KR20230154015A - 레트로바이러스 벡터 - Google Patents
레트로바이러스 벡터 Download PDFInfo
- Publication number
- KR20230154015A KR20230154015A KR1020237029670A KR20237029670A KR20230154015A KR 20230154015 A KR20230154015 A KR 20230154015A KR 1020237029670 A KR1020237029670 A KR 1020237029670A KR 20237029670 A KR20237029670 A KR 20237029670A KR 20230154015 A KR20230154015 A KR 20230154015A
- Authority
- KR
- South Korea
- Prior art keywords
- vector
- plasmid
- siv
- codon
- nucleic acid
- Prior art date
Links
- 239000013598 vector Substances 0.000 title claims abstract description 359
- 241001430294 unidentified retrovirus Species 0.000 title description 25
- 230000001177 retroviral effect Effects 0.000 claims abstract description 164
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 142
- 108700019146 Transgenes Proteins 0.000 claims abstract description 113
- 101710133291 Hemagglutinin-neuraminidase Proteins 0.000 claims abstract description 85
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 80
- 230000004927 fusion Effects 0.000 claims abstract description 23
- 230000000241 respiratory effect Effects 0.000 claims abstract description 22
- 201000003883 Cystic fibrosis Diseases 0.000 claims abstract description 19
- 239000013612 plasmid Substances 0.000 claims description 234
- 241000713311 Simian immunodeficiency virus Species 0.000 claims description 211
- 238000000034 method Methods 0.000 claims description 185
- 150000007523 nucleic acids Chemical group 0.000 claims description 120
- 101150047047 gag-pol gene Proteins 0.000 claims description 87
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 67
- 102000039446 nucleic acids Human genes 0.000 claims description 58
- 108020004707 nucleic acids Proteins 0.000 claims description 58
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 47
- 108010050122 alpha 1-Antitrypsin Proteins 0.000 claims description 44
- 102000015395 alpha 1-Antitrypsin Human genes 0.000 claims description 43
- 201000010099 disease Diseases 0.000 claims description 43
- 229940024142 alpha 1-antitrypsin Drugs 0.000 claims description 42
- 101000911390 Homo sapiens Coagulation factor VIII Proteins 0.000 claims description 39
- 102100026735 Coagulation factor VIII Human genes 0.000 claims description 38
- 230000001965 increasing effect Effects 0.000 claims description 29
- 230000001225 therapeutic effect Effects 0.000 claims description 29
- 241000713666 Lentivirus Species 0.000 claims description 25
- 241000282414 Homo sapiens Species 0.000 claims description 20
- 241000701022 Cytomegalovirus Species 0.000 claims description 17
- 101710163270 Nuclease Proteins 0.000 claims description 10
- 238000000746 purification Methods 0.000 claims description 10
- 101001086862 Homo sapiens Pulmonary surfactant-associated protein B Proteins 0.000 claims description 9
- 241000725303 Human immunodeficiency virus Species 0.000 claims description 9
- 102100032617 Pulmonary surfactant-associated protein B Human genes 0.000 claims description 9
- 238000003306 harvesting Methods 0.000 claims description 9
- 208000019693 Lung disease Diseases 0.000 claims description 7
- 239000003623 enhancer Substances 0.000 claims description 7
- 239000013603 viral vector Substances 0.000 claims description 7
- 241000713730 Equine infectious anemia virus Species 0.000 claims description 6
- 241000713800 Feline immunodeficiency virus Species 0.000 claims description 6
- 241000711408 Murine respirovirus Species 0.000 claims description 6
- 102000004142 Trypsin Human genes 0.000 claims description 6
- 108090000631 Trypsin Proteins 0.000 claims description 6
- 239000012588 trypsin Substances 0.000 claims description 6
- 102000004457 Granulocyte-Macrophage Colony-Stimulating Factor Human genes 0.000 claims description 5
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 claims description 5
- 102100022641 Coagulation factor IX Human genes 0.000 claims description 4
- 102100023804 Coagulation factor VII Human genes 0.000 claims description 4
- 102100032300 Dynein axonemal heavy chain 11 Human genes 0.000 claims description 4
- 102100031648 Dynein axonemal heavy chain 5 Human genes 0.000 claims description 4
- 102100033595 Dynein axonemal intermediate chain 1 Human genes 0.000 claims description 4
- 102100033596 Dynein axonemal intermediate chain 2 Human genes 0.000 claims description 4
- 108010076282 Factor IX Proteins 0.000 claims description 4
- 108010023321 Factor VII Proteins 0.000 claims description 4
- 108010054218 Factor VIII Proteins 0.000 claims description 4
- 102000001690 Factor VIII Human genes 0.000 claims description 4
- 101001016208 Homo sapiens Dynein axonemal heavy chain 11 Proteins 0.000 claims description 4
- 101000866368 Homo sapiens Dynein axonemal heavy chain 5 Proteins 0.000 claims description 4
- 101000872267 Homo sapiens Dynein axonemal intermediate chain 1 Proteins 0.000 claims description 4
- 101000872272 Homo sapiens Dynein axonemal intermediate chain 2 Proteins 0.000 claims description 4
- 208000010094 Visna Diseases 0.000 claims description 4
- 238000011210 chromatographic step Methods 0.000 claims description 4
- 229960004222 factor ix Drugs 0.000 claims description 4
- 229940012413 factor vii Drugs 0.000 claims description 4
- 229960000301 factor viii Drugs 0.000 claims description 4
- 239000012678 infectious agent Substances 0.000 claims description 4
- 238000012794 pre-harvesting Methods 0.000 claims description 4
- 239000000725 suspension Substances 0.000 claims description 4
- 101000801640 Homo sapiens Phospholipid-transporting ATPase ABCA3 Proteins 0.000 claims description 3
- 102000002508 Peptide Elongation Factors Human genes 0.000 claims description 3
- 108010068204 Peptide Elongation Factors Proteins 0.000 claims description 3
- 102100033623 Phospholipid-transporting ATPase ABCA3 Human genes 0.000 claims description 3
- 108010047303 von Willebrand Factor Proteins 0.000 claims description 3
- 102100036537 von Willebrand factor Human genes 0.000 claims description 3
- 229960001134 von willebrand factor Drugs 0.000 claims description 3
- 108010014173 Factor X Proteins 0.000 claims description 2
- 108010074864 Factor XI Proteins 0.000 claims description 2
- 229940012426 factor x Drugs 0.000 claims description 2
- 108010079245 Cystic Fibrosis Transmembrane Conductance Regulator Proteins 0.000 claims 1
- 102000008371 intracellularly ATP-gated chloride channel activity proteins Human genes 0.000 claims 1
- 238000007670 refining Methods 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 46
- 238000011282 treatment Methods 0.000 abstract description 15
- 238000001415 gene therapy Methods 0.000 abstract description 10
- 238000012546 transfer Methods 0.000 abstract description 9
- 208000023504 respiratory system disease Diseases 0.000 abstract description 5
- 108020004414 DNA Proteins 0.000 description 102
- 210000004027 cell Anatomy 0.000 description 85
- 235000018102 proteins Nutrition 0.000 description 65
- 230000014509 gene expression Effects 0.000 description 57
- 235000001014 amino acid Nutrition 0.000 description 52
- 150000001413 amino acids Chemical class 0.000 description 47
- 108090000765 processed proteins & peptides Proteins 0.000 description 46
- 229940024606 amino acid Drugs 0.000 description 44
- 102000004196 processed proteins & peptides Human genes 0.000 description 43
- 230000006870 function Effects 0.000 description 42
- 229920001184 polypeptide Polymers 0.000 description 42
- 108010027225 gag-pol Fusion Proteins Proteins 0.000 description 40
- 102100023419 Cystic fibrosis transmembrane conductance regulator Human genes 0.000 description 28
- 101000907783 Homo sapiens Cystic fibrosis transmembrane conductance regulator Proteins 0.000 description 28
- 239000012634 fragment Substances 0.000 description 23
- 108020004705 Codon Proteins 0.000 description 22
- 108700026244 Open Reading Frames Proteins 0.000 description 21
- 239000000203 mixture Substances 0.000 description 20
- 230000000295 complement effect Effects 0.000 description 19
- 230000014616 translation Effects 0.000 description 19
- 230000004048 modification Effects 0.000 description 18
- 238000012986 modification Methods 0.000 description 18
- 239000002245 particle Substances 0.000 description 17
- 230000000670 limiting effect Effects 0.000 description 16
- 125000003729 nucleotide group Chemical group 0.000 description 16
- 210000002345 respiratory system Anatomy 0.000 description 16
- 210000004072 lung Anatomy 0.000 description 15
- 238000005457 optimization Methods 0.000 description 15
- 238000013519 translation Methods 0.000 description 15
- 125000003275 alpha amino acid group Chemical group 0.000 description 14
- 239000002773 nucleotide Substances 0.000 description 14
- 238000006467 substitution reaction Methods 0.000 description 14
- 239000000443 aerosol Substances 0.000 description 13
- 125000000539 amino acid group Chemical class 0.000 description 13
- 108091033319 polynucleotide Proteins 0.000 description 13
- 102000040430 polynucleotide Human genes 0.000 description 13
- 239000002157 polynucleotide Substances 0.000 description 13
- 238000010361 transduction Methods 0.000 description 13
- 230000026683 transduction Effects 0.000 description 13
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 12
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 11
- 230000007774 longterm Effects 0.000 description 11
- 230000002829 reductive effect Effects 0.000 description 10
- 230000003612 virological effect Effects 0.000 description 10
- 241000700605 Viruses Species 0.000 description 9
- 239000003814 drug Substances 0.000 description 9
- 210000000981 epithelium Anatomy 0.000 description 9
- 230000000694 effects Effects 0.000 description 8
- 241000894007 species Species 0.000 description 8
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 7
- 230000007812 deficiency Effects 0.000 description 7
- 238000009472 formulation Methods 0.000 description 7
- 108010051242 phenylalanylserine Proteins 0.000 description 7
- 239000000047 product Substances 0.000 description 7
- 208000024891 symptom Diseases 0.000 description 7
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 6
- 230000008901 benefit Effects 0.000 description 6
- 238000012217 deletion Methods 0.000 description 6
- 230000037430 deletion Effects 0.000 description 6
- 238000013400 design of experiment Methods 0.000 description 6
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 6
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 6
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 6
- 230000035772 mutation Effects 0.000 description 6
- 230000036961 partial effect Effects 0.000 description 6
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 6
- 230000009467 reduction Effects 0.000 description 6
- 210000000130 stem cell Anatomy 0.000 description 6
- 238000001890 transfection Methods 0.000 description 6
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 5
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 5
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 5
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 5
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 5
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 5
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 5
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 210000005058 airway cell Anatomy 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 230000002068 genetic effect Effects 0.000 description 5
- 108010050848 glycylleucine Proteins 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 230000037431 insertion Effects 0.000 description 5
- 230000005923 long-lasting effect Effects 0.000 description 5
- 230000002459 sustained effect Effects 0.000 description 5
- 238000002560 therapeutic procedure Methods 0.000 description 5
- 108010068327 4-hydroxyphenylpyruvate dioxygenase Proteins 0.000 description 4
- DPXDVGDLWJYZBH-GUBZILKMSA-N Arg-Asn-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O DPXDVGDLWJYZBH-GUBZILKMSA-N 0.000 description 4
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 4
- AYFVRYXNDHBECD-YUMQZZPRSA-N Asp-Leu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AYFVRYXNDHBECD-YUMQZZPRSA-N 0.000 description 4
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 4
- 241000880493 Leptailurus serval Species 0.000 description 4
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 4
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 4
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 4
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 4
- HNDMFDBQXYZSRM-IHRRRGAJSA-N Ser-Val-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HNDMFDBQXYZSRM-IHRRRGAJSA-N 0.000 description 4
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 4
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 4
- 230000002411 adverse Effects 0.000 description 4
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 4
- 108010087924 alanylproline Proteins 0.000 description 4
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 108010077245 asparaginyl-proline Proteins 0.000 description 4
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 210000000270 basal cell Anatomy 0.000 description 4
- 239000003153 chemical reaction reagent Substances 0.000 description 4
- 238000009295 crossflow filtration Methods 0.000 description 4
- 208000035475 disorder Diseases 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 230000005764 inhibitory process Effects 0.000 description 4
- 108010057821 leucylproline Proteins 0.000 description 4
- 108010054155 lysyllysine Proteins 0.000 description 4
- 108010017391 lysylvaline Proteins 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 239000002609 medium Substances 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 108010056582 methionylglutamic acid Proteins 0.000 description 4
- 210000001331 nose Anatomy 0.000 description 4
- -1 phosphorylated Chemical class 0.000 description 4
- 108700004029 pol Genes Proteins 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 230000028327 secretion Effects 0.000 description 4
- 108010071207 serylmethionine Proteins 0.000 description 4
- 238000010561 standard procedure Methods 0.000 description 4
- 239000004094 surface-active agent Substances 0.000 description 4
- 210000001519 tissue Anatomy 0.000 description 4
- 108010027345 wheylin-1 peptide Proteins 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 3
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 3
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 3
- PHJPKNUWWHRAOC-PEFMBERDSA-N Asn-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PHJPKNUWWHRAOC-PEFMBERDSA-N 0.000 description 3
- LVHMEJJWEXBMKK-GMOBBJLQSA-N Asn-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)N)N LVHMEJJWEXBMKK-GMOBBJLQSA-N 0.000 description 3
- HMUKKNAMNSXDBB-CIUDSAMLSA-N Asn-Met-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMUKKNAMNSXDBB-CIUDSAMLSA-N 0.000 description 3
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 3
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 3
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 3
- 208000019838 Blood disease Diseases 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 3
- 101150029409 CFTR gene Proteins 0.000 description 3
- 208000024172 Cardiovascular disease Diseases 0.000 description 3
- 241000282552 Chlorocebus aethiops Species 0.000 description 3
- 208000017667 Chronic Disease Diseases 0.000 description 3
- 108700010070 Codon Usage Proteins 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 3
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 3
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 3
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 3
- PGTISAJTWZPFGN-PEXQALLHSA-N His-Gly-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O PGTISAJTWZPFGN-PEXQALLHSA-N 0.000 description 3
- JMSONHOUHFDOJH-GUBZILKMSA-N His-Ser-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 JMSONHOUHFDOJH-GUBZILKMSA-N 0.000 description 3
- CUEQQFOGARVNHU-VGDYDELISA-N His-Ser-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUEQQFOGARVNHU-VGDYDELISA-N 0.000 description 3
- KDDKJKKQODQQBR-NHCYSSNCSA-N His-Val-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N KDDKJKKQODQQBR-NHCYSSNCSA-N 0.000 description 3
- 102100034349 Integrase Human genes 0.000 description 3
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 3
- TYYLDKGBCJGJGW-UHFFFAOYSA-N L-tryptophan-L-tyrosine Natural products C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 TYYLDKGBCJGJGW-UHFFFAOYSA-N 0.000 description 3
- WSGXUIQTEZDVHJ-GARJFASQSA-N Leu-Ala-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O WSGXUIQTEZDVHJ-GARJFASQSA-N 0.000 description 3
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 3
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 3
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 3
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 3
- UIJVKVHLCQSPOJ-XIRDDKMYSA-N Lys-Ser-Trp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O UIJVKVHLCQSPOJ-XIRDDKMYSA-N 0.000 description 3
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 3
- 108010079364 N-glycylalanine Proteins 0.000 description 3
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 3
- INHMISZWLJZQGH-ULQDDVLXSA-N Phe-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 INHMISZWLJZQGH-ULQDDVLXSA-N 0.000 description 3
- 108020004511 Recombinant DNA Proteins 0.000 description 3
- 241000712907 Retroviridae Species 0.000 description 3
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 3
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 3
- PRNGXSILMXSWQQ-OEAJRASXSA-N Thr-Leu-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PRNGXSILMXSWQQ-OEAJRASXSA-N 0.000 description 3
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 3
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 3
- 239000004473 Threonine Substances 0.000 description 3
- OFCKFBGRYHOKFP-IHPCNDPISA-N Trp-Asp-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N OFCKFBGRYHOKFP-IHPCNDPISA-N 0.000 description 3
- VNRTXOUAOUZCFW-WDSOQIARSA-N Trp-Val-His Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O VNRTXOUAOUZCFW-WDSOQIARSA-N 0.000 description 3
- BEIGSKUPTIFYRZ-SRVKXCTJSA-N Tyr-Asp-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O BEIGSKUPTIFYRZ-SRVKXCTJSA-N 0.000 description 3
- HQYVQDRYODWONX-DCAQKATOSA-N Val-His-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CO)C(=O)O)N HQYVQDRYODWONX-DCAQKATOSA-N 0.000 description 3
- 241000711975 Vesicular stomatitis virus Species 0.000 description 3
- 108020000999 Viral RNA Proteins 0.000 description 3
- 241001492404 Woodchuck hepatitis virus Species 0.000 description 3
- 210000001552 airway epithelial cell Anatomy 0.000 description 3
- 108010044940 alanylglutamine Proteins 0.000 description 3
- 108010013835 arginine glutamate Proteins 0.000 description 3
- 108010060035 arginylproline Proteins 0.000 description 3
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 3
- 108010093581 aspartyl-proline Proteins 0.000 description 3
- 210000003123 bronchiole Anatomy 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000003776 cleavage reaction Methods 0.000 description 3
- 230000006378 damage Effects 0.000 description 3
- 230000002950 deficient Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000010790 dilution Methods 0.000 description 3
- 239000012895 dilution Substances 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 210000002919 epithelial cell Anatomy 0.000 description 3
- 108700004026 gag Genes Proteins 0.000 description 3
- 108010049041 glutamylalanine Proteins 0.000 description 3
- 108010037850 glycylvaline Proteins 0.000 description 3
- 208000014951 hematologic disease Diseases 0.000 description 3
- 208000018706 hematopoietic system disease Diseases 0.000 description 3
- 108010040030 histidinoalanine Proteins 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 210000005260 human cell Anatomy 0.000 description 3
- 208000026278 immune system disease Diseases 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 208000027866 inflammatory disease Diseases 0.000 description 3
- 230000002757 inflammatory effect Effects 0.000 description 3
- 108010053037 kyotorphin Proteins 0.000 description 3
- 108010000761 leucylarginine Proteins 0.000 description 3
- 108010003700 lysyl aspartic acid Proteins 0.000 description 3
- 208000030159 metabolic disease Diseases 0.000 description 3
- 238000000491 multivariate analysis Methods 0.000 description 3
- 239000007922 nasal spray Substances 0.000 description 3
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 108010004914 prolylarginine Proteins 0.000 description 3
- 108010090894 prolylleucine Proteins 0.000 description 3
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 230000007017 scission Effects 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 108010048818 seryl-histidine Proteins 0.000 description 3
- 108010026333 seryl-proline Proteins 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 230000002463 transducing effect Effects 0.000 description 3
- 108010044292 tryptophyltyrosine Proteins 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- PDRJLZDUOULRHE-ZETCQYMHSA-N (2s)-2-amino-3-pyridin-2-ylpropanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=N1 PDRJLZDUOULRHE-ZETCQYMHSA-N 0.000 description 2
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 2
- XWHHYOYVRVGJJY-QMMMGPOBSA-N 4-fluoro-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(F)C=C1 XWHHYOYVRVGJJY-QMMMGPOBSA-N 0.000 description 2
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 2
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 2
- WYPUMLRSQMKIJU-BPNCWPANSA-N Ala-Arg-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WYPUMLRSQMKIJU-BPNCWPANSA-N 0.000 description 2
- NKJBKNVQHBZUIX-ACZMJKKPSA-N Ala-Gln-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKJBKNVQHBZUIX-ACZMJKKPSA-N 0.000 description 2
- SFNFGFDRYJKZKN-XQXXSGGOSA-N Ala-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C)N)O SFNFGFDRYJKZKN-XQXXSGGOSA-N 0.000 description 2
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 2
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 2
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 2
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 2
- CHFFHQUVXHEGBY-GARJFASQSA-N Ala-Lys-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CHFFHQUVXHEGBY-GARJFASQSA-N 0.000 description 2
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 2
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 2
- JPOQZCHGOTWRTM-FQPOAREZSA-N Ala-Tyr-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPOQZCHGOTWRTM-FQPOAREZSA-N 0.000 description 2
- DEAGTWNKODHUIY-MRFFXTKBSA-N Ala-Tyr-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DEAGTWNKODHUIY-MRFFXTKBSA-N 0.000 description 2
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 2
- DHONNEYAZPNGSG-UBHSHLNASA-N Ala-Val-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DHONNEYAZPNGSG-UBHSHLNASA-N 0.000 description 2
- 206010001881 Alveolar proteinosis Diseases 0.000 description 2
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 2
- OLDOLPWZEMHNIA-PJODQICGSA-N Arg-Ala-Trp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O OLDOLPWZEMHNIA-PJODQICGSA-N 0.000 description 2
- NUBPTCMEOCKWDO-DCAQKATOSA-N Arg-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N NUBPTCMEOCKWDO-DCAQKATOSA-N 0.000 description 2
- FEZJJKXNPSEYEV-CIUDSAMLSA-N Arg-Gln-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FEZJJKXNPSEYEV-CIUDSAMLSA-N 0.000 description 2
- HPKSHFSEXICTLI-CIUDSAMLSA-N Arg-Glu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HPKSHFSEXICTLI-CIUDSAMLSA-N 0.000 description 2
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 2
- WMEVEPXNCMKNGH-IHRRRGAJSA-N Arg-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WMEVEPXNCMKNGH-IHRRRGAJSA-N 0.000 description 2
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 2
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 2
- NPAVRDPEFVKELR-DCAQKATOSA-N Arg-Lys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NPAVRDPEFVKELR-DCAQKATOSA-N 0.000 description 2
- VIINVRPKMUZYOI-DCAQKATOSA-N Arg-Met-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VIINVRPKMUZYOI-DCAQKATOSA-N 0.000 description 2
- FKQITMVNILRUCQ-IHRRRGAJSA-N Arg-Phe-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O FKQITMVNILRUCQ-IHRRRGAJSA-N 0.000 description 2
- IGFJVXOATGZTHD-UHFFFAOYSA-N Arg-Phe-His Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccccc1)C(=O)NC(Cc2c[nH]cn2)C(=O)O IGFJVXOATGZTHD-UHFFFAOYSA-N 0.000 description 2
- UGZUVYDKAYNCII-ULQDDVLXSA-N Arg-Phe-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UGZUVYDKAYNCII-ULQDDVLXSA-N 0.000 description 2
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 2
- YCYXHLZRUSJITQ-SRVKXCTJSA-N Arg-Pro-Pro Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 YCYXHLZRUSJITQ-SRVKXCTJSA-N 0.000 description 2
- OWSMKCJUBAPHED-JYJNAYRXSA-N Arg-Pro-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OWSMKCJUBAPHED-JYJNAYRXSA-N 0.000 description 2
- FBXMCPLCVYUWBO-BPUTZDHNSA-N Arg-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N FBXMCPLCVYUWBO-BPUTZDHNSA-N 0.000 description 2
- OQPAZKMGCWPERI-GUBZILKMSA-N Arg-Ser-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OQPAZKMGCWPERI-GUBZILKMSA-N 0.000 description 2
- CNBIWSCSSCAINS-UFYCRDLUSA-N Arg-Tyr-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNBIWSCSSCAINS-UFYCRDLUSA-N 0.000 description 2
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 2
- ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N Asn-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N 0.000 description 2
- UPALZCBCKAMGIY-PEFMBERDSA-N Asn-Gln-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UPALZCBCKAMGIY-PEFMBERDSA-N 0.000 description 2
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 2
- GJFYPBDMUGGLFR-NKWVEPMBSA-N Asn-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC(=O)N)N)C(=O)O GJFYPBDMUGGLFR-NKWVEPMBSA-N 0.000 description 2
- JQBCANGGAVVERB-CFMVVWHZSA-N Asn-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N JQBCANGGAVVERB-CFMVVWHZSA-N 0.000 description 2
- HNXWVVHIGTZTBO-LKXGYXEUSA-N Asn-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O HNXWVVHIGTZTBO-LKXGYXEUSA-N 0.000 description 2
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 2
- ZLGKHJHFYSRUBH-FXQIFTODSA-N Asp-Arg-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLGKHJHFYSRUBH-FXQIFTODSA-N 0.000 description 2
- JGDBHIVECJGXJA-FXQIFTODSA-N Asp-Asp-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JGDBHIVECJGXJA-FXQIFTODSA-N 0.000 description 2
- QOVWVLLHMMCFFY-ZLUOBGJFSA-N Asp-Asp-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QOVWVLLHMMCFFY-ZLUOBGJFSA-N 0.000 description 2
- KIJLEFNHWSXHRU-NUMRIWBASA-N Asp-Gln-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KIJLEFNHWSXHRU-NUMRIWBASA-N 0.000 description 2
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 2
- RQYMKRMRZWJGHC-BQBZGAKWSA-N Asp-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N RQYMKRMRZWJGHC-BQBZGAKWSA-N 0.000 description 2
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 2
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 2
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 2
- YRZIYQGXTSBRLT-AVGNSLFASA-N Asp-Phe-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YRZIYQGXTSBRLT-AVGNSLFASA-N 0.000 description 2
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 2
- PWAIZUBWHRHYKS-MELADBBJSA-N Asp-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC(=O)O)N)C(=O)O PWAIZUBWHRHYKS-MELADBBJSA-N 0.000 description 2
- FIAKNCXQFFKSSI-ZLUOBGJFSA-N Asp-Ser-Cys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O FIAKNCXQFFKSSI-ZLUOBGJFSA-N 0.000 description 2
- XYPJXLLXNSAWHZ-SRVKXCTJSA-N Asp-Ser-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XYPJXLLXNSAWHZ-SRVKXCTJSA-N 0.000 description 2
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 2
- ZUNMTUPRQMWMHX-LSJOCFKGSA-N Asp-Val-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O ZUNMTUPRQMWMHX-LSJOCFKGSA-N 0.000 description 2
- 208000006545 Chronic Obstructive Pulmonary Disease Diseases 0.000 description 2
- 208000025678 Ciliary Motility disease Diseases 0.000 description 2
- 108091029430 CpG site Proteins 0.000 description 2
- YMBAVNPKBWHDAW-CIUDSAMLSA-N Cys-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N YMBAVNPKBWHDAW-CIUDSAMLSA-N 0.000 description 2
- VNXXMHTZQGGDSG-CIUDSAMLSA-N Cys-His-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O VNXXMHTZQGGDSG-CIUDSAMLSA-N 0.000 description 2
- CHRCKSPMGYDLIA-SRVKXCTJSA-N Cys-Phe-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O CHRCKSPMGYDLIA-SRVKXCTJSA-N 0.000 description 2
- FBPFZTCFMRRESA-FSIIMWSLSA-N D-Glucitol Natural products OC[C@H](O)[C@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-FSIIMWSLSA-N 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 241000714165 Feline leukemia virus Species 0.000 description 2
- PGPJSRSLQNXBDT-YUMQZZPRSA-N Gln-Arg-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O PGPJSRSLQNXBDT-YUMQZZPRSA-N 0.000 description 2
- HWEINOMSWQSJDC-SRVKXCTJSA-N Gln-Leu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HWEINOMSWQSJDC-SRVKXCTJSA-N 0.000 description 2
- CELXWPDNIGWCJN-WDCWCFNPSA-N Gln-Lys-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CELXWPDNIGWCJN-WDCWCFNPSA-N 0.000 description 2
- XZUUUKNKNWVPHQ-JYJNAYRXSA-N Gln-Phe-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O XZUUUKNKNWVPHQ-JYJNAYRXSA-N 0.000 description 2
- NHMRJKKAVMENKJ-WDCWCFNPSA-N Gln-Thr-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NHMRJKKAVMENKJ-WDCWCFNPSA-N 0.000 description 2
- HNAUFGBKJLTWQE-IFFSRLJSSA-N Gln-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCC(=O)N)N)O HNAUFGBKJLTWQE-IFFSRLJSSA-N 0.000 description 2
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 2
- KBKGRMNVKPSQIF-XDTLVQLUSA-N Glu-Ala-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KBKGRMNVKPSQIF-XDTLVQLUSA-N 0.000 description 2
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 2
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 2
- PBFGQTGPSKWHJA-QEJZJMRPSA-N Glu-Asp-Trp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O PBFGQTGPSKWHJA-QEJZJMRPSA-N 0.000 description 2
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 2
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 2
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 2
- BCYGDJXHAGZNPQ-DCAQKATOSA-N Glu-Lys-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O BCYGDJXHAGZNPQ-DCAQKATOSA-N 0.000 description 2
- SOEPMWQCTJITPZ-SRVKXCTJSA-N Glu-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N SOEPMWQCTJITPZ-SRVKXCTJSA-N 0.000 description 2
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 2
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 2
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 2
- KFMBRBPXHVMDFN-UWVGGRQHSA-N Gly-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCNC(N)=N KFMBRBPXHVMDFN-UWVGGRQHSA-N 0.000 description 2
- YYQGVXNKAXUTJU-YUMQZZPRSA-N Gly-Cys-His Chemical compound NCC(=O)N[C@@H](CS)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O YYQGVXNKAXUTJU-YUMQZZPRSA-N 0.000 description 2
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 2
- QSVMIMFAAZPCAQ-PMVVWTBXSA-N Gly-His-Thr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QSVMIMFAAZPCAQ-PMVVWTBXSA-N 0.000 description 2
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 2
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 2
- JPAACTMBBBGAAR-HOTGVXAUSA-N Gly-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)CC(C)C)C(O)=O)=CNC2=C1 JPAACTMBBBGAAR-HOTGVXAUSA-N 0.000 description 2
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 2
- MKIAPEZXQDILRR-YUMQZZPRSA-N Gly-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN MKIAPEZXQDILRR-YUMQZZPRSA-N 0.000 description 2
- BNMRSWQOHIQTFL-JSGCOSHPSA-N Gly-Val-Phe Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 BNMRSWQOHIQTFL-JSGCOSHPSA-N 0.000 description 2
- 208000031220 Hemophilia Diseases 0.000 description 2
- 208000009292 Hemophilia A Diseases 0.000 description 2
- ORERHHPZDDEMSC-VGDYDELISA-N His-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ORERHHPZDDEMSC-VGDYDELISA-N 0.000 description 2
- AIPUZFXMXAHZKY-QWRGUYRKSA-N His-Leu-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AIPUZFXMXAHZKY-QWRGUYRKSA-N 0.000 description 2
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 2
- DPQIPEAHIYMUEJ-IHRRRGAJSA-N His-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N DPQIPEAHIYMUEJ-IHRRRGAJSA-N 0.000 description 2
- IGBBXBFSLKRHJB-BZSNNMDCSA-N His-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 IGBBXBFSLKRHJB-BZSNNMDCSA-N 0.000 description 2
- PGXZHYYGOPKYKM-IHRRRGAJSA-N His-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CCCCN)C(=O)O PGXZHYYGOPKYKM-IHRRRGAJSA-N 0.000 description 2
- OWYIDJCNRWRSJY-QTKMDUPCSA-N His-Pro-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O OWYIDJCNRWRSJY-QTKMDUPCSA-N 0.000 description 2
- FLXCRBXJRJSDHX-AVGNSLFASA-N His-Pro-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O FLXCRBXJRJSDHX-AVGNSLFASA-N 0.000 description 2
- VXZZUXWAOMWWJH-QTKMDUPCSA-N His-Thr-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VXZZUXWAOMWWJH-QTKMDUPCSA-N 0.000 description 2
- FFYYUUWROYYKFY-IHRRRGAJSA-N His-Val-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O FFYYUUWROYYKFY-IHRRRGAJSA-N 0.000 description 2
- 241000701024 Human betaherpesvirus 5 Species 0.000 description 2
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 2
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 2
- ZGGWRNBSBOHIGH-HVTMNAMFSA-N Ile-Gln-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZGGWRNBSBOHIGH-HVTMNAMFSA-N 0.000 description 2
- DMZOUKXXHJQPTL-GRLWGSQLSA-N Ile-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N DMZOUKXXHJQPTL-GRLWGSQLSA-N 0.000 description 2
- XLCZWMJPVGRWHJ-KQXIARHKSA-N Ile-Glu-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N XLCZWMJPVGRWHJ-KQXIARHKSA-N 0.000 description 2
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 2
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 2
- IOVUXUSIGXCREV-DKIMLUQUSA-N Ile-Leu-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IOVUXUSIGXCREV-DKIMLUQUSA-N 0.000 description 2
- PMMMQRVUMVURGJ-XUXIUFHCSA-N Ile-Leu-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O PMMMQRVUMVURGJ-XUXIUFHCSA-N 0.000 description 2
- CNMOKANDJMLAIF-CIQUZCHMSA-N Ile-Thr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O CNMOKANDJMLAIF-CIQUZCHMSA-N 0.000 description 2
- 208000022559 Inflammatory bowel disease Diseases 0.000 description 2
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 2
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 2
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 2
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 2
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 2
- GPICTNQYKHHHTH-GUBZILKMSA-N Leu-Gln-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GPICTNQYKHHHTH-GUBZILKMSA-N 0.000 description 2
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 2
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 2
- QPXBPQUGXHURGP-UWVGGRQHSA-N Leu-Gly-Met Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CCSC)C(=O)O)N QPXBPQUGXHURGP-UWVGGRQHSA-N 0.000 description 2
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 2
- ORWTWZXGDBYVCP-BJDJZHNGSA-N Leu-Ile-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(C)C ORWTWZXGDBYVCP-BJDJZHNGSA-N 0.000 description 2
- SEMUSFOBZGKBGW-YTFOTSKYSA-N Leu-Ile-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SEMUSFOBZGKBGW-YTFOTSKYSA-N 0.000 description 2
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 2
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 2
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 2
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 2
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 2
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 2
- YESNGRDJQWDYLH-KKUMJFAQSA-N Leu-Phe-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N YESNGRDJQWDYLH-KKUMJFAQSA-N 0.000 description 2
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 2
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 2
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 2
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 2
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 2
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 2
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 2
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 2
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 2
- TUIOUEWKFFVNLH-DCAQKATOSA-N Leu-Val-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(O)=O TUIOUEWKFFVNLH-DCAQKATOSA-N 0.000 description 2
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 2
- 239000012097 Lipofectamine 2000 Substances 0.000 description 2
- ALSRJRIWBNENFY-DCAQKATOSA-N Lys-Arg-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O ALSRJRIWBNENFY-DCAQKATOSA-N 0.000 description 2
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 2
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 2
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 2
- SQXZLVXQXWILKW-KKUMJFAQSA-N Lys-Ser-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQXZLVXQXWILKW-KKUMJFAQSA-N 0.000 description 2
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 2
- 208000015439 Lysosomal storage disease Diseases 0.000 description 2
- 101710125418 Major capsid protein Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- WXHHTBVYQOSYSL-FXQIFTODSA-N Met-Ala-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O WXHHTBVYQOSYSL-FXQIFTODSA-N 0.000 description 2
- AWOMRHGUWFBDNU-ZPFDUUQYSA-N Met-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N AWOMRHGUWFBDNU-ZPFDUUQYSA-N 0.000 description 2
- RZJOHSFAEZBWLK-CIUDSAMLSA-N Met-Gln-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N RZJOHSFAEZBWLK-CIUDSAMLSA-N 0.000 description 2
- AETNZPKUUYYYEK-CIUDSAMLSA-N Met-Glu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AETNZPKUUYYYEK-CIUDSAMLSA-N 0.000 description 2
- SXWQMBGNFXAGAT-FJXKBIBVSA-N Met-Gly-Thr Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SXWQMBGNFXAGAT-FJXKBIBVSA-N 0.000 description 2
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 2
- RDLSEGZJMYGFNS-FXQIFTODSA-N Met-Ser-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RDLSEGZJMYGFNS-FXQIFTODSA-N 0.000 description 2
- GMMLGMFBYCFCCX-KZVJFYERSA-N Met-Thr-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMMLGMFBYCFCCX-KZVJFYERSA-N 0.000 description 2
- 241000714177 Murine leukemia virus Species 0.000 description 2
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 2
- 108010066427 N-valyltryptophan Proteins 0.000 description 2
- 108020004485 Nonsense Codon Proteins 0.000 description 2
- 108091060545 Nonsense suppressor Proteins 0.000 description 2
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 2
- TXKWKTWYTIAZSV-KKUMJFAQSA-N Phe-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N TXKWKTWYTIAZSV-KKUMJFAQSA-N 0.000 description 2
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 2
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 2
- FQUUYTNBMIBOHS-IHRRRGAJSA-N Phe-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FQUUYTNBMIBOHS-IHRRRGAJSA-N 0.000 description 2
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 2
- ZJPGOXWRFNKIQL-JYJNAYRXSA-N Phe-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=CC=C1 ZJPGOXWRFNKIQL-JYJNAYRXSA-N 0.000 description 2
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 2
- SHUFSZDAIPLZLF-BEAPCOKYSA-N Phe-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O SHUFSZDAIPLZLF-BEAPCOKYSA-N 0.000 description 2
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 2
- OLHDPZMYUSBGDE-GUBZILKMSA-N Pro-Arg-Cys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O OLHDPZMYUSBGDE-GUBZILKMSA-N 0.000 description 2
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 2
- TXPUNZXZDVJUJQ-LPEHRKFASA-N Pro-Asn-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O TXPUNZXZDVJUJQ-LPEHRKFASA-N 0.000 description 2
- FRKBNXCFJBPJOL-GUBZILKMSA-N Pro-Glu-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FRKBNXCFJBPJOL-GUBZILKMSA-N 0.000 description 2
- WFHYFCWBLSKEMS-KKUMJFAQSA-N Pro-Glu-Phe Chemical compound N([C@@H](CCC(=O)O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C(=O)[C@@H]1CCCN1 WFHYFCWBLSKEMS-KKUMJFAQSA-N 0.000 description 2
- NFLNBHLMLYALOO-DCAQKATOSA-N Pro-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1 NFLNBHLMLYALOO-DCAQKATOSA-N 0.000 description 2
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 2
- PUQRDHNIOONJJN-AVGNSLFASA-N Pro-Lys-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O PUQRDHNIOONJJN-AVGNSLFASA-N 0.000 description 2
- VGVCNKSUVSZEIE-IHRRRGAJSA-N Pro-Phe-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O VGVCNKSUVSZEIE-IHRRRGAJSA-N 0.000 description 2
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 2
- QKDIHFHGHBYTKB-IHRRRGAJSA-N Pro-Ser-Phe Chemical compound N([C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C(=O)[C@@H]1CCCN1 QKDIHFHGHBYTKB-IHRRRGAJSA-N 0.000 description 2
- CHYAYDLYYIJCKY-OSUNSFLBSA-N Pro-Thr-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CHYAYDLYYIJCKY-OSUNSFLBSA-N 0.000 description 2
- BXHRXLMCYSZSIY-STECZYCISA-N Pro-Tyr-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H]1CCCN1)C(O)=O BXHRXLMCYSZSIY-STECZYCISA-N 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- 101710150344 Protein Rev Proteins 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 2
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 2
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 2
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 2
- RNMRYWZYFHHOEV-CIUDSAMLSA-N Ser-Gln-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RNMRYWZYFHHOEV-CIUDSAMLSA-N 0.000 description 2
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 2
- VDVYTKZBMFADQH-AVGNSLFASA-N Ser-Gln-Tyr Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 VDVYTKZBMFADQH-AVGNSLFASA-N 0.000 description 2
- GRSLLFZTTLBOQX-CIUDSAMLSA-N Ser-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N GRSLLFZTTLBOQX-CIUDSAMLSA-N 0.000 description 2
- IXCHOHLPHNGFTJ-YUMQZZPRSA-N Ser-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N IXCHOHLPHNGFTJ-YUMQZZPRSA-N 0.000 description 2
- OQPNSDWGAMFJNU-QWRGUYRKSA-N Ser-Gly-Tyr Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OQPNSDWGAMFJNU-QWRGUYRKSA-N 0.000 description 2
- ZFVFHHZBCVNLGD-GUBZILKMSA-N Ser-His-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZFVFHHZBCVNLGD-GUBZILKMSA-N 0.000 description 2
- IFPBAGJBHSNYPR-ZKWXMUAHSA-N Ser-Ile-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O IFPBAGJBHSNYPR-ZKWXMUAHSA-N 0.000 description 2
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 2
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 2
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 2
- KCGIREHVWRXNDH-GARJFASQSA-N Ser-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N KCGIREHVWRXNDH-GARJFASQSA-N 0.000 description 2
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 2
- JWOBLHJRDADHLN-KKUMJFAQSA-N Ser-Leu-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JWOBLHJRDADHLN-KKUMJFAQSA-N 0.000 description 2
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 2
- FBLNYDYPCLFTSP-IXOXFDKPSA-N Ser-Phe-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FBLNYDYPCLFTSP-IXOXFDKPSA-N 0.000 description 2
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 2
- GZGFSPWOMUKKCV-NAKRPEOUSA-N Ser-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO GZGFSPWOMUKKCV-NAKRPEOUSA-N 0.000 description 2
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 2
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 2
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 2
- HKHCTNFKZXAMIF-KKUMJFAQSA-N Ser-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=C(O)C=C1 HKHCTNFKZXAMIF-KKUMJFAQSA-N 0.000 description 2
- UKKROEYWYIHWBD-ZKWXMUAHSA-N Ser-Val-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UKKROEYWYIHWBD-ZKWXMUAHSA-N 0.000 description 2
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 2
- 108010034546 Serratia marcescens nuclease Proteins 0.000 description 2
- 101000629318 Severe acute respiratory syndrome coronavirus 2 Spike glycoprotein Proteins 0.000 description 2
- PXIPVTKHYLBLMZ-UHFFFAOYSA-N Sodium azide Chemical compound [Na+].[N-]=[N+]=[N-] PXIPVTKHYLBLMZ-UHFFFAOYSA-N 0.000 description 2
- DDPVJPIGACCMEH-XQXXSGGOSA-N Thr-Ala-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DDPVJPIGACCMEH-XQXXSGGOSA-N 0.000 description 2
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 2
- PQLXHSACXPGWPD-GSSVUCPTSA-N Thr-Asn-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PQLXHSACXPGWPD-GSSVUCPTSA-N 0.000 description 2
- NOWXWJLVGTVJKM-PBCZWWQYSA-N Thr-Asp-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O NOWXWJLVGTVJKM-PBCZWWQYSA-N 0.000 description 2
- ZUUDNCOCILSYAM-KKHAAJSZSA-N Thr-Asp-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZUUDNCOCILSYAM-KKHAAJSZSA-N 0.000 description 2
- LOHBIDZYHQQTDM-IXOXFDKPSA-N Thr-Cys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LOHBIDZYHQQTDM-IXOXFDKPSA-N 0.000 description 2
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- JWQNAFHCXKVZKZ-UVOCVTCTSA-N Thr-Lys-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWQNAFHCXKVZKZ-UVOCVTCTSA-N 0.000 description 2
- WNQJTLATMXYSEL-OEAJRASXSA-N Thr-Phe-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O WNQJTLATMXYSEL-OEAJRASXSA-N 0.000 description 2
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 2
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 2
- KVEWWQRTAVMOFT-KJEVXHAQSA-N Thr-Tyr-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O KVEWWQRTAVMOFT-KJEVXHAQSA-N 0.000 description 2
- KPMIQCXJDVKWKO-IFFSRLJSSA-N Thr-Val-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KPMIQCXJDVKWKO-IFFSRLJSSA-N 0.000 description 2
- JLTQXEOXIJMCLZ-ZVZYQTTQSA-N Trp-Gln-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 JLTQXEOXIJMCLZ-ZVZYQTTQSA-N 0.000 description 2
- NLLARHRWSFNEMH-NUTKFTJISA-N Trp-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NLLARHRWSFNEMH-NUTKFTJISA-N 0.000 description 2
- RERRMBXDSFMBQE-ZFWWWQNUSA-N Trp-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N RERRMBXDSFMBQE-ZFWWWQNUSA-N 0.000 description 2
- STKZKWFOKOCSLW-UMPQAUOISA-N Trp-Thr-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)[C@@H](C)O)=CNC2=C1 STKZKWFOKOCSLW-UMPQAUOISA-N 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- AKXBNSZMYAOGLS-STQMWFEESA-N Tyr-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AKXBNSZMYAOGLS-STQMWFEESA-N 0.000 description 2
- MNMYOSZWCKYEDI-JRQIVUDYSA-N Tyr-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MNMYOSZWCKYEDI-JRQIVUDYSA-N 0.000 description 2
- USYGMBIIUDLYHJ-GVARAGBVSA-N Tyr-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 USYGMBIIUDLYHJ-GVARAGBVSA-N 0.000 description 2
- HVPPEXXUDXAPOM-MGHWNKPDSA-N Tyr-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HVPPEXXUDXAPOM-MGHWNKPDSA-N 0.000 description 2
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 2
- DMWNPLOERDAHSY-MEYUZBJRSA-N Tyr-Leu-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DMWNPLOERDAHSY-MEYUZBJRSA-N 0.000 description 2
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 2
- PLXQRTXVLZUNMU-RNXOBYDBSA-N Tyr-Phe-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)NC(=O)[C@H](CC4=CC=C(C=C4)O)N PLXQRTXVLZUNMU-RNXOBYDBSA-N 0.000 description 2
- RWOKVQUCENPXGE-IHRRRGAJSA-N Tyr-Ser-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RWOKVQUCENPXGE-IHRRRGAJSA-N 0.000 description 2
- ULUXAIYMVXLDQP-PMVMPFDFSA-N Tyr-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CC4=CC=C(C=C4)O)N ULUXAIYMVXLDQP-PMVMPFDFSA-N 0.000 description 2
- JQOMHZMWQHXALX-FHWLQOOXSA-N Tyr-Tyr-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JQOMHZMWQHXALX-FHWLQOOXSA-N 0.000 description 2
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 2
- IVXJODPZRWHCCR-JYJNAYRXSA-N Val-Arg-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IVXJODPZRWHCCR-JYJNAYRXSA-N 0.000 description 2
- CVUDMNSZAIZFAE-UHFFFAOYSA-N Val-Arg-Pro Natural products NC(N)=NCCCC(NC(=O)C(N)C(C)C)C(=O)N1CCCC1C(O)=O CVUDMNSZAIZFAE-UHFFFAOYSA-N 0.000 description 2
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 2
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 2
- YLHLNFUXDBOAGX-DCAQKATOSA-N Val-Cys-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N YLHLNFUXDBOAGX-DCAQKATOSA-N 0.000 description 2
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 2
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 2
- FOADDSDHGRFUOC-DZKIICNBSA-N Val-Glu-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N FOADDSDHGRFUOC-DZKIICNBSA-N 0.000 description 2
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 2
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 2
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 2
- HWNYVQMOLCYHEA-IHRRRGAJSA-N Val-Ser-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N HWNYVQMOLCYHEA-IHRRRGAJSA-N 0.000 description 2
- PFMSJVIPEZMKSC-DZKIICNBSA-N Val-Tyr-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PFMSJVIPEZMKSC-DZKIICNBSA-N 0.000 description 2
- DFQZDQPLWBSFEJ-LSJOCFKGSA-N Val-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DFQZDQPLWBSFEJ-LSJOCFKGSA-N 0.000 description 2
- 108010003533 Viral Envelope Proteins Proteins 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010066829 alanyl-glutamyl-aspartylprolyine Proteins 0.000 description 2
- 108010070944 alanylhistidine Proteins 0.000 description 2
- 238000005571 anion exchange chromatography Methods 0.000 description 2
- 230000000890 antigenic effect Effects 0.000 description 2
- 108010008355 arginyl-glutamine Proteins 0.000 description 2
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 2
- 108010068380 arginylarginine Proteins 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 108010047857 aspartylglycine Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 239000003833 bile salt Substances 0.000 description 2
- 230000023555 blood coagulation Effects 0.000 description 2
- 229940098773 bovine serum albumin Drugs 0.000 description 2
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 2
- 239000006143 cell culture medium Substances 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000004587 chromatography analysis Methods 0.000 description 2
- PMMYEEVYMWASQN-IMJSIDKUSA-N cis-4-Hydroxy-L-proline Chemical compound O[C@@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-IMJSIDKUSA-N 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- 108010060199 cysteinylproline Proteins 0.000 description 2
- 239000003937 drug carrier Substances 0.000 description 2
- 241001493065 dsRNA viruses Species 0.000 description 2
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 2
- 239000003797 essential amino acid Substances 0.000 description 2
- 235000020776 essential amino acid Nutrition 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 239000007789 gas Substances 0.000 description 2
- 238000001476 gene delivery Methods 0.000 description 2
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010077515 glycylproline Proteins 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 239000003906 humectant Substances 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 2
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 2
- 108010072591 lysyl-leucyl-alanyl-arginine Proteins 0.000 description 2
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 2
- 108010075702 lysyl-valyl-aspartyl-leucine Proteins 0.000 description 2
- 108010009298 lysylglutamic acid Proteins 0.000 description 2
- 210000004962 mammalian cell Anatomy 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 210000004379 membrane Anatomy 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 2
- 210000004400 mucous membrane Anatomy 0.000 description 2
- 238000002703 mutagenesis Methods 0.000 description 2
- 231100000350 mutagenesis Toxicity 0.000 description 2
- 231100000219 mutagenic Toxicity 0.000 description 2
- 230000003505 mutagenic effect Effects 0.000 description 2
- 229940097496 nasal spray Drugs 0.000 description 2
- 230000037434 nonsense mutation Effects 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 238000002823 phage display Methods 0.000 description 2
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 2
- 108010084525 phenylalanyl-phenylalanyl-glycine Proteins 0.000 description 2
- 108010089520 pol Gene Products Proteins 0.000 description 2
- 101150088264 pol gene Proteins 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- 201000009266 primary ciliary dyskinesia Diseases 0.000 description 2
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 2
- 108010015796 prolylisoleucine Proteins 0.000 description 2
- 108010053725 prolylvaline Proteins 0.000 description 2
- 239000003380 propellant Substances 0.000 description 2
- 201000003489 pulmonary alveolar proteinosis Diseases 0.000 description 2
- 238000003753 real-time PCR Methods 0.000 description 2
- 230000001172 regenerating effect Effects 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 210000001533 respiratory mucosa Anatomy 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- FSYKKLYZXJSNPZ-UHFFFAOYSA-N sarcosine Chemical compound C[NH2+]CC([O-])=O FSYKKLYZXJSNPZ-UHFFFAOYSA-N 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 238000002741 site-directed mutagenesis Methods 0.000 description 2
- 238000001542 size-exclusion chromatography Methods 0.000 description 2
- MFBOGIVSZKQAPD-UHFFFAOYSA-M sodium butyrate Chemical compound [Na+].CCCC([O-])=O MFBOGIVSZKQAPD-UHFFFAOYSA-M 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 239000000600 sorbitol Substances 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 239000012085 test solution Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 2
- 239000012096 transfection reagent Substances 0.000 description 2
- 230000009261 transgenic effect Effects 0.000 description 2
- 108010029384 tryptophyl-histidine Proteins 0.000 description 2
- 108010084932 tryptophyl-proline Proteins 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 2
- 108010073969 valyllysine Proteins 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 230000003442 weekly effect Effects 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- WTKYBFQVZPCGAO-LURJTMIESA-N (2s)-2-(pyridin-3-ylamino)propanoic acid Chemical compound OC(=O)[C@H](C)NC1=CC=CN=C1 WTKYBFQVZPCGAO-LURJTMIESA-N 0.000 description 1
- SAAQPSNNIOGFSQ-LURJTMIESA-N (2s)-2-(pyridin-4-ylamino)propanoic acid Chemical compound OC(=O)[C@H](C)NC1=CC=NC=C1 SAAQPSNNIOGFSQ-LURJTMIESA-N 0.000 description 1
- DFZVZEMNPGABKO-ZETCQYMHSA-N (2s)-2-amino-3-pyridin-3-ylpropanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CN=C1 DFZVZEMNPGABKO-ZETCQYMHSA-N 0.000 description 1
- FQFVANSXYKWQOT-ZETCQYMHSA-N (2s)-2-azaniumyl-3-pyridin-4-ylpropanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=NC=C1 FQFVANSXYKWQOT-ZETCQYMHSA-N 0.000 description 1
- FXGZFWDCXQRZKI-VKHMYHEASA-N (2s)-5-amino-2-nitramido-5-oxopentanoic acid Chemical compound NC(=O)CC[C@@H](C(O)=O)N[N+]([O-])=O FXGZFWDCXQRZKI-VKHMYHEASA-N 0.000 description 1
- CCAIIPMIAFGKSI-DMTCNVIQSA-N (2s,3r)-3-hydroxy-2-(methylazaniumyl)butanoate Chemical compound CN[C@@H]([C@@H](C)O)C(O)=O CCAIIPMIAFGKSI-DMTCNVIQSA-N 0.000 description 1
- CNPSFBUUYIVHAP-WHFBIAKZSA-N (2s,3s)-3-methylpyrrolidin-1-ium-2-carboxylate Chemical compound C[C@H]1CCN[C@@H]1C(O)=O CNPSFBUUYIVHAP-WHFBIAKZSA-N 0.000 description 1
- IARAAMMRAOTQKW-XOMXTNSMSA-N (4r)-4-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-4-methylsulfanylbutanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]-5-[[(2r)-1-[[(2r,3r)-1-[(2s)-2-[[(2s)-1-[[(1s)-1-carboxy-2-hydroxyethyl]amino]-4-methylsulfanyl-1-oxobutan-2-yl]carbamoyl]pyrrolidin- Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCC(O)=O)C(=O)N[C@H](C)C(=O)N[C@H]([C@H](C)CC)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O)NC(=O)[C@H](N)CCSC)C1=CC=CC=C1 IARAAMMRAOTQKW-XOMXTNSMSA-N 0.000 description 1
- RYCNUMLMNKHWPZ-SNVBAGLBSA-N 1-acetyl-sn-glycero-3-phosphocholine Chemical compound CC(=O)OC[C@@H](O)COP([O-])(=O)OCC[N+](C)(C)C RYCNUMLMNKHWPZ-SNVBAGLBSA-N 0.000 description 1
- GXVUZYLYWKWJIM-UHFFFAOYSA-N 2-(2-aminoethoxy)ethanamine Chemical compound NCCOCCN GXVUZYLYWKWJIM-UHFFFAOYSA-N 0.000 description 1
- FUOOLUPWFVMBKG-UHFFFAOYSA-N 2-Aminoisobutyric acid Chemical compound CC(C)(N)C(O)=O FUOOLUPWFVMBKG-UHFFFAOYSA-N 0.000 description 1
- OIALAIQRYISUEV-UHFFFAOYSA-N 2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-[2-(2-hydroxyethoxy)ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]ethoxy]e Polymers CCCCCCCCCCCCCCCCCC(=O)OCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCO OIALAIQRYISUEV-UHFFFAOYSA-N 0.000 description 1
- CDUUKBXTEOFITR-BYPYZUCNSA-N 2-methyl-L-serine Chemical compound OC[C@@]([NH3+])(C)C([O-])=O CDUUKBXTEOFITR-BYPYZUCNSA-N 0.000 description 1
- XEVFXAFXZZYFSX-UHFFFAOYSA-N 3-azabicyclo[2.1.1]hexane-4-carboxylic acid Chemical compound C1C2CC1(C(=O)O)NC2 XEVFXAFXZZYFSX-UHFFFAOYSA-N 0.000 description 1
- SBGXWWCLHIOABR-UHFFFAOYSA-N Ala Ala Gly Ala Chemical compound CC(N)C(=O)NC(C)C(=O)NCC(=O)NC(C)C(O)=O SBGXWWCLHIOABR-UHFFFAOYSA-N 0.000 description 1
- QDRGPQWIVZNJQD-CIUDSAMLSA-N Ala-Arg-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QDRGPQWIVZNJQD-CIUDSAMLSA-N 0.000 description 1
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 1
- AWAXZRDKUHOPBO-GUBZILKMSA-N Ala-Gln-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O AWAXZRDKUHOPBO-GUBZILKMSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- VBRDBGCROKWTPV-XHNCKOQMSA-N Ala-Glu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N VBRDBGCROKWTPV-XHNCKOQMSA-N 0.000 description 1
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 1
- CWEAKSWWKHGTRJ-BQBZGAKWSA-N Ala-Gly-Met Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O CWEAKSWWKHGTRJ-BQBZGAKWSA-N 0.000 description 1
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 1
- OKEWAFFWMHBGPT-XPUUQOCRSA-N Ala-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CN=CN1 OKEWAFFWMHBGPT-XPUUQOCRSA-N 0.000 description 1
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 1
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 1
- VHVVPYOJIIQCKS-QEJZJMRPSA-N Ala-Leu-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VHVVPYOJIIQCKS-QEJZJMRPSA-N 0.000 description 1
- XSTZMVAYYCJTNR-DCAQKATOSA-N Ala-Met-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XSTZMVAYYCJTNR-DCAQKATOSA-N 0.000 description 1
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 1
- CYBJZLQSUJEMAS-LFSVMHDDSA-N Ala-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C)N)O CYBJZLQSUJEMAS-LFSVMHDDSA-N 0.000 description 1
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 1
- BHTBAVZSZCQZPT-GUBZILKMSA-N Ala-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N BHTBAVZSZCQZPT-GUBZILKMSA-N 0.000 description 1
- FFZJHQODAYHGPO-KZVJFYERSA-N Ala-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N FFZJHQODAYHGPO-KZVJFYERSA-N 0.000 description 1
- VJVQKGYHIZPSNS-FXQIFTODSA-N Ala-Ser-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N VJVQKGYHIZPSNS-FXQIFTODSA-N 0.000 description 1
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 1
- AETQNIIFKCMVHP-UVBJJODRSA-N Ala-Trp-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AETQNIIFKCMVHP-UVBJJODRSA-N 0.000 description 1
- TVUFMYKTYXTRPY-HERUPUMHSA-N Ala-Trp-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O TVUFMYKTYXTRPY-HERUPUMHSA-N 0.000 description 1
- JNJHNBXBGNJESC-KKXDTOCCSA-N Ala-Tyr-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JNJHNBXBGNJESC-KKXDTOCCSA-N 0.000 description 1
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 108020000948 Antisense Oligonucleotides Proteins 0.000 description 1
- DBKNLHKEVPZVQC-LPEHRKFASA-N Arg-Ala-Pro Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O DBKNLHKEVPZVQC-LPEHRKFASA-N 0.000 description 1
- UISQLSIBJKEJSS-GUBZILKMSA-N Arg-Arg-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(O)=O UISQLSIBJKEJSS-GUBZILKMSA-N 0.000 description 1
- RWWPBOUMKFBHAL-FXQIFTODSA-N Arg-Asn-Cys Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(O)=O RWWPBOUMKFBHAL-FXQIFTODSA-N 0.000 description 1
- CPSHGRGUPZBMOK-CIUDSAMLSA-N Arg-Asn-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CPSHGRGUPZBMOK-CIUDSAMLSA-N 0.000 description 1
- JUWQNWXEGDYCIE-YUMQZZPRSA-N Arg-Gln-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O JUWQNWXEGDYCIE-YUMQZZPRSA-N 0.000 description 1
- LMPKCSXZJSXBBL-NHCYSSNCSA-N Arg-Gln-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O LMPKCSXZJSXBBL-NHCYSSNCSA-N 0.000 description 1
- OHYQKYUTLIPFOX-ZPFDUUQYSA-N Arg-Glu-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OHYQKYUTLIPFOX-ZPFDUUQYSA-N 0.000 description 1
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 1
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 1
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 1
- FLYANDHDFRGGTM-PYJNHQTQSA-N Arg-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FLYANDHDFRGGTM-PYJNHQTQSA-N 0.000 description 1
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 1
- OGSQONVYSTZIJB-WDSOQIARSA-N Arg-Leu-Trp Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O OGSQONVYSTZIJB-WDSOQIARSA-N 0.000 description 1
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 1
- FIQKRDXFTANIEJ-ULQDDVLXSA-N Arg-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FIQKRDXFTANIEJ-ULQDDVLXSA-N 0.000 description 1
- DNBMCNQKNOKOSD-DCAQKATOSA-N Arg-Pro-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O DNBMCNQKNOKOSD-DCAQKATOSA-N 0.000 description 1
- YFHATWYGAAXQCF-JYJNAYRXSA-N Arg-Pro-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YFHATWYGAAXQCF-JYJNAYRXSA-N 0.000 description 1
- AMIQZQAAYGYKOP-FXQIFTODSA-N Arg-Ser-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O AMIQZQAAYGYKOP-FXQIFTODSA-N 0.000 description 1
- ZPWMEWYQBWSGAO-ZJDVBMNYSA-N Arg-Thr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZPWMEWYQBWSGAO-ZJDVBMNYSA-N 0.000 description 1
- VJIQPOJMISSUPO-BVSLBCMMSA-N Arg-Trp-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VJIQPOJMISSUPO-BVSLBCMMSA-N 0.000 description 1
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- NTXNUXPCNRDMAF-WFBYXXMGSA-N Asn-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC(N)=O)C)C(O)=O)=CNC2=C1 NTXNUXPCNRDMAF-WFBYXXMGSA-N 0.000 description 1
- KSBHCUSPLWRVEK-ZLUOBGJFSA-N Asn-Asn-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KSBHCUSPLWRVEK-ZLUOBGJFSA-N 0.000 description 1
- NLCDVZJDEXIDDL-BIIVOSGPSA-N Asn-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O NLCDVZJDEXIDDL-BIIVOSGPSA-N 0.000 description 1
- PAXHINASXXXILC-SRVKXCTJSA-N Asn-Asp-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N)O PAXHINASXXXILC-SRVKXCTJSA-N 0.000 description 1
- FAEFJTCTNZTPHX-ACZMJKKPSA-N Asn-Gln-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FAEFJTCTNZTPHX-ACZMJKKPSA-N 0.000 description 1
- HJRBIWRXULGMOA-ACZMJKKPSA-N Asn-Gln-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJRBIWRXULGMOA-ACZMJKKPSA-N 0.000 description 1
- FUHFYEKSGWOWGZ-XHNCKOQMSA-N Asn-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O FUHFYEKSGWOWGZ-XHNCKOQMSA-N 0.000 description 1
- XVAPVJNJGLWGCS-ACZMJKKPSA-N Asn-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVAPVJNJGLWGCS-ACZMJKKPSA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 1
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- FODVBOKTYKYRFJ-CIUDSAMLSA-N Asn-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N FODVBOKTYKYRFJ-CIUDSAMLSA-N 0.000 description 1
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 1
- RTFWCVDISAMGEQ-SRVKXCTJSA-N Asn-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N RTFWCVDISAMGEQ-SRVKXCTJSA-N 0.000 description 1
- KYQJHBWHRASMKG-ZLUOBGJFSA-N Asn-Ser-Cys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O KYQJHBWHRASMKG-ZLUOBGJFSA-N 0.000 description 1
- JWQWPRCDYWNVNM-ACZMJKKPSA-N Asn-Ser-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N JWQWPRCDYWNVNM-ACZMJKKPSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- QIRJQYQOIKBPBZ-IHRRRGAJSA-N Asn-Tyr-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QIRJQYQOIKBPBZ-IHRRRGAJSA-N 0.000 description 1
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 1
- VTYQAQFKMQTKQD-ACZMJKKPSA-N Asp-Ala-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O VTYQAQFKMQTKQD-ACZMJKKPSA-N 0.000 description 1
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 1
- ACEDJCOOPZFUBU-CIUDSAMLSA-N Asp-Cys-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N ACEDJCOOPZFUBU-CIUDSAMLSA-N 0.000 description 1
- BKXPJCBEHWFSTF-ACZMJKKPSA-N Asp-Gln-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O BKXPJCBEHWFSTF-ACZMJKKPSA-N 0.000 description 1
- RYKWOUUZJFSJOH-FXQIFTODSA-N Asp-Gln-Glu Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N RYKWOUUZJFSJOH-FXQIFTODSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- OVPHVTCDVYYTHN-AVGNSLFASA-N Asp-Glu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OVPHVTCDVYYTHN-AVGNSLFASA-N 0.000 description 1
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 1
- UBPMOJLRVMGTOQ-GARJFASQSA-N Asp-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)C(=O)O UBPMOJLRVMGTOQ-GARJFASQSA-N 0.000 description 1
- NHSDEZURHWEZPN-SXTJYALSSA-N Asp-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CC(=O)O)N NHSDEZURHWEZPN-SXTJYALSSA-N 0.000 description 1
- RTXQQDVBACBSCW-CFMVVWHZSA-N Asp-Ile-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RTXQQDVBACBSCW-CFMVVWHZSA-N 0.000 description 1
- XLILXFRAKOYEJX-GUBZILKMSA-N Asp-Leu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLILXFRAKOYEJX-GUBZILKMSA-N 0.000 description 1
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 1
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 1
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 1
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 1
- BKOIIURTQAJHAT-GUBZILKMSA-N Asp-Pro-Pro Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 BKOIIURTQAJHAT-GUBZILKMSA-N 0.000 description 1
- RVMXMLSYBTXCAV-VEVYYDQMSA-N Asp-Pro-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMXMLSYBTXCAV-VEVYYDQMSA-N 0.000 description 1
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 1
- ZVGRHIRJLWBWGJ-ACZMJKKPSA-N Asp-Ser-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZVGRHIRJLWBWGJ-ACZMJKKPSA-N 0.000 description 1
- MGSVBZIBCCKGCY-ZLUOBGJFSA-N Asp-Ser-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MGSVBZIBCCKGCY-ZLUOBGJFSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- AWPWHMVCSISSQK-QWRGUYRKSA-N Asp-Tyr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O AWPWHMVCSISSQK-QWRGUYRKSA-N 0.000 description 1
- GGBQDSHTXKQSLP-NHCYSSNCSA-N Asp-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N GGBQDSHTXKQSLP-NHCYSSNCSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 108091028026 C-DNA Proteins 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 108010062745 Chloride Channels Proteins 0.000 description 1
- 102000011045 Chloride Channels Human genes 0.000 description 1
- 101710094648 Coat protein Proteins 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- SZQCDCKIGWQAQN-FXQIFTODSA-N Cys-Arg-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O SZQCDCKIGWQAQN-FXQIFTODSA-N 0.000 description 1
- DCXGXDGGXVZVMY-GHCJXIJMSA-N Cys-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CS DCXGXDGGXVZVMY-GHCJXIJMSA-N 0.000 description 1
- XABFFGOGKOORCG-CIUDSAMLSA-N Cys-Asp-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XABFFGOGKOORCG-CIUDSAMLSA-N 0.000 description 1
- YUZPQIQWXLRFBW-ACZMJKKPSA-N Cys-Glu-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O YUZPQIQWXLRFBW-ACZMJKKPSA-N 0.000 description 1
- KGIHMGPYGXBYJJ-SRVKXCTJSA-N Cys-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CS KGIHMGPYGXBYJJ-SRVKXCTJSA-N 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 230000007023 DNA restriction-modification system Effects 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 102100038132 Endogenous retrovirus group K member 6 Pro protein Human genes 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 206010016654 Fibrosis Diseases 0.000 description 1
- 101710177291 Gag polyprotein Proteins 0.000 description 1
- 241001663880 Gammaretrovirus Species 0.000 description 1
- WOACHWLUOFZLGJ-GUBZILKMSA-N Gln-Arg-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O WOACHWLUOFZLGJ-GUBZILKMSA-N 0.000 description 1
- BTSPOOHJBYJRKO-CIUDSAMLSA-N Gln-Asp-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BTSPOOHJBYJRKO-CIUDSAMLSA-N 0.000 description 1
- RKAQZCDMSUQTSS-FXQIFTODSA-N Gln-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RKAQZCDMSUQTSS-FXQIFTODSA-N 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 1
- WLODHVXYKYHLJD-ACZMJKKPSA-N Gln-Asp-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N WLODHVXYKYHLJD-ACZMJKKPSA-N 0.000 description 1
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 1
- XKBASPWPBXNVLQ-WDSKDSINSA-N Gln-Gly-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XKBASPWPBXNVLQ-WDSKDSINSA-N 0.000 description 1
- NSNUZSPSADIMJQ-WDSKDSINSA-N Gln-Gly-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NSNUZSPSADIMJQ-WDSKDSINSA-N 0.000 description 1
- HDUDGCZEOZEFOA-KBIXCLLPSA-N Gln-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HDUDGCZEOZEFOA-KBIXCLLPSA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- HYPVLWGNBIYTNA-GUBZILKMSA-N Gln-Leu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HYPVLWGNBIYTNA-GUBZILKMSA-N 0.000 description 1
- QBLMTCRYYTVUQY-GUBZILKMSA-N Gln-Leu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QBLMTCRYYTVUQY-GUBZILKMSA-N 0.000 description 1
- QKCZZAZNMMVICF-DCAQKATOSA-N Gln-Leu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O QKCZZAZNMMVICF-DCAQKATOSA-N 0.000 description 1
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 1
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 1
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 1
- ILKYYKRAULNYMS-JYJNAYRXSA-N Gln-Lys-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ILKYYKRAULNYMS-JYJNAYRXSA-N 0.000 description 1
- QMVCEWKHIUHTSD-GUBZILKMSA-N Gln-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N QMVCEWKHIUHTSD-GUBZILKMSA-N 0.000 description 1
- WHVLABLIJYGVEK-QEWYBTABSA-N Gln-Phe-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WHVLABLIJYGVEK-QEWYBTABSA-N 0.000 description 1
- JUUNNOLZGVYCJT-JYJNAYRXSA-N Gln-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JUUNNOLZGVYCJT-JYJNAYRXSA-N 0.000 description 1
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 1
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 1
- SYZZMPFLOLSMHL-XHNCKOQMSA-N Gln-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N)C(=O)O SYZZMPFLOLSMHL-XHNCKOQMSA-N 0.000 description 1
- QENSHQJGWGRPQS-QEJZJMRPSA-N Gln-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)N)C(O)=O)=CNC2=C1 QENSHQJGWGRPQS-QEJZJMRPSA-N 0.000 description 1
- VLOLPWWCNKWRNB-LOKLDPHHSA-N Gln-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O VLOLPWWCNKWRNB-LOKLDPHHSA-N 0.000 description 1
- XKPACHRGOWQHFH-IRIUXVKKSA-N Gln-Thr-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XKPACHRGOWQHFH-IRIUXVKKSA-N 0.000 description 1
- CVPXINNKRTZBMO-CIUDSAMLSA-N Glu-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N CVPXINNKRTZBMO-CIUDSAMLSA-N 0.000 description 1
- MLCPTRRNICEKIS-FXQIFTODSA-N Glu-Asn-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLCPTRRNICEKIS-FXQIFTODSA-N 0.000 description 1
- YYOBUPFZLKQUAX-FXQIFTODSA-N Glu-Asn-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YYOBUPFZLKQUAX-FXQIFTODSA-N 0.000 description 1
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 1
- LJLPOZGRPLORTF-CIUDSAMLSA-N Glu-Asn-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O LJLPOZGRPLORTF-CIUDSAMLSA-N 0.000 description 1
- ZJICFHQSPWFBKP-AVGNSLFASA-N Glu-Asn-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZJICFHQSPWFBKP-AVGNSLFASA-N 0.000 description 1
- VAIWPXWHWAPYDF-FXQIFTODSA-N Glu-Asp-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O VAIWPXWHWAPYDF-FXQIFTODSA-N 0.000 description 1
- CKOFNWCLWRYUHK-XHNCKOQMSA-N Glu-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O CKOFNWCLWRYUHK-XHNCKOQMSA-N 0.000 description 1
- RQNYYRHRKSVKAB-GUBZILKMSA-N Glu-Cys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O RQNYYRHRKSVKAB-GUBZILKMSA-N 0.000 description 1
- ALCAUWPAMLVUDB-FXQIFTODSA-N Glu-Gln-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ALCAUWPAMLVUDB-FXQIFTODSA-N 0.000 description 1
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 1
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 1
- BUAKRRKDHSSIKK-IHRRRGAJSA-N Glu-Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BUAKRRKDHSSIKK-IHRRRGAJSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- DVLZZEPUNFEUBW-AVGNSLFASA-N Glu-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N DVLZZEPUNFEUBW-AVGNSLFASA-N 0.000 description 1
- LGYCLOCORAEQSZ-PEFMBERDSA-N Glu-Ile-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O LGYCLOCORAEQSZ-PEFMBERDSA-N 0.000 description 1
- QXDXIXFSFHUYAX-MNXVOIDGSA-N Glu-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O QXDXIXFSFHUYAX-MNXVOIDGSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 1
- CUPSDFQZTVVTSK-GUBZILKMSA-N Glu-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O CUPSDFQZTVVTSK-GUBZILKMSA-N 0.000 description 1
- CBEUFCJRFNZMCU-SRVKXCTJSA-N Glu-Met-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O CBEUFCJRFNZMCU-SRVKXCTJSA-N 0.000 description 1
- YHOJJFFTSMWVGR-HJGDQZAQSA-N Glu-Met-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YHOJJFFTSMWVGR-HJGDQZAQSA-N 0.000 description 1
- PMSMKNYRZCKVMC-DRZSPHRISA-N Glu-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCC(=O)O)N PMSMKNYRZCKVMC-DRZSPHRISA-N 0.000 description 1
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 1
- ITVBKCZZLJUUHI-HTUGSXCWSA-N Glu-Phe-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ITVBKCZZLJUUHI-HTUGSXCWSA-N 0.000 description 1
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 1
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 1
- DTLLNDVORUEOTM-WDCWCFNPSA-N Glu-Thr-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DTLLNDVORUEOTM-WDCWCFNPSA-N 0.000 description 1
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 1
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 1
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 1
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 1
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 1
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 1
- GRIRDMVMJJDZKV-RCOVLWMOSA-N Gly-Asn-Val Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O GRIRDMVMJJDZKV-RCOVLWMOSA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- XXGQRGQPGFYECI-WDSKDSINSA-N Gly-Cys-Glu Chemical compound NCC(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCC(O)=O XXGQRGQPGFYECI-WDSKDSINSA-N 0.000 description 1
- LJXWZPHEMJSNRC-KBPBESRZSA-N Gly-Gln-Trp Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LJXWZPHEMJSNRC-KBPBESRZSA-N 0.000 description 1
- JLJLBWDKDRYOPA-RYUDHWBXSA-N Gly-Gln-Tyr Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JLJLBWDKDRYOPA-RYUDHWBXSA-N 0.000 description 1
- ZKLYPEGLWFVRGF-IUCAKERBSA-N Gly-His-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZKLYPEGLWFVRGF-IUCAKERBSA-N 0.000 description 1
- ALOBJFDJTMQQPW-ONGXEEELSA-N Gly-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)CN ALOBJFDJTMQQPW-ONGXEEELSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- YSDLIYZLOTZZNP-UWVGGRQHSA-N Gly-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN YSDLIYZLOTZZNP-UWVGGRQHSA-N 0.000 description 1
- BXICSAQLIHFDDL-YUMQZZPRSA-N Gly-Lys-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BXICSAQLIHFDDL-YUMQZZPRSA-N 0.000 description 1
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- DBJYVKDPGIFXFO-BQBZGAKWSA-N Gly-Met-Ala Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O DBJYVKDPGIFXFO-BQBZGAKWSA-N 0.000 description 1
- SJLKKOZFHSJJAW-YUMQZZPRSA-N Gly-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN SJLKKOZFHSJJAW-YUMQZZPRSA-N 0.000 description 1
- QGDOOCIPHSSADO-STQMWFEESA-N Gly-Met-Phe Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGDOOCIPHSSADO-STQMWFEESA-N 0.000 description 1
- OMOZPGCHVWOXHN-BQBZGAKWSA-N Gly-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)CN OMOZPGCHVWOXHN-BQBZGAKWSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 1
- CQMFNTVQVLQRLT-JHEQGTHGSA-N Gly-Thr-Gln Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CQMFNTVQVLQRLT-JHEQGTHGSA-N 0.000 description 1
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 1
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 1
- KOYUSMBPJOVSOO-XEGUGMAKSA-N Gly-Tyr-Ile Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KOYUSMBPJOVSOO-XEGUGMAKSA-N 0.000 description 1
- NGRPGJGKJMUGDM-XVKPBYJWSA-N Gly-Val-Gln Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O NGRPGJGKJMUGDM-XVKPBYJWSA-N 0.000 description 1
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 1
- 208000009329 Graft vs Host Disease Diseases 0.000 description 1
- 102100021519 Hemoglobin subunit beta Human genes 0.000 description 1
- 108091005904 Hemoglobin subunit beta Proteins 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 1
- SOFSRBYHDINIRG-QTKMDUPCSA-N His-Arg-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CN=CN1)N)O SOFSRBYHDINIRG-QTKMDUPCSA-N 0.000 description 1
- UZZXGLOJRZKYEL-DJFWLOJKSA-N His-Asn-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UZZXGLOJRZKYEL-DJFWLOJKSA-N 0.000 description 1
- AAXMRLWFJFDYQO-GUBZILKMSA-N His-Asp-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O AAXMRLWFJFDYQO-GUBZILKMSA-N 0.000 description 1
- LIEIYPBMQJLASB-SRVKXCTJSA-N His-Gln-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CN=CN1 LIEIYPBMQJLASB-SRVKXCTJSA-N 0.000 description 1
- DVHGLDYMGWTYKW-GUBZILKMSA-N His-Gln-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DVHGLDYMGWTYKW-GUBZILKMSA-N 0.000 description 1
- IMPKSPYRPUXYAP-SZMVWBNQSA-N His-Gln-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC3=CN=CN3)N IMPKSPYRPUXYAP-SZMVWBNQSA-N 0.000 description 1
- PYNUBZSXKQKAHL-UWVGGRQHSA-N His-Gly-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O PYNUBZSXKQKAHL-UWVGGRQHSA-N 0.000 description 1
- RGPWUJOMKFYFSR-QWRGUYRKSA-N His-Gly-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O RGPWUJOMKFYFSR-QWRGUYRKSA-N 0.000 description 1
- GHAFKUCRIVBLDJ-IHRRRGAJSA-N His-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N GHAFKUCRIVBLDJ-IHRRRGAJSA-N 0.000 description 1
- BZKDJRSZWLPJNI-SRVKXCTJSA-N His-His-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O BZKDJRSZWLPJNI-SRVKXCTJSA-N 0.000 description 1
- JJHWJUYYTWYXPL-PYJNHQTQSA-N His-Ile-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CN=CN1 JJHWJUYYTWYXPL-PYJNHQTQSA-N 0.000 description 1
- UROVZOUMHNXPLZ-AVGNSLFASA-N His-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 UROVZOUMHNXPLZ-AVGNSLFASA-N 0.000 description 1
- BPOHQCZZSFBSON-KKUMJFAQSA-N His-Leu-His Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1cnc[nH]1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O BPOHQCZZSFBSON-KKUMJFAQSA-N 0.000 description 1
- SKOKHBGDXGTDDP-MELADBBJSA-N His-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N SKOKHBGDXGTDDP-MELADBBJSA-N 0.000 description 1
- PGRPSOUCWRBWKZ-DLOVCJGASA-N His-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CN=CN1 PGRPSOUCWRBWKZ-DLOVCJGASA-N 0.000 description 1
- YVCGJPIKRMGNPA-LSJOCFKGSA-N His-Met-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O YVCGJPIKRMGNPA-LSJOCFKGSA-N 0.000 description 1
- AYUOWUNWZGTNKB-ULQDDVLXSA-N His-Phe-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AYUOWUNWZGTNKB-ULQDDVLXSA-N 0.000 description 1
- HYWZHNUGAYVEEW-KKUMJFAQSA-N His-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N HYWZHNUGAYVEEW-KKUMJFAQSA-N 0.000 description 1
- ZHHLTWUOWXHVQJ-YUMQZZPRSA-N His-Ser-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZHHLTWUOWXHVQJ-YUMQZZPRSA-N 0.000 description 1
- MRVZCDSYLJXKKX-ACRUOGEOSA-N His-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CN=CN3)N MRVZCDSYLJXKKX-ACRUOGEOSA-N 0.000 description 1
- JATYGDHMDRAISQ-KKUMJFAQSA-N His-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O JATYGDHMDRAISQ-KKUMJFAQSA-N 0.000 description 1
- 102000003964 Histone deacetylase Human genes 0.000 description 1
- 108090000353 Histone deacetylase Proteins 0.000 description 1
- YZJSUQQZGCHHNQ-UHFFFAOYSA-N Homoglutamine Chemical compound OC(=O)C(N)CCCC(N)=O YZJSUQQZGCHHNQ-UHFFFAOYSA-N 0.000 description 1
- 241000713772 Human immunodeficiency virus 1 Species 0.000 description 1
- 241000726041 Human respirovirus 1 Species 0.000 description 1
- NKVZTQVGUNLLQW-JBDRJPRFSA-N Ile-Ala-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O)N NKVZTQVGUNLLQW-JBDRJPRFSA-N 0.000 description 1
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 1
- SACHLUOUHCVIKI-GMOBBJLQSA-N Ile-Arg-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SACHLUOUHCVIKI-GMOBBJLQSA-N 0.000 description 1
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 1
- NULSANWBUWLTKN-NAKRPEOUSA-N Ile-Arg-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N NULSANWBUWLTKN-NAKRPEOUSA-N 0.000 description 1
- QADCTXFNLZBZAB-GHCJXIJMSA-N Ile-Asn-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N QADCTXFNLZBZAB-GHCJXIJMSA-N 0.000 description 1
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 1
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 1
- AQTWDZDISVGCAC-CFMVVWHZSA-N Ile-Asp-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N AQTWDZDISVGCAC-CFMVVWHZSA-N 0.000 description 1
- KMBPQYKVZBMRMH-PEFMBERDSA-N Ile-Gln-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O KMBPQYKVZBMRMH-PEFMBERDSA-N 0.000 description 1
- WNQKUUQIVDDAFA-ZPFDUUQYSA-N Ile-Gln-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N WNQKUUQIVDDAFA-ZPFDUUQYSA-N 0.000 description 1
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 1
- LPFBXFILACZHIB-LAEOZQHASA-N Ile-Gly-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)O)N LPFBXFILACZHIB-LAEOZQHASA-N 0.000 description 1
- CCYGNFBYUNHFSC-MGHWNKPDSA-N Ile-His-Phe Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O CCYGNFBYUNHFSC-MGHWNKPDSA-N 0.000 description 1
- LNJLOZYNZFGJMM-DEQVHRJGSA-N Ile-His-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N LNJLOZYNZFGJMM-DEQVHRJGSA-N 0.000 description 1
- KEKTTYCXKGBAAL-VGDYDELISA-N Ile-His-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CO)C(=O)O)N KEKTTYCXKGBAAL-VGDYDELISA-N 0.000 description 1
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 1
- UDBPXJNOEWDBDF-XUXIUFHCSA-N Ile-Lys-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)O)N UDBPXJNOEWDBDF-XUXIUFHCSA-N 0.000 description 1
- UOPBQSJRBONRON-STECZYCISA-N Ile-Met-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOPBQSJRBONRON-STECZYCISA-N 0.000 description 1
- SNHYFFQZRFIRHO-CYDGBPFRSA-N Ile-Met-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N SNHYFFQZRFIRHO-CYDGBPFRSA-N 0.000 description 1
- OTSVBELRDMSPKY-PCBIJLKTSA-N Ile-Phe-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OTSVBELRDMSPKY-PCBIJLKTSA-N 0.000 description 1
- IIWQTXMUALXGOV-PCBIJLKTSA-N Ile-Phe-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IIWQTXMUALXGOV-PCBIJLKTSA-N 0.000 description 1
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 1
- FQYQMFCIJNWDQZ-CYDGBPFRSA-N Ile-Pro-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 FQYQMFCIJNWDQZ-CYDGBPFRSA-N 0.000 description 1
- JZNVOBUNTWNZPW-GHCJXIJMSA-N Ile-Ser-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JZNVOBUNTWNZPW-GHCJXIJMSA-N 0.000 description 1
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 1
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 1
- WLRJHVNFGAOYPS-HJPIBITLSA-N Ile-Ser-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N WLRJHVNFGAOYPS-HJPIBITLSA-N 0.000 description 1
- HXIDVIFHRYRXLZ-NAKRPEOUSA-N Ile-Ser-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)O)N HXIDVIFHRYRXLZ-NAKRPEOUSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- QGXQHJQPAPMACW-PPCPHDFISA-N Ile-Thr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QGXQHJQPAPMACW-PPCPHDFISA-N 0.000 description 1
- BLFXHAFTNYZEQE-VKOGCVSHSA-N Ile-Trp-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N BLFXHAFTNYZEQE-VKOGCVSHSA-N 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 206010062717 Increased upper airway secretion Diseases 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 206010022004 Influenza like illness Diseases 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- SNDPXSYFESPGGJ-BYPYZUCNSA-N L-2-aminopentanoic acid Chemical compound CCC[C@H](N)C(O)=O SNDPXSYFESPGGJ-BYPYZUCNSA-N 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 1
- 125000000899 L-alpha-glutamyl group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C([H])([H])C([H])([H])C(O[H])=O 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- 125000000010 L-asparaginyl group Chemical group O=C([*])[C@](N([H])[H])([H])C([H])([H])C(=O)N([H])[H] 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- SNDPXSYFESPGGJ-UHFFFAOYSA-N L-norVal-OH Natural products CCCC(N)C(O)=O SNDPXSYFESPGGJ-UHFFFAOYSA-N 0.000 description 1
- HXEACLLIILLPRG-YFKPBYRVSA-N L-pipecolic acid Chemical compound [O-]C(=O)[C@@H]1CCCC[NH2+]1 HXEACLLIILLPRG-YFKPBYRVSA-N 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 1
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 1
- GPXFZVUVPCFTMG-AVGNSLFASA-N Leu-Arg-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(C)C GPXFZVUVPCFTMG-AVGNSLFASA-N 0.000 description 1
- STAVRDQLZOTNKJ-RHYQMDGZSA-N Leu-Arg-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STAVRDQLZOTNKJ-RHYQMDGZSA-N 0.000 description 1
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 1
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- DPWGZWUMUUJQDT-IUCAKERBSA-N Leu-Gln-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O DPWGZWUMUUJQDT-IUCAKERBSA-N 0.000 description 1
- BOFAFKVZQUMTID-AVGNSLFASA-N Leu-Gln-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N BOFAFKVZQUMTID-AVGNSLFASA-N 0.000 description 1
- QDSKNVXKLPQNOJ-GVXVVHGQSA-N Leu-Gln-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O QDSKNVXKLPQNOJ-GVXVVHGQSA-N 0.000 description 1
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 1
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 1
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 1
- WRLPVDVHNWSSCL-MELADBBJSA-N Leu-His-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N WRLPVDVHNWSSCL-MELADBBJSA-N 0.000 description 1
- HMDDEJADNKQTBR-BZSNNMDCSA-N Leu-His-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMDDEJADNKQTBR-BZSNNMDCSA-N 0.000 description 1
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 1
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 1
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 1
- KXCMQWMNYQOAKA-SRVKXCTJSA-N Leu-Met-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N KXCMQWMNYQOAKA-SRVKXCTJSA-N 0.000 description 1
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 1
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 1
- IBSGMIPRBMPMHE-IHRRRGAJSA-N Leu-Met-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O IBSGMIPRBMPMHE-IHRRRGAJSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- MJWVXZABPOKJJF-ACRUOGEOSA-N Leu-Phe-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MJWVXZABPOKJJF-ACRUOGEOSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 1
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 1
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 1
- FGZVGOAAROXFAB-IXOXFDKPSA-N Leu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N)O FGZVGOAAROXFAB-IXOXFDKPSA-N 0.000 description 1
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 1
- QQXJROOJCMIHIV-AVGNSLFASA-N Leu-Val-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O QQXJROOJCMIHIV-AVGNSLFASA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 108010028275 Leukocyte Elastase Proteins 0.000 description 1
- 102000016799 Leukocyte elastase Human genes 0.000 description 1
- 206010024971 Lower respiratory tract infections Diseases 0.000 description 1
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 1
- YRWCPXOFBKTCFY-NUTKFTJISA-N Lys-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCCN)N YRWCPXOFBKTCFY-NUTKFTJISA-N 0.000 description 1
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 1
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 1
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 1
- OVIVOCSURJYCTM-GUBZILKMSA-N Lys-Asp-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O OVIVOCSURJYCTM-GUBZILKMSA-N 0.000 description 1
- GKFNXYMAMKJSKD-NHCYSSNCSA-N Lys-Asp-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GKFNXYMAMKJSKD-NHCYSSNCSA-N 0.000 description 1
- MLLKLNYPZRDIQG-GUBZILKMSA-N Lys-Cys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N MLLKLNYPZRDIQG-GUBZILKMSA-N 0.000 description 1
- YVMQJGWLHRWMDF-MNXVOIDGSA-N Lys-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N YVMQJGWLHRWMDF-MNXVOIDGSA-N 0.000 description 1
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 1
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 1
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 1
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 1
- GHOIOYHDDKXIDX-SZMVWBNQSA-N Lys-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 GHOIOYHDDKXIDX-SZMVWBNQSA-N 0.000 description 1
- SQJSXOQXJYAVRV-SRVKXCTJSA-N Lys-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N SQJSXOQXJYAVRV-SRVKXCTJSA-N 0.000 description 1
- WAIHHELKYSFIQN-XUXIUFHCSA-N Lys-Ile-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O WAIHHELKYSFIQN-XUXIUFHCSA-N 0.000 description 1
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 1
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 1
- BEGQVWUZFXLNHZ-IHPCNDPISA-N Lys-Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 BEGQVWUZFXLNHZ-IHPCNDPISA-N 0.000 description 1
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 1
- URGPVYGVWLIRGT-DCAQKATOSA-N Lys-Met-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O URGPVYGVWLIRGT-DCAQKATOSA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 1
- UWHCKWNPWKTMBM-WDCWCFNPSA-N Lys-Thr-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWHCKWNPWKTMBM-WDCWCFNPSA-N 0.000 description 1
- WAAZECNCPVGPIV-RHYQMDGZSA-N Lys-Thr-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O WAAZECNCPVGPIV-RHYQMDGZSA-N 0.000 description 1
- XGZDDOKIHSYHTO-SZMVWBNQSA-N Lys-Trp-Glu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 XGZDDOKIHSYHTO-SZMVWBNQSA-N 0.000 description 1
- XABXVVSWUVCZST-GVXVVHGQSA-N Lys-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN XABXVVSWUVCZST-GVXVVHGQSA-N 0.000 description 1
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- MCNGIXXCMJAURZ-VEVYYDQMSA-N Met-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCSC)N)O MCNGIXXCMJAURZ-VEVYYDQMSA-N 0.000 description 1
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 1
- STLBOMUOQNIALW-BQBZGAKWSA-N Met-Gly-Cys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CS)C(O)=O STLBOMUOQNIALW-BQBZGAKWSA-N 0.000 description 1
- JACAKCWAOHKQBV-UWVGGRQHSA-N Met-Gly-Lys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN JACAKCWAOHKQBV-UWVGGRQHSA-N 0.000 description 1
- LRALLISKBZNSKN-BQBZGAKWSA-N Met-Gly-Ser Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LRALLISKBZNSKN-BQBZGAKWSA-N 0.000 description 1
- MVMNUCOHQGYYKB-PEDHHIEDSA-N Met-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCSC)N MVMNUCOHQGYYKB-PEDHHIEDSA-N 0.000 description 1
- WTHGNAAQXISJHP-AVGNSLFASA-N Met-Lys-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WTHGNAAQXISJHP-AVGNSLFASA-N 0.000 description 1
- MIAZEQZXAFTCCG-UBHSHLNASA-N Met-Phe-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 MIAZEQZXAFTCCG-UBHSHLNASA-N 0.000 description 1
- YLDSJJOGQNEQJK-AVGNSLFASA-N Met-Pro-Leu Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YLDSJJOGQNEQJK-AVGNSLFASA-N 0.000 description 1
- MIXPUVSPPOWTCR-FXQIFTODSA-N Met-Ser-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MIXPUVSPPOWTCR-FXQIFTODSA-N 0.000 description 1
- DBMLDOWSVHMQQN-XGEHTFHBSA-N Met-Ser-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DBMLDOWSVHMQQN-XGEHTFHBSA-N 0.000 description 1
- PNHRPOWKRRJATF-IHRRRGAJSA-N Met-Tyr-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 PNHRPOWKRRJATF-IHRRRGAJSA-N 0.000 description 1
- JACMWNXOOUYXCD-JYJNAYRXSA-N Met-Val-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JACMWNXOOUYXCD-JYJNAYRXSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- PQNASZJZHFPQLE-LURJTMIESA-N N(6)-methyl-L-lysine Chemical compound CNCCCC[C@H](N)C(O)=O PQNASZJZHFPQLE-LURJTMIESA-N 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 101710141454 Nucleoprotein Proteins 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 229930182555 Penicillin Natural products 0.000 description 1
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- LBSARGIQACMGDF-WBAXXEDZSA-N Phe-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 LBSARGIQACMGDF-WBAXXEDZSA-N 0.000 description 1
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 1
- WGXOKDLDIWSOCV-MELADBBJSA-N Phe-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O WGXOKDLDIWSOCV-MELADBBJSA-N 0.000 description 1
- UEXCHCYDPAIVDE-SRVKXCTJSA-N Phe-Asp-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UEXCHCYDPAIVDE-SRVKXCTJSA-N 0.000 description 1
- IUVYJBMTHARMIP-PCBIJLKTSA-N Phe-Asp-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IUVYJBMTHARMIP-PCBIJLKTSA-N 0.000 description 1
- UMKYAYXCMYYNHI-AVGNSLFASA-N Phe-Gln-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N UMKYAYXCMYYNHI-AVGNSLFASA-N 0.000 description 1
- LLGTYVHITPVGKR-RYUDHWBXSA-N Phe-Gln-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O LLGTYVHITPVGKR-RYUDHWBXSA-N 0.000 description 1
- RJYBHZVWJPUSLB-QEWYBTABSA-N Phe-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N RJYBHZVWJPUSLB-QEWYBTABSA-N 0.000 description 1
- WYPVCIACUMJRIB-JYJNAYRXSA-N Phe-Gln-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N WYPVCIACUMJRIB-JYJNAYRXSA-N 0.000 description 1
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- PMKIMKUGCSVFSV-CQDKDKBSSA-N Phe-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=CC=C2)N PMKIMKUGCSVFSV-CQDKDKBSSA-N 0.000 description 1
- GXDPQJUBLBZKDY-IAVJCBSLSA-N Phe-Ile-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GXDPQJUBLBZKDY-IAVJCBSLSA-N 0.000 description 1
- KZRQONDKKJCAOL-DKIMLUQUSA-N Phe-Leu-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZRQONDKKJCAOL-DKIMLUQUSA-N 0.000 description 1
- KPEIBEPEUAZWNS-ULQDDVLXSA-N Phe-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KPEIBEPEUAZWNS-ULQDDVLXSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- AUJWXNGCAQWLEI-KBPBESRZSA-N Phe-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O AUJWXNGCAQWLEI-KBPBESRZSA-N 0.000 description 1
- ROOQMPCUFLDOSB-FHWLQOOXSA-N Phe-Phe-Gln Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(N)=O)C(O)=O)C1=CC=CC=C1 ROOQMPCUFLDOSB-FHWLQOOXSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 1
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 1
- MRWOVVNKSXXLRP-IHPCNDPISA-N Phe-Ser-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O MRWOVVNKSXXLRP-IHPCNDPISA-N 0.000 description 1
- JHSRGEODDALISP-XVSYOHENSA-N Phe-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O JHSRGEODDALISP-XVSYOHENSA-N 0.000 description 1
- BSTPNLNKHKBONJ-HTUGSXCWSA-N Phe-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O BSTPNLNKHKBONJ-HTUGSXCWSA-N 0.000 description 1
- XNQMZHLAYFWSGJ-HTUGSXCWSA-N Phe-Thr-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XNQMZHLAYFWSGJ-HTUGSXCWSA-N 0.000 description 1
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 1
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 1
- 229920003171 Poly (ethylene oxide) Polymers 0.000 description 1
- 229920002701 Polyoxyl 40 Stearate Polymers 0.000 description 1
- LUGOKRWYNMDGTD-FXQIFTODSA-N Pro-Cys-Asn Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O LUGOKRWYNMDGTD-FXQIFTODSA-N 0.000 description 1
- ZPPVJIJMIKTERM-YUMQZZPRSA-N Pro-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ZPPVJIJMIKTERM-YUMQZZPRSA-N 0.000 description 1
- HJSCRFZVGXAGNG-SRVKXCTJSA-N Pro-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1 HJSCRFZVGXAGNG-SRVKXCTJSA-N 0.000 description 1
- LQZZPNDMYNZPFT-KKUMJFAQSA-N Pro-Gln-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LQZZPNDMYNZPFT-KKUMJFAQSA-N 0.000 description 1
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 1
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 1
- LHALYDBUDCWMDY-CIUDSAMLSA-N Pro-Glu-Ala Chemical compound C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O LHALYDBUDCWMDY-CIUDSAMLSA-N 0.000 description 1
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 1
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 1
- LNOWDSPAYBWJOR-PEDHHIEDSA-N Pro-Ile-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LNOWDSPAYBWJOR-PEDHHIEDSA-N 0.000 description 1
- DRKAXLDECUGLFE-ULQDDVLXSA-N Pro-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O DRKAXLDECUGLFE-ULQDDVLXSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- CPRLKHJUFAXVTD-ULQDDVLXSA-N Pro-Leu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CPRLKHJUFAXVTD-ULQDDVLXSA-N 0.000 description 1
- XQPHBAKJJJZOBX-SRVKXCTJSA-N Pro-Lys-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O XQPHBAKJJJZOBX-SRVKXCTJSA-N 0.000 description 1
- ZZCJYPLMOPTZFC-SRVKXCTJSA-N Pro-Met-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(O)=O ZZCJYPLMOPTZFC-SRVKXCTJSA-N 0.000 description 1
- ANESFYPBAJPYNJ-SDDRHHMPSA-N Pro-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ANESFYPBAJPYNJ-SDDRHHMPSA-N 0.000 description 1
- GFHXZNVJIKMAGO-IHRRRGAJSA-N Pro-Phe-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GFHXZNVJIKMAGO-IHRRRGAJSA-N 0.000 description 1
- XYAFCOJKICBRDU-JYJNAYRXSA-N Pro-Phe-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O XYAFCOJKICBRDU-JYJNAYRXSA-N 0.000 description 1
- DWPXHLIBFQLKLK-CYDGBPFRSA-N Pro-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 DWPXHLIBFQLKLK-CYDGBPFRSA-N 0.000 description 1
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 1
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 1
- BJCXXMGGPHRSHV-GUBZILKMSA-N Pro-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BJCXXMGGPHRSHV-GUBZILKMSA-N 0.000 description 1
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 1
- AJJDPGVVNPUZCR-RHYQMDGZSA-N Pro-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1)O AJJDPGVVNPUZCR-RHYQMDGZSA-N 0.000 description 1
- AIOWVDNPESPXRB-YTWAJWBKSA-N Pro-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2)O AIOWVDNPESPXRB-YTWAJWBKSA-N 0.000 description 1
- VEUACYMXJKXALX-IHRRRGAJSA-N Pro-Tyr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VEUACYMXJKXALX-IHRRRGAJSA-N 0.000 description 1
- DGDCSVGVWWAJRS-AVGNSLFASA-N Pro-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2 DGDCSVGVWWAJRS-AVGNSLFASA-N 0.000 description 1
- FHJQROWZEJFZPO-SRVKXCTJSA-N Pro-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FHJQROWZEJFZPO-SRVKXCTJSA-N 0.000 description 1
- 101710083689 Probable capsid protein Proteins 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 101710192141 Protein Nef Proteins 0.000 description 1
- 208000004756 Respiratory Insufficiency Diseases 0.000 description 1
- 241000315672 SARS coronavirus Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 108010077895 Sarcosine Proteins 0.000 description 1
- 206010040047 Sepsis Diseases 0.000 description 1
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 1
- NRCJWSGXMAPYQX-LPEHRKFASA-N Ser-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N)C(=O)O NRCJWSGXMAPYQX-LPEHRKFASA-N 0.000 description 1
- OOKCGAYXSNJBGQ-ZLUOBGJFSA-N Ser-Asn-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OOKCGAYXSNJBGQ-ZLUOBGJFSA-N 0.000 description 1
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 1
- UGJRQLURDVGULT-LKXGYXEUSA-N Ser-Asn-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UGJRQLURDVGULT-LKXGYXEUSA-N 0.000 description 1
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 1
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 1
- MOVJSUIKUNCVMG-ZLUOBGJFSA-N Ser-Cys-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N)O MOVJSUIKUNCVMG-ZLUOBGJFSA-N 0.000 description 1
- ULVMNZOKDBHKKI-ACZMJKKPSA-N Ser-Gln-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ULVMNZOKDBHKKI-ACZMJKKPSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- IOVBCLGAJJXOHK-SRVKXCTJSA-N Ser-His-His Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IOVBCLGAJJXOHK-SRVKXCTJSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- YIUWWXVTYLANCJ-NAKRPEOUSA-N Ser-Ile-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YIUWWXVTYLANCJ-NAKRPEOUSA-N 0.000 description 1
- LQESNKGTTNHZPZ-GHCJXIJMSA-N Ser-Ile-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O LQESNKGTTNHZPZ-GHCJXIJMSA-N 0.000 description 1
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 1
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- XXNYYSXNXCJYKX-DCAQKATOSA-N Ser-Leu-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O XXNYYSXNXCJYKX-DCAQKATOSA-N 0.000 description 1
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 1
- ZSLFCBHEINFXRS-LPEHRKFASA-N Ser-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ZSLFCBHEINFXRS-LPEHRKFASA-N 0.000 description 1
- AXOHAHIUJHCLQR-IHRRRGAJSA-N Ser-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CO)N AXOHAHIUJHCLQR-IHRRRGAJSA-N 0.000 description 1
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 1
- DINQYZRMXGWWTG-GUBZILKMSA-N Ser-Pro-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DINQYZRMXGWWTG-GUBZILKMSA-N 0.000 description 1
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 1
- DKGRNFUXVTYRAS-UBHSHLNASA-N Ser-Ser-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DKGRNFUXVTYRAS-UBHSHLNASA-N 0.000 description 1
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 1
- RXUOAOOZIWABBW-XGEHTFHBSA-N Ser-Thr-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RXUOAOOZIWABBW-XGEHTFHBSA-N 0.000 description 1
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 1
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 1
- RTXKJFWHEBTABY-IHPCNDPISA-N Ser-Trp-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)NC(=O)[C@H](CO)N RTXKJFWHEBTABY-IHPCNDPISA-N 0.000 description 1
- QYBRQMLZDDJBSW-AVGNSLFASA-N Ser-Tyr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O QYBRQMLZDDJBSW-AVGNSLFASA-N 0.000 description 1
- ZVBCMFDJIMUELU-BZSNNMDCSA-N Ser-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N ZVBCMFDJIMUELU-BZSNNMDCSA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 108091027967 Small hairpin RNA Proteins 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- KEGBFULVYKYJRD-LFSVMHDDSA-N Thr-Ala-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KEGBFULVYKYJRD-LFSVMHDDSA-N 0.000 description 1
- PKXHGEXFMIZSER-QTKMDUPCSA-N Thr-Arg-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O PKXHGEXFMIZSER-QTKMDUPCSA-N 0.000 description 1
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 1
- QNJZOAHSYPXTAB-VEVYYDQMSA-N Thr-Asn-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O QNJZOAHSYPXTAB-VEVYYDQMSA-N 0.000 description 1
- MFEBUIFJVPNZLO-OLHMAJIHSA-N Thr-Asp-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MFEBUIFJVPNZLO-OLHMAJIHSA-N 0.000 description 1
- JEDIEMIJYSRUBB-FOHZUACHSA-N Thr-Asp-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 1
- JXKMXEBNZCKSDY-JIOCBJNQSA-N Thr-Asp-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O JXKMXEBNZCKSDY-JIOCBJNQSA-N 0.000 description 1
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 1
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 1
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 1
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 1
- IGGFFPOIFHZYKC-PBCZWWQYSA-N Thr-His-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O IGGFFPOIFHZYKC-PBCZWWQYSA-N 0.000 description 1
- YDWLCDQXLCILCZ-BWAGICSOSA-N Thr-His-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YDWLCDQXLCILCZ-BWAGICSOSA-N 0.000 description 1
- UYTYTDMCDBPDSC-URLPEUOOSA-N Thr-Ile-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N UYTYTDMCDBPDSC-URLPEUOOSA-N 0.000 description 1
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 1
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 1
- FIFDDJFLNVAVMS-RHYQMDGZSA-N Thr-Leu-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O FIFDDJFLNVAVMS-RHYQMDGZSA-N 0.000 description 1
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 1
- BDGBHYCAZJPLHX-HJGDQZAQSA-N Thr-Lys-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BDGBHYCAZJPLHX-HJGDQZAQSA-N 0.000 description 1
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 1
- QNCFWHZVRNXAKW-OEAJRASXSA-N Thr-Lys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNCFWHZVRNXAKW-OEAJRASXSA-N 0.000 description 1
- WVVOFCVMHAXGLE-LFSVMHDDSA-N Thr-Phe-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O WVVOFCVMHAXGLE-LFSVMHDDSA-N 0.000 description 1
- WRQLCVIALDUQEQ-UNQGMJICSA-N Thr-Phe-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WRQLCVIALDUQEQ-UNQGMJICSA-N 0.000 description 1
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 1
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 1
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 1
- XZUBGOYOGDRYFC-XGEHTFHBSA-N Thr-Ser-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O XZUBGOYOGDRYFC-XGEHTFHBSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 1
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 1
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 1
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 1
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 1
- IJKNKFJZOJCKRR-GBALPHGKSA-N Thr-Trp-Ser Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 IJKNKFJZOJCKRR-GBALPHGKSA-N 0.000 description 1
- BZTSQFWJNJYZSX-JRQIVUDYSA-N Thr-Tyr-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O BZTSQFWJNJYZSX-JRQIVUDYSA-N 0.000 description 1
- QGVBFDIREUUSHX-IFFSRLJSSA-N Thr-Val-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O QGVBFDIREUUSHX-IFFSRLJSSA-N 0.000 description 1
- SPIFGZFZMVLPHN-UNQGMJICSA-N Thr-Val-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SPIFGZFZMVLPHN-UNQGMJICSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 108010046722 Thrombospondin 1 Proteins 0.000 description 1
- 102100036034 Thrombospondin-1 Human genes 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 206010052779 Transplant rejections Diseases 0.000 description 1
- VZBWRZGNEPBRDE-HZUKXOBISA-N Trp-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N VZBWRZGNEPBRDE-HZUKXOBISA-N 0.000 description 1
- NIWAGRRZHCMPOY-GMVOTWDCSA-N Trp-Ala-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N NIWAGRRZHCMPOY-GMVOTWDCSA-N 0.000 description 1
- ICNFHVUVCNWUAB-SZMVWBNQSA-N Trp-Arg-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N ICNFHVUVCNWUAB-SZMVWBNQSA-N 0.000 description 1
- GWQUSADRQCTMHN-NWLDYVSISA-N Trp-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O GWQUSADRQCTMHN-NWLDYVSISA-N 0.000 description 1
- ILDJYIDXESUBOE-HSCHXYMDSA-N Trp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N ILDJYIDXESUBOE-HSCHXYMDSA-N 0.000 description 1
- YVXIAOOYAKBAAI-SZMVWBNQSA-N Trp-Leu-Gln Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 YVXIAOOYAKBAAI-SZMVWBNQSA-N 0.000 description 1
- BGWSLEYVITZIQP-DCPHZVHLSA-N Trp-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(O)=O BGWSLEYVITZIQP-DCPHZVHLSA-N 0.000 description 1
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 1
- GSCPHMSPGQSZJT-JYBASQMISA-N Trp-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O GSCPHMSPGQSZJT-JYBASQMISA-N 0.000 description 1
- HTGJDTPQYFMKNC-VFAJRCTISA-N Trp-Thr-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)[C@@H](C)O)=CNC2=C1 HTGJDTPQYFMKNC-VFAJRCTISA-N 0.000 description 1
- DVLHKUWLNKDINO-PMVMPFDFSA-N Trp-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DVLHKUWLNKDINO-PMVMPFDFSA-N 0.000 description 1
- NMOIRIIIUVELLY-WDSOQIARSA-N Trp-Val-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)C(C)C)=CNC2=C1 NMOIRIIIUVELLY-WDSOQIARSA-N 0.000 description 1
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 1
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 1
- VFJIWSJKZJTQII-SRVKXCTJSA-N Tyr-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VFJIWSJKZJTQII-SRVKXCTJSA-N 0.000 description 1
- JWGXUKHIKXZWNG-RYUDHWBXSA-N Tyr-Gly-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O JWGXUKHIKXZWNG-RYUDHWBXSA-N 0.000 description 1
- HFJJDMOFTCQGEI-STECZYCISA-N Tyr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N HFJJDMOFTCQGEI-STECZYCISA-N 0.000 description 1
- AZZLDIDWPZLCCW-ZEWNOJEFSA-N Tyr-Ile-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O AZZLDIDWPZLCCW-ZEWNOJEFSA-N 0.000 description 1
- QARCDOCCDOLJSF-HJPIBITLSA-N Tyr-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QARCDOCCDOLJSF-HJPIBITLSA-N 0.000 description 1
- QSFJHIRIHOJRKS-ULQDDVLXSA-N Tyr-Leu-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QSFJHIRIHOJRKS-ULQDDVLXSA-N 0.000 description 1
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 1
- GZOCMHSZGGJBCX-ULQDDVLXSA-N Tyr-Lys-Met Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O GZOCMHSZGGJBCX-ULQDDVLXSA-N 0.000 description 1
- FASACHWGQBNSRO-ZEWNOJEFSA-N Tyr-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FASACHWGQBNSRO-ZEWNOJEFSA-N 0.000 description 1
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 1
- SOAUMCDLIUGXJJ-SRVKXCTJSA-N Tyr-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O SOAUMCDLIUGXJJ-SRVKXCTJSA-N 0.000 description 1
- ZPFLBLFITJCBTP-QWRGUYRKSA-N Tyr-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O ZPFLBLFITJCBTP-QWRGUYRKSA-N 0.000 description 1
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 1
- GOPQNCQSXBJAII-ULQDDVLXSA-N Tyr-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N GOPQNCQSXBJAII-ULQDDVLXSA-N 0.000 description 1
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 1
- BYOHPUZJVXWHAE-BYULHYEWSA-N Val-Asn-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N BYOHPUZJVXWHAE-BYULHYEWSA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 1
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 1
- HHSILIQTHXABKM-YDHLFZDLSA-N Val-Asp-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](Cc1ccccc1)C(O)=O HHSILIQTHXABKM-YDHLFZDLSA-N 0.000 description 1
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 1
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 1
- IWZYXFRGWKEKBJ-GVXVVHGQSA-N Val-Gln-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N IWZYXFRGWKEKBJ-GVXVVHGQSA-N 0.000 description 1
- BRPKEERLGYNCNC-NHCYSSNCSA-N Val-Glu-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N BRPKEERLGYNCNC-NHCYSSNCSA-N 0.000 description 1
- AHHJARQXFFGOKF-NRPADANISA-N Val-Glu-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N AHHJARQXFFGOKF-NRPADANISA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- MHAHQDBEIDPFQS-NHCYSSNCSA-N Val-Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)C(C)C MHAHQDBEIDPFQS-NHCYSSNCSA-N 0.000 description 1
- SDSCOOZQQGUQFC-GVXVVHGQSA-N Val-His-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N SDSCOOZQQGUQFC-GVXVVHGQSA-N 0.000 description 1
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 1
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 1
- MBGFDZDWMDLXHQ-GUBZILKMSA-N Val-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N MBGFDZDWMDLXHQ-GUBZILKMSA-N 0.000 description 1
- UZFNHAXYMICTBU-DZKIICNBSA-N Val-Phe-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UZFNHAXYMICTBU-DZKIICNBSA-N 0.000 description 1
- BCBFMJYTNKDALA-UFYCRDLUSA-N Val-Phe-Phe Chemical compound N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O BCBFMJYTNKDALA-UFYCRDLUSA-N 0.000 description 1
- YKNOJPJWNVHORX-UNQGMJICSA-N Val-Phe-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YKNOJPJWNVHORX-UNQGMJICSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- PDDJTOSAVNRJRH-UNQGMJICSA-N Val-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](C(C)C)N)O PDDJTOSAVNRJRH-UNQGMJICSA-N 0.000 description 1
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 1
- 206010047115 Vasculitis Diseases 0.000 description 1
- 108010059722 Viral Fusion Proteins Proteins 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 241000713325 Visna/maedi virus Species 0.000 description 1
- 208000027276 Von Willebrand disease Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 241000269370 Xenopus <genus> Species 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 206010069351 acute lung injury Diseases 0.000 description 1
- 206010000891 acute myocardial infarction Diseases 0.000 description 1
- 230000033289 adaptive immune response Effects 0.000 description 1
- 210000003539 airway basal cell Anatomy 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 238000012867 alanine scanning Methods 0.000 description 1
- CDUUKBXTEOFITR-UHFFFAOYSA-N alpha-methylserine Natural products OCC([NH3+])(C)C([O-])=O CDUUKBXTEOFITR-UHFFFAOYSA-N 0.000 description 1
- 230000000689 aminoacylating effect Effects 0.000 description 1
- 125000000129 anionic group Chemical group 0.000 description 1
- 239000003945 anionic surfactant Substances 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000003110 anti-inflammatory effect Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 239000000074 antisense oligonucleotide Substances 0.000 description 1
- 238000012230 antisense oligonucleotides Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 1
- 108010068265 aspartyltyrosine Proteins 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 238000013398 bayesian method Methods 0.000 description 1
- SQVRNKJHWKZAKO-UHFFFAOYSA-N beta-N-Acetyl-D-neuraminic acid Natural products CC(=O)NC1C(O)CC(O)(C(O)=O)OC1C(O)C(O)CO SQVRNKJHWKZAKO-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 229940093761 bile salts Drugs 0.000 description 1
- 230000009141 biological interaction Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000000621 bronchi Anatomy 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 244000309464 bull Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 239000003093 cationic surfactant Substances 0.000 description 1
- 230000007910 cell fusion Effects 0.000 description 1
- 210000004671 cell-free system Anatomy 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007665 chronic toxicity Effects 0.000 description 1
- 231100000160 chronic toxicity Toxicity 0.000 description 1
- 230000007882 cirrhosis Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000015271 coagulation Effects 0.000 description 1
- 238000005345 coagulation Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 239000012228 culture supernatant Substances 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 238000002050 diffraction method Methods 0.000 description 1
- 238000007865 diluting Methods 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 208000016097 disease of metabolism Diseases 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000007876 drug discovery Methods 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 238000002003 electron diffraction Methods 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 210000003890 endocrine cell Anatomy 0.000 description 1
- 238000002641 enzyme replacement therapy Methods 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 201000007386 factor VII deficiency Diseases 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 239000012091 fetal bovine serum Substances 0.000 description 1
- 230000004761 fibrosis Effects 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- 235000003599 food sweetener Nutrition 0.000 description 1
- 238000003052 fractional factorial design Methods 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 101150098622 gag gene Proteins 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 230000000762 glandular Effects 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 108010085059 glutamyl-arginyl-proline Proteins 0.000 description 1
- 235000011187 glycerol Nutrition 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 208000024908 graft versus host disease Diseases 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 108010092114 histidylphenylalanine Proteins 0.000 description 1
- 230000005099 host tropism Effects 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- MWFRVMDVLYIXJF-BYPYZUCNSA-N hydroxyethylcysteine Chemical compound OC(=O)[C@@H](N)CSCCO MWFRVMDVLYIXJF-BYPYZUCNSA-N 0.000 description 1
- 229960002591 hydroxyproline Drugs 0.000 description 1
- 230000002163 immunogen Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 210000004969 inflammatory cell Anatomy 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 230000028709 inflammatory response Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 230000007794 irritation Effects 0.000 description 1
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- GCHPUFAZSONQIV-UHFFFAOYSA-N isovaline Chemical compound CCC(C)(N)C(O)=O GCHPUFAZSONQIV-UHFFFAOYSA-N 0.000 description 1
- HXEACLLIILLPRG-RXMQYKEDSA-N l-pipecolic acid Natural products OC(=O)[C@H]1CCCCN1 HXEACLLIILLPRG-RXMQYKEDSA-N 0.000 description 1
- 108010091871 leucylmethionine Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 238000002898 library design Methods 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 208000019423 liver disease Diseases 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000002483 medication Methods 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000002480 mineral oil Substances 0.000 description 1
- 235000010446 mineral oil Nutrition 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 201000006417 multiple sclerosis Diseases 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- 208000031225 myocardial ischemia Diseases 0.000 description 1
- 239000007923 nasal drop Substances 0.000 description 1
- 229940100662 nasal drops Drugs 0.000 description 1
- 210000002850 nasal mucosa Anatomy 0.000 description 1
- 239000006199 nebulizer Substances 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000004412 neuroendocrine cell Anatomy 0.000 description 1
- 239000002736 nonionic surfactant Substances 0.000 description 1
- 229940066429 octoxynol Drugs 0.000 description 1
- 229920002113 octoxynol Polymers 0.000 description 1
- 210000000287 oocyte Anatomy 0.000 description 1
- 230000002669 organ and tissue protective effect Effects 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 229940049954 penicillin Drugs 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 239000008249 pharmaceutical aerosol Substances 0.000 description 1
- 208000026435 phlegm Diseases 0.000 description 1
- 239000002953 phosphate buffered saline Substances 0.000 description 1
- 150000008300 phosphoramidites Chemical class 0.000 description 1
- 238000005222 photoaffinity labeling Methods 0.000 description 1
- 210000004043 pneumocyte Anatomy 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 235000010482 polyoxyethylene sorbitan monooleate Nutrition 0.000 description 1
- 229940099429 polyoxyl 40 stearate Drugs 0.000 description 1
- 229920000053 polysorbate 80 Polymers 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 239000003755 preservative agent Substances 0.000 description 1
- 230000002335 preservative effect Effects 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 125000001500 prolyl group Chemical group [H]N1C([H])(C(=O)[*])C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 108010077112 prolyl-proline Proteins 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 108010029020 prolylglycine Proteins 0.000 description 1
- 230000000069 prophylactic effect Effects 0.000 description 1
- 235000004252 protein component Nutrition 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 201000004193 respiratory failure Diseases 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 108700004030 rev Genes Proteins 0.000 description 1
- 101150098213 rev gene Proteins 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 239000004017 serum-free culture medium Substances 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- SQVRNKJHWKZAKO-OQPLDHBCSA-N sialic acid Chemical compound CC(=O)N[C@@H]1[C@@H](O)C[C@@](O)(C(O)=O)OC1[C@H](O)[C@H](O)CO SQVRNKJHWKZAKO-OQPLDHBCSA-N 0.000 description 1
- 239000002924 silencing RNA Substances 0.000 description 1
- 239000004055 small Interfering RNA Substances 0.000 description 1
- 229940126586 small molecule drug Drugs 0.000 description 1
- 235000010356 sorbitol Nutrition 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 229960005322 streptomycin Drugs 0.000 description 1
- 210000004878 submucosal gland Anatomy 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000003765 sweetening agent Substances 0.000 description 1
- 101150065190 term gene Proteins 0.000 description 1
- NPDBDJFLKKQMCM-UHFFFAOYSA-N tert-butylglycine Chemical compound CC(C)(C)C(N)C(O)=O NPDBDJFLKKQMCM-UHFFFAOYSA-N 0.000 description 1
- RTKIYNMVFMVABJ-UHFFFAOYSA-L thimerosal Chemical compound [Na+].CC[Hg]SC1=CC=CC=C1C([O-])=O RTKIYNMVFMVABJ-UHFFFAOYSA-L 0.000 description 1
- 229960004906 thiomersal Drugs 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 150000005691 triesters Chemical class 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 235000015112 vegetable and seed oil Nutrition 0.000 description 1
- 239000008158 vegetable oil Substances 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 208000012137 von Willebrand disease (hereditary or acquired) Diseases 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K35/00—Medicinal preparations containing materials or reaction products thereof with undetermined constitution
- A61K35/66—Microorganisms or materials therefrom
- A61K35/76—Viruses; Subviral particles; Bacteriophages
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/16—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- A61K38/162—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from virus
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/005—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/0075—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the delivery route, e.g. oral, subcutaneous
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P11/00—Drugs for disorders of the respiratory system
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/06—Animal cells or tissues; Human cells or tissues
- C12N5/0602—Vertebrate cells
- C12N5/0684—Cells of the urinary tract or kidneys
- C12N5/0687—Renal stem cells; Renal progenitors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/15011—Lentivirus, not HIV, e.g. FIV, SIV
- C12N2740/15022—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/15011—Lentivirus, not HIV, e.g. FIV, SIV
- C12N2740/15041—Use of virus, viral particle or viral elements as a vector
- C12N2740/15043—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2740/00—Reverse transcribing RNA viruses
- C12N2740/00011—Details
- C12N2740/10011—Retroviridae
- C12N2740/15011—Lentivirus, not HIV, e.g. FIV, SIV
- C12N2740/15051—Methods of production or purification of viral material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2760/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses negative-sense
- C12N2760/00011—Details
- C12N2760/18011—Paramyxoviridae
- C12N2760/18811—Sendai virus
- C12N2760/18822—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2760/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses negative-sense
- C12N2760/00011—Details
- C12N2760/18011—Paramyxoviridae
- C12N2760/18811—Sendai virus
- C12N2760/18841—Use of virus, viral particle or viral elements as a vector
- C12N2760/18843—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2810/00—Vectors comprising a targeting moiety
- C12N2810/50—Vectors comprising as targeting moiety peptide derived from defined protein
- C12N2810/60—Vectors comprising as targeting moiety peptide derived from defined protein from viruses
- C12N2810/6072—Vectors comprising as targeting moiety peptide derived from defined protein from viruses negative strand RNA viruses
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Biophysics (AREA)
- Virology (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Epidemiology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Urology & Nephrology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Gastroenterology & Hepatology (AREA)
- Cell Biology (AREA)
- Mycology (AREA)
- Immunology (AREA)
- Pulmonology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Medicines Containing Material From Animals Or Micro-Organisms (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
본 발명은 프로모터 및 이식 유전자를 포함하는, 레트로바이러스 유전자 전달 벡터, 특히 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 렌티바이러스 벡터에 관한 것이고; 그것을 제조하는 방법을 포함한다. 본 발명은 또한 특히 낭포성 섬유증(CF)과 같은 호흡기 질환의 치료를 위한 유전자 치료에서의 상기 벡터의 용도에 관한 것이다.
Description
본 발명은 레트로바이러스 유전자 전달 벡터, 특히 프로모터 및 이식유전자(transgene)를 포함하는, 호흡기 파라믹소바이러스(respiratory paramyxovirus)로부터의 헤마글루티닌-뉴라미니다제(HN: hemagglutinin-neuraminidase) 및 융합(F: fusion) 단백질로 위형화된(pseudotyped) 렌티바이러스 벡터, 및 그의 제조 방법에 관한 것이다.
레트로바이러스는 효소 역전사 효소(enzyme reverse transcriptase)를 암호화하는 RNA 바이러스(Retroviridae) 계열이다. 렌티바이러스는 Retroviridae과의 한 속(genus)으로, 긴 잠복기가 특징이다. 레트로바이러스, 특히 렌티바이러스는 상당한 양의 바이러스 RNA를 숙주 세포의 DNA 내에 전달할 수 있고, 레트로바이러스 중에서 비분열 세포(non-dividing cell)를 감염시킬 수 있는 독특한 능력을 갖고 있어, 유전자 전달 벡터의 가장 효율적인 방법 중 하나이다.
슈도타이핑(Pseudotyping)은 외래 바이러스 외피 단백질과 결합하여 바이러스 또는 바이러스 벡터를 생산하는 과정이다. 이와 같이, 외래 바이러스 외피 단백질은 숙주 친화성을 변경하거나 바이러스 입자의 안정성을 증가/감소시키는 데 사용될 수 있다. 예를 들어, 슈도타이핑(pseudotyping)을 통해 외피 단백질의 특성을 지정할 수 있다. 레트로바이러스 및 렌티바이러스 벡터를 슈도타이핑하기 위해 자주 사용되는 단백질은 수포성 구내염 바이러스(VSV: Vesicular stomatitis virus)의 당단백질 G, 짧은 VSV-G이다.
렌티바이러스 벡터, 특히 HIV-1에서 파생된 벡터는 널리 연구되고 자주 사용되는 벡터이다. 렌티바이러스 벡터 백본의 진화와 재조합 DNA 분자(이식 유전자)를 표적 세포 내로 전달하는 바이러스의 능력으로 인해 많은 응용 분야에서 사용되었다. 바이러스 벡터의 두 가지 가능한 응용 분야는 체외 재조합 단백질 생산에서 유전자 치료 및 기능 유전자의 복원을 포함한다.
유전자 전달 벡터로 사용하기에 적합한 레트로바이러스/렌티바이러스 벡터를 설계할 때, 한 가지 주요 요인은 벡터를 환자에게 가능한 한 안전하게 만드는 것이다. 두 번째 주요 요인은 개별 환자를 치료하는 것뿐만 아니라 치료의 혜택을 받을 수 있는 모든 환자를 위해 치료에 대한 더 넓은 임상 접근을 허용하기 위해 충분한 양의 벡터를 생산해야 할 필요성이다. 벡터 안전(vector safety)을 개선하는 수정은 종종 벡터 생산 중 수율 감소와 관련되기 때문에 이 두 요인은 충돌할 수 있다.
기도 상피(airway epithelium)로의 유전자 전달로부터 이점을 얻을 수 있는 임상 설정의 한 예는 낭포성 섬유증(CF: Cystic Fibrosis)의 치료이다. CF는 기도 상피 세포에서 염화물 채널(chloride channel) 역할을 하는 CFTR(CF transmembrane Conductance Regulator) 유전자의 돌연변이로 인해 발생하는 치명적인 유전 질환이다. CF는 재발성 흉부 감염(recurrent chest infection), 증가된 기도 분비물, 및 궁극적으로 호흡 부전(respiratory failure)을 특징으로 한다. 영국의 현재 사망 평균 연령은 약 25세이다. 대부분의 유전자형의 경우, 기본 결함을 대상으로 하는 치료법이 없고; 증상 완화를 위한 현재의 치료법은 매일 몇 시간의 자가-관리 요법이 필요하다. 소분자 약물과 달리 유전자 요법은 CFTR 돌연변이 클래스와 독립적이므로, 영향을 받는 모든 CF 개체에게 적용할 수 있다. 그러나, 현재까지 CF의 치료에 임상적으로 사용하도록 승인된 바이러스 벡터는 없고, 다른 질병, 특히 많은 다른 호흡기 질환에도 동일하게 적용된다.
환자 안전 및 수율 문제 외에도, 기도 상피로의 유전자 전달과 관련하여 통상적으로 관련된 다른 어려움이 있다.
기도 상피로의 유전자 전달 효율은 일반적으로 좋지 않은데, 이는 많은 바이러스 벡터에 대한 각각의 수용체가 주로 기도 상피의 기저외측 표면에 국한되는 것으로 보이기 때문이다. 이와 같이, 본 발명자들의 연구 이전에, 리소포스파티딜콜린(lysophosphatidylcholine) 또는 에틸렌 글리콜 비스(2-아미노에틸 에테르)-N,N,N’N’-테트라아세트산과 같은 세제를 사용하여 기도를 변환하기 위해 상피 무결성을 파괴해야 하는 렌티바이러스 슈도타입의 사용은 패혈증의 위험 증가와 관련이 있어 왔다. 또한, 기존의 유전자 전달 벡터는 호흡기 점액층을 통과하기 어려워 유전자 전달 효율도 저하된다. 자가-재생 상피의 평생 치료에 필수적인 기존의 바이러스 벡터를 반복적으로 투여하는 능력은 환자의 적응 면역 반응으로 인해 성공적인 반복 투여를 방해하기 때문에 제한적이다.
임상 적용을 위한 벡터의 투여는 또 다른 적절한 요소이다. 따라서, 치료 효과를 위해서는 임상적으로 관련된 장치(예를 들어, 기관지경 및 분무기) 사용을 통한 바이러스 안정성이 유지되어야 한다.
따라서 상기 기술된 하나 이상의 문제를 피할 수 있는 유전자 치료 벡터가 필요하다. 특히, 본 발명의 목적은 슈도타이핑된 레트로바이러스 또는 렌티바이러스(예를 들어, SIV) 벡터를 생산하는 방법 및 상기 방법을 수행하기 위한 수단을 제공하는 것이고, 여기서 생성된 벡터는 안전하고 기도 상피를 통한 향상된 유전자 전달 효율에 적합하며, 임상적으로 관련된 규모로 생산된다.
발명의 요약
본 발명자들은 이전에 호흡기 파라믹소바이러스로(paramyxovirus)부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 렌티바이러스 벡터를 개발하였고, 이는 프로모터 및 이식 유전자를 포함한다. 일반적으로, 벡터의 백본은 SIV1 또는 아프리카 녹색 원숭이 SIV(SIV-AGM)와 같은 유인원 면역결핍 바이러스(SIV)에서 유래한다. 바람직하게는 본 발명의 바이러스 벡터의 백본은 SIV-AGM으로부터 유래한다. HN 및 F 단백질은 각각 시알산에 부착하고 표적 세포로의 벡터 진입을 위한 세포 융합을 매개하는 기능을 한다. 본 발명자들은 이 특이적으로 F/HN-슈도타이핑된 렌티바이러스 벡터가 기도 상피를 효율적으로 형질도입할 수 있고, 그 결과 기도 상피 세포의 제안된 수명을 초과하는 기간 동안 이식 유전자 발현이 지속됨을 발견하였다. 중요한 것은, 본 발명자들은 또한 재투여가 효능의 손실을 초래하지 않는다는 것을 발견하였다. 이들 특징은 본 발명의 벡터가 (i) 기도 세포 내에서; (ii) 기도의 내강(lumen) 내로 분비되고; 그리고 (iii) 순환계 내로 분비되는 치료 단백질을 발현하는데 사용함으로써 질병을 치료하기 위한 매력적인 후보가 되게 한다.
그러나, 이 렌티바이러스 벡터에는 잠재적인 안전성 문제가 있었다. 특히, 게놈 벡터와 그 생산에 사용된 GagPol 벡터 사이에는 상당한 정도의 서열 상동성(sequence homology)이 있었다. 이 서열 상동성은 복제 가능 렌티바이러스(RCL: replication competent lentivirus)가 제조 동안 또는 환자에게 투여된 후 임상 사용 중에 생성될 수 있다는 이론적 위험을 야기한다. 이는 환자에게 안전 위험을 나타낸다. 복제 가능 바이러스 입자 생성의 위험은 다른 레트로바이러스/렌티바이러스 벡터에서도 문제가 된다.
이 위험을 완화하는 것이 바람직하지만, 그렇게 하는 것은 간단하지 않거나 적어도 허용할 수 없는 다른 단점을 유발하지 않고는 아니다. 특히, 제조 gag-pol 유전자의 코돈-최적화와 같은 RCL의 위험 감소를 목표로 하는 변형은 전형적으로 역가 또는 벡터의 수율에 부정적인 영향을 미친다는 것이 당업계에 확립되어 있다. 단일 환자를 치료하는 데 필요한 벡터의 역가가 크다는 점을 감안할 때, 이러한 수율 감소는 생산을 상업적으로 불가능하게 만들 가능성이 있다.
본 발명자들은 이제 처음으로 SIV로부터의 코돈-최적화된 gal-pol 유전자의 사용이 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 SIV 벡터의 제조된 역가에 부정적인 영향을 미치지 않고, 심지어 벡터의 역가를 증가시킬 수 있음을 입증하였다. 이는 정상적인 제조 조건(gag-pol 유전자가 아닌 벡터 게놈 플라스미드가 제한적일 때)에서 gag-pol 유전자의 코돈-최적화가 일반적으로 벡터 수율을 감소시킨다는 점을 감안하면 놀라운 일이다.
따라서, 본 발명자들은 부정적인 영향을 미치거나 벡터 역가를 증가시키지 않으면서 RCL의 위험이 감소된 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스, 특히 SIV와 같은 렌티바이러스 벡터의 생산 방법을 처음으로 제공하였다. 따라서, 본 발명의 방법은 상업적으로 바람직한 수율로 생산된 보다 안전한 벡터를 제공한다.
따라서, 본 발명은 호흡기 파라믹소바이러스로부터 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스 벡터를 생산하는 방법을 제공하고, 이는 프로모터 및 이식 유전자를 포함하며, 상기 방법은 코돈-최적화된 gag-pol 유전자의 용도를 포함한다. 바람직하게는, 레트로바이러스 벡터는 렌티바이러스 벡터이고, 선택적으로 렌티바이러스 벡터는 유인원 면역결핍 바이러스(SIV) 벡터, 인간 면역결핍 바이러스(HIV) 벡터, 고양이 면역결핍 바이러스(FIV) 벡터, 말 감염성 빈혈 바이러스(EIAV) 벡터, 및 Visna/maedi 바이러스 벡터로 이루어진 군으로부터 선택된다. SIV 벡터를 생산하는 방법이 특히 바람직하다.
코돈-최적화된 gag-pol 유전자는 SIV gag-pol 유전자일 수 있다. 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1에 대해 적어도 80%의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성될 수 있다. 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1의 핵산 서열을 포함하거나 이로 구성될 수 있다. 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 5와 적어도 80%의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성된 플라스미드에 포함될 수 있다. 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 5의 핵산 서열을 포함하거나 이로 구성된 플라스미드에 포함될 수 있다.
호흡기 파라믹소바이러스는 센다이 바이러스(Sendai virus)일 수 있다.
본 발명의 방법에 의해 생성된 레트로바이러스 벡터의 역가는: (a) 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스 벡터의 역가와 동등할 수 있거나; 또는 (b) 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스 벡터의 역가와 비교하여 증가된다. 선택적으로, 레트로바이러스 벡터의 역가는 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스 벡터의 역가보다 적어도 1.5배, 적어도 2배, 또는 적어도 2.5배 클 수 있다.
프로모터는 CMV(cytomegalovirus) 프로모터, EF1a(elongation factor 1a) 프로모터, 및 hCEF(hybrid human CMV enhancer/EF1a) 프로모터로 구성된 군에서 선택될 수 있다. 바람직하게는 벡터는 하이브리드 인간 CMV 인핸서/EF1a(hCEF) 프로모터를 포함한다.
이식 유전자는: (a) 분비된 치료 단백질, 선택적으로 알파-1 항트립신(A1AT), 인자 VIII, 계면활성제 단백질 B(SFTPB), 인자 VII, 인자 IX, 인자 X, 인자 XI, 폰 빌레브란트 인자(von Willebrand Factor), 과립구 큰포식세포 집락자극인자(Granulocyte-Macrophage Colony-Stimulating Factor(GM-CSF)) 및 감염원에 대한 단클론 항체; 또는 (b) CFTR, ABCA3, DNAH5, DNAH11, DNAI1, 및 DNAI2로부터 선택될 수 있다. 바람직하게는 이식 유전자는: (i) CFTR; (ii) A1AT; 또는 (iii) FVIII를 암호화한다.
특히 바람직한 구체예에서, 상기 방법은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성하고, 여기서: (a) 프로모터는 hCEF 프로모터이고 이식 유전자는 CFTR을 암호화하고; (b) 프로모터는 hCEF 프로모터이고 이식 유전자는 A1AT를 암호화하고, 또는 (c) 프로모터는 hCEF 또는 CMV 프로모터이고 이식 유전자는 FVIII를 암호화한다.
본 발명의 방법은 다음 단계를 포함하거나 다음으로 구성될 수 있다: (a) 벡터 게놈 플라스미드, 바람직하게는 pGM830 및 pGM326 또는 본 명세서에 정의된 그의 변이체로부터 선택되는 것; (b) co-galpol 플라스미드, 바람직하게는 pGM691 또는 본 명세서에 정의된 그의 변이체; (c) Rev 플라스미드, 바람직하게는 pGM299 또는 본 명세서에 정의된 그의 변이체; (d) 융합(F) 단백질 플라스미드, 바람직하게는 pGM301 또는 본 명세서에 정의된 그의 변이체; 및 (e) 헤마글루티닌-뉴라미니다제(HN) 플라스미드, 바람직하게는 pGM303 또는 본 명세서에 정의된 그의 변이체. 벡터 게놈 플라스미드: co-gagpol 플라스미드: Rev 플라스미드: F 플라스미드: HN 플라스미드의 비율은 20:9:6:6:6일 수 있다.
상기 방법의 단계 (a)-(f)는 순차적으로 수행될 수 있다. 세포는 HEK293 세포(예를 들어 HEK293F 또는 HEK293T 세포) 또는 293T/17 세포일 수 있다. 뉴클레아제의 첨가는 채취 전 단계일 수 있다. 트립신의 첨가는 채취 후 단계일 수 있다. 정제 단계는 하나 이상의 크로마토그래피 단계를 포함할 수 있다.
벡터 게놈 플라스미드는 레트로바이러스 ORF의 수를 줄이기 위해 변형될 수 있다.
본 발명은 또한 코돈-최적화된 gag-pol 유전자를 포함하는 핵산을 제공하고, 상기 핵산은 SEQ ID NO: 1과 적어도 80%의 서열 동일성을 갖는다. 바람직하게는 핵산은 SEQ ID NO: 1의 핵산 서열을 포함하거나 이로 구성된다.
본 발명은 추가로 본 발명의 핵산을 포함하는 플라스미드를 제공하고, 상기에서 선택적으로: (a) 플라스미드는 SEQ ID NO: 5에 대해 적어도 80% 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성되고; 또는 (b) 플라스미드는 SEQ ID NO: 5의 핵산 서열을 포함하거나 이로 구성된다. 선택적으로 플라스미드 내에서 핵산은 Gag 및 Pol 단백질의 발현을 유도하는 프로모터, 바람직하게는 CAG 프로모터에 작동 가능하게 연결된다.
본 발명은 또한 본 발명의 핵산 및/또는 본 발명의 플라스미드를 포함하는 숙주 세포를 제공한다.
본 발명은 추가로 본 발명의 방법에 의해 얻을 수 있는 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스 벡터를 제공한다.
본 발명은 또한 본 발명의 방법에 의해 얻을 수 있는 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스 벡터를 이를 필요로 하는 대상에게 투여하는 단계를 포함하는 질병 치료 방법을 제공한다. 치료될 질환은 폐 질환, 바람직하게는 낭포성 섬유증(cystic fibrosis)일 수 있다.
도 1: pGM297로부터의 야생형(코돈-최적화되지 않은) gag-pol 유전자와 pGM691로부터의 본 발명의 예시적인 코돈-최적화된 gag-pol 유전자의 정렬을 도시하고, 야생형 서열에 대한 변화를 도시한다.
도 2: a-f는 본 발명의 벡터 생산에 사용되는 예시적인 플라스미드의 개략도를 도시한다. g는 본 발명에 따라 코돈-최적화될 수 있는 코돈-최적화되지 않은 gag-pol 플라스미드(pDNA2a, 특히 pGM297)를 도시한다.
도 3: 본 발명의 A1AT 벡터의 생산에 사용되는 예시적인 pDNA1 플라스미드의 개략도를 도시한다.
도 4: a-d는 본 발명의 FVIII 벡터의 생산에 사용되는 예시적인 pDNA1 플라스미드의 개략도를 도시한다.
도 5: a는 pDNA1 플라스미드 pGM326과 코돈-최적화되지 않은 pDNA2a 플라스미드 pGM297 사이의 상동성을 예시한다. b는 본 발명의 코돈-최적화되지 않은 pDNA2a 플라스미드 pGM297과 코돈-최적화된 pDNA2a 플라스미드 pGM691을 주해된(annotated) 2개 사이의 차이점과 비교한다. c DNA 매트릭스 상동성 플롯은 pGM297(가로축)과 pGM691(세로축)에 존재하는 DNA 서열 간의 상동성을 설명한다. 실선 대각선은 서열 상동성을 나타내고, 파선은 서열 동일성이 감소된 영역을 강조 표시하고; pGM691에서 gag 및 pol 유전자 코돈 최적화 영역에서 감소된 서열 동일성에 주목한다. 또한 pGM297에 존재하는 추가 서열(가로축에 표시된 번호로 대략 6000 내지 7000개의 염기에 위치함)에 주목한다 - 이는 pGM297에는 존재하지만 pGM691에는 존재하지 않는 RRE 영역이다. d pGM297(하단 DNA 서열) 및 pGM691(상단 DNA 서열)의 gag pol 영역의 ClustalW DNA 서열 정렬; 서열 상동성은 박스 안의 음영 영역으로 표시되고, 컨센서스 DNA 서열은 pGM691 및 pGM297 서열 목록 아래에 도시된다. (i) gag pol 슬립 영역, gag pol 유전자의 중복 부분, 및 (ii) 토끼 베타 글로빈 폴리 아데닐화 서열(RBG pA)에서 pGM297과 pGM691 서열 간의 완전한 DNA 상동성을 주목한다. 또한 pGM297에는 SIV RRE 서열이 함유되어 있지만 pGM691에는 없다. E는 pGM693 플라스미드 내의 코돈-최적화된 gag-pol 유전자의 제한 지도(restriction map)를 도시한다.
도 6: a는 실험 설계(DOE) 조건하에서, 코돈-최적화된 pDNA2a 플라스미드 pGM691의 사용이 rSIV.F/HN hCEF-CFTR 벡터의 역가에서 관찰할 수 있는 증가를 초래했음을 도시한다. b는 코돈-최적화 pDNA2a 플라스미드 pGM691을 사용하여 수득한 rSIV.F/HN hCEF-CFTR 벡터 역가의 증가가 두 개의 상이한 실험 조건 세트에 걸쳐 나타남을 도시한다.
도 7: 코돈-최적화 pDNA2a 플라스미드 pGM691을 사용하여 수득한 rSIV.F/HN CMV-EGFP 벡터의 역가가 pDNA2a 플라스미드 pGM297에서 코돈-최적화되지 않은 gagpol을 사용하여 수득한 것보다 큼을 도시한다. 이는 F/HN 슈도타이핑된 벡터에서 코돈-최적화된 gagpol의 유리한 특성이 rSIV.F/HN hCEF-CFTR에 제한되지 않고, F/HN 슈도타이핑된 벡터에서 코돈-최적화된 gagpol을 사용하는 일반적인 특성임을 시사한다.
도 8: pGM326 벡터 게놈 플라스미드의 Partial Gag RRE cPPT hCEF 영역에 대한 선형 플라스미드 지도를 도시한다.
도 9: SIV ORF가 확인된 pGM326 벡터 게놈 플라스미드의 주해된 개략도를 도시한다. 특히, 189개 아미노산(aa) 중 하나, 250aa 중 하나인 두 개의 큰 ORF가 hCEF 프로모터 및 soCFTR2 이식 유전자의 업스트림에서 확인되었다.
도 10: 다른 동일한 조건(비-coGagPol 포함)에서 pGM326 벡터 게놈 플라스미드 및 변형된 pGM830 벡터 게놈 플라스미드가 HEK293T 세포(좌측 패널) 및 A549 세포(우측 패널) 모두에서 유사한 벡터 역가를 생성함을 도시한다.
도 11: coGagPol과 pGM326 또는 pGM830을 사용하여 다른 동일한 조건에서 생성된 벡터 역가를 도시하고, coGagPol이 pGM830과 결합될 때 증가된 벡터 역가에 대해 관찰 가능한 경향이 있다.
도 2: a-f는 본 발명의 벡터 생산에 사용되는 예시적인 플라스미드의 개략도를 도시한다. g는 본 발명에 따라 코돈-최적화될 수 있는 코돈-최적화되지 않은 gag-pol 플라스미드(pDNA2a, 특히 pGM297)를 도시한다.
도 3: 본 발명의 A1AT 벡터의 생산에 사용되는 예시적인 pDNA1 플라스미드의 개략도를 도시한다.
도 4: a-d는 본 발명의 FVIII 벡터의 생산에 사용되는 예시적인 pDNA1 플라스미드의 개략도를 도시한다.
도 5: a는 pDNA1 플라스미드 pGM326과 코돈-최적화되지 않은 pDNA2a 플라스미드 pGM297 사이의 상동성을 예시한다. b는 본 발명의 코돈-최적화되지 않은 pDNA2a 플라스미드 pGM297과 코돈-최적화된 pDNA2a 플라스미드 pGM691을 주해된(annotated) 2개 사이의 차이점과 비교한다. c DNA 매트릭스 상동성 플롯은 pGM297(가로축)과 pGM691(세로축)에 존재하는 DNA 서열 간의 상동성을 설명한다. 실선 대각선은 서열 상동성을 나타내고, 파선은 서열 동일성이 감소된 영역을 강조 표시하고; pGM691에서 gag 및 pol 유전자 코돈 최적화 영역에서 감소된 서열 동일성에 주목한다. 또한 pGM297에 존재하는 추가 서열(가로축에 표시된 번호로 대략 6000 내지 7000개의 염기에 위치함)에 주목한다 - 이는 pGM297에는 존재하지만 pGM691에는 존재하지 않는 RRE 영역이다. d pGM297(하단 DNA 서열) 및 pGM691(상단 DNA 서열)의 gag pol 영역의 ClustalW DNA 서열 정렬; 서열 상동성은 박스 안의 음영 영역으로 표시되고, 컨센서스 DNA 서열은 pGM691 및 pGM297 서열 목록 아래에 도시된다. (i) gag pol 슬립 영역, gag pol 유전자의 중복 부분, 및 (ii) 토끼 베타 글로빈 폴리 아데닐화 서열(RBG pA)에서 pGM297과 pGM691 서열 간의 완전한 DNA 상동성을 주목한다. 또한 pGM297에는 SIV RRE 서열이 함유되어 있지만 pGM691에는 없다. E는 pGM693 플라스미드 내의 코돈-최적화된 gag-pol 유전자의 제한 지도(restriction map)를 도시한다.
도 6: a는 실험 설계(DOE) 조건하에서, 코돈-최적화된 pDNA2a 플라스미드 pGM691의 사용이 rSIV.F/HN hCEF-CFTR 벡터의 역가에서 관찰할 수 있는 증가를 초래했음을 도시한다. b는 코돈-최적화 pDNA2a 플라스미드 pGM691을 사용하여 수득한 rSIV.F/HN hCEF-CFTR 벡터 역가의 증가가 두 개의 상이한 실험 조건 세트에 걸쳐 나타남을 도시한다.
도 7: 코돈-최적화 pDNA2a 플라스미드 pGM691을 사용하여 수득한 rSIV.F/HN CMV-EGFP 벡터의 역가가 pDNA2a 플라스미드 pGM297에서 코돈-최적화되지 않은 gagpol을 사용하여 수득한 것보다 큼을 도시한다. 이는 F/HN 슈도타이핑된 벡터에서 코돈-최적화된 gagpol의 유리한 특성이 rSIV.F/HN hCEF-CFTR에 제한되지 않고, F/HN 슈도타이핑된 벡터에서 코돈-최적화된 gagpol을 사용하는 일반적인 특성임을 시사한다.
도 8: pGM326 벡터 게놈 플라스미드의 Partial Gag RRE cPPT hCEF 영역에 대한 선형 플라스미드 지도를 도시한다.
도 9: SIV ORF가 확인된 pGM326 벡터 게놈 플라스미드의 주해된 개략도를 도시한다. 특히, 189개 아미노산(aa) 중 하나, 250aa 중 하나인 두 개의 큰 ORF가 hCEF 프로모터 및 soCFTR2 이식 유전자의 업스트림에서 확인되었다.
도 10: 다른 동일한 조건(비-coGagPol 포함)에서 pGM326 벡터 게놈 플라스미드 및 변형된 pGM830 벡터 게놈 플라스미드가 HEK293T 세포(좌측 패널) 및 A549 세포(우측 패널) 모두에서 유사한 벡터 역가를 생성함을 도시한다.
도 11: coGagPol과 pGM326 또는 pGM830을 사용하여 다른 동일한 조건에서 생성된 벡터 역가를 도시하고, coGagPol이 pGM830과 결합될 때 증가된 벡터 역가에 대해 관찰 가능한 경향이 있다.
발명의 상세한 설명
정의
달리 정의되지 않는 한, 본 명세서에 사용된 모든 기술 및 과학 용어는 본 명세서가 속하는 기술 분야의 통상의 기술자가 일반적으로 이해하는 것과 동일한 의미를 갖는다. Singleton, et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY, 20 ED., John Wiley and Sons, New York (1994), and Hale & Marham, THE HARPER COLLINS DICTIONARY OF BIOLOGY, Harper Perennial, NY (1991)는 당업자에게 본 명세서에서 사용되는 많은 용어의 일반 사전을 제공한다. 용어의 의미와 범위는 명확해야 하지만; 잠재적 다의성이 있는 경우, 본 명세서에 제공된 정의가 사전 또는 외부 정의보다 우선한다. 본 발명은 본 명세서에 기술된 특정 방법론, 프로토콜, 및 시약 등에 제한되지 않고 그 자체가 다양할 수 있음이 이해되어야 한다.
본 명세서는 본 명세서에 개시된 예시적인 방법 및 재료에 의해 제한되지 않고, 본 명세서에 기재된 것과 유사하거나 등가인 임의의 방법 및 재료가 본 명세서의 구체예의 실행 또는 테스트에 사용될 수 있다. 본 명세서에서 사용된 용어는 단지 특정한 구체예를 설명하기 위해 사용된 것으로, 본 발명의 권리범위를 한정하려는 의도가 아니고, 본 발명은 특허청구범위에 의해서만 정의된다.
본 발명의 구체예에 대한 설명은 완전하거나 개시된 정확한 형태로 본 발명을 제한하려는 것이 아니다. 본 명세서의 특정 구체예 및 실시예는 설명의 목적으로 본 명세서에 기재되어 있지만, 관련 기술 분야의 숙련자가 인식하는 바와 같이, 명세서의 범위 내에서 다양한 등가 변형이 가능하다. 예를 들어, 방법 단계 또는 기능이 주어진 순서로 제공되는 반면, 대안적인 구체예는 다른 순서로 기능을 수행할 수 있거나, 기능이 실질적으로 동시에 수행될 수 있다. 본 명세서에 제공된 명세서의 교시 내용은 적절하게 다른 절차 또는 방법에 적용될 수 있다. 본 명세서에 기술된 다양한 구체예는 조합되어 추가 구체예를 제공할 수 있다. 본 명세서의 측면은, 필요하다면, 본 명세서의 또 다른 구체예를 제공하기 위해 상기 참조 및 출원의 구성, 기능, 및 개념을 채용하도록 변형될 수 있다. 또한, 생물학적 기능적 동등성 고려로 인해, 종류 또는 양의 생물학적 또는 화학적 작용에 영향을 미치지 않으면서 단백질 구조에 약간의 변화를 줄 수 있다. 이러한 변경 및 기타 변경은 상세한 설명에 비추어 본 명세서에 대해 이루어질 수 있다. 이러한 모든 변형은 첨부된 청구범위 내에 포함되도록 의도된다.
달리 표시되지 않는 한, 임의의 핵산 서열은 5'에서 3' 방향으로 왼쪽에서 오른쪽으로 기록되고; 아미노산 서열은 각각 아미노에서 카르복시 방향으로 왼쪽에서 오른쪽으로 기록된다.
본 명세서 제공된 제목은 본 명세서의 다양한 측면 또는 구체예의 제한이 아니다.
본 명세서에서, "~할 수 있는"이라는 용어는 동사와 함께 사용될 때, 해당 동사의 작용을 포함하거나 의미한다. 예를 들어, ”상호 작용할 수 있는"은 또한 상호 작용을 의미하고, "절단 가능할 수 있는"은 또한 절단을 의미하고, “결합할 수 있는"은 또한 결합을 의미하고, "특이적으로 표적화할 수 있는…"은 또한 특이적 표적을 의미한다.
용어의 다른 정의는 명세서 전반에 걸쳐 나타날 수 있다. 예시적인 구체예가 더 상세하게 설명되기 전에, 본 명세서가 설명된 특정 구체예에 제한되지 않고, 이와 같이 변경될 수 있음을 이해해야 한다. 본 발명의 범위는 첨부된 특허청구범위에 의해서만 정의될 것이기 때문에, 본 명세서에서 사용된 용어는 단지 특정 구체예를 설명하기 위한 것이고, 제한하려는 의도가 아님을 이해해야 한다.
숫자 범위에는 범위를 정의하는 숫자가 포함된다. 값의 범위가 제공되는 경우, 문맥에서 달리 명시하지 않는 한, 하한 단위의 10분의 1까지 해당 범위의 상한과 하한 사이의 각 중간 값도 구체적으로 개시되는 것으로 이해된다. 임의의 명시된 값 또는 명시된 범위의 중간 값과 임의의 다른 명시된 값 또는 해당 명시된 범위의 중간 값 사이의 각각의 더 작은 범위는 본 명세서 내에 포함된다. 이들 더 작은 범위의 상한 및 하한은 독립적으로 범위에 포함되거나 제외될 수 있고, 더 작은 범위에 어느 하나, 둘 다, 또는 둘 다 포함되지 않는 각각의 범위는 또한, 명시된 범위에서 임의로 특이적으로 배제된 제한에 따라, 본 명세서 내에 포함된다. 명시된 범위가 제한 중 하나 또는 둘 모두를 포함하는 경우, 포함된 제한 중 하나 또는 둘 모두를 제외한 범위도 본 명세서에 포함된다.
본 명세서에서 사용된 바와 같이, 관사 “하나의”(“a" 및 “an")는 관사의 문법적 대상 중 하나 또는 둘 이상(예를 들어, 적어도 하나)을 의미할 수 있다. 또한, 문맥상 달리 요구되지 않는 한, 단수 용어는 복수를 포함하고 복수 용어는 단수를 포함한다. 본 출원에서, "또는"의 사용은 달리 언급하지 않는 한 "및/또는"을 의미한다. 또한, "포함한다" 및 "포함되는"과 같은 다른 형태뿐만 아니라 "포함하는"이라는 용어의 사용은 제한되지 않는다.
"약"은 일반적으로 측정의 특성 또는 정밀도를 고려할 때 측정된 양에 대해 허용 가능한 오류 정도를 의미할 수 있다. 예시적인 오류 정도는 주어진 값 또는 값 범위의 20퍼센트(%) 이내, 일반적으로 10% 이내, 보다 일반적으로 5% 이내이다. 바람직하게는, 용어 "약"은 본 명세서에서 사용되는 수치의 플러스 또는 마이너스 (±) 5%, 바람직하게는 ±4%, ±3%, ±2%, ±1%, ±0.5%, ±0.1%로 이해되어야 한다.
용어 “구성되는(consisting of)”은 본 명세서에 기재된 바와 같은 조성물, 방법, 및 이들의 각각의 구성 요소를 의미하고, 이는 본 발명의 해당 설명에 인용되지 않은 임의의 요소를 배제한다.
본 명세서에 사용된 바와 같이 용어 "본질적으로 구성되는(consisting essentially of)“은 주어진 발명에 필요한 요소를 의미한다. 이 용어는 해당 발명의 기본적이고 신규하거나 기능적인 특성(들)에 실질적으로 영향을 미치지 않는 요소(즉, 비활성 또는 면역원성이 없는 성분)의 존재를 허용한다.
하나 이상의 특징을 "포함하는" 것으로 본 명세서에 기재된 구체예는 또한 이러한 특징으로 "구성되는" 및/또는 "필수적으로 구성되는" 대응하는 구체예의 개시로서 간주될 수 있다.
농도, 양, 부피, 백분율, 및 기타 수치는 범위 형식으로 본 명세서에 표시될 수 있다. 또한 이러한 범위 형식은 편의와 간결성을 위해 사용되고 범위의 제한으로 명시적으로 언급된 수치 값을 포함할 뿐만 아니라 모든 개별 수치 또는 하위 범위가 명시적으로 언급된 것처럼 해당 범위 내에 포함된 모든 개별 수치 또는 하위 범위를 포함하는 것으로 해석되어야 함을 이해해야 한다.
본 명세서에서 사용되는 용어 "벡터", "레트로바이러스 벡터”, 및 "레트로바이러스 F/HN 벡터"는 달리 명시되지 않는 한 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다아제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스 벡터를 의미하기 위해 상호 교환적으로 사용된다. 용어 "렌티바이러스 벡터" 및 "렌티바이러스 F/HN 벡터"는 달리 명시되지 않는 한 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 렌티바이러스 벡터를 의미하기 위해 상호 교환적으로 사용된다. 본 발명의 레트로바이러스 벡터에 관한 본 명세서의 모든 개시 내용은 본 발명의 렌티바이러스 벡터 및 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다아제(HN) 및 융합(F) 단백질로 슈도타이핑된 SIV 벡터(본 명세서에서 SIV F/HN 또는 SIV-FHN으로도 지칭됨)에 동일하게 그리고 유보 없이 적용된다.
본 명세서에서 사용되는 용어 "역가" 및 "수율"은 본 발명의 방법에 의해 생산된 렌티바이러스(예를 들어, SIV) 벡터의 양을 의미하기 위해 상호 교환적으로 사용된다. 역가는 제조 효율성을 특징짓는 주요 벤치마크이고, 역가가 높을수록 일반적으로 더 많은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터가 제조됨을 나타낸다(예를 들어, 동일한 양의 시약 사용). 역가 또는 수율은, 표적 세포의 게놈 내에 통합된 벡터 게놈의 수(통합 역가)와 관련될 수 있고, 이는 "활성" 바이러스 입자의 척도, 즉 세포를 형질 도입할 수 있는 입자의 수이다. 형질 도입 단위(TU/mL 또는 TTU/mL라고도 함)는 특정 조직 배양/바이러스 희석 조건에서 형질 도입되는 숙주 세포 수의 생물학적 판독값이고, "활성" 바이러스 입자 수의 척도이다. (활성+비활성) 바이러스 입자의 총 수는, 테스트 용액에 얼마나 많은 Gag가 존재하는지 또는 테스트 용액에 얼마나 많은 바이러스 RNA 복제본이 있는지 측정하는 것과 같은, 임의의 적절한 수단을 사용하여 결정될 수도 있다. 그런 다음 렌티바이러스 입자가 2000개의 Gag 분자 또는 2개의 바이러스 RNA 분자를 포함한다고 가정한다. 총 입자 수와 형질 도입 역가/TU가 측정되면, 입자:감염 비율이 계산된다. 아미노산은 아미노산의 이름, 세 글자 약어 또는 한 글자 약어를 사용하여 본 명세서에서 언급된다.
본 명세서에서 사용되는 용어 "단백질" 및 "폴리펩티드"는 본 명세서에서 상호 교환적으로 사용되어 인접한 잔기의 알파-아미노 및 카르복실기 사이의 펩티드 결합에 의해 서로 연결된 일련의 아미노산 잔기를 지정한다. 용어 "단백질" 및 “폴리펩티드"는, 크기나 기능에 관계없이, 변형된 아미노산(예를 들어, 인산화, 당화, 글리코실화 등) 및 아미노산 유사체를 포함하는 아미노산의 중합체를 의미한다. "단백질" 및 "폴리펩티드"는 상대적으로 큰 폴리펩티드와 관련하여 종종 사용되는 반면, 용어 "펩티드"는 종종 작은 폴리펩티드와 관련하여 사용되지만, 당업계에서 이들 용어의 사용은 중복된다. 용어 "단백질" 및 "폴리펩티드"는 유전자 산물 및 이의 단편을 언급할 때 본 명세서에서 상호 교환적으로 사용된다. 따라서, 예시적인 폴리펩티드 또는 단백질은 유전자 산물, 자연 발생 단백질, 상동체, 오르토로그(orthologs), 파라로그(paralogs), 단편 및 기타 등가물, 변이체, 단편, 및 전술한 유사체를 포함한다.
본 명세서에서 사용되는 용어 "폴리뉴클레오티드", “핵산", 및 "핵산 서열"은 리보핵산, 데옥시리보핵산, 또는 그의 유사체의 단위를 포함하는 임의의 분자, 바람직하게는 중합체 분자를 의미한다. 핵산은 단일 가닥 또는 이중 가닥일 수 있다. 단일 가닥 핵산은 변성 이중 가닥 DNA의 하나의 핵산 가닥일 수 있다. 또는, 임의의 이중 가닥 DNA에서 파생되지 않은 단일 가닥 핵산일 수 있다. 한 측면에서, 핵산은 DNA일 수 있다. 또 다른 측면에서, 핵산은 RNA일 수 있다. 적합한 핵산 분자는 게놈 DNA 또는 cDNA를 포함하는 DNA이다. 다른 적합한 핵산 분자는 siRNA, shRNA, 및 안티센스 올리고뉴클레오티드를 포함하는 RNA이다. 용어 “이식 유전자" 및 "유전자"는 또한 상호 교환적으로 사용되고 두 용어 모두 표적 단백질을 암호화하는 단편 또는 변이체를 포함한다.
본 발명의 이식 유전자는 자연 발생 환경으로부터 제거된 핵산 서열, 재조합 또는 클로닝된 DNA 분리물, 및 화학적으로 합성된 유사체 또는 이종 시스템에 의해 생물학적으로 합성된 유사체를 포함한다.
본 발명의 아미노산 서열(들)의 경미한 변이가 본 발명의 아미노산 서열 또는 본 명세서에 정의된 이의 단편에 대해 적어도 60%, 적어도 70%, 보다 바람직하게는 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 가장 바람직하게는 적어도 97% 또는 적어도 99% 서열 동일성을 유지한다면, 아미노산 서열(들)의 변이가 본 발명에 포함되는 것으로 고려된다. 용어 상동성은 본 명세서에서 동일성을 의미하기 위해 사용된다. 이와 같이, 본 발명의 아미노산 서열의 변이체 또는 유사체 서열은 치환(전형적으로 보존적 치환) 결실 또는 삽입에 기초하여 상이할 수 있다. 이러한 변이를 포함하는 단백질은 본 명세서에서 변이체라고 한다.
본 발명의 단백질은, 한 종의 아미노산 잔기가 보존 또는 비보존 위치에서 다른 종의 상응하는 잔기로 치환된, 변이체를 포함할 수 있다. 본 명세서에 기재된 단백질 분자의 변이체는 본 발명에서 생산되고 사용될 수 있다. 단백질의 구조/특성-활성 관계[예를 들어, Wold, et al. Multivariate data analysis in chemistry. Chemometrics-Mathematics and Statistics in Chemistry (Ed.: B. Kowalski); D. Reidel Publishing Company, Dordrecht, Holland, 1984 (ISBN 90-277-1846-6 참조] 정량적 활성-특성 관계에 다변량 데이터 분석 기술을 적용하는 데 있어서 계산 화학의 선두에 따라 통계적 회귀, 패턴 인식, 및 분류와 같은 공지된 수학적 기법을 사용하여 유도할 수 있다[예를 들어, Norman et al. Applied Regression Analysis. Wiley-lnterscience; 3rd edition (April 1998) ISBN: 0471170828; Kandel, Abraham et al. Computer-Assisted Reasoning in Cluster Analysis. Prentice Hall PTR, (May 11, 1995), ISBN: 0133418847; Krzanowski, Wojtek. Principles of Multivariate Analysis: A User's Perspective (Oxford Statistical Science Series, No 22 (Paper)). Oxford University Press; (December 2000), ISBN: 0198507089; Witten, Ian H. et al Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann; (October 11, 1999), ISBN:1558605525; Denison David G. T. (Editor) et al Bayesian Methods for Nonlinear Classification and Regression (Wiley Series in Probability and Statistics). John Wiley & Sons; (July 2002), ISBN: 0471490369; Ghose, Arup K. et al. Combinatorial Library Design and Evaluation Principles, Software, Tools, and Applications in Drug Discovery. ISBN: 0-8247-0487-8 참조]. 단백질의 특성은 단백질 서열, 기능, 및 3차원 구조의 경험적 및 이론적 모델(예를 들어, 가능한 접촉 잔기의 분석 또는 계산된 물리 화학적 특성)에서 파생될 수 있고 이러한 특성은 개별적으로 또는 조합하여 고려될 수 있다.
아미노산은 아미노산의 명명, 세 글자 약어, 또는 한 글자 약어를 사용하여 본 명세서에서 언급된다. 본 명세서에서 사용되는 용어 "단백질"은 단백질, 폴리펩티드, 및 펩티드를 포함한다. 본 명세서에서 사용되는 용어 "아미노산 서열"은 용어 "폴리펩티드" 및/또는 용어 "단백질"과 동의어이다. 일부 경우에, 용어 "아미노산 서열"은 용어 "펩티드"와 동의어이다. 용어 "단백질" 및 "폴리펩티드"는 본 명세서에서 상호교환적으로 사용된다. 본 명세서 및 특허청구범위에서, 아미노산 잔기에 대한 통상의 한 글자 및 세 문자 코드가 사용될 수 있다. IUPACIUB JCBN(Joint Commission on Biochemical Nomenclature)에 따라 정의된 아미노산의 세 글자 코드이다. 유전자 코드의 축퇴(degeneracy)로 인해 폴리펩티드가 하나 이상의 뉴클레오티드 서열에 의해 코딩될 수 있다는 것도 이해된다.
비보존 위치의 아미노산 잔기는 보존적 또는 비보존적 잔기로 치환될 수 있다. 특히, 보존적 아미노산 대체가 고려된다.
"보존적 아미노산 치환"은 아미노산 잔기가 유사한 측쇄를 갖는 아미노산 잔기로 대체되는 것이다. 염기성 측쇄(예를 들어, 라이신, 아르기닌, 또는 히스티딘), 산성 측쇄(예를 들어, 아스파르트산 또는 글루탐산), 하전되지 않은 극성 측쇄(예를 들어, 글리신, 아스파라긴, 글루타민, 세린, 트레오닌, 티로신, 또는 시스테인), 비극성 측쇄(예를 들어, 알라닌, 발린, 류신, 이소류신, 프롤린, 페닐알라닌, 메티오닌, 또는 트립토판), 베타-분지형 측쇄(예를 들어, 트레오닌, 발린, 이소류신), 및 방향족 측쇄(예를 들어, 티로신, 페닐알라닌, 트립토판, 또는 히스티딘)을 포함하여, 유사한 측쇄를 갖는 아미노산 잔기 패밀리가 당업계에 정의되어 있다. 따라서, 폴리펩티드의 아미노산이 동일한 측쇄 패밀리의 다른 아미노산으로 대체되면, 아미노산 치환은 보존적인 것으로 간주된다. 본 발명의 단백질에 보존적으로 변형된 변이체의 포함은 다른 형태의 변이체, 예를 들어 다형성 변이체, 종간 상동체, 및 대립 유전자를 배제하지 않는다.
"비보존적 아미노산 치환"은 (i) 전기 양성 측쇄를 갖는 잔기(예를 들어, Arg, His, 또는 Lys)가 전기 음성 잔기(예를 들어, Glu 또는 Asp)로 치환되거나, (ii) 친수성 잔기(예를 들어, Ser 또는 Thr)가 소수성 잔기(예를 들어, Ala, Leu, Ile, Phe, 또는 Val)로 치환되거나, (iii) 시스테인 또는 프롤린이 임의의 다른 잔기로 치환되거나, 또는 (iv) 부피가 큰 소수성 또는 방향족 측쇄를 갖는 잔기(예를 들어, Val, His, Ile, 또는 Trp)가 더 작은 측쇄를 갖거나(예를 들어, Ala 또는 Ser) 측쇄가 없는 잔기(예를 들어, Gly)로 치환되는 것을 포함한다.
“삽입(insertions)” 또는 “결실(deletions)”은 일반적으로 약 1, 2, 또는 3개의 아미노산 범위이다. 허용되는 변이는 재조합 DNA 기술을 사용하여 단백질에 아미노산의 삽입 또는 결실을 체계적으로 도입하고 생성된 재조합 변이체의 활성을 분석하여 실험적으로 결정할 수 있다. 이는 숙련된 사람에게 일상적인 실험 이상을 요구하지 않는다.
폴리펩티드의 "단편"은 원래의 폴리펩티드의 적어도 50%, 적어도 60%, 적어도 70%, 적어도 80%, 적어도 90%, 적어도 95%, 적어도 97%, 또는 그 이상을 포함한다.
본 발명의 폴리뉴클레오티드는 당업계에 공지된 임의의 수단에 의해 제조될 수 있다. 예를 들어, 많은 양의 폴리뉴클레오티드가 적합한 숙주 세포에서 복제에 의해 생산될 수 있다. 원하는 단편을 코딩하는 천연 또는 합성 DNA 단편은, 원핵 또는 진핵 세포에 도입 및 복제할 수 있는 재조합 핵산 구조, 일반적으로 DNA 구조 내에 통합될 것이다. 일반적으로 DNA 구조는, 효모 또는 박테리아와 같은 단세포 숙주에서 자율 복제(autonomous replication)에 적합할 것이지만, 배양된 곤충, 포유류, 식물, 또는 다른 진핵 세포주의 게놈으로의 도입 및 통합을 위해 의도될 수도 있다.
본 발명의 폴리뉴클레오티드는 또한 화학적 합성, 예를 들어 포스포아미다이트(phosphoramidite) 방법 또는 트리-에스테르 방법에 의해 상업적으로 자동화된 올리고뉴클레오티드 합성기에서 수행될 수 있다. 이중 가닥 단편은 상보 가닥을 합성하고 적절한 조건에서 가닥을 함께 어닐링하거나 적절한 프라이머 서열과 함께 DNA 폴리머라제를 사용하여 상보 가닥을 추가함으로써 화학적 합성의 단일 가닥 생성물로부터 얻을 수 있다.
핵산 서열에 적용될 때, 본 발명의 맥락에서 용어 “단리된(isolated)”은 폴리뉴클레오티드 서열이 그의 천연 유전적 환경에서 제거되었고, 따라서 다른 이질적이거나 원치 않는 코딩 서열이 없으며(그러나 프로모터 및 터미네이터와 같은 자연 발생 5' 및 3' 비번역 영역을 포함할 수 있음), 유전적으로 조작된 단백질 생산 시스템 내에서 사용하기에 적합한 형태이다. 이러한 단리된 분자는 자연 환경에서 분리된 분자이다.
유전자 코드의 변성을 고려할 때, 본 발명의 폴리뉴클레오티드 간에 상당한 서열 변이가 가능하다. 주어진 아미노산에 대한 모든 가능한 코돈을 포함하는 축퇴 코돈은 아래에 제시되어 있다:
아미노산(Amino Acid) | 코돈(Codons) | 동의코돈(Degenerate Codon) |
Cys | TGC TGT | TGY |
Ser | AGC AGT TCA TCC TCG TCT | WSN |
Thr | ACA ACC ACG ACT | ACN |
Pro | CCA CCC CCG CCT | CCN |
Ala | GCA GCC GCG GCT | GCN |
Gly | GGA GGC GGG GGT | GGN |
Asn | AAC AAT | AAY |
Asp | GAC GAT | GAY |
Glu | GAA GAG | GAR |
Gln | CAA CAG | CAR |
His | CAC CAT | CAY |
Arg | AGA AGG CGA CGC CGG CGT | MGN |
Lys | AAA AAG | AAR |
Met | ATG | ATG |
Ile | ATA ATC ATT | ATH |
Leu | CTA CTC CTG CTT TTA TTG | YTN |
Val | GTA GTC GTG GTT | GTN |
Phe | TTC TTT | TTY |
Tyr | TAC TAT | TAY |
Trp | TGG | TGG |
Ter | TAA TAG TGA | TRR |
Asn/ Asp | RAY | |
Glu/ Gln | SAR | |
Any | NNN |
당업자는 각 아미노산을 암호화하는 모든 가능한 코돈을 대표하는 축퇴 코돈을 결정할 때 유연성이 존재함을 이해할 것이다. 예를 들어, 축퇴 서열에 포함된 일부 폴리뉴클레오티드는 변이체 아미노산 서열을 암호화할 수 있지만, 당업자는 본 발명의 아미노산 서열을 참조하여 이러한 변이체 서열을 쉽게 확인할 수 있다.
"변이체" 핵산 서열은 참조 핵산 서열(또는 이의 단편)과 실질적 상동성 또는 실질적 유사성을 갖는다. 핵산 서열 또는 이의 단편은, 다른 핵산(또는 이의 상보 가닥)과 최적으로 정렬(적절한 뉴클레오티드 삽입 또는 결실 포함)되었을 때, 뉴클레오티드 염기의 적어도 약 70%, 75%, 80%, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% 이상에서 뉴클레오티드 서열 동일성이 있는 경우, 참조 서열과 “실질적으로 상동”(또는 “실질적으로 동일”)이다. 핵산 서열의 상동성 결정 방법은 당업계에 공지되어 있다.
대안적으로, "변이체" 핵산 서열은, "변이체" 및 참조 서열이 엄격한(예를 들어, 매우 엄격한) 혼성화 조건 하에 혼성화할 수 있는 경우, 참조 서열(또는 이의 단편)과 실질적으로 상동성(또는 실질적으로 동일)이다. 핵산 서열 혼성화는, 염기 조성, 상보 가닥의 길이, 및 혼성화하는 핵산 사이의 뉴클레오티드 염기 불일치 수 외에도, 염 농도(예를 들어, NaCl), 온도, 또는 유기 용매와 같은 조건에 의해 영향을 받을 것이고, 이는 당업자에 의해 쉽게 이해될 것이다. 엄격한 온도 조건이 바람직하게 사용되고, 일반적으로 30℃ 초과, 전형적으로 37℃ 초과, 및 바람직하게는 45℃ 초과의 온도를 포함한다. 엄격한 염 조건은 일반적으로 1000mM 미만, 전형적으로 500mM 미만, 바람직하게는 200mM 미만이다. pH는 전형적으로 7.0에서 8.3 사이이다. 매개변수의 조합은 단일 매개변수보다 훨씬 더 중요하다.
핵산 백분율 서열 동일성을 결정하는 방법은 당업계에 공지되어 있다. 예로서, 핵산 서열 동일성을 평가할 때, 정의된 수의 연속 뉴클레오티드를 갖는 서열은 본 발명의 핵산 서열의 해당 부분으로부터의 핵산 서열(동일한 수의 연속 뉴클레오티드를 가짐)과 정렬될 수 있다. 핵산 백분율 서열 동일성을 결정하기 위한 당업계에 공지된 도구는 뉴클레오티드 BLAST(하기 기재됨)를 포함한다.
당업자는 상이한 종(species)이 "우선적 코돈 사용(preferential codon usage)“을 나타낸다는 것을 인식한다. 본 명세서에서 사용되는 바와 같이, 용어 "우선적 코돈 사용"은 특정 종의 세포에서 가장 빈번하게 사용되는 코돈을 의미하고, 따라서 각 아미노산을 암호화하는 가능한 코돈 중 하나 또는 몇 개를 선호한다. 예를 들어, 아미노산 트레오닌(Thr)은 ACA, ACC, ACG, 또는 ACT에 의해 암호화될 수 있지만, 포유류 숙주 세포에서 ACC는 가장 일반적으로 사용되는 코돈이고; 다른 종에서는, 다른 코돈이 우선적일 수 있다. 특정 숙주 세포 종에 대한 우선적 코돈은 당업계에 공지된 다양한 방법에 의해 본 발명의 폴리뉴클레오티드 내로 도입될 수 있다. 예를 들어, 재조합 DNA 내에 우선적 코돈 서열을 도입하면 특정 세포 유형 또는 종 내에서 단백질 번역을 보다 효율적으로 만들어 단백질 생산을 향상시킬 수 있다. 따라서, 본 발명에 따르면, gag-pol 유전자 이외에 임의의 핵산 서열이 숙주 또는 표적 세포에서의 발현을 위해 코돈-최적화될 수 있다. 특히, 벡터 게놈(또는 상응하는 플라스미드), REV 유전자(또는 상응하는 플라스미드), 융합 단백질(F) 유전자(또는 상응하는 플라스미드), 및/또는 헤마글루티닌-뉴라미니다제(HN) 유전자(또는 상응하는 플라스미드), 또는 이들의 임의의 조합은 코돈-최적화될 수 있다.
관심 있는 폴리뉴클레오티드의 "단편"은 상기 전장 폴리뉴클레오티드의 서열로부터 일련의 연속적인 뉴클레오티드를 포함한다. 예를 들어, 관심 폴리뉴클레오티드의 "단편"은 상기 폴리뉴클레오티드의 서열로부터 적어도 30개의 연속적인 뉴클레오티드(예를 들어, 상기 폴리뉴클레오티드의 적어도 35, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 또는 1000개의 연속 핵산 잔기)를 포함할 수 있다(또는 이들로 구성될 수 있다). 단편은 적어도 하나의 항원 결정자를 포함할 수 있고/있거나 상응하는 관심 폴리펩티드의 적어도 하나의 항원 에피토프를 암호화할 수 있다. 전형적으로, 본 명세서에 정의된 바와 같은 단편은 전장 폴리뉴클레오티드와 동일한 기능을 유지한다.
용어 "감소하다", "감소된", “감소", 또는 "억제하다"는 모두 통계적으로 유의미한 양만큼의 감소를 의미하기 위해 본 명세서에서 사용된다. 용어 "감소하다", “감소", 또는 “감소하다", 또는 "억제하다"는 전형적으로 기준 수준과 비교하여 적어도 10% 감소(예를 들어, 주어진 치료의 부재)를 의미하고 예를 들어 적어도 약 10%, 적어도 약 20%, 적어도 약 25%, 적어도 약 30%, 적어도 약 35%, 적어도 약 40%, 적어도 약 45%, 적어도 약 50%, 적어도 약 55%, 적어도 약 60%, 적어도 약 65%, 적어도 약 70%, 적어도 약 75%, 적어도 약 80%, 적어도 약 85%, 적어도 약 90%, 적어도 약 95%, 적어도 약 98%, 적어도 약 99%, 또는 그 이상의 감소를 포함할 수 있다. 본 명세서에서 사용되는 "감소" 또는 "억제"는 기준 수준과 비교하여 완전한 억제 또는 감소를 포함한다. "완전한 억제"는 기준 수준과 비교하여 100% 억제(즉, 저지(abrogation))이다.
용어 "증가된", "증가하다", “강화하다", 또는 "활성화하다"는 모두 정적으로 상당한 양만큼의 증가를 의미하는 것으로 본 명세서에서 사용된다. 용어 "증가된", "증가하다", “강화하다", 또는 "활성화하다"는 기준 수준과 비교하여 적어도 25%, 적어도 50%의 증가, 예를 들어 기준 수준과 비교하여 적어도 약 50%, 또는 적어도 약 75%, 또는 적어도 약 80%, 또는 적어도 약 90%, 또는 적어도 약 100%, 또는 적어도 약 150%, 또는 적어도 약 200%, 또는 적어도 약 250%, 또는 그 이상 증가, 또는 기준 수준과 비교하여 적어도 약 1.5배, 또는 적어도 약 2배, 또는 적어도 약 2.5배, 또는 적어도 약 3배, 또는 적어도 약 4배 , 또는 적어도 약 5배, 또는 적어도 약 10배 증가, 또는 1.5배 내지 10배 또는 그 초과의 임의의 증가를 의미할 수 있다. 수율 또는 역가와 관련하여, "증가"는 그러한 수준에서 관찰 가능하거나 통계적으로 유의미한 증가이다.
용어 "개체", “대상", 및 "환자"는 진단, 예후, 질병 모니터링, 치료, 요법, 및/또는 요법 최적화가 요구되는 포유동물 대상을 지칭하기 위해 본 명세서에서 상호 교환적으로 사용된다. 포유동물은 (제한 없이) 인간, 비인간 영장류, 생쥐, 쥐, 개, 고양이, 말, 또는 소일 수 있다. 바람직한 구체예에서, 개체, 대상, 또는 환자는 인간이다. "개체"는 성인, 청소년, 또는 유아일 수 있다. "개체"는 남성 또는 여성일 수 있다.
특정 질환에 대한 치료가 "필요한 대상"은 해당 질환을 갖고 있거나, 해당 질환을 갖는 것으로 진단되거나, 해당 질환이 발생할 위험이 있는 개체일 수 있다.
대상은 치료를 필요로 하는 질환 또는 그러한 질환과 관련된 하나 이상의 합병증 또는 증상으로 진단되거나 고통받는 것으로 확인되거나 그러한 질환과 관련된 하나 이상의 합병증 또는 증상을 갖는 것으로 이전에 진단되고, 선택적으로 본 명세서에 정의된 질환 또는 상기 질환과 관련된 하나 이상의 합병증 또는 증상에 대한 치료를 이미 받은 사람일 수 있다. 대안적으로, 대상은 또한 본 명세서에 정의된 질환 또는 상기 질환과 관련된 하나 이상의 증상 또는 합병증을 갖는 것으로 이전에 진단되지 않은 사람일 수 있다. 예를 들어, 대상은 질환에 대한 하나 이상의 위험 인자를 나타내는 대상, 또는 상기 질환과 관련된 하나 이상의 증상 또는 합병증을 나타내는 대상 또는 위험 인자를 나타내지 않는 대상일 수 있다.
본 명세서에서 사용되는 바와 같이, 용어 "건강한 개체"는 건강한 상태에 있는 개체 또는 개체 그룹, 예를 들어 질병의 증상을 나타내지 않고, 질병으로 진단되지 않았으며, 그리고/또는 질병(예를 들어, 낭포성 섬유증(CF) 또는 본 명세서에 기재된 임의의 다른 질병)이 발병할 가능성이 없는 개체를 의미한다. 바람직하게는 상기 건강한 개체(들)는 CF에 영향을 미치는 약물을 복용하지 않고 임의의 다른 질병으로 진단되지 않았다. 하나 이상의 건강한 개체는 테스트 개체와 비교하여 유사한 성별, 연령, 및/또는 체질량 지수(BMI)를 가질 수 있다. 의학에서 사용되는 표준 통계 방법을 적용하면 건강한 개체의 정상적인 표현 수준과 그러한 정상 수준에서 상당한 편차를 결정할 수 있다.
본 명세서에서 용어 "대조군" 및 “기준 집단(reference population)“은 상호 교환적으로 사용된다.
본 명세서에서 사용된 용어 "약학적으로 허용되는"은 연방 또는 주 정부의 규제 기관에 의해 승인되거나 미국 약전, 유럽 약전, 또는 기타 일반적으로 인정되는 약전에 등재된 것을 의미한다.
본 명세서에서 논의된 간행물은 본 출원의 출원일 이전의 개시를 위해서만 제공된다. 본 명세서의 어떤 것도 그러한 간행물이 본 명세서에 첨부된 청구범위에 대한 선행 기술을 구성한다는 것을 인정하는 것으로 해석해서는 안 된다.
본 발명의 다양한 방법과 관련된 개시는 다른 방법, 치료 용도 또는 방법, 데이터 저장 매체 또는 장치, 컴퓨터 프로그램 제품, 및 그 역으로도 동일하게 적용되도록 의도된다.
레트로바이러스 및 렌티바이러스 벡터
본 발명은 레트로바이러스/렌티바이러스(예를 들어, SIV) 작제물(construct)의 생산에 관한 것이다. 용어 "레트로바이러스"는 효소 역전사 효소를 암호화하는 레트로바이러스과(Retroviridae) RNA 바이러스 계열의 구성원을 의미한다. 용어 "렌티바이러스"는 레트로바이러스 계열을 의미한다. 본 발명에 사용하기에 적합한 레트로바이러스의 예는 뮤린 백혈병 바이러스(MLV) 및 고양이 백혈병 바이러스(FLV)와 같은 감마레트로바이러스를 포함한다. 본 발명에 사용하기에 적합한 렌티바이러스의 예는 유인원 면역결핍 바이러스(SIV), 인간 면역결핍 바이러스(HIV), 고양이 면역결핍 바이러스(FIV), 말 감염성 빈혈 바이러스(EIAV), 및 비스나/매디(Visna/maedi) 바이러스를 포함한다. 바람직하게는 본 발명은 렌티바이러스 벡터 및 이의 생산에 관한 것이다. 특히 바람직한 렌티바이러스 벡터는 SIV-AGM(원래의 아프리카 녹색 원숭이, Cercopithecus aethiops로부터 단리됨)과 같은 SIV 벡터(모든 균주 및 아형 포함)이다. 대안적으로 본 발명은 HIV 벡터에 관한 것이다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 전형적으로 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된다. 바람직하게는 호흡기 파라믹소바이러스는 센다이 바이러스(Sendai virus)(뮤린 파라인플루엔자 바이러스 1형)이다. 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는, 코돈-최적화된 gag-pol 유전자(예를 들어, SIV로부터)의 사용이 벡터의 제조된 역가에 부정적인 영향을 미치지 않거나, 심지어 벡터의 역가를 증가시키지 않는다면, 다른 바이러스의 단백질로 슈도타이핑될 수 있다. 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 슈도타이핑하기 위해 사용될 수 있는 다른 단백질의 비제한적 예는 수포성 구내염 바이러스(G-VSV)의 G 당단백질 및 중증 급성 호흡기 증후군 코로나바이러스 2(SARS-CoV-2) 스파이크 단백질 또는 이의 변형된 형태를 포함하고; 예를 들어 영국 특허 출원 제2118685.3호 및 제2105278.2호에 기재된 것과 같이, 이들 각각은 전체가 본 명세서에 참조로 포함된다. 따라서, 본 발명은, 코돈-최적화된 gag-pol 유전자를 사용하여, G-VSV로 슈도타이핑된 SIV 또는 SARS-CoV-2 스파이크 단백질로 슈도타이핑된 SIV의 생산에 관한 것일 수 있다.
본 발명에 따라 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 인테그라제-적격(IC: integrase-competent)일 수 있다. 대안적으로, 렌티바이러스(예를 들어, SIV) 벡터는 인테그라제-결핍(ID: integrase-deficient)일 수 있다.
본 발명에 따라 생성된 것과 같은 레트로바이러스/렌티바이러스 벡터는 형질 도입된 세포의 게놈 내로 통합될 수 있고 오래 지속되는 발현을 유도하여, 줄기/전구 세포의 형질 도입에 적합하게 만든다. 폐에서는, 재생 능력을 갖는 여러 세포 유형이 전도성 기도(conducting airways) 및 폐포에서 특정 세포 계통을 유지하는 역할을 하는 것으로 확인되었다. 이에는 상기도(upper airways)의 기저 세포 및 점막하 샘관 세포(submucosal gland duct cells), 세기관지 기도(bronchiolar airways)의 곤봉상 세포(club cells) 및 신경 내분비 세포(neuroendocrine cells), 말단 세기관지(terminal bronchioles)의 세기관지 폐포 줄기 세포(bronchioalveolar stem cells), 및 폐포의 II형 폐포 세포(pneumocytes)가 포함된다. 따라서, 이론에 얽매이지 않고, 상기 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는, 하나 이상의 수명이 긴 기도 상피 세포 또는 상기도의 기저 세포 및 점막하 샘관 세포, 세기관지 기도의 곤봉상 세포 및 신경 내분비 세포, 말단 세기관지의 세기관지 폐포 줄기 세포, 및 폐포의 II형 폐포 세포와 같은 세포 유형에 이식 유전자를 도입함으로써 관심 이식 유전자의 장기간 유전자 발현(long term gene expression)을 야기한다.
따라서, 본 발명에 따라 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 장기간 유전자 발현을 달성하기 위해 폐(기도 및 호흡기 포함) 내에서 재생 가능성을 갖는 하나 이상의 세포 또는 세포주를 형질도입할 수 있다. 예를 들어, 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 상기도/호흡기에 있는 것과 같은 기저 세포를 형질 도입할 수 있다. 기저 세포는 손상 후 상피 유지 및 복구 과정에서 중심적인 역할을 한다. 또한, 기저 세포는 사람의 호흡기 상피를 따라 광범위하게 분포되어 있고, 상대적인 분포 범위는 30%(큰 기도)에서 6%(작은 기도)이다.
본 발명에 따라 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 환자에게 투여하기 전에 생체 외에서 단리되고 확장된 줄기/전구 세포를 형질 도입하는 데 사용될 수 있다. 바람직하게는, 본 발명에 따라 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 생체 내에서 폐(또는 기도/호흡기) 내의 세포를 형질 도입하는 데 사용된다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 기관지경, 분무기, 및 네뷸라이저(nebulisers)와 같은 임상적으로-관련된 전달 장치를 통과할 때 형질 도입 능력이 약간만 감소하면서 전단력에 대한 현저한 저항성을 나타낸다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 높은 수준의 이식 유전자 발현을 가능하게 하여, 높은 수준(치료 수준)의 치료 단백질 발현을 야기한다. 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 전형적으로 환자에게 투여될 때 이식 유전자의 높은 발현 수준을 제공한다. 높은 발현 및 치료적 발현이라는 용어는 본 명세서에서 상호 교환적으로 사용된다. 발현은 임의의 적절한 방법(정성적 또는 정량적, 바람직하게는 정량적) 및 임의의 적절한 측정 단위(예를 들어, ng/ml 또는 nM)로 주어진 농도로 측정할 수 있다.
관심 이식 유전자의 발현은 환자에서 상응하는 내인성(결함) 유전자의 발현과 관련하여 주어질 수 있다. 발현은 mRNA 또는 단백질 발현 측면에서 측정될 수 있다. 기능적 CFTR 유전자와 같은 본 발명의 이식 유전자의 발현은 내인성 유전자, 예를 들어 세포 당 mRNA 복제본 또는 임의의 다른 적절한 단위에 관한 내인성(기능 장애) CFTR 유전자와 관련하여 상대적으로 정량화될 수 있다.
본 발명의 이식 유전자 및/또는 암호화된 치료 단백질의 발현 수준은 적절한 경우 폐 조직, 상피 표면액(epithelial lining fluid), 및/또는 혈청/혈장에서 측정될 수 있다. 따라서 높은 및/또는 치료적 발현 수준은 폐, 상피 표면액, 및/또는 혈청/혈장 내 농도를 의미할 수 있다.
본 발명의 벡터에 포함된 이식 유전자는 발현을 용이하게 하기 위해 변형될 수 있다. 예를 들어, 이식 유전자 서열은 유전자 발현을 촉진하기 위해 CpG-결핍(또는 CpG-fee) 및/또는 코돈-최적화된 형태일 수 있다. 이러한 방식으로 이식 유전자 서열을 변형시키는 표준 기술은 당업계에 공지되어 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 효율적인 기도 세포 흡수, 향상된 이식 유전자 발현을 나타내고, 반복 투여 시 효능의 손실을 겪지 않는다. 따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 과도한 면역 반응을 유도하지 않고 기도 세포에서 오래 지속되고, 반복 가능하며, 높은 수준의 발현을 생성할 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 장기간 이식 유전자 발현을 가능하게 하여, 치료 단백질의 장기간 발현을 초래한다. 본 명세서에 기재된 바와 같이, "장기간 발현", "지속적인 발현(sustained expression)“, "오래 지속되는 발현”, 및 "지속적인 발현(persistent expression)“이라는 어구는 상호 교환적으로 사용된다. 본 발명에 따른 장기간 발현은 적어도 45일, 적어도 60일, 적어도 90일, 적어도 120일, 적어도 180일, 적어도 250일, 적어도 360일, 적어도 450일, 적어도 730일 이상 동안 바람직하게는 치료 수준에서 치료 유전자 및/또는 단백질의 발현을 의미한다. 바람직하게는 장기간 발현은 적어도 90일, 적어도 120일, 적어도 180일, 적어도 250일, 적어도 360일, 적어도 450일, 적어도 720일 이상, 보다 바람직하게는 적어도 360일, 적어도 450일, 적어도 720일 이상 동안의 발현을 의미한다. 이러한 장기간 발현은 반복 투여 또는 단일 투여에 의해 달성될 수 있다.
반복 투여는 1일 2회, 매일, 주 2회, 매주, 매월, 2개월마다, 3개월마다, 4개월마다, 6개월마다, 매년, 2년마다, 또는 그 이상 투여될 수 있다. 투약은 필요한 만큼, 예를 들어 적어도 6개월, 적어도 1년, 2년, 3년, 4년, 5년, 10년, 15년, 20년, 또는 그 이상 동안 치료될 환자의 평생 동안 계속될 수 있다.
레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 이식 유전자의 발현을 가능하게 하는 이식 유전자에 작동 가능하게 연결된 프로모터를 포함한다. 전형적으로 프로모터는 하이브리드 인간 CMV 인핸서/EF1a(hCEF) 프로모터이다. 이 hCEF 프로모터는 hCEF 프로모터의 뉴클레오티드 570-709에 해당하는 인트론 및 뉴클레오티드 728-733에 해당하는 엑손이 없을 수 있다. 본 발명의 hCEF 프로모터 서열의 바람직한 예는 SEQ ID NO: 10에 의해 제공된다. 프로모터는 CMV 프로모터일 수 있다. CMV 프로모터 서열의 예는 SEQ ID NO: 11에 의해 제공된다. 프로모터는 인간 신장 인자(elongation factor) 1a(EF1a) 프로모터일 수 있다. EF1a 프로모터의 예는 SEQ ID NO: 12에 의해 제공된다. 이식 유전자 발현을 위한 다른 프로모터는 당업계에 공지되어 있고 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에 대한 이들의 적합성은 당업계에 공지된 통상적인 기술을 사용하여 결정된다. 다른 프로모터의 비제한적 예는 UbC 및 UCOE를 포함한다. 본 명세서에 기재된 바와 같이, 프로모터는 본 발명의 이식 유전자의 발현을 추가로 조절하도록 변형될 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에 포함된 프로모터는 치료 유전자의 발현 조절을 추가로 정제하기 위해 특이적으로 선택 및/또는 변형될 수 있다. 또한, 적합한 프로모터 및 이들의 변형을 위한 표준 기술은 당업계에 공지되어 있다. 비제한적인 예로서, 본 발명에 사용하기에 적합한 다수의 적합한(CpG-무함유) 프로모터는 Pringle et al. (J. Mol. Med. Berl. 2012, 90(12): 1487-96)에 기재되어 있고, 이는 그 전문이 본 명세서에 참조로 포함된다. 바람직하게는, 본 발명의 레트로바이러스/렌티바이러스 벡터(특히 SIV F/HN 벡터)는 CpG 디뉴클레오티드 함량이 낮거나 없는 hCEF 프로모터를 포함한다. hCEF 프로모터는 AG, TG, 또는 GT 중 어느 하나로 대체된 모든 CG 디뉴클레오티드를 가질 수 있다. 따라서, hCEF 프로모터는 CpG가 없을 수 있다. 본 발명의 CpG-무함유 hCEF 프로모터 서열의 바람직한 예는 SEQ ID NO: 10에 의해 제공된다. CpG 디뉴클레오티드의 부재는 특히 발현된 항원에 대한 면역 반응 또는 전달된 발현 작제물에 대한 염증 반응을 유도하는 것이 바람직하지 않은 상황에서 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 성능을 추가로 향상시킨다. CpG 디뉴클레오티드의 제거는, 특히 기도에 투여될 때, 작제물의 투여로 인해 발생할 수 있는 독감-유사 증상 및 염증의 발생을 감소시킨다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 변형되어 유전자 발현이 중단될 수 있다. 이러한 방식으로 벡터를 변형하기 위한 표준 기술은 당업계에 공지되어 있다. 비제한적인 예로서, Tet-반응성 프로모터가 널리 사용된다.
바람직하게는, 본 발명은 프로모터 및 이식 유전자를 포함하는 F/HN 레트로바이러스/렌티바이러스 벡터, 특히 SIV F/HN 벡터에 관한 것이다. F/HN 슈도타이핑(pseudotyping)은 기도 상피의 세포를 표적으로 삼는 데 특히 효율적이고, 따라서 치료 적용을 위해 전형적으로 기도 상피의 세포를 포함하여 호흡기의 세포에 전달된다. 따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 기도, 호흡기, 또는 폐의 질병 또는 장애의 치료에 특히 적합하다. 전형적으로, 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 유전성 호흡기 질병의 치료에 사용될 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 이러한 질병, 특히 기도, 호흡기, 또는 폐의 질병 또는 장애의 치료를 위해 치료적인 폴리펩티드 또는 단백질을 암호화하는 이식 유전자를 포함할 수 있다.
따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는: (i) 분비된 치료 단백질, 선택적으로 알파-1 항트립신(A1AT), 인자 VIII, 계면활성제 단백질 B(SFTPB), 인자 VII, 인자 IX, 인자 X, 인자 XI, 폰 빌레브란트 인자, 과립구-대식세포 콜로니-자극 인자(GM-CSF), 및 감염원에 대한 단일클론 항체; 또는 (ii) CFTR, ABCA3, DNAH5, DNAH11, DNAI1, 및 DNAI2로부터 선택되는 단백질을 암호화하는 이식 유전자를 포함할 수 있다. 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에 포함될 수 있는 이식 유전자의 다른 예는 다른 계면활성제 결핍과 관련되거나 연관된 유전자를 포함한다.
바람직하게는, 이식 유전자는 CFTR을 암호화한다. CFTR cDNA의 예는 SEQ ID NO: 13에 의해 제공된다. 그의 변이체(본 명세서에 기재된 바와 같음), 특히 SEQ ID NO: 13에 대해 적어도 90%(예를 들어, 적어도 90, 92, 94, 95, 96, 97, 98, 99, 또는 100%)를 갖는 변이체가 또한 포함된다.
이식 유전자는 A1AT를 암호화할 수 있다. A1AT 이식 유전자의 예는 SEQ ID NO: 14, 또는 SEQ ID NO: 15의 상보적 서열에 의해 제공된다. SEQ ID NO: 14는 인간 세포에서 번역을 향상시키기 위해 본 발명자들에 의해 이전에 설계된 코돈-최적화된 CpG 고갈된 A1AT 이식 유전자이다. 이러한 최적화는 유전자 발현을 최대 15배까지 향상시키는 것으로 나타났다. 변형되지 않은(야생형) A1AT 유전자 서열과 비교하여 번역을 향상시키는 동일한 기술적 효과를 갖는 동일한 서열(본 명세서에 정의된 바와 같음)의 변이체도 본 발명에 포함된다. 상기 A1AT 이식 유전자에 의해 암호화되는 폴리펩티드는 SEQ ID NO: 16의 폴리펩티드로 예시될 수 있다. 그의 변이체(본 명세서에 기재된 바와 같음), 특히 SEQ ID NO: 14, 15, 또는 16에 대해 적어도 90%(예를 들어, 적어도 90, 92, 94, 95, 96, 97, 98, 99, 또는 100%)를 갖는 변이체가 또한 포함된다.
이식 유전자는 FVIII를 암호화할 수 있다. FVIII 이식 유전자의 예는 SEQ ID NOs: 17 및 18, 또는 SEQ ID NO: 19 및 20의 각각의 상보적인 서열에 의해 제공된다. FVIII 이식 유전자에 의해 암호화되는 폴리펩티드는 SEQ ID NO: 21 또는 22의 폴리펩티드로 예시될 수 있다. 그의 변이체(본 명세서에 기술된 바와 같음), 특히 SEQ ID NO: 17 내지 22중 임의의 하나에 대해 적어도 90%(예를 들어, 적어도 90, 92, 94, 95, 96, 97, 98, 99, 또는 100%)를 갖는 변이체가 또한 포함된다.
본 발명의 이식 유전자는 DNAH5, DNAH11, DNAI1, 및 DNAI2 중 임의의 하나 이상 또는 다른 공지된 관련 유전자일 수 있다.
호흡기 상피가 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 전달을 위해 표적이 될 때, 이식 유전자는 A1AT, SFTPB, 또는 GM-CSF를 암호화할 수 있다. 이식 유전자는 감염원에 대한 단일클론 항체(mAb)를 암호화할 수 있다. 이식 유전자는 항-TNF 알파를 암호화할 수 있다. 이식 유전자는 염증, 면역, 또는 대사 질환과 관련된 치료 단백질을 암호화할 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 순환계 내로 분비되는 단백질 생산을 허용하기 위해 호흡기의 세포로 전달될 수 있다. 이러한 구체예에서, 이식 유전자는 인자 VII, 인자 VIII, 인자 IX, 인자 X, 인자 XI, 및/또는 폰 빌레브란트 인자를 암호화할 수 있다. 이러한 벡터는 질병, 특히 심혈관 질환 및 혈액 장애, 바람직하게는 혈우병과 같은 혈액 응고 결핍의 치료에 사용될 수 있다. 또한, 이식 유전자는 리소좀 축적병(lysosomal storage disease)과 같은 염증, 면역, 또는 대사 질환과 관련된 감염원 또는 단백질에 대한 mAb를 암호화할 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 프로모터와 이식 유전자 사이에 인트론이 위치하지 않을 수 있다. 유사하게, 벡터 게놈(pDNA1) 플라스미드(예를 들어, 도 2a에 예시되고 SEQ ID NO: 3의 서열을 갖는 본 명세서에 기재된 바와 같은 pGM326)에서 프로모터와 이식 유전자 사이에 인트론이 없을 수 있다.
본 발명의 일부 바람직한 구체예에서, 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 본 명세서에 기재된 것들을 포함하는 hCEF 프로모터 및 CFTR 이식 유전자를 포함한다. 선택적으로 상기 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 프로모터와 이식 유전자 사이에 인트론이 위치하지 않을 수 있다. 이러한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 CFTR 이식 유전자 및 프로모터를 운반하는 게놈 플라스미드를 사용하여 본 명세서에 기재된 방법에 의해 생성될 수 있다.
본 발명의 일부 바람직한 구체예에서, 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 본 명세서에 기재된 것을 포함하는 hCEF 프로모터 및 A1AT 이식 유전자를 포함한다. 선택적으로 상기 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 프로모터와 이식 유전자 사이에 인트론이 위치하지 않을 수 있다. 이러한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 A1AT 이식 유전자 및 프로모터를 운반하는 게놈 플라스미드를 사용하여 본 명세서에 기재된 방법에 의해 생성될 수 있다.
본 발명의 일부 바람직한 구체예에서, 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 본 명세서에 기재된 것들을 포함하는 hCEF 또는 CMW 프로모터 및 FVIII 이식 유전자를 포함한다. 선택적으로 상기 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 프로모터와 이식 유전자 사이에 인트론이 위치하지 않을 수 있다. 이러한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 FVIII 이식 유전자 및 프로모터를 운반하는 게놈 플라스미드를 사용하여 본 명세서에 기재된 방법에 의해 생성될 수 있다.
본 명세서에 기재된 바와 같은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 이식 유전자를 포함한다. 이식 유전자는 유전자 산물(gene product), 예를 들어 단백질, 특히 치료 단백질을 암호화하는 핵산 서열을 포함한다.
예를 들어, 한 구체예에서, CFTR, A1AT, 또는 FVIII를 암호화하는 핵산 서열은 각각 CFTR, A1AT, 또는 FVIII 핵산 서열에 대해 적어도 90%(예를 들어, 적어도 90, 92, 94, 95, 96, 97, 98, 99, 또는 100%) 서열 동일성을 갖는 핵산 서열을 포함하고(또는 이들로 구성되고), 그 예는 본 명세서에 기재되어 있다. 추가의 구체예에서, CFTR, A1AT, 또는 FVIII을 암호화하는 핵산 서열은 각각 CFTR, A1AT, 또는 FVIII 핵산 서열에 대해 적어도 95%(예를 들어, 적어도 95, 96, 97, 98, 99, 또는 100%) 서열 동일성을 갖는 핵산 서열을 포함하고(또는 이들로 구성되고), 그 예는 본 명세서에 기재되어 있다. 한 구체예에서, CFTR을 암호화하는 핵산 서열은 SEQ ID NO: 13에 의해 제공되고, A1AT를 암호화하는 핵산 서열은 SEQ ID NO: 14, 또는 SEQ ID NO: 15의 상보적 서열에 의해 제공되고/되거나 FVIII을 암호화하는 핵산 서열은 SEQ ID NO: 17 및 18, 또는 SEQ ID NO: 19 및 20의 각각의 상보적 서열 또는 이의 변이체에 의해 제공된다.
CFTR, A1AT, 또는 FVIII 이식 유전자의 아미노산 서열은 기능적 CFTR, A1AT, 또는 FVIII 폴리펩티드 서열 각각에 대해 적어도 95%(예를 들어, 적어도 95, 96, 97, 98, 99, 또는 100%) 서열 동일성을 갖는 아미노산 서열을 포함할 수 있다(또는 이로 구성될 수 있다).
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 중심 폴리퓨린관(cPPT: central polypurine tract) 및/또는 우드척 간염 바이러스 전사 후 조절 요소(WPRE: Woodchuck hepatitis virus posttranscriptional regulatory element)를 포함할 수 있다. 예시적인 WPRE 서열은 SEQ ID NO: 23에 의해 제공된다.
생성 방법
본 명세서에 기재된 바와 같이, 본 발명자들은 SIV로부터의 코돈-최적화된 gal-pol 유전자를 사용하면 호흡기 파라믹소바이러스의 헤마글루티닌-뉴라미니다제(HN) 및 융합체(F) 단백질로 슈도타이핑된 SIV 벡터의 제조된 역가에 부정적인 영향을 미치지 않고 벡터의 역가가 증가할 수도 있다는 것을 처음으로 입증하였다. 또한, 본 발명자들은 코돈-최적화된 gag-pol 유전자를 사용은 벡터 역가를 유지하거나 심지어 증가시키면서 본 명세서에 기재된 바와 같은 변형된 벡터 게놈 플라스미드의 사용과 추가로 조합될 수 있음을 추가로 보여주었다.
코돈 최적화는 암호화 유전자의 번역 효율을 높여 단백질 발현을 최대화하는 기술이다. 핵산 서열의 변형으로 번역 효율이 증가한다. 코돈 최적화는 당업계에서 일상적이고, 주어진 핵산 서열의 코돈-최적화 버전을 고안하는 것은 통상의 기술자의 일상적인 실행 범위 내에 있다. 그러나 간단하지 않은 것은 코돈 최적화가 다른 매개변수에 미치는 영향을 예측하는 것이다. 예를 들어, 본 명세서에 기재된 바와 같이, 통상적인 상식은 정상적인 제조 조건하에서(gag-pol 유전자가 아닌 벡터 게놈 플라스미드가 제한적일 때) gag-pol 유전자의 코돈-최적화가 전형적으로 벡터 수율을 감소시킨다는 것을 교시한다.
따라서, 본 발명은 프로모터 및 이식 유전자를 포함하는 호흡기 파라믹소바이러스로부터 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성하는 방법을 제공하고, 여기서 상기 방법은 코돈-최적화된 gag-pol 유전자의 용도를 포함한다. 바람직하게는 상기 벡터는 렌티바이러스 벡터이고, 유인원 면역 결핍 바이러스(SIV) 벡터가 특히 바람직하다.
전형적으로 본 발명의 생성 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 생성되는 레트로바이러스/렌티바이러스 벡터에 일치된다. 비제한적 예로서, 렌티바이러스 벡터가 HIV 벡터인 경우, 본 발명의 생성 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 HIV gag-pol 유전자이다. 비제한적 예로서, 렌티바이러스 벡터가 SIV 벡터인 경우, 본 발명의 생성 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 SIV gag-pol 유전자이다.
바람직하게는 본 발명의 생성 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 SIV gag-pol 유전자이다. 코돈-최적화된 gag-pol 유전자를 생성하도록 변형될 수 있는 예시적인 야생형 SIV gag-pol 유전자는 SEQ ID NO: 2에 주어진다. 본 발명의 예시적인 코돈-최적화된 gag-pol 유전자(SEQ ID NO: 1)에 도달하기 위해 SEQ ID NO: 2의 야생형 gag-pol 유전자에 이루어진 변형은 도 1의 정렬에 도시되어 있다.
코돈-최적화에 더하여, 본 발명의 생성 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 다른 변형, 예를 들어 번역 슬립(translational slip)(번역이 한 영역에서 다른 영역으로 미끄러져 Gag와 Pol을 모두 생성할 수 있도록 함)을 포함할 수 있다. (i) 벡터 게놈 플라스미드와 GagPol 플라스미드 사이의 상동성이 감소되어 RCL 생성 위험을 최소화하고 (ii) 코돈 최적화 후에 RRE를 포함하지 않고 충분한 GagPol이 생성되는(이는 RCL 생성의 위험과 상동성을 더욱 감소시킨다) 경우, 코돈 사용의 임의의 적절한 변이가 본 발명의 코돈-최적화된 gag-pol 유전자에 사용될 수 있다.
본 발명의 생성 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 완전히(100%) 또는 부분적으로 코돈-최적화될 수 있다. 부분 코돈-최적화는 적어도 70%, 적어도 80%, 적어도 95%, 적어도 90%, 적어도 95%, 적어도 96%, 적어도 97%, 적어도 98%, 적어도 99%, 또는 그 이상의 코돈 최적화를 포함한다.
바람직하게는, gag-pol 유전자 자체는 완전히 코돈-최적화되지만, 코돈-최적화되지 않은 서열의 비함유 영역(예를 들어, gag와 pol 유전자 사이)을 포함할 수 있다. 비제한적 예로서, gag 및 pol 유전자 사이의 리딩 프레임의 번역 슬립을 유지하기 위해, 번역 슬립 서열 주변의 영역은 코돈-최적화되지 않을 수 있다(예를 들어, 정확한 번역 슬립 서열이 이 기능에 중요한 경우). 코돈-최적화된 gag-pol 유전자 내의 코돈-최적화되지 않은 번역 슬립 서열은 SEQ ID NO: 1에 예시되어 있다.
바람직하게는, 본 발명의 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1의 핵산 서열 또는 그의 변이체(본 명세서에 정의된 바와 같음)를 포함하거나 이로 구성된다. 특히, 본 발명의 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1에 대해 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 적어도 96 SEQ ID NO: 1에 대해 %, 적어도 97%, 적어도 98%, 적어도 99% 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성된다. 바람직하게는, 본 발명의 방법에서 사용되는 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1에 대해 적어도 90%, 보다 바람직하게는 적어도 95%, 보다 더 바람직하게는 적어도 98% 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성된다. SEQ ID NO: 1의 코돈-최적화된 gag-pol 유전자는 번역 슬립을 포함하고, 따라서 단일의 통상적인 오픈 리딩 프레임(open reading frame)을 형성하지 않는다.
본 발명의 방법은 확장 가능한 GMP-호환 방법일 수 있다. 따라서, 본 발명의 방법은 전형적으로 역가가 높은 정제된 F/HN 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 생성을 가능하게 한다. 전형적으로 본 발명의 방법은, 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가와 적어도 동등한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가를 생성한다. 본 명세서에서 사용되는 용어 “동등한(equivalent)”은 코돈-최적화된 gag-pol 유전자의 사용이 상응하는 코돈-최적화되지 않은 gal-pol 유전자의 사용과 비교하여 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가를 유의하게 감소시키지 않도록 정의될 수 있다. 비제한적인 예로서, 본 발명의 방법은 상응하는 코돈-최적화되지 않은 gal-pol 유전자의 사용과 비교하여 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가보다 2배 이하, 1.5배 이하, 1.0배 이하, 0.5배 이하, 0.25배 이하, 또는 더 낮은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가를 생성한다. 용어 “동등한"은 코돈-최적화된 gag-pol 유전자를 사용하는 방법에 의해 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가가 상응하는 코돈-최적화되지 않은 gal-pol 유전자를 사용하는 방법에 의해 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가와 비교하여 통계적으로 변하지 않도록 정의될 수 있다(예를 들어, p<0.05, p<0.01).
바람직하게는, 본 발명의 방법은 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가와 비교하여 증가된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가를 생성한다. 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가는 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가보다 적어도 1.5배, 적어도 2배, 또는 적어도 2.5배 클 수 있다.
레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 생성은 전형적으로 벡터 생성에 필요한 요소(레트로바이러스/렌티바이러스 벡터의 게놈, Gag-Pol, Rev, F 및 HN)를 제공하는 하나 이상의 플라스미드를 사용한다. 단일 플라스미드에 여러 요소가 제공될 수 있다. 바람직하게는 각 요소는 각각의 벡터 게놈, Gag-Pol, Rev, F 및 HN에 대해 하나씩 5개의 플라스미드가 있도록 별도의 플라스미드에 제공된다.
대안적으로, 단일 플라스미드가 Gag-Pol 및 Rev 요소를 제공할 수 있고, 단일 플라스미드는 패키징 플라스미드(pDNA2)라고 할 수 있다. 나머지 요소(게놈, F 및 HN)는 별도의 플라스미드(각각 pDNA1, pDNA3a, pDNA3b)에 의해 제공될 수 있으므로, 4개의 플라스미드가 본 발명에 따른 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 생성에 사용된다. 4개의 플라스미드 방법에서, pDNA1, pDNA3a, 및 pDNA3b는 5-플라스미드 방법과 관련하여 본 명세서에 기재된 바와 같을 수 있다.
바람직하게는, 본 발명의 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 5의 핵산 서열(pGM691) 또는 그의 변이체(본 명세서에 정의된 바와 같음)를 포함하거나 이로 구성된 플라스미드에 포함된다. 특히, 본 발명의 방법에 사용되는 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 5에 대해 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 적어도 96%, 적어도 97%, 적어도 98%, 적어도 99%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성된 플라스미드에 포함된다. 바람직하게는, 본 발명의 방법에서 사용되는 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 5에 대해 적어도 90%, 보다 바람직하게는 적어도 95%, 보다 더 바람직하게는 적어도 98%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성된 플라스미드에 포함된다. SEQ ID NO: 5(또는 이의 변이체)의 플라스미드에서: (i) SEQ ID NO: 1의 코돈-최적화된 gag-pol 유전자는 번역 슬립을 포함하고, 따라서 단일의 통상적인 오픈 리딩 프레임을 형성하지 않으며; (ii) SEQ ID NO: 1의 코돈-최적화된 gag-pol 유전자는 CAG 프로모터에 작동 가능하게 연결된다.
본 발명의 바람직한 5개의 플라스미드 방법에서, 벡터 게놈 플라스미드는, 이식 유전자를 포함하여 최종 레트로바이러스/렌티바이러스 벡터 내로 패키징되는 모든 유전 물질을 암호화한다. 일반적으로 벡터 게놈 플라스미드에서 발견되는 유전 물질의 일부만이 바이러스에 머무르게 된다. 벡터 게놈 플라스미드는 본 명세서에서 "pDNA1"로 지정될 수 있고, 전형적으로 이식 유전자 및 이식 유전자 프로모터를 포함한다.
다른 4개의 플라스미드는 Gag-Pol, Rev, F 및 HN 단백질을 암호화하는 플라스미드를 제조하고 있다. 이러한 플라스미드는 각각 "pDNA2a", "pDNA2b", “pDNA3a", 및 "pDNA3b"로 지정될 수 있다.
특히 벡터의 안전성 프로파일을 추가로 개선하기 위해, 벡터 게놈 플라스미드(pDNA1)에 변형이 이루어질 수 있다. 본 명세서에 예시된 바와 같이, 그러한 변형은 pDNA1 서열로부터 바이러스, 특히 레트로바이러스/렌티바이러스(예를 들어, SIV) ORF를 제거하기 위해 pDNA1 서열을 변형하는 것을 포함하거나 이로 구성될 수 있다. 따라서, 본 발명의 방법은 감소된 수의 비-이식 유전자 ORF를 포함하는 변형된 pDNA1을 사용할 수 있다. 상기 변형된 pDNA1은 플라스미드 서열의 임의의 영역 내의 변형을 포함할 수 있다. 특히, 변형된 pDNA1은: (i) 5'에서 3' ORF; (ii) ≥100 아미노산의 ORF; 및/또는 (iii) 이식 유전자의 업스트림 ORF 및/또는 이식유전자에 작동 가능하게 연결된 프로모터를 제거하기 위한 변형을 포함할 수 있다. 변형된 pDNA1은 이식 유전자 이외의 ORF를 포함하지 않을 수 있지만, 이것이 필수적인 것은 아니다. 오히려, 변형된 pDNA1은 이식 유전자 이외의 ORF를 여전히 포함할 수 있지만, 그것이 유래된 변형되지 않은 pDNA1과 비교하여 감소된 수의 비-이식 유전자 ORF를 포함할 수 있다. 비제한적 예로서, 변형된 pDNA1은 상응하는 변형되지 않은 pDNA1과 비교하여 적어도 1개, 적어도 2개, 적어도 3개, 적어도 4개, 적어도 5개, 또는 그보다 더 적은 비-이식 유전자 ORF를 포함할 수 있다. 구체적인 예로서, pGM830(pGM326에서 유래)은 pGM326에 비해 2개 더 적은 비-이식 유전자 ORF를 포함한다. 변형된 pDNA1은 상응하는 변형되지 않은 pDNA1과 비교하여 적어도 1개, 적어도 5개, 적어도 6개, 적어도 7개, 적어도 8개, 적어도 9개, 적어도 10개, 적어도 15개, 적어도 20개, 또는 그 이상의 변형(예를 들어, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 또는 20개의 변형)을 포함할 수 있다. 비제한적 예로서, 변형된 pDNA1은 상응하는 변형되지 않은 pDNA1과 비교하여 약 1 내지 약 20개, 예컨대 약 5 내지 약 15개, 또는 약 5 내지 약 10개의 변형을 포함할 수 있다. 구체적인 예로서, pGM830(pGM326에서 유래)은 pGM326과 비교하여 7개의 변형을 포함한다.
본 명세서에 예시된 바와 같이, pGM380을 플라스미드 pDNA1로 사용하면, pDNA1 플라스미드가 pGM326이지만(도 11) 다른 모든 플라스미드 및 매개변수가 일정하게 유지되는 생성 방법과 비교하여 개선된 SIV 역가를 생성할 가능성이 있다. 즉, pGM830과 같은 변형된 pDNA1의 사용은 코돈-최적화된 gal-pol 유전자를 사용하여 달성된 개선된 역가에 부정적인 영향을 미치지 않고, pGM691을 pDNA2a로 사용하여 제공되는 것과 같이 코돈-최적화된 gal-pol 유전자를 사용하는 효과에 비해 잠재적으로 역가의 추가 개선을 제공할 수도 있다. 본 명세서에서 정의된 용어 "증가된 역가(increased titre)“는 코돈-최적화된 gal-pol 유전자 및 변형된 pDNA1을 모두 사용하는 본 발명의 방법에 동일하게 적용된다.
전형적으로, 렌티바이러스는 SIV1, 바람직하게는 SIV-AGM과 같은 SIV이다. F 및 HN 단백질은 호흡기 파라믹소바이러스, 바람직하게는 센다이 바이러스로부터 유래된다.
CFTR에 관한 특정 구체예에서, 5개의 플라스미드는 도 2a-2f에 의해 특성화되고, 따라서 pDNA1은 도 2a의 pGM326 플라스미드 또는 도 2b의 pGM830 플라스미드이고, pDNA2a는 도 2c의 pGM691 플라스미드이고, pDNA2b는 도 2d의 pGM299 플라스미드이고, pDNA3a는 도 2e의 pGM301 플라스미드이고 pDNA3b는 도 2f의 pGM303 플라스미드 또는 이들 플라스미드 중 임의의 변이체(본 명세서에 기재된 바와 같음)이다. 이 구체예에서, 레트로바이러스/렌티바이러스 벡터를 포함하는 최종 CFTR은 vGM195로 지칭될 수 있다(실시예 참조). pGM691 플라스미드 및 vGM195 벡터는 본 발명의 바람직한 구체예이다.
본 명세서에 예시된 바와 같이, pGM691을 플라스미드 pDNA2a로 사용하면 pDNA2a 플라스미드가 pGM297이지만(도 2g) 다른 모든 플라스미드 및 방법 매개변수가 일정하게 유지되는 생성 방법과 비교하여 개선된 SIV 역가를 생성할 가능성이 있다.
본 발명의 방법을 사용하여 A1AT를 생성하는 경우, 5개의 플라스미드는 도 3(따라서 플라스미드 pDNA1은 pGM407일 수 있음) 및 도 2c-f 모두(특정 CFTR 구체예에 대해 상기한 바와 같음), 또는 이들 플라스미드 중 임의의 변이체(본 명세서에 기재된 바와 같음)에 의해 특성화될 수 있다.
본 발명의 방법을 사용하여 FVIII를 생성하는 경우, 5개의 플라스미드는 도 4ad 중 하나(따라서 플라스미드 pDNA1은 pGM411, pGM412, pGM413, 또는 pGM414일 수 있음) 및 도 2c-f 모두, 또는 이들 플라스미드 중 임의의 변이체(본 명세서에 기재된 바와 같음)에 의해 특성화될 수 있다.
도 2a에 정의된 바와 같은 플라스미드는 SEQ ID NO: 3으로 표시되고; 도 2b에 정의된 바와 같은 플라스미드는 SEQ ID NO: 4로 표시되고; 도 2c에 정의된 바와 같은 플라스미드는 SEQ ID NO: 5로 표시되고; 도 2d에 정의된 바와 같은 플라스미드는 SEQ ID NO: 6으로 표시되고; 도 2e에 정의된 플라스미드는 SEQ ID NO: 7로 표시되고; 도 2f에 정의된 플라스미드는 SEQ ID NO: 8로 표시되고; 도 2g에 정의된 플라스미드는 SEQ ID NO: 9로 표시되고; 도 3에 정의된 플라스미드는 SEQ ID NO: 24로 표시되고 도 4a 내지 4d에 정의된 F/HN-SIV-CMV-HFVIII-V3, F/HN-SIV-hCEF-HFVIII-V3, F/HN-SIV-CMV-HFVIII-N6-co, 및/또는 F/HN-SIV-hCEF-HFVIII-N6-co 플라스미드는 각각 SEQ ID NO: 25 내지 28로 표시된다. 이들 플라스미드의 변이체(본 명세서에 정의된 바와 같음)도 본 발명에 포함된다. 특히, SEQ ID NOs: 3 내지 9, 24 및 25 내지 28 중 어느 하나와 적어도 90%(예를 들어, 적어도 90, 92, 94, 95, 96, 97, 98, 99, 99.5, 또는 100%) 서열 동일성을 갖는 변이체가 포함된다.
본 발명의 5개 플라스미드 방법에서 5개 플라스미드 모두는 최종 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 형성에 기여한다. 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 제조하는 동안, 벡터 게놈 플라스미드(pDNA1)는 바이러스 제조에 중요한 인핸서/프로모터, Psi, RRE, cPPT, mWPRE, SIN LTR, SV40 polyA(도 2a 또는 2b 참조)를 제공한다. pDNA1의 비제한적 예로서 pGM326 또는 pGM830을 사용하여, CMV 인핸서/프로모터, SV40 polyA, colE1 Ori, 및 KanR은 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터(예를 들어, vGM195 또는 vGM244)의 제조에 관여하지만, 최종 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에서는 발견되지 않는다. pGM326 또는 pGM830의 RRE, cPPT(central polypurine tract), hCEF, soCFTR2(이식 유전자), 및 mWPRE는 최종 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에서 발견된다. SIN LTR(긴 말단 반복, SIN/IN 자체-불활성화) 및 Psi(패키징 신호)는 최종 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에서 찾을 수 있다.
본 발명의 다른 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 경우, 다른 벡터 게놈 플라스미드(pDNA1)의 상응하는 요소가, 제조에 필요하거나(그러나 최종 벡터에서는 발견되지 않음) 최종 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에 존재한다.
pDNA3a 및 pDNA3b의 F 및 HN 단백질(바람직하게는 Sendai F 및 HN 단백질)은 최종 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터로 표적 세포를 감염시키는 데, 즉 환자의 상피 세포(전형적으로 본 명세서에 기재된 바와 같은 폐 또는 비강 세포)의 진입에 중요하다. pDNA2a 및 pDNA2b 플라스미드의 생성물은 바이러스 형질 도입, 즉 레트로바이러스/렌티바이러스(예를 들어, SIV) DNA를 숙주의 게놈 내에 삽입하는 데 중요하다. 프로모터, 조절 요소(예를 들어, WPRE), 및 이식 유전자는 표적 세포(들) 내에서 이식 유전자 발현에 중요하다.
본 발명의 방법은 다음 단계를 포함하거나 이로 구성될 수 있다: (a) 현탁액에서 세포를 성장시키고; (b) 하나 이상의 플라스미드로 세포를 형질 감염시키고; (c) 뉴클레아제를 첨가하고; (d) 렌티바이러스(예를 들어, SIV)를 채취하고; (e) 트립신을 첨가하고; 그리고 (f) 렌티바이러스(예를 들어, SIV)를 정제하는 단계.
이 방법은 본 명세서에 설명된 4- 또는 5-플라스미드 시스템을 사용할 수 있다. 따라서, 바람직한 5-플라스미드 방법의 경우, 하나 이상의 플라스미드는: 벡터 게놈 플라스미드 pDNA1; co-galpol 플라스미드, pDNA2a; Rev 플라스미드, pDNA2b; 융합(F) 단백질 플라스미드, pDNA3a; 및 헤마글루티닌-뉴라미니다제(HN) 플라스미드, pDNA3b를 포함하거나 이로 구성된다. pDNA1은 pGM326 및 pGM830, 바람직하게는 pGM830으로부터 선택될 수 있다. pDNA2a는 pGM691일 수 있다. pDNA2b는 pGM299일 수 있다. pDNA3a는 pGM301일 수 있다. pDNA3b는 pGM303일 수 있다. pDNA1, pDNA2a, pDNA2b, pDNA3a, 및 pDNA3b의 임의의 조합을 사용할 수 있다. 바람직하게는, pDNA1은 pGM326 또는 pGM830(pGM830이 특히 바람직함)이고; pDNA2a는 pGM691이고; pDNA2b는 pGM299이고; pDNA3a는 pGM301이고; pDNA3b는 pGM303이다. pGM830, pGM691, pGM299, pGM301, 및 pGM303을 사용하여 생성된 SIV 벡터는 vGM244로 지정된다. pGM326, pGM691, pGM299, pGM301, 및 pGM303을 사용하여 생성된 SIV 벡터는 vGM195로 지정된다.
벡터 게놈 플라스미드: co-gagpol 플라스미드: Rev 플라스미드: F 플라스미드: HN 플라스미드의 임의의 적절한 비율을 사용하여 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 역가를 추가로 최적화(증가)할 수 있다. 비제한적 예로서, 벡터 게놈 플라스미드: co-gagpol 플라스미드: Rev 플라스미드: F 플라스미드: HN 플라스미드의 비율은 10-40:-4-20:3-12:3-12:3-12, 전형적으로 15-20:7-11:4-8:4-8:4-8, 예를 들어 약 18-22:7-11:4-8:4-8:4-8, 19-21:8-10:5-7:5-7:5-7 범위일 수 있다. 바람직하게는 벡터 게놈 플라스미드: co-gagpol 플라스미드: Rev 플라스미드: F 플라스미드: HN 플라스미드의 비율은 약 20:9:6:6:6이다.
방법의 단계 (a)-(f)는 전형적으로 단계 (a)에서 시작하여 단계 (f)까지 연속적으로 순차적으로 수행된다. 방법은 하나 이상의 추가 단계, 예컨대 추가 정제 단계, 완충제 교환, 정제 후 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 농축, 및/또는 정제(또는 농축) 후 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 제형을 포함할 수 있다. 각 단계는 하나 이상의 하위 단계를 포함할 수 있다. 예를 들어, 채취는 하나 이상의 단계 또는 하위 단계를 포함할 수 있고/있거나 정제는 하나 이상의 단계 또는 하위 단계를 포함할 수 있다.
임의의 적절한 세포 유형은 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성하기 위해 하나 이상의 플라스미드(예를 들어, 본 명세서에 기재된 5-플라스미드)로 형질 감염될 수 있다. 전형적으로 포유 동물 세포, 특히 인간 세포주가 사용된다. 본 발명의 방법에 사용하기에 적합한 세포의 비제한적 예는 HEK293 세포(예를 들어, HEK293F 또는 HEK293T 세포) 및 293T/17 세포이다. 바이러스 생성에 적합한 상업적 세포주 또한 쉽게 구할 수 있다(예를 들어, Gibco Viral Production Cells - Catalogue Number A35347 from ThermoFisher Scientific).
세포는 무혈청 배지를 포함하여 동물-성분이 없는 배지에서 성장할 수 있다. 세포는 인간 성분을 함유하는 배지에서 성장할 수 있다. 세포는 합성적으로 생성된 성분을 포함하거나 이로 구성된 정의된 배지에서 성장할 수 있다.
임의의 적절한 형질 감염 수단이 본 발명에 따라 사용될 수 있다. 적절한 형질 감염 수단의 선택은 당업자의 통상적인 실행 범위 내에 있다. 비제한적 예로서, 형질 감염은 PEIProTM, Lipofectamine2000™, 또는 Lipofectamine3000TM을 사용하여 수행할 수 있다.
임의의 적절한 뉴클레아제가 본 발명에 따라 사용될 수 있다. 적절한 뉴클레아제의 선택은 당업자의 통상적인 실행 범위 내에 있다. 전형적으로 뉴클레아제는 엔도뉴클레아제이다. 비제한적 예로서, 뉴클레아제는 Benzonase® 또는 Denarase®일 수 있다. 뉴클레아제의 첨가는 채취 전 단계 또는 채취 후 단계 또는 채취 단계 사이에 있을 수 있다.
트립신 활성은 바람직하게는 TrypLE Select™와 같은 동물성 기원이 없는 재조합 효소에 의해 제공될 수 있다. 트립신의 첨가는 채취 전 단계 또는 채취 후 단계 또는 채취 단계 사이에 있을 수 있다.
임의의 적절한 정제 수단을 사용하여 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 정제할 수 있다. 적합한 정제 단계의 비제한적 예는 심층/말단 여과(end filtration), 접선 유동 여과(TFF: tangential flow filtration), 및 크로마토그래피를 포함한다. 정제 단계는 전형적으로 적어도 하나의 크로마토그래피 단계를 포함한다. 본 발명에 따라 사용될 수 있는 크로마토그래피 단계의 비제한적 예는 혼합-모드 크기 배제 크로마토그래피(SEC) 및/또는 음이온 교환 크로마토그래피를 포함한다. 용출(Elution)은 염 구배를 사용하거나 사용하지 않고, 바람직하게는 사용하지 않고 수행할 수 있다.
이 방법은 본 명세서에 기재된 CFTR, A1AT, 및/또는 FVIII 유전자를 포함하는 것과 같은 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성하는 데 사용될 수 있다. 대안적으로, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 임의의 상기 언급된 유전자 또는 상기 언급된 단백질을 암호화하는 유전자를 포함한다.
본 발명의 방법은 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 제공하기 위해 사용되는 도 2a-2f, 도 3, 및/또는 도 4a-4d에 의해 제공되는 하나 이상의 특정 플라스미드 작제물의 임의의 조합을 사용할 수 있다. 특히, 도 2c-2f의 플라스미드 작제물이 바람직하게는 도 2b, 도 2a, 도 3, 또는 도 4a-4d의 플라스미드와 함께 사용되고, 도 2b의 플라스미드가 특히 바람직하다.
본 발명은 또한 코돈-최적화된 SIV gag-pol 유전자를 제공한다. 이들 코돈-최적화된 SIV gag-pol 유전자는 전형적으로 본 발명의 방법에 사용하기에 적합하다. 본 발명의 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1의 핵산 서열 또는 그의 변이체(본 명세서에 정의된 바와 같음)를 포함하거나 이로 구성될 수 있다. 특히, 본 발명의 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1에 대해 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 적어도 96%, 적어도 97%, 적어도 98%, 적어도 99%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성될 수 있다. 바람직하게는, 본 발명의 코돈-최적화된 gag-pol 유전자는 SEQ ID NO: 1에 대해 적어도 90%, 더 바람직하게는 적어도 95%, 보다 더 바람직하게는 적어도 98%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성될 수 있다. 따라서, 본 발명은 코돈-최적화된 gag-pol 유전자를 포함하는 핵산을 제공하고, 상기 핵산은 SEQ ID NO: 1에 대해 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 적어도 96%, 적어도 97%, 적어도 98%, 적어도 99%, 또는 그 이상의 서열 동일성, 바람직하게는 SEQ ID NO: 1에 대해 적어도 90%, 더 바람직하게는 적어도 95%, 보다 더 바람직하게는 적어도 98%, 또는 그 이상의 서열 동일성을 갖는다. 특히 바람직한 구체예에서, 본 발명은 SEQ ID NO: 1의 핵산 서열을 포함하거나 이로 구성된 핵산을 제공한다. 본 발명의 코돈-최적화된 gag-pol 유전자(예를 들어, SIV gag-pol 유전자)는 전형적으로 gag-pol 단백질의 발현을 용이하게 하는 프로모터에 작동 가능하게 연결된다. 이식 유전자에 대한 프로모터와 관련하여 본 명세서에 기재된 것을 포함하여 임의의 적합한 프로모터가 사용될 수 있다. 바람직하게는, 프로모터는 예시된 pGM691 플라스미드에 사용된 바와 같은 CAG 프로모터이다. 예시적인 CAG 프로모터는 SEQ ID NO: 29에 제시되어 있다. SEQ ID NO: 1의 코돈-최적화된 gag-pol 유전자는 번역 슬립을 포함하고, 따라서 단일의 통상적인 오픈 리딩 프레임을 형성하지 않는다.
본 발명은 또한 본 발명의 코돈-최적화된 SIV gag-pol 유전자를 포함하는 플라스미드, 즉 본 발명의 코돈-최적화된 SIV gag-pol 유전자를 포함하는 pDNA2a를 제공한다. 이들 플라스미드는 전형적으로 본 발명의 방법에 사용하기에 적합하다. 본 발명의 (pDNA2a) 플라스미드는 SEQ ID NO: 5(pGM691)의 핵산 서열 또는 그의 변이체(본 명세서에 정의된 바와 같음)를 포함하거나 이로 구성될 수 있다. 특히, 본 발명의 (pDNA2a) 플라스미드는 SEQ ID NO: 5에 대해 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 적어도 96%, 적어도 97%, 적어도 98%, 적어도 99%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성될 수 있다. 바람직하게는, 본 발명의 (pDNA2a) 플라스미드는 SEQ ID NO: 5에 대해 적어도 90%, 보다 바람직하게는 적어도 95%, 보다 더 바람직하게는 적어도 98%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성될 수 있다. 따라서, 본 발명은 본 발명의 코돈-최적화된 SIV gag-pol 유전자(본 명세서에 정의된 바와 같음), 특히 SEQ ID NO: 1을 포함하거나 이로 구성된 핵산 서열 또는 그의 변이체(본 명세서에 정의된 바와 같음)를 포함하는 플라스미드를 제공한다. 상기 플라스미드는 SEQ ID NO: 5에 대해 적어도 80%, 적어도 85%, 적어도 90%, 적어도 95%, 적어도 96%, 적어도 97%, 적어도 98%, 적어도 99%, 또는 그 이상의 서열 동일성, 바람직하게는 SEQ ID NO: 5에 대해 바람직하게는 적어도 90%, 보다 바람직하게는 적어도 95%, 보다 더 바람직하게는 적어도 98%, 또는 그 이상의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성될 수 있다. 특히 바람직한 구체예에서, 본 발명은 SEQ ID NO: 5의 핵산 서열을 포함하거나 이로 구성된 플라스미드를 제공한다. SEQ ID NO: 5(또는 이의 변이체)의 플라스미드에서: (i) SEQ ID NO: 1의 코돈-최적화된 gag-pol 유전자는 번역 슬립을 포함하고, 따라서 단일의 통상적인 오픈 리딩 프레임을 형성하지 않으며; (ii) SEQ ID NO: 1의 코돈-최적화된 gag-pol 유전자는 CAG 프로모터(예를 들어, 본 명세서에 예시됨)에 작동 가능하게 연결된다.
코돈-최적화된 gag-pol 유전자(또는 이를 포함하거나 이로 구성되는 핵산) 및 상기 유전자 또는 핵산을 포함하는 플라스미드는, 고역가(high titre) F/HN 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 생성을 가능하게 하므로, 본 발명의 방법을 사용하여 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 생성에 유리하다. 전형적으로 상기 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성된 핵산) 및 상기 유전자 또는 핵산을 포함하는 플라스미드는 본 명세서에 기재된 바와 같이 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가와 적어도 동등한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가를 생성하는 데 사용될 수 있다.
바람직하게는, 코돈-최적화된 gag-pol 유전자(또는 이를 포함하거나 이로 구성되는 핵산) 및 상기 유전자 또는 핵산을 포함하는 플라스미드는 본 명세서에 기재된 바와 같이 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생산된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가와 비교하여 증가된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터 역가의 생성을 가능하게 한다.
본 발명은 또한 (i) 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터, (ii) 본 발명의 코돈-최적화된 gag-pol 유전자(또는 이를 포함하거나 이로 구성된 핵산); 및/또는 (iii) 상기 유전자 또는 핵산을 포함하는 플라스미드; 또는 이들의 임의의 조합을 포함하는 숙주 세포를 제공한다. 전형적으로 숙주 세포는 포유 동물 세포, 특히 인간 세포 또는 세포주이다. 숙주 세포의 비제한적 예는 HEK293 세포(HEK293F 또는 HEK293T 세포와 같은) 및 293T/17 세포를 포함한다. 바이러스 생성에 적합한 상업용 세포주 또한 쉽게 구할 수 있다(본 명세서에 설명된 대로).
본 발명은 또한 본 발명의 방법에 의해, 또는 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산)를 사용하여 얻을 수 있는 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터, 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포를 제공한다.
전형적으로 본 발명의 방법에 의해, 또는 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산)를 사용하여 얻을 수 있는 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터, 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포는 고역가에서 생성된다. 역가는 본 명세서에 정의된 변환 단위(transducing unit)로 측정할 수 있다. 본 명세서에 기재된 바와 같이, 본 발명의 방법은 전형적으로 코돈-최적화된 gag-pol 유전자를 사용하지 않는 상응하는 방법과 동등하거나 더 높은 역가로 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성한다. 따라서, 본 발명의 방법에 의해, 또는 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산), 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포를 사용하여 얻을 수 있는 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 선택적으로 적어도 약 2.5x106 TU/mL, 적어도 약 3.0x106 TU/mL, 적어도 약 3.1x106 TU/mL, 적어도 약 3.2x106 TU/mL, 적어도 약 3.3x106 TU/mL¸ 적어도 약 3.4x106 TU/mL, 적어도 약 3.5x106 TU/mL, 적어도 약 3.6x106 TU/mL, 적어도 약 3.7x106 TU/mL, 적어도 약 3.8x106 TU/mL, 적어도 약 3.9x106 TU/mL, 적어도 약 4.0x106 TU/mL, 또는 그 이상의 역가일 수 있다. 바람직하게는 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 적어도 약 3.0x106 TU/mL, 또는 적어도 약 3.5x106 TU/mL의 역가에서 생성된다.
역가가 높은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 생성은 생성된 벡터 산물에 다른 바람직한 특성을 부여할 수 있다. 예를 들어, 이론에 얽매이지 않고, TFF와 같은 방법에 의해 집중적으로 농축할 필요 없이 고역가로 생성하면, 코돈-최적화된 gag-pol 유전자(및 선택적으로 변형된 벡터 게놈 플라스미드)를 사용하지 않고 상응하는 방법으로 생성하는 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터보다 더 높은 품질의 벡터 산물이 되는데, 그 이유는 벡터가 바이러스 입자와 그들의 RNA 화물을 손상시킬 수 있는 전단력에 덜 노출되기 때문인 것으로 이해된다.
본 발명은 또한 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산), 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포의 사용을 포함하는 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터 역가를 증가시키는 방법을 제공한다. 본 발명에 따른 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터 역가를 증가시키는 상기 방법은 gag-pol 유전자(또는 이를 포함하거나 이로 구성되는 핵산)의 비-코돈-최적화된 버전, 또는 상기 비-코돈 최적화된 유전자 또는 핵산을 포함하는 플라스미드 또는 숙주 세포를 사용하는 상응하는 방법과 비교하여 역가를 적어도 1.5배, 적어도 2배, 또는 적어도 2.5배, 또는 그 이상 증가시킬 수 있다. 대안적으로, 본 발명에 따른 레트로바이러스/렌티바이러스(예를 들어, SIV) 역가를 증가시키는 상기 방법은 gag-pol 유전자(또는 이를 포함하거나 이로 구성되는 핵산)의 비-코돈-최적화된 버전, 또는 상기 비-코돈 최적화된 유전자 또는 핵산을 포함하는 플라스미드 또는 숙주 세포를 사용하는 상응하는 방법과 비교하여 역가를 적어도 약 25%, 적어도 약 50%, 적어도 약 100%, 적어도 약 150%, 적어도 약 200%, 또는 그 이상 증가시킬 수 있다. 바람직하게는, 본 발명에 따른 레트로바이러스/렌티바이러스(예를 들어, SIV) 역가를 증가시키는 방법은 역가를 (a) 적어도 1.5배 또는 적어도 2배; 및/또는 (b) 적어도 약 25%, 더 바람직하게는 적어도 약 50%, 보다 더 바람직하게는 적어도 약 100%만큼 증가시킬 수 있다. 전형적으로 상응하는 방법은 코돈-최적화된 gag-pol 유전자(또는 이를 포함하거나 이로 구성되는 핵산), 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포의 사용을 제외하고는 본 발명의 방법과 동일하다. 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성하는 방법에 관한 본 명세서의 모든 개시는 동일하게 그리고 유보 없이 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 역가를 증가시키는 방법에 적용된다.
본 발명은 또한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 역가를 증가시키기 위한 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산), 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포의 용도를 제공한다. 상기 용도는 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산)의 상응하는 비-코돈-최적화된 버전, 또는 상기 비-코돈 최적화된 유전자 또는 핵산을 포함하는 플라스미드 또는 숙주 세포의 용도와 비교하여 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터 역가를 적어도 1.5배, 적어도 2배, 또는 적어도 2.5배, 또는 그 이상 증가시킬 수 있다. 대안적으로, 상기 용도는 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산)의 상응하는 비-코돈-최적화된 버전, 또는 상기 비-코돈 최적화된 유전자 또는 핵산을 포함하는 플라스미드 또는 숙주 세포의 용도와 비교하여 레트로바이러스/렌티바이러스(예를 들어, SIV) 역가를 적어도 약 25%, 적어도 약 50%, 적어도 약 100%, 적어도 약 150%, 적어도 약 200%, 또는 그 이상 증가시킬 수 있다. 바람직하게는, 상기 용도는 레트로바이러스/렌티바이러스(예를 들어, SIV) 역가를 (a) 적어도 1.5배 또는 적어도 2배; 및/또는 (b) 적어도 약 25%, 더 바람직하게는 적어도 약 50%, 훨씬 더 바람직하게는 적어도 약 100% 증가시킨다. 전형적으로 상응하는 용도는 코돈-최적화된 gag-pol 유전자(또는 이들을 포함하거나 이로 구성되는 핵산), 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포의 용도를 제외하고는 본 발명의 방법과 동일하다. 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 생성하는 방법에 관한 본 명세서의 모든 개시는 본 발명에 따른 레트로바이러스/렌티바이러스(예를 들어 SIV) 벡터의 역가를 증가시키기 위해 코돈-최적화된 gag-pol 유전자, 상기 유전자 또는 핵산을 포함하는 플라스미드, 또는 본 발명의 숙주 세포의 용도에 동일하게 유보 없이 적용된다. 변형된 벡터 게놈 플라스미드(감소된 바이러스 ORF 포함)와 함께 코돈-최적화된 gal-pol 유전자를 사용하면 안전성 및/또는 벡터 역가 측면에서 추가 이점을 제공할 수 있다. 따라서, 본 명세서에 기재된 증가된 벡터 수율은 코돈-최적화된 gag-pol 유전자만을 사용하거나 변형된 벡터 게놈 플라스미드와 함께 사용하여 달성될 수 있다. 코돈-최적화된 gag-pol 유전자를 사용하는 방법의 맥락에서 증가된 벡터 역가와 관련된 본 명세서의 모든 개시는 본 발명의 변형된 벡터 게놈 플라스미드와 함께 코돈-최적화된 gag-pol 유전자를 사용하는 방법 및 이러한 방법에 의해 생성된 벡터에 동일하게 그리고 제한 없이 적용된다.
치료 적응증(Therapeutic Indications)
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 효율적인 유전자 전달을 통해 더 높고 지속적인 유전자 발현을 가능하게 한다. 본 발명의 F/HN-슈도타이핑된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는: (i) 상피 완전성(epithelial integrity)의 파괴 없이 기도 형질 도입; (ii) 지속적인 유전자 발현; (iii) 만성 독성(chronic toxicity)의 결여; 및 (iv) 효율적인 반복 투여가 가능하다. 바람직하게는 치료-유효 수준에서 장기간/지속적인 안정한 유전자 발현은 본 발명의 벡터의 반복 용량을 사용하여 달성될 수 있다. 대안적으로, 원하는 장기 발현을 달성하기 위해 단일 용량을 사용할 수 있다.
따라서, 유리하게는, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 유전자 치료에 사용될 수 있다. 예로서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 효율적인 기도 세포 흡수 특성은 이들을 호흡기 질환 치료에 매우 적합하게 만든다. 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 또한 치료 단백질의 분비를 촉진하기 위한 유전자 치료 방법에 사용될 수 있다. 추가 예로서, 본 발명은 호흡기 또는 순환계의 내강 내로의 치료 단백질의 분비를 제공한다. 따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 투여 및 기도 세포에 의한 그의 흡수는, 폐(또는 코 또는 기도)를 “공장”으로 사용하여, 분비되어 치료 효과를 이끌어내기 위해 관심 있는 세포/조직으로 이동할 수 있는 치료 수준에서 일반 순환계로 들어가는, 치료 단백질을 생성할 수 있게 할 수 있다. 세포 내 단백질 또는 막 단백질과 달리, 이러한 분비 단백질의 생성은, 형질 도입되는 특정 질병 표적 세포에 의존하지 않고, 이는 상당한 이점이며 높은 수준의 단백질 발현을 달성한다. 따라서, 심혈관 질환 및 혈액 장애, 특히 혈액 응고 결핍증과 같은 호흡기 질환이 아닌 다른 질환도 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터에 의해 치료될 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 질병 조정(correction)을 위한 이식 유전자를 제공함으로써 질병을 효과적으로 치료할 수 있다. 예를 들어, 근본적인 돌연변이(underlying mutation)와 관계없이, CF 환자의 폐 질환을 개선하거나 예방하기 위해, CFTR 유전자의 기능적 복제본을 삽입한다. 따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는, 전형적으로 본 명세서에 기재된 바와 같은 CFTR 이식 유전자를 사용한 유전자 요법에 의해, 낭포성 섬유증(CF)을 치료하는 데 사용될 수 있다.
또 다른 예로서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 알파-1 항트립신(A1AT) 결핍을 치료하기 위해, 전형적으로 본 명세서에 기재된 바와 같은 A1AT 이식 유전자를 사용한 유전자 요법에 의해 사용될 수 있다. A1AT는 주로 간에서 생성된 후 폐로 전달되는 분비형 항프로테아제이고, 소량은 폐 자체에서도 생성된다. A1AT의 주요 기능은 호중구 엘라스타제에 결합하여 중화/억제하는 것이다. 본 발명에 따른 A1AT를 이용한 유전자 요법은, CF 또는 만성 폐쇄성 폐질환(COPD)과 같은 다른 폐 질환뿐만 아니라 A1AT 결핍 환자와 관련이 있고, 표적 조직(폐/비강 상피)에서 안정적이고 오래 지속되는 발현, 투여 용이성, 및 무제한 가용성을 제공하여 기존의 효소 대체 요법(enzyme replacement therapy)(A1AT가 인간 혈액에서 단리되어 매주 정맥 주사됨)에서 직면하는 일부 문제를 극복할 수 있는 기회를 제공한다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 사용한 형질 도입은 재조합 단백질을 순환계뿐만 아니라 폐의 내강 내로 분비하도록 유도할 수 있다. 이의 한 가지 이점은 치료용 단백질이 간질(interstitium)에 도달한다는 것이다. 따라서 A1AT 유전자 치료는 제1형 및 제2형 당뇨병, 급성 심근 경색, 허혈성 심장 질환, 류마티스 관절염, 염증성 장 질환, 이식 거부, 이식편 대 숙주(GvH) 질환, 다발성 경화증, 간 질환, 간경화, 맥관염, 및 박테리아 및/또는 바이러스 감염과 같은 감염을 포함하는 비제한적 예와 같은 다른 질병 적응증에도 유익할 수 있다.
A1AT는 예를 들어 당뇨병, 이식편 대 숙주 질환 및 염증성 장 질환의 전-임상 모델에서 수많은 다른 항-염증 및 조직-보호 효과를 나타낸다. 따라서, 본 발명에 따른 형질 도입 후 폐 및/또는 코에서 A1AT의 생성은 이들 적응증을 포함하여 보다 광범위하게 적용될 수 있다.
본 발명에 따른 분비 단백질의 유전자 요법으로 치료될 수 있는 질병의 다른 예는 심혈관 질환 및 혈액 장애, 특히 혈우병(A, B 또는 C), 폰빌레브란트병, 및 인자 VII 결핍과 같은 혈액 응고 결핍증을 포함한다.
치료할 질병 또는 장애의 다른 예에는 원발성 섬모 운동 이상증(PCD), 급성 폐 손상, 계면활성 단백질 B(SFTB) 결핍, 폐포 단백질증(PAP: Pulmonary Alveolar Proteinosis), 만성 폐쇄성 폐 질환(COPD), 및/또는 리소좀 저장 질환과 같은 염증성, 감염성, 면역, 또는 대사 질환이 있다.
따라서, 본 발명은 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 대상에게 투여하는 방법을 포함하는, 질환을 치료하는 방법을 제공한다. 전형적으로 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 본 발명의 방법을 사용하여 생성된다. 본 명세서에 기재된 임의의 질병은 본 발명에 따라 치료될 수 있다. 특히, 본 발명은 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 사용하여 폐 질환을 치료하는 방법을 제공한다. 치료할 질병은 만성 질병일 수 있다. 바람직하게는, CF를 치료하는 방법이 제공된다.
본 발명은 또한 질병을 치료하는 방법에 사용하기 위한 본 명세서에 기재된 바와 같은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 제공한다. 전형적으로 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 본 발명의 방법을 사용하여 생성된다. 본 명세서에 기재된 임의의 질병은 본 발명에 따라 치료될 수 있다. 특히, 본 발명은 폐 질환을 치료하는 방법에 사용하기 위한 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 제공한다. 치료할 질병은 만성 질환일 수 있다. 바람직하게는, CF 치료에 사용하기 위한 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터가 제공된다.
본 발명은 또한 질병을 치료하는 방법에 사용하기 위한 약제의 제조에 있어서 본 명세서에 기재된 바와 같은 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 용도를 제공한다. 전형적으로 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 본 발명의 방법을 사용하여 생성된다. 본 명세서에 기재된 임의의 질병은 본 발명에 따라 치료될 수 있다. 특히, 본 발명은 폐 질환을 치료하는 방법에 사용하기 위한 약제의 제조를 위한 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 용도를 제공한다. 치료할 질병은 만성 질환일 수 있다. 바람직하게는, CF를 치료하는 방법에 사용하기 위한 약제의 제조에서 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 용도가 제공된다.
제형 및 투여
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 원하는 치료 효과를 달성하기에 적절한 투여량으로 투여될 수 있다. 적절한 투여량은 임상의 또는 기타 의사가 표준 기술을 사용하여 정상적인 연구 과정 내에서 결정할 수 있다. 적합한 투여량의 비제한적 예는 1x108 형질 도입 단위(TU), 1x109 TU, 1x1010 TU, 1x1011 TU 이상을 포함한다.
본 발명은 또한 상기 기재된 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터 및 약학적으로 허용되는 담체를 포함하는 조성물을 제공한다. 약학적으로 허용되는 담체의 비제한적 예는 물, 식염수, 및 인산염-완충 식염수를 포함한다. 그러나 일부 구체예에서, 조성물은 동결 건조된 형태이고, 이 경우 소 혈청 알부민(BSA)과 같은 안정제를 포함할 수 있다. 일부 구체예에서, 장기(long-term) 저장을 용이하게 하기 위해, 티오메르살(thiomersal) 또는 아지드화나트륨(sodium azide)과 같은 방부제와 함께 조성물을 제형화하는 것이 바람직할 수 있다.
본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 임의의 적절한 경로로 투여될 수 있다. 본 발명의 조성물(상기 기재된 바와 같음)을 대상의 호흡계로 향하게 하는 것이 바람직할 수 있다. 호흡기의 감염 부위로의 치료/예방 조성물 또는 약제의 효율적인 전달은 예를 들어 에어로졸(예를 들어, 비강 스프레이)로서 경구 또는 비강 내 투여에 의해 또는 카테터에 의해 달성될 수 있다. 전형적으로 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 임상적으로 관련된 분무기, 흡입기(정량 흡입기 포함), 카테터, 및 에어로졸 등에서 안정하다.
본 발명의 일부 구체예에서 코는 다음 이유 중 적어도 하나 때문에 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 사용하는 치료 단백질에 대한 바람직한 생산 부위이다: (i) 염증 세포 및 가래와 같은 세포 외 장벽이 코에서 덜 두드러짐; (ii) 벡터 투여 용이성; (iii) 더 적은 양의 벡터가 필요함; 및 (iv) 윤리적 고려 사항. 따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터를 사용한 비강 상피 세포의 형질 도입은 관심 있는 치료 이식 유전자의 효율적이고(높은 수준) 오래 지속되는 발현을 초래할 수 있다. 따라서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 비강 투여가 바람직할 수 있다.
비강 내 투여를 위한 제형은 비강 액적 또는 비강 분무제의 형태일 수 있다. 비강 내 제형은 500-4000㎛, 1000-3000㎛, 또는 100-1000㎛와 같은 100-5000㎛ 범위의 대략적인 직경을 갖는 액적을 포함할 수 있다. 대안적으로, 부피 면에서, 액적은 약 0.001-100μl, 예를 들어 0.1-50μl 또는 1.0-25μl, 또는 예를 들어 0.001-1μl 범위일 수 있다.
에어로졸 제형은 분말, 현탁액, 또는 용액의 형태를 취할 수 있다. 에어로졸 입자의 크기는 에어로졸의 전달 능력과 관련이 있다. 더 작은 입자는 더 큰 입자보다 폐포를 향해 호흡기 아래로 더 멀리 이동할 수 있다. 일 구체예에서, 에어로졸 입자는 기관지, 세기관지, 및 폐포의 전체 길이를 따라 전달을 용이하게 하는 직경 분포를 갖는다. 대안적으로, 입자 크기 분포는 호흡기의 특정 부분, 예를 들어 폐포를 표적으로 하도록 선택될 수 있다. 약제의 에어로졸 전달의 경우, 입자는 약 0.1-50㎛, 바람직하게는 1-25㎛, 더 바람직하게는 1-5㎛ 범위의 직경을 가질 수 있다.
에어로졸 입자는 분무기(예를 들어, 입을 통해) 또는 비강 스프레이를 사용하여 전달하기 위한 것일 수 있다. 에어로졸 제형은 선택적으로 분사제 및/또는 압축 가스(propellant)를 함유할 수 있다.
약학적 에어로졸의 제형은 당업자에게 일상적이고, 예를 들어 Sciarra, J. in Remington's Pharmaceutical Sciences (supra)를 참조한다. 제제는 용액 에어로졸, 건조 분말의 분산액 또는 현탁 에어로졸, 에멀젼 또는 반고체 제제로 제형화될 수 있다. 에어로졸은 당업자에게 공지된 임의의 압축 가스 시스템을 사용하여 전달될 수 있다. 에어로졸은 예를 들어 비강 흡입에 의해 상기도, 또는 하기도 또는 둘 모두에 적용될 수 있다. 약제가 전달되는 폐 부분은 장애에 따라 결정될 수 있다. 본 발명의 벡터를 포함하는 조성물은, 특히 비강 내 전달이 사용될 경우, 습윤제를 포함할 수 있다. 이는 점막의 건조를 줄이거나 예방하고 점막의 자극을 예방하는 데 도움이 될 수 있다. 적합한 습윤제는 예를 들어 소르비톨, 미네랄 오일, 식물성 오일, 및 글리세롤; 진정제(soothing agent); 멤브레인 컨디셔너; 감미료; 및 이들의 조합을 포함한다. 조성물은 계면활성제를 포함할 수 있다. 적합한 계면활성제는 비이온성, 음이온성, 및 양이온성 계면활성제를 포함한다. 사용될 수 있는 계면활성제의 예는 예를 들어 소르비톨 무수물의 지방산 부분 에스테르의 폴리옥시에틸렌 유도체, 예를 들어 트윈 80, 폴리옥실 40 스테아레이트, 폴리옥시 에틸렌 50 스테아레이트, 푸시에이트(fusieates), 담즙산염(bile salt), 및 옥톡시놀을 포함한다.
경우에 따라 초기 투여 후 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터의 후속 투여가 수행될 수 있다. 예를 들어, 투여는 최초 투여 후 적어도 1주, 2주, 1개월, 2개월, 6개월, 1년, 또는 그 이상이 될 수 있다. 일부 예에서, 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 적어도 1주에 1회, 2주에 1회, 1개월에 1회, 2개월마다, 6개월마다, 매년, 또는 더 긴 간격으로 투여될 수 있다. 바람직하게는, 6개월마다, 보다 바람직하게는 매년 투여한다. 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 예를 들어 이전 투여 효과가 감소하는 시기에 따라 지정된 간격으로 투여될 수 있다.
본 발명의 임의의 2 이상의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 개별적으로, 순차적으로, 또는 동시에 투여될 수 있다. 따라서 적어도 하나의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터가 본 발명의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터인, 2개의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터 또는 그 이상의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터가 개별적으로, 동시에, 또는 순차적으로 투여될 수 있고 특히 본 발명의 2개 이상의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터가 이러한 방식으로 투여될 수 있다. 2개의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 동일하거나 상이한 조성물로 투여될 수 있다. 바람직한 예에서, 2개의 레트로바이러스/렌티바이러스(예를 들어, SIV) 벡터는 동일한 조성물로 전달될 수 있다.
서열 상동성(SEQUENCE HOMOLOGY)
전체적 방법, 국소적 방법, 및 혼성 방법, 예를 들어 세그먼트 접근 방법(segment approach method)을 포함하나 이에 제한되지 않는 임의의 다양한 서열 정렬 방법을 사용하여 백분율 동일성을 결정할 수 있다. 백분율 동일성을 결정하기 위한 프로토콜은 당업자의 범위 내에서 일상적인 절차이다. 전체적 방법은 분자의 처음부터 끝까지 서열을 정렬하고 개별 잔기 쌍의 점수를 합산하고 갭 페널티를 부과하여 최상의 정렬을 결정한다. 비제한적 방법은 예를 들어 CLUSTAL W를 포함하고, 예를 들어 Julie D. Thompson et al., CLUSTAL W: Improving the Sensitivity of Progressive Multiple Sequence Alignment Through Sequence Weighting, Position- Specific Gap Penalties and Weight Matrix Choice, 22(22) Nucleic Acids Research 4673-4680 (1994)를 참조하고; 그리고 반복 정제를 포함하고, 예를 들어 Osamu Gotoh, Significant Improvement in Accuracy of Multiple Protein. Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments, 264(4) J. MoI. Biol. 823-838 (1996)를 참조한다. 국소적 방법은 모든 입력 서열에서 공유하는 하나 이상의 보존된 모티프를 식별하여 서열을 정렬한다. 비제한적 방법은 예를 들어 Match-box(예를 들어, Eric Depiereux and Ernest Feytmans, Match-Box: A Fundamentally New Algorithm for the Simultaneous Alignment of Several Protein Sequences, 8(5) CABIOS 501 -509 (1992) 참조); Gibbs 샘플링(예를 들어, C. E. Lawrence et al., Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment, 262(5131 ) Science 208-214 (1993) 참조); Align-M(예를 들어, Ivo Van WaIIe et al., Align-M - A New Algorithm for Multiple Alignment of Highly Divergent Sequences, 20(9) Bioinformatics:1428-1435 (2004) 참조)을 포함한다.
따라서, 백분율 서열 동일성은 통상적인 방법에 의해 결정된다. 예를 들어, Altschul et al., Bull. Math. Bio. 48: 603-16, 1986 and Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915-19, 1992를 참조한다. 간략하게, 2개의 아미노산 서열은 10의 갭 오프닝 페널티, 1의 갭 확장 페널티, 하기에 표시된 대로 Henikoff 및 Henikoff(ibid.)의 "blosum 62" 스코어링 매트릭스를 사용하여 정렬 점수를 최적화하도록 정렬된다(아미노산은 표준 한 문자 코드로 표시됨).
2 이상의 핵산 또는 아미노산 서열 사이의 "백분율 서열 동일성(percent sequence identity)“은 서열이 공유하는 동일한 위치의 수의 함수이다. 따라서, % 동일성은 동일한 뉴클레오티드/아미노산의 수를 총 뉴클레오티드/아미노산 수로 나누고 100을 곱하여 계산할 수 있다. % 서열 동일성의 계산은 또한 갭의 수 및, 2 이상의 서열의 정렬을 최적화하기 위해 도입될 필요가 있는 각 갭의 길이를 고려할 수 있다. 서열 비교 및 2 이상의 서열 사이의 백분율 동일성의 결정은 당업자에게 친숙할 BLAST와 같은 특정 수학적 알고리즘을 사용하여 수행될 수 있다.
서열 동일성을 결정하기 위한 정렬 점수
A R N D C Q E G H I L K M F P S T W Y V
A 4
R -1 5
N -2 0 6
D -2 -2 1 6
C 0 -3 -3 -3 9
Q -1 1 0 0 -3 5
E -1 0 0 2 -4 2 5
G 0 -2 0 -1 -3 -2 -2 6
H -2 0 1 -1 -3 0 0 -2 8
I -1 -3 -3 -3 -1 -3 -3 -4 -3 4
L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4
K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5
M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5
F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6
P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7
S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4
T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5
W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11
Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7
V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4
백분율 동일성은 다음과 같이 계산된다:
동일한 일치의 총 수
__________________________________________ x 100
[더 긴 서열의 길이 + 두 서열을 정렬하기 위해 더 긴 서열 내에 도입된 갭의 수]
실질적으로 상동인 폴리펩티드는 하나 이상의 아미노산 치환, 결실, 또는 첨가를 갖는 것으로 특징지어진다. 이러한 변화는 바람직하게는 미미한 성질, 즉 폴리펩티드의 폴딩(folding) 또는 활성에 유의미한 영향을 미치지 않는 보존적 아미노산 치환(본 명세서에 기재된 바와 같음) 및 기타 치환; 전형적으로 1개 내지 약 30개 아미노산의 작은 결실; 및 아미노-말단 메티오닌 잔기, 최대 약 20-25 잔기의 작은 링커 펩티드 또는 친화성 태그와 같은 작은 아미노- 또는 카르복실-말단 연장이다.
20개의 표준 아미노산 외에도, 비표준 아미노산(예를 들어, 4-히드록시프롤린, 6-N-메틸 라이신, 2-아미노이소부티르산, 이소발린, 및 α-메틸 세린)이 본 발명의 폴리펩티드의 아미노산 잔기를 대체할 수 있다. 제한된 수의 비보존적 아미노산, 유전자 코드에 의해 암호화되지 않은 아미노산, 및 비천연 아미노산이 폴리펩티드 아미노산 잔기를 대체할 수 있다. 본 발명의 폴리펩티드는 또한 비천연 발생 아미노산 잔기를 포함할 수 있다.
비천연 발생 아미노산에는 트랜스-3-메틸프롤린, 2,4-메타노-프롤린, 시스-4-히드록시프롤린, 트랜스-4-히드록시-프롤린, N-메틸글리신, 알로-트레오닌, 메틸-트레오닌, 히드록시-에틸시스테인, 히드록시에틸호모-시스테인, 니트로-글루타민, 호모글루타민, 피페콜산, tert-류신, 노르발린, 2-아자페닐알라닌, 3-아자페닐-알라닌, 4-아자페닐-알라닌, 및 4-플루오로페닐알라닌이 포함되나 이에 제한되지 않는다. 비천연 발생 아미노산 잔기를 단백질 내에 혼입시키기 위한 여러 방법이 당업계에 공지되어 있다. 예를 들어, 넌센스 돌연변이가 화학적으로 아미노아실화된 억제자 tRNA를 사용하여 억제되는, 시험관 내 시스템이 사용될 수 있다. 아미노산을 합성하고 tRNA를 아미노아실화하는 방법은 당업계에 공지되어 있다. 넌센스 돌연변이를 함유하는 플라스미드의 전사 및 번역은 E. coli S30 추출물과 상업적으로 이용 가능한 효소 및 기타 시약을 포함하는 무세포 시스템에서 수행된다. 단백질은 크로마토그래피로 정제된다. 예를 들어, Robertson et al., J. Am. Chem. Soc. 113:2722, 1991; Ellman et al., Methods Enzymol. 202:301, 1991; Chung et al., Science 259:806-9, 1993; and Chung et al., Proc. Natl. Acad. Sci. USA 90:10145-9, 1993)를 참조한다. 두 번째 방법에서, 번역은 돌연변이된 mRNA 및 화학적으로 아미노아실화된 억제자 tRNA의 미세 주입(microinjection)에 의해 Xenopus 난모 세포(oocytes)에서 수행된다(Turcatti et al., J. Biol. Chem. 271:19991-8, 1996). 세 번째 방법 내에서, 대장균 세포는 대체될 천연 아미노산(예를 들어, 페닐알라닌)의 부재 및 원하는 비천연 발생 아미노산(들)(예를 들어, 2- 아자페닐알라닌, 3-아자페닐알라닌, 4-아자페닐알라닌, 또는 4-플루오로페닐알라닌)의 존재하에 배양된다. 비천연 발생 아미노산은 천연 대응물 대신에 폴리펩티드 내에 통합된다. Koide et al., Biochem. 33:7470-6, 1994를 참조한다. 천연 발생 아미노산 잔기는 시험관 내 화학적 변형에 의해 비천연 발생 종으로 전환될 수 있다. 화학적 변형은 치환 범위를 추가로 확장하기 위해 부위-지정 돌연변이 유발(site-directed mutagenesis)과 조합될 수 있다(Wynn and Richards, Protein Sci. 2:395-403, 1993).
제한된 수의 비보존적 아미노산, 유전자 코드에 의해 암호화되지 않는 아미노산, 비-천연 발생 아미노산, 및 비천연 아미노산이 본 발명의 폴리펩티드의 아미노산 잔기를 대체할 수 있다.
본 발명의 폴리펩티드의 필수 아미노산은, 부위-지정 돌연변이 유발 또는 알라닌-스캐닝 돌연변이 유발과 같은, 당업계에 공지된 절차에 따라 확인될 수 있다(Cunningham and Wells, Science 244: 1081-5, 1989). 생물학적 상호 작용 부위는, 추정되는 접촉 부위 아미노산의 돌연변이와 함께, 핵 자기 공명, 결정학, 전자 회절, 또는 광친화성 라벨링과 같은 기술에 의해 결정되는 구조의 물리적 분석에 의해 결정될 수도 있다. 예를 들어, de Vos et al., Science 255:306-12, 1992; Smith et al., J. Mol. Biol. 224:899-904, 1992; Wlodaver et al., FEBS Lett. 309:59-64, 1992를 참조한다. 필수 아미노산의 본질은 또한 본 발명의 폴리펩티드의 관련 성분(예를 들어, 전위 또는 프로테아제 성분)과의 상동성 분석으로부터 추론될 수 있다.
다중 아미노산 치환은 Reidhaar-Olson 및 Sauer(Science 241:53-7, 1988) 또는 Bowie 및 Sauer(Proc. Natl. Acad. Sci. USA 86:2152-6, 1989)에 의해 개시된 것과 같은 공지된 돌연변이 유발 및 스크리닝 방법을 사용하여 이루어지고 시험될 수 있다.
간략하게, 이들 저자는 폴리펩티드에서 2 이상의 위치를 동시에 무작위화하고, 기능적 폴리펩티드를 선택한 다음, 각 위치에서 허용 가능한 치환의 스펙트럼을 결정하기 위해 돌연변이화된 폴리펩티드를 시퀀싱하는 방법을 개시한다. 사용될 수 있는 다른 방법에는 파지 디스플레이(예를 들어, Lowman et al., Biochem. 30:10832-7, 1991; Ladner et al., U.S. Patent No. 5,223,409; Huse, WIPO Publication WO 92/06204) 및 영역-지정 돌연변이 유발(Derbyshire et al., Gene 46:145, 1986; Ner et al., DNA 7:127, 1988)이 포함된다.
다중 아미노산 치환은 Reidhaar-Olson 및 Sauer(Science 241:53-7, 1988) 또는 Bowie 및 Sauer(Proc. Natl. Acad. Sci. USA 86:2152-6, 1989)에 의해 개시된 것과 같은 공지된 돌연변이 유발 및 스크리닝 방법을 사용하여 이루어지고 시험될 수 있다.
간략하게, 이들 저자는 폴리펩티드에서 2 이상의 위치를 동시에 무작위화하고, 기능적 폴리펩티드를 선택한 다음, 각 위치에서 허용 가능한 치환의 스펙트럼을 결정하기 위해 돌연변이화된 폴리펩티드를 시퀀싱하는 방법을 개시한다. 사용될 수 있는 다른 방법에는 파지 디스플레이(예를 들어, Lowman et al., Biochem. 30:10832-7, 1991; Ladner et al., U.S. Patent No. 5,223,409; Huse, WIPO Publication WO 92/06204) 및 영역-지정 돌연변이 유발(Derbyshire et al., Gene 46:145, 1986; Ner et al., DNA 7:127, 1988)이 포함된다.
실시예
이제 본 발명을 하기 실시예를 참조하여 설명한다. 이들은 본 발명의 범위를 제한하지 않고, 당업자는 본 발명의 범위 내에서 적합한 등가물이 사용될 수 있음을 이해할 것이다. 따라서, 실시예는 본 발명의 구성 요소로 간주될 수 있고, 본 명세서에 기재된 개별 측면은 독립적으로 또는 임의의 조합으로 개시된 것으로 간주될 수 있다.
실시예 1 - 플라스미드 pGM691 작제(Plasmid pGM691 construction)
pGM326의 벡터 게놈 플라스미드(pDNA1)와 pGM297의 GagPol 플라스미드(pDNA2a)의 비교를 수행하였다. 도 5a에 도시된 바와 같이, pGM326의 부분적인 gagpol 뉴클레오티드 서열과 pGM297의 비-코돈 최적화된 gagpol 서열 사이에 상당한 상동성이 존재한다.
변형된 pDNA2a 플라스미드는 (i) pGM326의 부분적인 gagpol 뉴클레오티드 서열과 pGM297의 비-코돈 최적화된 gagpol 서열 사이의 상동성을 감소시키고; (ii) 증가된 gagpol 단백질 발현을 위해 gagpol 유전자를 코돈-최적화하고; (iii) 제조 또는 임상 사용 중에 복제-가능 렌티바이러스(RCL) 생성의 이론적 위험을 줄이고; 그리고 (iv) Rev에 대한 gagpol 발현 의존성을 제거하도록 설계되었다. pGM297과 변형된 pDNA2a(pGM691)의 비교가 도 5b-5d에 도시되어 있고, 변경 사항이 주해되어 있다.
pGM691은 제한 효소 XhoI, EcoRV, 및 BglII로 pGM297을 분해하여 4583bp, 3662bp, 및 1641bp의 DNA 단편을 수득함으로써 생성되었다. 플라스미드 복제 기점 및 CBA 프로모터 인트론을 함유하는 4583bp 단편을 정제하고 보유하였다. 플라스미드 pGM693은 GeneArt/LifeTechnologies에서 DNA 합성을 통해 제조하였다. pGM693은 궁극적으로 pGM691에서 발견되는 코돈 최적화된 GagPol 서열을 포함하는 4481bp XhoI 내지 BglII DNA 단편을 포함하도록 발명자들에 의해 설계되었다. pGM693을 XhoI 및 BglII로 분해하여 4481bp, 1236bp, 및 1048bp의 DNA 단편을 얻었다. 코돈 최적화된 GagPol 서열을 함유하는 4481bp 단편을 정제하고 보유하였다(도 5e 참조). 2개의 보유된 DNA 단편을 DNA 리가제로 결찰하고 결찰된 DNA의 결과 혼합물을 E. coli Stbl3 세포로 형질 전환시켰고; 복제할 수 있는 플라스미드를 함유하는 세포가 카나마이신(kanamycin)에 대한 저항성에 의해 선택되었다. 카나마이신 내성, 형질 전환된 Stbl3 세포의 잘-분리된 개별 콜로니를 선택하고 확장시켰다. 생성된 클론의 DNA 제한 분석은 예상되는 DNA 구조를 갖는 다수의 클론을 식별하였고; 하나는 유보되어(reserved) pGM691로 명명되었다.
실시예 2 - rSIV.F/HN 벡터 hCEF-CFTR의 생성
hCEF 프로모터의 전사 조절(transcriptional control)하에 CFTR 이식 유전자를 포함하는 벡터 게놈 pGM326은 pGM297 GagPol 또는 pGM691 coGagPol을 사용하여 제공되는 생성 수율을 평가하기 위해 두 가지 실험 설계(DoE) 연구에 사용되었다.
각 DoE 연구에서 사용된 각 구성 요소의 저농도, 중앙 농도, 및 고농도를 포함하는 광범위한 조건이 사용되었다:
기능(Function) | 코드(Code) | Low | Centre | High |
Genome | pGM326 | 0.2 | 1.1 | 2 |
(co)GagPol | pGM297 or GM691 | 0.1 | 0.55 | 1 |
Rev | pGM299 | 0.1 | 0.55 | 1 |
F | pGM301 | 0.1 | 0.55 | 1 |
HN | pGM303 | 0.1 | 0.55 | 1 |
형질감염 시약(Transfection Reagent) | 리포펙타민(Lipofectamine) 2000 | 4 | 7 | 10 |
형질 감염 시약의 단위는 μL/mL이고, 다른 모든 시약의 경우 μg/mL이다.
대부분의 조건과 6개의 반복 중심점에 대해 준비된 중복 벡터 스톡과 함께 3-수준 부분 요인 설계를 사용하였다. 전체적으로, pGM297 GagPol 및 pGM691 coGagPol에 대해 동일한 조건을 사용하여 31개의 벡터 스톡을 제조하였다.
벡터 스톡의 희석액으로 293T 세포를 형질 도입한 후 정량적 PCR을 통해 형질 도입된 세포에서 벡터 특이적 및 게놈 특이적 DNA 서열의 비율을 검출하여 결정된 통합 형질 도입 단위 역가(TU/mL)를 도 6a에 도시하였다(복제 벡터 스톡은 점으로 표시되고, 선은 동일한 조건을 나타낸다).
DOE 실험에 이어서, hCEF 프로모터의 전사 조절하에 CFTR 이식 유전자를 포함하는 벡터 게놈 pGM326을 사용하여, 지시된 바와 같이 pGM297 GagPol 또는 pGM691 coGagPol을 사용하여 rSIV.F/HN 벡터 스톡을 3회 제조하였다.
모든 제조에 대해, Rev, F 및 HN은 각각 pGM299, pGM301, 및 pGM303에서 제공되었다. 사용된 벡터 게놈:GagPol:Rev:F:HN의 DNA 질량비는 모든 경우에 20:9:6:6:6이었다. 조건 A 및 B의 경우, 사용된 총 DNA 수준은 각각 2.2μg/mL 및 1.8μg/mL이다. 조건 A 및 B의 경우, 사용된 총 Lipofectamine 2000 수준은 각각 7μL/mL 및 8μL/mL이었다.
벡터 스톡의 희석물과 함께 293T 세포의 형질 도입 후 정량적 PCR을 통해 형질 도입된 세포에서 게놈 특이적 DNA 서열에 특이적인 벡터의 비율에 의해 결정된 통합 형질 도입 단위 역가(TU/mL)가 플롯된다(개별 벡터 스톡은 점으로 표시되고, 선은 그룹 중앙값을 나타낸다).
pGM691에 의해 제공되는 coGagPol을 사용한 벡터 수율은 조건 A에서 ~2.3배 더 높고 조건 B에서 ~1.5배 더 높은 것으로 관찰되었다(도 6b). 따라서, pGM691을 pDNA2a로 사용하면 사용된 다른 배양 조건과 관계없이 SIV 바이러스 역가가 눈에 띄게 증가하였다. gagpol 유전자의 코돈-최적화가 렌티바이러스 역가의 감소와 관련이 있다고 보고하는 여러 독립적인 발표된 연구가 있기 때문에, 이는 놀라운 것이다.
실시예 3 - rSIV.F/HN CMV-EGFP의 생성
벡터 역가를 유지하거나 증가시키는 코돈-최적화된 gagpol의 능력이 특정 rSIV.F/HN 작제물(rSIV.F/HN hCEF-CFTR)로 제한되는지 여부를 조사하기 위해, 상이한 프로모터에 작동 가능하게 연결된 상이한 이식 유전자를 생성하기 위해 플라스미드를 사용하여 실험을 수행하였다.
HEK293T, Freestyle 293F (Life Technologies, Paisley, UK), 및 293T/17 세포(CRL-11268; ATCC, Manassas, VA)를 10% 태아 소 혈청을 함유하는 Dulbecco's minimal Eagle's medium (Invitrogen, Carlsbad, CA)에서 유지하고 페니실린(100U/ml) 및 스트렙토마이신(100μg/ml) 또는 Freestyle™ 293 Expression Medium (Life Technologies) 배지로 보충하였다.
SeV-F/HN-슈도타이핑된 SIV 벡터는: pDNA1(pGM311; CMV 프로모터의 전사 조절하에 EGFP 이식 유전자를 포함함)이 렌티바이러스 벡터 mRNA를 암호화하고; pDNA2a(pGM691; 도 2c)는 SIV Gag 및 Pol 단백질을 암호화하고; pDNA2b(pGM299: 도 2d)는 SIV Rev 단백질을 암호화하고; pDNA3a(pGM301; 도 2e)는 센다이 바이러스-유래 Fct4 단백질을 암호화하고[Kobayashi et al., 2003 J. Virol. 77:2607]; 그리고 pDNA3b(pGM303; 도 2f)는 센다이 바이러스-유래 SIVct+HN을 암호화하는[Kobayashi et al., 2003 J. Virol. 77:2607] complexed with PEIpro (Polyplus, Illkirch, France); 특성을 갖는 5개의 플라스미드의 혼합물로 FreeStyleTM 293 Expression Medium에서 배양된 HEK293T 또는 293T/17 세포를 형질 감염시켜 생성되었다. 세포 배양 배지는 부티르산 나트륨으로 형질 감염 후 12-24일에 보충되었다. 부티르산 나트륨은 히스톤 데아세틸라제를 억제하여 벡터 생성을 자극하여 5개의 플라스미드에 의해 암호화되는 SIV 및 센다이 바이러스 융합 단백질 구성 요소의 발현을 증가시킨다. 세포 배양 배지는 형질 감염 후 44-52시간 및/또는 68-76시간에 5 단위/mL 벤조나아제 뉴클레아제(Merck Millipore, Nottingham, UK)로 보충되었다. SIV 벡터를 함유하는 배양 상등액은 형질 감염 후 68-76.5시간에 채취하고 0.45μm 멤브레인을 통해 여과하여 정화하였다. SIV 벡터는 TrypLE Select™로 분해 처리된다. 이어서, SIV 벡터를 추가로 정제하고 음이온-교환 크로마토그래피 및 접선 흐름 여과에 의해 농축하였다.
표시된 대로 pGM297 GagPol 또는 pGM691 coGagPol을 사용하여 rSIV.F/HN 벡터 스톡을 3중으로 한다. 사용된 벡터 게놈:GagPol:Rev:F:HN의 DNA 질량비는 모든 경우에 20:9:6:6:6이었다.
벡터 스톡을 희석하여 293T 세포를 형질 도입한 후 유세포 분석을 통해 EGFP 양성 세포를 검출하여 결정한 기능적 형질 도입 단위 역가(FTU/mL)를 도 7에 표시하였다(개별 벡터 스톡은 점으로 표시됨, 선은 그룹 중앙값을 의미). 실시예 2의 rSIV.F/HN hCEF-CFTR 작제물의 경우, pGM691에 의해 제공되는 coGagPol을 사용한 rSIV.F/HN CMV-EGFP 벡터 수율은 pGM297의 코돈-최적화되지 않은 gagpol을 사용한 경우보다 ~1.6배 더 높은 것으로 관찰되었다. 이는 벡터 역가를 유지하거나 증가시키는 코돈-최적화된 gagpol의 능력이 특정 rSIV.F/HN hCEF-CFTR 작제물에 제한되지 않고, 오히려 coGagPol의 사용과 일반적으로 연관된 기능임을 시사한다.
실시예 3 - 벡터 게놈 플라스미드 내 온전한 SIV ORF 수 감소
하나 이상의 구성 플라스미드에 대한 추가 변형은 최종 벡터 생성물의 안전성을 더욱 향상시켜, 추가적인 임상적 이점을 제공할 수 있다.
본 발명자들은 구성 플라스미드의 서열을 검토하고 벡터 게놈 플라스미드 pGM326 내에서 몇 가지 중요 영역을 확인하였다. 특히, pGM326 부분 Gag RRE cPPT hCEF 영역은 다음을 포함한다:
·개시 코돈(ATG) 77개;
·길이가 아미노산 10개 이상인 ORF 32개
·5'에서 3' 방향의 큰 ORF 2개
o p17 매트릭스 및 p24 캡시드의 일부를 암호화하는 벡터 게놈(Gag/RRE 융합)에서 most 5' ATG의 189개 아미노산
o RRE 내부의 ATG에서 250개의 아미노산(RRE/cPPT/hCEF 융합)
이는 도 8에 설명되어 있다. 2개의 큰 ORF(도 9 참조)가 특히 중요하였다.
이와 같이, 본 발명자들은 개선된 안전성을 위해 온전한 SIV ORF의 수를 감소시키기 위한(및 특히 이들 2개의 큰 ORF를 제거하기 위한) 추가적인 변형의 조합을 갖는 변형된 버전의 pGM326 플라스미드를 설계하였다. 변형은 hCEF 프로모터 및 CFTR 이식 유전자(soCFTR2)의 상류에 있는 2개의 큰 ORF에 생기게 된다. 변경 사항은 다음과 같다:
·6개의 ATG 제거(3xATG-ATTG, 1xATG-TTG, 2xATG-AAG)
·1개의 정지 삽입(TCC-TAAA)
·변경된 부분 Gag와 RRE 사이의 제한 부위 1개(EcoRI GAATTC - GCCTGCAGG SbfI)
생성된 벡터 게놈 플라스미드는 도 2b에 도시된 바와 같이 SEQ ID NO: 4의 서열을 갖는 pGM830이다.
동일한 생산 프로토콜에서 pGM326 또는 pGM830 벡터 게놈 플라스미드를 사용하여 벡터 역가를 비교한 결과 pGM830의 사용이 HEK293T 및 A549 세포 모두를 사용하는 pGM326에 필적하는 역가를 제공한다는 것을 입증하였고(도 10 참조), 이는 역가에 악영향을 미치지 않고 개선된 안전성 프로파일이 달성될 수 있음을 나타낸다.
실시예 4 - coGagPol과 변형된 벡터 게놈 플라스미드의 조합은 벡터 역가를 유지하거나 심지어 증가시킨다
실시예 2에 보고된 실험은 놀랍게도 예상되는 수율 감소보다는 coGagPol을 사용한 SIV.F/HN hCEF-CFTR의 생성이 벡터 역가를 유지하거나 심지어 증가시키는 경향이 있음을 입증하였다. 실시예 3에 보고된 실험은 벡터 역가에 악영향을 미치지 않으면서, 벡터 게놈 플라스미드를 변형함으로써 벡터의 안전성 프로파일에 대한 추가적인 개선이 달성될 수 있음을 입증하였다.
이후, pGM830 벡터 게놈 플라스미드의 사용과 coGagPol의 사용을 결합한 추가 실험을 수행하여, 이 두 가지 안전-관련 변형을 결합하고 벡터 역가를 유지할 수 있는지 여부를 조사하였다.
도 11에 예시된 바와 같이, 발명자들은 놀랍게도 coGagPol의 사용이 변형된 벡터 게놈 플라스미드(pGM830)의 사용과 조합될 수 있을 뿐만 아니라 이 조합이 벡터 역가를 증가시키는 관찰 가능한 경향을 제공한다는 것을 발견하였다.
이는 변형된 벡터 게놈 플라스미드와 coGagPol의 사용을 조합함으로써 더 개선된 안전성 프로파일을 갖는 벡터가 얻어질 수 있을 뿐만 아니라, 놀랍게도 이는 rSIV.F/HN hCEF-이식 유전자 역가를 유지하거나 심지어 증가시키면서 달성될 수 있음을 시사한다.
서열 정보
서열의 핵심
SEQ ID NO: 1 코돈-최적화된 SIV gal-pol 핵산 서열
SEQ ID NO: 2 야생형 SIV gag-pol 핵산 서열
SEQ ID NO: 3 도 2a에 정의된 바와 같은 플라스미드(pDNA1 pGM326)
SEQ ID NO: 4 도 2b에 정의된 바와 같은 플라스미드(pDNA1 pGM830)
SEQ ID NO: 5 도 2c에 정의된 바와 같은 플라스미드(pDNA2a pGM691)
SEQ ID NO: 6 도 2d에 정의된 바와 같은 플라스미드(pDNA2b pGM299)
SEQ ID NO: 7 도 2e에 정의된 바와 같은 플라스미드(pDNA3a pGM301)
SEQ ID NO: 8 도 2f에 정의된 바와 같은 플라스미드(pDNA3b pGM303)
SEQ ID NO: 9 도 2g에 정의된 바와 같은 플라스미드(pDNA2a pGM297)
SEQ ID NO: 10 예시된 hCEF 프로모터
SEQ ID NO: 11 예시된 CMV 프로모터
SEQ ID NO: 12 예시된 EF1a 프로모터
SEQ ID NO: 13 예시된 CFTR 이식 유전자(soCFTR2)
SEQ ID NO: 14 예시된 A1AT 이식 유전자
SEQ ID NO: 15 예시된 A1AT 이식 유전자에 대한 상보 가닥
SEQ ID NO: 16 예시된 A1A1 폴리펩티드
SEQ ID NO: 17 예시된 FVIII 이식 유전자(N6)
SEQ ID NO: 18 예시된 FVIII 이식 유전자(V3)
SEQ ID NO: 19 예시된 FVIII 이식 유전자(N6)에 대한 상보 가닥
SEQ ID NO: 20 예시된 FVIII 이식 유전자(V3)에 대한 상보 가닥
SEQ ID NO: 21 예시된 FVIII 폴리펩티드(N6)
SEQ ID NO: 22 예시된 FVIII 폴리펩티드(V3)
SEQ ID NO: 23 예시된 WPRE 성분(mWPRE)
SEQ ID NO: 24 도 3에 정의된 바와 같은 F/HN-SIV-hCEF-soA1AT 플라스미드(pDNA1 pGM407)
SEQ ID NO: 25 도 4a에 정의된 바와 같은 F/HN-SIV-CMV-HFVIII-V3 플라스미드(pDNA1 pGM411)
SEQ ID NO: 26 도 4b에 정의된 F/HN-SIV-hCEF-HFVIII-V3 플라스미드(pDNA1 pGM413)
SEQ ID NO: 27 도 4c에 정의된 바와 같은 F/HN-SIV-CMV-HFVIII-N6-co 플라스미드(pDNA1 pGM412)
SEQ ID NO: 28 도 4d에 정의된 바와 같은 F/HN-SIV-hCEF-HFVIII-N6-co 플라스미드(pDNA1 pGM414)
SEQ ID NO: 29 예시된 CAG 프로모터
서열
SEQ ID NO: 1 코돈-최적화된 SIV gal-pol 핵산 서열(pGM691로부터)
길이: 4391; 분자 유형: DNA; 기능 위치(Features Location)/한정자(Qualifiers): 소스(source), 1..4391; mol_type, 기타 DNA; 코돈-최적화된 SIV gal-pol 핵산 서열(pGM691로부터); 유기체, 합성 구조물
ATGGGAGCTGCCACATCTGCCCTGAATAGACGGCAGCTGGACCAGTTCGAGAAGATCAGACTGCGGCCCAACGGCAAGAAGAAGTACCAGATCAAGCACCTGATCTGGGCCGGCAAAGAGATGGAAAGATTCGGCCTGCACGAGCGGCTGCTGGAAACCGAGGAAGGCTGCAAGAGAATTATCGAGGTGCTGTACCCTCTGGAACCTACCGGCTCTGAGGGCCTGAAGTCCCTGTTCAATCTCGTGTGCGTGCTGTACTGCCTGCACAAAGAACAGAAAGTGAAGGACACCGAAGAGGCCGTGGCCACAGTTAGACAGCACTGCCACCTGGTGGAAAAAGAGAAGTCCGCCACAGAGACAAGCAGCGGCCAGAAGAAGAACGACAAGGGAATTGCTGCCCCTCCTGGCGGCAGCCAGAATTTTCCTGCTCAGCAGCAGGGAAACGCCTGGGTGCACGTTCCACTGAGCCCTAGAACACTGAATGCCTGGGTCAAAGCCGTGGAAGAGAAGAAGTTTGGCGCCGAGATCGTGCCCATGTTCCAGGCTCTGTCTGAGGGCTGCACCCCTTACGACATCAACCAGATGCTGAACGTGCTGGGAGATCACCAGGGCGCTCTGCAGATCGTGAAAGAGATCATCAACGAAGAGGCTGCCCAGTGGGACGTGACACATCCATTGCCTGCTGGACCTCTGCCAGCCGGACAACTGAGAGATCCTAGAGGCTCTGATATCGCCGGCACCACCAGCTCTGTGCAAGAGCAGCTGGAATGGATCTACACCGCCAATCCTAGAGTGGACGTGGGCGCCATCTACAGAAGATGGATCATCCTGGGCCTGCAGAAATGCGTGAAGATGTACAACCCCGTGTCCGTGCTGGACATCAGACAGGGACCCAAAGAGCCCTTCAAGGACTACGTGGACCGGTTCTATAAGGCCATTAGAGCCGAGCAGGCCAGCGGCGAAGTGAAGCAGTGGATGACAGAGAGCCTGCTGATCCAGAACGCCAATCCAGACTGCAAAGTGATCCTGAAAGGCCTGGGCATGCACCCCACACTGGAAGAGATGCTGACAGCCTGTCAAGGCGTTGGCGGCCCTTCTTACAAAGCCAAAGTGATGGCCGAGATGATGCAGACCATGCAGAACCAGAACATGGTGCAGCAAGGCGGCCCTAAGAGACAGAGGCCTCCTCTGAGATGCTACAACTGCGGCAAGTTCGGCCACATGCAGAGACAGTGTCCTGAGCCTAGGAAAACAAAATGTCTAAAGTGTGGAAAATTGGGACACCTAGCAAAAGACTGCAGGGGACAGGTGAATTTTTTAGGGTATGGACGGTGGATGGGGGCAAAACCGAGAAATTTTCCCGCCGCTACTCTTGGAGCGGAACCGAGTGCGCCTCCTCCACCGAGCGGCACCACCCCATACGACCCAGCAAAGAAGCTCCTGCAGCAATATGCAGAGAAAGGGAAACAACTGAGGGAGCAAAAGAGGAATCCACCGGCAATGAATCCGGATTGGACCGAGGGATATTCTTTGAACTCCCTCTTTGGAGAAGACCAATAAAGACCGTGTACATCGAGGGCGTGCCCATCAAGGCTCTGCTGGATACAGGCGCCGACGACACCATCATCAAAGAGAACGACCTGCAGCTGAGCGGCCCTTGGAGGCCTAAGATCATTGGAGGAATCGGCGGAGGCCTGAACGTCAAAGAGTACAACGACCGGGAAGTGAAGATCGAGGACAAGATCCTGAGGGGCACAATCCTGCTGGGCGCCACACCTATCAACATCATCGGCAGAAATCTGCTGGCCCCTGCCGGCGCTAGACTGGTTATGGGACAGCTCTCTGAGAAGATCCCCGTGACACCCGTGAAGCTGAAAGAAGGCGCTAGAGGACCTTGTGTGCGACAGTGGCCTCTGAGCAAAGAGAAGATTGAGGCCCTGCAAGAAATCTGTAGCCAGCTGGAACAAGAGGGCAAGATCAGCAGAGTTGGCGGCGAGAACGCCTACAATACCCCTATCTTCTGCATCAAGAAAAAGGACAAGAGCCAGTGGCGGATGCTGGTGGACTTTAGAGAGCTGAACAAGGCTACCCAGGACTTCTTCGAGGTGCAGCTGGGAATTCCTCATCCTGCCGGCCTGCGGAAGATGAGACAGATCACAGTGCTGGATGTGGGCGACGCCTACTACAGCATCCCTCTGGACCCCAACTTCAGAAAGTACACCGCCTTCACAATCCCCACCGTGAACAATCAAGGCCCTGGCATCAGATACCAGTTCAACTGCCTGCCTCAAGGCTGGAAGGGCAGCCCCACCATTTTTCAGAATACCGCCGCCAGCATCCTGGAAGAAATCAAGAGAAACCTGCCTGCTCTGACCATCGTGCAGTACATGGACGATCTGTGGGTCGGAAGCCAAGAGAATGAGCACACCCACGACAAGCTGGTGGAACAGCTGAGAACAAAGCTGCAGGCCTGGGGCCTCGAAACCCCTGAGAAGAAGGTGCAGAAAGAACCTCCTTACGAGTGGATGGGCTACAAGCTGTGGCCTCACAAGTGGGAGCTGAGCCGGATTCAGCTCGAAGAGAAGGACGAGTGGACCGTGAACGACATCCAGAAACTCGTGGGCAAGCTGAATTGGGCAGCCCAGCTGTATCCCGGCCTGAGGACCAAGAACATCTGCAAGCTGATCCGGGGAAAGAAGAACCTGCTGGAACTGGTCACATGGACACCTGAGGCCGAGGCCGAATATGCCGAGAATGCCGAAATCCTGAAAACCGAGCAAGAGGGGACCTACTACAAGCCTGGCATTCCAATCAGAGCTGCCGTGCAGAAACTGGAAGGCGGCCAGTGGTCCTACCAGTTTAAGCAAGAAGGCCAGGTCCTGAAAGTGGGCAAGTACACCAAGCAGAAGAACACCCACACCAACGAGCTGAGGACACTGGCTGGCCTGGTCCAGAAAATCTGCAAAGAGGCCCTGGTCATTTGGGGCATCCTGCCTGTTCTGGAACTGCCCATTGAGCGGGAAGTGTGGGAACAGTGGTGGGCCGATTACTGGCAAGTGTCTTGGATCCCCGAGTGGGACTTCGTGTCTACCCCTCCTCTGCTGAAACTGTGGTACACCCTGACAAAAGAGCCCATTCCTAAAGAGGACGTCTACTACGTTGACGGCGCCTGCAACCGGAACTCCAAAGAAGGCAAGGCCGGCTACATCAGCCAGTACGGCAAGCAGAGAGTGGAAACCCTGGAAAACACCACCAACCAGCAGGCCGAGCTGACCGCCATTAAGATGGCCCTGGAAGATAGCGGCCCCAATGTGAACATCGTGACCGACTCTCAGTACGCCATGGGAATCCTGACAGCCCAGCCTACACAGAGCGATAGCCCTCTGGTTGAGCAGATCATTGCCCTGATGATTCAGAAGCAGCAAATCTACCTGCAGTGGGTGCCCGCTCACAAAGGCATCGGCGGAAACGAAGAGATCGATAAGCTGGTGTCCAAGGGAATCAGACGGGTGCTGTTCCTGGAAAAGATTGAAGAGGCCCAAGAGGAACACGAGCGCTACCACAACAACTGGAAGAATCTGGCCGACACCTACGGACTGCCCCAGATCGTGGCCAAAGAAATCGTGGCTATGTGCCCCAAGTGTCAGATCAAGGGCGAACCTGTGCACGGCCAAGTGGATGCTTCTCCTGGCACATGGCAGATGGACTGTACCCACCTGGAAGGCAAAGTGGTCATCGTGGCTGTGCACGTGGCCTCCGGCTTTATTGAGGCCGAAGTGATCCCCAGAGAGACAGGCAAAGAAACCGCCAAGTTCCTGCTGAAGATCCTGTCCAGATGGCCCATCACACAGCTGCACACCGACAACGGCCCTAACTTCACATCTCAAGAGGTGGCCGCCATCTGTTGGTGGGGAAAGATTGAGCACACAACCGGCATTCCCTACAATCCACAGAGCCAGGGCAGCATCGAGTCCATGAACAAGCAGCTCAAAGAGATTATCGGCAAGATCCGGGACGACTGCCAGTACACAGAAACAGCCGTGCTGATGGCCTGTCACATCCACAACTTCAAGCGGAAAGGCGGCATCGGAGGACAGACATCTGCCGAGAGACTGATCAATATCATCACCACTCAGCTGGAAATCCAGCACCTCCAGACCAAGATCCAGAAGATTCTGAACTTCCGGGTGTACTACCGCGAGGGCAGAGATCCTGTTTGGAAAGGCCCAGCACAGCTGATCTGGAAAGGCGAAGGTGCCGTGGTGCTGAAGGATGGCTCTGATCTGAAGGTGGTGCCCAGACGGAAGGCCAAGATTATCAAGGATTACGAGCCCAAACAGCGCGTGGGCAATGAAGGCGACGTTGAGGGCACAAGAGGCAGCGACAATTGA
SEQ ID NO: 2 야생형 SIV gag-pol 핵산 서열(pGM297로부터)
길이: 4391; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..4391; mol_type, 할당되지 않은 DNA; 유기체, 유인원 면역결핍 바이러스
ATGGGGGCGGCTACCTCAGCACTAAATAGGAGACAATTAGACCAATTTGAGAAAATACGACTTCGCCCGAACGGAAAGAAAAAGTACCAAATTAAACATTTAATATGGGCAGGCAAGGAGATGGAGCGCTTCGGCCTCCATGAGAGGTTGTTGGAGACAGAGGAGGGGTGTAAAAGAATCATAGAAGTCCTCTACCCCCTAGAACCAACAGGATCGGAGGGCTTAAAAAGTCTGTTCAATCTTGTGTGCGTACTATATTGCTTGCACAAGGAACAGAAAGTGAAAGACACAGAGGAAGCAGTAGCAACAGTAAGACAACACTGCCATCTAGTGGAAAAAGAAAAAAGTGCAACAGAGACATCTAGTGGACAAAAGAAAAATGACAAGGGAATAGCAGCGCCACCTGGTGGCAGTCAGAATTTTCCAGCGCAACAACAAGGAAATGCCTGGGTACATGTACCCTTGTCACCGCGCACCTTAAATGCGTGGGTAAAAGCAGTAGAGGAGAAAAAATTTGGAGCAGAAATAGTACCCATGTTTCAAGCCCTATCAGAAGGCTGCACACCCTATGACATTAATCAGATGCTTAATGTGCTAGGAGATCATCAAGGGGCATTACAAATAGTGAAAGAGATCATTAATGAAGAAGCAGCCCAGTGGGATGTAACACACCCACTACCCGCAGGACCCCTACCAGCAGGACAGCTCAGGGACCCTCGCGGCTCAGATATAGCAGGGACCACCAGCTCAGTACAAGAACAGTTAGAATGGATCTATACTGCTAACCCCCGGGTAGATGTAGGTGCCATCTACCGGAGATGGATTATTCTAGGACTTCAAAAGTGTGTCAAAATGTACAACCCAGTATCAGTCCTAGACATTAGGCAGGGACCTAAAGAGCCCTTCAAGGATTATGTGGACAGATTTTACAAGGCAATTAGAGCAGAACAAGCCTCAGGGGAAGTGAAACAATGGATGACAGAATCATTACTCATTCAAAATGCTAATCCAGATTGTAAGGTCATCCTGAAGGGCCTAGGAATGCACCCCACCCTTGAAGAAATGTTAACGGCTTGTCAGGGGGTAGGAGGCCCAAGCTACAAAGCAAAAGTAATGGCAGAAATGATGCAGACCATGCAAAATCAAAACATGGTGCAGCAGGGAGGTCCAAAAAGACAAAGACCCCCACTAAGATGTTATAATTGTGGAAAATTTGGCCATATGCAAAGACAATGTCCGGAACCAAGGAAAACAAAATGTCTAAAGTGTGGAAAATTGGGACACCTAGCAAAAGACTGCAGGGGACAGGTGAATTTTTTAGGGTATGGACGGTGGATGGGGGCAAAACCGAGAAATTTTCCCGCCGCTACTCTTGGAGCGGAACCGAGTGCGCCTCCTCCACCGAGCGGCACCACCCCATACGACCCAGCAAAGAAGCTCCTGCAGCAATATGCAGAGAAAGGGAAACAACTGAGGGAGCAAAAGAGGAATCCACCGGCAATGAATCCGGATTGGACCGAGGGATATTCTTTGAACTCCCTCTTTGGAGAAGACCAATAAAGACAGTGTATATAGAAGGGGTCCCCATTAAGGCACTGCTAGACACAGGGGCAGATGACACCATAATTAAAGAAAATGATTTACAATTATCAGGTCCATGGAGACCCAAAATTATAGGGGGCATAGGAGGAGGCCTTAATGTAAAAGAATATAACGACAGGGAAGTAAAAATAGAAGATAAAATTTTGAGAGGAACAATATTGTTAGGAGCAACTCCCATTAATATAATAGGTAGAAATTTGCTGGCCCCGGCAGGTGCCCGGTTAGTAATGGGACAATTATCAGAAAAAATTCCTGTCACACCTGTCAAATTGAAGGAAGGGGCTCGGGGACCCTGTGTAAGACAATGGCCTCTCTCTAAAGAGAAGATTGAAGCTTTACAGGAAATATGTTCCCAATTAGAGCAGGAAGGAAAAATCAGTAGAGTAGGAGGAGAAAATGCATACAATACCCCAATATTTTGCATAAAGAAGAAGGACAAATCCCAGTGGAGGATGCTAGTAGACTTTAGAGAGTTAAATAAGGCAACCCAAGATTTCTTTGAAGTGCAATTAGGGATACCCCACCCAGCAGGATTAAGAAAGATGAGACAGATAACAGTTTTAGATGTAGGAGACGCCTATTATTCCATACCATTGGATCCAAATTTTAGGAAATATACTGCTTTTACTATTCCCACAGTGAATAATCAGGGACCCGGGATTAGGTATCAATTCAACTGTCTCCCGCAAGGGTGGAAAGGATCTCCTACAATCTTCCAAAATACAGCAGCATCCATTTTGGAGGAGATAAAAAGAAACTTGCCAGCACTAACCATTGTACAATACATGGATGATTTATGGGTAGGTTCTCAAGAAAATGAACACACCCATGACAAATTAGTAGAACAGTTAAGAACAAAATTACAAGCCTGGGGCTTAGAAACCCCAGAAAAGAAGGTGCAAAAAGAACCACCTTATGAGTGGATGGGATACAAACTTTGGCCTCACAAATGGGAACTAAGCAGAATACAACTGGAGGAAAAAGATGAATGGACTGTCAATGACATCCAGAAGTTAGTTGGGAAACTAAATTGGGCAGCACAATTGTATCCAGGTCTTAGGACCAAGAATATATGCAAGTTAATTAGAGGAAAGAAAAATCTGTTAGAGCTAGTGACTTGGACACCTGAGGCAGAAGCTGAATATGCAGAAAATGCAGAGATTCTTAAAACAGAACAGGAAGGAACCTATTACAAACCAGGAATACCTATTAGGGCAGCAGTACAGAAATTGGAAGGAGGACAGTGGAGTTACCAATTCAAACAAGAAGGACAAGTCTTGAAAGTAGGAAAATACACCAAGCAAAAGAACACCCATACAAATGAACTTCGCACATTAGCTGGTTTAGTGCAGAAGATTTGCAAAGAAGCTCTAGTTATTTGGGGGATATTACCAGTTCTAGAACTCCCGATAGAAAGAGAGGTATGGGAACAATGGTGGGCGGATTACTGGCAGGTAAGCTGGATTCCCGAATGGGATTTTGTCAGCACCCCACCTTTGCTCAAACTATGGTACACATTAACAAAAGAACCCATACCCAAGGAGGACGTTTACTATGTAGATGGAGCATGCAACAGAAATTCAAAAGAAGGAAAAGCAGGATACATCTCACAATACGGAAAACAGAGAGTAGAAACATTAGAAAACACTACCAATCAGCAAGCAGAATTAACAGCTATAAAAATGGCTTTGGAAGACAGTGGGCCTAATGTGAACATAGTAACAGACTCTCAATATGCAATGGGAATTTTGACAGCACAACCCACACAAAGTGATTCACCATTAGTAGAGCAAATTATAGCCTTAATGATACAAAAGCAACAAATATATTTGCAGTGGGTACCAGCACATAAAGGAATAGGAGGAAATGAGGAGATAGATAAATTAGTGAGTAAAGGCATTAGAAGAGTTTTATTCTTAGAAAAAATAGAAGAAGCTCAAGAAGAGCATGAAAGATATCATAATAATTGGAAAAACCTAGCAGATACATATGGGCTTCCACAAATAGTAGCAAAAGAGATAGTGGCCATGTGTCCAAAATGTCAGATAAAGGGAGAACCAGTGCATGGACAAGTGGATGCCTCACCTGGAACATGGCAGATGGATTGTACTCATCTAGAAGGAAAAGTAGTCATAGTTGCGGTCCATGTAGCCAGTGGATTCATAGAAGCAGAAGTCATACCTAGGGAAACAGGAAAAGAAACGGCAAAGTTTCTATTAAAAATACTGAGTAGATGGCCTATAACACAGTTACACACAGACAATGGGCCTAACTTTACCTCCCAAGAAGTGGCAGCAATATGTTGGTGGGGAAAAATTGAACATACAACAGGTATACCATATAACCCCCAATCTCAAGGATCAATAGAAAGCATGAACAAACAATTAAAAGAGATAATTGGGAAAATAAGAGATGATTGCCAATATACAGAGACAGCAGTACTGATGGCTTGCCATATTCACAATTTTAAAAGAAAGGGAGGAATAGGGGGACAGACTTCAGCAGAGAGACTAATTAATATAATAACAACACAATTAGAAATACAACATTTACAAACCAAAATTCAAAAAATTTTAAATTTTAGAGTCTACTACAGAGAAGGGAGAGACCCTGTGTGGAAAGGACCAGCACAATTAATCTGGAAAGGGGAAGGAGCAGTGGTCCTCAAGGACGGAAGTGACCTAAAGGTTGTACCAAGAAGGAAAGCTAAAATTATTAAGGATTATGAACCCAAACAAAGAGTGGGTAATGAGGGTGACGTGGAAGGTACCAGGGGATCTGATAACTAA
SEQ ID NO: 3 도 2a에 정의된 바와 같은 플라스미드(pDNA1 pGM326)
길이: 10528; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..10528; mol_type, 기타 DNA; 참고, pGM326; 유기체, 합성 구조물
GGTACCTCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCCAATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTGCGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGCTGGCTTGTAACTCAGTCTCTTACTAGGAGACCAGCTTGAGCCTGGGTGTTCGCTGGTTAGCCTAACCTGGTTGGCCACCAGGGGTAAGGACTCCTTGGCTTAGAAAGCTAATAAACTTGCCTGCATTAGAGCTTATCTGAGTCAAGTGTCCTCATTGACGCCTCACTCTCTTGAACGGGAATCTTCCTTACTGGGTTCTCTCTCTGACCCAGGCGAGAGAAACTCCAGCAGTGGCGCCCGAACAGGGACTTGAGTGAGAGTGTAGGCACGTACAGCTGAGAAGGCGTCGGACGCGAAGGAAGCGCGGGGTGCGACGCGACCAAGAAGGAGACTTGGTGAGTAGGCTTCTCGAGTGCCGGGAAAAAGCTCGAGCCTAGTTAGAGGACTAGGAGAGGCCGTAGCCGTAACTACTCTGGGCAAGTAGGGCAGGCGGTGGGTACGCAATGGGGGCGGCTACCTCAGCACTAAATAGGAGACAATTAGACCAATTTGAGAAAATACGACTTCGCCCGAACGGAAAGAAAAAGTACCAAATTAAACATTTAATATGGGCAGGCAAGGAGATGGAGCGCTTCGGCCTCCATGAGAGGTTGTTGGAGACAGAGGAGGGGTGTAAAAGAATCATAGAAGTCCTCTACCCCCTAGAACCAACAGGATCGGAGGGCTTAAAAAGTCTGTTCAATCTTGTGTGCGTGCTATATTGCTTGCACAAGGAACAGAAAGTGAAAGACACAGAGGAAGCAGTAGCAACAGTAAGACAACACTGCCATCTAGTGGAAAAAGAAAAAAGTGCAACAGAGACATCTAGTGGACAAAAGAAAAATGACAAGGGAATAGCAGCGCCACCTGGTGGCAGTCAGAATTTTCCAGCGCAACAACAAGGAAATGCCTGGGTACATGTACCCTTGTCACCGCGCACCTTAAATGCGTGGGTAAAAGCAGTAGAGGAGAAAAAATTTGGAGCAGAAATAGTACCCATGTTTCAAGCCCTATCGAATTCCCGTTTGTGCTAGGGTTCTTAGGCTTCTTGGGGGCTGCTGGAACTGCAATGGGAGCAGCGGCGACAGCCCTGACGGTCCAGTCTCAGCATTTGCTTGCTGGGATACTGCAGCAGCAGAAGAATCTGCTGGCGGCTGTGGAGGCTCAACAGCAGATGTTGAAGCTGACCATTTGGGGTGTTAAAAACCTCAATGCCCGCGTCACAGCCCTTGAGAAGTACCTAGAGGATCAGGCACGACTAAACTCCTGGGGGTGCGCATGGAAACAAGTATGTCATACCACAGTGGAGTGGCCCTGGACAAATCGGACTCCGGATTGGCAAAATATGACTTGGTTGGAGTGGGAAAGACAAATAGCTGATTTGGAAAGCAACATTACGAGACAATTAGTGAAGGCTAGAGAACAAGAGGAAAAGAATCTAGATGCCTATCAGAAGTTAACTAGTTGGTCAGATTTCTGGTCTTGGTTCGATTTCTCAAAATGGCTTAACATTTTAAAAATGGGATTTTTAGTAATAGTAGGAATAATAGGGTTAAGATTACTTTACACAGTATATGGATGTATAGTGAGGGTTAGGCAGGGATATGTTCCTCTATCTCCACAGATCCATATCCGCGGCAATTTTAAAAGAAAGGGAGGAATAGGGGGACAGACTTCAGCAGAGAGACTAATTAATATAATAACAACACAATTAGAAATACAACATTTACAAACCAAAATTCAAAAAATTTTAAATTTTAGAGCCGCGGAGATCTGTTACATAACTTATGGTAAATGGCCTGCCTGGCTGACTGCCCAATGACCCCTGCCCAATGATGTCAATAATGATGTATGTTCCCATGTAATGCCAATAGGGACTTTCCATTGATGTCAATGGGTGGAGTATTTATGGTAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTATGCCCCCTATTGATGTCAATGATGGTAAATGGCCTGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTATGTATTAGTCATTGCTATTACCATGGGAATTCACTAGTGGAGAAGAGCATGCTTGAGGGCTGAGTGCCCCTCAGTGGGCAGAGAGCACATGGCCCACAGTCCCTGAGAAGTTGGGGGGAGGGGTGGGCAATTGAACTGGTGCCTAGAGAAGGTGGGGCTTGGGTAAACTGGGAAAGTGATGTGGTGTACTGGCTCCACCTTTTTCCCCAGGGTGGGGGAGAACCATATATAAGTGCAGTAGTCTCTGTGAACATTCAAGCTTCTGCCTTCTCCCTCCTGTGAGTTTGCTAGCCACCATGCAGAGAAGCCCTCTGGAGAAGGCCTCTGTGGTGAGCAAGCTGTTCTTCAGCTGGACCAGGCCCATCCTGAGGAAGGGCTACAGGCAGAGACTGGAGCTGTCTGACATCTACCAGATCCCCTCTGTGGACTCTGCTGACAACCTGTCTGAGAAGCTGGAGAGGGAGTGGGATAGAGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAATGCCCTGAGGAGATGCTTCTTCTGGAGATTCATGTTCTATGGCATCTTCCTGTACCTGGGGGAAGTGACCAAGGCTGTGCAGCCTCTGCTGCTGGGCAGAATCATTGCCAGCTATGACCCTGACAACAAGGAGGAGAGGAGCATTGCCATCTACCTGGGCATTGGCCTGTGCCTGCTGTTCATTGTGAGGACCCTGCTGCTGCACCCTGCCATCTTTGGCCTGCACCACATTGGCATGCAGATGAGGATTGCCATGTTCAGCCTGATCTACAAGAAAACCCTGAAGCTGTCCAGCAGAGTGCTGGACAAGATCAGCATTGGCCAGCTGGTGAGCCTGCTGAGCAACAACCTGAACAAGTTTGATGAGGGCCTGGCCCTGGCCCACTTTGTGTGGATTGCCCCTCTGCAGGTGGCCCTGCTGATGGGCCTGATTTGGGAGCTGCTGCAGGCCTCTGCCTTTTGTGGCCTGGGCTTCCTGATTGTGCTGGCCCTGTTTCAGGCTGGCCTGGGCAGGATGATGATGAAGTACAGGGACCAGAGGGCAGGCAAGATCAGTGAGAGGCTGGTGATCACCTCTGAGATGATTGAGAACATCCAGTCTGTGAAGGCCTACTGTTGGGAGGAAGCTATGGAGAAGATGATTGAAAACCTGAGGCAGACAGAGCTGAAGCTGACCAGGAAGGCTGCCTATGTGAGATACTTCAACAGCTCTGCCTTCTTCTTCTCTGGCTTCTTTGTGGTGTTCCTGTCTGTGCTGCCCTATGCCCTGATCAAGGGGATCATCCTGAGAAAGATTTTCACCACCATCAGCTTCTGCATTGTGCTGAGGATGGCTGTGACCAGACAGTTCCCCTGGGCTGTGCAGACCTGGTATGACAGCCTGGGGGCCATCAACAAGATCCAGGACTTCCTGCAGAAGCAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACAGAAGTGGTGATGGAGAATGTGACAGCCTTCTGGGAGGAGGGCTTTGGGGAGCTGTTTGAGAAGGCCAAGCAGAACAACAACAACAGAAAGACCAGCAATGGGGATGACTCCCTGTTCTTCTCCAACTTCTCCCTGCTGGGCACACCTGTGCTGAAGGACATCAACTTCAAGATTGAGAGGGGGCAGCTGCTGGCTGTGGCTGGATCTACAGGGGCTGGCAAGACCAGCCTGCTGATGATGATCATGGGGGAGCTGGAGCCTTCTGAGGGCAAGATCAAGCACTCTGGCAGGATCAGCTTTTGCAGCCAGTTCAGCTGGATCATGCCTGGCACCATCAAGGAGAACATCATCTTTGGAGTGAGCTATGATGAGTACAGATACAGGAGTGTGATCAAGGCCTGCCAGCTGGAGGAGGACATCAGCAAGTTTGCTGAGAAGGACAACATTGTGCTGGGGGAGGGAGGCATTACACTGTCTGGGGGCCAGAGAGCCAGAATCAGCCTGGCCAGGGCTGTGTACAAGGATGCTGACCTGTACCTGCTGGACTCCCCCTTTGGCTACCTGGATGTGCTGACAGAGAAGGAGATTTTTGAGAGCTGTGTGTGCAAGCTGATGGCCAACAAGACCAGAATCCTGGTGACCAGCAAGATGGAGCACCTGAAGAAGGCTGACAAGATCCTGATCCTGCATGAGGGCAGCAGCTACTTCTATGGGACCTTCTCTGAGCTGCAGAACCTGCAGCCTGACTTCAGCTCTAAGCTGATGGGCTGTGACAGCTTTGACCAGTTCTCTGCTGAGAGGAGGAACAGCATCCTGACAGAGACCCTGCACAGATTCAGCCTGGAGGGAGATGCCCCTGTGAGCTGGACAGAGACCAAGAAGCAGAGCTTCAAGCAGACAGGGGAGTTTGGGGAGAAGAGGAAGAACTCCATCCTGAACCCCATCAACAGCATCAGGAAGTTCAGCATTGTGCAGAAAACCCCCCTGCAGATGAATGGCATTGAGGAAGATTCTGATGAGCCCCTGGAGAGGAGACTGAGCCTGGTGCCTGATTCTGAGCAGGGAGAGGCCATCCTGCCTAGGATCTCTGTGATCAGCACAGGCCCTACACTGCAGGCCAGAAGGAGGCAGTCTGTGCTGAACCTGATGACCCACTCTGTGAACCAGGGCCAGAACATCCACAGGAAAACCACAGCCTCCACCAGGAAAGTGAGCCTGGCCCCTCAGGCCAATCTGACAGAGCTGGACATCTACAGCAGGAGGCTGTCTCAGGAGACAGGCCTGGAGATTTCTGAGGAGATCAATGAGGAGGACCTGAAAGAGTGCTTCTTTGATGACATGGAGAGCATCCCTGCTGTGACCACCTGGAACACCTACCTGAGATACATCACAGTGCACAAGAGCCTGATCTTTGTGCTGATCTGGTGCCTGGTGATCTTCCTGGCTGAAGTGGCTGCCTCTCTGGTGGTGCTGTGGCTGCTGGGAAACACCCCACTGCAGGACAAGGGCAACAGCACCCACAGCAGGAACAACAGCTATGCTGTGATCATCACCTCCACCTCCAGCTACTATGTGTTCTACATCTATGTGGGAGTGGCTGATACCCTGCTGGCTATGGGCTTCTTTAGAGGCCTGCCCCTGGTGCACACACTGATCACAGTGAGCAAGATCCTCCACCACAAGATGCTGCACTCTGTGCTGCAGGCTCCTATGAGCACCCTGAATACCCTGAAGGCTGGGGGCATCCTGAACAGATTCTCCAAGGATATTGCCATCCTGGATGACCTGCTGCCTCTCACCATCTTTGACTTCATCCAGCTGCTGCTGATTGTGATTGGGGCCATTGCTGTGGTGGCAGTGCTGCAGCCCTACATCTTTGTGGCCACAGTGCCTGTGATTGTGGCCTTCATCATGCTGAGGGCCTACTTTCTGCAGACCTCCCAGCAGCTGAAGCAGCTGGAGTCTGAGGGCAGAAGCCCCATCTTCACCCACCTGGTGACAAGCCTGAAGGGCCTGTGGACCCTGAGAGCCTTTGGCAGGCAGCCCTACTTTGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACAGCCAACTGGTTCCTCTACCTGTCCACCCTGAGATGGTTCCAGATGAGAATTGAGATGATCTTTGTCATCTTCTTCATTGCTGTGACCTTCATCAGCATTCTGACCACAGGAGAGGGAGAGGGCAGAGTGGGCATTATCCTGACCCTGGCCATGAACATCATGAGCACACTGCAGTGGGCAGTGAACAGCAGCATTGATGTGGACAGCCTGATGAGGAGTGTGAGCAGAGTGTTCAAGTTCATTGATATGCCCACAGAGGGCAAGCCTACCAAGAGCACCAAGCCCTACAAGAATGGCCAGCTGAGCAAAGTGATGATCATTGAGAACAGCCATGTGAAGAAGGATGATATCTGGCCCAGTGGAGGCCAGATGACAGTGAAGGACCTGACAGCCAAGTACACAGAGGGGGGCAATGCTATCCTGGAGAACATCTCCTTCAGCATCTCCCCTGGCCAGAGAGTGGGACTGCTGGGAAGAACAGGCTCTGGCAAGTCTACCCTGCTGTCTGCCTTCCTGAGGCTGCTGAACACAGAGGGAGAGATCCAGATTGATGGAGTGTCCTGGGACAGCATCACACTGCAGCAGTGGAGGAAGGCCTTTGGTGTGATCCCCCAGAAAGTGTTCATCTTCAGTGGCACCTTCAGGAAGAACCTGGACCCCTATGAGCAGTGGTCTGACCAGGAGATTTGGAAAGTGGCTGATGAAGTGGGCCTGAGAAGTGTGATTGAGCAGTTCCCTGGCAAGCTGGACTTTGTCCTGGTGGATGGGGGCTGTGTGCTGAGCCATGGCCACAAGCAGCTGATGTGCCTGGCCAGATCAGTGCTGAGCAAGGCCAAGATCCTGCTGCTGGATGAGCCTTCTGCCCACCTGGATCCTGTGACCTACCAGATCATCAGGAGGACCCTCAAGCAGGCCTTTGCTGACTGCACAGTCATCCTGTGTGAGCACAGGATTGAGGCCATGCTGGAGTGCCAGCAGTTCCTGGTGATTGAGGAGAACAAAGTGAGGCAGTATGACAGCATCCAGAAGCTGCTGAATGAGAGGAGCCTGTTCAGGCAGGCCATCAGCCCCTCTGATAGAGTGAAGCTGTTCCCCCACAGGAACAGCTCCAAGTGCAAGAGCAAGCCCCAGATTGCTGCCCTGAAGGAGGAGACAGAGGAGGAAGTGCAGGACACCAGGCTGTGAGGGCCCAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTGCTGTCTCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTGGTGTGCACTGTGTTTGCTGACGCAACCCCCACTGGTTGGGGCATTGCCACCACCTGTCAGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCTATTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGTGTTGTCGGGGAAATCATCGTCCTTTCCTTGGCTGCTCGCCTGTGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGCTACGTCCCTTCGGCCCTCAATCCAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTCTGCGGCCTCTTCCGCGTCTTCGCCTTCGCCCTCAGACGAGTCGGATCTCCCTTTGGGCCGCCTCCCCGCAAGCTTCGCACTTTTTAAAAGAAAAGGGAGGACTGGATGGGATTTATTACTCCGATAGGACGCTGGCTTGTAACTCAGTCTCTTACTAGGAGACCAGCTTGAGCCTGGGTGTTCGCTGGTTAGCCTAACCTGGTTGGCCACCAGGGGTAAGGACTCCTTGGCTTAGAAAGCTAATAAACTTGCCTGCATTAGAGCTCTTACGCGTCCCGGGCTCGAGATCCGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAACAGCTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTTCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGATGATATATTTTTATCTTGTGCAATGTAACATCAGAGATTTTGAGACACAACAATTGGTCGACGGATCC
SEQ ID NO: 4 도 2b에 정의된 바와 같은 플라스미드(pDNA1 pGM830)
길이: 10536; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..10536; mol_type, 기타 DNA; 참고, pGM830; 유기체, 합성 구조물
GGTACCTCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCCAATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTGCGATCGCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGCTGGCTTGTAACTCAGTCTCTTACTAGGAGACCAGCTTGAGCCTGGGTGTTCGCTGGTTAGCCTAACCTGGTTGGCCACCAGGGGTAAGGACTCCTTGGCTTAGAAAGCTAATAAACTTGCCTGCATTAGAGCTTATCTGAGTCAAGTGTCCTCATTGACGCCTCACTCTCTTGAACGGGAATCTTCCTTACTGGGTTCTCTCTCTGACCCAGGCGAGAGAAACTCCAGCAGTGGCGCCCGAACAGGGACTTGAGTGAGAGTGTAGGCACGTACAGCTGAGAAGGCGTCGGACGCGAAGGAAGCGCGGGGTGCGACGCGACCAAGAAGGAGACTTGGTGAGTAGGCTTCTCGAGTGCCGGGAAAAAGCTCGAGCCTAGTTAGAGGACTAGGAGAGGCCGTAGCCGTAACTACTCTGGGCAAGTAGGGCAGGCGGTGGGTACGCAATTGGGGGCGGCTACCTCAGCACTAAATAGGAGACAATTAGACCAATTTGAGAAAATACGACTTCGCCCGAACGGAAAGAAAAAGTACCAAATTAAACATTTAATATTGGGCAGGCAAGGAGATTGGAGCGCTTCGGCCTCCATGAGAGGTTGTTGGAGACAGAGGAGGGGTGTAAAAGAATCATAGAAGTCCTCTACCCCCTAGAACCAACAGGATCGGAGGGCTTAAAAAGTCTGTTCAATCTTGTGTGCGTGCTATATTGCTTGCACAAGGAACAGAAAGTGAAAGACACAGAGGAAGCAGTAGCAACAGTAAGACAACACTGCCATCTAGTGGAAAAAGAAAAAAGTGCAACAGAGACATCTAGTGGACAAAAGAAAAATGACAAGGGAATAGCAGCGCCACCTGGTGGCAGTCAGAATTTTCCAGCGCAACAACAAGGAAATTGCCTGGGTACATGTACCCTTGTCACCGCGCACCTTAAATGCGTGGGTAAAAGCAGTAGAGGAGAAAAAATTTGGAGCAGAAATAGTACCCATGTTTCAAGCCCTATCGCCTGCAGGCCGTTTGTGCTAGGGTTCTTAGGCTTCTTGGGGGCTGCTGGAACTGCATTGGGAGCAGCGGCGACAGCCCTGACGGTCCAGTCTCAGCATTTGCTTGCTGGGATACTGCAGCAGCAGAAGAATCTGCTGGCGGCTGTGGAGGCTCAACAGCAGATGTTGAAGCTGACCATTTGGGGTGTTAAAAACCTCAATGCCCGCGTCACAGCCCTTGAGAAGTACCTAGAGGATCAGGCACGACTAAACTCCTGGGGGTGCGCATGGAAACAAGTATGTCATACCACAGTGGAGTGGCCCTGGACAAATCGGACTCCGGATTGGCAAAATAAGACTTGGTTGGAGTGGGAAAGACAAATAGCTGATTTGGAAAGCAACATTACGAGACAATTAGTGAAGGCTAGAGAACAAGAGGAAAAGAATCTAGATGCCTATCAGAAGTTAACTAGTTGGTCAGATTTCTGGTCTTGGTTCGATTTCTCAAAATGGCTTAACATTTTAAAAAAGGGATTTTTAGTAATAGTAGGAATAATAGGGTTAAGATTACTTTACACAGTATATGGATGTATAGTGAGGGTTAGGCAGGGATATGTTCCTCTATCTCCACAGATCCATATAAAGCGGCAATTTTAAAAGAAAGGGAGGAATAGGGGGACAGACTTCAGCAGAGAGACTAATTAATATAATAACAACACAATTAGAAATACAACATTTACAAACCAAAATTCAAAAAATTTTAAATTTTAGAGCCGCGGAGATCTGTTACATAACTTATGGTAAATGGCCTGCCTGGCTGACTGCCCAATGACCCCTGCCCAATGATGTCAATAATGATGTATGTTCCCATGTAATGCCAATAGGGACTTTCCATTGATGTCAATGGGTGGAGTATTTATGGTAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTATGCCCCCTATTGATGTCAATGATGGTAAATGGCCTGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTATGTATTAGTCATTGCTATTACCATGGGAATTCACTAGTGGAGAAGAGCATGCTTGAGGGCTGAGTGCCCCTCAGTGGGCAGAGAGCACATGGCCCACAGTCCCTGAGAAGTTGGGGGGAGGGGTGGGCAATTGAACTGGTGCCTAGAGAAGGTGGGGCTTGGGTAAACTGGGAAAGTGATGTGGTGTACTGGCTCCACCTTTTTCCCCAGGGTGGGGGAGAACCATATATAAGTGCAGTAGTCTCTGTGAACATTCAAGCTTCTGCCTTCTCCCTCCTGTGAGTTTGCTAGCCACCATGCAGAGAAGCCCTCTGGAGAAGGCCTCTGTGGTGAGCAAGCTGTTCTTCAGCTGGACCAGGCCCATCCTGAGGAAGGGCTACAGGCAGAGACTGGAGCTGTCTGACATCTACCAGATCCCCTCTGTGGACTCTGCTGACAACCTGTCTGAGAAGCTGGAGAGGGAGTGGGATAGAGAGCTGGCCAGCAAGAAGAACCCCAAGCTGATCAATGCCCTGAGGAGATGCTTCTTCTGGAGATTCATGTTCTATGGCATCTTCCTGTACCTGGGGGAAGTGACCAAGGCTGTGCAGCCTCTGCTGCTGGGCAGAATCATTGCCAGCTATGACCCTGACAACAAGGAGGAGAGGAGCATTGCCATCTACCTGGGCATTGGCCTGTGCCTGCTGTTCATTGTGAGGACCCTGCTGCTGCACCCTGCCATCTTTGGCCTGCACCACATTGGCATGCAGATGAGGATTGCCATGTTCAGCCTGATCTACAAGAAAACCCTGAAGCTGTCCAGCAGAGTGCTGGACAAGATCAGCATTGGCCAGCTGGTGAGCCTGCTGAGCAACAACCTGAACAAGTTTGATGAGGGCCTGGCCCTGGCCCACTTTGTGTGGATTGCCCCTCTGCAGGTGGCCCTGCTGATGGGCCTGATTTGGGAGCTGCTGCAGGCCTCTGCCTTTTGTGGCCTGGGCTTCCTGATTGTGCTGGCCCTGTTTCAGGCTGGCCTGGGCAGGATGATGATGAAGTACAGGGACCAGAGGGCAGGCAAGATCAGTGAGAGGCTGGTGATCACCTCTGAGATGATTGAGAACATCCAGTCTGTGAAGGCCTACTGTTGGGAGGAAGCTATGGAGAAGATGATTGAAAACCTGAGGCAGACAGAGCTGAAGCTGACCAGGAAGGCTGCCTATGTGAGATACTTCAACAGCTCTGCCTTCTTCTTCTCTGGCTTCTTTGTGGTGTTCCTGTCTGTGCTGCCCTATGCCCTGATCAAGGGGATCATCCTGAGAAAGATTTTCACCACCATCAGCTTCTGCATTGTGCTGAGGATGGCTGTGACCAGACAGTTCCCCTGGGCTGTGCAGACCTGGTATGACAGCCTGGGGGCCATCAACAAGATCCAGGACTTCCTGCAGAAGCAGGAGTACAAGACCCTGGAGTACAACCTGACCACCACAGAAGTGGTGATGGAGAATGTGACAGCCTTCTGGGAGGAGGGCTTTGGGGAGCTGTTTGAGAAGGCCAAGCAGAACAACAACAACAGAAAGACCAGCAATGGGGATGACTCCCTGTTCTTCTCCAACTTCTCCCTGCTGGGCACACCTGTGCTGAAGGACATCAACTTCAAGATTGAGAGGGGGCAGCTGCTGGCTGTGGCTGGATCTACAGGGGCTGGCAAGACCAGCCTGCTGATGATGATCATGGGGGAGCTGGAGCCTTCTGAGGGCAAGATCAAGCACTCTGGCAGGATCAGCTTTTGCAGCCAGTTCAGCTGGATCATGCCTGGCACCATCAAGGAGAACATCATCTTTGGAGTGAGCTATGATGAGTACAGATACAGGAGTGTGATCAAGGCCTGCCAGCTGGAGGAGGACATCAGCAAGTTTGCTGAGAAGGACAACATTGTGCTGGGGGAGGGAGGCATTACACTGTCTGGGGGCCAGAGAGCCAGAATCAGCCTGGCCAGGGCTGTGTACAAGGATGCTGACCTGTACCTGCTGGACTCCCCCTTTGGCTACCTGGATGTGCTGACAGAGAAGGAGATTTTTGAGAGCTGTGTGTGCAAGCTGATGGCCAACAAGACCAGAATCCTGGTGACCAGCAAGATGGAGCACCTGAAGAAGGCTGACAAGATCCTGATCCTGCATGAGGGCAGCAGCTACTTCTATGGGACCTTCTCTGAGCTGCAGAACCTGCAGCCTGACTTCAGCTCTAAGCTGATGGGCTGTGACAGCTTTGACCAGTTCTCTGCTGAGAGGAGGAACAGCATCCTGACAGAGACCCTGCACAGATTCAGCCTGGAGGGAGATGCCCCTGTGAGCTGGACAGAGACCAAGAAGCAGAGCTTCAAGCAGACAGGGGAGTTTGGGGAGAAGAGGAAGAACTCCATCCTGAACCCCATCAACAGCATCAGGAAGTTCAGCATTGTGCAGAAAACCCCCCTGCAGATGAATGGCATTGAGGAAGATTCTGATGAGCCCCTGGAGAGGAGACTGAGCCTGGTGCCTGATTCTGAGCAGGGAGAGGCCATCCTGCCTAGGATCTCTGTGATCAGCACAGGCCCTACACTGCAGGCCAGAAGGAGGCAGTCTGTGCTGAACCTGATGACCCACTCTGTGAACCAGGGCCAGAACATCCACAGGAAAACCACAGCCTCCACCAGGAAAGTGAGCCTGGCCCCTCAGGCCAATCTGACAGAGCTGGACATCTACAGCAGGAGGCTGTCTCAGGAGACAGGCCTGGAGATTTCTGAGGAGATCAATGAGGAGGACCTGAAAGAGTGCTTCTTTGATGACATGGAGAGCATCCCTGCTGTGACCACCTGGAACACCTACCTGAGATACATCACAGTGCACAAGAGCCTGATCTTTGTGCTGATCTGGTGCCTGGTGATCTTCCTGGCTGAAGTGGCTGCCTCTCTGGTGGTGCTGTGGCTGCTGGGAAACACCCCACTGCAGGACAAGGGCAACAGCACCCACAGCAGGAACAACAGCTATGCTGTGATCATCACCTCCACCTCCAGCTACTATGTGTTCTACATCTATGTGGGAGTGGCTGATACCCTGCTGGCTATGGGCTTCTTTAGAGGCCTGCCCCTGGTGCACACACTGATCACAGTGAGCAAGATCCTCCACCACAAGATGCTGCACTCTGTGCTGCAGGCTCCTATGAGCACCCTGAATACCCTGAAGGCTGGGGGCATCCTGAACAGATTCTCCAAGGATATTGCCATCCTGGATGACCTGCTGCCTCTCACCATCTTTGACTTCATCCAGCTGCTGCTGATTGTGATTGGGGCCATTGCTGTGGTGGCAGTGCTGCAGCCCTACATCTTTGTGGCCACAGTGCCTGTGATTGTGGCCTTCATCATGCTGAGGGCCTACTTTCTGCAGACCTCCCAGCAGCTGAAGCAGCTGGAGTCTGAGGGCAGAAGCCCCATCTTCACCCACCTGGTGACAAGCCTGAAGGGCCTGTGGACCCTGAGAGCCTTTGGCAGGCAGCCCTACTTTGAGACCCTGTTCCACAAGGCCCTGAACCTGCACACAGCCAACTGGTTCCTCTACCTGTCCACCCTGAGATGGTTCCAGATGAGAATTGAGATGATCTTTGTCATCTTCTTCATTGCTGTGACCTTCATCAGCATTCTGACCACAGGAGAGGGAGAGGGCAGAGTGGGCATTATCCTGACCCTGGCCATGAACATCATGAGCACACTGCAGTGGGCAGTGAACAGCAGCATTGATGTGGACAGCCTGATGAGGAGTGTGAGCAGAGTGTTCAAGTTCATTGATATGCCCACAGAGGGCAAGCCTACCAAGAGCACCAAGCCCTACAAGAATGGCCAGCTGAGCAAAGTGATGATCATTGAGAACAGCCATGTGAAGAAGGATGATATCTGGCCCAGTGGAGGCCAGATGACAGTGAAGGACCTGACAGCCAAGTACACAGAGGGGGGCAATGCTATCCTGGAGAACATCTCCTTCAGCATCTCCCCTGGCCAGAGAGTGGGACTGCTGGGAAGAACAGGCTCTGGCAAGTCTACCCTGCTGTCTGCCTTCCTGAGGCTGCTGAACACAGAGGGAGAGATCCAGATTGATGGAGTGTCCTGGGACAGCATCACACTGCAGCAGTGGAGGAAGGCCTTTGGTGTGATCCCCCAGAAAGTGTTCATCTTCAGTGGCACCTTCAGGAAGAACCTGGACCCCTATGAGCAGTGGTCTGACCAGGAGATTTGGAAAGTGGCTGATGAAGTGGGCCTGAGAAGTGTGATTGAGCAGTTCCCTGGCAAGCTGGACTTTGTCCTGGTGGATGGGGGCTGTGTGCTGAGCCATGGCCACAAGCAGCTGATGTGCCTGGCCAGATCAGTGCTGAGCAAGGCCAAGATCCTGCTGCTGGATGAGCCTTCTGCCCACCTGGATCCTGTGACCTACCAGATCATCAGGAGGACCCTCAAGCAGGCCTTTGCTGACTGCACAGTCATCCTGTGTGAGCACAGGATTGAGGCCATGCTGGAGTGCCAGCAGTTCCTGGTGATTGAGGAGAACAAAGTGAGGCAGTATGACAGCATCCAGAAGCTGCTGAATGAGAGGAGCCTGTTCAGGCAGGCCATCAGCCCCTCTGATAGAGTGAAGCTGTTCCCCCACAGGAACAGCTCCAAGTGCAAGAGCAAGCCCCAGATTGCTGCCCTGAAGGAGGAGACAGAGGAGGAAGTGCAGGACACCAGGCTGTGAGGGCCCAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTGCTGTCTCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTGGTGTGCACTGTGTTTGCTGACGCAACCCCCACTGGTTGGGGCATTGCCACCACCTGTCAGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCTATTGCCACGGCGGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGTGTTGTCGGGGAAATCATCGTCCTTTCCTTGGCTGCTCGCCTGTGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGCTACGTCCCTTCGGCCCTCAATCCAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTCTGCGGCCTCTTCCGCGTCTTCGCCTTCGCCCTCAGACGAGTCGGATCTCCCTTTGGGCCGCCTCCCCGCAAGCTTCGCACTTTTTAAAAGAAAAGGGAGGACTGGATGGGATTTATTACTCCGATAGGACGCTGGCTTGTAACTCAGTCTCTTACTAGGAGACCAGCTTGAGCCTGGGTGTTCGCTGGTTAGCCTAACCTGGTTGGCCACCAGGGGTAAGGACTCCTTGGCTTAGAAAGCTAATAAACTTGCCTGCATTAGAGCTCTTACGCGTCCCGGGCTCGAGATCCGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAACAGCTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTTCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGATGATATATTTTTATCTTGTGCAATGTAACATCAGAGATTTTGAGACACAACAATTGGTCGACGGATCC
SEQ ID NO: 5 도 2c에 정의된 바와 같은 플라스미드(pDNA2a pGM691)
길이: 9064; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..9064; mol_type, 기타 DNA; 참고, pGM691; 유기체, 합성 구조물
ATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTTGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAACCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTTCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTTCGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTTCCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCATTTTGGCAAAGAATTGCTCGAGCCACCATGGGAGCTGCCACATCTGCCCTGAATAGACGGCAGCTGGACCAGTTCGAGAAGATCAGACTGCGGCCCAACGGCAAGAAGAAGTACCAGATCAAGCACCTGATCTGGGCCGGCAAAGAGATGGAAAGATTCGGCCTGCACGAGCGGCTGCTGGAAACCGAGGAAGGCTGCAAGAGAATTATCGAGGTGCTGTACCCTCTGGAACCTACCGGCTCTGAGGGCCTGAAGTCCCTGTTCAATCTCGTGTGCGTGCTGTACTGCCTGCACAAAGAACAGAAAGTGAAGGACACCGAAGAGGCCGTGGCCACAGTTAGACAGCACTGCCACCTGGTGGAAAAAGAGAAGTCCGCCACAGAGACAAGCAGCGGCCAGAAGAAGAACGACAAGGGAATTGCTGCCCCTCCTGGCGGCAGCCAGAATTTTCCTGCTCAGCAGCAGGGAAACGCCTGGGTGCACGTTCCACTGAGCCCTAGAACACTGAATGCCTGGGTCAAAGCCGTGGAAGAGAAGAAGTTTGGCGCCGAGATCGTGCCCATGTTCCAGGCTCTGTCTGAGGGCTGCACCCCTTACGACATCAACCAGATGCTGAACGTGCTGGGAGATCACCAGGGCGCTCTGCAGATCGTGAAAGAGATCATCAACGAAGAGGCTGCCCAGTGGGACGTGACACATCCATTGCCTGCTGGACCTCTGCCAGCCGGACAACTGAGAGATCCTAGAGGCTCTGATATCGCCGGCACCACCAGCTCTGTGCAAGAGCAGCTGGAATGGATCTACACCGCCAATCCTAGAGTGGACGTGGGCGCCATCTACAGAAGATGGATCATCCTGGGCCTGCAGAAATGCGTGAAGATGTACAACCCCGTGTCCGTGCTGGACATCAGACAGGGACCCAAAGAGCCCTTCAAGGACTACGTGGACCGGTTCTATAAGGCCATTAGAGCCGAGCAGGCCAGCGGCGAAGTGAAGCAGTGGATGACAGAGAGCCTGCTGATCCAGAACGCCAATCCAGACTGCAAAGTGATCCTGAAAGGCCTGGGCATGCACCCCACACTGGAAGAGATGCTGACAGCCTGTCAAGGCGTTGGCGGCCCTTCTTACAAAGCCAAAGTGATGGCCGAGATGATGCAGACCATGCAGAACCAGAACATGGTGCAGCAAGGCGGCCCTAAGAGACAGAGGCCTCCTCTGAGATGCTACAACTGCGGCAAGTTCGGCCACATGCAGAGACAGTGTCCTGAGCCTAGGAAAACAAAATGTCTAAAGTGTGGAAAATTGGGACACCTAGCAAAAGACTGCAGGGGACAGGTGAATTTTTTAGGGTATGGACGGTGGATGGGGGCAAAACCGAGAAATTTTCCCGCCGCTACTCTTGGAGCGGAACCGAGTGCGCCTCCTCCACCGAGCGGCACCACCCCATACGACCCAGCAAAGAAGCTCCTGCAGCAATATGCAGAGAAAGGGAAACAACTGAGGGAGCAAAAGAGGAATCCACCGGCAATGAATCCGGATTGGACCGAGGGATATTCTTTGAACTCCCTCTTTGGAGAAGACCAATAAAGACCGTGTACATCGAGGGCGTGCCCATCAAGGCTCTGCTGGATACAGGCGCCGACGACACCATCATCAAAGAGAACGACCTGCAGCTGAGCGGCCCTTGGAGGCCTAAGATCATTGGAGGAATCGGCGGAGGCCTGAACGTCAAAGAGTACAACGACCGGGAAGTGAAGATCGAGGACAAGATCCTGAGGGGCACAATCCTGCTGGGCGCCACACCTATCAACATCATCGGCAGAAATCTGCTGGCCCCTGCCGGCGCTAGACTGGTTATGGGACAGCTCTCTGAGAAGATCCCCGTGACACCCGTGAAGCTGAAAGAAGGCGCTAGAGGACCTTGTGTGCGACAGTGGCCTCTGAGCAAAGAGAAGATTGAGGCCCTGCAAGAAATCTGTAGCCAGCTGGAACAAGAGGGCAAGATCAGCAGAGTTGGCGGCGAGAACGCCTACAATACCCCTATCTTCTGCATCAAGAAAAAGGACAAGAGCCAGTGGCGGATGCTGGTGGACTTTAGAGAGCTGAACAAGGCTACCCAGGACTTCTTCGAGGTGCAGCTGGGAATTCCTCATCCTGCCGGCCTGCGGAAGATGAGACAGATCACAGTGCTGGATGTGGGCGACGCCTACTACAGCATCCCTCTGGACCCCAACTTCAGAAAGTACACCGCCTTCACAATCCCCACCGTGAACAATCAAGGCCCTGGCATCAGATACCAGTTCAACTGCCTGCCTCAAGGCTGGAAGGGCAGCCCCACCATTTTTCAGAATACCGCCGCCAGCATCCTGGAAGAAATCAAGAGAAACCTGCCTGCTCTGACCATCGTGCAGTACATGGACGATCTGTGGGTCGGAAGCCAAGAGAATGAGCACACCCACGACAAGCTGGTGGAACAGCTGAGAACAAAGCTGCAGGCCTGGGGCCTCGAAACCCCTGAGAAGAAGGTGCAGAAAGAACCTCCTTACGAGTGGATGGGCTACAAGCTGTGGCCTCACAAGTGGGAGCTGAGCCGGATTCAGCTCGAAGAGAAGGACGAGTGGACCGTGAACGACATCCAGAAACTCGTGGGCAAGCTGAATTGGGCAGCCCAGCTGTATCCCGGCCTGAGGACCAAGAACATCTGCAAGCTGATCCGGGGAAAGAAGAACCTGCTGGAACTGGTCACATGGACACCTGAGGCCGAGGCCGAATATGCCGAGAATGCCGAAATCCTGAAAACCGAGCAAGAGGGGACCTACTACAAGCCTGGCATTCCAATCAGAGCTGCCGTGCAGAAACTGGAAGGCGGCCAGTGGTCCTACCAGTTTAAGCAAGAAGGCCAGGTCCTGAAAGTGGGCAAGTACACCAAGCAGAAGAACACCCACACCAACGAGCTGAGGACACTGGCTGGCCTGGTCCAGAAAATCTGCAAAGAGGCCCTGGTCATTTGGGGCATCCTGCCTGTTCTGGAACTGCCCATTGAGCGGGAAGTGTGGGAACAGTGGTGGGCCGATTACTGGCAAGTGTCTTGGATCCCCGAGTGGGACTTCGTGTCTACCCCTCCTCTGCTGAAACTGTGGTACACCCTGACAAAAGAGCCCATTCCTAAAGAGGACGTCTACTACGTTGACGGCGCCTGCAACCGGAACTCCAAAGAAGGCAAGGCCGGCTACATCAGCCAGTACGGCAAGCAGAGAGTGGAAACCCTGGAAAACACCACCAACCAGCAGGCCGAGCTGACCGCCATTAAGATGGCCCTGGAAGATAGCGGCCCCAATGTGAACATCGTGACCGACTCTCAGTACGCCATGGGAATCCTGACAGCCCAGCCTACACAGAGCGATAGCCCTCTGGTTGAGCAGATCATTGCCCTGATGATTCAGAAGCAGCAAATCTACCTGCAGTGGGTGCCCGCTCACAAAGGCATCGGCGGAAACGAAGAGATCGATAAGCTGGTGTCCAAGGGAATCAGACGGGTGCTGTTCCTGGAAAAGATTGAAGAGGCCCAAGAGGAACACGAGCGCTACCACAACAACTGGAAGAATCTGGCCGACACCTACGGACTGCCCCAGATCGTGGCCAAAGAAATCGTGGCTATGTGCCCCAAGTGTCAGATCAAGGGCGAACCTGTGCACGGCCAAGTGGATGCTTCTCCTGGCACATGGCAGATGGACTGTACCCACCTGGAAGGCAAAGTGGTCATCGTGGCTGTGCACGTGGCCTCCGGCTTTATTGAGGCCGAAGTGATCCCCAGAGAGACAGGCAAAGAAACCGCCAAGTTCCTGCTGAAGATCCTGTCCAGATGGCCCATCACACAGCTGCACACCGACAACGGCCCTAACTTCACATCTCAAGAGGTGGCCGCCATCTGTTGGTGGGGAAAGATTGAGCACACAACCGGCATTCCCTACAATCCACAGAGCCAGGGCAGCATCGAGTCCATGAACAAGCAGCTCAAAGAGATTATCGGCAAGATCCGGGACGACTGCCAGTACACAGAAACAGCCGTGCTGATGGCCTGTCACATCCACAACTTCAAGCGGAAAGGCGGCATCGGAGGACAGACATCTGCCGAGAGACTGATCAATATCATCACCACTCAGCTGGAAATCCAGCACCTCCAGACCAAGATCCAGAAGATTCTGAACTTCCGGGTGTACTACCGCGAGGGCAGAGATCCTGTTTGGAAAGGCCCAGCACAGCTGATCTGGAAAGGCGAAGGTGCCGTGGTGCTGAAGGATGGCTCTGATCTGAAGGTGGTGCCCAGACGGAAGGCCAAGATTATCAAGGATTACGAGCCCAAACAGCGCGTGGGCAATGAAGGCGACGTTGAGGGCACAAGAGGCAGCGACAATTGAAATTCACTCCTCAGGTGCAGGCTGCCTATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAATACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCATGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTATTTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAAGGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATTTGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAACAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCCTGCTGTCCATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAGATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTAAAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTGACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTCGACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAACAGCTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTTCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGATGATATATTTTTATCTTGTGCAATGTAACATCAGAGATTTTGAGACACAACAATTGGTCGAC
SEQ ID NO: 6 도 2d에 정의된 바와 같은 플라스미드(pDNA2b pGM299)
길이: 3384; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..3384; mol_type, 기타 DNA; 참고, pGM299; 유기체, 합성 구조물
TCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCCAATATGACCGCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAATAACCCCGCCCCGTTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGCGGTAGTTTATCACAGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGCAGAAGTTGGTCGTGAGGCACTGGGCAGGTAAGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCTTGTCGAGACAGAGAAGACTCTTGCGTTTCTGATAGGCACCTATTGGTCTTACTGACATCCACTTTGCCTTTCTCTCCACAGGTGTCCACTCCCAGTTCAATTACAGCTCTTAAGGCTAGAGTACTTAATACGACTCACTATAGGCTAGCCTCGAGAATTCGATTATGCCCCTAGGACCAGAAGAAAGAAGATTGCTTCGCTTGATTTGGCTCCTTTACAGCACCAATCCATATCCACCAAGTGGGGAAGGGACGGCCAGACAACGCCGACGAGCCAGGAGAAGGTGGAGACAACAGCAGGATCAAATTAGAGTCTTGGTAGAAAGACTCCAAGAGCAGGTGTATGCAGTTGACCGCCTGGCTGACGAGGCTCAACACTTGGCTATACAACAGTTGCCTGACCCTCCTCATTCAGCTTAGAATCACTAGTGAATTCACGCGTGGTACCTCTAGAGTCGACCCGGGCGGCCGCTTCGAGCAGACATGATAAGATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGCTGCAATAAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGGGAGATGTGGGAGGTTTTTTAAAGCAAGTAAAACCTCTACAAATGTGGTAAAATCGATAAGGATCCGTCGACCAATTGTTGTGTCTCAAAATCTCTGATGTTACATTGCACAAGATAAAAATATATCATCATGAACAATAAAACTGTCTGCTTACATAAACAGTAATACAAGGGGTGTTATGAGCCATATTCAACGGGAAACGTCTTGCTCTAGGCCGCGATTAAATTCCAACATGGATGCTGATTTATATGGGTATAAATGGGCTCGCGATAATGTCGGGCAATCAGGTGCGACAATCTATCGATTGTATGGGAAGCCCGATGCGCCAGAGTTGTTTCTGAAACATGGCAAAGGTAGCGTTGCCAATGATGTTACAGATGAGATGGTCAGACTAAACTGGCTGACGGAATTTATGCCTCTTCCGACCATCAAGCATTTTATCCGTACTCCTGATGATGCATGGTTACTCACCACTGCGATCCCCGGAAAAACAGCATTCCAGGTATTAGAAGAATATCCTGATTCAGGTGAAAATATTGTTGATGCGCTGGCAGTGTTCCTGCGCCGGTTGCATTCGATTCCTGTTTGTAATTGTCCTTTTAACAGCGATCGCGTATTTCGTCTCGCTCAGGCGCAATCACGAATGAATAACGGTTTGGTTGATGCGAGTGATTTTGATGACGAGCGTAATGGCTGGCCTGTTGAACAAGTCTGGAAAGAAATGCATAAGCTGTTGCCATTCTCACCGGATTCAGTCGTCACTCATGGTGATTTCTCACTTGATAACCTTATTTTTGACGAGGGGAAATTAATAGGTTGTATTGATGTTGGACGAGTCGGAATCGCAGACCGATACCAGGATCTTGCCATCCTATGGAACTGCCTCGGTGAGTTTTCTCCTTCATTACAGAAACGGCTTTTTCAAAAATATGGTATTGATAATCCTGATATGAATAAATTGCAGTTTCATTTGATGCTCGATGAGTTTTTCTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGGCTCGACAGATCT
SEQ ID NO: 7 도 2e에 정의된 바와 같은 플라스미드(pDNA3a pGM301)
길이: 6264; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..6264; mol_type, 기타 DNA; 참고, pGM301; 유기체, 합성 구조물
ATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTTGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAACCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTTCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTTCGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTTCCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCATTTTGGCAAAGAATTCGATTGCCATGGCAACATATATCCAGAGAGTACAGTGCATCTCAACATCACTACTGGTTGTTCTCACCACATTGGTCTCGTGTCAGATTCCCAGGGATAGGCTCTCTAACATAGGGGTCATAGTCGATGAAGGGAAATCACTGAAGATAGCTGGATCCCACGAATCGAGGTACATAGTACTGAGTCTAGTTCCGGGGGTAGACTTTGAGAATGGGTGCGGAACAGCCCAGGTTATCCAGTACAAGAGCCTACTGAACAGGCTGTTAATCCCATTGAGGGATGCCTTAGATCTTCAGGAGGCTCTGATAACTGTCACCAATGATACGACACAAAATGCCGGTGCTCCCCAGTCGAGATTCTTCGGTGCTGTGATTGGTACTATCGCACTTGGAGTGGCGACATCAGCACAAATCACCGCAGGGATTGCACTAGCCGAAGCGAGGGAGGCCAAAAGAGACATAGCGCTCATCAAAGAATCGATGACAAAAACACACAAGTCTATAGAACTGCTGCAAAACGCTGTGGGGGAACAAATTCTTGCTCTAAAGACACTCCAGGATTTCGTGAATGATGAGATCAAACCCGCAATAAGCGAATTAGGCTGTGAGACTGCTGCCTTAAGACTGGGTATAAAATTGACACAGCATTACTCCGAGCTGTTAACTGCGTTCGGCTCGAATTTCGGAACCATCGGAGAGAAGAGCCTCACGCTGCAGGCGCTGTCTTCACTTTACTCTGCTAACATTACTGAGATTATGACCACAATCAGGACAGGGCAGTCTAACATCTATGATGTCATTTATACAGAACAGATCAAAGGAACGGTGATAGATGTGGATCTAGAGAGATACATGGTCACCCTGTCTGTGAAGATCCCTATTCTTTCTGAAGTCCCAGGTGTGCTCATACACAAGGCATCATCTATTTCTTACAACATAGACGGGGAGGAATGGTATGTGACTGTCCCCAGCCATATACTCAGTCGTGCTTCTTTCTTAGGGGGTGCAGACATAACCGATTGTGTTGAGTCCAGATTGACCTATATATGCCCCAGGGATCCCGCACAACTGATACCTGACAGCCAGCAAAAGTGTATCCTGGGGGACACAACAAGGTGTCCTGTCACAAAAGTTGTGGACAGCCTTATCCCCAAGTTTGCTTTTGTGAATGGGGGCGTTGTTGCTAACTGCATAGCATCCACATGTACCTGCGGGACAGGCCGAAGACCAATCAGTCAGGATCGCTCTAAAGGTGTAGTATTCCTAACCCATGACAACTGTGGTCTTATAGGTGTCAATGGGGTAGAATTGTATGCTAACCGGAGAGGGCACGATGCCACTTGGGGGGTCCAGAACTTGACAGTCGGTCCTGCAATTGCTATCAGACCCGTTGATATTTCTCTCAACCTTGCTGATGCTACGAATTTCTTGCAAGACTCTAAGGCTGAGCTTGAGAAAGCACGGAAAATCCTCTCGGAGGTAGGTAGATGGTACAACTCAAGAGAGACTGTGATTACGATCATAGTAGTTATGGTCGTAATATTGGTGGTCATTATAGTGATCATCATCGTGCTTTATAGACTCAGAAGGTGAAATCACTAGTGAATTCACTCCTCAGGTGCAGGCTGCCTATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAATACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCATGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTATTTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAAGGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATTTGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAACAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCCTGCTGTCCATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAGATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTAAAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTGACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTCGACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAACAGCTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTTCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGATGATATATTTTTATCTTGTGCAATGTAACATCAGAGATTTTGAGACACAACAATTGGTCGAC
SEQ ID NO: 8 도 2f에 정의된 바와 같은 플라스미드(pDNA3b pGM303)
길이: 6522; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..6522; mol_type, 기타 DNA; 참고, pGM303; 유기체, 합성 구조물
ATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTTGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAACCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTTCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTTCCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCATTTTGGCAAAGAATTCCTCGAGCATGTGGTCTGAGTTAAAAATCAGGAGCAACGACGGAGGTGAAGGACCAGAGGACGCCAACGACCCCCGGGGAAAGGGGGTGCAACACATCCATATCCAGCCATCTCTACCTGTTTATGGACAGAGGGTTAGGGATGGTGATAGGGGCAAACGTGACTCGTACTGGTCTACTTCTCCTAGTGGTAGCACCACAAAACCAGCATCAGGTTGGGAGAGGTCAAGTAAAGCCGACACATGGTTGCTGATTCTCTCATTCACCCAGTGGGCTTTGTCAATTGCCACAGTGATCATCTGTATCATAATTTCTGCTAGACAAGGGTATAGTATGAAAGAGTACTCAATGACTGTAGAGGCATTGAACATGAGCAGCAGGGAGGTGAAAGAGTCACTTACCAGTCTAATAAGGCAAGAGGTTATAGCAAGGGCTGTCAACATTCAGAGCTCTGTGCAAACCGGAATCCCAGTCTTGTTGAACAAAAACAGCAGGGATGTCATCCAGATGATTGATAAGTCGTGCAGCAGACAAGAGCTCACTCAGCACTGTGAGAGTACGATCGCAGTCCACCATGCCGATGGAATTGCCCCACTTGAGCCACATAGTTTCTGGAGATGCCCTGTCGGAGAACCGTATCTTAGCTCAGATCCTGAAATCTCATTGCTGCCTGGTCCGAGCTTGTTATCTGGTTCTACAACGATCTCTGGATGTGTTAGGCTCCCTTCACTCTCAATTGGCGAGGCAATCTATGCCTATTCATCAAATCTCATTACACAAGGTTGTGCTGACATAGGGAAATCATATCAGGTCCTGCAGCTAGGGTACATATCACTCAATTCAGATATGTTCCCTGATCTTAACCCCGTAGTGTCCCACACTTATGACATCAACGACAATCGGAAATCATGCTCTGTGGTGGCAACCGGGACTAGGGGTTATCAGCTTTGCTCCATGCCGACTGTAGACGAAAGAACCGACTACTCTAGTGATGGTATTGAGGATCTGGTCCTTGATGTCCTGGATCTCAAAGGGAGAACTAAGTCTCACCGGTATCGCAACAGCGAGGTAGATCTTGATCACCCGTTCTCTGCACTATACCCCAGTGTAGGCAACGGCATTGCAACAGAAGGCTCATTGATATTTCTTGGGTATGGTGGACTAACCACCCCTCTGCAGGGTGATACAAAATGTAGGACCCAAGGATGCCAACAGGTGTCGCAAGACACATGCAATGAGGCTCTGAAAATTACATGGCTAGGAGGGAAACAGGTGGTCAGCGTGATCATCCAGGTCAATGACTATCTCTCAGAGAGGCCAAAGATAAGAGTCACAACCATTCCAATCACTCAAAACTATCTCGGGGCGGAAGGTAGATTATTAAAATTGGGTGATCGGGTGTACATCTATACAAGATCATCAGGCTGGCACTCTCAACTGCAGATAGGAGTACTTGATGTCAGCCACCCTTTGACTATCAACTGGACACCTCATGAAGCCTTGTCTAGACCAGGAAATAAAGAGTGCAATTGGTACAATAAGTGTCCGAAGGAATGCATATCAGGCGTATACACTGATGCTTATCCATTGTCCCCTGATGCAGCTAACGTCGCTACCGTCACGCTATATGCCAATACATCGCGTGTCAACCCAACAATCATGTATTCTAACACTACTAACATTATAAATATGTTAAGGATAAAGGATGTTCAATTAGAGGCTGCATATACCACGACATCGTGTATCACGCATTTTGGTAAAGGCTACTGCTTTCACATCATCGAGATCAATCAGAAGAGCCTGAATACCTTACAGCCGATGCTCTTTAAGACTAGCATCCCTAAATTATGCAAGGCCGAGTCTTAAGCGGCCGCGCATGCGAATTCACTCCTCAGGTGCAGGCTGCCTATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAATACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCATGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTATTTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAAGGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATTTGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAACAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCCTGCTGTCTATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAGATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTAAAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTGACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTCGACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAACAGCTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTTCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGATGATATATTTTTATCTTGTGCAATGTAACATCAGAGATTTTGAGACACAACAATTGGTCGAC
SEQ ID NO: 9 도 2g에 정의된 바와 같은 플라스미드(pDNA2a pGM297)
길이: 9886; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..9886; mol_type, 기타 DNA; 참고, pGM297; 유기체, 합성 구조물
ATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTTGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAACCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTTCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTTCGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTTCCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCATTTTGGCAAAGAATTGCTCGAGACTAGTGACTTGGTGAGTAGGCTTCGAGCCTAGTTAGAGGACTAGGAGAGGCCGTAGCCGTAACTACTCTGGGCAAGTAGGGCAGGCGGTGGGTACGCAATGGGGGCGGCTACCTCAGCACTAAATAGGAGACAATTAGACCAATTTGAGAAAATACGACTTCGCCCGAACGGAAAGAAAAAGTACCAAATTAAACATTTAATATGGGCAGGCAAGGAGATGGAGCGCTTCGGCCTCCATGAGAGGTTGTTGGAGACAGAGGAGGGGTGTAAAAGAATCATAGAAGTCCTCTACCCCCTAGAACCAACAGGATCGGAGGGCTTAAAAAGTCTGTTCAATCTTGTGTGCGTACTATATTGCTTGCACAAGGAACAGAAAGTGAAAGACACAGAGGAAGCAGTAGCAACAGTAAGACAACACTGCCATCTAGTGGAAAAAGAAAAAAGTGCAACAGAGACATCTAGTGGACAAAAGAAAAATGACAAGGGAATAGCAGCGCCACCTGGTGGCAGTCAGAATTTTCCAGCGCAACAACAAGGAAATGCCTGGGTACATGTACCCTTGTCACCGCGCACCTTAAATGCGTGGGTAAAAGCAGTAGAGGAGAAAAAATTTGGAGCAGAAATAGTACCCATGTTTCAAGCCCTATCAGAAGGCTGCACACCCTATGACATTAATCAGATGCTTAATGTGCTAGGAGATCATCAAGGGGCATTACAAATAGTGAAAGAGATCATTAATGAAGAAGCAGCCCAGTGGGATGTAACACACCCACTACCCGCAGGACCCCTACCAGCAGGACAGCTCAGGGACCCTCGCGGCTCAGATATAGCAGGGACCACCAGCTCAGTACAAGAACAGTTAGAATGGATCTATACTGCTAACCCCCGGGTAGATGTAGGTGCCATCTACCGGAGATGGATTATTCTAGGACTTCAAAAGTGTGTCAAAATGTACAACCCAGTATCAGTCCTAGACATTAGGCAGGGACCTAAAGAGCCCTTCAAGGATTATGTGGACAGATTTTACAAGGCAATTAGAGCAGAACAAGCCTCAGGGGAAGTGAAACAATGGATGACAGAATCATTACTCATTCAAAATGCTAATCCAGATTGTAAGGTCATCCTGAAGGGCCTAGGAATGCACCCCACCCTTGAAGAAATGTTAACGGCTTGTCAGGGGGTAGGAGGCCCAAGCTACAAAGCAAAAGTAATGGCAGAAATGATGCAGACCATGCAAAATCAAAACATGGTGCAGCAGGGAGGTCCAAAAAGACAAAGACCCCCACTAAGATGTTATAATTGTGGAAAATTTGGCCATATGCAAAGACAATGTCCGGAACCAAGGAAAACAAAATGTCTAAAGTGTGGAAAATTGGGACACCTAGCAAAAGACTGCAGGGGACAGGTGAATTTTTTAGGGTATGGACGGTGGATGGGGGCAAAACCGAGAAATTTTCCCGCCGCTACTCTTGGAGCGGAACCGAGTGCGCCTCCTCCACCGAGCGGCACCACCCCATACGACCCAGCAAAGAAGCTCCTGCAGCAATATGCAGAGAAAGGGAAACAACTGAGGGAGCAAAAGAGGAATCCACCGGCAATGAATCCGGATTGGACCGAGGGATATTCTTTGAACTCCCTCTTTGGAGAAGACCAATAAAGACAGTGTATATAGAAGGGGTCCCCATTAAGGCACTGCTAGACACAGGGGCAGATGACACCATAATTAAAGAAAATGATTTACAATTATCAGGTCCATGGAGACCCAAAATTATAGGGGGCATAGGAGGAGGCCTTAATGTAAAAGAATATAACGACAGGGAAGTAAAAATAGAAGATAAAATTTTGAGAGGAACAATATTGTTAGGAGCAACTCCCATTAATATAATAGGTAGAAATTTGCTGGCCCCGGCAGGTGCCCGGTTAGTAATGGGACAATTATCAGAAAAAATTCCTGTCACACCTGTCAAATTGAAGGAAGGGGCTCGGGGACCCTGTGTAAGACAATGGCCTCTCTCTAAAGAGAAGATTGAAGCTTTACAGGAAATATGTTCCCAATTAGAGCAGGAAGGAAAAATCAGTAGAGTAGGAGGAGAAAATGCATACAATACCCCAATATTTTGCATAAAGAAGAAGGACAAATCCCAGTGGAGGATGCTAGTAGACTTTAGAGAGTTAAATAAGGCAACCCAAGATTTCTTTGAAGTGCAATTAGGGATACCCCACCCAGCAGGATTAAGAAAGATGAGACAGATAACAGTTTTAGATGTAGGAGACGCCTATTATTCCATACCATTGGATCCAAATTTTAGGAAATATACTGCTTTTACTATTCCCACAGTGAATAATCAGGGACCCGGGATTAGGTATCAATTCAACTGTCTCCCGCAAGGGTGGAAAGGATCTCCTACAATCTTCCAAAATACAGCAGCATCCATTTTGGAGGAGATAAAAAGAAACTTGCCAGCACTAACCATTGTACAATACATGGATGATTTATGGGTAGGTTCTCAAGAAAATGAACACACCCATGACAAATTAGTAGAACAGTTAAGAACAAAATTACAAGCCTGGGGCTTAGAAACCCCAGAAAAGAAGGTGCAAAAAGAACCACCTTATGAGTGGATGGGATACAAACTTTGGCCTCACAAATGGGAACTAAGCAGAATACAACTGGAGGAAAAAGATGAATGGACTGTCAATGACATCCAGAAGTTAGTTGGGAAACTAAATTGGGCAGCACAATTGTATCCAGGTCTTAGGACCAAGAATATATGCAAGTTAATTAGAGGAAAGAAAAATCTGTTAGAGCTAGTGACTTGGACACCTGAGGCAGAAGCTGAATATGCAGAAAATGCAGAGATTCTTAAAACAGAACAGGAAGGAACCTATTACAAACCAGGAATACCTATTAGGGCAGCAGTACAGAAATTGGAAGGAGGACAGTGGAGTTACCAATTCAAACAAGAAGGACAAGTCTTGAAAGTAGGAAAATACACCAAGCAAAAGAACACCCATACAAATGAACTTCGCACATTAGCTGGTTTAGTGCAGAAGATTTGCAAAGAAGCTCTAGTTATTTGGGGGATATTACCAGTTCTAGAACTCCCGATAGAAAGAGAGGTATGGGAACAATGGTGGGCGGATTACTGGCAGGTAAGCTGGATTCCCGAATGGGATTTTGTCAGCACCCCACCTTTGCTCAAACTATGGTACACATTAACAAAAGAACCCATACCCAAGGAGGACGTTTACTATGTAGATGGAGCATGCAACAGAAATTCAAAAGAAGGAAAAGCAGGATACATCTCACAATACGGAAAACAGAGAGTAGAAACATTAGAAAACACTACCAATCAGCAAGCAGAATTAACAGCTATAAAAATGGCTTTGGAAGACAGTGGGCCTAATGTGAACATAGTAACAGACTCTCAATATGCAATGGGAATTTTGACAGCACAACCCACACAAAGTGATTCACCATTAGTAGAGCAAATTATAGCCTTAATGATACAAAAGCAACAAATATATTTGCAGTGGGTACCAGCACATAAAGGAATAGGAGGAAATGAGGAGATAGATAAATTAGTGAGTAAAGGCATTAGAAGAGTTTTATTCTTAGAAAAAATAGAAGAAGCTCAAGAAGAGCATGAAAGATATCATAATAATTGGAAAAACCTAGCAGATACATATGGGCTTCCACAAATAGTAGCAAAAGAGATAGTGGCCATGTGTCCAAAATGTCAGATAAAGGGAGAACCAGTGCATGGACAAGTGGATGCCTCACCTGGAACATGGCAGATGGATTGTACTCATCTAGAAGGAAAAGTAGTCATAGTTGCGGTCCATGTAGCCAGTGGATTCATAGAAGCAGAAGTCATACCTAGGGAAACAGGAAAAGAAACGGCAAAGTTTCTATTAAAAATACTGAGTAGATGGCCTATAACACAGTTACACACAGACAATGGGCCTAACTTTACCTCCCAAGAAGTGGCAGCAATATGTTGGTGGGGAAAAATTGAACATACAACAGGTATACCATATAACCCCCAATCTCAAGGATCAATAGAAAGCATGAACAAACAATTAAAAGAGATAATTGGGAAAATAAGAGATGATTGCCAATATACAGAGACAGCAGTACTGATGGCTTGCCATATTCACAATTTTAAAAGAAAGGGAGGAATAGGGGGACAGACTTCAGCAGAGAGACTAATTAATATAATAACAACACAATTAGAAATACAACATTTACAAACCAAAATTCAAAAAATTTTAAATTTTAGAGTCTACTACAGAGAAGGGAGAGACCCTGTGTGGAAAGGACCAGCACAATTAATCTGGAAAGGGGAAGGAGCAGTGGTCCTCAAGGACGGAAGTGACCTAAAGGTTGTACCAAGAAGGAAAGCTAAAATTATTAAGGATTATGAACCCAAACAAAGAGTGGGTAATGAGGGTGACGTGGAAGGTACCAGGGGATCTGATAACTAAATGGCAGGGAATAGTCAGATATTGGATGAGACAAAGAAATTTGAAATGGAACTATTATATGCATCAGCTGGCGGCCGCGAATTCACTAGTGATTCCCGTTTGTGCTAGGGTTCTTAGGCTTCTTGGGGGCTGCTGGAACTGCAATGGGAGCAGCGGCGACAGCCCTGACGGTCCAGTCTCAGCATTTGCTTGCTGGGATACTGCAGCAGCAGAAGAATCTGCTGGCGGCTGTGGAGGCTCAACAGCAGATGTTGAAGCTGACCATTTGGGGTGTTAAAAACCTCAATGCCCGCGTCACAGCCCTTGAGAAGTACCTAGAGGATCAGGCACGACTAAACTCCTGGGGGTGCGCATGGAAACAAGTATGTCATACCACAGTGGAGTGGCCCTGGACAAATCGGACTCCGGATTGGCAAAATATGACTTGGTTGGAGTGGGAAAGACAAATAGCTGATTTGGAAAGCAACATTACGAGACAATTAGTGAAGGCTAGAGAACAAGAGGAAAAGAATCTAGATGCCTATCAGAAGTTAACTAGTTGGTCAGATTTCTGGTCTTGGTTCGATTTCTCAAAATGGCTTAACATTTTAAAAATGGGATTTTTAGTAATAGTAGGAATAATAGGGTTAAGATTACTTTACACAGTATATGGATGTATAGTGAGGGTTAGGCAGGGATATGTTCCTCTATCTCCACAGATCCATATCCAATCGAATTCCCGCGGCCGCAATTCACTCCTCAGGTGCAGGCTGCCTATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAATACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCATGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTATTTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAAGGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATTTGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAACAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCCTGCTGTCCATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAGATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTAAAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTGACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTCGACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAACAGCTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTTCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGATGATATATTTTTATCTTGTGCAATGTAACATCAGAGATTTTGAGACACAACAATTGGTCGAC
SEQ ID NO: 10 예시된 hCEF 프로모터
길이: 574; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..574; mol_type, 기타 DNA; 참고, hCEF 프로모터; 유기체, 합성 구조물
1 AGATCTGTTA CATAACTTAT GGTAAATGGC CTGCCTGGCT GACTGCCCAA TGACCCCTGC
61 CCAATGATGT CAATAATGAT GTATGTTCCC ATGTAATGCC AATAGGGACT TTCCATTGAT
121 GTCAATGGGT GGAGTATTTA TGGTAACTGC CCACTTGGCA GTACATCAAG TGTATCATAT
181 GCCAAGTATG CCCCCTATTG ATGTCAATGA TGGTAAATGG CCTGCCTGGC ATTATGCCCA
241 GTACATGACC TTATGGGACT TTCCTACTTG GCAGTACATC TATGTATTAG TCATTGCTAT
301 TACCATGGGA ATTCACTAGT GGAGAAGAGC ATGCTTGAGG GCTGAGTGCC CCTCAGTGGG
361 CAGAGAGCAC ATGGCCCACA GTCCCTGAGA AGTTGGGGGG AGGGGTGGGC AATTGAACTG
421 GTGCCTAGAG AAGGTGGGGC TTGGGTAAAC TGGGAAAGTG ATGTGGTGTA CTGGCTCCAC
481 CTTTTTCCCC AGGGTGGGGG AGAACCATAT ATAAGTGCAG TAGTCTCTGT GAACATTCAA
541 GCTTCTGCCT TCTCCCTCCT GTGAGTTTGC TAGC
SEQ ID NO: 11 예시된 CMV 프로모터
길이: 873; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..873; mol_type, 할당되지 않은 DNA; 유기체, 인간 사이토메갈로바이러스(cytomegalovirus)
CCGCGGAGATCTCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCT ATTGGCCATTGCATACGTTGTATCTATATCATAATATGTACATTTATATTGGCTCATGTCCAATATGACC GCCATGTTGGCATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCA TATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCC CATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGT GGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTCCGCCCCCTATT GACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTT GGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCG TGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGG CACCAAAATCAACGGGACTTTCCAAAATGTCGTAATAACCCCGCCCCGTTGACGCAAATGGGCGGTAGGC GTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCACTAGAAGCTTTATTGC GGTAGTTTATCACAGTTAAATTGCTAACGCAGTCAGTGCTTCTGACACAACAGTCTCGAACTTAAGCTGC AGAAGTTGGTCGTGAGGCACTGGGCAGGCTAGC
SEQ ID NO: 12 예시된 EF1a 프로모터
길이: 395; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..395; mol_type, 할당되지 않은 DNA; 유기체, 호모 사피엔스
AGATCCATATCCGCGGCAATTTTAAAAGAAAGGGAGGAATAGGGGGACAGACTTCAGCAGAGAGACTAATTAATATAATAACAACACAATTAGAAATACAACATTTACAAACCAAAATTCAAAAAATTTTAAATTTTAGAGCCGCGGAGATCCCGTGAGGCTCCGGTGCCCGTCAGTGGGCAGAGCGCACATCGCCCACAGTCCCCGAGAAGTTGGGGGGAGGGGTCGGCAATTGAACCGGTGCCTAGAGAAGGTGGCGCGGGGTAAACTGGGAAAGTGATGTCGTGTACTGGCTCCGCCTTTTTCCCGAGGGTGGGGGAGAACCGTATATAAGTGCAGTAGTCGCCGTGAACGTTCTTTTTCGCAACGGGTTTGCCGCCAGAACACAGGCTAGC
SEQ ID NO: 13 예시된 CFTR 이식 유전자(soCFTR2)
길이: 4459; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..4459; mol_type, 기타 DNA, 참고, soCFTR2; 유기체, 합성 구조물
1 GCTAGCCACC ATGCAGAGAA GCCCTCTGGA GAAGGCCTCT GTGGTGAGCA AGCTGTTCTT
61 CAGCTGGACC AGGCCCATCC TGAGGAAGGG CTACAGGCAG AGACTGGAGC TGTCTGACAT
121 CTACCAGATC CCCTCTGTGG ACTCTGCTGA CAACCTGTCT GAGAAGCTGG AGAGGGAGTG
181 GGATAGAGAG CTGGCCAGCA AGAAGAACCC CAAGCTGATC AATGCCCTGA GGAGATGCTT
241 CTTCTGGAGA TTCATGTTCT ATGGCATCTT CCTGTACCTG GGGGAAGTGA CCAAGGCTGT
301 GCAGCCTCTG CTGCTGGGCA GAATCATTGC CAGCTATGAC CCTGACAACA AGGAGGAGAG
361 GAGCATTGCC ATCTACCTGG GCATTGGCCT GTGCCTGCTG TTCATTGTGA GGACCCTGCT
421 GCTGCACCCT GCCATCTTTG GCCTGCACCA CATTGGCATG CAGATGAGGA TTGCCATGTT
481 CAGCCTGATC TACAAGAAAA CCCTGAAGCT GTCCAGCAGA GTGCTGGACA AGATCAGCAT
541 TGGCCAGCTG GTGAGCCTGC TGAGCAACAA CCTGAACAAG TTTGATGAGG GCCTGGCCCT
601 GGCCCACTTT GTGTGGATTG CCCCTCTGCA GGTGGCCCTG CTGATGGGCC TGATTTGGGA
661 GCTGCTGCAG GCCTCTGCCT TTTGTGGCCT GGGCTTCCTG ATTGTGCTGG CCCTGTTTCA
721 GGCTGGCCTG GGCAGGATGA TGATGAAGTA CAGGGACCAG AGGGCAGGCA AGATCAGTGA
781 GAGGCTGGTG ATCACCTCTG AGATGATTGA GAACATCCAG TCTGTGAAGG CCTACTGTTG
841 GGAGGAAGCT ATGGAGAAGA TGATTGAAAA CCTGAGGCAG ACAGAGCTGA AGCTGACCAG
901 GAAGGCTGCC TATGTGAGAT ACTTCAACAG CTCTGCCTTC TTCTTCTCTG GCTTCTTTGT
961 GGTGTTCCTG TCTGTGCTGC CCTATGCCCT GATCAAGGGG ATCATCCTGA GAAAGATTTT
1021 CACCACCATC AGCTTCTGCA TTGTGCTGAG GATGGCTGTG ACCAGACAGT TCCCCTGGGC
1081 TGTGCAGACC TGGTATGACA GCCTGGGGGC CATCAACAAG ATCCAGGACT TCCTGCAGAA
1141 GCAGGAGTAC AAGACCCTGG AGTACAACCT GACCACCACA GAAGTGGTGA TGGAGAATGT
1201 GACAGCCTTC TGGGAGGAGG GCTTTGGGGA GCTGTTTGAG AAGGCCAAGC AGAACAACAA
1261 CAACAGAAAG ACCAGCAATG GGGATGACTC CCTGTTCTTC TCCAACTTCT CCCTGCTGGG
1321 CACACCTGTG CTGAAGGACA TCAACTTCAA GATTGAGAGG GGGCAGCTGC TGGCTGTGGC
1381 TGGATCTACA GGGGCTGGCA AGACCAGCCT GCTGATGATG ATCATGGGGG AGCTGGAGCC
1441 TTCTGAGGGC AAGATCAAGC ACTCTGGCAG GATCAGCTTT TGCAGCCAGT TCAGCTGGAT
1501 CATGCCTGGC ACCATCAAGG AGAACATCAT CTTTGGAGTG AGCTATGATG AGTACAGATA
1561 CAGGAGTGTG ATCAAGGCCT GCCAGCTGGA GGAGGACATC AGCAAGTTTG CTGAGAAGGA
1621 CAACATTGTG CTGGGGGAGG GAGGCATTAC ACTGTCTGGG GGCCAGAGAG CCAGAATCAG
1681 CCTGGCCAGG GCTGTGTACA AGGATGCTGA CCTGTACCTG CTGGACTCCC CCTTTGGCTA
1741 CCTGGATGTG CTGACAGAGA AGGAGATTTT TGAGAGCTGT GTGTGCAAGC TGATGGCCAA
1801 CAAGACCAGA ATCCTGGTGA CCAGCAAGAT GGAGCACCTG AAGAAGGCTG ACAAGATCCT
1861 GATCCTGCAT GAGGGCAGCA GCTACTTCTA TGGGACCTTC TCTGAGCTGC AGAACCTGCA
1921 GCCTGACTTC AGCTCTAAGC TGATGGGCTG TGACAGCTTT GACCAGTTCT CTGCTGAGAG
1981 GAGGAACAGC ATCCTGACAG AGACCCTGCA CAGATTCAGC CTGGAGGGAG ATGCCCCTGT
2041 GAGCTGGACA GAGACCAAGA AGCAGAGCTT CAAGCAGACA GGGGAGTTTG GGGAGAAGAG
2101 GAAGAACTCC ATCCTGAACC CCATCAACAG CATCAGGAAG TTCAGCATTG TGCAGAAAAC
2161 CCCCCTGCAG ATGAATGGCA TTGAGGAAGA TTCTGATGAG CCCCTGGAGA GGAGACTGAG
2221 CCTGGTGCCT GATTCTGAGC AGGGAGAGGC CATCCTGCCT AGGATCTCTG TGATCAGCAC
2281 AGGCCCTACA CTGCAGGCCA GAAGGAGGCA GTCTGTGCTG AACCTGATGA CCCACTCTGT
2341 GAACCAGGGC CAGAACATCC ACAGGAAAAC CACAGCCTCC ACCAGGAAAG TGAGCCTGGC
2401 CCCTCAGGCC AATCTGACAG AGCTGGACAT CTACAGCAGG AGGCTGTCTC AGGAGACAGG
2461 CCTGGAGATT TCTGAGGAGA TCAATGAGGA GGACCTGAAA GAGTGCTTCT TTGATGACAT
2521 GGAGAGCATC CCTGCTGTGA CCACCTGGAA CACCTACCTG AGATACATCA CAGTGCACAA
2581 GAGCCTGATC TTTGTGCTGA TCTGGTGCCT GGTGATCTTC CTGGCTGAAG TGGCTGCCTC
2641 TCTGGTGGTG CTGTGGCTGC TGGGAAACAC CCCACTGCAG GACAAGGGCA ACAGCACCCA
2701 CAGCAGGAAC AACAGCTATG CTGTGATCAT CACCTCCACC TCCAGCTACT ATGTGTTCTA
2761 CATCTATGTG GGAGTGGCTG ATACCCTGCT GGCTATGGGC TTCTTTAGAG GCCTGCCCCT
2821 GGTGCACACA CTGATCACAG TGAGCAAGAT CCTCCACCAC AAGATGCTGC ACTCTGTGCT
2881 GCAGGCTCCT ATGAGCACCC TGAATACCCT GAAGGCTGGG GGCATCCTGA ACAGATTCTC
2941 CAAGGATATT GCCATCCTGG ATGACCTGCT GCCTCTCACC ATCTTTGACT TCATCCAGCT
3001 GCTGCTGATT GTGATTGGGG CCATTGCTGT GGTGGCAGTG CTGCAGCCCT ACATCTTTGT
3061 GGCCACAGTG CCTGTGATTG TGGCCTTCAT CATGCTGAGG GCCTACTTTC TGCAGACCTC
3121 CCAGCAGCTG AAGCAGCTGG AGTCTGAGGG CAGAAGCCCC ATCTTCACCC ACCTGGTGAC
3181 AAGCCTGAAG GGCCTGTGGA CCCTGAGAGC CTTTGGCAGG CAGCCCTACT TTGAGACCCT
3241 GTTCCACAAG GCCCTGAACC TGCACACAGC CAACTGGTTC CTCTACCTGT CCACCCTGAG
3301 ATGGTTCCAG ATGAGAATTG AGATGATCTT TGTCATCTTC TTCATTGCTG TGACCTTCAT
3361 CAGCATTCTG ACCACAGGAG AGGGAGAGGG CAGAGTGGGC ATTATCCTGA CCCTGGCCAT
3421 GAACATCATG AGCACACTGC AGTGGGCAGT GAACAGCAGC ATTGATGTGG ACAGCCTGAT
3481 GAGGAGTGTG AGCAGAGTGT TCAAGTTCAT TGATATGCCC ACAGAGGGCA AGCCTACCAA
3541 GAGCACCAAG CCCTACAAGA ATGGCCAGCT GAGCAAAGTG ATGATCATTG AGAACAGCCA
3601 TGTGAAGAAG GATGATATCT GGCCCAGTGG AGGCCAGATG ACAGTGAAGG ACCTGACAGC
3661 CAAGTACACA GAGGGGGGCA ATGCTATCCT GGAGAACATC TCCTTCAGCA TCTCCCCTGG
3721 CCAGAGAGTG GGACTGCTGG GAAGAACAGG CTCTGGCAAG TCTACCCTGC TGTCTGCCTT
3781 CCTGAGGCTG CTGAACACAG AGGGAGAGAT CCAGATTGAT GGAGTGTCCT GGGACAGCAT
3841 CACACTGCAG CAGTGGAGGA AGGCCTTTGG TGTGATCCCC CAGAAAGTGT TCATCTTCAG
3901 TGGCACCTTC AGGAAGAACC TGGACCCCTA TGAGCAGTGG TCTGACCAGG AGATTTGGAA
3961 AGTGGCTGAT GAAGTGGGCC TGAGAAGTGT GATTGAGCAG TTCCCTGGCA AGCTGGACTT
4021 TGTCCTGGTG GATGGGGGCT GTGTGCTGAG CCATGGCCAC AAGCAGCTGA TGTGCCTGGC
4081 CAGATCAGTG CTGAGCAAGG CCAAGATCCT GCTGCTGGAT GAGCCTTCTG CCCACCTGGA
4141 TCCTGTGACC TACCAGATCA TCAGGAGGAC CCTCAAGCAG GCCTTTGCTG ACTGCACAGT
4201 CATCCTGTGT GAGCACAGGA TTGAGGCCAT GCTGGAGTGC CAGCAGTTCC TGGTGATTGA
4261 GGAGAACAAA GTGAGGCAGT ATGACAGCAT CCAGAAGCTG CTGAATGAGA GGAGCCTGTT
4321 CAGGCAGGCC ATCAGCCCCT CTGATAGAGT GAAGCTGTTC CCCCACAGGA ACAGCTCCAA
4381 GTGCAAGAGC AAGCCCCAGA TTGCTGCCCT GAAGGAGGAG ACAGAGGAGG AAGTGCAGGA
4441 CACCAGGCTG TGAGGGCCC
SEQ ID NO: 14 예시된 A1AT 이식 유전자
길이: 1257; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..1257; mol_type, 기타 DNA; 참고, sohAAT 유기체, 합성 구조물
ATGCCCAGCTCTGTGTCCTGGGGCATTCTGCTGCTGGCTGGCCTGTGCTGTCTGGTGCCTGTGTCCCTGG CTGAGGACCCTCAGGGGGATGCTGCCCAGAAAACAGACACCTCCCACCATGACCAGGACCACCCCACCTT CAACAAGATCACCCCCAACCTGGCAGAGTTTGCCTTCAGCCTGTACAGACAGCTGGCCCACCAGAGCAAC AGCACCAACATCTTTTTCAGCCCTGTGTCCATTGCCACAGCCTTTGCCATGCTGAGCCTGGGCACCAAGG CTGACACCCATGATGAGATCCTGGAAGGCCTGAACTTCAACCTGACAGAGATCCCTGAGGCCCAGATCCA TGAGGGCTTCCAGGAACTGCTGAGAACCCTGAACCAGCCAGACAGCCAGCTGCAGCTGACAACAGGCAAT GGGCTGTTCCTGTCTGAGGGCCTGAAGCTGGTGGACAAGTTTCTGGAAGATGTGAAGAAGCTGTACCACT CTGAGGCCTTCACAGTGAACTTTGGGGACACAGAAGAGGCCAAGAAACAGATCAATGACTATGTGGAAAA GGGCACCCAGGGCAAGATTGTGGACCTTGTGAAAGAGCTGGACAGGGACACTGTGTTTGCCCTTGTGAAC TACATCTTCTTCAAGGGCAAGTGGGAGAGGCCCTTTGAAGTGAAGGACACTGAGGAAGAGGACTTCCATG TGGACCAAGTGACCACAGTGAAGGTGCCAATGATGAAGAGACTGGGGATGTTCAATATCCAGCACTGCAA GAAACTGAGCAGCTGGGTGCTGCTGATGAAGTACCTGGGCAATGCTACAGCCATATTCTTTCTGCCTGAT GAGGGCAAGCTGCAGCACCTGGAAAATGAGCTGACCCATGACATCATCACCAAATTTCTGGAAAATGAGG ACAGAAGATCTGCCAGCCTGCATCTGCCCAAGCTGAGCATCACAGGCACATATGACCTGAAGTCTGTGCT GGGACAGCTGGGAATCACCAAGGTGTTCAGCAATGGGGCAGACCTGAGTGGAGTGACAGAGGAAGCCCCT CTGAAGCTGTCCAAGGCTGTGCACAAGGCAGTGCTGACCATTGATGAGAAGGGCACAGAGGCTGCTGGGG CCATGTTTCTGGAAGCCATCCCCATGTCCATCCCCCCAGAAGTGAAGTTCAACAAGCCCTTTGTGTTCCT GATGATTGAGCAGAACACCAAGAGCCCCCTGTTCATGGGCAAGGTTGTGAACCCCACCCAGAAATGA
SEQ ID NO: 15 예시된 A1AT 이식 유전자에 대한 상보 가닥
길이: 1257; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..1257; mol_type, 기타 DNA; 참고, sohAAT 상보 가닥; 유기체, 합성 구조물
TACGGGTCGAGACACAGGACCCCGTAAGACGACGACCGACCGGACACGACAGACCACGGACACAGGGACC GACTCCTGGGAGTCCCCCTACGACGGGTCTTTTGTCTGTGGAGGGTGGTACTGGTCCTGGTGGGGTGGAA GTTGTTCTAGTGGGGGTTGGACCGTCTCAAACGGAAGTCGGACATGTCTGTCGACCGGGTGGTCTCGTTG TCGTGGTTGTAGAAAAAGTCGGGACACAGGTAACGGTGTCGGAAACGGTACGACTCGGACCCGTGGTTCC GACTGTGGGTACTACTCTAGGACCTTCCGGACTTGAAGTTGGACTGTCTCTAGGGACTCCGGGTCTAGGT ACTCCCGAAGGTCCTTGACGACTCTTGGGACTTGGTCGGTCTGTCGGTCGACGTCGACTGTTGTCCGTTA CCCGACAAGGACAGACTCCCGGACTTCGACCACCTGTTCAAAGACCTTCTACACTTCTTCGACATGGTGA GACTCCGGAAGTGTCACTTGAAACCCCTGTGTCTTCTCCGGTTCTTTGTCTAGTTACTGATACACCTTTT CCCGTGGGTCCCGTTCTAACACCTGGAACACTTTCTCGACCTGTCCCTGTGACACAAACGGGAACACTTG ATGTAGAAGAAGTTCCCGTTCACCCTCTCCGGGAAACTTCACTTCCTGTGACTCCTTCTCCTGAAGGTAC ACCTGGTTCACTGGTGTCACTTCCACGGTTACTACTTCTCTGACCCCTACAAGTTATAGGTCGTGACGTT CTTTGACTCGTCGACCCACGACGACTACTTCATGGACCCGTTACGATGTCGGTATAAGAAAGACGGACTA CTCCCGTTCGACGTCGTGGACCTTTTACTCGACTGGGTACTGTAGTAGTGGTTTAAAGACCTTTTACTCC TGTCTTCTAGACGGTCGGACGTAGACGGGTTCGACTCGTAGTGTCCGTGTATACTGGACTTCAGACACGA CCCTGTCGACCCTTAGTGGTTCCACAAGTCGTTACCCCGTCTGGACTCACCTCACTGTCTCCTTCGGGGA GACTTCGACAGGTTCCGACACGTGTTCCGTCACGACTGGTAACTACTCTTCCCGTGTCTCCGACGACCCC GGTACAAAGACCTTCGGTAGGGGTACAGGTAGGGGGGTCTTCACTTCAAGTTGTTCGGGAAACACAAGGA CTACTAACTCGTCTTGTGGTTCTCGGGGGACAAGTACCCGTTCCAACACTTGGGGTGGGTCTTTACT
SEQ ID NO: 16 예시된 A1AT 폴리펩티드
길이: 419; 분자 유형: AA; 기능 위치/한정자: 소스, 1..419; MOL_TYPE, 단백질; 유기체, 호모 사피엔스
AEDPQGDAAQKTDTSHHDQDHPTFAEDPQGDAAQKTDTSHHDQDHPTFNKITPNLAEFAFSLYRQLAHQSNSTNIFFSPVSIATAFAMLSLGTKADTHDEILEGLNFNLTEIPEAQIHEGFQELLRTLNQPDSQLQLTTGNGLFLSEGLKLVDKFLEDVKKLYHSEAFTVNFGDTEEAKKQINDYVEKGTQGKIVDLVKELDRDTVFALVNYIFFKGKWERPFEVKDTEEEDFHVDQVTTVKVPMMKRLGMFNIQHCKKLSSWVLLMKYLGNATAIFFLPDEGKLQHLENELTHDIITKFLENEDRRSASLHLPKLSITGTYDLKSVLGQLGITKVFSNGADLSGVTEEAPLKLSKAVHKAVLTIDEKGTEAAGAMFLEAIPMSIPPEVKFNKPFVFLMIEQNTKSPLFMGKVVNPTQK
SEQ ID NO: 17 예시된 FVIII 이식 유전자(N6)
길이: 5013; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..5013; mol_type, 기타 DNA; 참고, 코돈-최적화된 FVIII 이식 유전자(N6); 유기체, 합성 구조물
ATGCAGATTGAGCTGAGCACCTGCTTCTTCCTGTGCCTGCTGAGGTTCTGCTTCTCTGCCACCAGGAGAT ACTACCTGGGGGCTGTGGAGCTGAGCTGGGACTACATGCAGTCTGACCTGGGGGAGCTGCCTGTGGATGC CAGGTTCCCCCCCAGAGTGCCCAAGAGCTTCCCCTTCAACACCTCTGTGGTGTACAAGAAGACCCTGTTT GTGGAGTTCACTGACCACCTGTTCAACATTGCCAAGCCCAGGCCCCCCTGGATGGGCCTGCTGGGCCCCA CCATCCAGGCTGAGGTGTATGACACTGTGGTGATCACCCTGAAGAACATGGCCAGCCACCCTGTGAGCCT GCATGCTGTGGGGGTGAGCTACTGGAAGGCCTCTGAGGGGGCTGAGTATGATGACCAGACCAGCCAGAGG GAGAAGGAGGATGACAAGGTGTTCCCTGGGGGCAGCCACACCTATGTGTGGCAGGTGCTGAAGGAGAATG GCCCCATGGCCTCTGACCCCCTGTGCCTGACCTACAGCTACCTGAGCCATGTGGACCTGGTGAAGGACCT GAACTCTGGCCTGATTGGGGCCCTGCTGGTGTGCAGGGAGGGCAGCCTGGCCAAGGAGAAGACCCAGACC CTGCACAAGTTCATCCTGCTGTTTGCTGTGTTTGATGAGGGCAAGAGCTGGCACTCTGAAACCAAGAACA GCCTGATGCAGGACAGGGATGCTGCCTCTGCCAGGGCCTGGCCCAAGATGCACACTGTGAATGGCTATGT GAACAGGAGCCTGCCTGGCCTGATTGGCTGCCACAGGAAGTCTGTGTACTGGCATGTGATTGGCATGGGC ACCACCCCTGAGGTGCACAGCATCTTCCTGGAGGGCCACACCTTCCTGGTCAGGAACCACAGGCAGGCCA GCCTGGAGATCAGCCCCATCACCTTCCTGACTGCCCAGACCCTGCTGATGGACCTGGGCCAGTTCCTGCT GTTCTGCCACATCAGCAGCCACCAGCATGATGGCATGGAGGCCTATGTGAAGGTGGACAGCTGCCCTGAG GAGCCCCAGCTGAGGATGAAGAACAATGAGGAGGCTGAGGACTATGATGATGACCTGACTGACTCTGAGA TGGATGTGGTGAGGTTTGATGATGACAACAGCCCCAGCTTCATCCAGATCAGGTCTGTGGCCAAGAAGCA CCCCAAGACCTGGGTGCACTACATTGCTGCTGAGGAGGAGGACTGGGACTATGCCCCCCTGGTGCTGGCC CCTGATGACAGGAGCTACAAGAGCCAGTACCTGAACAATGGCCCCCAGAGGATTGGCAGGAAGTACAAGA AGGTCAGGTTCATGGCCTACACTGATGAAACCTTCAAGACCAGGGAGGCCATCCAGCATGAGTCTGGCAT CCTGGGCCCCCTGCTGTATGGGGAGGTGGGGGACACCCTGCTGATCATCTTCAAGAACCAGGCCAGCAGG CCCTACAACATCTACCCCCATGGCATCACTGATGTGAGGCCCCTGTACAGCAGGAGGCTGCCCAAGGGGG TGAAGCACCTGAAGGACTTCCCCATCCTGCCTGGGGAGATCTTCAAGTACAAGTGGACTGTGACTGTGGA GGATGGCCCCACCAAGTCTGACCCCAGGTGCCTGACCAGATACTACAGCAGCTTTGTGAACATGGAGAGG GACCTGGCCTCTGGCCTGATTGGCCCCCTGCTGATCTGCTACAAGGAGTCTGTGGACCAGAGGGGCAACC AGATCATGTCTGACAAGAGGAATGTGATCCTGTTCTCTGTGTTTGATGAGAACAGGAGCTGGTACCTGAC TGAGAACATCCAGAGGTTCCTGCCCAACCCTGCTGGGGTGCAGCTGGAGGACCCTGAGTTCCAGGCCAGC AACATCATGCACAGCATCAATGGCTATGTGTTTGACAGCCTGCAGCTGTCTGTGTGCCTGCATGAGGTGG CCTACTGGTACATCCTGAGCATTGGGGCCCAGACTGACTTCCTGTCTGTGTTCTTCTCTGGCTACACCTT CAAGCACAAGATGGTGTATGAGGACACCCTGACCCTGTTCCCCTTCTCTGGGGAGACTGTGTTCATGAGC ATGGAGAACCCTGGCCTGTGGATTCTGGGCTGCCACAACTCTGACTTCAGGAACAGGGGCATGACTGCCC TGCTGAAAGTCTCCAGCTGTGACAAGAACACTGGGGACTACTATGAGGACAGCTATGAGGACATCTCTGC CTACCTGCTGAGCAAGAACAATGCCATTGAGCCCAGGAGCTTCAGCCAGAACAGCAGGCACCCCAGCACC AGGCAGAAGCAGTTCAATGCCACCACCATCCCTGAGAATGACATAGAGAAGACAGACCCATGGTTTGCCC ACCGGACCCCCATGCCCAAGATCCAGAATGTGAGCAGCTCTGACCTGCTGATGCTGCTGAGGCAGAGCCC CACCCCCCATGGCCTGAGCCTGTCTGACCTGCAGGAGGCCAAGTATGAAACCTTCTCTGATGACCCCAGC CCTGGGGCCATTGACAGCAACAACAGCCTGTCTGAGATGACCCACTTCAGGCCCCAGCTGCACCACTCTG GGGACATGGTGTTCACCCCTGAGTCTGGCCTGCAGCTGAGGCTGAATGAGAAGCTGGGCACCACTGCTGC CACTGAGCTGAAGAAGCTGGACTTCAAAGTCTCCAGCACCAGCAACAACCTGATCAGCACCATCCCCTCT GACAACCTGGCTGCTGGCACTGACAACACCAGCAGCCTGGGCCCCCCCAGCATGCCTGTGCACTATGACA GCCAGCTGGACACCACCCTGTTTGGCAAGAAGAGCAGCCCCCTGACTGAGTCTGGGGGCCCCCTGAGCCT GTCTGAGGAGAACAATGACAGCAAGCTGCTGGAGTCTGGCCTGATGAACAGCCAGGAGAGCAGCTGGGGC AAGAATGTGAGCAGCAGGGAGATCACCAGGACCACCCTGCAGTCTGACCAGGAGGAGATTGACTATGATG ACACCATCTCTGTGGAGATGAAGAAGGAGGACTTTGACATCTACGACGAGGACGAGAACCAGAGCCCCAG GAGCTTCCAGAAGAAGACCAGGCACTACTTCATTGCTGCTGTGGAGAGGCTGTGGGACTATGGCATGAGC AGCAGCCCCCATGTGCTGAGGAACAGGGCCCAGTCTGGCTCTGTGCCCCAGTTCAAGAAGGTGGTGTTCC AGGAGTTCACTGATGGCAGCTTCACCCAGCCCCTGTACAGAGGGGAGCTGAATGAGCACCTGGGCCTGCT GGGCCCCTACATCAGGGCTGAGGTGGAGGACAACATCATGGTGACCTTCAGGAACCAGGCCAGCAGGCCC TACAGCTTCTACAGCAGCCTGATCAGCTATGAGGAGGACCAGAGGCAGGGGGCTGAGCCCAGGAAGAACT TTGTGAAGCCCAATGAAACCAAGACCTACTTCTGGAAGGTGCAGCACCACATGGCCCCCACCAAGGATGA GTTTGACTGCAAGGCCTGGGCCTACTTCTCTGATGTGGACCTGGAGAAGGATGTGCACTCTGGCCTGATT GGCCCCCTGCTGGTGTGCCACACCAACACCCTGAACCCTGCCCATGGCAGGCAGGTGACTGTGCAGGAGT TTGCCCTGTTCTTCACCATCTTTGATGAAACCAAGAGCTGGTACTTCACTGAGAACATGGAGAGGAACTG CAGGGCCCCCTGCAACATCCAGATGGAGGACCCCACCTTCAAGGAGAACTACAGGTTCCATGCCATCAAT GGCTACATCATGGACACCCTGCCTGGCCTGGTGATGGCCCAGGACCAGAGGATCAGGTGGTACCTGCTGA GCATGGGCAGCAATGAGAACATCCACAGCATCCACTTCTCTGGCCATGTGTTCACTGTGAGGAAGAAGGA GGAGTACAAGATGGCCCTGTACAACCTGTACCCTGGGGTGTTTGAGACTGTGGAGATGCTGCCCAGCAAG GCTGGCATCTGGAGGGTGGAGTGCCTGATTGGGGAGCACCTGCATGCTGGCATGAGCACCCTGTTCCTGG TGTACAGCAACAAGTGCCAGACCCCCCTGGGCATGGCCTCTGGCCACATCAGGGACTTCCAGATCACTGC CTCTGGCCAGTATGGCCAGTGGGCCCCCAAGCTGGCCAGGCTGCACTACTCTGGCAGCATCAATGCCTGG AGCACCAAGGAGCCCTTCAGCTGGATCAAGGTGGACCTGCTGGCCCCCATGATCATCCATGGCATCAAGA CCCAGGGGGCCAGGCAGAAGTTCAGCAGCCTGTACATCAGCCAGTTCATCATCATGTACAGCCTGGATGG CAAGAAGTGGCAGACCTACAGGGGCAACAGCACTGGCACCCTGATGGTGTTCTTTGGCAATGTGGACAGC TCTGGCATCAAGCACAACATCTTCAACCCCCCCATCATTGCCAGATACATCAGGCTGCACCCCACCCACT ACAGCATCAGGAGCACCCTGAGGATGGAGCTGATGGGCTGTGACCTGAACAGCTGCAGCATGCCCCTGGG CATGGAGAGCAAGGCCATCTCTGATGCCCAGATCACTGCCAGCAGCTACTTCACCAACATGTTTGCCACC TGGAGCCCCAGCAAGGCCAGGCTGCACCTGCAGGGCAGGAGCAATGCCTGGAGGCCCCAGGTCAACAACC CCAAGGAGTGGCTGCAGGTGGACTTCCAGAAGACCATGAAGGTGACTGGGGTGACCACCCAGGGGGTGAA GAGCCTGCTGACCAGCATGTATGTGAAGGAGTTCCTGATCAGCAGCAGCCAGGATGGCCACCAGTGGACC CTGTTCTTCCAGAATGGCAAGGTGAAGGTGTTCCAGGGCAACCAGGACAGCTTCACCCCTGTGGTGAACA GCCTGGACCCCCCCCTGCTGACCAGATACCTGAGGATTCACCCCCAGAGCTGGGTGCACCAGATTGCCCT GAGGATGGAGGTGCTGGGCTGTGAGGCCCAGGACCTGTACTGA
SEQ ID NO: 18 예시된 FVIII 이식 유전자(V3)
길이: 4425; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..4425; mol_type, 기타 DNA; 참고, 코돈-최적화된 FVIII 이식 유전자(V3); 유기체, 합성 구조물
ATGCAGATTGAGCTGAGCACCTGCTTCTTCCTGTGCCTGCTGAGGTTCTGCTTCTCTGCCACCAGGAGAT ACTACCTGGGGGCTGTGGAGCTGAGCTGGGACTACATGCAGTCTGACCTGGGGGAGCTGCCTGTGGATGC CAGGTTCCCCCCCAGAGTGCCCAAGAGCTTCCCCTTCAACACCTCTGTGGTGTACAAGAAGACCCTGTTT GTGGAGTTCACTGACCACCTGTTCAACATTGCCAAGCCCAGGCCCCCCTGGATGGGCCTGCTGGGCCCCA CCATCCAGGCTGAGGTGTATGACACTGTGGTGATCACCCTGAAGAACATGGCCAGCCACCCTGTGAGCCT GCATGCTGTGGGGGTGAGCTACTGGAAGGCCTCTGAGGGGGCTGAGTATGATGACCAGACCAGCCAGAGG GAGAAGGAGGATGACAAGGTGTTCCCTGGGGGCAGCCACACCTATGTGTGGCAGGTGCTGAAGGAGAATG GCCCCATGGCCTCTGACCCCCTGTGCCTGACCTACAGCTACCTGAGCCATGTGGACCTGGTGAAGGACCT GAACTCTGGCCTGATTGGGGCCCTGCTGGTGTGCAGGGAGGGCAGCCTGGCCAAGGAGAAGACCCAGACC CTGCACAAGTTCATCCTGCTGTTTGCTGTGTTTGATGAGGGCAAGAGCTGGCACTCTGAAACCAAGAACA GCCTGATGCAGGACAGGGATGCTGCCTCTGCCAGGGCCTGGCCCAAGATGCACACTGTGAATGGCTATGT GAACAGGAGCCTGCCTGGCCTGATTGGCTGCCACAGGAAGTCTGTGTACTGGCATGTGATTGGCATGGGC ACCACCCCTGAGGTGCACAGCATCTTCCTGGAGGGCCACACCTTCCTGGTCAGGAACCACAGGCAGGCCA GCCTGGAGATCAGCCCCATCACCTTCCTGACTGCCCAGACCCTGCTGATGGACCTGGGCCAGTTCCTGCT GTTCTGCCACATCAGCAGCCACCAGCATGATGGCATGGAGGCCTATGTGAAGGTGGACAGCTGCCCTGAG GAGCCCCAGCTGAGGATGAAGAACAATGAGGAGGCTGAGGACTATGATGATGACCTGACTGACTCTGAGA TGGATGTGGTGAGGTTTGATGATGACAACAGCCCCAGCTTCATCCAGATCAGGTCTGTGGCCAAGAAGCA CCCCAAGACCTGGGTGCACTACATTGCTGCTGAGGAGGAGGACTGGGACTATGCCCCCCTGGTGCTGGCC CCTGATGACAGGAGCTACAAGAGCCAGTACCTGAACAATGGCCCCCAGAGGATTGGCAGGAAGTACAAGA AGGTCAGGTTCATGGCCTACACTGATGAAACCTTCAAGACCAGGGAGGCCATCCAGCATGAGTCTGGCAT CCTGGGCCCCCTGCTGTATGGGGAGGTGGGGGACACCCTGCTGATCATCTTCAAGAACCAGGCCAGCAGG CCCTACAACATCTACCCCCATGGCATCACTGATGTGAGGCCCCTGTACAGCAGGAGGCTGCCCAAGGGGG TGAAGCACCTGAAGGACTTCCCCATCCTGCCTGGGGAGATCTTCAAGTACAAGTGGACTGTGACTGTGGA GGATGGCCCCACCAAGTCTGACCCCAGGTGCCTGACCAGATACTACAGCAGCTTTGTGAACATGGAGAGG GACCTGGCCTCTGGCCTGATTGGCCCCCTGCTGATCTGCTACAAGGAGTCTGTGGACCAGAGGGGCAACC AGATCATGTCTGACAAGAGGAATGTGATCCTGTTCTCTGTGTTTGATGAGAACAGGAGCTGGTACCTGAC TGAGAACATCCAGAGGTTCCTGCCCAACCCTGCTGGGGTGCAGCTGGAGGACCCTGAGTTCCAGGCCAGC AACATCATGCACAGCATCAATGGCTATGTGTTTGACAGCCTGCAGCTGTCTGTGTGCCTGCATGAGGTGG CCTACTGGTACATCCTGAGCATTGGGGCCCAGACTGACTTCCTGTCTGTGTTCTTCTCTGGCTACACCTT CAAGCACAAGATGGTGTATGAGGACACCCTGACCCTGTTCCCCTTCTCTGGGGAGACTGTGTTCATGAGC ATGGAGAACCCTGGCCTGTGGATTCTGGGCTGCCACAACTCTGACTTCAGGAACAGGGGCATGACTGCCC TGCTGAAAGTCTCCAGCTGTGACAAGAACACTGGGGACTACTATGAGGACAGCTATGAGGACATCTCTGC CTACCTGCTGAGCAAGAACAATGCCATTGAGCCCAGGAGCTTCAGCCAGAATGCCACTAATGTGTCTAAC AACAGCAACACCAGCAATGACAGCAATGTGTCTCCCCCAGTGCTGAAGAGGCACCAGAGGGAGATCACCA GGACCACCCTGCAGTCTGACCAGGAGGAGATTGACTATGATGACACCATCTCTGTGGAGATGAAGAAGGA GGACTTTGACATCTACGACGAGGACGAGAACCAGAGCCCCAGGAGCTTCCAGAAGAAGACCAGGCACTAC TTCATTGCTGCTGTGGAGAGGCTGTGGGACTATGGCATGAGCAGCAGCCCCCATGTGCTGAGGAACAGGG CCCAGTCTGGCTCTGTGCCCCAGTTCAAGAAGGTGGTGTTCCAGGAGTTCACTGATGGCAGCTTCACCCA GCCCCTGTACAGAGGGGAGCTGAATGAGCACCTGGGCCTGCTGGGCCCCTACATCAGGGCTGAGGTGGAG GACAACATCATGGTGACCTTCAGGAACCAGGCCAGCAGGCCCTACAGCTTCTACAGCAGCCTGATCAGCT ATGAGGAGGACCAGAGGCAGGGGGCTGAGCCCAGGAAGAACTTTGTGAAGCCCAATGAAACCAAGACCTA CTTCTGGAAGGTGCAGCACCACATGGCCCCCACCAAGGATGAGTTTGACTGCAAGGCCTGGGCCTACTTC TCTGATGTGGACCTGGAGAAGGATGTGCACTCTGGCCTGATTGGCCCCCTGCTGGTGTGCCACACCAACA CCCTGAACCCTGCCCATGGCAGGCAGGTGACTGTGCAGGAGTTTGCCCTGTTCTTCACCATCTTTGATGA AACCAAGAGCTGGTACTTCACTGAGAACATGGAGAGGAACTGCAGGGCCCCCTGCAACATCCAGATGGAG GACCCCACCTTCAAGGAGAACTACAGGTTCCATGCCATCAATGGCTACATCATGGACACCCTGCCTGGCC TGGTGATGGCCCAGGACCAGAGGATCAGGTGGTACCTGCTGAGCATGGGCAGCAATGAGAACATCCACAG CATCCACTTCTCTGGCCATGTGTTCACTGTGAGGAAGAAGGAGGAGTACAAGATGGCCCTGTACAACCTG TACCCTGGGGTGTTTGAGACTGTGGAGATGCTGCCCAGCAAGGCTGGCATCTGGAGGGTGGAGTGCCTGA TTGGGGAGCACCTGCATGCTGGCATGAGCACCCTGTTCCTGGTGTACAGCAACAAGTGCCAGACCCCCCT GGGCATGGCCTCTGGCCACATCAGGGACTTCCAGATCACTGCCTCTGGCCAGTATGGCCAGTGGGCCCCC AAGCTGGCCAGGCTGCACTACTCTGGCAGCATCAATGCCTGGAGCACCAAGGAGCCCTTCAGCTGGATCA AGGTGGACCTGCTGGCCCCCATGATCATCCATGGCATCAAGACCCAGGGGGCCAGGCAGAAGTTCAGCAG CCTGTACATCAGCCAGTTCATCATCATGTACAGCCTGGATGGCAAGAAGTGGCAGACCTACAGGGGCAAC AGCACTGGCACCCTGATGGTGTTCTTTGGCAATGTGGACAGCTCTGGCATCAAGCACAACATCTTCAACC CCCCCATCATTGCCAGATACATCAGGCTGCACCCCACCCACTACAGCATCAGGAGCACCCTGAGGATGGA GCTGATGGGCTGTGACCTGAACAGCTGCAGCATGCCCCTGGGCATGGAGAGCAAGGCCATCTCTGATGCC CAGATCACTGCCAGCAGCTACTTCACCAACATGTTTGCCACCTGGAGCCCCAGCAAGGCCAGGCTGCACC TGCAGGGCAGGAGCAATGCCTGGAGGCCCCAGGTCAACAACCCCAAGGAGTGGCTGCAGGTGGACTTCCA GAAGACCATGAAGGTGACTGGGGTGACCACCCAGGGGGTGAAGAGCCTGCTGACCAGCATGTATGTGAAG GAGTTCCTGATCAGCAGCAGCCAGGATGGCCACCAGTGGACCCTGTTCTTCCAGAATGGCAAGGTGAAGG TGTTCCAGGGCAACCAGGACAGCTTCACCCCTGTGGTGAACAGCCTGGACCCCCCCCTGCTGACCAGATA CCTGAGGATTCACCCCCAGAGCTGGGTGCACCAGATTGCCCTGAGGATGGAGGTGCTGGGCTGTGAGGCC CAGGACCTGTACTGA
SEQ ID NO: 19 예시된 FVIII 이식 유전자(N6)에 대한 상보 가닥
길이: 5013; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..5013; mol_type, 기타 DNA; 코돈-최적화된 FVIII 이식 유전자(N6) 상보 가닥; 유기체, 합성 구조물
TACGTCTAACTCGACTCGTGGACGAAGAAGGACACGGACGACTCCAAGACGAAGAGACGGTGGTCCTCTA
TGATGGACCCCCGACACCTCGACTCGACCCTGATGTACGTCAGACTGGACCCCCTCGACGGACACCTACG
GTCCAAGGGGGGGTCTCACGGGTTCTCGAAGGGGAAGTTGTGGAGACACCACATGTTCTTCTGGGACAAA
CACCTCAAGTGACTGGTGGACAAGTTGTAACGGTTCGGGTCCGGGGGGACCTACCCGGACGACCCGGGGT
GGTAGGTCCGACTCCACATACTGTGACACCACTAGTGGGACTTCTTGTACCGGTCGGTGGGACACTCGGA
CGTACGACACCCCCACTCGATGACCTTCCGGAGACTCCCCCGACTCATACTACTGGTCTGGTCGGTCTCC
CTCTTCCTCCTACTGTTCCACAAGGGACCCCCGTCGGTGTGGATACACACCGTCCACGACTTCCTCTTAC
CGGGGTACCGGAGACTGGGGGACACGGACTGGATGTCGATGGACTCGGTACACCTGGACCACTTCCTGGA
CTTGAGACCGGACTAACCCCGGGACGACCACACGTCCCTCCCGTCGGACCGGTTCCTCTTCTGGGTCTGG
GACGTGTTCAAGTAGGACGACAAACGACACAAACTACTCCCGTTCTCGACCGTGAGACTTTGGTTCTTGT
CGGACTACGTCCTGTCCCTACGACGGAGACGGTCCCGGACCGGGTTCTACGTGTGACACTTACCGATACA
CTTGTCCTCGGACGGACCGGACTAACCGACGGTGTCCTTCAGACACATGACCGTACACTAACCGTACCCG
TGGTGGGGACTCCACGTGTCGTAGAAGGACCTCCCGGTGTGGAAGGACCAGTCCTTGGTGTCCGTCCGGT
CGGACCTCTAGTCGGGGTAGTGGAAGGACTGACGGGTCTGGGACGACTACCTGGACCCGGTCAAGGACGA
CAAGACGGTGTAGTCGTCGGTGGTCGTACTACCGTACCTCCGGATACACTTCCACCTGTCGACGGGACTC
CTCGGGGTCGACTCCTACTTCTTGTTACTCCTCCGACTCCTGATACTACTACTGGACTGACTGAGACTCT
ACCTACACCACTCCAAACTACTACTGTTGTCGGGGTCGAAGTAGGTCTAGTCCAGACACCGGTTCTTCGT
GGGGTTCTGGACCCACGTGATGTAACGACGACTCCTCCTCCTGACCCTGATACGGGGGGACCACGACCGG
GGACTACTGTCCTCGATGTTCTCGGTCATGGACTTGTTACCGGGGGTCTCCTAACCGTCCTTCATGTTCT
TCCAGTCCAAGTACCGGATGTGACTACTTTGGAAGTTCTGGTCCCTCCGGTAGGTCGTACTCAGACCGTA
GGACCCGGGGGACGACATACCCCTCCACCCCCTGTGGGACGACTAGTAGAAGTTCTTGGTCCGGTCGTCC
GGGATGTTGTAGATGGGGGTACCGTAGTGACTACACTCCGGGGACATGTCGTCCTCCGACGGGTTCCCCC
ACTTCGTGGACTTCCTGAAGGGGTAGGACGGACCCCTCTAGAAGTTCATGTTCACCTGACACTGACACCT
CCTACCGGGGTGGTTCAGACTGGGGTCCACGGACTGGTCTATGATGTCGTCGAAACACTTGTACCTCTCC
CTGGACCGGAGACCGGACTAACCGGGGGACGACTAGACGATGTTCCTCAGACACCTGGTCTCCCCGTTGG
TCTAGTACAGACTGTTCTCCTTACACTAGGACAAGAGACACAAACTACTCTTGTCCTCGACCATGGACTG
ACTCTTGTAGGTCTCCAAGGACGGGTTGGGACGACCCCACGTCGACCTCCTGGGACTCAAGGTCCGGTCG
TTGTAGTACGTGTCGTAGTTACCGATACACAAACTGTCGGACGTCGACAGACACACGGACGTACTCCACC
GGATGACCATGTAGGACTCGTAACCCCGGGTCTGACTGAAGGACAGACACAAGAAGAGACCGATGTGGAA
GTTCGTGTTCTACCACATACTCCTGTGGGACTGGGACAAGGGGAAGAGACCCCTCTGACACAAGTACTCG
TACCTCTTGGGACCGGACACCTAAGACCCGACGGTGTTGAGACTGAAGTCCTTGTCCCCGTACTGACGGG
ACGACTTTCAGAGGTCGACACTGTTCTTGTGACCCCTGATGATACTCCTGTCGATACTCCTGTAGAGACG
GATGGACGACTCGTTCTTGTTACGGTAACTCGGGTCCTCGAAGTCGGTCTTGTCGTCCGTGGGGTCGTGG
TCCGTCTTCGTCAAGTTACGGTGGTGGTAGGGACTCTTACTGTATCTCTTCTGTCTGGGTACCAAACGGG
TGGCCTGGGGGTACGGGTTCTAGGTCTTACACTCGTCGAGACTGGACGACTACGACGACTCCGTCTCGGG
GTGGGGGGTACCGGACTCGGACAGACTGGACGTCCTCCGGTTCATACTTTGGAAGAGACTACTGGGGTCG
GGACCCCGGTAACTGTCGTTGTTGTCGGACAGACTCTACTGGGTGAAGTCCGGGGTCGACGTGGTGAGAC
CCCTGTACCACAAGTGGGGACTCAGACCGGACGTCGACTCCGACTTACTCTTCGACCCGTGGTGACGACG
GTGACTCGACTTCTTCGACCTGAAGTTTCAGAGGTCGTGGTCGTTGTTGGACTAGTCGTGGTAGGGGAGA
CTGTTGGACCGACGACCGTGACTGTTGTGGTCGTCGGACCCGGGGGGGTCGTACGGACACGTGATACTGT
CGGTCGACCTGTGGTGGGACAAACCGTTCTTCTCGTCGGGGGACTGACTCAGACCCCCGGGGGACTCGGA
CAGACTCCTCTTGTTACTGTCGTTCGACGACCTCAGACCGGACTACTTGTCGGTCCTCTCGTCGACCCCG
TTCTTACACTCGTCGTCCCTCTAGTGGTCCTGGTGGGACGTCAGACTGGTCCTCCTCTAACTGATACTAC
TGTGGTAGAGACACCTCTACTTCTTCCTCCTGAAACTGTAGATGCTGCTCCTGCTCTTGGTCTCGGGGTC
CTCGAAGGTCTTCTTCTGGTCCGTGATGAAGTAACGACGACACCTCTCCGACACCCTGATACCGTACTCG
TCGTCGGGGGTACACGACTCCTTGTCCCGGGTCAGACCGAGACACGGGGTCAAGTTCTTCCACCACAAGG
TCCTCAAGTGACTACCGTCGAAGTGGGTCGGGGACATGTCTCCCCTCGACTTACTCGTGGACCCGGACGA
CCCGGGGATGTAGTCCCGACTCCACCTCCTGTTGTAGTACCACTGGAAGTCCTTGGTCCGGTCGTCCGGG
ATGTCGAAGATGTCGTCGGACTAGTCGATACTCCTCCTGGTCTCCGTCCCCCGACTCGGGTCCTTCTTGA
AACACTTCGGGTTACTTTGGTTCTGGATGAAGACCTTCCACGTCGTGGTGTACCGGGGGTGGTTCCTACT
CAAACTGACGTTCCGGACCCGGATGAAGAGACTACACCTGGACCTCTTCCTACACGTGAGACCGGACTAA
CCGGGGGACGACCACACGGTGTGGTTGTGGGACTTGGGACGGGTACCGTCCGTCCACTGACACGTCCTCA
AACGGGACAAGAAGTGGTAGAAACTACTTTGGTTCTCGACCATGAAGTGACTCTTGTACCTCTCCTTGAC
GTCCCGGGGGACGTTGTAGGTCTACCTCCTGGGGTGGAAGTTCCTCTTGATGTCCAAGGTACGGTAGTTA
CCGATGTAGTACCTGTGGGACGGACCGGACCACTACCGGGTCCTGGTCTCCTAGTCCACCATGGACGACT
CGTACCCGTCGTTACTCTTGTAGGTGTCGTAGGTGAAGAGACCGGTACACAAGTGACACTCCTTCTTCCT
CCTCATGTTCTACCGGGACATGTTGGACATGGGACCCCACAAACTCTGACACCTCTACGACGGGTCGTTC
CGACCGTAGACCTCCCACCTCACGGACTAACCCCTCGTGGACGTACGACCGTACTCGTGGGACAAGGACC
ACATGTCGTTGTTCACGGTCTGGGGGGACCCGTACCGGAGACCGGTGTAGTCCCTGAAGGTCTAGTGACG
GAGACCGGTCATACCGGTCACCCGGGGGTTCGACCGGTCCGACGTGATGAGACCGTCGTAGTTACGGACC
TCGTGGTTCCTCGGGAAGTCGACCTAGTTCCACCTGGACGACCGGGGGTACTAGTAGGTACCGTAGTTCT
GGGTCCCCCGGTCCGTCTTCAAGTCGTCGGACATGTAGTCGGTCAAGTAGTAGTACATGTCGGACCTACC
GTTCTTCACCGTCTGGATGTCCCCGTTGTCGTGACCGTGGGACTACCACAAGAAACCGTTACACCTGTCG
AGACCGTAGTTCGTGTTGTAGAAGTTGGGGGGGTAGTAACGGTCTATGTAGTCCGACGTGGGGTGGGTGA
TGTCGTAGTCCTCGTGGGACTCCTACCTCGACTACCCGACACTGGACTTGTCGACGTCGTACGGGGACCC
GTACCTCTCGTTCCGGTAGAGACTACGGGTCTAGTGACGGTCGTCGATGAAGTGGTTGTACAAACGGTGG
ACCTCGGGGTCGTTCCGGTCCGACGTGGACGTCCCGTCCTCGTTACGGACCTCCGGGGTCCAGTTGTTGG
GGTTCCTCACCGACGTCCACCTGAAGGTCTTCTGGTACTTCCACTGACCCCACTGGTGGGTCCCCCACTT
CTCGGACGACTGGTCGTACATACACTTCCTCAAGGACTAGTCGTCGTCGGTCCTACCGGTGGTCACCTGG
GACAAGAAGGTCTTACCGTTCCACTTCCACAAGGTCCCGTTGGTCCTGTCGAAGTGGGGACACCACTTGT
CGGACCTGGGGGGGGACGACTGGTCTATGGACTCCTAAGTGGGGGTCTCGACCCACGTGGTCTAACGGGA
CTCCTACCTCCACGACCCGACACTCCGGGTCCTGGACATGACT
SEQ ID NO: 20 예시된 FVIII 이식 유전자(V3)에 대한 상보 가닥
길이: 4425; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..4425; mol_type, 기타 DNA; 코돈-최적화된 FVIII 이식 유전자(V3) 상보 가닥; 유기체, 합성 구조물
TACGTCTAACTCGACTCGTGGACGAAGAAGGACACGGACGACTCCAAGACGAAGAGACGGTGGTCCTCTA TGATGGACCCCCGACACCTCGACTCGACCCTGATGTACGTCAGACTGGACCCCCTCGACGGACACCTACG GTCCAAGGGGGGGTCTCACGGGTTCTCGAAGGGGAAGTTGTGGAGACACCACATGTTCTTCTGGGACAAA CACCTCAAGTGACTGGTGGACAAGTTGTAACGGTTCGGGTCCGGGGGGACCTACCCGGACGACCCGGGGT GGTAGGTCCGACTCCACATACTGTGACACCACTAGTGGGACTTCTTGTACCGGTCGGTGGGACACTCGGA CGTACGACACCCCCACTCGATGACCTTCCGGAGACTCCCCCGACTCATACTACTGGTCTGGTCGGTCTCC CTCTTCCTCCTACTGTTCCACAAGGGACCCCCGTCGGTGTGGATACACACCGTCCACGACTTCCTCTTAC CGGGGTACCGGAGACTGGGGGACACGGACTGGATGTCGATGGACTCGGTACACCTGGACCACTTCCTGGA CTTGAGACCGGACTAACCCCGGGACGACCACACGTCCCTCCCGTCGGACCGGTTCCTCTTCTGGGTCTGG GACGTGTTCAAGTAGGACGACAAACGACACAAACTACTCCCGTTCTCGACCGTGAGACTTTGGTTCTTGT CGGACTACGTCCTGTCCCTACGACGGAGACGGTCCCGGACCGGGTTCTACGTGTGACACTTACCGATACA CTTGTCCTCGGACGGACCGGACTAACCGACGGTGTCCTTCAGACACATGACCGTACACTAACCGTACCCG TGGTGGGGACTCCACGTGTCGTAGAAGGACCTCCCGGTGTGGAAGGACCAGTCCTTGGTGTCCGTCCGGT CGGACCTCTAGTCGGGGTAGTGGAAGGACTGACGGGTCTGGGACGACTACCTGGACCCGGTCAAGGACGA CAAGACGGTGTAGTCGTCGGTGGTCGTACTACCGTACCTCCGGATACACTTCCACCTGTCGACGGGACTC CTCGGGGTCGACTCCTACTTCTTGTTACTCCTCCGACTCCTGATACTACTACTGGACTGACTGAGACTCT ACCTACACCACTCCAAACTACTACTGTTGTCGGGGTCGAAGTAGGTCTAGTCCAGACACCGGTTCTTCGT GGGGTTCTGGACCCACGTGATGTAACGACGACTCCTCCTCCTGACCCTGATACGGGGGGACCACGACCGG GGACTACTGTCCTCGATGTTCTCGGTCATGGACTTGTTACCGGGGGTCTCCTAACCGTCCTTCATGTTCT TCCAGTCCAAGTACCGGATGTGACTACTTTGGAAGTTCTGGTCCCTCCGGTAGGTCGTACTCAGACCGTA GGACCCGGGGGACGACATACCCCTCCACCCCCTGTGGGACGACTAGTAGAAGTTCTTGGTCCGGTCGTCC GGGATGTTGTAGATGGGGGTACCGTAGTGACTACACTCCGGGGACATGTCGTCCTCCGACGGGTTCCCCC ACTTCGTGGACTTCCTGAAGGGGTAGGACGGACCCCTCTAGAAGTTCATGTTCACCTGACACTGACACCT CCTACCGGGGTGGTTCAGACTGGGGTCCACGGACTGGTCTATGATGTCGTCGAAACACTTGTACCTCTCC CTGGACCGGAGACCGGACTAACCGGGGGACGACTAGACGATGTTCCTCAGACACCTGGTCTCCCCGTTGG TCTAGTACAGACTGTTCTCCTTACACTAGGACAAGAGACACAAACTACTCTTGTCCTCGACCATGGACTG ACTCTTGTAGGTCTCCAAGGACGGGTTGGGACGACCCCACGTCGACCTCCTGGGACTCAAGGTCCGGTCG TTGTAGTACGTGTCGTAGTTACCGATACACAAACTGTCGGACGTCGACAGACACACGGACGTACTCCACC GGATGACCATGTAGGACTCGTAACCCCGGGTCTGACTGAAGGACAGACACAAGAAGAGACCGATGTGGAA GTTCGTGTTCTACCACATACTCCTGTGGGACTGGGACAAGGGGAAGAGACCCCTCTGACACAAGTACTCG TACCTCTTGGGACCGGACACCTAAGACCCGACGGTGTTGAGACTGAAGTCCTTGTCCCCGTACTGACGGG ACGACTTTCAGAGGTCGACACTGTTCTTGTGACCCCTGATGATACTCCTGTCGATACTCCTGTAGAGACG GATGGACGACTCGTTCTTGTTACGGTAACTCGGGTCCTCGAAGTCGGTCTTACGGTGATTACACAGATTG TTGTCGTTGTGGTCGTTACTGTCGTTACACAGAGGGGGTCACGACTTCTCCGTGGTCTCCCTCTAGTGGT CCTGGTGGGACGTCAGACTGGTCCTCCTCTAACTGATACTACTGTGGTAGAGACACCTCTACTTCTTCCT CCTGAAACTGTAGATGCTGCTCCTGCTCTTGGTCTCGGGGTCCTCGAAGGTCTTCTTCTGGTCCGTGATG AAGTAACGACGACACCTCTCCGACACCCTGATACCGTACTCGTCGTCGGGGGTACACGACTCCTTGTCCC GGGTCAGACCGAGACACGGGGTCAAGTTCTTCCACCACAAGGTCCTCAAGTGACTACCGTCGAAGTGGGT CGGGGACATGTCTCCCCTCGACTTACTCGTGGACCCGGACGACCCGGGGATGTAGTCCCGACTCCACCTC CTGTTGTAGTACCACTGGAAGTCCTTGGTCCGGTCGTCCGGGATGTCGAAGATGTCGTCGGACTAGTCGA TACTCCTCCTGGTCTCCGTCCCCCGACTCGGGTCCTTCTTGAAACACTTCGGGTTACTTTGGTTCTGGAT GAAGACCTTCCACGTCGTGGTGTACCGGGGGTGGTTCCTACTCAAACTGACGTTCCGGACCCGGATGAAG AGACTACACCTGGACCTCTTCCTACACGTGAGACCGGACTAACCGGGGGACGACCACACGGTGTGGTTGT GGGACTTGGGACGGGTACCGTCCGTCCACTGACACGTCCTCAAACGGGACAAGAAGTGGTAGAAACTACT TTGGTTCTCGACCATGAAGTGACTCTTGTACCTCTCCTTGACGTCCCGGGGGACGTTGTAGGTCTACCTC CTGGGGTGGAAGTTCCTCTTGATGTCCAAGGTACGGTAGTTACCGATGTAGTACCTGTGGGACGGACCGG ACCACTACCGGGTCCTGGTCTCCTAGTCCACCATGGACGACTCGTACCCGTCGTTACTCTTGTAGGTGTC GTAGGTGAAGAGACCGGTACACAAGTGACACTCCTTCTTCCTCCTCATGTTCTACCGGGACATGTTGGAC ATGGGACCCCACAAACTCTGACACCTCTACGACGGGTCGTTCCGACCGTAGACCTCCCACCTCACGGACT AACCCCTCGTGGACGTACGACCGTACTCGTGGGACAAGGACCACATGTCGTTGTTCACGGTCTGGGGGGA CCCGTACCGGAGACCGGTGTAGTCCCTGAAGGTCTAGTGACGGAGACCGGTCATACCGGTCACCCGGGGG TTCGACCGGTCCGACGTGATGAGACCGTCGTAGTTACGGACCTCGTGGTTCCTCGGGAAGTCGACCTAGT TCCACCTGGACGACCGGGGGTACTAGTAGGTACCGTAGTTCTGGGTCCCCCGGTCCGTCTTCAAGTCGTC GGACATGTAGTCGGTCAAGTAGTAGTACATGTCGGACCTACCGTTCTTCACCGTCTGGATGTCCCCGTTG TCGTGACCGTGGGACTACCACAAGAAACCGTTACACCTGTCGAGACCGTAGTTCGTGTTGTAGAAGTTGG GGGGGTAGTAACGGTCTATGTAGTCCGACGTGGGGTGGGTGATGTCGTAGTCCTCGTGGGACTCCTACCT CGACTACCCGACACTGGACTTGTCGACGTCGTACGGGGACCCGTACCTCTCGTTCCGGTAGAGACTACGG GTCTAGTGACGGTCGTCGATGAAGTGGTTGTACAAACGGTGGACCTCGGGGTCGTTCCGGTCCGACGTGG ACGTCCCGTCCTCGTTACGGACCTCCGGGGTCCAGTTGTTGGGGTTCCTCACCGACGTCCACCTGAAGGT CTTCTGGTACTTCCACTGACCCCACTGGTGGGTCCCCCACTTCTCGGACGACTGGTCGTACATACACTTC CTCAAGGACTAGTCGTCGTCGGTCCTACCGGTGGTCACCTGGGACAAGAAGGTCTTACCGTTCCACTTCC ACAAGGTCCCGTTGGTCCTGTCGAAGTGGGGACACCACTTGTCGGACCTGGGGGGGGACGACTGGTCTAT GGACTCCTAAGTGGGGGTCTCGACCCACGTGGTCTAACGGGACTCCTACCTCCACGACCCGACACTCCGG GTCCTGGACATGACT
SEQ ID NO: 21 예시된 FVIII 폴리펩티드(N6)
길이: 1670; 분자 유형: AA; 기능 위치/한정자: 소스, 1..1670; MOL_TYPE, 단백질; 유기체, 호모 사피엔스
MQIELSTCFFLCLLRFCFSATRRYYLGAVELSWDYMQSDLGELPVDARFPPRVPKSFPFNTSVVYKKTLFVEFTDHLFNIAKPRPPWMGLLGPTIQAEVYDTVVITLKNMASHPVSLHAVGVSYWKASEGAEYDDQTSQREKEDDKVFPGGSHTYVWQVLKENGPMASDPLCLTYSYLSHVDLVKDLNSGLIGALLVCREGSLAKEKTQTLHKFILLFAVFDEGKSWHSETKNSLMQDRDAASARAWPKMHTVNGYVNRSLPGLIGCHRKSVYWHVIGMGTTPEVHSIFLEGHTFLVRNHRQASLEISPITFLTAQTLLMDLGQFLLFCHISSHQHDGMEAYVKVDSCPEEPQLRMKNNEEAEDYDDDLTDSEMDVVRFDDDNSPSFIQIRSVAKKHPKTWVHYIAAEEEDWDYAPLVLAPDDRSYKSQYLNNGPQRIGRKYKKVRFMAYTDETFKTREAIQHESGILGPLLYGEVGDTLLIIFKNQASRPYNIYPHGITDVRPLYSRRLPKGVKHLKDFPILPGEIFKYKWTVTVEDGPTKSDPRCLTRYYSSFVNMERDLASGLIGPLLICYKESVDQRGNQIMSDKRNVILFSVFDENRSWYLTENIQRFLPNPAGVQLEDPEFQASNIMHSINGYVFDSLQLSVCLHEVAYWYILSIGAQTDFLSVFFSGYTFKHKMVYEDTLTLFPFSGETVFMSMENPGLWILGCHNSDFRNRGMTALLKVSSCDKNTGDYYEDSYEDISAYLLSKNNAIEPRSFSQNSRHPSTRQKQFNATTIPENDIEKTDPWFAHRTPMPKIQNVSSSDLLMLLRQSPTPHGLSLSDLQEAKYETFSDDPSPGAIDSNNSLSEMTHFRPQLHHSGDMVFTPESGLQLRLNEKLGTTAATELKKLDFKVSSTSNNLISTIPSDNLAAGTDNTSSLGPPSMPVHYDSQLDTTLFGKKSSPLTESGGPLSLSEENNDSKLLESGLMNSQESSWGKNVSSREITRTTLQSDQEEIDYDDTISVEMKKEDFDIYDEDENQSPRSFQKKTRHYFIAAVERLWDYGMSSSPHVLRNRAQSGSVPQFKKVVFQEFTDGSFTQPLYRGELNEHLGLLGPYIRAEVEDNIMVTFRNQASRPYSFYSSLISYEEDQRQGAEPRKNFVKPNETKTYFWKVQHHMAPTKDEFDCKAWAYFSDVDLEKDVHSGLIGPLLVCHTNTLNPAHGRQVTVQEFALFFTIFDETKSWYFTENMERNCRAPCNIQMEDPTFKENYRFHAINGYIMDTLPGLVMAQDQRIRWYLLSMGSNENIHSIHFSGHVFTVRKKEEYKMALYNLYPGVFETVEMLPSKAGIWRVECLIGEHLHAGMSTLFLVYSNKCQTPLGMASGHIRDFQITASGQYGQWAPKLARLHYSGSINAWSTKEPFSWIKVDLLAPMIIHGIKTQGARQKFSSLYISQFIIMYSLDGKKWQTYRGNSTGTLMVFFGNVDSSGIKHNIFNPPIIARYIRLHPTHYSIRSTLRMELMGCDLNSCSMPLGMESKAISDAQITASSYFTNMFATWSPSKARLHLQGRSNAWRPQVNNPKEWLQVDFQKTMKVTGVTTQGVKSLLTSMYVKEFLISSSQDGHQWTLFFQNGKVKVFQGNQDSFTPVVNSLDPPLLTRYLRIHPQSWVHQIALRMEVLGCEAQDLY
SEQ ID NO: 22 예시된 FVIII 폴리펩티드(V3)
길이: 1474; 분자 유형: AA; 기능 위치/한정자: 소스, 1..1474; MOL_TYPE, 단백질; 유기체, 호모 사피엔스
MQIELSTCFFLCLLRFCFSATRRYYLGAVELSWDYMQSDLGELPVDARFPPRVPKSFPFNTSVVYKKTLF
VEFTDHLFNIAKPRPPWMGLLGPTIQAEVYDTVVITLKNMASHPVSLHAVGVSYWKASEGAEYDDQTSQR
EKEDDKVFPGGSHTYVWQVLKENGPMASDPLCLTYSYLSHVDLVKDLNSGLIGALLVCREGSLAKEKTQT
LHKFILLFAVFDEGKSWHSETKNSLMQDRDAASARAWPKMHTVNGYVNRSLPGLIGCHRKSVYWHVIGMG
TTPEVHSIFLEGHTFLVRNHRQASLEISPITFLTAQTLLMDLGQFLLFCHISSHQHDGMEAYVKVDSCPE
EPQLRMKNNEEAEDYDDDLTDSEMDVVRFDDDNSPSFIQIRSVAKKHPKTWVHYIAAEEEDWDYAPLVLA
PDDRSYKSQYLNNGPQRIGRKYKKVRFMAYTDETFKTREAIQHESGILGPLLYGEVGDTLLIIFKNQASR
PYNIYPHGITDVRPLYSRRLPKGVKHLKDFPILPGEIFKYKWTVTVEDGPTKSDPRCLTRYYSSFVNMER
DLASGLIGPLLICYKESVDQRGNQIMSDKRNVILFSVFDENRSWYLTENIQRFLPNPAGVQLEDPEFQAS
NIMHSINGYVFDSLQLSVCLHEVAYWYILSIGAQTDFLSVFFSGYTFKHKMVYEDTLTLFPFSGETVFMS
MENPGLWILGCHNSDFRNRGMTALLKVSSCDKNTGDYYEDSYEDISAYLLSKNNAIEPRSFSQNATNVSN
NSNTSNDSNVSPPVLKRHQREITRTTLQSDQEEIDYDDTISVEMKKEDFDIYDEDENQSPRSFQKKTRHY
FIAAVERLWDYGMSSSPHVLRNRAQSGSVPQFKKVVFQEFTDGSFTQPLYRGELNEHLGLLGPYIRAEVE
DNIMVTFRNQASRPYSFYSSLISYEEDQRQGAEPRKNFVKPNETKTYFWKVQHHMAPTKDEFDCKAWAYF
SDVDLEKDVHSGLIGPLLVCHTNTLNPAHGRQVTVQEFALFFTIFDETKSWYFTENMERNCRAPCNIQME
DPTFKENYRFHAINGYIMDTLPGLVMAQDQRIRWYLLSMGSNENIHSIHFSGHVFTVRKKEEYKMALYNL
YPGVFETVEMLPSKAGIWRVECLIGEHLHAGMSTLFLVYSNKCQTPLGMASGHIRDFQITASGQYGQWAP
KLARLHYSGSINAWSTKEPFSWIKVDLLAPMIIHGIKTQGARQKFSSLYISQFIIMYSLDGKKWQTYRGN
STGTLMVFFGNVDSSGIKHNIFNPPIIARYIRLHPTHYSIRSTLRMELMGCDLNSCSMPLGMESKAISDA
QITASSYFTNMFATWSPSKARLHLQGRSNAWRPQVNNPKEWLQVDFQKTMKVTGVTTQGVKSLLTSMYVK
EFLISSSQDGHQWTLFFQNGKVKVFQGNQDSFTPVVNSLDPPLLTRYLRIHPQSWVHQIALRMEVLGCEA
QDLY
SEQ ID NO: 23 예시된 WPRE 성분(mWPRE)
길이: 600; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..600; mol_type, 할당되지 않은 DNA; 유기체, Woodchuck 간염 바이러스
1 GGGCCCAATC AACCTCTGGA TTACAAAATT TGTGAAAGAT TGACTGGTAT TCTTAACTAT
61 GTTGCTCCTT TTACGCTATG TGGATACGCT GCTTTAATGC CTTTGTATCA TGCTATTGCT
121 TCCCGTATGG CTTTCATTTT CTCCTCCTTG TATAAATCCT GGTTGCTGTC TCTTTATGAG
181 GAGTTGTGGC CCGTTGTCAG GCAACGTGGC GTGGTGTGCA CTGTGTTTGC TGACGCAACC
241 CCCACTGGTT GGGGCATTGC CACCACCTGT CAGCTCCTTT CCGGGACTTT CGCTTTCCCC
301 CTCCCTATTG CCACGGCGGA ACTCATCGCC GCCTGCCTTG CCCGCTGCTG GACAGGGGCT
361 CGGCTGTTGG GCACTGACAA TTCCGTGGTG TTGTCGGGGA AATCATCGTC CTTTCCTTGG
421 CTGCTCGCCT GTGTTGCCAC CTGGATTCTG CGCGGGACGT CCTTCTGCTA CGTCCCTTCG
481 GCCCTCAATC CAGCGGACCT TCCTTCCCGC GGCCTGCTGC CGGCTCTGCG GCCTCTTCCG
541 CGTCTTCGCC TTCGCCCTCA GACGAGTCGG ATCTCCCTTT GGGCCGCCTC CCCGCAAGCT
SEQ ID NO: 24 도 3에 정의된 바와 같은 F/HN-SIV-hCEF-soA1AT 플라스미드(pDNA1 pGM407)
길이: 7349; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..7349; mol_type, 기타 DNA; 참고, pGM407; 유기체, 합성 구조물
1 GGTACCTCAA TATTGGCCAT TAGCCATATT ATTCATTGGT TATATAGCAT AAATCAATAT
61 TGGCTATTGG CCATTGCATA CGTTGTATCT ATATCATAAT ATGTACATTT ATATTGGCTC
121 ATGTCCAATA TGACCGCCAT GTTGGCATTG ATTATTGACT AGTTATTAAT AGTAATCAAT
181 TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA
241 TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT
301 TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA
361 AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTCCGCCCC CTATTGACGT
421 CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAC GGGACTTTCC
481 TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA
541 GTACACCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT
601 TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA
661 CAACTGCGAT CGCCCGCCCC GTTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC
721 TATATAAGCA GAGCTCGCTG GCTTGTAACT CAGTCTCTTA CTAGGAGACC AGCTTGAGCC
781 TGGGTGTTCG CTGGTTAGCC TAACCTGGTT GGCCACCAGG GGTAAGGACT CCTTGGCTTA
841 GAAAGCTAAT AAACTTGCCT GCATTAGAGC TTATCTGAGT CAAGTGTCCT CATTGACGCC
901 TCACTCTCTT GAACGGGAAT CTTCCTTACT GGGTTCTCTC TCTGACCCAG GCGAGAGAAA
961 CTCCAGCAGT GGCGCCCGAA CAGGGACTTG AGTGAGAGTG TAGGCACGTA CAGCTGAGAA
1021 GGCGTCGGAC GCGAAGGAAG CGCGGGGTGC GACGCGACCA AGAAGGAGAC TTGGTGAGTA
1081 GGCTTCTCGA GTGCCGGGAA AAAGCTCGAG CCTAGTTAGA GGACTAGGAG AGGCCGTAGC
1141 CGTAACTACT CTTGGGCAAG TAGGGCAGGC GGTGGGTACG CAATGGGGGC GGCTACCTCA
1201 GCACTAAATA GGAGACAATT AGACCAATTT GAGAAAATAC GACTTCGCCC GAACGGAAAG
1261 AAAAAGTACC AAATTAAACA TTTAATATGG GCAGGCAAGG AGATGGAGCG CTTCGGCCTC
1321 CATGAGAGGT TGTTGGAGAC AGAGGAGGGG TGTAAAAGAA TCATAGAAGT CCTCTACCCC
1381 CTAGAACCAA CAGGATCGGA GGGCTTAAAA AGTCTGTTCA ATCTTGTGTG CGTGCTATAT
1441 TGCTTGCACA AGGAACAGAA AGTGAAAGAC ACAGAGGAAG CAGTAGCAAC AGTAAGACAA
1501 CACTGCCATC TAGTGGAAAA AGAAAAAAGT GCAACAGAGA CATCTAGTGG ACAAAAGAAA
1561 AATGACAAGG GAATAGCAGC GCCACCTGGT GGCAGTCAGA ATTTTCCAGC GCAACAACAA
1621 GGAAATGCCT GGGTACATGT ACCCTTGTCA CCGCGCACCT TAAATGCGTG GGTAAAAGCA
1681 GTAGAGGAGA AAAAATTTGG AGCAGAAATA GTACCCATTT TTTTGTTTCA AGCCCTATCG
1741 AATTCCCGTT TGTGCTAGGG TTCTTAGGCT TCTTGGGGGC TGCTGGAACT GCAATGGGAG
1801 CAGCGGCGAC AGCCCTGACG GTCCAGTCTC AGCATTTGCT TGCTGGGATA CTGCAGCAGC
1861 AGAAGAATCT GCTGGCGGCT GTGGAGGCTC AACAGCAGAT GTTGAAGCTG ACCATTTGGG
1921 GTGTTAAAAA CCTCAATGCC CGCGTCACAG CCCTTGAGAA GTACCTAGAG GATCAGGCAC
1981 GACTAAACTC CTGGGGGTGC GCATGGAAAC AAGTATGTCA TACCACAGTG GAGTGGCCCT
2041 GGACAAATCG GACTCCGGAT TGGCAAAATA TGACTTGGTT GGAGTGGGAA AGACAAATAG
2101 CTGATTTGGA AAGCAACATT ACGAGACAAT TAGTGAAGGC TAGAGAACAA GAGGAAAAGA
2161 ATCTAGATGC CTATCAGAAG TTAACTAGTT GGTCAGATTT CTGGTCTTGG TTCGATTTCT
2221 CAAAATGGCT TAACATTTTA AAAATGGGAT TTTTAGTAAT AGTAGGAATA ATAGGGTTAA
2281 GATTACTTTA CACAGTATAT GGATGTATAG TGAGGGTTAG GCAGGGATAT GTTCCTCTAT
2341 CTCCACAGAT CCATATCCGC GGCAATTTTA AAAGAAAGGG AGGAATAGGG GGACAGACTT
2401 CAGCAGAGAG ACTAATTAAT ATAATAACAA CACAATTAGA AATACAACAT TTACAAACCA
2461 AAATTCAAAA AATTTTAAAT TTTAGAGCCG CGGAGATCTG TTACATAACT TATGGTAAAT
2521 GGCCTGCCTG GCTGACTGCC CAATGACCCC TGCCCAATGA TGTCAATAAT GATGTATGTT
2581 CCCATGTAAT GCCAATAGGG ACTTTCCATT GATGTCAATG GGTGGAGTAT TTATGGTAAC
2641 TGCCCACTTG GCAGTACATC AAGTGTATCA TATGCCAAGT ATGCCCCCTA TTGATGTCAA
2701 TGATGGTAAA TGGCCTGCCT GGCATTATGC CCAGTACATG ACCTTATGGG ACTTTCCTAC
2761 TTGGCAGTAC ATCTATGTAT TAGTCATTGC TATTACCATG GGAATTCACT AGTGGAGAAG
2821 AGCATGCTTG AGGGCTGAGT GCCCCTCAGT GGGCAGAGAG CACATGGCCC ACAGTCCCTG
2881 AGAAGTTGGG GGGAGGGGTG GGCAATTGAA CTGGTGCCTA GAGAAGGTGG GGCTTGGGTA
2941 AACTGGGAAA GTGATGTGGT GTACTGGCTC CACCTTTTTC CCCAGGGTGG GGGAGAACCA
3001 TATATAAGTG CAGTAGTCTC TGTGAACATT CAAGCTTCTG CCTTCTCCCT CCTGTGAGTT
3061 TGCTAGCCAC CATGCCCAGC TCTGTGTCCT GGGGCATTCT GCTGCTGGCT GGCCTGTGCT
3121 GTCTGGTGCC TGTGTCCCTG GCTGAGGACC CTCAGGGGGA TGCTGCCCAG AAAACAGACA
3181 CCTCCCACCA TGACCAGGAC CACCCCACCT TCAACAAGAT CACCCCCAAC CTGGCAGAGT
3241 TTGCCTTCAG CCTGTACAGA CAGCTGGCCC ACCAGAGCAA CAGCACCAAC ATCTTTTTCA
3301 GCCCTGTGTC CATTGCCACA GCCTTTGCCA TGCTGAGCCT GGGCACCAAG GCTGACACCC
3361 ATGATGAGAT CCTGGAAGGC CTGAACTTCA ACCTGACAGA GATCCCTGAG GCCCAGATCC
3421 ATGAGGGCTT CCAGGAACTG CTGAGAACCC TGAACCAGCC AGACAGCCAG CTGCAGCTGA
3481 CAACAGGCAA TGGGCTGTTC CTGTCTGAGG GCCTGAAGCT GGTGGACAAG TTTCTGGAAG
3541 ATGTGAAGAA GCTGTACCAC TCTGAGGCCT TCACAGTGAA CTTTGGGGAC ACAGAAGAGG
3601 CCAAGAAACA GATCAATGAC TATGTGGAAA AGGGCACCCA GGGCAAGATT GTGGACCTTG
3661 TGAAAGAGCT GGACAGGGAC ACTGTGTTTG CCCTTGTGAA CTACATCTTC TTCAAGGGCA
3721 AGTGGGAGAG GCCCTTTGAA GTGAAGGACA CTGAGGAAGA GGACTTCCAT GTGGACCAAG
3781 TGACCACAGT GAAGGTGCCA ATGATGAAGA GACTGGGGAT GTTCAATATC CAGCACTGCA
3841 AGAAACTGAG CAGCTGGGTG CTGCTGATGA AGTACCTGGG CAATGCTACA GCCATATTCT
3901 TTCTGCCTGA TGAGGGCAAG CTGCAGCACC TGGAAAATGA GCTGACCCAT GACATCATCA
3961 CCAAATTTCT GGAAAATGAG GACAGAAGAT CTGCCAGCCT GCATCTGCCC AAGCTGAGCA
4021 TCACAGGCAC ATATGACCTG AAGTCTGTGC TGGGACAGCT GGGAATCACC AAGGTGTTCA
4081 GCAATGGGGC AGACCTGAGT GGAGTGACAG AGGAAGCCCC TCTGAAGCTG TCCAAGGCTG
4141 TGCACAAGGC AGTGCTGACC ATTGATGAGA AGGGCACAGA GGCTGCTGGG GCCATGTTTC
4201 TGGAAGCCAT CCCCATGTCC ATCCCCCCAG AAGTGAAGTT CAACAAGCCC TTTGTGTTCC
4261 TGATGATTGA GCAGAACACC AAGAGCCCCC TGTTCATGGG CAAGGTTGTG AACCCCACCC
4321 AGAAATGAGG GCCCAATCAA CCTCTGGATT ACAAAATTTG TGAAAGATTG ACTGGTATTC
4381 TTAACTATGT TGCTCCTTTT ACGCTATGTG GATACGCTGC TTTAATGCCT TTGTATCATG
4441 CTATTGCTTC CCGTATGGCT TTCATTTTCT CCTCCTTGTA TAAATCCTGG TTGCTGTCTC
4501 TTTATGAGGA GTTGTGGCCC GTTGTCAGGC AACGTGGCGT GGTGTGCACT GTGTTTGCTG
4561 ACGCAACCCC CACTGGTTGG GGCATTGCCA CCACCTGTCA GCTCCTTTCC GGGACTTTCG
4621 CTTTCCCCCT CCCTATTGCC ACGGCGGAAC TCATCGCCGC CTGCCTTGCC CGCTGCTGGA
4681 CAGGGGCTCG GCTGTTGGGC ACTGACAATT CCGTGGTGTT GTCGGGGAAA TCATCGTCCT
4741 TTCCTTGGCT GCTCGCCTGT GTTGCCACCT GGATTCTGCG CGGGACGTCC TTCTGCTACG
4801 TCCCTTCGGC CCTCAATCCA GCGGACCTTC CTTCCCGCGG CCTGCTGCCG GCTCTGCGGC
4861 CTCTTCCGCG TCTTCGCCTT CGCCCTCAGA CGAGTCGGAT CTCCCTTTGG GCCGCCTCCC
4921 CGCAAGCTTC GCACTTTTTA AAAGAAAAGG GAGGACTGGA TGGGATTTAT TACTCCGATA
4981 GGACGCTGGC TTGTAACTCA GTCTCTTACT AGGAGACCAG CTTGAGCCTG GGTGTTCGCT
5041 GGTTAGCCTA ACCTGGTTGG CCACCAGGGG TAAGGACTCC TTGGCTTAGA AAGCTAATAA
5101 ACTTGCCTGC ATTAGAGCTC TTACGCGTCC CGGGCTCGAG ATCCGCATCT CAATTAGTCA
5161 GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC CAGTTCCGCC
5221 CATTCTCCGC CCCATGGCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA GGCCGCCTCG
5281 GCCTCTGAGC TATTCCAGAA GTAGTGAGGA GGCTTTTTTG GAGGCCTAGG CTTTTGCAAA
5341 AAGCTAACTT GTTTATTGCA GCTTATAATG GTTACAAATA AAGCAATAGC ATCACAAATT
5401 TCACAAATAA AGCATTTTTT TCACTGCATT CTAGTTGTGG TTTGTCCAAA CTCATCAATG
5461 TATCTTATCA TGTCTGTCCG CTTCCTCGCT CACTGACTCG CTGCGCTCGG TCGTTCGGCT
5521 GCGGCGAGCG GTATCAGCTC ACTCAAAGGC GGTAATACGG TTATCCACAG AATCAGGGGA
5581 TAACGCAGGA AAGAACATGT GAGCAAAAGG CCAGCAAAAG GCCAGGAACC GTAAAAAGGC
5641 CGCGTTGCTG GCGTTTTTCC ATAGGCTCCG CCCCCCTGAC GAGCATCACA AAAATCGACG
5701 CTCAAGTCAG AGGTGGCGAA ACCCGACAGG ACTATAAAGA TACCAGGCGT TTCCCCCTGG
5761 AAGCTCCCTC GTGCGCTCTC CTGTTCCGAC CCTGCCGCTT ACCGGATACC TGTCCGCCTT
5821 TCTCCCTTCG GGAAGCGTGG CGCTTTCTCA TAGCTCACGC TGTAGGTATC TCAGTTCGGT
5881 GTAGGTCGTT CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CCGACCGCTG
5941 CGCCTTATCC GGTAACTATC GTCTTGAGTC CAACCCGGTA AGACACGACT TATCGCCACT
6001 GGCAGCAGCC ACTGGTAACA GGATTAGCAG AGCGAGGTAT GTAGGCGGTG CTACAGAGTT
6061 CTTGAAGTGG TGGCCTAACT ACGGCTACAC TAGAAGAACA GTATTTGGTA TCTGCGCTCT
6121 GCTGAAGCCA GTTACCTTCG GAAAAAGAGT TGGTAGCTCT TGATCCGGCA AACAAACCAC
6181 CGCTGGTAGC GGTGGTTTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC
6241 TCAAGAAGAT CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG AAAACTCACG
6301 TTAAGGGATT TTGGTCATGA GATTATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA
6361 AAAATGAAGT TTTAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTAGAA
6421 AAACTCATCG AGCATCAAAT GAAACTGCAA TTTATTCATA TCAGGATTAT CAATACCATA
6481 TTTTTGAAAA AGCCGTTTCT GTAATGAAGG AGAAAACTCA CCGAGGCAGT TCCATAGGAT
6541 GGCAAGATCC TGGTATCGGT CTGCGATTCC GACTCGTCCA ACATCAATAC AACCTATTAA
6601 TTTCCCCTCG TCAAAAATAA GGTTATCAAG TGAGAAATCA CCATGAGTGA CGACTGAATC
6661 CGGTGAGAAT GGCAACAGCT TATGCATTTC TTTCCAGACT TGTTCAACAG GCCAGCCATT
6721 ACGCTCGTCA TCAAAATCAC TCGCATCAAC CAAACCGTTA TTCATTCGTG ATTGCGCCTG
6781 AGCGAGACGA AATACGCGAT CGCTGTTAAA AGGACAATTA CAAACAGGAA TCGAATGCAA
6841 CCGGCGCAGG AACACTGCCA GCGCATCAAC AATATTTTCA CCTGAATCAG GATATTCTTC
6901 TAATACCTGG AATGCTGTTT TTCCGGGGAT CGCAGTGGTG AGTAACCATG CATCATCAGG
6961 AGTACGGATA AAATGCTTGA TGGTCGGAAG AGGCATAAAT TCCGTCAGCC AGTTTAGTCT
7021 GACCATCTCA TCTGTAACAT CATTGGCAAC GCTACCTTTG CCATGTTTCA GAAACAACTC
7081 TGGCGCATCG GGCTTCCCAT ACAATCGATA GATTGTCGCA CCTGATTGCC CGACATTATC
7141 GCGAGCCCAT TTATACCCAT ATAAATCAGC ATCCATGTTG GAATTTAATC GCGGCCTAGA
7201 GCAAGACGTT TCCCGTTGAA TATGGCTCAT AACACCCCTT GTATTACTGT TTATGTAAGC
7261 AGACAGTTTT ATTGTTCATG ATGATATATT TTTATCTTGT GCAATGTAAC ATCAGAGATT
7321 TTGAGACACA ACAATTGGTC GACGGATCC
SEQ ID NO: 25 도 4a에 정의된 바와 같은 F/HN-SIV-CMV-HFVIII-V3 플라스미드(pDNA1 pGM411)
길이: 10812; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..10812; mol_type, 기타 DNA; 참고, pGM411; 유기체, 합성 구조물
1 GGTACCTCAA TATTGGCCAT TAGCCATATT ATTCATTGGT TATATAGCAT AAATCAATAT
61 TGGCTATTGG CCATTGCATA CGTTGTATCT ATATCATAAT ATGTACATTT ATATTGGCTC
121 ATGTCCAATA TGACCGCCAT GTTGGCATTG ATTATTGACT AGTTATTAAT AGTAATCAAT
181 TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA
241 TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT
301 TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA
361 AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTCCGCCCC CTATTGACGT
421 CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAC GGGACTTTCC
481 TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA
541 GTACACCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT
601 TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA
661 CAACTGCGAT CGCCCGCCCC GTTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC
721 TATATAAGCA GAGCTCGCTG GCTTGTAACT CAGTCTCTTA CTAGGAGACC AGCTTGAGCC
781 TGGGTGTTCG CTGGTTAGCC TAACCTGGTT GGCCACCAGG GGTAAGGACT CCTTGGCTTA
841 GAAAGCTAAT AAACTTGCCT GCATTAGAGC TTATCTGAGT CAAGTGTCCT CATTGACGCC
901 TCACTCTCTT GAACGGGAAT CTTCCTTACT GGGTTCTCTC TCTGACCCAG GCGAGAGAAA
961 CTCCAGCAGT GGCGCCCGAA CAGGGACTTG AGTGAGAGTG TAGGCACGTA CAGCTGAGAA
1021 GGCGTCGGAC GCGAAGGAAG CGCGGGGTGC GACGCGACCA AGAAGGAGAC TTGGTGAGTA
1081 GGCTTCTCGA GTGCCGGGAA AAAGCTCGAG CCTAGTTAGA GGACTAGGAG AGGCCGTAGC
1141 CGTAACTACT CTGGGCAAGT AGGGCAGGCG GTGGGTACGC AATGGGGGCG GCTACCTCAG
1201 CACTAAATAG GAGACAATTA GACCAATTTG AGAAAATACG ACTTCGCCCG AACGGAAAGA
1261 AAAAGTACCA AATTAAACAT TTAATATGGG CAGGCAAGGA GATGGAGCGC TTCGGCCTCC
1321 ATGAGAGGTT GTTGGAGACA GAGGAGGGGT GTAAAAGAAT CATAGAAGTC CTCTACCCCC
1381 TAGAACCAAC AGGATCGGAG GGCTTAAAAA GTCTGTTCAA TCTTGTGTGC GTGCTATATT
1441 GCTTGCACAA GGAACAGAAA GTGAAAGACA CAGAGGAAGC AGTAGCAACA GTAAGACAAC
1501 ACTGCCATCT AGTGGAAAAA GAAAAAAGTG CAACAGAGAC ATCTAGTGGA CAAAAGAAAA
1561 ATGACAAGGG AATAGCAGCG CCACCTGGTG GCAGTCAGAA TTTTCCAGCG CAACAACAAG
1621 GAAATGCCTG GGTACATGTA CCCTTGTCAC CGCGCACCTT AAATGCGTGG GTAAAAGCAG
1681 TAGAGGAGAA AAAATTTGGA GCAGAAATAG TACCCATGTT TCAAGCCCTA TCGAATTCCC
1741 GTTTGTGCTA GGGTTCTTAG GCTTCTTGGG GGCTGCTGGA ACTGCAATGG GAGCAGCGGC
1801 GACAGCCCTG ACGGTCCAGT CTCAGCATTT GCTTGCTGGG ATACTGCAGC AGCAGAAGAA
1861 TCTGCTGGCG GCTGTGGAGG CTCAACAGCA GATGTTGAAG CTGACCATTT GGGGTGTTAA
1921 AAACCTCAAT GCCCGCGTCA CAGCCCTTGA GAAGTACCTA GAGGATCAGG CACGACTAAA
1981 CTCCTGGGGG TGCGCATGGA AACAAGTATG TCATACCACA GTGGAGTGGC CCTGGACAAA
2041 TCGGACTCCG GATTGGCAAA ATATGACTTG GTTGGAGTGG GAAAGACAAA TAGCTGATTT
2101 GGAAAGCAAC ATTACGAGAC AATTAGTGAA GGCTAGAGAA CAAGAGGAAA AGAATCTAGA
2161 TGCCTATCAG AAGTTAACTA GTTGGTCAGA TTTCTGGTCT TGGTTCGATT TCTCAAAATG
2221 GCTTAACATT TTAAAAATGG GATTTTTAGT AATAGTAGGA ATAATAGGGT TAAGATTACT
2281 TTACACAGTA TATGGATGTA TAGTGAGGGT TAGGCAGGGA TATGTTCCTC TATCTCCACA
2341 GATCCATATC CGCGGCAATT TTAAAAGAAA GGGAGGAATA GGGGGACAGA CTTCAGCAGA
2401 GAGACTAATT AATATAATAA CAACACAATT AGAAATACAA CATTTACAAA CCAAAATTCA
2461 AAAAATTTTA AATTTTAGAG CCGCGGAGAT CTCAATATTG GCCATTAGCC ATATTATTCA
2521 TTGGTTATAT AGCATAAATC AATATTGGCT ATTGGCCATT GCATACGTTG TATCTATATC
2581 ATAATATGTA CATTTATATT GGCTCATGTC CAATATGACC GCCATGTTGG CATTGATTAT
2641 TGACTAGTTA TTAATAGTAA TCAATTACGG GGTCATTAGT TCATAGCCCA TATATGGAGT
2701 TCCGCGTTAC ATAACTTACG GTAAATGGCC CGCCTGGCTG ACCGCCCAAC GACCCCCGCC
2761 CATTGACGTC AATAATGACG TATGTTCCCA TAGTAACGCC AATAGGGACT TTCCATTGAC
2821 GTCAATGGGT GGAGTATTTA CGGTAAACTG CCCACTTGGC AGTACATCAA GTGTATCATA
2881 TGCCAAGTCC GCCCCCTATT GACGTCAATG ACGGTAAATG GCCCGCCTGG CATTATGCCC
2941 AGTACATGAC CTTACGGGAC TTTCCTACTT GGCAGTACAT CTACGTATTA GTCATCGCTA
3001 TTACCATGGT GATGCGGTTT TGGCAGTACA CCAATGGGCG TGGATAGCGG TTTGACTCAC
3061 GGGGATTTCC AAGTCTCCAC CCCATTGACG TCAATGGGAG TTTGTTTTGG CACCAAAATC
3121 AACGGGACTT TCCAAAATGT CGTAATAACC CCGCCCCGTT GACGCAAATG GGCGGTAGGC
3181 GTGTACGGTG GGAGGTCTAT ATAAGCAGAG CTCGTTTAGT GAACCGTCAG ATCACTAGAA
3241 GCTTTATTGC GGTAGTTTAT CACAGTTAAA TTGCTAACGC AGTCAGTGCT TCTGACACAA
3301 CAGTCTCGAA CTTAAGCTGC AGAAGTTGGT CGTGAGGCAC TGGGCAGGCT AGCCACCAAT
3361 GCAGATTGAG CTGAGCACCT GCTTCTTCCT GTGCCTGCTG AGGTTCTGCT TCTCTGCCAC
3421 CAGGAGATAC TACCTGGGGG CTGTGGAGCT GAGCTGGGAC TACATGCAGT CTGACCTGGG
3481 GGAGCTGCCT GTGGATGCCA GGTTCCCCCC CAGAGTGCCC AAGAGCTTCC CCTTCAACAC
3541 CTCTGTGGTG TACAAGAAGA CCCTGTTTGT GGAGTTCACT GACCACCTGT TCAACATTGC
3601 CAAGCCCAGG CCCCCCTGGA TGGGCCTGCT GGGCCCCACC ATCCAGGCTG AGGTGTATGA
3661 CACTGTGGTG ATCACCCTGA AGAACATGGC CAGCCACCCT GTGAGCCTGC ATGCTGTGGG
3721 GGTGAGCTAC TGGAAGGCCT CTGAGGGGGC TGAGTATGAT GACCAGACCA GCCAGAGGGA
3781 GAAGGAGGAT GACAAGGTGT TCCCTGGGGG CAGCCACACC TATGTGTGGC AGGTGCTGAA
3841 GGAGAATGGC CCCATGGCCT CTGACCCCCT GTGCCTGACC TACAGCTACC TGAGCCATGT
3901 GGACCTGGTG AAGGACCTGA ACTCTGGCCT GATTGGGGCC CTGCTGGTGT GCAGGGAGGG
3961 CAGCCTGGCC AAGGAGAAGA CCCAGACCCT GCACAAGTTC ATCCTGCTGT TTGCTGTGTT
4021 TGATGAGGGC AAGAGCTGGC ACTCTGAAAC CAAGAACAGC CTGATGCAGG ACAGGGATGC
4081 TGCCTCTGCC AGGGCCTGGC CCAAGATGCA CACTGTGAAT GGCTATGTGA ACAGGAGCCT
4141 GCCTGGCCTG ATTGGCTGCC ACAGGAAGTC TGTGTACTGG CATGTGATTG GCATGGGCAC
4201 CACCCCTGAG GTGCACAGCA TCTTCCTGGA GGGCCACACC TTCCTGGTCA GGAACCACAG
4261 GCAGGCCAGC CTGGAGATCA GCCCCATCAC CTTCCTGACT GCCCAGACCC TGCTGATGGA
4321 CCTGGGCCAG TTCCTGCTGT TCTGCCACAT CAGCAGCCAC CAGCATGATG GCATGGAGGC
4381 CTATGTGAAG GTGGACAGCT GCCCTGAGGA GCCCCAGCTG AGGATGAAGA ACAATGAGGA
4441 GGCTGAGGAC TATGATGATG ACCTGACTGA CTCTGAGATG GATGTGGTGA GGTTTGATGA
4501 TGACAACAGC CCCAGCTTCA TCCAGATCAG GTCTGTGGCC AAGAAGCACC CCAAGACCTG
4561 GGTGCACTAC ATTGCTGCTG AGGAGGAGGA CTGGGACTAT GCCCCCCTGG TGCTGGCCCC
4621 TGATGACAGG AGCTACAAGA GCCAGTACCT GAACAATGGC CCCCAGAGGA TTGGCAGGAA
4681 GTACAAGAAG GTCAGGTTCA TGGCCTACAC TGATGAAACC TTCAAGACCA GGGAGGCCAT
4741 CCAGCATGAG TCTGGCATCC TGGGCCCCCT GCTGTATGGG GAGGTGGGGG ACACCCTGCT
4801 GATCATCTTC AAGAACCAGG CCAGCAGGCC CTACAACATC TACCCCCATG GCATCACTGA
4861 TGTGAGGCCC CTGTACAGCA GGAGGCTGCC CAAGGGGGTG AAGCACCTGA AGGACTTCCC
4921 CATCCTGCCT GGGGAGATCT TCAAGTACAA GTGGACTGTG ACTGTGGAGG ATGGCCCCAC
4981 CAAGTCTGAC CCCAGGTGCC TGACCAGATA CTACAGCAGC TTTGTGAACA TGGAGAGGGA
5041 CCTGGCCTCT GGCCTGATTG GCCCCCTGCT GATCTGCTAC AAGGAGTCTG TGGACCAGAG
5101 GGGCAACCAG ATCATGTCTG ACAAGAGGAA TGTGATCCTG TTCTCTGTGT TTGATGAGAA
5161 CAGGAGCTGG TACCTGACTG AGAACATCCA GAGGTTCCTG CCCAACCCTG CTGGGGTGCA
5221 GCTGGAGGAC CCTGAGTTCC AGGCCAGCAA CATCATGCAC AGCATCAATG GCTATGTGTT
5281 TGACAGCCTG CAGCTGTCTG TGTGCCTGCA TGAGGTGGCC TACTGGTACA TCCTGAGCAT
5341 TGGGGCCCAG ACTGACTTCC TGTCTGTGTT CTTCTCTGGC TACACCTTCA AGCACAAGAT
5401 GGTGTATGAG GACACCCTGA CCCTGTTCCC CTTCTCTGGG GAGACTGTGT TCATGAGCAT
5461 GGAGAACCCT GGCCTGTGGA TTCTGGGCTG CCACAACTCT GACTTCAGGA ACAGGGGCAT
5521 GACTGCCCTG CTGAAAGTCT CCAGCTGTGA CAAGAACACT GGGGACTACT ATGAGGACAG
5581 CTATGAGGAC ATCTCTGCCT ACCTGCTGAG CAAGAACAAT GCCATTGAGC CCAGGAGCTT
5641 CAGCCAGAAT GCCACTAATG TGTCTAACAA CAGCAACACC AGCAATGACA GCAATGTGTC
5701 TCCCCCAGTG CTGAAGAGGC ACCAGAGGGA GATCACCAGG ACCACCCTGC AGTCTGACCA
5761 GGAGGAGATT GACTATGATG ACACCATCTC TGTGGAGATG AAGAAGGAGG ACTTTGACAT
5821 CTACGACGAG GACGAGAACC AGAGCCCCAG GAGCTTCCAG AAGAAGACCA GGCACTACTT
5881 CATTGCTGCT GTGGAGAGGC TGTGGGACTA TGGCATGAGC AGCAGCCCCC ATGTGCTGAG
5941 GAACAGGGCC CAGTCTGGCT CTGTGCCCCA GTTCAAGAAG GTGGTGTTCC AGGAGTTCAC
6001 TGATGGCAGC TTCACCCAGC CCCTGTACAG AGGGGAGCTG AATGAGCACC TGGGCCTGCT
6061 GGGCCCCTAC ATCAGGGCTG AGGTGGAGGA CAACATCATG GTGACCTTCA GGAACCAGGC
6121 CAGCAGGCCC TACAGCTTCT ACAGCAGCCT GATCAGCTAT GAGGAGGACC AGAGGCAGGG
6181 GGCTGAGCCC AGGAAGAACT TTGTGAAGCC CAATGAAACC AAGACCTACT TCTGGAAGGT
6241 GCAGCACCAC ATGGCCCCCA CCAAGGATGA GTTTGACTGC AAGGCCTGGG CCTACTTCTC
6301 TGATGTGGAC CTGGAGAAGG ATGTGCACTC TGGCCTGATT GGCCCCCTGC TGGTGTGCCA
6361 CACCAACACC CTGAACCCTG CCCATGGCAG GCAGGTGACT GTGCAGGAGT TTGCCCTGTT
6421 CTTCACCATC TTTGATGAAA CCAAGAGCTG GTACTTCACT GAGAACATGG AGAGGAACTG
6481 CAGGGCCCCC TGCAACATCC AGATGGAGGA CCCCACCTTC AAGGAGAACT ACAGGTTCCA
6541 TGCCATCAAT GGCTACATCA TGGACACCCT GCCTGGCCTG GTGATGGCCC AGGACCAGAG
6601 GATCAGGTGG TACCTGCTGA GCATGGGCAG CAATGAGAAC ATCCACAGCA TCCACTTCTC
6661 TGGCCATGTG TTCACTGTGA GGAAGAAGGA GGAGTACAAG ATGGCCCTGT ACAACCTGTA
6721 CCCTGGGGTG TTTGAGACTG TGGAGATGCT GCCCAGCAAG GCTGGCATCT GGAGGGTGGA
6781 GTGCCTGATT GGGGAGCACC TGCATGCTGG CATGAGCACC CTGTTCCTGG TGTACAGCAA
6841 CAAGTGCCAG ACCCCCCTGG GCATGGCCTC TGGCCACATC AGGGACTTCC AGATCACTGC
6901 CTCTGGCCAG TATGGCCAGT GGGCCCCCAA GCTGGCCAGG CTGCACTACT CTGGCAGCAT
6961 CAATGCCTGG AGCACCAAGG AGCCCTTCAG CTGGATCAAG GTGGACCTGC TGGCCCCCAT
7021 GATCATCCAT GGCATCAAGA CCCAGGGGGC CAGGCAGAAG TTCAGCAGCC TGTACATCAG
7081 CCAGTTCATC ATCATGTACA GCCTGGATGG CAAGAAGTGG CAGACCTACA GGGGCAACAG
7141 CACTGGCACC CTGATGGTGT TCTTTGGCAA TGTGGACAGC TCTGGCATCA AGCACAACAT
7201 CTTCAACCCC CCCATCATTG CCAGATACAT CAGGCTGCAC CCCACCCACT ACAGCATCAG
7261 GAGCACCCTG AGGATGGAGC TGATGGGCTG TGACCTGAAC AGCTGCAGCA TGCCCCTGGG
7321 CATGGAGAGC AAGGCCATCT CTGATGCCCA GATCACTGCC AGCAGCTACT TCACCAACAT
7381 GTTTGCCACC TGGAGCCCCA GCAAGGCCAG GCTGCACCTG CAGGGCAGGA GCAATGCCTG
7441 GAGGCCCCAG GTCAACAACC CCAAGGAGTG GCTGCAGGTG GACTTCCAGA AGACCATGAA
7501 GGTGACTGGG GTGACCACCC AGGGGGTGAA GAGCCTGCTG ACCAGCATGT ATGTGAAGGA
7561 GTTCCTGATC AGCAGCAGCC AGGATGGCCA CCAGTGGACC CTGTTCTTCC AGAATGGCAA
7621 GGTGAAGGTG TTCCAGGGCA ACCAGGACAG CTTCACCCCT GTGGTGAACA GCCTGGACCC
7681 CCCCCTGCTG ACCAGATACC TGAGGATTCA CCCCCAGAGC TGGGTGCACC AGATTGCCCT
7741 GAGGATGGAG GTGCTGGGCT GTGAGGCCCA GGACCTGTAC TGAGCGGCCG CGGGCCCAAT
7801 CAACCTCTGG ATTACAAAAT TTGTGAAAGA TTGACTGGTA TTCTTAACTA TGTTGCTCCT
7861 TTTACGCTAT GTGGATACGC TGCTTTAATG CCTTTGTATC ATGCTATTGC TTCCCGTATG
7921 GCTTTCATTT TCTCCTCCTT GTATAAATCC TGGTTGCTGT CTCTTTATGA GGAGTTGTGG
7981 CCCGTTGTCA GGCAACGTGG CGTGGTGTGC ACTGTGTTTG CTGACGCAAC CCCCACTGGT
8041 TGGGGCATTG CCACCACCTG TCAGCTCCTT TCCGGGACTT TCGCTTTCCC CCTCCCTATT
8101 GCCACGGCGG AACTCATCGC CGCCTGCCTT GCCCGCTGCT GGACAGGGGC TCGGCTGTTG
8161 GGCACTGACA ATTCCGTGGT GTTGTCGGGG AAATCATCGT CCTTTCCTTG GCTGCTCGCC
8221 TGTGTTGCCA CCTGGATTCT GCGCGGGACG TCCTTCTGCT ACGTCCCTTC GGCCCTCAAT
8281 CCAGCGGACC TTCCTTCCCG CGGCCTGCTG CCGGCTCTGC GGCCTCTTCC GCGTCTTCGC
8341 CTTCGCCCTC AGACGAGTCG GATCTCCCTT TGGGCCGCCT CCCCGCAAGC TTCGCACTTT
8401 TTAAAAGAAA AGGGAGGACT GGATGGGATT TATTACTCCG ATAGGACGCT GGCTTGTAAC
8461 TCAGTCTCTT ACTAGGAGAC CAGCTTGAGC CTGGGTGTTC GCTGGTTAGC CTAACCTGGT
8521 TGGCCACCAG GGGTAAGGAC TCCTTGGCTT AGAAAGCTAA TAAACTTGCC TGCATTAGAG
8581 CTCTTACGCG TCCCGGGCTC GAGATCCGCA TCTCAATTAG TCAGCAACCA TAGTCCCGCC
8641 CCTAACTCCG CCCATCCCGC CCCTAACTCC GCCCAGTTCC GCCCATTCTC CGCCCCATGG
8701 CTGACTAATT TTTTTTATTT ATGCAGAGGC CGAGGCCGCC TCGGCCTCTG AGCTATTCCA
8761 GAAGTAGTGA GGAGGCTTTT TTGGAGGCCT AGGCTTTTGC AAAAAGCTAA CTTGTTTATT
8821 GCAGCTTATA ATGGTTACAA ATAAAGCAAT AGCATCACAA ATTTCACAAA TAAAGCATTT
8881 TTTTCACTGC ATTCTAGTTG TGGTTTGTCC AAACTCATCA ATGTATCTTA TCATGTCTGT
8941 CCGCTTCCTC GCTCACTGAC TCGCTGCGCT CGGTCGTTCG GCTGCGGCGA GCGGTATCAG
9001 CTCACTCAAA GGCGGTAATA CGGTTATCCA CAGAATCAGG GGATAACGCA GGAAAGAACA
9061 TGTGAGCAAA AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA GGCCGCGTTG CTGGCGTTTT
9121 TCCATAGGCT CCGCCCCCCT GACGAGCATC ACAAAAATCG ACGCTCAAGT CAGAGGTGGC
9181 GAAACCCGAC AGGACTATAA AGATACCAGG CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT
9241 CTCCTGTTCC GACCCTGCCG CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG
9301 TGGCGCTTTC TCATAGCTCA CGCTGTAGGT ATCTCAGTTC GGTGTAGGTC GTTCGCTCCA
9361 AGCTGGGCTG TGTGCACGAA CCCCCCGTTC AGCCCGACCG CTGCGCCTTA TCCGGTAACT
9421 ATCGTCTTGA GTCCAACCCG GTAAGACACG ACTTATCGCC ACTGGCAGCA GCCACTGGTA
9481 ACAGGATTAG CAGAGCGAGG TATGTAGGCG GTGCTACAGA GTTCTTGAAG TGGTGGCCTA
9541 ACTACGGCTA CACTAGAAGA ACAGTATTTG GTATCTGCGC TCTGCTGAAG CCAGTTACCT
9601 TCGGAAAAAG AGTTGGTAGC TCTTGATCCG GCAAACAAAC CACCGCTGGT AGCGGTGGTT
9661 TTTTTGTTTG CAAGCAGCAG ATTACGCGCA GAAAAAAAGG ATCTCAAGAA GATCCTTTGA
9721 TCTTTTCTAC GGGGTCTGAC GCTCAGTGGA ACGAAAACTC ACGTTAAGGG ATTTTGGTCA
9781 TGAGATTATC AAAAAGGATC TTCACCTAGA TCCTTTTAAA TTAAAAATGA AGTTTTAAAT
9841 CAATCTAAAG TATATATGAG TAAACTTGGT CTGACAGTTA GAAAAACTCA TCGAGCATCA
9901 AATGAAACTG CAATTTATTC ATATCAGGAT TATCAATACC ATATTTTTGA AAAAGCCGTT
9961 TCTGTAATGA AGGAGAAAAC TCACCGAGGC AGTTCCATAG GATGGCAAGA TCCTGGTATC
10021 GGTCTGCGAT TCCGACTCGT CCAACATCAA TACAACCTAT TAATTTCCCC TCGTCAAAAA
10081 TAAGGTTATC AAGTGAGAAA TCACCATGAG TGACGACTGA ATCCGGTGAG AATGGCAACA
10141 GCTTATGCAT TTCTTTCCAG ACTTGTTCAA CAGGCCAGCC ATTACGCTCG TCATCAAAAT
10201 CACTCGCATC AACCAAACCG TTATTCATTC GTGATTGCGC CTGAGCGAGA CGAAATACGC
10261 GATCGCTGTT AAAAGGACAA TTACAAACAG GAATCGAATG CAACCGGCGC AGGAACACTG
10321 CCAGCGCATC AACAATATTT TCACCTGAAT CAGGATATTC TTCTAATACC TGGAATGCTG
10381 TTTTTCCGGG GATCGCAGTG GTGAGTAACC ATGCATCATC AGGAGTACGG ATAAAATGCT
10441 TGATGGTCGG AAGAGGCATA AATTCCGTCA GCCAGTTTAG TCTGACCATC TCATCTGTAA
10501 CATCATTGGC AACGCTACCT TTGCCATGTT TCAGAAACAA CTCTGGCGCA TCGGGCTTCC
10561 CATACAATCG ATAGATTGTC GCACCTGATT GCCCGACATT ATCGCGAGCC CATTTATACC
10621 CATATAAATC AGCATCCATG TTGGAATTTA ATCGCGGCCT AGAGCAAGAC GTTTCCCGTT
10681 GAATATGGCT CATAACACCC CTTGTATTAC TGTTTATGTA AGCAGACAGT TTTATTGTTC
10741 ATGATGATAT ATTTTTATCT TGTGCAATGT AACATCAGAG ATTTTGAGAC ACAACAATTG
10801 GTCGACGGAT CC
SEQ ID NO: 26 도 4b에 정의된 F/HN-SIV-hCEF-HFVIII-V3 플라스미드(pDNA1 pGM413)
길이: 10519; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..10519; mol_type, 기타 DNA; 참고, pGM413; 유기체, 합성 구조물
1 GGTACCTCAA TATTGGCCAT TAGCCATATT ATTCATTGGT TATATAGCAT AAATCAATAT
61 TGGCTATTGG CCATTGCATA CGTTGTATCT ATATCATAAT ATGTACATTT ATATTGGCTC
121 ATGTCCAATA TGACCGCCAT GTTGGCATTG ATTATTGACT AGTTATTAAT AGTAATCAAT
181 TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA
241 TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT
301 TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA
361 AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTCCGCCCC CTATTGACGT
421 CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAC GGGACTTTCC
481 TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA
541 GTACACCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT
601 TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA
661 CAACTGCGAT CGCCCGCCCC GTTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC
721 TATATAAGCA GAGCTCGCTG GCTTGTAACT CAGTCTCTTA CTAGGAGACC AGCTTGAGCC
781 TGGGTGTTCG CTGGTTAGCC TAACCTGGTT GGCCACCAGG GGTAAGGACT CCTTGGCTTA
841 GAAAGCTAAT AAACTTGCCT GCATTAGAGC TTATCTGAGT CAAGTGTCCT CATTGACGCC
901 TCACTCTCTT GAACGGGAAT CTTCCTTACT GGGTTCTCTC TCTGACCCAG GCGAGAGAAA
961 CTCCAGCAGT GGCGCCCGAA CAGGGACTTG AGTGAGAGTG TAGGCACGTA CAGCTGAGAA
1021 GGCGTCGGAC GCGAAGGAAG CGCGGGGTGC GACGCGACCA AGAAGGAGAC TTGGTGAGTA
1081 GGCTTCTCGA GTGCCGGGAA AAAGCTCGAG CCTAGTTAGA GGACTAGGAG AGGCCGTAGC
1141 CGTAACTACT CTGGGCAAGT AGGGCAGGCG GTGGGTACGC AATGGGGGCG GCTACCTCAG
1201 CACTAAATAG GAGACAATTA GACCAATTTG AGAAAATACG ACTTCGCCCG AACGGAAAGA
1261 AAAAGTACCA AATTAAACAT TTAATATGGG CAGGCAAGGA GATGGAGCGC TTCGGCCTCC
1321 ATGAGAGGTT GTTGGAGACA GAGGAGGGGT GTAAAAGAAT CATAGAAGTC CTCTACCCCC
1381 TAGAACCAAC AGGATCGGAG GGCTTAAAAA GTCTGTTCAA TCTTGTGTGC GTGCTATATT
1441 GCTTGCACAA GGAACAGAAA GTGAAAGACA CAGAGGAAGC AGTAGCAACA GTAAGACAAC
1501 ACTGCCATCT AGTGGAAAAA GAAAAAAGTG CAACAGAGAC ATCTAGTGGA CAAAAGAAAA
1561 ATGACAAGGG AATAGCAGCG CCACCTGGTG GCAGTCAGAA TTTTCCAGCG CAACAACAAG
1621 GAAATGCCTG GGTACATGTA CCCTTGTCAC CGCGCACCTT AAATGCGTGG GTAAAAGCAG
1681 TAGAGGAGAA AAAATTTGGA GCAGAAATAG TACCCATGTT TCAAGCCCTA TCGAATTCCC
1741 GTTTGTGCTA GGGTTCTTAG GCTTCTTGGG GGCTGCTGGA ACTGCAATGG GAGCAGCGGC
1801 GACAGCCCTG ACGGTCCAGT CTCAGCATTT GCTTGCTGGG ATACTGCAGC AGCAGAAGAA
1861 TCTGCTGGCG GCTGTGGAGG CTCAACAGCA GATGTTGAAG CTGACCATTT GGGGTGTTAA
1921 AAACCTCAAT GCCCGCGTCA CAGCCCTTGA GAAGTACCTA GAGGATCAGG CACGACTAAA
1981 CTCCTGGGGG TGCGCATGGA AACAAGTATG TCATACCACA GTGGAGTGGC CCTGGACAAA
2041 TCGGACTCCG GATTGGCAAA ATATGACTTG GTTGGAGTGG GAAAGACAAA TAGCTGATTT
2101 GGAAAGCAAC ATTACGAGAC AATTAGTGAA GGCTAGAGAA CAAGAGGAAA AGAATCTAGA
2161 TGCCTATCAG AAGTTAACTA GTTGGTCAGA TTTCTGGTCT TGGTTCGATT TCTCAAAATG
2221 GCTTAACATT TTAAAAATGG GATTTTTAGT AATAGTAGGA ATAATAGGGT TAAGATTACT
2281 TTACACAGTA TATGGATGTA TAGTGAGGGT TAGGCAGGGA TATGTTCCTC TATCTCCACA
2341 GATCCATATC CGCGGCAATT TTAAAAGAAA GGGAGGAATA GGGGGACAGA CTTCAGCAGA
2401 GAGACTAATT AATATAATAA CAACACAATT AGAAATACAA CATTTACAAA CCAAAATTCA
2461 AAAAATTTTA AATTTTAGAG CCGCGGAGAT CTGTTACATA ACTTATGGTA AATGGCCTGC
2521 CTGGCTGACT GCCCAATGAC CCCTGCCCAA TGATGTCAAT AATGATGTAT GTTCCCATGT
2581 AATGCCAATA GGGACTTTCC ATTGATGTCA ATGGGTGGAG TATTTATGGT AACTGCCCAC
2641 TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTATGCCCC CTATTGATGT CAATGATGGT
2701 AAATGGCCTG CCTGGCATTA TGCCCAGTAC ATGACCTTAT GGGACTTTCC TACTTGGCAG
2761 TACATCTATG TATTAGTCAT TGCTATTACC ATGGGAATTC ACTAGTGGAG AAGAGCATGC
2821 TTGAGGGCTG AGTGCCCCTC AGTGGGCAGA GAGCACATGG CCCACAGTCC CTGAGAAGTT
2881 GGGGGGAGGG GTGGGCAATT GAACTGGTGC CTAGAGAAGG TGGGGCTTGG GTAAACTGGG
2941 AAAGTGATGT GGTGTACTGG CTCCACCTTT TTCCCCAGGG TGGGGGAGAA CCATATATAA
3001 GTGCAGTAGT CTCTGTGAAC ATTCAAGCTT CTGCCTTCTC CCTCCTGTGA GTTTGCTAGC
3061 CACCAATGCA GATTGAGCTG AGCACCTGCT TCTTCCTGTG CCTGCTGAGG TTCTGCTTCT
3121 CTGCCACCAG GAGATACTAC CTGGGGGCTG TGGAGCTGAG CTGGGACTAC ATGCAGTCTG
3181 ACCTGGGGGA GCTGCCTGTG GATGCCAGGT TCCCCCCCAG AGTGCCCAAG AGCTTCCCCT
3241 TCAACACCTC TGTGGTGTAC AAGAAGACCC TGTTTGTGGA GTTCACTGAC CACCTGTTCA
3301 ACATTGCCAA GCCCAGGCCC CCCTGGATGG GCCTGCTGGG CCCCACCATC CAGGCTGAGG
3361 TGTATGACAC TGTGGTGATC ACCCTGAAGA ACATGGCCAG CCACCCTGTG AGCCTGCATG
3421 CTGTGGGGGT GAGCTACTGG AAGGCCTCTG AGGGGGCTGA GTATGATGAC CAGACCAGCC
3481 AGAGGGAGAA GGAGGATGAC AAGGTGTTCC CTGGGGGCAG CCACACCTAT GTGTGGCAGG
3541 TGCTGAAGGA GAATGGCCCC ATGGCCTCTG ACCCCCTGTG CCTGACCTAC AGCTACCTGA
3601 GCCATGTGGA CCTGGTGAAG GACCTGAACT CTGGCCTGAT TGGGGCCCTG CTGGTGTGCA
3661 GGGAGGGCAG CCTGGCCAAG GAGAAGACCC AGACCCTGCA CAAGTTCATC CTGCTGTTTG
3721 CTGTGTTTGA TGAGGGCAAG AGCTGGCACT CTGAAACCAA GAACAGCCTG ATGCAGGACA
3781 GGGATGCTGC CTCTGCCAGG GCCTGGCCCA AGATGCACAC TGTGAATGGC TATGTGAACA
3841 GGAGCCTGCC TGGCCTGATT GGCTGCCACA GGAAGTCTGT GTACTGGCAT GTGATTGGCA
3901 TGGGCACCAC CCCTGAGGTG CACAGCATCT TCCTGGAGGG CCACACCTTC CTGGTCAGGA
3961 ACCACAGGCA GGCCAGCCTG GAGATCAGCC CCATCACCTT CCTGACTGCC CAGACCCTGC
4021 TGATGGACCT GGGCCAGTTC CTGCTGTTCT GCCACATCAG CAGCCACCAG CATGATGGCA
4081 TGGAGGCCTA TGTGAAGGTG GACAGCTGCC CTGAGGAGCC CCAGCTGAGG ATGAAGAACA
4141 ATGAGGAGGC TGAGGACTAT GATGATGACC TGACTGACTC TGAGATGGAT GTGGTGAGGT
4201 TTGATGATGA CAACAGCCCC AGCTTCATCC AGATCAGGTC TGTGGCCAAG AAGCACCCCA
4261 AGACCTGGGT GCACTACATT GCTGCTGAGG AGGAGGACTG GGACTATGCC CCCCTGGTGC
4321 TGGCCCCTGA TGACAGGAGC TACAAGAGCC AGTACCTGAA CAATGGCCCC CAGAGGATTG
4381 GCAGGAAGTA CAAGAAGGTC AGGTTCATGG CCTACACTGA TGAAACCTTC AAGACCAGGG
4441 AGGCCATCCA GCATGAGTCT GGCATCCTGG GCCCCCTGCT GTATGGGGAG GTGGGGGACA
4501 CCCTGCTGAT CATCTTCAAG AACCAGGCCA GCAGGCCCTA CAACATCTAC CCCCATGGCA
4561 TCACTGATGT GAGGCCCCTG TACAGCAGGA GGCTGCCCAA GGGGGTGAAG CACCTGAAGG
4621 ACTTCCCCAT CCTGCCTGGG GAGATCTTCA AGTACAAGTG GACTGTGACT GTGGAGGATG
4681 GCCCCACCAA GTCTGACCCC AGGTGCCTGA CCAGATACTA CAGCAGCTTT GTGAACATGG
4741 AGAGGGACCT GGCCTCTGGC CTGATTGGCC CCCTGCTGAT CTGCTACAAG GAGTCTGTGG
4801 ACCAGAGGGG CAACCAGATC ATGTCTGACA AGAGGAATGT GATCCTGTTC TCTGTGTTTG
4861 ATGAGAACAG GAGCTGGTAC CTGACTGAGA ACATCCAGAG GTTCCTGCCC AACCCTGCTG
4921 GGGTGCAGCT GGAGGACCCT GAGTTCCAGG CCAGCAACAT CATGCACAGC ATCAATGGCT
4981 ATGTGTTTGA CAGCCTGCAG CTGTCTGTGT GCCTGCATGA GGTGGCCTAC TGGTACATCC
5041 TGAGCATTGG GGCCCAGACT GACTTCCTGT CTGTGTTCTT CTCTGGCTAC ACCTTCAAGC
5101 ACAAGATGGT GTATGAGGAC ACCCTGACCC TGTTCCCCTT CTCTGGGGAG ACTGTGTTCA
5161 TGAGCATGGA GAACCCTGGC CTGTGGATTC TGGGCTGCCA CAACTCTGAC TTCAGGAACA
5221 GGGGCATGAC TGCCCTGCTG AAAGTCTCCA GCTGTGACAA GAACACTGGG GACTACTATG
5281 AGGACAGCTA TGAGGACATC TCTGCCTACC TGCTGAGCAA GAACAATGCC ATTGAGCCCA
5341 GGAGCTTCAG CCAGAATGCC ACTAATGTGT CTAACAACAG CAACACCAGC AATGACAGCA
5401 ATGTGTCTCC CCCAGTGCTG AAGAGGCACC AGAGGGAGAT CACCAGGACC ACCCTGCAGT
5461 CTGACCAGGA GGAGATTGAC TATGATGACA CCATCTCTGT GGAGATGAAG AAGGAGGACT
5521 TTGACATCTA CGACGAGGAC GAGAACCAGA GCCCCAGGAG CTTCCAGAAG AAGACCAGGC
5581 ACTACTTCAT TGCTGCTGTG GAGAGGCTGT GGGACTATGG CATGAGCAGC AGCCCCCATG
5641 TGCTGAGGAA CAGGGCCCAG TCTGGCTCTG TGCCCCAGTT CAAGAAGGTG GTGTTCCAGG
5701 AGTTCACTGA TGGCAGCTTC ACCCAGCCCC TGTACAGAGG GGAGCTGAAT GAGCACCTGG
5761 GCCTGCTGGG CCCCTACATC AGGGCTGAGG TGGAGGACAA CATCATGGTG ACCTTCAGGA
5821 ACCAGGCCAG CAGGCCCTAC AGCTTCTACA GCAGCCTGAT CAGCTATGAG GAGGACCAGA
5881 GGCAGGGGGC TGAGCCCAGG AAGAACTTTG TGAAGCCCAA TGAAACCAAG ACCTACTTCT
5941 GGAAGGTGCA GCACCACATG GCCCCCACCA AGGATGAGTT TGACTGCAAG GCCTGGGCCT
6001 ACTTCTCTGA TGTGGACCTG GAGAAGGATG TGCACTCTGG CCTGATTGGC CCCCTGCTGG
6061 TGTGCCACAC CAACACCCTG AACCCTGCCC ATGGCAGGCA GGTGACTGTG CAGGAGTTTG
6121 CCCTGTTCTT CACCATCTTT GATGAAACCA AGAGCTGGTA CTTCACTGAG AACATGGAGA
6181 GGAACTGCAG GGCCCCCTGC AACATCCAGA TGGAGGACCC CACCTTCAAG GAGAACTACA
6241 GGTTCCATGC CATCAATGGC TACATCATGG ACACCCTGCC TGGCCTGGTG ATGGCCCAGG
6301 ACCAGAGGAT CAGGTGGTAC CTGCTGAGCA TGGGCAGCAA TGAGAACATC CACAGCATCC
6361 ACTTCTCTGG CCATGTGTTC ACTGTGAGGA AGAAGGAGGA GTACAAGATG GCCCTGTACA
6421 ACCTGTACCC TGGGGTGTTT GAGACTGTGG AGATGCTGCC CAGCAAGGCT GGCATCTGGA
6481 GGGTGGAGTG CCTGATTGGG GAGCACCTGC ATGCTGGCAT GAGCACCCTG TTCCTGGTGT
6541 ACAGCAACAA GTGCCAGACC CCCCTGGGCA TGGCCTCTGG CCACATCAGG GACTTCCAGA
6601 TCACTGCCTC TGGCCAGTAT GGCCAGTGGG CCCCCAAGCT GGCCAGGCTG CACTACTCTG
6661 GCAGCATCAA TGCCTGGAGC ACCAAGGAGC CCTTCAGCTG GATCAAGGTG GACCTGCTGG
6721 CCCCCATGAT CATCCATGGC ATCAAGACCC AGGGGGCCAG GCAGAAGTTC AGCAGCCTGT
6781 ACATCAGCCA GTTCATCATC ATGTACAGCC TGGATGGCAA GAAGTGGCAG ACCTACAGGG
6841 GCAACAGCAC TGGCACCCTG ATGGTGTTCT TTGGCAATGT GGACAGCTCT GGCATCAAGC
6901 ACAACATCTT CAACCCCCCC ATCATTGCCA GATACATCAG GCTGCACCCC ACCCACTACA
6961 GCATCAGGAG CACCCTGAGG ATGGAGCTGA TGGGCTGTGA CCTGAACAGC TGCAGCATGC
7021 CCCTGGGCAT GGAGAGCAAG GCCATCTCTG ATGCCCAGAT CACTGCCAGC AGCTACTTCA
7081 CCAACATGTT TGCCACCTGG AGCCCCAGCA AGGCCAGGCT GCACCTGCAG GGCAGGAGCA
7141 ATGCCTGGAG GCCCCAGGTC AACAACCCCA AGGAGTGGCT GCAGGTGGAC TTCCAGAAGA
7201 CCATGAAGGT GACTGGGGTG ACCACCCAGG GGGTGAAGAG CCTGCTGACC AGCATGTATG
7261 TGAAGGAGTT CCTGATCAGC AGCAGCCAGG ATGGCCACCA GTGGACCCTG TTCTTCCAGA
7321 ATGGCAAGGT GAAGGTGTTC CAGGGCAACC AGGACAGCTT CACCCCTGTG GTGAACAGCC
7381 TGGACCCCCC CCTGCTGACC AGATACCTGA GGATTCACCC CCAGAGCTGG GTGCACCAGA
7441 TTGCCCTGAG GATGGAGGTG CTGGGCTGTG AGGCCCAGGA CCTGTACTGA GCGGCCGCGG
7501 GCCCAATCAA CCTCTGGATT ACAAAATTTG TGAAAGATTG ACTGGTATTC TTAACTATGT
7561 TGCTCCTTTT ACGCTATGTG GATACGCTGC TTTAATGCCT TTGTATCATG CTATTGCTTC
7621 CCGTATGGCT TTCATTTTCT CCTCCTTGTA TAAATCCTGG TTGCTGTCTC TTTATGAGGA
7681 GTTGTGGCCC GTTGTCAGGC AACGTGGCGT GGTGTGCACT GTGTTTGCTG ACGCAACCCC
7741 CACTGGTTGG GGCATTGCCA CCACCTGTCA GCTCCTTTCC GGGACTTTCG CTTTCCCCCT
7801 CCCTATTGCC ACGGCGGAAC TCATCGCCGC CTGCCTTGCC CGCTGCTGGA CAGGGGCTCG
7861 GCTGTTGGGC ACTGACAATT CCGTGGTGTT GTCGGGGAAA TCATCGTCCT TTCCTTGGCT
7921 GCTCGCCTGT GTTGCCACCT GGATTCTGCG CGGGACGTCC TTCTGCTACG TCCCTTCGGC
7981 CCTCAATCCA GCGGACCTTC CTTCCCGCGG CCTGCTGCCG GCTCTGCGGC CTCTTCCGCG
8041 TCTTCGCCTT CGCCCTCAGA CGAGTCGGAT CTCCCTTTGG GCCGCCTCCC CGCAAGCTTC
8101 GCACTTTTTA AAAGAAAAGG GAGGACTGGA TGGGATTTAT TACTCCGATA GGACGCTGGC
8161 TTGTAACTCA GTCTCTTACT AGGAGACCAG CTTGAGCCTG GGTGTTCGCT GGTTAGCCTA
8221 ACCTGGTTGG CCACCAGGGG TAAGGACTCC TTGGCTTAGA AAGCTAATAA ACTTGCCTGC
8281 ATTAGAGCTC TTACGCGTCC CGGGCTCGAG ATCCGCATCT CAATTAGTCA GCAACCATAG
8341 TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC CAGTTCCGCC CATTCTCCGC
8401 CCCATGGCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA GGCCGCCTCG GCCTCTGAGC
8461 TATTCCAGAA GTAGTGAGGA GGCTTTTTTG GAGGCCTAGG CTTTTGCAAA AAGCTAACTT
8521 GTTTATTGCA GCTTATAATG GTTACAAATA AAGCAATAGC ATCACAAATT TCACAAATAA
8581 AGCATTTTTT TCACTGCATT CTAGTTGTGG TTTGTCCAAA CTCATCAATG TATCTTATCA
8641 TGTCTGTCCG CTTCCTCGCT CACTGACTCG CTGCGCTCGG TCGTTCGGCT GCGGCGAGCG
8701 GTATCAGCTC ACTCAAAGGC GGTAATACGG TTATCCACAG AATCAGGGGA TAACGCAGGA
8761 AAGAACATGT GAGCAAAAGG CCAGCAAAAG GCCAGGAACC GTAAAAAGGC CGCGTTGCTG
8821 GCGTTTTTCC ATAGGCTCCG CCCCCCTGAC GAGCATCACA AAAATCGACG CTCAAGTCAG
8881 AGGTGGCGAA ACCCGACAGG ACTATAAAGA TACCAGGCGT TTCCCCCTGG AAGCTCCCTC
8941 GTGCGCTCTC CTGTTCCGAC CCTGCCGCTT ACCGGATACC TGTCCGCCTT TCTCCCTTCG
9001 GGAAGCGTGG CGCTTTCTCA TAGCTCACGC TGTAGGTATC TCAGTTCGGT GTAGGTCGTT
9061 CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CCGACCGCTG CGCCTTATCC
9121 GGTAACTATC GTCTTGAGTC CAACCCGGTA AGACACGACT TATCGCCACT GGCAGCAGCC
9181 ACTGGTAACA GGATTAGCAG AGCGAGGTAT GTAGGCGGTG CTACAGAGTT CTTGAAGTGG
9241 TGGCCTAACT ACGGCTACAC TAGAAGAACA GTATTTGGTA TCTGCGCTCT GCTGAAGCCA
9301 GTTACCTTCG GAAAAAGAGT TGGTAGCTCT TGATCCGGCA AACAAACCAC CGCTGGTAGC
9361 GGTGGTTTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC TCAAGAAGAT
9421 CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG AAAACTCACG TTAAGGGATT
9481 TTGGTCATGA GATTATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA AAAATGAAGT
9541 TTTAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTAGAA AAACTCATCG
9601 AGCATCAAAT GAAACTGCAA TTTATTCATA TCAGGATTAT CAATACCATA TTTTTGAAAA
9661 AGCCGTTTCT GTAATGAAGG AGAAAACTCA CCGAGGCAGT TCCATAGGAT GGCAAGATCC
9721 TGGTATCGGT CTGCGATTCC GACTCGTCCA ACATCAATAC AACCTATTAA TTTCCCCTCG
9781 TCAAAAATAA GGTTATCAAG TGAGAAATCA CCATGAGTGA CGACTGAATC CGGTGAGAAT
9841 GGCAACAGCT TATGCATTTC TTTCCAGACT TGTTCAACAG GCCAGCCATT ACGCTCGTCA
9901 TCAAAATCAC TCGCATCAAC CAAACCGTTA TTCATTCGTG ATTGCGCCTG AGCGAGACGA
9961 AATACGCGAT CGCTGTTAAA AGGACAATTA CAAACAGGAA TCGAATGCAA CCGGCGCAGG
10021 AACACTGCCA GCGCATCAAC AATATTTTCA CCTGAATCAG GATATTCTTC TAATACCTGG
10081 AATGCTGTTT TTCCGGGGAT CGCAGTGGTG AGTAACCATG CATCATCAGG AGTACGGATA
10141 AAATGCTTGA TGGTCGGAAG AGGCATAAAT TCCGTCAGCC AGTTTAGTCT GACCATCTCA
10201 TCTGTAACAT CATTGGCAAC GCTACCTTTG CCATGTTTCA GAAACAACTC TGGCGCATCG
10261 GGCTTCCCAT ACAATCGATA GATTGTCGCA CCTGATTGCC CGACATTATC GCGAGCCCAT
10321 TTATACCCAT ATAAATCAGC ATCCATGTTG GAATTTAATC GCGGCCTAGA GCAAGACGTT
10381 TCCCGTTGAA TATGGCTCAT AACACCCCTT GTATTACTGT TTATGTAAGC AGACAGTTTT
10441 ATTGTTCATG ATGATATATT TTTATCTTGT GCAATGTAAC ATCAGAGATT TTGAGACACA
10501 ACAATTGGTC GACGGATCC
SEQ ID NO: 27 도 4c에 정의된 바와 같은 F/HN-SIV-CMV-HFVIII-N6-co 플라스미드(pDNA1 pGM412)
길이: 11400; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..11400; mol_type, 기타 DNA; 참고, pGM412; 유기체, 합성 구조물
1 GGTACCTCAA TATTGGCCAT TAGCCATATT ATTCATTGGT TATATAGCAT AAATCAATAT
61 TGGCTATTGG CCATTGCATA CGTTGTATCT ATATCATAAT ATGTACATTT ATATTGGCTC
121 ATGTCCAATA TGACCGCCAT GTTGGCATTG ATTATTGACT AGTTATTAAT AGTAATCAAT
181 TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA
241 TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT
301 TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA
361 AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTCCGCCCC CTATTGACGT
421 CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAC GGGACTTTCC
481 TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA
541 GTACACCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT
601 TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA
661 CAACTGCGAT CGCCCGCCCC GTTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC
721 TATATAAGCA GAGCTCGCTG GCTTGTAACT CAGTCTCTTA CTAGGAGACC AGCTTGAGCC
781 TGGGTGTTCG CTGGTTAGCC TAACCTGGTT GGCCACCAGG GGTAAGGACT CCTTGGCTTA
841 GAAAGCTAAT AAACTTGCCT GCATTAGAGC TTATCTGAGT CAAGTGTCCT CATTGACGCC
901 TCACTCTCTT GAACGGGAAT CTTCCTTACT GGGTTCTCTC TCTGACCCAG GCGAGAGAAA
961 CTCCAGCAGT GGCGCCCGAA CAGGGACTTG AGTGAGAGTG TAGGCACGTA CAGCTGAGAA
1021 GGCGTCGGAC GCGAAGGAAG CGCGGGGTGC GACGCGACCA AGAAGGAGAC TTGGTGAGTA
1081 GGCTTCTCGA GTGCCGGGAA AAAGCTCGAG CCTAGTTAGA GGACTAGGAG AGGCCGTAGC
1141 CGTAACTACT CTGGGCAAGT AGGGCAGGCG GTGGGTACGC AATGGGGGCG GCTACCTCAG
1201 CACTAAATAG GAGACAATTA GACCAATTTG AGAAAATACG ACTTCGCCCG AACGGAAAGA
1261 AAAAGTACCA AATTAAACAT TTAATATGGG CAGGCAAGGA GATGGAGCGC TTCGGCCTCC
1321 ATGAGAGGTT GTTGGAGACA GAGGAGGGGT GTAAAAGAAT CATAGAAGTC CTCTACCCCC
1381 TAGAACCAAC AGGATCGGAG GGCTTAAAAA GTCTGTTCAA TCTTGTGTGC GTGCTATATT
1441 GCTTGCACAA GGAACAGAAA GTGAAAGACA CAGAGGAAGC AGTAGCAACA GTAAGACAAC
1501 ACTGCCATCT AGTGGAAAAA GAAAAAAGTG CAACAGAGAC ATCTAGTGGA CAAAAGAAAA
1561 ATGACAAGGG AATAGCAGCG CCACCTGGTG GCAGTCAGAA TTTTCCAGCG CAACAACAAG
1621 GAAATGCCTG GGTACATGTA CCCTTGTCAC CGCGCACCTT AAATGCGTGG GTAAAAGCAG
1681 TAGAGGAGAA AAAATTTGGA GCAGAAATAG TACCCATGTT TCAAGCCCTA TCGAATTCCC
1741 GTTTGTGCTA GGGTTCTTAG GCTTCTTGGG GGCTGCTGGA ACTGCAATGG GAGCAGCGGC
1801 GACAGCCCTG ACGGTCCAGT CTCAGCATTT GCTTGCTGGG ATACTGCAGC AGCAGAAGAA
1861 TCTGCTGGCG GCTGTGGAGG CTCAACAGCA GATGTTGAAG CTGACCATTT GGGGTGTTAA
1921 AAACCTCAAT GCCCGCGTCA CAGCCCTTGA GAAGTACCTA GAGGATCAGG CACGACTAAA
1981 CTCCTGGGGG TGCGCATGGA AACAAGTATG TCATACCACA GTGGAGTGGC CCTGGACAAA
2041 TCGGACTCCG GATTGGCAAA ATATGACTTG GTTGGAGTGG GAAAGACAAA TAGCTGATTT
2101 GGAAAGCAAC ATTACGAGAC AATTAGTGAA GGCTAGAGAA CAAGAGGAAA AGAATCTAGA
2161 TGCCTATCAG AAGTTAACTA GTTGGTCAGA TTTCTGGTCT TGGTTCGATT TCTCAAAATG
2221 GCTTAACATT TTAAAAATGG GATTTTTAGT AATAGTAGGA ATAATAGGGT TAAGATTACT
2281 TTACACAGTA TATGGATGTA TAGTGAGGGT TAGGCAGGGA TATGTTCCTC TATCTCCACA
2341 GATCCATATC CGCGGCAATT TTAAAAGAAA GGGAGGAATA GGGGGACAGA CTTCAGCAGA
2401 GAGACTAATT AATATAATAA CAACACAATT AGAAATACAA CATTTACAAA CCAAAATTCA
2461 AAAAATTTTA AATTTTAGAG CCGCGGAGAT CTCAATATTG GCCATTAGCC ATATTATTCA
2521 TTGGTTATAT AGCATAAATC AATATTGGCT ATTGGCCATT GCATACGTTG TATCTATATC
2581 ATAATATGTA CATTTATATT GGCTCATGTC CAATATGACC GCCATGTTGG CATTGATTAT
2641 TGACTAGTTA TTAATAGTAA TCAATTACGG GGTCATTAGT TCATAGCCCA TATATGGAGT
2701 TCCGCGTTAC ATAACTTACG GTAAATGGCC CGCCTGGCTG ACCGCCCAAC GACCCCCGCC
2761 CATTGACGTC AATAATGACG TATGTTCCCA TAGTAACGCC AATAGGGACT TTCCATTGAC
2821 GTCAATGGGT GGAGTATTTA CGGTAAACTG CCCACTTGGC AGTACATCAA GTGTATCATA
2881 TGCCAAGTCC GCCCCCTATT GACGTCAATG ACGGTAAATG GCCCGCCTGG CATTATGCCC
2941 AGTACATGAC CTTACGGGAC TTTCCTACTT GGCAGTACAT CTACGTATTA GTCATCGCTA
3001 TTACCATGGT GATGCGGTTT TGGCAGTACA CCAATGGGCG TGGATAGCGG TTTGACTCAC
3061 GGGGATTTCC AAGTCTCCAC CCCATTGACG TCAATGGGAG TTTGTTTTGG CACCAAAATC
3121 AACGGGACTT TCCAAAATGT CGTAATAACC CCGCCCCGTT GACGCAAATG GGCGGTAGGC
3181 GTGTACGGTG GGAGGTCTAT ATAAGCAGAG CTCGTTTAGT GAACCGTCAG ATCACTAGAA
3241 GCTTTATTGC GGTAGTTTAT CACAGTTAAA TTGCTAACGC AGTCAGTGCT TCTGACACAA
3301 CAGTCTCGAA CTTAAGCTGC AGAAGTTGGT CGTGAGGCAC TGGGCAGGCT AGCCACCAAT
3361 GCAGATTGAG CTGAGCACCT GCTTCTTCCT GTGCCTGCTG AGGTTCTGCT TCTCTGCCAC
3421 CAGGAGATAC TACCTGGGGG CTGTGGAGCT GAGCTGGGAC TACATGCAGT CTGACCTGGG
3481 GGAGCTGCCT GTGGATGCCA GGTTCCCCCC CAGAGTGCCC AAGAGCTTCC CCTTCAACAC
3541 CTCTGTGGTG TACAAGAAGA CCCTGTTTGT GGAGTTCACT GACCACCTGT TCAACATTGC
3601 CAAGCCCAGG CCCCCCTGGA TGGGCCTGCT GGGCCCCACC ATCCAGGCTG AGGTGTATGA
3661 CACTGTGGTG ATCACCCTGA AGAACATGGC CAGCCACCCT GTGAGCCTGC ATGCTGTGGG
3721 GGTGAGCTAC TGGAAGGCCT CTGAGGGGGC TGAGTATGAT GACCAGACCA GCCAGAGGGA
3781 GAAGGAGGAT GACAAGGTGT TCCCTGGGGG CAGCCACACC TATGTGTGGC AGGTGCTGAA
3841 GGAGAATGGC CCCATGGCCT CTGACCCCCT GTGCCTGACC TACAGCTACC TGAGCCATGT
3901 GGACCTGGTG AAGGACCTGA ACTCTGGCCT GATTGGGGCC CTGCTGGTGT GCAGGGAGGG
3961 CAGCCTGGCC AAGGAGAAGA CCCAGACCCT GCACAAGTTC ATCCTGCTGT TTGCTGTGTT
4021 TGATGAGGGC AAGAGCTGGC ACTCTGAAAC CAAGAACAGC CTGATGCAGG ACAGGGATGC
4081 TGCCTCTGCC AGGGCCTGGC CCAAGATGCA CACTGTGAAT GGCTATGTGA ACAGGAGCCT
4141 GCCTGGCCTG ATTGGCTGCC ACAGGAAGTC TGTGTACTGG CATGTGATTG GCATGGGCAC
4201 CACCCCTGAG GTGCACAGCA TCTTCCTGGA GGGCCACACC TTCCTGGTCA GGAACCACAG
4261 GCAGGCCAGC CTGGAGATCA GCCCCATCAC CTTCCTGACT GCCCAGACCC TGCTGATGGA
4321 CCTGGGCCAG TTCCTGCTGT TCTGCCACAT CAGCAGCCAC CAGCATGATG GCATGGAGGC
4381 CTATGTGAAG GTGGACAGCT GCCCTGAGGA GCCCCAGCTG AGGATGAAGA ACAATGAGGA
4441 GGCTGAGGAC TATGATGATG ACCTGACTGA CTCTGAGATG GATGTGGTGA GGTTTGATGA
4501 TGACAACAGC CCCAGCTTCA TCCAGATCAG GTCTGTGGCC AAGAAGCACC CCAAGACCTG
4561 GGTGCACTAC ATTGCTGCTG AGGAGGAGGA CTGGGACTAT GCCCCCCTGG TGCTGGCCCC
4621 TGATGACAGG AGCTACAAGA GCCAGTACCT GAACAATGGC CCCCAGAGGA TTGGCAGGAA
4681 GTACAAGAAG GTCAGGTTCA TGGCCTACAC TGATGAAACC TTCAAGACCA GGGAGGCCAT
4741 CCAGCATGAG TCTGGCATCC TGGGCCCCCT GCTGTATGGG GAGGTGGGGG ACACCCTGCT
4801 GATCATCTTC AAGAACCAGG CCAGCAGGCC CTACAACATC TACCCCCATG GCATCACTGA
4861 TGTGAGGCCC CTGTACAGCA GGAGGCTGCC CAAGGGGGTG AAGCACCTGA AGGACTTCCC
4921 CATCCTGCCT GGGGAGATCT TCAAGTACAA GTGGACTGTG ACTGTGGAGG ATGGCCCCAC
4981 CAAGTCTGAC CCCAGGTGCC TGACCAGATA CTACAGCAGC TTTGTGAACA TGGAGAGGGA
5041 CCTGGCCTCT GGCCTGATTG GCCCCCTGCT GATCTGCTAC AAGGAGTCTG TGGACCAGAG
5101 GGGCAACCAG ATCATGTCTG ACAAGAGGAA TGTGATCCTG TTCTCTGTGT TTGATGAGAA
5161 CAGGAGCTGG TACCTGACTG AGAACATCCA GAGGTTCCTG CCCAACCCTG CTGGGGTGCA
5221 GCTGGAGGAC CCTGAGTTCC AGGCCAGCAA CATCATGCAC AGCATCAATG GCTATGTGTT
5281 TGACAGCCTG CAGCTGTCTG TGTGCCTGCA TGAGGTGGCC TACTGGTACA TCCTGAGCAT
5341 TGGGGCCCAG ACTGACTTCC TGTCTGTGTT CTTCTCTGGC TACACCTTCA AGCACAAGAT
5401 GGTGTATGAG GACACCCTGA CCCTGTTCCC CTTCTCTGGG GAGACTGTGT TCATGAGCAT
5461 GGAGAACCCT GGCCTGTGGA TTCTGGGCTG CCACAACTCT GACTTCAGGA ACAGGGGCAT
5521 GACTGCCCTG CTGAAAGTCT CCAGCTGTGA CAAGAACACT GGGGACTACT ATGAGGACAG
5581 CTATGAGGAC ATCTCTGCCT ACCTGCTGAG CAAGAACAAT GCCATTGAGC CCAGGAGCTT
5641 CAGCCAGAAC AGCAGGCACC CCAGCACCAG GCAGAAGCAG TTCAATGCCA CCACCATCCC
5701 TGAGAATGAC ATAGAGAAGA CAGACCCATG GTTTGCCCAC CGGACCCCCA TGCCCAAGAT
5761 CCAGAATGTG AGCAGCTCTG ACCTGCTGAT GCTGCTGAGG CAGAGCCCCA CCCCCCATGG
5821 CCTGAGCCTG TCTGACCTGC AGGAGGCCAA GTATGAAACC TTCTCTGATG ACCCCAGCCC
5881 TGGGGCCATT GACAGCAACA ACAGCCTGTC TGAGATGACC CACTTCAGGC CCCAGCTGCA
5941 CCACTCTGGG GACATGGTGT TCACCCCTGA GTCTGGCCTG CAGCTGAGGC TGAATGAGAA
6001 GCTGGGCACC ACTGCTGCCA CTGAGCTGAA GAAGCTGGAC TTCAAAGTCT CCAGCACCAG
6061 CAACAACCTG ATCAGCACCA TCCCCTCTGA CAACCTGGCT GCTGGCACTG ACAACACCAG
6121 CAGCCTGGGC CCCCCCAGCA TGCCTGTGCA CTATGACAGC CAGCTGGACA CCACCCTGTT
6181 TGGCAAGAAG AGCAGCCCCC TGACTGAGTC TGGGGGCCCC CTGAGCCTGT CTGAGGAGAA
6241 CAATGACAGC AAGCTGCTGG AGTCTGGCCT GATGAACAGC CAGGAGAGCA GCTGGGGCAA
6301 GAATGTGAGC AGCAGGGAGA TCACCAGGAC CACCCTGCAG TCTGACCAGG AGGAGATTGA
6361 CTATGATGAC ACCATCTCTG TGGAGATGAA GAAGGAGGAC TTTGACATCT ACGACGAGGA
6421 CGAGAACCAG AGCCCCAGGA GCTTCCAGAA GAAGACCAGG CACTACTTCA TTGCTGCTGT
6481 GGAGAGGCTG TGGGACTATG GCATGAGCAG CAGCCCCCAT GTGCTGAGGA ACAGGGCCCA
6541 GTCTGGCTCT GTGCCCCAGT TCAAGAAGGT GGTGTTCCAG GAGTTCACTG ATGGCAGCTT
6601 CACCCAGCCC CTGTACAGAG GGGAGCTGAA TGAGCACCTG GGCCTGCTGG GCCCCTACAT
6661 CAGGGCTGAG GTGGAGGACA ACATCATGGT GACCTTCAGG AACCAGGCCA GCAGGCCCTA
6721 CAGCTTCTAC AGCAGCCTGA TCAGCTATGA GGAGGACCAG AGGCAGGGGG CTGAGCCCAG
6781 GAAGAACTTT GTGAAGCCCA ATGAAACCAA GACCTACTTC TGGAAGGTGC AGCACCACAT
6841 GGCCCCCACC AAGGATGAGT TTGACTGCAA GGCCTGGGCC TACTTCTCTG ATGTGGACCT
6901 GGAGAAGGAT GTGCACTCTG GCCTGATTGG CCCCCTGCTG GTGTGCCACA CCAACACCCT
6961 GAACCCTGCC CATGGCAGGC AGGTGACTGT GCAGGAGTTT GCCCTGTTCT TCACCATCTT
7021 TGATGAAACC AAGAGCTGGT ACTTCACTGA GAACATGGAG AGGAACTGCA GGGCCCCCTG
7081 CAACATCCAG ATGGAGGACC CCACCTTCAA GGAGAACTAC AGGTTCCATG CCATCAATGG
7141 CTACATCATG GACACCCTGC CTGGCCTGGT GATGGCCCAG GACCAGAGGA TCAGGTGGTA
7201 CCTGCTGAGC ATGGGCAGCA ATGAGAACAT CCACAGCATC CACTTCTCTG GCCATGTGTT
7261 CACTGTGAGG AAGAAGGAGG AGTACAAGAT GGCCCTGTAC AACCTGTACC CTGGGGTGTT
7321 TGAGACTGTG GAGATGCTGC CCAGCAAGGC TGGCATCTGG AGGGTGGAGT GCCTGATTGG
7381 GGAGCACCTG CATGCTGGCA TGAGCACCCT GTTCCTGGTG TACAGCAACA AGTGCCAGAC
7441 CCCCCTGGGC ATGGCCTCTG GCCACATCAG GGACTTCCAG ATCACTGCCT CTGGCCAGTA
7501 TGGCCAGTGG GCCCCCAAGC TGGCCAGGCT GCACTACTCT GGCAGCATCA ATGCCTGGAG
7561 CACCAAGGAG CCCTTCAGCT GGATCAAGGT GGACCTGCTG GCCCCCATGA TCATCCATGG
7621 CATCAAGACC CAGGGGGCCA GGCAGAAGTT CAGCAGCCTG TACATCAGCC AGTTCATCAT
7681 CATGTACAGC CTGGATGGCA AGAAGTGGCA GACCTACAGG GGCAACAGCA CTGGCACCCT
7741 GATGGTGTTC TTTGGCAATG TGGACAGCTC TGGCATCAAG CACAACATCT TCAACCCCCC
7801 CATCATTGCC AGATACATCA GGCTGCACCC CACCCACTAC AGCATCAGGA GCACCCTGAG
7861 GATGGAGCTG ATGGGCTGTG ACCTGAACAG CTGCAGCATG CCCCTGGGCA TGGAGAGCAA
7921 GGCCATCTCT GATGCCCAGA TCACTGCCAG CAGCTACTTC ACCAACATGT TTGCCACCTG
7981 GAGCCCCAGC AAGGCCAGGC TGCACCTGCA GGGCAGGAGC AATGCCTGGA GGCCCCAGGT
8041 CAACAACCCC AAGGAGTGGC TGCAGGTGGA CTTCCAGAAG ACCATGAAGG TGACTGGGGT
8101 GACCACCCAG GGGGTGAAGA GCCTGCTGAC CAGCATGTAT GTGAAGGAGT TCCTGATCAG
8161 CAGCAGCCAG GATGGCCACC AGTGGACCCT GTTCTTCCAG AATGGCAAGG TGAAGGTGTT
8221 CCAGGGCAAC CAGGACAGCT TCACCCCTGT GGTGAACAGC CTGGACCCCC CCCTGCTGAC
8281 CAGATACCTG AGGATTCACC CCCAGAGCTG GGTGCACCAG ATTGCCCTGA GGATGGAGGT
8341 GCTGGGCTGT GAGGCCCAGG ACCTGTACTG AGCGGCCGCG GGCCCAATCA ACCTCTGGAT
8401 TACAAAATTT GTGAAAGATT GACTGGTATT CTTAACTATG TTGCTCCTTT TACGCTATGT
8461 GGATACGCTG CTTTAATGCC TTTGTATCAT GCTATTGCTT CCCGTATGGC TTTCATTTTC
8521 TCCTCCTTGT ATAAATCCTG GTTGCTGTCT CTTTATGAGG AGTTGTGGCC CGTTGTCAGG
8581 CAACGTGGCG TGGTGTGCAC TGTGTTTGCT GACGCAACCC CCACTGGTTG GGGCATTGCC
8641 ACCACCTGTC AGCTCCTTTC CGGGACTTTC GCTTTCCCCC TCCCTATTGC CACGGCGGAA
8701 CTCATCGCCG CCTGCCTTGC CCGCTGCTGG ACAGGGGCTC GGCTGTTGGG CACTGACAAT
8761 TCCGTGGTGT TGTCGGGGAA ATCATCGTCC TTTCCTTGGC TGCTCGCCTG TGTTGCCACC
8821 TGGATTCTGC GCGGGACGTC CTTCTGCTAC GTCCCTTCGG CCCTCAATCC AGCGGACCTT
8881 CCTTCCCGCG GCCTGCTGCC GGCTCTGCGG CCTCTTCCGC GTCTTCGCCT TCGCCCTCAG
8941 ACGAGTCGGA TCTCCCTTTG GGCCGCCTCC CCGCAAGCTT CGCACTTTTT AAAAGAAAAG
9001 GGAGGACTGG ATGGGATTTA TTACTCCGAT AGGACGCTGG CTTGTAACTC AGTCTCTTAC
9061 TAGGAGACCA GCTTGAGCCT GGGTGTTCGC TGGTTAGCCT AACCTGGTTG GCCACCAGGG
9121 GTAAGGACTC CTTGGCTTAG AAAGCTAATA AACTTGCCTG CATTAGAGCT CTTACGCGTC
9181 CCGGGCTCGA GATCCGCATC TCAATTAGTC AGCAACCATA GTCCCGCCCC TAACTCCGCC
9241 CATCCCGCCC CTAACTCCGC CCAGTTCCGC CCATTCTCCG CCCCATGGCT GACTAATTTT
9301 TTTTATTTAT GCAGAGGCCG AGGCCGCCTC GGCCTCTGAG CTATTCCAGA AGTAGTGAGG
9361 AGGCTTTTTT GGAGGCCTAG GCTTTTGCAA AAAGCTAACT TGTTTATTGC AGCTTATAAT
9421 GGTTACAAAT AAAGCAATAG CATCACAAAT TTCACAAATA AAGCATTTTT TTCACTGCAT
9481 TCTAGTTGTG GTTTGTCCAA ACTCATCAAT GTATCTTATC ATGTCTGTCC GCTTCCTCGC
9541 TCACTGACTC GCTGCGCTCG GTCGTTCGGC TGCGGCGAGC GGTATCAGCT CACTCAAAGG
9601 CGGTAATACG GTTATCCACA GAATCAGGGG ATAACGCAGG AAAGAACATG TGAGCAAAAG
9661 GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT GGCGTTTTTC CATAGGCTCC
9721 GCCCCCCTGA CGAGCATCAC AAAAATCGAC GCTCAAGTCA GAGGTGGCGA AACCCGACAG
9781 GACTATAAAG ATACCAGGCG TTTCCCCCTG GAAGCTCCCT CGTGCGCTCT CCTGTTCCGA
9841 CCCTGCCGCT TACCGGATAC CTGTCCGCCT TTCTCCCTTC GGGAAGCGTG GCGCTTTCTC
9901 ATAGCTCACG CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT TCGCTCCAAG CTGGGCTGTG
9961 TGCACGAACC CCCCGTTCAG CCCGACCGCT GCGCCTTATC CGGTAACTAT CGTCTTGAGT
10021 CCAACCCGGT AAGACACGAC TTATCGCCAC TGGCAGCAGC CACTGGTAAC AGGATTAGCA
10081 GAGCGAGGTA TGTAGGCGGT GCTACAGAGT TCTTGAAGTG GTGGCCTAAC TACGGCTACA
10141 CTAGAAGAAC AGTATTTGGT ATCTGCGCTC TGCTGAAGCC AGTTACCTTC GGAAAAAGAG
10201 TTGGTAGCTC TTGATCCGGC AAACAAACCA CCGCTGGTAG CGGTGGTTTT TTTGTTTGCA
10261 AGCAGCAGAT TACGCGCAGA AAAAAAGGAT CTCAAGAAGA TCCTTTGATC TTTTCTACGG
10321 GGTCTGACGC TCAGTGGAAC GAAAACTCAC GTTAAGGGAT TTTGGTCATG AGATTATCAA
10381 AAAGGATCTT CACCTAGATC CTTTTAAATT AAAAATGAAG TTTTAAATCA ATCTAAAGTA
10441 TATATGAGTA AACTTGGTCT GACAGTTAGA AAAACTCATC GAGCATCAAA TGAAACTGCA
10501 ATTTATTCAT ATCAGGATTA TCAATACCAT ATTTTTGAAA AAGCCGTTTC TGTAATGAAG
10561 GAGAAAACTC ACCGAGGCAG TTCCATAGGA TGGCAAGATC CTGGTATCGG TCTGCGATTC
10621 CGACTCGTCC AACATCAATA CAACCTATTA ATTTCCCCTC GTCAAAAATA AGGTTATCAA
10681 GTGAGAAATC ACCATGAGTG ACGACTGAAT CCGGTGAGAA TGGCAACAGC TTATGCATTT
10741 CTTTCCAGAC TTGTTCAACA GGCCAGCCAT TACGCTCGTC ATCAAAATCA CTCGCATCAA
10801 CCAAACCGTT ATTCATTCGT GATTGCGCCT GAGCGAGACG AAATACGCGA TCGCTGTTAA
10861 AAGGACAATT ACAAACAGGA ATCGAATGCA ACCGGCGCAG GAACACTGCC AGCGCATCAA
10921 CAATATTTTC ACCTGAATCA GGATATTCTT CTAATACCTG GAATGCTGTT TTTCCGGGGA
10981 TCGCAGTGGT GAGTAACCAT GCATCATCAG GAGTACGGAT AAAATGCTTG ATGGTCGGAA
11041 GAGGCATAAA TTCCGTCAGC CAGTTTAGTC TGACCATCTC ATCTGTAACA TCATTGGCAA
11101 CGCTACCTTT GCCATGTTTC AGAAACAACT CTGGCGCATC GGGCTTCCCA TACAATCGAT
11161 AGATTGTCGC ACCTGATTGC CCGACATTAT CGCGAGCCCA TTTATACCCA TATAAATCAG
11221 CATCCATGTT GGAATTTAAT CGCGGCCTAG AGCAAGACGT TTCCCGTTGA ATATGGCTCA
11281 TAACACCCCT TGTATTACTG TTTATGTAAG CAGACAGTTT TATTGTTCAT GATGATATAT
11341 TTTTATCTTG TGCAATGTAA CATCAGAGAT TTTGAGACAC AACAATTGGT CGACGGATCC
SEQ ID NO: 28 도 4d에 정의된 바와 같은 F/HN-SIV-hCEF-HFVIII-N6-co 플라스미드(pDNA1 pGM414)
길이: 11108; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..11108; mol_type, 기타 DNA; 참고, pGM414; 유기체, 합성 구조물
1 GGTACCTCAA TATTGGCCAT TAGCCATATT ATTCATTGGT TATATAGCAT AAATCAATAT
61 TGGCTATTGG CCATTGCATA CGTTGTATCT ATATCATAAT ATGTACATTT ATATTGGCTC
121 ATGTCCAATA TGACCGCCAT GTTGGCATTG ATTATTGACT AGTTATTAAT AGTAATCAAT
181 TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA
241 TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT
301 TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA
361 AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTCCGCCCC CTATTGACGT
421 CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAC GGGACTTTCC
481 TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA
541 GTACACCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT
601 TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA
661 CAACTGCGAT CGCCCGCCCC GTTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC
721 TATATAAGCA GAGCTCGCTG GCTTGTAACT CAGTCTCTTA CTAGGAGACC AGCTTGAGCC
781 TGGGTGTTCG CTGGTTAGCC TAACCTGGTT GGCCACCAGG GGTAAGGACT CCTTGGCTTA
841 GAAAGCTAAT AAACTTGCCT GCATTAGAGC TTATCTGAGT CAAGTGTCCT CATTGACGCC
901 TCACTCTCTT GAACGGGAAT CTTCCTTACT GGGTTCTCTC TCTGACCCAG GCGAGAGAAA
961 CTCCAGCAGT GGCGCCCGAA CAGGGACTTG AGTGAGAGTG TAGGCACGTA CAGCTGAGAA
1021 GGCGTCGGAC GCGAAGGAAG CGCGGGGTGC GACGCGACCA AGAAGGAGAC TTGGTGAGTA
1081 GGCTTCTCGA GTGCCGGGAA AAAGCTCGAG CCTAGTTAGA GGACTAGGAG AGGCCGTAGC
1141 CGTAACTACT CTTGGGCAAG TAGGGCAGGC GGTGGGTACG CAATGGGGGC GGCTACCTCA
1201 GCACTAAATA GGAGACAATT AGACCAATTT GAGAAAATAC GACTTCGCCC GAACGGAAAG
1261 AAAAAGTACC AAATTAAACA TTTAATATGG GCAGGCAAGG AGATGGAGCG CTTCGGCCTC
1321 CATGAGAGGT TGTTGGAGAC AGAGGAGGGG TGTAAAAGAA TCATAGAAGT CCTCTACCCC
1381 CTAGAACCAA CAGGATCGGA GGGCTTAAAA AGTCTGTTCA ATCTTGTGTG CGTGCTATAT
1441 TGCTTGCACA AGGAACAGAA AGTGAAAGAC ACAGAGGAAG CAGTAGCAAC AGTAAGACAA
1501 CACTGCCATC TAGTGGAAAA AGAAAAAAGT GCAACAGAGA CATCTAGTGG ACAAAAGAAA
1561 AATGACAAGG GAATAGCAGC GCCACCTGGT GGCAGTCAGA ATTTTCCAGC GCAACAACAA
1621 GGAAATGCCT GGGTACATGT ACCCTTGTCA CCGCGCACCT TAAATGCGTG GGTAAAAGCA
1681 GTAGAGGAGA AAAAATTTGG AGCAGAAATA GTACCCATGT TTCAAGCCCT ATCGAATTCC
1741 CGTTTGTGCT AGGGTTCTTA GGCTTCTTGG GGGCTGCTGG AACTGCAATG GGAGCAGCGG
1801 CGACAGCCCT GACGGTCCAG TCTCAGCATT TGCTTGCTGG GATACTGCAG CAGCAGAAGA
1861 ATCTGCTGGC GGCTGTGGAG GCTCAACAGC AGATGTTGAA GCTGACCATT TGGGGTGTTA
1921 AAAACCTCAA TGCCCGCGTC ACAGCCCTTG AGAAGTACCT AGAGGATCAG GCACGACTAA
1981 ACTCCTGGGG GTGCGCATGG AAACAAGTAT GTCATACCAC AGTGGAGTGG CCCTGGACAA
2041 ATCGGACTCC GGATTGGCAA AATATGACTT GGTTGGAGTG GGAAAGACAA ATAGCTGATT
2101 TGGAAAGCAA CATTACGAGA CAATTAGTGA AGGCTAGAGA ACAAGAGGAA AAGAATCTAG
2161 ATGCCTATCA GAAGTTAACT AGTTGGTCAG ATTTCTGGTC TTGGTTCGAT TTCTCAAAAT
2221 GGCTTAACAT TTTAAAAATG GGATTTTTAG TAATAGTAGG AATAATAGGG TTAAGATTAC
2281 TTTACACAGT ATATGGATGT ATAGTGAGGG TTAGGCAGGG ATATGTTCCT CTATCTCCAC
2341 AGATCCATAT CCGCGGCAAT TTTAAAAGAA AGGGAGGAAT AGGGGGACAG ACTTCAGCAG
2401 AGAGACTAAT TAATATAATA ACAACACAAT TAGAAATACA ACATTTACAA ACCAAAATTC
2461 AAAAAATTTT AAATTTTAGA GCCGCGGAGA TCTGTTACAT AACTTATGGT AAATGGCCTG
2521 CCTGGCTGAC TGCCCAATGA CCCCTGCCCA ATGATGTCAA TAATGATGTA TGTTCCCATG
2581 TAATGCCAAT AGGGACTTTC CATTGATGTC AATGGGTGGA GTATTTATGG TAACTGCCCA
2641 CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTATGCCC CCTATTGATG TCAATGATGG
2701 TAAATGGCCT GCCTGGCATT ATGCCCAGTA CATGACCTTA TGGGACTTTC CTACTTGGCA
2761 GTACATCTAT GTATTAGTCA TTGCTATTAC CATGGGAATT CACTAGTGGA GAAGAGCATG
2821 CTTGAGGGCT GAGTGCCCCT CAGTGGGCAG AGAGCACATG GCCCACAGTC CCTGAGAAGT
2881 TGGGGGGAGG GGTGGGCAAT TGAACTGGTG CCTAGAGAAG GTGGGGCTTG GGTAAACTGG
2941 GAAAGTGATG TGGTGTACTG GCTCCACCTT TTTCCCCAGG GTGGGGGAGA ACCATATATA
3001 AGTGCAGTAG TCTCTGTGAA CATTCAAGCT TCTGCCTTCT CCCTCCTGTG AGTTTGCTAG
3061 CCACCAATGC AGATTGAGCT GAGCACCTGC TTCTTCCTGT GCCTGCTGAG GTTCTGCTTC
3121 TCTGCCACCA GGAGATACTA CCTGGGGGCT GTGGAGCTGA GCTGGGACTA CATGCAGTCT
3181 GACCTGGGGG AGCTGCCTGT GGATGCCAGG TTCCCCCCCA GAGTGCCCAA GAGCTTCCCC
3241 TTCAACACCT CTGTGGTGTA CAAGAAGACC CTGTTTGTGG AGTTCACTGA CCACCTGTTC
3301 AACATTGCCA AGCCCAGGCC CCCCTGGATG GGCCTGCTGG GCCCCACCAT CCAGGCTGAG
3361 GTGTATGACA CTGTGGTGAT CACCCTGAAG AACATGGCCA GCCACCCTGT GAGCCTGCAT
3421 GCTGTGGGGG TGAGCTACTG GAAGGCCTCT GAGGGGGCTG AGTATGATGA CCAGACCAGC
3481 CAGAGGGAGA AGGAGGATGA CAAGGTGTTC CCTGGGGGCA GCCACACCTA TGTGTGGCAG
3541 GTGCTGAAGG AGAATGGCCC CATGGCCTCT GACCCCCTGT GCCTGACCTA CAGCTACCTG
3601 AGCCATGTGG ACCTGGTGAA GGACCTGAAC TCTGGCCTGA TTGGGGCCCT GCTGGTGTGC
3661 AGGGAGGGCA GCCTGGCCAA GGAGAAGACC CAGACCCTGC ACAAGTTCAT CCTGCTGTTT
3721 GCTGTGTTTG ATGAGGGCAA GAGCTGGCAC TCTGAAACCA AGAACAGCCT GATGCAGGAC
3781 AGGGATGCTG CCTCTGCCAG GGCCTGGCCC AAGATGCACA CTGTGAATGG CTATGTGAAC
3841 AGGAGCCTGC CTGGCCTGAT TGGCTGCCAC AGGAAGTCTG TGTACTGGCA TGTGATTGGC
3901 ATGGGCACCA CCCCTGAGGT GCACAGCATC TTCCTGGAGG GCCACACCTT CCTGGTCAGG
3961 AACCACAGGC AGGCCAGCCT GGAGATCAGC CCCATCACCT TCCTGACTGC CCAGACCCTG
4021 CTGATGGACC TGGGCCAGTT CCTGCTGTTC TGCCACATCA GCAGCCACCA GCATGATGGC
4081 ATGGAGGCCT ATGTGAAGGT GGACAGCTGC CCTGAGGAGC CCCAGCTGAG GATGAAGAAC
4141 AATGAGGAGG CTGAGGACTA TGATGATGAC CTGACTGACT CTGAGATGGA TGTGGTGAGG
4201 TTTGATGATG ACAACAGCCC CAGCTTCATC CAGATCAGGT CTGTGGCCAA GAAGCACCCC
4261 AAGACCTGGG TGCACTACAT TGCTGCTGAG GAGGAGGACT GGGACTATGC CCCCCTGGTG
4321 CTGGCCCCTG ATGACAGGAG CTACAAGAGC CAGTACCTGA ACAATGGCCC CCAGAGGATT
4381 GGCAGGAAGT ACAAGAAGGT CAGGTTCATG GCCTACACTG ATGAAACCTT CAAGACCAGG
4441 GAGGCCATCC AGCATGAGTC TGGCATCCTG GGCCCCCTGC TGTATGGGGA GGTGGGGGAC
4501 ACCCTGCTGA TCATCTTCAA GAACCAGGCC AGCAGGCCCT ACAACATCTA CCCCCATGGC
4561 ATCACTGATG TGAGGCCCCT GTACAGCAGG AGGCTGCCCA AGGGGGTGAA GCACCTGAAG
4621 GACTTCCCCA TCCTGCCTGG GGAGATCTTC AAGTACAAGT GGACTGTGAC TGTGGAGGAT
4681 GGCCCCACCA AGTCTGACCC CAGGTGCCTG ACCAGATACT ACAGCAGCTT TGTGAACATG
4741 GAGAGGGACC TGGCCTCTGG CCTGATTGGC CCCCTGCTGA TCTGCTACAA GGAGTCTGTG
4801 GACCAGAGGG GCAACCAGAT CATGTCTGAC AAGAGGAATG TGATCCTGTT CTCTGTGTTT
4861 GATGAGAACA GGAGCTGGTA CCTGACTGAG AACATCCAGA GGTTCCTGCC CAACCCTGCT
4921 GGGGTGCAGC TGGAGGACCC TGAGTTCCAG GCCAGCAACA TCATGCACAG CATCAATGGC
4981 TATGTGTTTG ACAGCCTGCA GCTGTCTGTG TGCCTGCATG AGGTGGCCTA CTGGTACATC
5041 CTGAGCATTG GGGCCCAGAC TGACTTCCTG TCTGTGTTCT TCTCTGGCTA CACCTTCAAG
5101 CACAAGATGG TGTATGAGGA CACCCTGACC CTGTTCCCCT TCTCTGGGGA GACTGTGTTC
5161 ATGAGCATGG AGAACCCTGG CCTGTGGATT CTGGGCTGCC ACAACTCTGA CTTCAGGAAC
5221 AGGGGCATGA CTGCCCTGCT GAAAGTCTCC AGCTGTGACA AGAACACTGG GGACTACTAT
5281 GAGGACAGCT ATGAGGACAT CTCTGCCTAC CTGCTGAGCA AGAACAATGC CATTGAGCCC
5341 AGGAGCTTCA GCCAGAACAG CAGGCACCCC AGCACCAGGC AGAAGCAGTT CAATGCCACC
5401 ACCATCCCTG AGAATGACAT AGAGAAGACA GACCCATGGT TTGCCCACCG GACCCCCATG
5461 CCCAAGATCC AGAATGTGAG CAGCTCTGAC CTGCTGATGC TGCTGAGGCA GAGCCCCACC
5521 CCCCATGGCC TGAGCCTGTC TGACCTGCAG GAGGCCAAGT ATGAAACCTT CTCTGATGAC
5581 CCCAGCCCTG GGGCCATTGA CAGCAACAAC AGCCTGTCTG AGATGACCCA CTTCAGGCCC
5641 CAGCTGCACC ACTCTGGGGA CATGGTGTTC ACCCCTGAGT CTGGCCTGCA GCTGAGGCTG
5701 AATGAGAAGC TGGGCACCAC TGCTGCCACT GAGCTGAAGA AGCTGGACTT CAAAGTCTCC
5761 AGCACCAGCA ACAACCTGAT CAGCACCATC CCCTCTGACA ACCTGGCTGC TGGCACTGAC
5821 AACACCAGCA GCCTGGGCCC CCCCAGCATG CCTGTGCACT ATGACAGCCA GCTGGACACC
5881 ACCCTGTTTG GCAAGAAGAG CAGCCCCCTG ACTGAGTCTG GGGGCCCCCT GAGCCTGTCT
5941 GAGGAGAACA ATGACAGCAA GCTGCTGGAG TCTGGCCTGA TGAACAGCCA GGAGAGCAGC
6001 TGGGGCAAGA ATGTGAGCAG CAGGGAGATC ACCAGGACCA CCCTGCAGTC TGACCAGGAG
6061 GAGATTGACT ATGATGACAC CATCTCTGTG GAGATGAAGA AGGAGGACTT TGACATCTAC
6121 GACGAGGACG AGAACCAGAG CCCCAGGAGC TTCCAGAAGA AGACCAGGCA CTACTTCATT
6181 GCTGCTGTGG AGAGGCTGTG GGACTATGGC ATGAGCAGCA GCCCCCATGT GCTGAGGAAC
6241 AGGGCCCAGT CTGGCTCTGT GCCCCAGTTC AAGAAGGTGG TGTTCCAGGA GTTCACTGAT
6301 GGCAGCTTCA CCCAGCCCCT GTACAGAGGG GAGCTGAATG AGCACCTGGG CCTGCTGGGC
6361 CCCTACATCA GGGCTGAGGT GGAGGACAAC ATCATGGTGA CCTTCAGGAA CCAGGCCAGC
6421 AGGCCCTACA GCTTCTACAG CAGCCTGATC AGCTATGAGG AGGACCAGAG GCAGGGGGCT
6481 GAGCCCAGGA AGAACTTTGT GAAGCCCAAT GAAACCAAGA CCTACTTCTG GAAGGTGCAG
6541 CACCACATGG CCCCCACCAA GGATGAGTTT GACTGCAAGG CCTGGGCCTA CTTCTCTGAT
6601 GTGGACCTGG AGAAGGATGT GCACTCTGGC CTGATTGGCC CCCTGCTGGT GTGCCACACC
6661 AACACCCTGA ACCCTGCCCA TGGCAGGCAG GTGACTGTGC AGGAGTTTGC CCTGTTCTTC
6721 ACCATCTTTG ATGAAACCAA GAGCTGGTAC TTCACTGAGA ACATGGAGAG GAACTGCAGG
6781 GCCCCCTGCA ACATCCAGAT GGAGGACCCC ACCTTCAAGG AGAACTACAG GTTCCATGCC
6841 ATCAATGGCT ACATCATGGA CACCCTGCCT GGCCTGGTGA TGGCCCAGGA CCAGAGGATC
6901 AGGTGGTACC TGCTGAGCAT GGGCAGCAAT GAGAACATCC ACAGCATCCA CTTCTCTGGC
6961 CATGTGTTCA CTGTGAGGAA GAAGGAGGAG TACAAGATGG CCCTGTACAA CCTGTACCCT
7021 GGGGTGTTTG AGACTGTGGA GATGCTGCCC AGCAAGGCTG GCATCTGGAG GGTGGAGTGC
7081 CTGATTGGGG AGCACCTGCA TGCTGGCATG AGCACCCTGT TCCTGGTGTA CAGCAACAAG
7141 TGCCAGACCC CCCTGGGCAT GGCCTCTGGC CACATCAGGG ACTTCCAGAT CACTGCCTCT
7201 GGCCAGTATG GCCAGTGGGC CCCCAAGCTG GCCAGGCTGC ACTACTCTGG CAGCATCAAT
7261 GCCTGGAGCA CCAAGGAGCC CTTCAGCTGG ATCAAGGTGG ACCTGCTGGC CCCCATGATC
7321 ATCCATGGCA TCAAGACCCA GGGGGCCAGG CAGAAGTTCA GCAGCCTGTA CATCAGCCAG
7381 TTCATCATCA TGTACAGCCT GGATGGCAAG AAGTGGCAGA CCTACAGGGG CAACAGCACT
7441 GGCACCCTGA TGGTGTTCTT TGGCAATGTG GACAGCTCTG GCATCAAGCA CAACATCTTC
7501 AACCCCCCCA TCATTGCCAG ATACATCAGG CTGCACCCCA CCCACTACAG CATCAGGAGC
7561 ACCCTGAGGA TGGAGCTGAT GGGCTGTGAC CTGAACAGCT GCAGCATGCC CCTGGGCATG
7621 GAGAGCAAGG CCATCTCTGA TGCCCAGATC ACTGCCAGCA GCTACTTCAC CAACATGTTT
7681 GCCACCTGGA GCCCCAGCAA GGCCAGGCTG CACCTGCAGG GCAGGAGCAA TGCCTGGAGG
7741 CCCCAGGTCA ACAACCCCAA GGAGTGGCTG CAGGTGGACT TCCAGAAGAC CATGAAGGTG
7801 ACTGGGGTGA CCACCCAGGG GGTGAAGAGC CTGCTGACCA GCATGTATGT GAAGGAGTTC
7861 CTGATCAGCA GCAGCCAGGA TGGCCACCAG TGGACCCTGT TCTTCCAGAA TGGCAAGGTG
7921 AAGGTGTTCC AGGGCAACCA GGACAGCTTC ACCCCTGTGG TGAACAGCCT GGACCCCCCC
7981 CTGCTGACCA GATACCTGAG GATTCACCCC CAGAGCTGGG TGCACCAGAT TGCCCTGAGG
8041 ATGGAGGTGC TGGGCTGTGA GGCCCAGGAC CTGTACTGAG CGGCCGCGGG CCCAATCAAC
8101 CTCTGGATTA CAAAATTTGT GAAAGATTGA CTGGTATTCT TAACTATGTT GCTCCTTTTA
8161 CGCTATGTGG ATACGCTGCT TTAATGCCTT TGTATCATGC TATTGCTTCC CGTATGGCTT
8221 TCATTTTCTC CTCCTTGTAT AAATCCTGGT TGCTGTCTCT TTATGAGGAG TTGTGGCCCG
8281 TTGTCAGGCA ACGTGGCGTG GTGTGCACTG TGTTTGCTGA CGCAACCCCC ACTGGTTGGG
8341 GCATTGCCAC CACCTGTCAG CTCCTTTCCG GGACTTTCGC TTTCCCCCTC CCTATTGCCA
8401 CGGCGGAACT CATCGCCGCC TGCCTTGCCC GCTGCTGGAC AGGGGCTCGG CTGTTGGGCA
8461 CTGACAATTC CGTGGTGTTG TCGGGGAAAT CATCGTCCTT TCCTTGGCTG CTCGCCTGTG
8521 TTGCCACCTG GATTCTGCGC GGGACGTCCT TCTGCTACGT CCCTTCGGCC CTCAATCCAG
8581 CGGACCTTCC TTCCCGCGGC CTGCTGCCGG CTCTGCGGCC TCTTCCGCGT CTTCGCCTTC
8641 GCCCTCAGAC GAGTCGGATC TCCCTTTGGG CCGCCTCCCC GCAAGCTTCG CACTTTTTAA
8701 AAGAAAAGGG AGGACTGGAT GGGATTTATT ACTCCGATAG GACGCTGGCT TGTAACTCAG
8761 TCTCTTACTA GGAGACCAGC TTGAGCCTGG GTGTTCGCTG GTTAGCCTAA CCTGGTTGGC
8821 CACCAGGGGT AAGGACTCCT TGGCTTAGAA AGCTAATAAA CTTGCCTGCA TTAGAGCTCT
8881 TACGCGTCCC GGGCTCGAGA TCCGCATCTC AATTAGTCAG CAACCATAGT CCCGCCCCTA
8941 ACTCCGCCCA TCCCGCCCCT AACTCCGCCC AGTTCCGCCC ATTCTCCGCC CCATGGCTGA
9001 CTAATTTTTT TTATTTATGC AGAGGCCGAG GCCGCCTCGG CCTCTGAGCT ATTCCAGAAG
9061 TAGTGAGGAG GCTTTTTTGG AGGCCTAGGC TTTTGCAAAA AGCTAACTTG TTTATTGCAG
9121 CTTATAATGG TTACAAATAA AGCAATAGCA TCACAAATTT CACAAATAAA GCATTTTTTT
9181 CACTGCATTC TAGTTGTGGT TTGTCCAAAC TCATCAATGT ATCTTATCAT GTCTGTCCGC
9241 TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA
9301 CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG
9361 AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA
9421 TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA
9481 CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC
9541 TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC
9601 GCTTTCTCAT AGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT
9661 GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG
9721 TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG
9781 GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA
9841 CGGCTACACT AGAAGAACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG
9901 AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGGTTTTTT
9961 TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT
10021 TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG
10081 ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT
10141 CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTAGAAA AACTCATCGA GCATCAAATG
10201 AAACTGCAAT TTATTCATAT CAGGATTATC AATACCATAT TTTTGAAAAA GCCGTTTCTG
10261 TAATGAAGGA GAAAACTCAC CGAGGCAGTT CCATAGGATG GCAAGATCCT GGTATCGGTC
10321 TGCGATTCCG ACTCGTCCAA CATCAATACA ACCTATTAAT TTCCCCTCGT CAAAAATAAG
10381 GTTATCAAGT GAGAAATCAC CATGAGTGAC GACTGAATCC GGTGAGAATG GCAACAGCTT
10441 ATGCATTTCT TTCCAGACTT GTTCAACAGG CCAGCCATTA CGCTCGTCAT CAAAATCACT
10501 CGCATCAACC AAACCGTTAT TCATTCGTGA TTGCGCCTGA GCGAGACGAA ATACGCGATC
10561 GCTGTTAAAA GGACAATTAC AAACAGGAAT CGAATGCAAC CGGCGCAGGA ACACTGCCAG
10621 CGCATCAACA ATATTTTCAC CTGAATCAGG ATATTCTTCT AATACCTGGA ATGCTGTTTT
10681 TCCGGGGATC GCAGTGGTGA GTAACCATGC ATCATCAGGA GTACGGATAA AATGCTTGAT
10741 GGTCGGAAGA GGCATAAATT CCGTCAGCCA GTTTAGTCTG ACCATCTCAT CTGTAACATC
10801 ATTGGCAACG CTACCTTTGC CATGTTTCAG AAACAACTCT GGCGCATCGG GCTTCCCATA
10861 CAATCGATAG ATTGTCGCAC CTGATTGCCC GACATTATCG CGAGCCCATT TATACCCATA
10921 TAAATCAGCA TCCATGTTGG AATTTAATCG CGGCCTAGAG CAAGACGTTT CCCGTTGAAT
10981 ATGGCTCATA ACACCCCTTG TATTACTGTT TATGTAAGCA GACAGTTTTA TTGTTCATGA
11041 TGATATATTT TTATCTTGTG CAATGTAACA TCAGAGATTT TGAGACACAA CAATTGGTCG
11101 ACGGATCC
SEQ ID NO: 29 예시된 CAG 프로모터
길이: 1738; 분자 유형: DNA; 기능 위치/한정자: 소스, 1..1738; mol_type, 기타 DNA; 참고, CAG 프로모터; 유기체, 합성 구조물
ATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTTGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAACCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTTCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTTCGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTTCCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCATTTTGGCAAAGAATTGCTCGAGCCACC
SEQUENCE LISTING
<110> IP2IPO Innovations Limited
<120> RETROVIRAL VECTORS
<130> P68229WO
<160> 29
<170> PatentIn version 3.5
<210> 1
<211> 4391
<212> DNA
<213> Artificial Sequence
<220>
<223> codon-optimised SIV gal-pol nucleic acid sequence (from pGM691)
<400> 1
atgggagctg ccacatctgc cctgaataga cggcagctgg accagttcga gaagatcaga 60
ctgcggccca acggcaagaa gaagtaccag atcaagcacc tgatctgggc cggcaaagag 120
atggaaagat tcggcctgca cgagcggctg ctggaaaccg aggaaggctg caagagaatt 180
atcgaggtgc tgtaccctct ggaacctacc ggctctgagg gcctgaagtc cctgttcaat 240
ctcgtgtgcg tgctgtactg cctgcacaaa gaacagaaag tgaaggacac cgaagaggcc 300
gtggccacag ttagacagca ctgccacctg gtggaaaaag agaagtccgc cacagagaca 360
agcagcggcc agaagaagaa cgacaaggga attgctgccc ctcctggcgg cagccagaat 420
tttcctgctc agcagcaggg aaacgcctgg gtgcacgttc cactgagccc tagaacactg 480
aatgcctggg tcaaagccgt ggaagagaag aagtttggcg ccgagatcgt gcccatgttc 540
caggctctgt ctgagggctg caccccttac gacatcaacc agatgctgaa cgtgctggga 600
gatcaccagg gcgctctgca gatcgtgaaa gagatcatca acgaagaggc tgcccagtgg 660
gacgtgacac atccattgcc tgctggacct ctgccagccg gacaactgag agatcctaga 720
ggctctgata tcgccggcac caccagctct gtgcaagagc agctggaatg gatctacacc 780
gccaatccta gagtggacgt gggcgccatc tacagaagat ggatcatcct gggcctgcag 840
aaatgcgtga agatgtacaa ccccgtgtcc gtgctggaca tcagacaggg acccaaagag 900
cccttcaagg actacgtgga ccggttctat aaggccatta gagccgagca ggccagcggc 960
gaagtgaagc agtggatgac agagagcctg ctgatccaga acgccaatcc agactgcaaa 1020
gtgatcctga aaggcctggg catgcacccc acactggaag agatgctgac agcctgtcaa 1080
ggcgttggcg gcccttctta caaagccaaa gtgatggccg agatgatgca gaccatgcag 1140
aaccagaaca tggtgcagca aggcggccct aagagacaga ggcctcctct gagatgctac 1200
aactgcggca agttcggcca catgcagaga cagtgtcctg agcctaggaa aacaaaatgt 1260
ctaaagtgtg gaaaattggg acacctagca aaagactgca ggggacaggt gaatttttta 1320
gggtatggac ggtggatggg ggcaaaaccg agaaattttc ccgccgctac tcttggagcg 1380
gaaccgagtg cgcctcctcc accgagcggc accaccccat acgacccagc aaagaagctc 1440
ctgcagcaat atgcagagaa agggaaacaa ctgagggagc aaaagaggaa tccaccggca 1500
atgaatccgg attggaccga gggatattct ttgaactccc tctttggaga agaccaataa 1560
agaccgtgta catcgagggc gtgcccatca aggctctgct ggatacaggc gccgacgaca 1620
ccatcatcaa agagaacgac ctgcagctga gcggcccttg gaggcctaag atcattggag 1680
gaatcggcgg aggcctgaac gtcaaagagt acaacgaccg ggaagtgaag atcgaggaca 1740
agatcctgag gggcacaatc ctgctgggcg ccacacctat caacatcatc ggcagaaatc 1800
tgctggcccc tgccggcgct agactggtta tgggacagct ctctgagaag atccccgtga 1860
cacccgtgaa gctgaaagaa ggcgctagag gaccttgtgt gcgacagtgg cctctgagca 1920
aagagaagat tgaggccctg caagaaatct gtagccagct ggaacaagag ggcaagatca 1980
gcagagttgg cggcgagaac gcctacaata cccctatctt ctgcatcaag aaaaaggaca 2040
agagccagtg gcggatgctg gtggacttta gagagctgaa caaggctacc caggacttct 2100
tcgaggtgca gctgggaatt cctcatcctg ccggcctgcg gaagatgaga cagatcacag 2160
tgctggatgt gggcgacgcc tactacagca tccctctgga ccccaacttc agaaagtaca 2220
ccgccttcac aatccccacc gtgaacaatc aaggccctgg catcagatac cagttcaact 2280
gcctgcctca aggctggaag ggcagcccca ccatttttca gaataccgcc gccagcatcc 2340
tggaagaaat caagagaaac ctgcctgctc tgaccatcgt gcagtacatg gacgatctgt 2400
gggtcggaag ccaagagaat gagcacaccc acgacaagct ggtggaacag ctgagaacaa 2460
agctgcaggc ctggggcctc gaaacccctg agaagaaggt gcagaaagaa cctccttacg 2520
agtggatggg ctacaagctg tggcctcaca agtgggagct gagccggatt cagctcgaag 2580
agaaggacga gtggaccgtg aacgacatcc agaaactcgt gggcaagctg aattgggcag 2640
cccagctgta tcccggcctg aggaccaaga acatctgcaa gctgatccgg ggaaagaaga 2700
acctgctgga actggtcaca tggacacctg aggccgaggc cgaatatgcc gagaatgccg 2760
aaatcctgaa aaccgagcaa gaggggacct actacaagcc tggcattcca atcagagctg 2820
ccgtgcagaa actggaaggc ggccagtggt cctaccagtt taagcaagaa ggccaggtcc 2880
tgaaagtggg caagtacacc aagcagaaga acacccacac caacgagctg aggacactgg 2940
ctggcctggt ccagaaaatc tgcaaagagg ccctggtcat ttggggcatc ctgcctgttc 3000
tggaactgcc cattgagcgg gaagtgtggg aacagtggtg ggccgattac tggcaagtgt 3060
cttggatccc cgagtgggac ttcgtgtcta cccctcctct gctgaaactg tggtacaccc 3120
tgacaaaaga gcccattcct aaagaggacg tctactacgt tgacggcgcc tgcaaccgga 3180
actccaaaga aggcaaggcc ggctacatca gccagtacgg caagcagaga gtggaaaccc 3240
tggaaaacac caccaaccag caggccgagc tgaccgccat taagatggcc ctggaagata 3300
gcggccccaa tgtgaacatc gtgaccgact ctcagtacgc catgggaatc ctgacagccc 3360
agcctacaca gagcgatagc cctctggttg agcagatcat tgccctgatg attcagaagc 3420
agcaaatcta cctgcagtgg gtgcccgctc acaaaggcat cggcggaaac gaagagatcg 3480
ataagctggt gtccaaggga atcagacggg tgctgttcct ggaaaagatt gaagaggccc 3540
aagaggaaca cgagcgctac cacaacaact ggaagaatct ggccgacacc tacggactgc 3600
cccagatcgt ggccaaagaa atcgtggcta tgtgccccaa gtgtcagatc aagggcgaac 3660
ctgtgcacgg ccaagtggat gcttctcctg gcacatggca gatggactgt acccacctgg 3720
aaggcaaagt ggtcatcgtg gctgtgcacg tggcctccgg ctttattgag gccgaagtga 3780
tccccagaga gacaggcaaa gaaaccgcca agttcctgct gaagatcctg tccagatggc 3840
ccatcacaca gctgcacacc gacaacggcc ctaacttcac atctcaagag gtggccgcca 3900
tctgttggtg gggaaagatt gagcacacaa ccggcattcc ctacaatcca cagagccagg 3960
gcagcatcga gtccatgaac aagcagctca aagagattat cggcaagatc cgggacgact 4020
gccagtacac agaaacagcc gtgctgatgg cctgtcacat ccacaacttc aagcggaaag 4080
gcggcatcgg aggacagaca tctgccgaga gactgatcaa tatcatcacc actcagctgg 4140
aaatccagca cctccagacc aagatccaga agattctgaa cttccgggtg tactaccgcg 4200
agggcagaga tcctgtttgg aaaggcccag cacagctgat ctggaaaggc gaaggtgccg 4260
tggtgctgaa ggatggctct gatctgaagg tggtgcccag acggaaggcc aagattatca 4320
aggattacga gcccaaacag cgcgtgggca atgaaggcga cgttgagggc acaagaggca 4380
gcgacaattg a 4391
<210> 2
<211> 4391
<212> DNA
<213> Simian immunodeficiency virus
<400> 2
atgggggcgg ctacctcagc actaaatagg agacaattag accaatttga gaaaatacga 60
cttcgcccga acggaaagaa aaagtaccaa attaaacatt taatatgggc aggcaaggag 120
atggagcgct tcggcctcca tgagaggttg ttggagacag aggaggggtg taaaagaatc 180
atagaagtcc tctaccccct agaaccaaca ggatcggagg gcttaaaaag tctgttcaat 240
cttgtgtgcg tactatattg cttgcacaag gaacagaaag tgaaagacac agaggaagca 300
gtagcaacag taagacaaca ctgccatcta gtggaaaaag aaaaaagtgc aacagagaca 360
tctagtggac aaaagaaaaa tgacaaggga atagcagcgc cacctggtgg cagtcagaat 420
tttccagcgc aacaacaagg aaatgcctgg gtacatgtac ccttgtcacc gcgcacctta 480
aatgcgtggg taaaagcagt agaggagaaa aaatttggag cagaaatagt acccatgttt 540
caagccctat cagaaggctg cacaccctat gacattaatc agatgcttaa tgtgctagga 600
gatcatcaag gggcattaca aatagtgaaa gagatcatta atgaagaagc agcccagtgg 660
gatgtaacac acccactacc cgcaggaccc ctaccagcag gacagctcag ggaccctcgc 720
ggctcagata tagcagggac caccagctca gtacaagaac agttagaatg gatctatact 780
gctaaccccc gggtagatgt aggtgccatc taccggagat ggattattct aggacttcaa 840
aagtgtgtca aaatgtacaa cccagtatca gtcctagaca ttaggcaggg acctaaagag 900
cccttcaagg attatgtgga cagattttac aaggcaatta gagcagaaca agcctcaggg 960
gaagtgaaac aatggatgac agaatcatta ctcattcaaa atgctaatcc agattgtaag 1020
gtcatcctga agggcctagg aatgcacccc acccttgaag aaatgttaac ggcttgtcag 1080
ggggtaggag gcccaagcta caaagcaaaa gtaatggcag aaatgatgca gaccatgcaa 1140
aatcaaaaca tggtgcagca gggaggtcca aaaagacaaa gacccccact aagatgttat 1200
aattgtggaa aatttggcca tatgcaaaga caatgtccgg aaccaaggaa aacaaaatgt 1260
ctaaagtgtg gaaaattggg acacctagca aaagactgca ggggacaggt gaatttttta 1320
gggtatggac ggtggatggg ggcaaaaccg agaaattttc ccgccgctac tcttggagcg 1380
gaaccgagtg cgcctcctcc accgagcggc accaccccat acgacccagc aaagaagctc 1440
ctgcagcaat atgcagagaa agggaaacaa ctgagggagc aaaagaggaa tccaccggca 1500
atgaatccgg attggaccga gggatattct ttgaactccc tctttggaga agaccaataa 1560
agacagtgta tatagaaggg gtccccatta aggcactgct agacacaggg gcagatgaca 1620
ccataattaa agaaaatgat ttacaattat caggtccatg gagacccaaa attatagggg 1680
gcataggagg aggccttaat gtaaaagaat ataacgacag ggaagtaaaa atagaagata 1740
aaattttgag aggaacaata ttgttaggag caactcccat taatataata ggtagaaatt 1800
tgctggcccc ggcaggtgcc cggttagtaa tgggacaatt atcagaaaaa attcctgtca 1860
cacctgtcaa attgaaggaa ggggctcggg gaccctgtgt aagacaatgg cctctctcta 1920
aagagaagat tgaagcttta caggaaatat gttcccaatt agagcaggaa ggaaaaatca 1980
gtagagtagg aggagaaaat gcatacaata ccccaatatt ttgcataaag aagaaggaca 2040
aatcccagtg gaggatgcta gtagacttta gagagttaaa taaggcaacc caagatttct 2100
ttgaagtgca attagggata ccccacccag caggattaag aaagatgaga cagataacag 2160
ttttagatgt aggagacgcc tattattcca taccattgga tccaaatttt aggaaatata 2220
ctgcttttac tattcccaca gtgaataatc agggacccgg gattaggtat caattcaact 2280
gtctcccgca agggtggaaa ggatctccta caatcttcca aaatacagca gcatccattt 2340
tggaggagat aaaaagaaac ttgccagcac taaccattgt acaatacatg gatgatttat 2400
gggtaggttc tcaagaaaat gaacacaccc atgacaaatt agtagaacag ttaagaacaa 2460
aattacaagc ctggggctta gaaaccccag aaaagaaggt gcaaaaagaa ccaccttatg 2520
agtggatggg atacaaactt tggcctcaca aatgggaact aagcagaata caactggagg 2580
aaaaagatga atggactgtc aatgacatcc agaagttagt tgggaaacta aattgggcag 2640
cacaattgta tccaggtctt aggaccaaga atatatgcaa gttaattaga ggaaagaaaa 2700
atctgttaga gctagtgact tggacacctg aggcagaagc tgaatatgca gaaaatgcag 2760
agattcttaa aacagaacag gaaggaacct attacaaacc aggaatacct attagggcag 2820
cagtacagaa attggaagga ggacagtgga gttaccaatt caaacaagaa ggacaagtct 2880
tgaaagtagg aaaatacacc aagcaaaaga acacccatac aaatgaactt cgcacattag 2940
ctggtttagt gcagaagatt tgcaaagaag ctctagttat ttgggggata ttaccagttc 3000
tagaactccc gatagaaaga gaggtatggg aacaatggtg ggcggattac tggcaggtaa 3060
gctggattcc cgaatgggat tttgtcagca ccccaccttt gctcaaacta tggtacacat 3120
taacaaaaga acccataccc aaggaggacg tttactatgt agatggagca tgcaacagaa 3180
attcaaaaga aggaaaagca ggatacatct cacaatacgg aaaacagaga gtagaaacat 3240
tagaaaacac taccaatcag caagcagaat taacagctat aaaaatggct ttggaagaca 3300
gtgggcctaa tgtgaacata gtaacagact ctcaatatgc aatgggaatt ttgacagcac 3360
aacccacaca aagtgattca ccattagtag agcaaattat agccttaatg atacaaaagc 3420
aacaaatata tttgcagtgg gtaccagcac ataaaggaat aggaggaaat gaggagatag 3480
ataaattagt gagtaaaggc attagaagag ttttattctt agaaaaaata gaagaagctc 3540
aagaagagca tgaaagatat cataataatt ggaaaaacct agcagataca tatgggcttc 3600
cacaaatagt agcaaaagag atagtggcca tgtgtccaaa atgtcagata aagggagaac 3660
cagtgcatgg acaagtggat gcctcacctg gaacatggca gatggattgt actcatctag 3720
aaggaaaagt agtcatagtt gcggtccatg tagccagtgg attcatagaa gcagaagtca 3780
tacctaggga aacaggaaaa gaaacggcaa agtttctatt aaaaatactg agtagatggc 3840
ctataacaca gttacacaca gacaatgggc ctaactttac ctcccaagaa gtggcagcaa 3900
tatgttggtg gggaaaaatt gaacatacaa caggtatacc atataacccc caatctcaag 3960
gatcaataga aagcatgaac aaacaattaa aagagataat tgggaaaata agagatgatt 4020
gccaatatac agagacagca gtactgatgg cttgccatat tcacaatttt aaaagaaagg 4080
gaggaatagg gggacagact tcagcagaga gactaattaa tataataaca acacaattag 4140
aaatacaaca tttacaaacc aaaattcaaa aaattttaaa ttttagagtc tactacagag 4200
aagggagaga ccctgtgtgg aaaggaccag cacaattaat ctggaaaggg gaaggagcag 4260
tggtcctcaa ggacggaagt gacctaaagg ttgtaccaag aaggaaagct aaaattatta 4320
aggattatga acccaaacaa agagtgggta atgagggtga cgtggaaggt accaggggat 4380
ctgataacta a 4391
<210> 3
<211> 10528
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM326
<400> 3
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact ctgggcaagt agggcaggcg gtgggtacgc aatgggggcg gctacctcag 1200
cactaaatag gagacaatta gaccaatttg agaaaatacg acttcgcccg aacggaaaga 1260
aaaagtacca aattaaacat ttaatatggg caggcaagga gatggagcgc ttcggcctcc 1320
atgagaggtt gttggagaca gaggaggggt gtaaaagaat catagaagtc ctctaccccc 1380
tagaaccaac aggatcggag ggcttaaaaa gtctgttcaa tcttgtgtgc gtgctatatt 1440
gcttgcacaa ggaacagaaa gtgaaagaca cagaggaagc agtagcaaca gtaagacaac 1500
actgccatct agtggaaaaa gaaaaaagtg caacagagac atctagtgga caaaagaaaa 1560
atgacaaggg aatagcagcg ccacctggtg gcagtcagaa ttttccagcg caacaacaag 1620
gaaatgcctg ggtacatgta cccttgtcac cgcgcacctt aaatgcgtgg gtaaaagcag 1680
tagaggagaa aaaatttgga gcagaaatag tacccatgtt tcaagcccta tcgaattccc 1740
gtttgtgcta gggttcttag gcttcttggg ggctgctgga actgcaatgg gagcagcggc 1800
gacagccctg acggtccagt ctcagcattt gcttgctggg atactgcagc agcagaagaa 1860
tctgctggcg gctgtggagg ctcaacagca gatgttgaag ctgaccattt ggggtgttaa 1920
aaacctcaat gcccgcgtca cagcccttga gaagtaccta gaggatcagg cacgactaaa 1980
ctcctggggg tgcgcatgga aacaagtatg tcataccaca gtggagtggc cctggacaaa 2040
tcggactccg gattggcaaa atatgacttg gttggagtgg gaaagacaaa tagctgattt 2100
ggaaagcaac attacgagac aattagtgaa ggctagagaa caagaggaaa agaatctaga 2160
tgcctatcag aagttaacta gttggtcaga tttctggtct tggttcgatt tctcaaaatg 2220
gcttaacatt ttaaaaatgg gatttttagt aatagtagga ataatagggt taagattact 2280
ttacacagta tatggatgta tagtgagggt taggcaggga tatgttcctc tatctccaca 2340
gatccatatc cgcggcaatt ttaaaagaaa gggaggaata gggggacaga cttcagcaga 2400
gagactaatt aatataataa caacacaatt agaaatacaa catttacaaa ccaaaattca 2460
aaaaatttta aattttagag ccgcggagat ctgttacata acttatggta aatggcctgc 2520
ctggctgact gcccaatgac ccctgcccaa tgatgtcaat aatgatgtat gttcccatgt 2580
aatgccaata gggactttcc attgatgtca atgggtggag tatttatggt aactgcccac 2640
ttggcagtac atcaagtgta tcatatgcca agtatgcccc ctattgatgt caatgatggt 2700
aaatggcctg cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag 2760
tacatctatg tattagtcat tgctattacc atgggaattc actagtggag aagagcatgc 2820
ttgagggctg agtgcccctc agtgggcaga gagcacatgg cccacagtcc ctgagaagtt 2880
ggggggaggg gtgggcaatt gaactggtgc ctagagaagg tggggcttgg gtaaactggg 2940
aaagtgatgt ggtgtactgg ctccaccttt ttccccaggg tgggggagaa ccatatataa 3000
gtgcagtagt ctctgtgaac attcaagctt ctgccttctc cctcctgtga gtttgctagc 3060
caccatgcag agaagccctc tggagaaggc ctctgtggtg agcaagctgt tcttcagctg 3120
gaccaggccc atcctgagga agggctacag gcagagactg gagctgtctg acatctacca 3180
gatcccctct gtggactctg ctgacaacct gtctgagaag ctggagaggg agtgggatag 3240
agagctggcc agcaagaaga accccaagct gatcaatgcc ctgaggagat gcttcttctg 3300
gagattcatg ttctatggca tcttcctgta cctgggggaa gtgaccaagg ctgtgcagcc 3360
tctgctgctg ggcagaatca ttgccagcta tgaccctgac aacaaggagg agaggagcat 3420
tgccatctac ctgggcattg gcctgtgcct gctgttcatt gtgaggaccc tgctgctgca 3480
ccctgccatc tttggcctgc accacattgg catgcagatg aggattgcca tgttcagcct 3540
gatctacaag aaaaccctga agctgtccag cagagtgctg gacaagatca gcattggcca 3600
gctggtgagc ctgctgagca acaacctgaa caagtttgat gagggcctgg ccctggccca 3660
ctttgtgtgg attgcccctc tgcaggtggc cctgctgatg ggcctgattt gggagctgct 3720
gcaggcctct gccttttgtg gcctgggctt cctgattgtg ctggccctgt ttcaggctgg 3780
cctgggcagg atgatgatga agtacaggga ccagagggca ggcaagatca gtgagaggct 3840
ggtgatcacc tctgagatga ttgagaacat ccagtctgtg aaggcctact gttgggagga 3900
agctatggag aagatgattg aaaacctgag gcagacagag ctgaagctga ccaggaaggc 3960
tgcctatgtg agatacttca acagctctgc cttcttcttc tctggcttct ttgtggtgtt 4020
cctgtctgtg ctgccctatg ccctgatcaa ggggatcatc ctgagaaaga ttttcaccac 4080
catcagcttc tgcattgtgc tgaggatggc tgtgaccaga cagttcccct gggctgtgca 4140
gacctggtat gacagcctgg gggccatcaa caagatccag gacttcctgc agaagcagga 4200
gtacaagacc ctggagtaca acctgaccac cacagaagtg gtgatggaga atgtgacagc 4260
cttctgggag gagggctttg gggagctgtt tgagaaggcc aagcagaaca acaacaacag 4320
aaagaccagc aatggggatg actccctgtt cttctccaac ttctccctgc tgggcacacc 4380
tgtgctgaag gacatcaact tcaagattga gagggggcag ctgctggctg tggctggatc 4440
tacaggggct ggcaagacca gcctgctgat gatgatcatg ggggagctgg agccttctga 4500
gggcaagatc aagcactctg gcaggatcag cttttgcagc cagttcagct ggatcatgcc 4560
tggcaccatc aaggagaaca tcatctttgg agtgagctat gatgagtaca gatacaggag 4620
tgtgatcaag gcctgccagc tggaggagga catcagcaag tttgctgaga aggacaacat 4680
tgtgctgggg gagggaggca ttacactgtc tgggggccag agagccagaa tcagcctggc 4740
cagggctgtg tacaaggatg ctgacctgta cctgctggac tccccctttg gctacctgga 4800
tgtgctgaca gagaaggaga tttttgagag ctgtgtgtgc aagctgatgg ccaacaagac 4860
cagaatcctg gtgaccagca agatggagca cctgaagaag gctgacaaga tcctgatcct 4920
gcatgagggc agcagctact tctatgggac cttctctgag ctgcagaacc tgcagcctga 4980
cttcagctct aagctgatgg gctgtgacag ctttgaccag ttctctgctg agaggaggaa 5040
cagcatcctg acagagaccc tgcacagatt cagcctggag ggagatgccc ctgtgagctg 5100
gacagagacc aagaagcaga gcttcaagca gacaggggag tttggggaga agaggaagaa 5160
ctccatcctg aaccccatca acagcatcag gaagttcagc attgtgcaga aaacccccct 5220
gcagatgaat ggcattgagg aagattctga tgagcccctg gagaggagac tgagcctggt 5280
gcctgattct gagcagggag aggccatcct gcctaggatc tctgtgatca gcacaggccc 5340
tacactgcag gccagaagga ggcagtctgt gctgaacctg atgacccact ctgtgaacca 5400
gggccagaac atccacagga aaaccacagc ctccaccagg aaagtgagcc tggcccctca 5460
ggccaatctg acagagctgg acatctacag caggaggctg tctcaggaga caggcctgga 5520
gatttctgag gagatcaatg aggaggacct gaaagagtgc ttctttgatg acatggagag 5580
catccctgct gtgaccacct ggaacaccta cctgagatac atcacagtgc acaagagcct 5640
gatctttgtg ctgatctggt gcctggtgat cttcctggct gaagtggctg cctctctggt 5700
ggtgctgtgg ctgctgggaa acaccccact gcaggacaag ggcaacagca cccacagcag 5760
gaacaacagc tatgctgtga tcatcacctc cacctccagc tactatgtgt tctacatcta 5820
tgtgggagtg gctgataccc tgctggctat gggcttcttt agaggcctgc ccctggtgca 5880
cacactgatc acagtgagca agatcctcca ccacaagatg ctgcactctg tgctgcaggc 5940
tcctatgagc accctgaata ccctgaaggc tgggggcatc ctgaacagat tctccaagga 6000
tattgccatc ctggatgacc tgctgcctct caccatcttt gacttcatcc agctgctgct 6060
gattgtgatt ggggccattg ctgtggtggc agtgctgcag ccctacatct ttgtggccac 6120
agtgcctgtg attgtggcct tcatcatgct gagggcctac tttctgcaga cctcccagca 6180
gctgaagcag ctggagtctg agggcagaag ccccatcttc acccacctgg tgacaagcct 6240
gaagggcctg tggaccctga gagcctttgg caggcagccc tactttgaga ccctgttcca 6300
caaggccctg aacctgcaca cagccaactg gttcctctac ctgtccaccc tgagatggtt 6360
ccagatgaga attgagatga tctttgtcat cttcttcatt gctgtgacct tcatcagcat 6420
tctgaccaca ggagagggag agggcagagt gggcattatc ctgaccctgg ccatgaacat 6480
catgagcaca ctgcagtggg cagtgaacag cagcattgat gtggacagcc tgatgaggag 6540
tgtgagcaga gtgttcaagt tcattgatat gcccacagag ggcaagccta ccaagagcac 6600
caagccctac aagaatggcc agctgagcaa agtgatgatc attgagaaca gccatgtgaa 6660
gaaggatgat atctggccca gtggaggcca gatgacagtg aaggacctga cagccaagta 6720
cacagagggg ggcaatgcta tcctggagaa catctccttc agcatctccc ctggccagag 6780
agtgggactg ctgggaagaa caggctctgg caagtctacc ctgctgtctg ccttcctgag 6840
gctgctgaac acagagggag agatccagat tgatggagtg tcctgggaca gcatcacact 6900
gcagcagtgg aggaaggcct ttggtgtgat cccccagaaa gtgttcatct tcagtggcac 6960
cttcaggaag aacctggacc cctatgagca gtggtctgac caggagattt ggaaagtggc 7020
tgatgaagtg ggcctgagaa gtgtgattga gcagttccct ggcaagctgg actttgtcct 7080
ggtggatggg ggctgtgtgc tgagccatgg ccacaagcag ctgatgtgcc tggccagatc 7140
agtgctgagc aaggccaaga tcctgctgct ggatgagcct tctgcccacc tggatcctgt 7200
gacctaccag atcatcagga ggaccctcaa gcaggccttt gctgactgca cagtcatcct 7260
gtgtgagcac aggattgagg ccatgctgga gtgccagcag ttcctggtga ttgaggagaa 7320
caaagtgagg cagtatgaca gcatccagaa gctgctgaat gagaggagcc tgttcaggca 7380
ggccatcagc ccctctgata gagtgaagct gttcccccac aggaacagct ccaagtgcaa 7440
gagcaagccc cagattgctg ccctgaagga ggagacagag gaggaagtgc aggacaccag 7500
gctgtgaggg cccaatcaac ctctggatta caaaatttgt gaaagattga ctggtattct 7560
taactatgtt gctcctttta cgctatgtgg atacgctgct ttaatgcctt tgtatcatgc 7620
tattgcttcc cgtatggctt tcattttctc ctccttgtat aaatcctggt tgctgtctct 7680
ttatgaggag ttgtggcccg ttgtcaggca acgtggcgtg gtgtgcactg tgtttgctga 7740
cgcaaccccc actggttggg gcattgccac cacctgtcag ctcctttccg ggactttcgc 7800
tttccccctc cctattgcca cggcggaact catcgccgcc tgccttgccc gctgctggac 7860
aggggctcgg ctgttgggca ctgacaattc cgtggtgttg tcggggaaat catcgtcctt 7920
tccttggctg ctcgcctgtg ttgccacctg gattctgcgc gggacgtcct tctgctacgt 7980
cccttcggcc ctcaatccag cggaccttcc ttcccgcggc ctgctgccgg ctctgcggcc 8040
tcttccgcgt cttcgccttc gccctcagac gagtcggatc tccctttggg ccgcctcccc 8100
gcaagcttcg cactttttaa aagaaaaggg aggactggat gggatttatt actccgatag 8160
gacgctggct tgtaactcag tctcttacta ggagaccagc ttgagcctgg gtgttcgctg 8220
gttagcctaa cctggttggc caccaggggt aaggactcct tggcttagaa agctaataaa 8280
cttgcctgca ttagagctct tacgcgtccc gggctcgaga tccgcatctc aattagtcag 8340
caaccatagt cccgccccta actccgccca tcccgcccct aactccgccc agttccgccc 8400
attctccgcc ccatggctga ctaatttttt ttatttatgc agaggccgag gccgcctcgg 8460
cctctgagct attccagaag tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa 8520
agctaacttg tttattgcag cttataatgg ttacaaataa agcaatagca tcacaaattt 8580
cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac tcatcaatgt 8640
atcttatcat gtctgtccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg 8700
cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat 8760
aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc 8820
gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc 8880
tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga 8940
agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt 9000
ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg 9060
taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc 9120
gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg 9180
gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc 9240
ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat ctgcgctctg 9300
ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 9360
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 9420
caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt 9480
taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa 9540
aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttagaaa 9600
aactcatcga gcatcaaatg aaactgcaat ttattcatat caggattatc aataccatat 9660
ttttgaaaaa gccgtttctg taatgaagga gaaaactcac cgaggcagtt ccataggatg 9720
gcaagatcct ggtatcggtc tgcgattccg actcgtccaa catcaataca acctattaat 9780
ttcccctcgt caaaaataag gttatcaagt gagaaatcac catgagtgac gactgaatcc 9840
ggtgagaatg gcaacagctt atgcatttct ttccagactt gttcaacagg ccagccatta 9900
cgctcgtcat caaaatcact cgcatcaacc aaaccgttat tcattcgtga ttgcgcctga 9960
gcgagacgaa atacgcgatc gctgttaaaa ggacaattac aaacaggaat cgaatgcaac 10020
cggcgcagga acactgccag cgcatcaaca atattttcac ctgaatcagg atattcttct 10080
aatacctgga atgctgtttt tccggggatc gcagtggtga gtaaccatgc atcatcagga 10140
gtacggataa aatgcttgat ggtcggaaga ggcataaatt ccgtcagcca gtttagtctg 10200
accatctcat ctgtaacatc attggcaacg ctacctttgc catgtttcag aaacaactct 10260
ggcgcatcgg gcttcccata caatcgatag attgtcgcac ctgattgccc gacattatcg 10320
cgagcccatt tatacccata taaatcagca tccatgttgg aatttaatcg cggcctagag 10380
caagacgttt cccgttgaat atggctcata acaccccttg tattactgtt tatgtaagca 10440
gacagtttta ttgttcatga tgatatattt ttatcttgtg caatgtaaca tcagagattt 10500
tgagacacaa caattggtcg acggatcc 10528
<210> 4
<211> 10536
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM830
<400> 4
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact ctgggcaagt agggcaggcg gtgggtacgc aattgggggc ggctacctca 1200
gcactaaata ggagacaatt agaccaattt gagaaaatac gacttcgccc gaacggaaag 1260
aaaaagtacc aaattaaaca tttaatattg ggcaggcaag gagattggag cgcttcggcc 1320
tccatgagag gttgttggag acagaggagg ggtgtaaaag aatcatagaa gtcctctacc 1380
ccctagaacc aacaggatcg gagggcttaa aaagtctgtt caatcttgtg tgcgtgctat 1440
attgcttgca caaggaacag aaagtgaaag acacagagga agcagtagca acagtaagac 1500
aacactgcca tctagtggaa aaagaaaaaa gtgcaacaga gacatctagt ggacaaaaga 1560
aaaatgacaa gggaatagca gcgccacctg gtggcagtca gaattttcca gcgcaacaac 1620
aaggaaattg cctgggtaca tgtacccttg tcaccgcgca ccttaaatgc gtgggtaaaa 1680
gcagtagagg agaaaaaatt tggagcagaa atagtaccca tgtttcaagc cctatcgcct 1740
gcaggccgtt tgtgctaggg ttcttaggct tcttgggggc tgctggaact gcattgggag 1800
cagcggcgac agccctgacg gtccagtctc agcatttgct tgctgggata ctgcagcagc 1860
agaagaatct gctggcggct gtggaggctc aacagcagat gttgaagctg accatttggg 1920
gtgttaaaaa cctcaatgcc cgcgtcacag cccttgagaa gtacctagag gatcaggcac 1980
gactaaactc ctgggggtgc gcatggaaac aagtatgtca taccacagtg gagtggccct 2040
ggacaaatcg gactccggat tggcaaaata agacttggtt ggagtgggaa agacaaatag 2100
ctgatttgga aagcaacatt acgagacaat tagtgaaggc tagagaacaa gaggaaaaga 2160
atctagatgc ctatcagaag ttaactagtt ggtcagattt ctggtcttgg ttcgatttct 2220
caaaatggct taacatttta aaaaagggat ttttagtaat agtaggaata atagggttaa 2280
gattacttta cacagtatat ggatgtatag tgagggttag gcagggatat gttcctctat 2340
ctccacagat ccatataaag cggcaatttt aaaagaaagg gaggaatagg gggacagact 2400
tcagcagaga gactaattaa tataataaca acacaattag aaatacaaca tttacaaacc 2460
aaaattcaaa aaattttaaa ttttagagcc gcggagatct gttacataac ttatggtaaa 2520
tggcctgcct ggctgactgc ccaatgaccc ctgcccaatg atgtcaataa tgatgtatgt 2580
tcccatgtaa tgccaatagg gactttccat tgatgtcaat gggtggagta tttatggtaa 2640
ctgcccactt ggcagtacat caagtgtatc atatgccaag tatgccccct attgatgtca 2700
atgatggtaa atggcctgcc tggcattatg cccagtacat gaccttatgg gactttccta 2760
cttggcagta catctatgta ttagtcattg ctattaccat gggaattcac tagtggagaa 2820
gagcatgctt gagggctgag tgcccctcag tgggcagaga gcacatggcc cacagtccct 2880
gagaagttgg ggggaggggt gggcaattga actggtgcct agagaaggtg gggcttgggt 2940
aaactgggaa agtgatgtgg tgtactggct ccaccttttt ccccagggtg ggggagaacc 3000
atatataagt gcagtagtct ctgtgaacat tcaagcttct gccttctccc tcctgtgagt 3060
ttgctagcca ccatgcagag aagccctctg gagaaggcct ctgtggtgag caagctgttc 3120
ttcagctgga ccaggcccat cctgaggaag ggctacaggc agagactgga gctgtctgac 3180
atctaccaga tcccctctgt ggactctgct gacaacctgt ctgagaagct ggagagggag 3240
tgggatagag agctggccag caagaagaac cccaagctga tcaatgccct gaggagatgc 3300
ttcttctgga gattcatgtt ctatggcatc ttcctgtacc tgggggaagt gaccaaggct 3360
gtgcagcctc tgctgctggg cagaatcatt gccagctatg accctgacaa caaggaggag 3420
aggagcattg ccatctacct gggcattggc ctgtgcctgc tgttcattgt gaggaccctg 3480
ctgctgcacc ctgccatctt tggcctgcac cacattggca tgcagatgag gattgccatg 3540
ttcagcctga tctacaagaa aaccctgaag ctgtccagca gagtgctgga caagatcagc 3600
attggccagc tggtgagcct gctgagcaac aacctgaaca agtttgatga gggcctggcc 3660
ctggcccact ttgtgtggat tgcccctctg caggtggccc tgctgatggg cctgatttgg 3720
gagctgctgc aggcctctgc cttttgtggc ctgggcttcc tgattgtgct ggccctgttt 3780
caggctggcc tgggcaggat gatgatgaag tacagggacc agagggcagg caagatcagt 3840
gagaggctgg tgatcacctc tgagatgatt gagaacatcc agtctgtgaa ggcctactgt 3900
tgggaggaag ctatggagaa gatgattgaa aacctgaggc agacagagct gaagctgacc 3960
aggaaggctg cctatgtgag atacttcaac agctctgcct tcttcttctc tggcttcttt 4020
gtggtgttcc tgtctgtgct gccctatgcc ctgatcaagg ggatcatcct gagaaagatt 4080
ttcaccacca tcagcttctg cattgtgctg aggatggctg tgaccagaca gttcccctgg 4140
gctgtgcaga cctggtatga cagcctgggg gccatcaaca agatccagga cttcctgcag 4200
aagcaggagt acaagaccct ggagtacaac ctgaccacca cagaagtggt gatggagaat 4260
gtgacagcct tctgggagga gggctttggg gagctgtttg agaaggccaa gcagaacaac 4320
aacaacagaa agaccagcaa tggggatgac tccctgttct tctccaactt ctccctgctg 4380
ggcacacctg tgctgaagga catcaacttc aagattgaga gggggcagct gctggctgtg 4440
gctggatcta caggggctgg caagaccagc ctgctgatga tgatcatggg ggagctggag 4500
ccttctgagg gcaagatcaa gcactctggc aggatcagct tttgcagcca gttcagctgg 4560
atcatgcctg gcaccatcaa ggagaacatc atctttggag tgagctatga tgagtacaga 4620
tacaggagtg tgatcaaggc ctgccagctg gaggaggaca tcagcaagtt tgctgagaag 4680
gacaacattg tgctggggga gggaggcatt acactgtctg ggggccagag agccagaatc 4740
agcctggcca gggctgtgta caaggatgct gacctgtacc tgctggactc cccctttggc 4800
tacctggatg tgctgacaga gaaggagatt tttgagagct gtgtgtgcaa gctgatggcc 4860
aacaagacca gaatcctggt gaccagcaag atggagcacc tgaagaaggc tgacaagatc 4920
ctgatcctgc atgagggcag cagctacttc tatgggacct tctctgagct gcagaacctg 4980
cagcctgact tcagctctaa gctgatgggc tgtgacagct ttgaccagtt ctctgctgag 5040
aggaggaaca gcatcctgac agagaccctg cacagattca gcctggaggg agatgcccct 5100
gtgagctgga cagagaccaa gaagcagagc ttcaagcaga caggggagtt tggggagaag 5160
aggaagaact ccatcctgaa ccccatcaac agcatcagga agttcagcat tgtgcagaaa 5220
acccccctgc agatgaatgg cattgaggaa gattctgatg agcccctgga gaggagactg 5280
agcctggtgc ctgattctga gcagggagag gccatcctgc ctaggatctc tgtgatcagc 5340
acaggcccta cactgcaggc cagaaggagg cagtctgtgc tgaacctgat gacccactct 5400
gtgaaccagg gccagaacat ccacaggaaa accacagcct ccaccaggaa agtgagcctg 5460
gcccctcagg ccaatctgac agagctggac atctacagca ggaggctgtc tcaggagaca 5520
ggcctggaga tttctgagga gatcaatgag gaggacctga aagagtgctt ctttgatgac 5580
atggagagca tccctgctgt gaccacctgg aacacctacc tgagatacat cacagtgcac 5640
aagagcctga tctttgtgct gatctggtgc ctggtgatct tcctggctga agtggctgcc 5700
tctctggtgg tgctgtggct gctgggaaac accccactgc aggacaaggg caacagcacc 5760
cacagcagga acaacagcta tgctgtgatc atcacctcca cctccagcta ctatgtgttc 5820
tacatctatg tgggagtggc tgataccctg ctggctatgg gcttctttag aggcctgccc 5880
ctggtgcaca cactgatcac agtgagcaag atcctccacc acaagatgct gcactctgtg 5940
ctgcaggctc ctatgagcac cctgaatacc ctgaaggctg ggggcatcct gaacagattc 6000
tccaaggata ttgccatcct ggatgacctg ctgcctctca ccatctttga cttcatccag 6060
ctgctgctga ttgtgattgg ggccattgct gtggtggcag tgctgcagcc ctacatcttt 6120
gtggccacag tgcctgtgat tgtggccttc atcatgctga gggcctactt tctgcagacc 6180
tcccagcagc tgaagcagct ggagtctgag ggcagaagcc ccatcttcac ccacctggtg 6240
acaagcctga agggcctgtg gaccctgaga gcctttggca ggcagcccta ctttgagacc 6300
ctgttccaca aggccctgaa cctgcacaca gccaactggt tcctctacct gtccaccctg 6360
agatggttcc agatgagaat tgagatgatc tttgtcatct tcttcattgc tgtgaccttc 6420
atcagcattc tgaccacagg agagggagag ggcagagtgg gcattatcct gaccctggcc 6480
atgaacatca tgagcacact gcagtgggca gtgaacagca gcattgatgt ggacagcctg 6540
atgaggagtg tgagcagagt gttcaagttc attgatatgc ccacagaggg caagcctacc 6600
aagagcacca agccctacaa gaatggccag ctgagcaaag tgatgatcat tgagaacagc 6660
catgtgaaga aggatgatat ctggcccagt ggaggccaga tgacagtgaa ggacctgaca 6720
gccaagtaca cagagggggg caatgctatc ctggagaaca tctccttcag catctcccct 6780
ggccagagag tgggactgct gggaagaaca ggctctggca agtctaccct gctgtctgcc 6840
ttcctgaggc tgctgaacac agagggagag atccagattg atggagtgtc ctgggacagc 6900
atcacactgc agcagtggag gaaggccttt ggtgtgatcc cccagaaagt gttcatcttc 6960
agtggcacct tcaggaagaa cctggacccc tatgagcagt ggtctgacca ggagatttgg 7020
aaagtggctg atgaagtggg cctgagaagt gtgattgagc agttccctgg caagctggac 7080
tttgtcctgg tggatggggg ctgtgtgctg agccatggcc acaagcagct gatgtgcctg 7140
gccagatcag tgctgagcaa ggccaagatc ctgctgctgg atgagccttc tgcccacctg 7200
gatcctgtga cctaccagat catcaggagg accctcaagc aggcctttgc tgactgcaca 7260
gtcatcctgt gtgagcacag gattgaggcc atgctggagt gccagcagtt cctggtgatt 7320
gaggagaaca aagtgaggca gtatgacagc atccagaagc tgctgaatga gaggagcctg 7380
ttcaggcagg ccatcagccc ctctgataga gtgaagctgt tcccccacag gaacagctcc 7440
aagtgcaaga gcaagcccca gattgctgcc ctgaaggagg agacagagga ggaagtgcag 7500
gacaccaggc tgtgagggcc caatcaacct ctggattaca aaatttgtga aagattgact 7560
ggtattctta actatgttgc tccttttacg ctatgtggat acgctgcttt aatgcctttg 7620
tatcatgcta ttgcttcccg tatggctttc attttctcct ccttgtataa atcctggttg 7680
ctgtctcttt atgaggagtt gtggcccgtt gtcaggcaac gtggcgtggt gtgcactgtg 7740
tttgctgacg caacccccac tggttggggc attgccacca cctgtcagct cctttccggg 7800
actttcgctt tccccctccc tattgccacg gcggaactca tcgccgcctg ccttgcccgc 7860
tgctggacag gggctcggct gttgggcact gacaattccg tggtgttgtc ggggaaatca 7920
tcgtcctttc cttggctgct cgcctgtgtt gccacctgga ttctgcgcgg gacgtccttc 7980
tgctacgtcc cttcggccct caatccagcg gaccttcctt cccgcggcct gctgccggct 8040
ctgcggcctc ttccgcgtct tcgccttcgc cctcagacga gtcggatctc cctttgggcc 8100
gcctccccgc aagcttcgca ctttttaaaa gaaaagggag gactggatgg gatttattac 8160
tccgatagga cgctggcttg taactcagtc tcttactagg agaccagctt gagcctgggt 8220
gttcgctggt tagcctaacc tggttggcca ccaggggtaa ggactccttg gcttagaaag 8280
ctaataaact tgcctgcatt agagctctta cgcgtcccgg gctcgagatc cgcatctcaa 8340
ttagtcagca accatagtcc cgcccctaac tccgcccatc ccgcccctaa ctccgcccag 8400
ttccgcccat tctccgcccc atggctgact aatttttttt atttatgcag aggccgaggc 8460
cgcctcggcc tctgagctat tccagaagta gtgaggaggc ttttttggag gcctaggctt 8520
ttgcaaaaag ctaacttgtt tattgcagct tataatggtt acaaataaag caatagcatc 8580
acaaatttca caaataaagc atttttttca ctgcattcta gttgtggttt gtccaaactc 8640
atcaatgtat cttatcatgt ctgtccgctt cctcgctcac tgactcgctg cgctcggtcg 8700
ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 8760
caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 8820
aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 8880
atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 8940
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 9000
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 9060
gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 9120
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 9180
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 9240
cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct 9300
gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 9360
aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 9420
aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 9480
actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 9540
taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 9600
gttagaaaaa ctcatcgagc atcaaatgaa actgcaattt attcatatca ggattatcaa 9660
taccatattt ttgaaaaagc cgtttctgta atgaaggaga aaactcaccg aggcagttcc 9720
ataggatggc aagatcctgg tatcggtctg cgattccgac tcgtccaaca tcaatacaac 9780
ctattaattt cccctcgtca aaaataaggt tatcaagtga gaaatcacca tgagtgacga 9840
ctgaatccgg tgagaatggc aacagcttat gcatttcttt ccagacttgt tcaacaggcc 9900
agccattacg ctcgtcatca aaatcactcg catcaaccaa accgttattc attcgtgatt 9960
gcgcctgagc gagacgaaat acgcgatcgc tgttaaaagg acaattacaa acaggaatcg 10020
aatgcaaccg gcgcaggaac actgccagcg catcaacaat attttcacct gaatcaggat 10080
attcttctaa tacctggaat gctgtttttc cggggatcgc agtggtgagt aaccatgcat 10140
catcaggagt acggataaaa tgcttgatgg tcggaagagg cataaattcc gtcagccagt 10200
ttagtctgac catctcatct gtaacatcat tggcaacgct acctttgcca tgtttcagaa 10260
acaactctgg cgcatcgggc ttcccataca atcgatagat tgtcgcacct gattgcccga 10320
cattatcgcg agcccattta tacccatata aatcagcatc catgttggaa tttaatcgcg 10380
gcctagagca agacgtttcc cgttgaatat ggctcataac accccttgta ttactgttta 10440
tgtaagcaga cagttttatt gttcatgatg atatattttt atcttgtgca atgtaacatc 10500
agagattttg agacacaaca attggtcgac ggatcc 10536
<210> 5
<211> 9064
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM691
<400> 5
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 60
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 360
tcatcgctat taccatggtc gaggtgagcc ccacgttctg cttcactctc cccatctccc 420
ccccctcccc acccccaatt ttgtatttat ttatttttta attattttgt gcagcgatgg 480
gggcgggggg gggggggggg cgcgcgccag gcggggcggg gcggggcgag gggcggggcg 540
gggcgaggcg gagaggtgcg gcggcagcca atcagagcgg cgcgctccga aagtttcctt 600
ttatggcgag gcggcggcgg cggcggccct ataaaaagcg aagcgcgcgg cgggcgggag 660
tcgctgcgcg ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc 720
ggctctgact gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg 780
gctgtaatta gcgcttggtt taatgacggc ttgtttcttt tctgtggctg cgtgaaagcc 840
ttgaggggct ccgggagggc cctttgtgcg gggggagcgg ctcggggggt gcgtgcgtgt 900
gtgtgtgcgt ggggagcgcc gcgtgcggct ccgcgctgcc cggcggctgt gagcgctgcg 960
ggcgcggcgc ggggctttgt gcgctccgca gtgtgcgcga ggggagcgcg gccgggggcg 1020
gtgccccgcg gtgcgggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080
tgggggggtg agcagggggt gtgggcgcgt cggtcgggct gcaacccccc ctgcaccccc 1140
ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtacg gggcgtggcg 1200
cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc ggggcggggc 1260
cgcctcgggc cggggagggc tcgggggagg ggcgcggcgg cccccggagc gccggcggct 1320
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380
gacttccttt gtcccaaatc tgtgcggagc cgaaatctgg gaggcgccgc cgcaccccct 1440
ctagcgggcg cggggcgaag cggtgcggcg ccggcaggaa ggaaatgggc ggggagggcc 1500
ttcgtgcgtc gccgcgccgc cgtccccttc tccctctcca gcctcggggc tgtccgcggg 1560
gggacggctg ccttcggggg ggacggggca gggcggggtt cggcttctgg cgtgtgaccg 1620
gcggctctag agcctctgct aaccatgttc atgccttctt ctttttccta cagctcctgg 1680
gcaacgtgct ggttattgtg ctgtctcatc attttggcaa agaattgctc gagccaccat 1740
gggagctgcc acatctgccc tgaatagacg gcagctggac cagttcgaga agatcagact 1800
gcggcccaac ggcaagaaga agtaccagat caagcacctg atctgggccg gcaaagagat 1860
ggaaagattc ggcctgcacg agcggctgct ggaaaccgag gaaggctgca agagaattat 1920
cgaggtgctg taccctctgg aacctaccgg ctctgagggc ctgaagtccc tgttcaatct 1980
cgtgtgcgtg ctgtactgcc tgcacaaaga acagaaagtg aaggacaccg aagaggccgt 2040
ggccacagtt agacagcact gccacctggt ggaaaaagag aagtccgcca cagagacaag 2100
cagcggccag aagaagaacg acaagggaat tgctgcccct cctggcggca gccagaattt 2160
tcctgctcag cagcagggaa acgcctgggt gcacgttcca ctgagcccta gaacactgaa 2220
tgcctgggtc aaagccgtgg aagagaagaa gtttggcgcc gagatcgtgc ccatgttcca 2280
ggctctgtct gagggctgca ccccttacga catcaaccag atgctgaacg tgctgggaga 2340
tcaccagggc gctctgcaga tcgtgaaaga gatcatcaac gaagaggctg cccagtggga 2400
cgtgacacat ccattgcctg ctggacctct gccagccgga caactgagag atcctagagg 2460
ctctgatatc gccggcacca ccagctctgt gcaagagcag ctggaatgga tctacaccgc 2520
caatcctaga gtggacgtgg gcgccatcta cagaagatgg atcatcctgg gcctgcagaa 2580
atgcgtgaag atgtacaacc ccgtgtccgt gctggacatc agacagggac ccaaagagcc 2640
cttcaaggac tacgtggacc ggttctataa ggccattaga gccgagcagg ccagcggcga 2700
agtgaagcag tggatgacag agagcctgct gatccagaac gccaatccag actgcaaagt 2760
gatcctgaaa ggcctgggca tgcaccccac actggaagag atgctgacag cctgtcaagg 2820
cgttggcggc ccttcttaca aagccaaagt gatggccgag atgatgcaga ccatgcagaa 2880
ccagaacatg gtgcagcaag gcggccctaa gagacagagg cctcctctga gatgctacaa 2940
ctgcggcaag ttcggccaca tgcagagaca gtgtcctgag cctaggaaaa caaaatgtct 3000
aaagtgtgga aaattgggac acctagcaaa agactgcagg ggacaggtga attttttagg 3060
gtatggacgg tggatggggg caaaaccgag aaattttccc gccgctactc ttggagcgga 3120
accgagtgcg cctcctccac cgagcggcac caccccatac gacccagcaa agaagctcct 3180
gcagcaatat gcagagaaag ggaaacaact gagggagcaa aagaggaatc caccggcaat 3240
gaatccggat tggaccgagg gatattcttt gaactccctc tttggagaag accaataaag 3300
accgtgtaca tcgagggcgt gcccatcaag gctctgctgg atacaggcgc cgacgacacc 3360
atcatcaaag agaacgacct gcagctgagc ggcccttgga ggcctaagat cattggagga 3420
atcggcggag gcctgaacgt caaagagtac aacgaccggg aagtgaagat cgaggacaag 3480
atcctgaggg gcacaatcct gctgggcgcc acacctatca acatcatcgg cagaaatctg 3540
ctggcccctg ccggcgctag actggttatg ggacagctct ctgagaagat ccccgtgaca 3600
cccgtgaagc tgaaagaagg cgctagagga ccttgtgtgc gacagtggcc tctgagcaaa 3660
gagaagattg aggccctgca agaaatctgt agccagctgg aacaagaggg caagatcagc 3720
agagttggcg gcgagaacgc ctacaatacc cctatcttct gcatcaagaa aaaggacaag 3780
agccagtggc ggatgctggt ggactttaga gagctgaaca aggctaccca ggacttcttc 3840
gaggtgcagc tgggaattcc tcatcctgcc ggcctgcgga agatgagaca gatcacagtg 3900
ctggatgtgg gcgacgccta ctacagcatc cctctggacc ccaacttcag aaagtacacc 3960
gccttcacaa tccccaccgt gaacaatcaa ggccctggca tcagatacca gttcaactgc 4020
ctgcctcaag gctggaaggg cagccccacc atttttcaga ataccgccgc cagcatcctg 4080
gaagaaatca agagaaacct gcctgctctg accatcgtgc agtacatgga cgatctgtgg 4140
gtcggaagcc aagagaatga gcacacccac gacaagctgg tggaacagct gagaacaaag 4200
ctgcaggcct ggggcctcga aacccctgag aagaaggtgc agaaagaacc tccttacgag 4260
tggatgggct acaagctgtg gcctcacaag tgggagctga gccggattca gctcgaagag 4320
aaggacgagt ggaccgtgaa cgacatccag aaactcgtgg gcaagctgaa ttgggcagcc 4380
cagctgtatc ccggcctgag gaccaagaac atctgcaagc tgatccgggg aaagaagaac 4440
ctgctggaac tggtcacatg gacacctgag gccgaggccg aatatgccga gaatgccgaa 4500
atcctgaaaa ccgagcaaga ggggacctac tacaagcctg gcattccaat cagagctgcc 4560
gtgcagaaac tggaaggcgg ccagtggtcc taccagttta agcaagaagg ccaggtcctg 4620
aaagtgggca agtacaccaa gcagaagaac acccacacca acgagctgag gacactggct 4680
ggcctggtcc agaaaatctg caaagaggcc ctggtcattt ggggcatcct gcctgttctg 4740
gaactgccca ttgagcggga agtgtgggaa cagtggtggg ccgattactg gcaagtgtct 4800
tggatccccg agtgggactt cgtgtctacc cctcctctgc tgaaactgtg gtacaccctg 4860
acaaaagagc ccattcctaa agaggacgtc tactacgttg acggcgcctg caaccggaac 4920
tccaaagaag gcaaggccgg ctacatcagc cagtacggca agcagagagt ggaaaccctg 4980
gaaaacacca ccaaccagca ggccgagctg accgccatta agatggccct ggaagatagc 5040
ggccccaatg tgaacatcgt gaccgactct cagtacgcca tgggaatcct gacagcccag 5100
cctacacaga gcgatagccc tctggttgag cagatcattg ccctgatgat tcagaagcag 5160
caaatctacc tgcagtgggt gcccgctcac aaaggcatcg gcggaaacga agagatcgat 5220
aagctggtgt ccaagggaat cagacgggtg ctgttcctgg aaaagattga agaggcccaa 5280
gaggaacacg agcgctacca caacaactgg aagaatctgg ccgacaccta cggactgccc 5340
cagatcgtgg ccaaagaaat cgtggctatg tgccccaagt gtcagatcaa gggcgaacct 5400
gtgcacggcc aagtggatgc ttctcctggc acatggcaga tggactgtac ccacctggaa 5460
ggcaaagtgg tcatcgtggc tgtgcacgtg gcctccggct ttattgaggc cgaagtgatc 5520
cccagagaga caggcaaaga aaccgccaag ttcctgctga agatcctgtc cagatggccc 5580
atcacacagc tgcacaccga caacggccct aacttcacat ctcaagaggt ggccgccatc 5640
tgttggtggg gaaagattga gcacacaacc ggcattccct acaatccaca gagccagggc 5700
agcatcgagt ccatgaacaa gcagctcaaa gagattatcg gcaagatccg ggacgactgc 5760
cagtacacag aaacagccgt gctgatggcc tgtcacatcc acaacttcaa gcggaaaggc 5820
ggcatcggag gacagacatc tgccgagaga ctgatcaata tcatcaccac tcagctggaa 5880
atccagcacc tccagaccaa gatccagaag attctgaact tccgggtgta ctaccgcgag 5940
ggcagagatc ctgtttggaa aggcccagca cagctgatct ggaaaggcga aggtgccgtg 6000
gtgctgaagg atggctctga tctgaaggtg gtgcccagac ggaaggccaa gattatcaag 6060
gattacgagc ccaaacagcg cgtgggcaat gaaggcgacg ttgagggcac aagaggcagc 6120
gacaattgaa attcactcct caggtgcagg ctgcctatca gaaggtggtg gctggtgtgg 6180
ccaatgccct ggctcacaaa taccactgag atctttttcc ctctgccaaa aattatgggg 6240
acatcatgaa gccccttgag catctgactt ctggctaata aaggaaattt attttcattg 6300
caatagtgtg ttggaatttt ttgtgtctct cactcggaag gacatatggg agggcaaatc 6360
atttaaaaca tcagaatgag tatttggttt agagtttggc aacatatgcc catatgctgg 6420
ctgccatgaa caaaggttgg ctataaagag gtcatcagta tatgaaacag ccccctgctg 6480
tccattcctt attccataga aaagccttga cttgaggtta gatttttttt atattttgtt 6540
ttgtgttatt tttttcttta acatccctaa aattttcctt acatgtttta ctagccagat 6600
ttttcctcct ctcctgacta ctcccagtca tagctgtccc tcttctctta tggagatccc 6660
tcgacctgca gcccaagctt ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt 6720
tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa agcctggggt 6780
gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc tttccagtcg 6840
ggaaacctgt cgtgccagcg gatccgcatc tcaattagtc agcaaccata gtcccgcccc 6900
taactccgcc catcccgccc ctaactccgc ccagttccgc ccattctccg ccccatggct 6960
gactaatttt ttttatttat gcagaggccg aggccgcctc ggcctctgag ctattccaga 7020
agtagtgagg aggctttttt ggaggcctag gcttttgcaa aaagctaact tgtttattgc 7080
agcttataat ggttacaaat aaagcaatag catcacaaat ttcacaaata aagcattttt 7140
ttcactgcat tctagttgtg gtttgtccaa actcatcaat gtatcttatc atgtctgtcc 7200
gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct 7260
cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg 7320
tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc 7380
cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga 7440
aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 7500
cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 7560
gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 7620
ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 7680
cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 7740
aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 7800
tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc 7860
ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 7920
tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc 7980
ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg 8040
agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca 8100
atctaaagta tatatgagta aacttggtct gacagttaga aaaactcatc gagcatcaaa 8160
tgaaactgca atttattcat atcaggatta tcaataccat atttttgaaa aagccgtttc 8220
tgtaatgaag gagaaaactc accgaggcag ttccatagga tggcaagatc ctggtatcgg 8280
tctgcgattc cgactcgtcc aacatcaata caacctatta atttcccctc gtcaaaaata 8340
aggttatcaa gtgagaaatc accatgagtg acgactgaat ccggtgagaa tggcaacagc 8400
ttatgcattt ctttccagac ttgttcaaca ggccagccat tacgctcgtc atcaaaatca 8460
ctcgcatcaa ccaaaccgtt attcattcgt gattgcgcct gagcgagacg aaatacgcga 8520
tcgctgttaa aaggacaatt acaaacagga atcgaatgca accggcgcag gaacactgcc 8580
agcgcatcaa caatattttc acctgaatca ggatattctt ctaatacctg gaatgctgtt 8640
tttccgggga tcgcagtggt gagtaaccat gcatcatcag gagtacggat aaaatgcttg 8700
atggtcggaa gaggcataaa ttccgtcagc cagtttagtc tgaccatctc atctgtaaca 8760
tcattggcaa cgctaccttt gccatgtttc agaaacaact ctggcgcatc gggcttccca 8820
tacaatcgat agattgtcgc acctgattgc ccgacattat cgcgagccca tttataccca 8880
tataaatcag catccatgtt ggaatttaat cgcggcctag agcaagacgt ttcccgttga 8940
atatggctca taacacccct tgtattactg tttatgtaag cagacagttt tattgttcat 9000
gatgatatat ttttatcttg tgcaatgtaa catcagagat tttgagacac aacaattggt 9060
cgac 9064
<210> 6
<211> 3384
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM299
<400> 6
tcaatattgg ccattagcca tattattcat tggttatata gcataaatca atattggcta 60
ttggccattg catacgttgt atctatatca taatatgtac atttatattg gctcatgtcc 120
aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180
gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240
gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300
agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360
ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420
cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480
gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 540
caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600
caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaataaccc 660
cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagc 720
tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc acagttaaat 780
tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca gaagttggtc 840
gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag accaatagaa 900
actgggcttg tcgagacaga gaagactctt gcgtttctga taggcaccta ttggtcttac 960
tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt acagctctta 1020
aggctagagt acttaatacg actcactata ggctagcctc gagaattcga ttatgcccct 1080
aggaccagaa gaaagaagat tgcttcgctt gatttggctc ctttacagca ccaatccata 1140
tccaccaagt ggggaaggga cggccagaca acgccgacga gccaggagaa ggtggagaca 1200
acagcaggat caaattagag tcttggtaga aagactccaa gagcaggtgt atgcagttga 1260
ccgcctggct gacgaggctc aacacttggc tatacaacag ttgcctgacc ctcctcattc 1320
agcttagaat cactagtgaa ttcacgcgtg gtacctctag agtcgacccg ggcggccgct 1380
tcgagcagac atgataagat acattgatga gtttggacaa accacaacta gaatgcagtg 1440
aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa ccattataag 1500
ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg ttcaggggga 1560
gatgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtaaaa tcgataagga 1620
tccgtcgacc aattgttgtg tctcaaaatc tctgatgtta cattgcacaa gataaaaata 1680
tatcatcatg aacaataaaa ctgtctgctt acataaacag taatacaagg ggtgttatga 1740
gccatattca acgggaaacg tcttgctcta ggccgcgatt aaattccaac atggatgctg 1800
atttatatgg gtataaatgg gctcgcgata atgtcgggca atcaggtgcg acaatctatc 1860
gattgtatgg gaagcccgat gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg 1920
ccaatgatgt tacagatgag atggtcagac taaactggct gacggaattt atgcctcttc 1980
cgaccatcaa gcattttatc cgtactcctg atgatgcatg gttactcacc actgcgatcc 2040
ccggaaaaac agcattccag gtattagaag aatatcctga ttcaggtgaa aatattgttg 2100
atgcgctggc agtgttcctg cgccggttgc attcgattcc tgtttgtaat tgtcctttta 2160
acagcgatcg cgtatttcgt ctcgctcagg cgcaatcacg aatgaataac ggtttggttg 2220
atgcgagtga ttttgatgac gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa 2280
tgcataagct gttgccattc tcaccggatt cagtcgtcac tcatggtgat ttctcacttg 2340
ataaccttat ttttgacgag gggaaattaa taggttgtat tgatgttgga cgagtcggaa 2400
tcgcagaccg ataccaggat cttgccatcc tatggaactg cctcggtgag ttttctcctt 2460
cattacagaa acggcttttt caaaaatatg gtattgataa tcctgatatg aataaattgc 2520
agtttcattt gatgctcgat gagtttttct aactgtcaga ccaagtttac tcatatatac 2580
tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg 2640
ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg 2700
tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc 2760
aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc 2820
tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtt cttctagtgt 2880
agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc 2940
taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact 3000
caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac 3060
agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt gagctatgag 3120
aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg 3180
gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg 3240
tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga 3300
gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt 3360
ttgctcacat ggctcgacag atct 3384
<210> 7
<211> 6264
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM301
<400> 7
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 60
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 360
tcatcgctat taccatggtc gaggtgagcc ccacgttctg cttcactctc cccatctccc 420
ccccctcccc acccccaatt ttgtatttat ttatttttta attattttgt gcagcgatgg 480
gggcgggggg gggggggggg cgcgcgccag gcggggcggg gcggggcgag gggcggggcg 540
gggcgaggcg gagaggtgcg gcggcagcca atcagagcgg cgcgctccga aagtttcctt 600
ttatggcgag gcggcggcgg cggcggccct ataaaaagcg aagcgcgcgg cgggcgggag 660
tcgctgcgcg ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc 720
ggctctgact gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg 780
gctgtaatta gcgcttggtt taatgacggc ttgtttcttt tctgtggctg cgtgaaagcc 840
ttgaggggct ccgggagggc cctttgtgcg gggggagcgg ctcggggggt gcgtgcgtgt 900
gtgtgtgcgt ggggagcgcc gcgtgcggct ccgcgctgcc cggcggctgt gagcgctgcg 960
ggcgcggcgc ggggctttgt gcgctccgca gtgtgcgcga ggggagcgcg gccgggggcg 1020
gtgccccgcg gtgcgggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080
tgggggggtg agcagggggt gtgggcgcgt cggtcgggct gcaacccccc ctgcaccccc 1140
ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtacg gggcgtggcg 1200
cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc ggggcggggc 1260
cgcctcgggc cggggagggc tcgggggagg ggcgcggcgg cccccggagc gccggcggct 1320
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380
gacttccttt gtcccaaatc tgtgcggagc cgaaatctgg gaggcgccgc cgcaccccct 1440
ctagcgggcg cggggcgaag cggtgcggcg ccggcaggaa ggaaatgggc ggggagggcc 1500
ttcgtgcgtc gccgcgccgc cgtccccttc tccctctcca gcctcggggc tgtccgcggg 1560
gggacggctg ccttcggggg ggacggggca gggcggggtt cggcttctgg cgtgtgaccg 1620
gcggctctag agcctctgct aaccatgttc atgccttctt ctttttccta cagctcctgg 1680
gcaacgtgct ggttattgtg ctgtctcatc attttggcaa agaattcgat tgccatggca 1740
acatatatcc agagagtaca gtgcatctca acatcactac tggttgttct caccacattg 1800
gtctcgtgtc agattcccag ggataggctc tctaacatag gggtcatagt cgatgaaggg 1860
aaatcactga agatagctgg atcccacgaa tcgaggtaca tagtactgag tctagttccg 1920
ggggtagact ttgagaatgg gtgcggaaca gcccaggtta tccagtacaa gagcctactg 1980
aacaggctgt taatcccatt gagggatgcc ttagatcttc aggaggctct gataactgtc 2040
accaatgata cgacacaaaa tgccggtgct ccccagtcga gattcttcgg tgctgtgatt 2100
ggtactatcg cacttggagt ggcgacatca gcacaaatca ccgcagggat tgcactagcc 2160
gaagcgaggg aggccaaaag agacatagcg ctcatcaaag aatcgatgac aaaaacacac 2220
aagtctatag aactgctgca aaacgctgtg ggggaacaaa ttcttgctct aaagacactc 2280
caggatttcg tgaatgatga gatcaaaccc gcaataagcg aattaggctg tgagactgct 2340
gccttaagac tgggtataaa attgacacag cattactccg agctgttaac tgcgttcggc 2400
tcgaatttcg gaaccatcgg agagaagagc ctcacgctgc aggcgctgtc ttcactttac 2460
tctgctaaca ttactgagat tatgaccaca atcaggacag ggcagtctaa catctatgat 2520
gtcatttata cagaacagat caaaggaacg gtgatagatg tggatctaga gagatacatg 2580
gtcaccctgt ctgtgaagat ccctattctt tctgaagtcc caggtgtgct catacacaag 2640
gcatcatcta tttcttacaa catagacggg gaggaatggt atgtgactgt ccccagccat 2700
atactcagtc gtgcttcttt cttagggggt gcagacataa ccgattgtgt tgagtccaga 2760
ttgacctata tatgccccag ggatcccgca caactgatac ctgacagcca gcaaaagtgt 2820
atcctggggg acacaacaag gtgtcctgtc acaaaagttg tggacagcct tatccccaag 2880
tttgcttttg tgaatggggg cgttgttgct aactgcatag catccacatg tacctgcggg 2940
acaggccgaa gaccaatcag tcaggatcgc tctaaaggtg tagtattcct aacccatgac 3000
aactgtggtc ttataggtgt caatggggta gaattgtatg ctaaccggag agggcacgat 3060
gccacttggg gggtccagaa cttgacagtc ggtcctgcaa ttgctatcag acccgttgat 3120
atttctctca accttgctga tgctacgaat ttcttgcaag actctaaggc tgagcttgag 3180
aaagcacgga aaatcctctc ggaggtaggt agatggtaca actcaagaga gactgtgatt 3240
acgatcatag tagttatggt cgtaatattg gtggtcatta tagtgatcat catcgtgctt 3300
tatagactca gaaggtgaaa tcactagtga attcactcct caggtgcagg ctgcctatca 3360
gaaggtggtg gctggtgtgg ccaatgccct ggctcacaaa taccactgag atctttttcc 3420
ctctgccaaa aattatgggg acatcatgaa gccccttgag catctgactt ctggctaata 3480
aaggaaattt attttcattg caatagtgtg ttggaatttt ttgtgtctct cactcggaag 3540
gacatatggg agggcaaatc atttaaaaca tcagaatgag tatttggttt agagtttggc 3600
aacatatgcc catatgctgg ctgccatgaa caaaggttgg ctataaagag gtcatcagta 3660
tatgaaacag ccccctgctg tccattcctt attccataga aaagccttga cttgaggtta 3720
gatttttttt atattttgtt ttgtgttatt tttttcttta acatccctaa aattttcctt 3780
acatgtttta ctagccagat ttttcctcct ctcctgacta ctcccagtca tagctgtccc 3840
tcttctctta tggagatccc tcgacctgca gcccaagctt ggcgtaatca tggtcatagc 3900
tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca 3960
taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct 4020
cactgcccgc tttccagtcg ggaaacctgt cgtgccagcg gatccgcatc tcaattagtc 4080
agcaaccata gtcccgcccc taactccgcc catcccgccc ctaactccgc ccagttccgc 4140
ccattctccg ccccatggct gactaatttt ttttatttat gcagaggccg aggccgcctc 4200
ggcctctgag ctattccaga agtagtgagg aggctttttt ggaggcctag gcttttgcaa 4260
aaagctaact tgtttattgc agcttataat ggttacaaat aaagcaatag catcacaaat 4320
ttcacaaata aagcattttt ttcactgcat tctagttgtg gtttgtccaa actcatcaat 4380
gtatcttatc atgtctgtcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc 4440
tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg 4500
ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg 4560
ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac 4620
gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg 4680
gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct 4740
ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat ctcagttcgg 4800
tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct 4860
gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac 4920
tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt 4980
tcttgaagtg gtggcctaac tacggctaca ctagaagaac agtatttggt atctgcgctc 5040
tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca 5100
ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat 5160
ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac 5220
gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt 5280
aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct gacagttaga 5340
aaaactcatc gagcatcaaa tgaaactgca atttattcat atcaggatta tcaataccat 5400
atttttgaaa aagccgtttc tgtaatgaag gagaaaactc accgaggcag ttccatagga 5460
tggcaagatc ctggtatcgg tctgcgattc cgactcgtcc aacatcaata caacctatta 5520
atttcccctc gtcaaaaata aggttatcaa gtgagaaatc accatgagtg acgactgaat 5580
ccggtgagaa tggcaacagc ttatgcattt ctttccagac ttgttcaaca ggccagccat 5640
tacgctcgtc atcaaaatca ctcgcatcaa ccaaaccgtt attcattcgt gattgcgcct 5700
gagcgagacg aaatacgcga tcgctgttaa aaggacaatt acaaacagga atcgaatgca 5760
accggcgcag gaacactgcc agcgcatcaa caatattttc acctgaatca ggatattctt 5820
ctaatacctg gaatgctgtt tttccgggga tcgcagtggt gagtaaccat gcatcatcag 5880
gagtacggat aaaatgcttg atggtcggaa gaggcataaa ttccgtcagc cagtttagtc 5940
tgaccatctc atctgtaaca tcattggcaa cgctaccttt gccatgtttc agaaacaact 6000
ctggcgcatc gggcttccca tacaatcgat agattgtcgc acctgattgc ccgacattat 6060
cgcgagccca tttataccca tataaatcag catccatgtt ggaatttaat cgcggcctag 6120
agcaagacgt ttcccgttga atatggctca taacacccct tgtattactg tttatgtaag 6180
cagacagttt tattgttcat gatgatatat ttttatcttg tgcaatgtaa catcagagat 6240
tttgagacac aacaattggt cgac 6264
<210> 8
<211> 6522
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM303
<400> 8
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 60
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 360
tcatcgctat taccatggtc gaggtgagcc ccacgttctg cttcactctc cccatctccc 420
ccccctcccc acccccaatt ttgtatttat ttatttttta attattttgt gcagcgatgg 480
gggcgggggg gggggggggg cgcgcgccag gcggggcggg gcggggcgag gggcggggcg 540
gggcgaggcg gagaggtgcg gcggcagcca atcagagcgg cgcgctccga aagtttcctt 600
ttatggcgag gcggcggcgg cggcggccct ataaaaagcg aagcgcgcgg cgggcgggag 660
tcgctgcgcg ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc 720
ggctctgact gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg 780
gctgtaatta gcgcttggtt taatgacggc ttgtttcttt tctgtggctg cgtgaaagcc 840
ttgaggggct ccgggagggc cctttgtgcg gggggagcgg ctcggggggt gcgtgcgtgt 900
gtgtgtgcgt ggggagcgcc gcgtgcggct ccgcgctgcc cggcggctgt gagcgctgcg 960
ggcgcggcgc ggggctttgt gcgctccgca gtgtgcgcga ggggagcgcg gccgggggcg 1020
gtgccccgcg gtgcgggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080
tgggggggtg agcagggggt gtgggcgcgt cggtcgggct gcaacccccc ctgcaccccc 1140
ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtacg gggcgtggcg 1200
cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc ggggcggggc 1260
cgcctcgggc cggggagggc tcgggggagg ggcgcggcgg cccccggagc gccggcggct 1320
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380
gacttccttt gtcccaaatc tgtgcggagc cgaaatctgg gaggcgccgc cgcaccccct 1440
ctagcgggcg cggggcgaag cggtgcggcg ccggcaggaa ggaaatgggc ggggagggcc 1500
ttcgtgcgtc gccgcgccgc cgtccccttc tccctctcca gcctcggggc tgtccgcggg 1560
gggacggggc agggcggggt tcggcttctg gcgtgtgacc ggcggctcta gagcctctgc 1620
taaccatgtt catgccttct tctttttcct acagctcctg ggcaacgtgc tggttattgt 1680
gctgtctcat cattttggca aagaattcct cgagcatgtg gtctgagtta aaaatcagga 1740
gcaacgacgg aggtgaagga ccagaggacg ccaacgaccc ccggggaaag ggggtgcaac 1800
acatccatat ccagccatct ctacctgttt atggacagag ggttagggat ggtgataggg 1860
gcaaacgtga ctcgtactgg tctacttctc ctagtggtag caccacaaaa ccagcatcag 1920
gttgggagag gtcaagtaaa gccgacacat ggttgctgat tctctcattc acccagtggg 1980
ctttgtcaat tgccacagtg atcatctgta tcataatttc tgctagacaa gggtatagta 2040
tgaaagagta ctcaatgact gtagaggcat tgaacatgag cagcagggag gtgaaagagt 2100
cacttaccag tctaataagg caagaggtta tagcaagggc tgtcaacatt cagagctctg 2160
tgcaaaccgg aatcccagtc ttgttgaaca aaaacagcag ggatgtcatc cagatgattg 2220
ataagtcgtg cagcagacaa gagctcactc agcactgtga gagtacgatc gcagtccacc 2280
atgccgatgg aattgcccca cttgagccac atagtttctg gagatgccct gtcggagaac 2340
cgtatcttag ctcagatcct gaaatctcat tgctgcctgg tccgagcttg ttatctggtt 2400
ctacaacgat ctctggatgt gttaggctcc cttcactctc aattggcgag gcaatctatg 2460
cctattcatc aaatctcatt acacaaggtt gtgctgacat agggaaatca tatcaggtcc 2520
tgcagctagg gtacatatca ctcaattcag atatgttccc tgatcttaac cccgtagtgt 2580
cccacactta tgacatcaac gacaatcgga aatcatgctc tgtggtggca accgggacta 2640
ggggttatca gctttgctcc atgccgactg tagacgaaag aaccgactac tctagtgatg 2700
gtattgagga tctggtcctt gatgtcctgg atctcaaagg gagaactaag tctcaccggt 2760
atcgcaacag cgaggtagat cttgatcacc cgttctctgc actatacccc agtgtaggca 2820
acggcattgc aacagaaggc tcattgatat ttcttgggta tggtggacta accacccctc 2880
tgcagggtga tacaaaatgt aggacccaag gatgccaaca ggtgtcgcaa gacacatgca 2940
atgaggctct gaaaattaca tggctaggag ggaaacaggt ggtcagcgtg atcatccagg 3000
tcaatgacta tctctcagag aggccaaaga taagagtcac aaccattcca atcactcaaa 3060
actatctcgg ggcggaaggt agattattaa aattgggtga tcgggtgtac atctatacaa 3120
gatcatcagg ctggcactct caactgcaga taggagtact tgatgtcagc caccctttga 3180
ctatcaactg gacacctcat gaagccttgt ctagaccagg aaataaagag tgcaattggt 3240
acaataagtg tccgaaggaa tgcatatcag gcgtatacac tgatgcttat ccattgtccc 3300
ctgatgcagc taacgtcgct accgtcacgc tatatgccaa tacatcgcgt gtcaacccaa 3360
caatcatgta ttctaacact actaacatta taaatatgtt aaggataaag gatgttcaat 3420
tagaggctgc atataccacg acatcgtgta tcacgcattt tggtaaaggc tactgctttc 3480
acatcatcga gatcaatcag aagagcctga ataccttaca gccgatgctc tttaagacta 3540
gcatccctaa attatgcaag gccgagtctt aagcggccgc gcatgcgaat tcactcctca 3600
ggtgcaggct gcctatcaga aggtggtggc tggtgtggcc aatgccctgg ctcacaaata 3660
ccactgagat ctttttccct ctgccaaaaa ttatggggac atcatgaagc cccttgagca 3720
tctgacttct ggctaataaa ggaaatttat tttcattgca atagtgtgtt ggaatttttt 3780
gtgtctctca ctcggaagga catatgggag ggcaaatcat ttaaaacatc agaatgagta 3840
tttggtttag agtttggcaa catatgccca tatgctggct gccatgaaca aaggttggct 3900
ataaagaggt catcagtata tgaaacagcc ccctgctgtc tattccttat tccatagaaa 3960
agccttgact tgaggttaga ttttttttat attttgtttt gtgttatttt tttctttaac 4020
atccctaaaa ttttccttac atgttttact agccagattt ttcctcctct cctgactact 4080
cccagtcata gctgtccctc ttctcttatg gagatccctc gacctgcagc ccaagcttgg 4140
cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta tccgctcaca attccacaca 4200
acatacgagc cggaagcata aagtgtaaag cctggggtgc ctaatgagtg agctaactca 4260
cattaattgc gttgcgctca ctgcccgctt tccagtcggg aaacctgtcg tgccagcgga 4320
tccgcatctc aattagtcag caaccatagt cccgccccta actccgccca tcccgcccct 4380
aactccgccc agttccgccc attctccgcc ccatggctga ctaatttttt ttatttatgc 4440
agaggccgag gccgcctcgg cctctgagct attccagaag tagtgaggag gcttttttgg 4500
aggcctaggc ttttgcaaaa agctaacttg tttattgcag cttataatgg ttacaaataa 4560
agcaatagca tcacaaattt cacaaataaa gcattttttt cactgcattc tagttgtggt 4620
ttgtccaaac tcatcaatgt atcttatcat gtctgtccgc ttcctcgctc actgactcgc 4680
tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt 4740
tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg 4800
ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg 4860
agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat 4920
accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta 4980
ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct 5040
gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 5100
ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa 5160
gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg 5220
taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag 5280
tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt 5340
gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 5400
cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc 5460
agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca 5520
cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa 5580
cttggtctga cagttagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat 5640
caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac 5700
cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa 5760
catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac 5820
catgagtgac gactgaatcc ggtgagaatg gcaacagctt atgcatttct ttccagactt 5880
gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat 5940
tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac 6000
aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac 6060
ctgaatcagg atattcttct aatacctgga atgctgtttt tccggggatc gcagtggtga 6120
gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt 6180
ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc 6240
catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac 6300
ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg 6360
aatttaatcg cggcctagag caagacgttt cccgttgaat atggctcata acaccccttg 6420
tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg 6480
caatgtaaca tcagagattt tgagacacaa caattggtcg ac 6522
<210> 9
<211> 9886
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM297
<400> 9
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 60
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 360
tcatcgctat taccatggtc gaggtgagcc ccacgttctg cttcactctc cccatctccc 420
ccccctcccc acccccaatt ttgtatttat ttatttttta attattttgt gcagcgatgg 480
gggcgggggg gggggggggg cgcgcgccag gcggggcggg gcggggcgag gggcggggcg 540
gggcgaggcg gagaggtgcg gcggcagcca atcagagcgg cgcgctccga aagtttcctt 600
ttatggcgag gcggcggcgg cggcggccct ataaaaagcg aagcgcgcgg cgggcgggag 660
tcgctgcgcg ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc 720
ggctctgact gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg 780
gctgtaatta gcgcttggtt taatgacggc ttgtttcttt tctgtggctg cgtgaaagcc 840
ttgaggggct ccgggagggc cctttgtgcg gggggagcgg ctcggggggt gcgtgcgtgt 900
gtgtgtgcgt ggggagcgcc gcgtgcggct ccgcgctgcc cggcggctgt gagcgctgcg 960
ggcgcggcgc ggggctttgt gcgctccgca gtgtgcgcga ggggagcgcg gccgggggcg 1020
gtgccccgcg gtgcgggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080
tgggggggtg agcagggggt gtgggcgcgt cggtcgggct gcaacccccc ctgcaccccc 1140
ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtacg gggcgtggcg 1200
cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc ggggcggggc 1260
cgcctcgggc cggggagggc tcgggggagg ggcgcggcgg cccccggagc gccggcggct 1320
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380
gacttccttt gtcccaaatc tgtgcggagc cgaaatctgg gaggcgccgc cgcaccccct 1440
ctagcgggcg cggggcgaag cggtgcggcg ccggcaggaa ggaaatgggc ggggagggcc 1500
ttcgtgcgtc gccgcgccgc cgtccccttc tccctctcca gcctcggggc tgtccgcggg 1560
gggacggctg ccttcggggg ggacggggca gggcggggtt cggcttctgg cgtgtgaccg 1620
gcggctctag agcctctgct aaccatgttc atgccttctt ctttttccta cagctcctgg 1680
gcaacgtgct ggttattgtg ctgtctcatc attttggcaa agaattgctc gagactagtg 1740
acttggtgag taggcttcga gcctagttag aggactagga gaggccgtag ccgtaactac 1800
tctgggcaag tagggcaggc ggtgggtacg caatgggggc ggctacctca gcactaaata 1860
ggagacaatt agaccaattt gagaaaatac gacttcgccc gaacggaaag aaaaagtacc 1920
aaattaaaca tttaatatgg gcaggcaagg agatggagcg cttcggcctc catgagaggt 1980
tgttggagac agaggagggg tgtaaaagaa tcatagaagt cctctacccc ctagaaccaa 2040
caggatcgga gggcttaaaa agtctgttca atcttgtgtg cgtactatat tgcttgcaca 2100
aggaacagaa agtgaaagac acagaggaag cagtagcaac agtaagacaa cactgccatc 2160
tagtggaaaa agaaaaaagt gcaacagaga catctagtgg acaaaagaaa aatgacaagg 2220
gaatagcagc gccacctggt ggcagtcaga attttccagc gcaacaacaa ggaaatgcct 2280
gggtacatgt acccttgtca ccgcgcacct taaatgcgtg ggtaaaagca gtagaggaga 2340
aaaaatttgg agcagaaata gtacccatgt ttcaagccct atcagaaggc tgcacaccct 2400
atgacattaa tcagatgctt aatgtgctag gagatcatca aggggcatta caaatagtga 2460
aagagatcat taatgaagaa gcagcccagt gggatgtaac acacccacta cccgcaggac 2520
ccctaccagc aggacagctc agggaccctc gcggctcaga tatagcaggg accaccagct 2580
cagtacaaga acagttagaa tggatctata ctgctaaccc ccgggtagat gtaggtgcca 2640
tctaccggag atggattatt ctaggacttc aaaagtgtgt caaaatgtac aacccagtat 2700
cagtcctaga cattaggcag ggacctaaag agcccttcaa ggattatgtg gacagatttt 2760
acaaggcaat tagagcagaa caagcctcag gggaagtgaa acaatggatg acagaatcat 2820
tactcattca aaatgctaat ccagattgta aggtcatcct gaagggccta ggaatgcacc 2880
ccacccttga agaaatgtta acggcttgtc agggggtagg aggcccaagc tacaaagcaa 2940
aagtaatggc agaaatgatg cagaccatgc aaaatcaaaa catggtgcag cagggaggtc 3000
caaaaagaca aagaccccca ctaagatgtt ataattgtgg aaaatttggc catatgcaaa 3060
gacaatgtcc ggaaccaagg aaaacaaaat gtctaaagtg tggaaaattg ggacacctag 3120
caaaagactg caggggacag gtgaattttt tagggtatgg acggtggatg ggggcaaaac 3180
cgagaaattt tcccgccgct actcttggag cggaaccgag tgcgcctcct ccaccgagcg 3240
gcaccacccc atacgaccca gcaaagaagc tcctgcagca atatgcagag aaagggaaac 3300
aactgaggga gcaaaagagg aatccaccgg caatgaatcc ggattggacc gagggatatt 3360
ctttgaactc cctctttgga gaagaccaat aaagacagtg tatatagaag gggtccccat 3420
taaggcactg ctagacacag gggcagatga caccataatt aaagaaaatg atttacaatt 3480
atcaggtcca tggagaccca aaattatagg gggcatagga ggaggcctta atgtaaaaga 3540
atataacgac agggaagtaa aaatagaaga taaaattttg agaggaacaa tattgttagg 3600
agcaactccc attaatataa taggtagaaa tttgctggcc ccggcaggtg cccggttagt 3660
aatgggacaa ttatcagaaa aaattcctgt cacacctgtc aaattgaagg aaggggctcg 3720
gggaccctgt gtaagacaat ggcctctctc taaagagaag attgaagctt tacaggaaat 3780
atgttcccaa ttagagcagg aaggaaaaat cagtagagta ggaggagaaa atgcatacaa 3840
taccccaata ttttgcataa agaagaagga caaatcccag tggaggatgc tagtagactt 3900
tagagagtta aataaggcaa cccaagattt ctttgaagtg caattaggga taccccaccc 3960
agcaggatta agaaagatga gacagataac agttttagat gtaggagacg cctattattc 4020
cataccattg gatccaaatt ttaggaaata tactgctttt actattccca cagtgaataa 4080
tcagggaccc gggattaggt atcaattcaa ctgtctcccg caagggtgga aaggatctcc 4140
tacaatcttc caaaatacag cagcatccat tttggaggag ataaaaagaa acttgccagc 4200
actaaccatt gtacaataca tggatgattt atgggtaggt tctcaagaaa atgaacacac 4260
ccatgacaaa ttagtagaac agttaagaac aaaattacaa gcctggggct tagaaacccc 4320
agaaaagaag gtgcaaaaag aaccacctta tgagtggatg ggatacaaac tttggcctca 4380
caaatgggaa ctaagcagaa tacaactgga ggaaaaagat gaatggactg tcaatgacat 4440
ccagaagtta gttgggaaac taaattgggc agcacaattg tatccaggtc ttaggaccaa 4500
gaatatatgc aagttaatta gaggaaagaa aaatctgtta gagctagtga cttggacacc 4560
tgaggcagaa gctgaatatg cagaaaatgc agagattctt aaaacagaac aggaaggaac 4620
ctattacaaa ccaggaatac ctattagggc agcagtacag aaattggaag gaggacagtg 4680
gagttaccaa ttcaaacaag aaggacaagt cttgaaagta ggaaaataca ccaagcaaaa 4740
gaacacccat acaaatgaac ttcgcacatt agctggttta gtgcagaaga tttgcaaaga 4800
agctctagtt atttggggga tattaccagt tctagaactc ccgatagaaa gagaggtatg 4860
ggaacaatgg tgggcggatt actggcaggt aagctggatt cccgaatggg attttgtcag 4920
caccccacct ttgctcaaac tatggtacac attaacaaaa gaacccatac ccaaggagga 4980
cgtttactat gtagatggag catgcaacag aaattcaaaa gaaggaaaag caggatacat 5040
ctcacaatac ggaaaacaga gagtagaaac attagaaaac actaccaatc agcaagcaga 5100
attaacagct ataaaaatgg ctttggaaga cagtgggcct aatgtgaaca tagtaacaga 5160
ctctcaatat gcaatgggaa ttttgacagc acaacccaca caaagtgatt caccattagt 5220
agagcaaatt atagccttaa tgatacaaaa gcaacaaata tatttgcagt gggtaccagc 5280
acataaagga ataggaggaa atgaggagat agataaatta gtgagtaaag gcattagaag 5340
agttttattc ttagaaaaaa tagaagaagc tcaagaagag catgaaagat atcataataa 5400
ttggaaaaac ctagcagata catatgggct tccacaaata gtagcaaaag agatagtggc 5460
catgtgtcca aaatgtcaga taaagggaga accagtgcat ggacaagtgg atgcctcacc 5520
tggaacatgg cagatggatt gtactcatct agaaggaaaa gtagtcatag ttgcggtcca 5580
tgtagccagt ggattcatag aagcagaagt catacctagg gaaacaggaa aagaaacggc 5640
aaagtttcta ttaaaaatac tgagtagatg gcctataaca cagttacaca cagacaatgg 5700
gcctaacttt acctcccaag aagtggcagc aatatgttgg tggggaaaaa ttgaacatac 5760
aacaggtata ccatataacc cccaatctca aggatcaata gaaagcatga acaaacaatt 5820
aaaagagata attgggaaaa taagagatga ttgccaatat acagagacag cagtactgat 5880
ggcttgccat attcacaatt ttaaaagaaa gggaggaata gggggacaga cttcagcaga 5940
gagactaatt aatataataa caacacaatt agaaatacaa catttacaaa ccaaaattca 6000
aaaaatttta aattttagag tctactacag agaagggaga gaccctgtgt ggaaaggacc 6060
agcacaatta atctggaaag gggaaggagc agtggtcctc aaggacggaa gtgacctaaa 6120
ggttgtacca agaaggaaag ctaaaattat taaggattat gaacccaaac aaagagtggg 6180
taatgagggt gacgtggaag gtaccagggg atctgataac taaatggcag ggaatagtca 6240
gatattggat gagacaaaga aatttgaaat ggaactatta tatgcatcag ctggcggccg 6300
cgaattcact agtgattccc gtttgtgcta gggttcttag gcttcttggg ggctgctgga 6360
actgcaatgg gagcagcggc gacagccctg acggtccagt ctcagcattt gcttgctggg 6420
atactgcagc agcagaagaa tctgctggcg gctgtggagg ctcaacagca gatgttgaag 6480
ctgaccattt ggggtgttaa aaacctcaat gcccgcgtca cagcccttga gaagtaccta 6540
gaggatcagg cacgactaaa ctcctggggg tgcgcatgga aacaagtatg tcataccaca 6600
gtggagtggc cctggacaaa tcggactccg gattggcaaa atatgacttg gttggagtgg 6660
gaaagacaaa tagctgattt ggaaagcaac attacgagac aattagtgaa ggctagagaa 6720
caagaggaaa agaatctaga tgcctatcag aagttaacta gttggtcaga tttctggtct 6780
tggttcgatt tctcaaaatg gcttaacatt ttaaaaatgg gatttttagt aatagtagga 6840
ataatagggt taagattact ttacacagta tatggatgta tagtgagggt taggcaggga 6900
tatgttcctc tatctccaca gatccatatc caatcgaatt cccgcggccg caattcactc 6960
ctcaggtgca ggctgcctat cagaaggtgg tggctggtgt ggccaatgcc ctggctcaca 7020
aataccactg agatcttttt ccctctgcca aaaattatgg ggacatcatg aagccccttg 7080
agcatctgac ttctggctaa taaaggaaat ttattttcat tgcaatagtg tgttggaatt 7140
ttttgtgtct ctcactcgga aggacatatg ggagggcaaa tcatttaaaa catcagaatg 7200
agtatttggt ttagagtttg gcaacatatg cccatatgct ggctgccatg aacaaaggtt 7260
ggctataaag aggtcatcag tatatgaaac agccccctgc tgtccattcc ttattccata 7320
gaaaagcctt gacttgaggt tagatttttt ttatattttg ttttgtgtta tttttttctt 7380
taacatccct aaaattttcc ttacatgttt tactagccag atttttcctc ctctcctgac 7440
tactcccagt catagctgtc cctcttctct tatggagatc cctcgacctg cagcccaagc 7500
ttggcgtaat catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca 7560
cacaacatac gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa 7620
ctcacattaa ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag 7680
cggatccgca tctcaattag tcagcaacca tagtcccgcc cctaactccg cccatcccgc 7740
ccctaactcc gcccagttcc gcccattctc cgccccatgg ctgactaatt ttttttattt 7800
atgcagaggc cgaggccgcc tcggcctctg agctattcca gaagtagtga ggaggctttt 7860
ttggaggcct aggcttttgc aaaaagctaa cttgtttatt gcagcttata atggttacaa 7920
ataaagcaat agcatcacaa atttcacaaa taaagcattt ttttcactgc attctagttg 7980
tggtttgtcc aaactcatca atgtatctta tcatgtctgt ccgcttcctc gctcactgac 8040
tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata 8100
cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa 8160
aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct 8220
gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa 8280
agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg 8340
cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca 8400
cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 8460
ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg 8520
gtaagacacg acttatcgcc actggcagca gccactggta acaggattag cagagcgagg 8580
tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga 8640
acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc 8700
tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag 8760
attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac 8820
gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc 8880
ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag 8940
taaacttggt ctgacagtta gaaaaactca tcgagcatca aatgaaactg caatttattc 9000
atatcaggat tatcaatacc atatttttga aaaagccgtt tctgtaatga aggagaaaac 9060
tcaccgaggc agttccatag gatggcaaga tcctggtatc ggtctgcgat tccgactcgt 9120
ccaacatcaa tacaacctat taatttcccc tcgtcaaaaa taaggttatc aagtgagaaa 9180
tcaccatgag tgacgactga atccggtgag aatggcaaca gcttatgcat ttctttccag 9240
acttgttcaa caggccagcc attacgctcg tcatcaaaat cactcgcatc aaccaaaccg 9300
ttattcattc gtgattgcgc ctgagcgaga cgaaatacgc gatcgctgtt aaaaggacaa 9360
ttacaaacag gaatcgaatg caaccggcgc aggaacactg ccagcgcatc aacaatattt 9420
tcacctgaat caggatattc ttctaatacc tggaatgctg tttttccggg gatcgcagtg 9480
gtgagtaacc atgcatcatc aggagtacgg ataaaatgct tgatggtcgg aagaggcata 9540
aattccgtca gccagtttag tctgaccatc tcatctgtaa catcattggc aacgctacct 9600
ttgccatgtt tcagaaacaa ctctggcgca tcgggcttcc catacaatcg atagattgtc 9660
gcacctgatt gcccgacatt atcgcgagcc catttatacc catataaatc agcatccatg 9720
ttggaattta atcgcggcct agagcaagac gtttcccgtt gaatatggct cataacaccc 9780
cttgtattac tgtttatgta agcagacagt tttattgttc atgatgatat atttttatct 9840
tgtgcaatgt aacatcagag attttgagac acaacaattg gtcgac 9886
<210> 10
<211> 574
<212> DNA
<213> Artificial Sequence
<220>
<223> hCEF promoter
<400> 10
agatctgtta cataacttat ggtaaatggc ctgcctggct gactgcccaa tgacccctgc 60
ccaatgatgt caataatgat gtatgttccc atgtaatgcc aatagggact ttccattgat 120
gtcaatgggt ggagtattta tggtaactgc ccacttggca gtacatcaag tgtatcatat 180
gccaagtatg ccccctattg atgtcaatga tggtaaatgg cctgcctggc attatgccca 240
gtacatgacc ttatgggact ttcctacttg gcagtacatc tatgtattag tcattgctat 300
taccatggga attcactagt ggagaagagc atgcttgagg gctgagtgcc cctcagtggg 360
cagagagcac atggcccaca gtccctgaga agttgggggg aggggtgggc aattgaactg 420
gtgcctagag aaggtggggc ttgggtaaac tgggaaagtg atgtggtgta ctggctccac 480
ctttttcccc agggtggggg agaaccatat ataagtgcag tagtctctgt gaacattcaa 540
gcttctgcct tctccctcct gtgagtttgc tagc 574
<210> 11
<211> 873
<212> DNA
<213> Human cytomegalovirus
<400> 11
ccgcggagat ctcaatattg gccattagcc atattattca ttggttatat agcataaatc 60
aatattggct attggccatt gcatacgttg tatctatatc ataatatgta catttatatt 120
ggctcatgtc caatatgacc gccatgttgg cattgattat tgactagtta ttaatagtaa 180
tcaattacgg ggtcattagt tcatagccca tatatggagt tccgcgttac ataacttacg 240
gtaaatggcc cgcctggctg accgcccaac gacccccgcc cattgacgtc aataatgacg 300
tatgttccca tagtaacgcc aatagggact ttccattgac gtcaatgggt ggagtattta 360
cggtaaactg cccacttggc agtacatcaa gtgtatcata tgccaagtcc gccccctatt 420
gacgtcaatg acggtaaatg gcccgcctgg cattatgccc agtacatgac cttacgggac 480
tttcctactt ggcagtacat ctacgtatta gtcatcgcta ttaccatggt gatgcggttt 540
tggcagtaca ccaatgggcg tggatagcgg tttgactcac ggggatttcc aagtctccac 600
cccattgacg tcaatgggag tttgttttgg caccaaaatc aacgggactt tccaaaatgt 660
cgtaataacc ccgccccgtt gacgcaaatg ggcggtaggc gtgtacggtg ggaggtctat 720
ataagcagag ctcgtttagt gaaccgtcag atcactagaa gctttattgc ggtagtttat 780
cacagttaaa ttgctaacgc agtcagtgct tctgacacaa cagtctcgaa cttaagctgc 840
agaagttggt cgtgaggcac tgggcaggct agc 873
<210> 12
<211> 395
<212> DNA
<213> Homo sapiens
<400> 12
agatccatat ccgcggcaat tttaaaagaa agggaggaat agggggacag acttcagcag 60
agagactaat taatataata acaacacaat tagaaataca acatttacaa accaaaattc 120
aaaaaatttt aaattttaga gccgcggaga tcccgtgagg ctccggtgcc cgtcagtggg 180
cagagcgcac atcgcccaca gtccccgaga agttgggggg aggggtcggc aattgaaccg 240
gtgcctagag aaggtggcgc ggggtaaact gggaaagtga tgtcgtgtac tggctccgcc 300
tttttcccga gggtggggga gaaccgtata taagtgcagt agtcgccgtg aacgttcttt 360
ttcgcaacgg gtttgccgcc agaacacagg ctagc 395
<210> 13
<211> 4459
<212> DNA
<213> Artificial Sequence
<220>
<223> soCFTR2
<400> 13
gctagccacc atgcagagaa gccctctgga gaaggcctct gtggtgagca agctgttctt 60
cagctggacc aggcccatcc tgaggaaggg ctacaggcag agactggagc tgtctgacat 120
ctaccagatc ccctctgtgg actctgctga caacctgtct gagaagctgg agagggagtg 180
ggatagagag ctggccagca agaagaaccc caagctgatc aatgccctga ggagatgctt 240
cttctggaga ttcatgttct atggcatctt cctgtacctg ggggaagtga ccaaggctgt 300
gcagcctctg ctgctgggca gaatcattgc cagctatgac cctgacaaca aggaggagag 360
gagcattgcc atctacctgg gcattggcct gtgcctgctg ttcattgtga ggaccctgct 420
gctgcaccct gccatctttg gcctgcacca cattggcatg cagatgagga ttgccatgtt 480
cagcctgatc tacaagaaaa ccctgaagct gtccagcaga gtgctggaca agatcagcat 540
tggccagctg gtgagcctgc tgagcaacaa cctgaacaag tttgatgagg gcctggccct 600
ggcccacttt gtgtggattg cccctctgca ggtggccctg ctgatgggcc tgatttggga 660
gctgctgcag gcctctgcct tttgtggcct gggcttcctg attgtgctgg ccctgtttca 720
ggctggcctg ggcaggatga tgatgaagta cagggaccag agggcaggca agatcagtga 780
gaggctggtg atcacctctg agatgattga gaacatccag tctgtgaagg cctactgttg 840
ggaggaagct atggagaaga tgattgaaaa cctgaggcag acagagctga agctgaccag 900
gaaggctgcc tatgtgagat acttcaacag ctctgccttc ttcttctctg gcttctttgt 960
ggtgttcctg tctgtgctgc cctatgccct gatcaagggg atcatcctga gaaagatttt 1020
caccaccatc agcttctgca ttgtgctgag gatggctgtg accagacagt tcccctgggc 1080
tgtgcagacc tggtatgaca gcctgggggc catcaacaag atccaggact tcctgcagaa 1140
gcaggagtac aagaccctgg agtacaacct gaccaccaca gaagtggtga tggagaatgt 1200
gacagccttc tgggaggagg gctttgggga gctgtttgag aaggccaagc agaacaacaa 1260
caacagaaag accagcaatg gggatgactc cctgttcttc tccaacttct ccctgctggg 1320
cacacctgtg ctgaaggaca tcaacttcaa gattgagagg gggcagctgc tggctgtggc 1380
tggatctaca ggggctggca agaccagcct gctgatgatg atcatggggg agctggagcc 1440
ttctgagggc aagatcaagc actctggcag gatcagcttt tgcagccagt tcagctggat 1500
catgcctggc accatcaagg agaacatcat ctttggagtg agctatgatg agtacagata 1560
caggagtgtg atcaaggcct gccagctgga ggaggacatc agcaagtttg ctgagaagga 1620
caacattgtg ctgggggagg gaggcattac actgtctggg ggccagagag ccagaatcag 1680
cctggccagg gctgtgtaca aggatgctga cctgtacctg ctggactccc cctttggcta 1740
cctggatgtg ctgacagaga aggagatttt tgagagctgt gtgtgcaagc tgatggccaa 1800
caagaccaga atcctggtga ccagcaagat ggagcacctg aagaaggctg acaagatcct 1860
gatcctgcat gagggcagca gctacttcta tgggaccttc tctgagctgc agaacctgca 1920
gcctgacttc agctctaagc tgatgggctg tgacagcttt gaccagttct ctgctgagag 1980
gaggaacagc atcctgacag agaccctgca cagattcagc ctggagggag atgcccctgt 2040
gagctggaca gagaccaaga agcagagctt caagcagaca ggggagtttg gggagaagag 2100
gaagaactcc atcctgaacc ccatcaacag catcaggaag ttcagcattg tgcagaaaac 2160
ccccctgcag atgaatggca ttgaggaaga ttctgatgag cccctggaga ggagactgag 2220
cctggtgcct gattctgagc agggagaggc catcctgcct aggatctctg tgatcagcac 2280
aggccctaca ctgcaggcca gaaggaggca gtctgtgctg aacctgatga cccactctgt 2340
gaaccagggc cagaacatcc acaggaaaac cacagcctcc accaggaaag tgagcctggc 2400
ccctcaggcc aatctgacag agctggacat ctacagcagg aggctgtctc aggagacagg 2460
cctggagatt tctgaggaga tcaatgagga ggacctgaaa gagtgcttct ttgatgacat 2520
ggagagcatc cctgctgtga ccacctggaa cacctacctg agatacatca cagtgcacaa 2580
gagcctgatc tttgtgctga tctggtgcct ggtgatcttc ctggctgaag tggctgcctc 2640
tctggtggtg ctgtggctgc tgggaaacac cccactgcag gacaagggca acagcaccca 2700
cagcaggaac aacagctatg ctgtgatcat cacctccacc tccagctact atgtgttcta 2760
catctatgtg ggagtggctg ataccctgct ggctatgggc ttctttagag gcctgcccct 2820
ggtgcacaca ctgatcacag tgagcaagat cctccaccac aagatgctgc actctgtgct 2880
gcaggctcct atgagcaccc tgaataccct gaaggctggg ggcatcctga acagattctc 2940
caaggatatt gccatcctgg atgacctgct gcctctcacc atctttgact tcatccagct 3000
gctgctgatt gtgattgggg ccattgctgt ggtggcagtg ctgcagccct acatctttgt 3060
ggccacagtg cctgtgattg tggccttcat catgctgagg gcctactttc tgcagacctc 3120
ccagcagctg aagcagctgg agtctgaggg cagaagcccc atcttcaccc acctggtgac 3180
aagcctgaag ggcctgtgga ccctgagagc ctttggcagg cagccctact ttgagaccct 3240
gttccacaag gccctgaacc tgcacacagc caactggttc ctctacctgt ccaccctgag 3300
atggttccag atgagaattg agatgatctt tgtcatcttc ttcattgctg tgaccttcat 3360
cagcattctg accacaggag agggagaggg cagagtgggc attatcctga ccctggccat 3420
gaacatcatg agcacactgc agtgggcagt gaacagcagc attgatgtgg acagcctgat 3480
gaggagtgtg agcagagtgt tcaagttcat tgatatgccc acagagggca agcctaccaa 3540
gagcaccaag ccctacaaga atggccagct gagcaaagtg atgatcattg agaacagcca 3600
tgtgaagaag gatgatatct ggcccagtgg aggccagatg acagtgaagg acctgacagc 3660
caagtacaca gaggggggca atgctatcct ggagaacatc tccttcagca tctcccctgg 3720
ccagagagtg ggactgctgg gaagaacagg ctctggcaag tctaccctgc tgtctgcctt 3780
cctgaggctg ctgaacacag agggagagat ccagattgat ggagtgtcct gggacagcat 3840
cacactgcag cagtggagga aggcctttgg tgtgatcccc cagaaagtgt tcatcttcag 3900
tggcaccttc aggaagaacc tggaccccta tgagcagtgg tctgaccagg agatttggaa 3960
agtggctgat gaagtgggcc tgagaagtgt gattgagcag ttccctggca agctggactt 4020
tgtcctggtg gatgggggct gtgtgctgag ccatggccac aagcagctga tgtgcctggc 4080
cagatcagtg ctgagcaagg ccaagatcct gctgctggat gagccttctg cccacctgga 4140
tcctgtgacc taccagatca tcaggaggac cctcaagcag gcctttgctg actgcacagt 4200
catcctgtgt gagcacagga ttgaggccat gctggagtgc cagcagttcc tggtgattga 4260
ggagaacaaa gtgaggcagt atgacagcat ccagaagctg ctgaatgaga ggagcctgtt 4320
caggcaggcc atcagcccct ctgatagagt gaagctgttc ccccacagga acagctccaa 4380
gtgcaagagc aagccccaga ttgctgccct gaaggaggag acagaggagg aagtgcagga 4440
caccaggctg tgagggccc 4459
<210> 14
<211> 1257
<212> DNA
<213> Artificial Sequence
<220>
<223> sohAAT
<400> 14
atgcccagct ctgtgtcctg gggcattctg ctgctggctg gcctgtgctg tctggtgcct 60
gtgtccctgg ctgaggaccc tcagggggat gctgcccaga aaacagacac ctcccaccat 120
gaccaggacc accccacctt caacaagatc acccccaacc tggcagagtt tgccttcagc 180
ctgtacagac agctggccca ccagagcaac agcaccaaca tctttttcag ccctgtgtcc 240
attgccacag cctttgccat gctgagcctg ggcaccaagg ctgacaccca tgatgagatc 300
ctggaaggcc tgaacttcaa cctgacagag atccctgagg cccagatcca tgagggcttc 360
caggaactgc tgagaaccct gaaccagcca gacagccagc tgcagctgac aacaggcaat 420
gggctgttcc tgtctgaggg cctgaagctg gtggacaagt ttctggaaga tgtgaagaag 480
ctgtaccact ctgaggcctt cacagtgaac tttggggaca cagaagaggc caagaaacag 540
atcaatgact atgtggaaaa gggcacccag ggcaagattg tggaccttgt gaaagagctg 600
gacagggaca ctgtgtttgc ccttgtgaac tacatcttct tcaagggcaa gtgggagagg 660
ccctttgaag tgaaggacac tgaggaagag gacttccatg tggaccaagt gaccacagtg 720
aaggtgccaa tgatgaagag actggggatg ttcaatatcc agcactgcaa gaaactgagc 780
agctgggtgc tgctgatgaa gtacctgggc aatgctacag ccatattctt tctgcctgat 840
gagggcaagc tgcagcacct ggaaaatgag ctgacccatg acatcatcac caaatttctg 900
gaaaatgagg acagaagatc tgccagcctg catctgccca agctgagcat cacaggcaca 960
tatgacctga agtctgtgct gggacagctg ggaatcacca aggtgttcag caatggggca 1020
gacctgagtg gagtgacaga ggaagcccct ctgaagctgt ccaaggctgt gcacaaggca 1080
gtgctgacca ttgatgagaa gggcacagag gctgctgggg ccatgtttct ggaagccatc 1140
cccatgtcca tccccccaga agtgaagttc aacaagccct ttgtgttcct gatgattgag 1200
cagaacacca agagccccct gttcatgggc aaggttgtga accccaccca gaaatga 1257
<210> 15
<211> 1257
<212> DNA
<213> Artificial Sequence
<220>
<223> sohAAT completmentary strand
<400> 15
tacgggtcga gacacaggac cccgtaagac gacgaccgac cggacacgac agaccacgga 60
cacagggacc gactcctggg agtcccccta cgacgggtct tttgtctgtg gagggtggta 120
ctggtcctgg tggggtggaa gttgttctag tgggggttgg accgtctcaa acggaagtcg 180
gacatgtctg tcgaccgggt ggtctcgttg tcgtggttgt agaaaaagtc gggacacagg 240
taacggtgtc ggaaacggta cgactcggac ccgtggttcc gactgtgggt actactctag 300
gaccttccgg acttgaagtt ggactgtctc tagggactcc gggtctaggt actcccgaag 360
gtccttgacg actcttggga cttggtcggt ctgtcggtcg acgtcgactg ttgtccgtta 420
cccgacaagg acagactccc ggacttcgac cacctgttca aagaccttct acacttcttc 480
gacatggtga gactccggaa gtgtcacttg aaacccctgt gtcttctccg gttctttgtc 540
tagttactga tacacctttt cccgtgggtc ccgttctaac acctggaaca ctttctcgac 600
ctgtccctgt gacacaaacg ggaacacttg atgtagaaga agttcccgtt caccctctcc 660
gggaaacttc acttcctgtg actccttctc ctgaaggtac acctggttca ctggtgtcac 720
ttccacggtt actacttctc tgacccctac aagttatagg tcgtgacgtt ctttgactcg 780
tcgacccacg acgactactt catggacccg ttacgatgtc ggtataagaa agacggacta 840
ctcccgttcg acgtcgtgga ccttttactc gactgggtac tgtagtagtg gtttaaagac 900
cttttactcc tgtcttctag acggtcggac gtagacgggt tcgactcgta gtgtccgtgt 960
atactggact tcagacacga ccctgtcgac ccttagtggt tccacaagtc gttaccccgt 1020
ctggactcac ctcactgtct ccttcgggga gacttcgaca ggttccgaca cgtgttccgt 1080
cacgactggt aactactctt cccgtgtctc cgacgacccc ggtacaaaga ccttcggtag 1140
gggtacaggt aggggggtct tcacttcaag ttgttcggga aacacaagga ctactaactc 1200
gtcttgtggt tctcggggga caagtacccg ttccaacact tggggtgggt ctttact 1257
<210> 16
<211> 419
<212> PRT
<213> Homo sapiens
<400> 16
Ala Glu Asp Pro Gln Gly Asp Ala Ala Gln Lys Thr Asp Thr Ser His
1 5 10 15
His Asp Gln Asp His Pro Thr Phe Ala Glu Asp Pro Gln Gly Asp Ala
20 25 30
Ala Gln Lys Thr Asp Thr Ser His His Asp Gln Asp His Pro Thr Phe
35 40 45
Asn Lys Ile Thr Pro Asn Leu Ala Glu Phe Ala Phe Ser Leu Tyr Arg
50 55 60
Gln Leu Ala His Gln Ser Asn Ser Thr Asn Ile Phe Phe Ser Pro Val
65 70 75 80
Ser Ile Ala Thr Ala Phe Ala Met Leu Ser Leu Gly Thr Lys Ala Asp
85 90 95
Thr His Asp Glu Ile Leu Glu Gly Leu Asn Phe Asn Leu Thr Glu Ile
100 105 110
Pro Glu Ala Gln Ile His Glu Gly Phe Gln Glu Leu Leu Arg Thr Leu
115 120 125
Asn Gln Pro Asp Ser Gln Leu Gln Leu Thr Thr Gly Asn Gly Leu Phe
130 135 140
Leu Ser Glu Gly Leu Lys Leu Val Asp Lys Phe Leu Glu Asp Val Lys
145 150 155 160
Lys Leu Tyr His Ser Glu Ala Phe Thr Val Asn Phe Gly Asp Thr Glu
165 170 175
Glu Ala Lys Lys Gln Ile Asn Asp Tyr Val Glu Lys Gly Thr Gln Gly
180 185 190
Lys Ile Val Asp Leu Val Lys Glu Leu Asp Arg Asp Thr Val Phe Ala
195 200 205
Leu Val Asn Tyr Ile Phe Phe Lys Gly Lys Trp Glu Arg Pro Phe Glu
210 215 220
Val Lys Asp Thr Glu Glu Glu Asp Phe His Val Asp Gln Val Thr Thr
225 230 235 240
Val Lys Val Pro Met Met Lys Arg Leu Gly Met Phe Asn Ile Gln His
245 250 255
Cys Lys Lys Leu Ser Ser Trp Val Leu Leu Met Lys Tyr Leu Gly Asn
260 265 270
Ala Thr Ala Ile Phe Phe Leu Pro Asp Glu Gly Lys Leu Gln His Leu
275 280 285
Glu Asn Glu Leu Thr His Asp Ile Ile Thr Lys Phe Leu Glu Asn Glu
290 295 300
Asp Arg Arg Ser Ala Ser Leu His Leu Pro Lys Leu Ser Ile Thr Gly
305 310 315 320
Thr Tyr Asp Leu Lys Ser Val Leu Gly Gln Leu Gly Ile Thr Lys Val
325 330 335
Phe Ser Asn Gly Ala Asp Leu Ser Gly Val Thr Glu Glu Ala Pro Leu
340 345 350
Lys Leu Ser Lys Ala Val His Lys Ala Val Leu Thr Ile Asp Glu Lys
355 360 365
Gly Thr Glu Ala Ala Gly Ala Met Phe Leu Glu Ala Ile Pro Met Ser
370 375 380
Ile Pro Pro Glu Val Lys Phe Asn Lys Pro Phe Val Phe Leu Met Ile
385 390 395 400
Glu Gln Asn Thr Lys Ser Pro Leu Phe Met Gly Lys Val Val Asn Pro
405 410 415
Thr Gln Lys
<210> 17
<211> 5013
<212> DNA
<213> Artificial Sequence
<220>
<223> codon-optimised FVIII transgene (N6)
<400> 17
atgcagattg agctgagcac ctgcttcttc ctgtgcctgc tgaggttctg cttctctgcc 60
accaggagat actacctggg ggctgtggag ctgagctggg actacatgca gtctgacctg 120
ggggagctgc ctgtggatgc caggttcccc cccagagtgc ccaagagctt ccccttcaac 180
acctctgtgg tgtacaagaa gaccctgttt gtggagttca ctgaccacct gttcaacatt 240
gccaagccca ggcccccctg gatgggcctg ctgggcccca ccatccaggc tgaggtgtat 300
gacactgtgg tgatcaccct gaagaacatg gccagccacc ctgtgagcct gcatgctgtg 360
ggggtgagct actggaaggc ctctgagggg gctgagtatg atgaccagac cagccagagg 420
gagaaggagg atgacaaggt gttccctggg ggcagccaca cctatgtgtg gcaggtgctg 480
aaggagaatg gccccatggc ctctgacccc ctgtgcctga cctacagcta cctgagccat 540
gtggacctgg tgaaggacct gaactctggc ctgattgggg ccctgctggt gtgcagggag 600
ggcagcctgg ccaaggagaa gacccagacc ctgcacaagt tcatcctgct gtttgctgtg 660
tttgatgagg gcaagagctg gcactctgaa accaagaaca gcctgatgca ggacagggat 720
gctgcctctg ccagggcctg gcccaagatg cacactgtga atggctatgt gaacaggagc 780
ctgcctggcc tgattggctg ccacaggaag tctgtgtact ggcatgtgat tggcatgggc 840
accacccctg aggtgcacag catcttcctg gagggccaca ccttcctggt caggaaccac 900
aggcaggcca gcctggagat cagccccatc accttcctga ctgcccagac cctgctgatg 960
gacctgggcc agttcctgct gttctgccac atcagcagcc accagcatga tggcatggag 1020
gcctatgtga aggtggacag ctgccctgag gagccccagc tgaggatgaa gaacaatgag 1080
gaggctgagg actatgatga tgacctgact gactctgaga tggatgtggt gaggtttgat 1140
gatgacaaca gccccagctt catccagatc aggtctgtgg ccaagaagca ccccaagacc 1200
tgggtgcact acattgctgc tgaggaggag gactgggact atgcccccct ggtgctggcc 1260
cctgatgaca ggagctacaa gagccagtac ctgaacaatg gcccccagag gattggcagg 1320
aagtacaaga aggtcaggtt catggcctac actgatgaaa ccttcaagac cagggaggcc 1380
atccagcatg agtctggcat cctgggcccc ctgctgtatg gggaggtggg ggacaccctg 1440
ctgatcatct tcaagaacca ggccagcagg ccctacaaca tctaccccca tggcatcact 1500
gatgtgaggc ccctgtacag caggaggctg cccaaggggg tgaagcacct gaaggacttc 1560
cccatcctgc ctggggagat cttcaagtac aagtggactg tgactgtgga ggatggcccc 1620
accaagtctg accccaggtg cctgaccaga tactacagca gctttgtgaa catggagagg 1680
gacctggcct ctggcctgat tggccccctg ctgatctgct acaaggagtc tgtggaccag 1740
aggggcaacc agatcatgtc tgacaagagg aatgtgatcc tgttctctgt gtttgatgag 1800
aacaggagct ggtacctgac tgagaacatc cagaggttcc tgcccaaccc tgctggggtg 1860
cagctggagg accctgagtt ccaggccagc aacatcatgc acagcatcaa tggctatgtg 1920
tttgacagcc tgcagctgtc tgtgtgcctg catgaggtgg cctactggta catcctgagc 1980
attggggccc agactgactt cctgtctgtg ttcttctctg gctacacctt caagcacaag 2040
atggtgtatg aggacaccct gaccctgttc cccttctctg gggagactgt gttcatgagc 2100
atggagaacc ctggcctgtg gattctgggc tgccacaact ctgacttcag gaacaggggc 2160
atgactgccc tgctgaaagt ctccagctgt gacaagaaca ctggggacta ctatgaggac 2220
agctatgagg acatctctgc ctacctgctg agcaagaaca atgccattga gcccaggagc 2280
ttcagccaga acagcaggca ccccagcacc aggcagaagc agttcaatgc caccaccatc 2340
cctgagaatg acatagagaa gacagaccca tggtttgccc accggacccc catgcccaag 2400
atccagaatg tgagcagctc tgacctgctg atgctgctga ggcagagccc caccccccat 2460
ggcctgagcc tgtctgacct gcaggaggcc aagtatgaaa ccttctctga tgaccccagc 2520
cctggggcca ttgacagcaa caacagcctg tctgagatga cccacttcag gccccagctg 2580
caccactctg gggacatggt gttcacccct gagtctggcc tgcagctgag gctgaatgag 2640
aagctgggca ccactgctgc cactgagctg aagaagctgg acttcaaagt ctccagcacc 2700
agcaacaacc tgatcagcac catcccctct gacaacctgg ctgctggcac tgacaacacc 2760
agcagcctgg gcccccccag catgcctgtg cactatgaca gccagctgga caccaccctg 2820
tttggcaaga agagcagccc cctgactgag tctgggggcc ccctgagcct gtctgaggag 2880
aacaatgaca gcaagctgct ggagtctggc ctgatgaaca gccaggagag cagctggggc 2940
aagaatgtga gcagcaggga gatcaccagg accaccctgc agtctgacca ggaggagatt 3000
gactatgatg acaccatctc tgtggagatg aagaaggagg actttgacat ctacgacgag 3060
gacgagaacc agagccccag gagcttccag aagaagacca ggcactactt cattgctgct 3120
gtggagaggc tgtgggacta tggcatgagc agcagccccc atgtgctgag gaacagggcc 3180
cagtctggct ctgtgcccca gttcaagaag gtggtgttcc aggagttcac tgatggcagc 3240
ttcacccagc ccctgtacag aggggagctg aatgagcacc tgggcctgct gggcccctac 3300
atcagggctg aggtggagga caacatcatg gtgaccttca ggaaccaggc cagcaggccc 3360
tacagcttct acagcagcct gatcagctat gaggaggacc agaggcaggg ggctgagccc 3420
aggaagaact ttgtgaagcc caatgaaacc aagacctact tctggaaggt gcagcaccac 3480
atggccccca ccaaggatga gtttgactgc aaggcctggg cctacttctc tgatgtggac 3540
ctggagaagg atgtgcactc tggcctgatt ggccccctgc tggtgtgcca caccaacacc 3600
ctgaaccctg cccatggcag gcaggtgact gtgcaggagt ttgccctgtt cttcaccatc 3660
tttgatgaaa ccaagagctg gtacttcact gagaacatgg agaggaactg cagggccccc 3720
tgcaacatcc agatggagga ccccaccttc aaggagaact acaggttcca tgccatcaat 3780
ggctacatca tggacaccct gcctggcctg gtgatggccc aggaccagag gatcaggtgg 3840
tacctgctga gcatgggcag caatgagaac atccacagca tccacttctc tggccatgtg 3900
ttcactgtga ggaagaagga ggagtacaag atggccctgt acaacctgta ccctggggtg 3960
tttgagactg tggagatgct gcccagcaag gctggcatct ggagggtgga gtgcctgatt 4020
ggggagcacc tgcatgctgg catgagcacc ctgttcctgg tgtacagcaa caagtgccag 4080
acccccctgg gcatggcctc tggccacatc agggacttcc agatcactgc ctctggccag 4140
tatggccagt gggcccccaa gctggccagg ctgcactact ctggcagcat caatgcctgg 4200
agcaccaagg agcccttcag ctggatcaag gtggacctgc tggcccccat gatcatccat 4260
ggcatcaaga cccagggggc caggcagaag ttcagcagcc tgtacatcag ccagttcatc 4320
atcatgtaca gcctggatgg caagaagtgg cagacctaca ggggcaacag cactggcacc 4380
ctgatggtgt tctttggcaa tgtggacagc tctggcatca agcacaacat cttcaacccc 4440
cccatcattg ccagatacat caggctgcac cccacccact acagcatcag gagcaccctg 4500
aggatggagc tgatgggctg tgacctgaac agctgcagca tgcccctggg catggagagc 4560
aaggccatct ctgatgccca gatcactgcc agcagctact tcaccaacat gtttgccacc 4620
tggagcccca gcaaggccag gctgcacctg cagggcagga gcaatgcctg gaggccccag 4680
gtcaacaacc ccaaggagtg gctgcaggtg gacttccaga agaccatgaa ggtgactggg 4740
gtgaccaccc agggggtgaa gagcctgctg accagcatgt atgtgaagga gttcctgatc 4800
agcagcagcc aggatggcca ccagtggacc ctgttcttcc agaatggcaa ggtgaaggtg 4860
ttccagggca accaggacag cttcacccct gtggtgaaca gcctggaccc ccccctgctg 4920
accagatacc tgaggattca cccccagagc tgggtgcacc agattgccct gaggatggag 4980
gtgctgggct gtgaggccca ggacctgtac tga 5013
<210> 18
<211> 4425
<212> DNA
<213> Artificial Sequence
<220>
<223> codon-optimised FVIII transgene (V3)
<400> 18
atgcagattg agctgagcac ctgcttcttc ctgtgcctgc tgaggttctg cttctctgcc 60
accaggagat actacctggg ggctgtggag ctgagctggg actacatgca gtctgacctg 120
ggggagctgc ctgtggatgc caggttcccc cccagagtgc ccaagagctt ccccttcaac 180
acctctgtgg tgtacaagaa gaccctgttt gtggagttca ctgaccacct gttcaacatt 240
gccaagccca ggcccccctg gatgggcctg ctgggcccca ccatccaggc tgaggtgtat 300
gacactgtgg tgatcaccct gaagaacatg gccagccacc ctgtgagcct gcatgctgtg 360
ggggtgagct actggaaggc ctctgagggg gctgagtatg atgaccagac cagccagagg 420
gagaaggagg atgacaaggt gttccctggg ggcagccaca cctatgtgtg gcaggtgctg 480
aaggagaatg gccccatggc ctctgacccc ctgtgcctga cctacagcta cctgagccat 540
gtggacctgg tgaaggacct gaactctggc ctgattgggg ccctgctggt gtgcagggag 600
ggcagcctgg ccaaggagaa gacccagacc ctgcacaagt tcatcctgct gtttgctgtg 660
tttgatgagg gcaagagctg gcactctgaa accaagaaca gcctgatgca ggacagggat 720
gctgcctctg ccagggcctg gcccaagatg cacactgtga atggctatgt gaacaggagc 780
ctgcctggcc tgattggctg ccacaggaag tctgtgtact ggcatgtgat tggcatgggc 840
accacccctg aggtgcacag catcttcctg gagggccaca ccttcctggt caggaaccac 900
aggcaggcca gcctggagat cagccccatc accttcctga ctgcccagac cctgctgatg 960
gacctgggcc agttcctgct gttctgccac atcagcagcc accagcatga tggcatggag 1020
gcctatgtga aggtggacag ctgccctgag gagccccagc tgaggatgaa gaacaatgag 1080
gaggctgagg actatgatga tgacctgact gactctgaga tggatgtggt gaggtttgat 1140
gatgacaaca gccccagctt catccagatc aggtctgtgg ccaagaagca ccccaagacc 1200
tgggtgcact acattgctgc tgaggaggag gactgggact atgcccccct ggtgctggcc 1260
cctgatgaca ggagctacaa gagccagtac ctgaacaatg gcccccagag gattggcagg 1320
aagtacaaga aggtcaggtt catggcctac actgatgaaa ccttcaagac cagggaggcc 1380
atccagcatg agtctggcat cctgggcccc ctgctgtatg gggaggtggg ggacaccctg 1440
ctgatcatct tcaagaacca ggccagcagg ccctacaaca tctaccccca tggcatcact 1500
gatgtgaggc ccctgtacag caggaggctg cccaaggggg tgaagcacct gaaggacttc 1560
cccatcctgc ctggggagat cttcaagtac aagtggactg tgactgtgga ggatggcccc 1620
accaagtctg accccaggtg cctgaccaga tactacagca gctttgtgaa catggagagg 1680
gacctggcct ctggcctgat tggccccctg ctgatctgct acaaggagtc tgtggaccag 1740
aggggcaacc agatcatgtc tgacaagagg aatgtgatcc tgttctctgt gtttgatgag 1800
aacaggagct ggtacctgac tgagaacatc cagaggttcc tgcccaaccc tgctggggtg 1860
cagctggagg accctgagtt ccaggccagc aacatcatgc acagcatcaa tggctatgtg 1920
tttgacagcc tgcagctgtc tgtgtgcctg catgaggtgg cctactggta catcctgagc 1980
attggggccc agactgactt cctgtctgtg ttcttctctg gctacacctt caagcacaag 2040
atggtgtatg aggacaccct gaccctgttc cccttctctg gggagactgt gttcatgagc 2100
atggagaacc ctggcctgtg gattctgggc tgccacaact ctgacttcag gaacaggggc 2160
atgactgccc tgctgaaagt ctccagctgt gacaagaaca ctggggacta ctatgaggac 2220
agctatgagg acatctctgc ctacctgctg agcaagaaca atgccattga gcccaggagc 2280
ttcagccaga atgccactaa tgtgtctaac aacagcaaca ccagcaatga cagcaatgtg 2340
tctcccccag tgctgaagag gcaccagagg gagatcacca ggaccaccct gcagtctgac 2400
caggaggaga ttgactatga tgacaccatc tctgtggaga tgaagaagga ggactttgac 2460
atctacgacg aggacgagaa ccagagcccc aggagcttcc agaagaagac caggcactac 2520
ttcattgctg ctgtggagag gctgtgggac tatggcatga gcagcagccc ccatgtgctg 2580
aggaacaggg cccagtctgg ctctgtgccc cagttcaaga aggtggtgtt ccaggagttc 2640
actgatggca gcttcaccca gcccctgtac agaggggagc tgaatgagca cctgggcctg 2700
ctgggcccct acatcagggc tgaggtggag gacaacatca tggtgacctt caggaaccag 2760
gccagcaggc cctacagctt ctacagcagc ctgatcagct atgaggagga ccagaggcag 2820
ggggctgagc ccaggaagaa ctttgtgaag cccaatgaaa ccaagaccta cttctggaag 2880
gtgcagcacc acatggcccc caccaaggat gagtttgact gcaaggcctg ggcctacttc 2940
tctgatgtgg acctggagaa ggatgtgcac tctggcctga ttggccccct gctggtgtgc 3000
cacaccaaca ccctgaaccc tgcccatggc aggcaggtga ctgtgcagga gtttgccctg 3060
ttcttcacca tctttgatga aaccaagagc tggtacttca ctgagaacat ggagaggaac 3120
tgcagggccc cctgcaacat ccagatggag gaccccacct tcaaggagaa ctacaggttc 3180
catgccatca atggctacat catggacacc ctgcctggcc tggtgatggc ccaggaccag 3240
aggatcaggt ggtacctgct gagcatgggc agcaatgaga acatccacag catccacttc 3300
tctggccatg tgttcactgt gaggaagaag gaggagtaca agatggccct gtacaacctg 3360
taccctgggg tgtttgagac tgtggagatg ctgcccagca aggctggcat ctggagggtg 3420
gagtgcctga ttggggagca cctgcatgct ggcatgagca ccctgttcct ggtgtacagc 3480
aacaagtgcc agacccccct gggcatggcc tctggccaca tcagggactt ccagatcact 3540
gcctctggcc agtatggcca gtgggccccc aagctggcca ggctgcacta ctctggcagc 3600
atcaatgcct ggagcaccaa ggagcccttc agctggatca aggtggacct gctggccccc 3660
atgatcatcc atggcatcaa gacccagggg gccaggcaga agttcagcag cctgtacatc 3720
agccagttca tcatcatgta cagcctggat ggcaagaagt ggcagaccta caggggcaac 3780
agcactggca ccctgatggt gttctttggc aatgtggaca gctctggcat caagcacaac 3840
atcttcaacc cccccatcat tgccagatac atcaggctgc accccaccca ctacagcatc 3900
aggagcaccc tgaggatgga gctgatgggc tgtgacctga acagctgcag catgcccctg 3960
ggcatggaga gcaaggccat ctctgatgcc cagatcactg ccagcagcta cttcaccaac 4020
atgtttgcca cctggagccc cagcaaggcc aggctgcacc tgcagggcag gagcaatgcc 4080
tggaggcccc aggtcaacaa ccccaaggag tggctgcagg tggacttcca gaagaccatg 4140
aaggtgactg gggtgaccac ccagggggtg aagagcctgc tgaccagcat gtatgtgaag 4200
gagttcctga tcagcagcag ccaggatggc caccagtgga ccctgttctt ccagaatggc 4260
aaggtgaagg tgttccaggg caaccaggac agcttcaccc ctgtggtgaa cagcctggac 4320
ccccccctgc tgaccagata cctgaggatt cacccccaga gctgggtgca ccagattgcc 4380
ctgaggatgg aggtgctggg ctgtgaggcc caggacctgt actga 4425
<210> 19
<211> 5013
<212> DNA
<213> Artificial Sequence
<220>
<223> codon-optimised FVIII transgene (N6) complementary strand
<400> 19
tacgtctaac tcgactcgtg gacgaagaag gacacggacg actccaagac gaagagacgg 60
tggtcctcta tgatggaccc ccgacacctc gactcgaccc tgatgtacgt cagactggac 120
cccctcgacg gacacctacg gtccaagggg gggtctcacg ggttctcgaa ggggaagttg 180
tggagacacc acatgttctt ctgggacaaa cacctcaagt gactggtgga caagttgtaa 240
cggttcgggt ccggggggac ctacccggac gacccggggt ggtaggtccg actccacata 300
ctgtgacacc actagtggga cttcttgtac cggtcggtgg gacactcgga cgtacgacac 360
ccccactcga tgaccttccg gagactcccc cgactcatac tactggtctg gtcggtctcc 420
ctcttcctcc tactgttcca caagggaccc ccgtcggtgt ggatacacac cgtccacgac 480
ttcctcttac cggggtaccg gagactgggg gacacggact ggatgtcgat ggactcggta 540
cacctggacc acttcctgga cttgagaccg gactaacccc gggacgacca cacgtccctc 600
ccgtcggacc ggttcctctt ctgggtctgg gacgtgttca agtaggacga caaacgacac 660
aaactactcc cgttctcgac cgtgagactt tggttcttgt cggactacgt cctgtcccta 720
cgacggagac ggtcccggac cgggttctac gtgtgacact taccgataca cttgtcctcg 780
gacggaccgg actaaccgac ggtgtccttc agacacatga ccgtacacta accgtacccg 840
tggtggggac tccacgtgtc gtagaaggac ctcccggtgt ggaaggacca gtccttggtg 900
tccgtccggt cggacctcta gtcggggtag tggaaggact gacgggtctg ggacgactac 960
ctggacccgg tcaaggacga caagacggtg tagtcgtcgg tggtcgtact accgtacctc 1020
cggatacact tccacctgtc gacgggactc ctcggggtcg actcctactt cttgttactc 1080
ctccgactcc tgatactact actggactga ctgagactct acctacacca ctccaaacta 1140
ctactgttgt cggggtcgaa gtaggtctag tccagacacc ggttcttcgt ggggttctgg 1200
acccacgtga tgtaacgacg actcctcctc ctgaccctga tacgggggga ccacgaccgg 1260
ggactactgt cctcgatgtt ctcggtcatg gacttgttac cgggggtctc ctaaccgtcc 1320
ttcatgttct tccagtccaa gtaccggatg tgactacttt ggaagttctg gtccctccgg 1380
taggtcgtac tcagaccgta ggacccgggg gacgacatac ccctccaccc cctgtgggac 1440
gactagtaga agttcttggt ccggtcgtcc gggatgttgt agatgggggt accgtagtga 1500
ctacactccg gggacatgtc gtcctccgac gggttccccc acttcgtgga cttcctgaag 1560
gggtaggacg gacccctcta gaagttcatg ttcacctgac actgacacct cctaccgggg 1620
tggttcagac tggggtccac ggactggtct atgatgtcgt cgaaacactt gtacctctcc 1680
ctggaccgga gaccggacta accgggggac gactagacga tgttcctcag acacctggtc 1740
tccccgttgg tctagtacag actgttctcc ttacactagg acaagagaca caaactactc 1800
ttgtcctcga ccatggactg actcttgtag gtctccaagg acgggttggg acgaccccac 1860
gtcgacctcc tgggactcaa ggtccggtcg ttgtagtacg tgtcgtagtt accgatacac 1920
aaactgtcgg acgtcgacag acacacggac gtactccacc ggatgaccat gtaggactcg 1980
taaccccggg tctgactgaa ggacagacac aagaagagac cgatgtggaa gttcgtgttc 2040
taccacatac tcctgtggga ctgggacaag gggaagagac ccctctgaca caagtactcg 2100
tacctcttgg gaccggacac ctaagacccg acggtgttga gactgaagtc cttgtccccg 2160
tactgacggg acgactttca gaggtcgaca ctgttcttgt gacccctgat gatactcctg 2220
tcgatactcc tgtagagacg gatggacgac tcgttcttgt tacggtaact cgggtcctcg 2280
aagtcggtct tgtcgtccgt ggggtcgtgg tccgtcttcg tcaagttacg gtggtggtag 2340
ggactcttac tgtatctctt ctgtctgggt accaaacggg tggcctgggg gtacgggttc 2400
taggtcttac actcgtcgag actggacgac tacgacgact ccgtctcggg gtggggggta 2460
ccggactcgg acagactgga cgtcctccgg ttcatacttt ggaagagact actggggtcg 2520
ggaccccggt aactgtcgtt gttgtcggac agactctact gggtgaagtc cggggtcgac 2580
gtggtgagac ccctgtacca caagtgggga ctcagaccgg acgtcgactc cgacttactc 2640
ttcgacccgt ggtgacgacg gtgactcgac ttcttcgacc tgaagtttca gaggtcgtgg 2700
tcgttgttgg actagtcgtg gtaggggaga ctgttggacc gacgaccgtg actgttgtgg 2760
tcgtcggacc cgggggggtc gtacggacac gtgatactgt cggtcgacct gtggtgggac 2820
aaaccgttct tctcgtcggg ggactgactc agacccccgg gggactcgga cagactcctc 2880
ttgttactgt cgttcgacga cctcagaccg gactacttgt cggtcctctc gtcgaccccg 2940
ttcttacact cgtcgtccct ctagtggtcc tggtgggacg tcagactggt cctcctctaa 3000
ctgatactac tgtggtagag acacctctac ttcttcctcc tgaaactgta gatgctgctc 3060
ctgctcttgg tctcggggtc ctcgaaggtc ttcttctggt ccgtgatgaa gtaacgacga 3120
cacctctccg acaccctgat accgtactcg tcgtcggggg tacacgactc cttgtcccgg 3180
gtcagaccga gacacggggt caagttcttc caccacaagg tcctcaagtg actaccgtcg 3240
aagtgggtcg gggacatgtc tcccctcgac ttactcgtgg acccggacga cccggggatg 3300
tagtcccgac tccacctcct gttgtagtac cactggaagt ccttggtccg gtcgtccggg 3360
atgtcgaaga tgtcgtcgga ctagtcgata ctcctcctgg tctccgtccc ccgactcggg 3420
tccttcttga aacacttcgg gttactttgg ttctggatga agaccttcca cgtcgtggtg 3480
taccgggggt ggttcctact caaactgacg ttccggaccc ggatgaagag actacacctg 3540
gacctcttcc tacacgtgag accggactaa ccgggggacg accacacggt gtggttgtgg 3600
gacttgggac gggtaccgtc cgtccactga cacgtcctca aacgggacaa gaagtggtag 3660
aaactacttt ggttctcgac catgaagtga ctcttgtacc tctccttgac gtcccggggg 3720
acgttgtagg tctacctcct ggggtggaag ttcctcttga tgtccaaggt acggtagtta 3780
ccgatgtagt acctgtggga cggaccggac cactaccggg tcctggtctc ctagtccacc 3840
atggacgact cgtacccgtc gttactcttg taggtgtcgt aggtgaagag accggtacac 3900
aagtgacact ccttcttcct cctcatgttc taccgggaca tgttggacat gggaccccac 3960
aaactctgac acctctacga cgggtcgttc cgaccgtaga cctcccacct cacggactaa 4020
cccctcgtgg acgtacgacc gtactcgtgg gacaaggacc acatgtcgtt gttcacggtc 4080
tggggggacc cgtaccggag accggtgtag tccctgaagg tctagtgacg gagaccggtc 4140
ataccggtca cccgggggtt cgaccggtcc gacgtgatga gaccgtcgta gttacggacc 4200
tcgtggttcc tcgggaagtc gacctagttc cacctggacg accgggggta ctagtaggta 4260
ccgtagttct gggtcccccg gtccgtcttc aagtcgtcgg acatgtagtc ggtcaagtag 4320
tagtacatgt cggacctacc gttcttcacc gtctggatgt ccccgttgtc gtgaccgtgg 4380
gactaccaca agaaaccgtt acacctgtcg agaccgtagt tcgtgttgta gaagttgggg 4440
gggtagtaac ggtctatgta gtccgacgtg gggtgggtga tgtcgtagtc ctcgtgggac 4500
tcctacctcg actacccgac actggacttg tcgacgtcgt acggggaccc gtacctctcg 4560
ttccggtaga gactacgggt ctagtgacgg tcgtcgatga agtggttgta caaacggtgg 4620
acctcggggt cgttccggtc cgacgtggac gtcccgtcct cgttacggac ctccggggtc 4680
cagttgttgg ggttcctcac cgacgtccac ctgaaggtct tctggtactt ccactgaccc 4740
cactggtggg tcccccactt ctcggacgac tggtcgtaca tacacttcct caaggactag 4800
tcgtcgtcgg tcctaccggt ggtcacctgg gacaagaagg tcttaccgtt ccacttccac 4860
aaggtcccgt tggtcctgtc gaagtgggga caccacttgt cggacctggg gggggacgac 4920
tggtctatgg actcctaagt gggggtctcg acccacgtgg tctaacggga ctcctacctc 4980
cacgacccga cactccgggt cctggacatg act 5013
<210> 20
<211> 4425
<212> DNA
<213> Artificial Sequence
<220>
<223> codon-optimised FVIII transgene (V3) complementary strand
<400> 20
tacgtctaac tcgactcgtg gacgaagaag gacacggacg actccaagac gaagagacgg 60
tggtcctcta tgatggaccc ccgacacctc gactcgaccc tgatgtacgt cagactggac 120
cccctcgacg gacacctacg gtccaagggg gggtctcacg ggttctcgaa ggggaagttg 180
tggagacacc acatgttctt ctgggacaaa cacctcaagt gactggtgga caagttgtaa 240
cggttcgggt ccggggggac ctacccggac gacccggggt ggtaggtccg actccacata 300
ctgtgacacc actagtggga cttcttgtac cggtcggtgg gacactcgga cgtacgacac 360
ccccactcga tgaccttccg gagactcccc cgactcatac tactggtctg gtcggtctcc 420
ctcttcctcc tactgttcca caagggaccc ccgtcggtgt ggatacacac cgtccacgac 480
ttcctcttac cggggtaccg gagactgggg gacacggact ggatgtcgat ggactcggta 540
cacctggacc acttcctgga cttgagaccg gactaacccc gggacgacca cacgtccctc 600
ccgtcggacc ggttcctctt ctgggtctgg gacgtgttca agtaggacga caaacgacac 660
aaactactcc cgttctcgac cgtgagactt tggttcttgt cggactacgt cctgtcccta 720
cgacggagac ggtcccggac cgggttctac gtgtgacact taccgataca cttgtcctcg 780
gacggaccgg actaaccgac ggtgtccttc agacacatga ccgtacacta accgtacccg 840
tggtggggac tccacgtgtc gtagaaggac ctcccggtgt ggaaggacca gtccttggtg 900
tccgtccggt cggacctcta gtcggggtag tggaaggact gacgggtctg ggacgactac 960
ctggacccgg tcaaggacga caagacggtg tagtcgtcgg tggtcgtact accgtacctc 1020
cggatacact tccacctgtc gacgggactc ctcggggtcg actcctactt cttgttactc 1080
ctccgactcc tgatactact actggactga ctgagactct acctacacca ctccaaacta 1140
ctactgttgt cggggtcgaa gtaggtctag tccagacacc ggttcttcgt ggggttctgg 1200
acccacgtga tgtaacgacg actcctcctc ctgaccctga tacgggggga ccacgaccgg 1260
ggactactgt cctcgatgtt ctcggtcatg gacttgttac cgggggtctc ctaaccgtcc 1320
ttcatgttct tccagtccaa gtaccggatg tgactacttt ggaagttctg gtccctccgg 1380
taggtcgtac tcagaccgta ggacccgggg gacgacatac ccctccaccc cctgtgggac 1440
gactagtaga agttcttggt ccggtcgtcc gggatgttgt agatgggggt accgtagtga 1500
ctacactccg gggacatgtc gtcctccgac gggttccccc acttcgtgga cttcctgaag 1560
gggtaggacg gacccctcta gaagttcatg ttcacctgac actgacacct cctaccgggg 1620
tggttcagac tggggtccac ggactggtct atgatgtcgt cgaaacactt gtacctctcc 1680
ctggaccgga gaccggacta accgggggac gactagacga tgttcctcag acacctggtc 1740
tccccgttgg tctagtacag actgttctcc ttacactagg acaagagaca caaactactc 1800
ttgtcctcga ccatggactg actcttgtag gtctccaagg acgggttggg acgaccccac 1860
gtcgacctcc tgggactcaa ggtccggtcg ttgtagtacg tgtcgtagtt accgatacac 1920
aaactgtcgg acgtcgacag acacacggac gtactccacc ggatgaccat gtaggactcg 1980
taaccccggg tctgactgaa ggacagacac aagaagagac cgatgtggaa gttcgtgttc 2040
taccacatac tcctgtggga ctgggacaag gggaagagac ccctctgaca caagtactcg 2100
tacctcttgg gaccggacac ctaagacccg acggtgttga gactgaagtc cttgtccccg 2160
tactgacggg acgactttca gaggtcgaca ctgttcttgt gacccctgat gatactcctg 2220
tcgatactcc tgtagagacg gatggacgac tcgttcttgt tacggtaact cgggtcctcg 2280
aagtcggtct tacggtgatt acacagattg ttgtcgttgt ggtcgttact gtcgttacac 2340
agagggggtc acgacttctc cgtggtctcc ctctagtggt cctggtggga cgtcagactg 2400
gtcctcctct aactgatact actgtggtag agacacctct acttcttcct cctgaaactg 2460
tagatgctgc tcctgctctt ggtctcgggg tcctcgaagg tcttcttctg gtccgtgatg 2520
aagtaacgac gacacctctc cgacaccctg ataccgtact cgtcgtcggg ggtacacgac 2580
tccttgtccc gggtcagacc gagacacggg gtcaagttct tccaccacaa ggtcctcaag 2640
tgactaccgt cgaagtgggt cggggacatg tctcccctcg acttactcgt ggacccggac 2700
gacccgggga tgtagtcccg actccacctc ctgttgtagt accactggaa gtccttggtc 2760
cggtcgtccg ggatgtcgaa gatgtcgtcg gactagtcga tactcctcct ggtctccgtc 2820
ccccgactcg ggtccttctt gaaacacttc gggttacttt ggttctggat gaagaccttc 2880
cacgtcgtgg tgtaccgggg gtggttccta ctcaaactga cgttccggac ccggatgaag 2940
agactacacc tggacctctt cctacacgtg agaccggact aaccggggga cgaccacacg 3000
gtgtggttgt gggacttggg acgggtaccg tccgtccact gacacgtcct caaacgggac 3060
aagaagtggt agaaactact ttggttctcg accatgaagt gactcttgta cctctccttg 3120
acgtcccggg ggacgttgta ggtctacctc ctggggtgga agttcctctt gatgtccaag 3180
gtacggtagt taccgatgta gtacctgtgg gacggaccgg accactaccg ggtcctggtc 3240
tcctagtcca ccatggacga ctcgtacccg tcgttactct tgtaggtgtc gtaggtgaag 3300
agaccggtac acaagtgaca ctccttcttc ctcctcatgt tctaccggga catgttggac 3360
atgggacccc acaaactctg acacctctac gacgggtcgt tccgaccgta gacctcccac 3420
ctcacggact aacccctcgt ggacgtacga ccgtactcgt gggacaagga ccacatgtcg 3480
ttgttcacgg tctgggggga cccgtaccgg agaccggtgt agtccctgaa ggtctagtga 3540
cggagaccgg tcataccggt cacccggggg ttcgaccggt ccgacgtgat gagaccgtcg 3600
tagttacgga cctcgtggtt cctcgggaag tcgacctagt tccacctgga cgaccggggg 3660
tactagtagg taccgtagtt ctgggtcccc cggtccgtct tcaagtcgtc ggacatgtag 3720
tcggtcaagt agtagtacat gtcggaccta ccgttcttca ccgtctggat gtccccgttg 3780
tcgtgaccgt gggactacca caagaaaccg ttacacctgt cgagaccgta gttcgtgttg 3840
tagaagttgg gggggtagta acggtctatg tagtccgacg tggggtgggt gatgtcgtag 3900
tcctcgtggg actcctacct cgactacccg acactggact tgtcgacgtc gtacggggac 3960
ccgtacctct cgttccggta gagactacgg gtctagtgac ggtcgtcgat gaagtggttg 4020
tacaaacggt ggacctcggg gtcgttccgg tccgacgtgg acgtcccgtc ctcgttacgg 4080
acctccgggg tccagttgtt ggggttcctc accgacgtcc acctgaaggt cttctggtac 4140
ttccactgac cccactggtg ggtcccccac ttctcggacg actggtcgta catacacttc 4200
ctcaaggact agtcgtcgtc ggtcctaccg gtggtcacct gggacaagaa ggtcttaccg 4260
ttccacttcc acaaggtccc gttggtcctg tcgaagtggg gacaccactt gtcggacctg 4320
gggggggacg actggtctat ggactcctaa gtgggggtct cgacccacgt ggtctaacgg 4380
gactcctacc tccacgaccc gacactccgg gtcctggaca tgact 4425
<210> 21
<211> 1670
<212> PRT
<213> Homo sapiens
<400> 21
Met Gln Ile Glu Leu Ser Thr Cys Phe Phe Leu Cys Leu Leu Arg Phe
1 5 10 15
Cys Phe Ser Ala Thr Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser
20 25 30
Trp Asp Tyr Met Gln Ser Asp Leu Gly Glu Leu Pro Val Asp Ala Arg
35 40 45
Phe Pro Pro Arg Val Pro Lys Ser Phe Pro Phe Asn Thr Ser Val Val
50 55 60
Tyr Lys Lys Thr Leu Phe Val Glu Phe Thr Asp His Leu Phe Asn Ile
65 70 75 80
Ala Lys Pro Arg Pro Pro Trp Met Gly Leu Leu Gly Pro Thr Ile Gln
85 90 95
Ala Glu Val Tyr Asp Thr Val Val Ile Thr Leu Lys Asn Met Ala Ser
100 105 110
His Pro Val Ser Leu His Ala Val Gly Val Ser Tyr Trp Lys Ala Ser
115 120 125
Glu Gly Ala Glu Tyr Asp Asp Gln Thr Ser Gln Arg Glu Lys Glu Asp
130 135 140
Asp Lys Val Phe Pro Gly Gly Ser His Thr Tyr Val Trp Gln Val Leu
145 150 155 160
Lys Glu Asn Gly Pro Met Ala Ser Asp Pro Leu Cys Leu Thr Tyr Ser
165 170 175
Tyr Leu Ser His Val Asp Leu Val Lys Asp Leu Asn Ser Gly Leu Ile
180 185 190
Gly Ala Leu Leu Val Cys Arg Glu Gly Ser Leu Ala Lys Glu Lys Thr
195 200 205
Gln Thr Leu His Lys Phe Ile Leu Leu Phe Ala Val Phe Asp Glu Gly
210 215 220
Lys Ser Trp His Ser Glu Thr Lys Asn Ser Leu Met Gln Asp Arg Asp
225 230 235 240
Ala Ala Ser Ala Arg Ala Trp Pro Lys Met His Thr Val Asn Gly Tyr
245 250 255
Val Asn Arg Ser Leu Pro Gly Leu Ile Gly Cys His Arg Lys Ser Val
260 265 270
Tyr Trp His Val Ile Gly Met Gly Thr Thr Pro Glu Val His Ser Ile
275 280 285
Phe Leu Glu Gly His Thr Phe Leu Val Arg Asn His Arg Gln Ala Ser
290 295 300
Leu Glu Ile Ser Pro Ile Thr Phe Leu Thr Ala Gln Thr Leu Leu Met
305 310 315 320
Asp Leu Gly Gln Phe Leu Leu Phe Cys His Ile Ser Ser His Gln His
325 330 335
Asp Gly Met Glu Ala Tyr Val Lys Val Asp Ser Cys Pro Glu Glu Pro
340 345 350
Gln Leu Arg Met Lys Asn Asn Glu Glu Ala Glu Asp Tyr Asp Asp Asp
355 360 365
Leu Thr Asp Ser Glu Met Asp Val Val Arg Phe Asp Asp Asp Asn Ser
370 375 380
Pro Ser Phe Ile Gln Ile Arg Ser Val Ala Lys Lys His Pro Lys Thr
385 390 395 400
Trp Val His Tyr Ile Ala Ala Glu Glu Glu Asp Trp Asp Tyr Ala Pro
405 410 415
Leu Val Leu Ala Pro Asp Asp Arg Ser Tyr Lys Ser Gln Tyr Leu Asn
420 425 430
Asn Gly Pro Gln Arg Ile Gly Arg Lys Tyr Lys Lys Val Arg Phe Met
435 440 445
Ala Tyr Thr Asp Glu Thr Phe Lys Thr Arg Glu Ala Ile Gln His Glu
450 455 460
Ser Gly Ile Leu Gly Pro Leu Leu Tyr Gly Glu Val Gly Asp Thr Leu
465 470 475 480
Leu Ile Ile Phe Lys Asn Gln Ala Ser Arg Pro Tyr Asn Ile Tyr Pro
485 490 495
His Gly Ile Thr Asp Val Arg Pro Leu Tyr Ser Arg Arg Leu Pro Lys
500 505 510
Gly Val Lys His Leu Lys Asp Phe Pro Ile Leu Pro Gly Glu Ile Phe
515 520 525
Lys Tyr Lys Trp Thr Val Thr Val Glu Asp Gly Pro Thr Lys Ser Asp
530 535 540
Pro Arg Cys Leu Thr Arg Tyr Tyr Ser Ser Phe Val Asn Met Glu Arg
545 550 555 560
Asp Leu Ala Ser Gly Leu Ile Gly Pro Leu Leu Ile Cys Tyr Lys Glu
565 570 575
Ser Val Asp Gln Arg Gly Asn Gln Ile Met Ser Asp Lys Arg Asn Val
580 585 590
Ile Leu Phe Ser Val Phe Asp Glu Asn Arg Ser Trp Tyr Leu Thr Glu
595 600 605
Asn Ile Gln Arg Phe Leu Pro Asn Pro Ala Gly Val Gln Leu Glu Asp
610 615 620
Pro Glu Phe Gln Ala Ser Asn Ile Met His Ser Ile Asn Gly Tyr Val
625 630 635 640
Phe Asp Ser Leu Gln Leu Ser Val Cys Leu His Glu Val Ala Tyr Trp
645 650 655
Tyr Ile Leu Ser Ile Gly Ala Gln Thr Asp Phe Leu Ser Val Phe Phe
660 665 670
Ser Gly Tyr Thr Phe Lys His Lys Met Val Tyr Glu Asp Thr Leu Thr
675 680 685
Leu Phe Pro Phe Ser Gly Glu Thr Val Phe Met Ser Met Glu Asn Pro
690 695 700
Gly Leu Trp Ile Leu Gly Cys His Asn Ser Asp Phe Arg Asn Arg Gly
705 710 715 720
Met Thr Ala Leu Leu Lys Val Ser Ser Cys Asp Lys Asn Thr Gly Asp
725 730 735
Tyr Tyr Glu Asp Ser Tyr Glu Asp Ile Ser Ala Tyr Leu Leu Ser Lys
740 745 750
Asn Asn Ala Ile Glu Pro Arg Ser Phe Ser Gln Asn Ser Arg His Pro
755 760 765
Ser Thr Arg Gln Lys Gln Phe Asn Ala Thr Thr Ile Pro Glu Asn Asp
770 775 780
Ile Glu Lys Thr Asp Pro Trp Phe Ala His Arg Thr Pro Met Pro Lys
785 790 795 800
Ile Gln Asn Val Ser Ser Ser Asp Leu Leu Met Leu Leu Arg Gln Ser
805 810 815
Pro Thr Pro His Gly Leu Ser Leu Ser Asp Leu Gln Glu Ala Lys Tyr
820 825 830
Glu Thr Phe Ser Asp Asp Pro Ser Pro Gly Ala Ile Asp Ser Asn Asn
835 840 845
Ser Leu Ser Glu Met Thr His Phe Arg Pro Gln Leu His His Ser Gly
850 855 860
Asp Met Val Phe Thr Pro Glu Ser Gly Leu Gln Leu Arg Leu Asn Glu
865 870 875 880
Lys Leu Gly Thr Thr Ala Ala Thr Glu Leu Lys Lys Leu Asp Phe Lys
885 890 895
Val Ser Ser Thr Ser Asn Asn Leu Ile Ser Thr Ile Pro Ser Asp Asn
900 905 910
Leu Ala Ala Gly Thr Asp Asn Thr Ser Ser Leu Gly Pro Pro Ser Met
915 920 925
Pro Val His Tyr Asp Ser Gln Leu Asp Thr Thr Leu Phe Gly Lys Lys
930 935 940
Ser Ser Pro Leu Thr Glu Ser Gly Gly Pro Leu Ser Leu Ser Glu Glu
945 950 955 960
Asn Asn Asp Ser Lys Leu Leu Glu Ser Gly Leu Met Asn Ser Gln Glu
965 970 975
Ser Ser Trp Gly Lys Asn Val Ser Ser Arg Glu Ile Thr Arg Thr Thr
980 985 990
Leu Gln Ser Asp Gln Glu Glu Ile Asp Tyr Asp Asp Thr Ile Ser Val
995 1000 1005
Glu Met Lys Lys Glu Asp Phe Asp Ile Tyr Asp Glu Asp Glu Asn
1010 1015 1020
Gln Ser Pro Arg Ser Phe Gln Lys Lys Thr Arg His Tyr Phe Ile
1025 1030 1035
Ala Ala Val Glu Arg Leu Trp Asp Tyr Gly Met Ser Ser Ser Pro
1040 1045 1050
His Val Leu Arg Asn Arg Ala Gln Ser Gly Ser Val Pro Gln Phe
1055 1060 1065
Lys Lys Val Val Phe Gln Glu Phe Thr Asp Gly Ser Phe Thr Gln
1070 1075 1080
Pro Leu Tyr Arg Gly Glu Leu Asn Glu His Leu Gly Leu Leu Gly
1085 1090 1095
Pro Tyr Ile Arg Ala Glu Val Glu Asp Asn Ile Met Val Thr Phe
1100 1105 1110
Arg Asn Gln Ala Ser Arg Pro Tyr Ser Phe Tyr Ser Ser Leu Ile
1115 1120 1125
Ser Tyr Glu Glu Asp Gln Arg Gln Gly Ala Glu Pro Arg Lys Asn
1130 1135 1140
Phe Val Lys Pro Asn Glu Thr Lys Thr Tyr Phe Trp Lys Val Gln
1145 1150 1155
His His Met Ala Pro Thr Lys Asp Glu Phe Asp Cys Lys Ala Trp
1160 1165 1170
Ala Tyr Phe Ser Asp Val Asp Leu Glu Lys Asp Val His Ser Gly
1175 1180 1185
Leu Ile Gly Pro Leu Leu Val Cys His Thr Asn Thr Leu Asn Pro
1190 1195 1200
Ala His Gly Arg Gln Val Thr Val Gln Glu Phe Ala Leu Phe Phe
1205 1210 1215
Thr Ile Phe Asp Glu Thr Lys Ser Trp Tyr Phe Thr Glu Asn Met
1220 1225 1230
Glu Arg Asn Cys Arg Ala Pro Cys Asn Ile Gln Met Glu Asp Pro
1235 1240 1245
Thr Phe Lys Glu Asn Tyr Arg Phe His Ala Ile Asn Gly Tyr Ile
1250 1255 1260
Met Asp Thr Leu Pro Gly Leu Val Met Ala Gln Asp Gln Arg Ile
1265 1270 1275
Arg Trp Tyr Leu Leu Ser Met Gly Ser Asn Glu Asn Ile His Ser
1280 1285 1290
Ile His Phe Ser Gly His Val Phe Thr Val Arg Lys Lys Glu Glu
1295 1300 1305
Tyr Lys Met Ala Leu Tyr Asn Leu Tyr Pro Gly Val Phe Glu Thr
1310 1315 1320
Val Glu Met Leu Pro Ser Lys Ala Gly Ile Trp Arg Val Glu Cys
1325 1330 1335
Leu Ile Gly Glu His Leu His Ala Gly Met Ser Thr Leu Phe Leu
1340 1345 1350
Val Tyr Ser Asn Lys Cys Gln Thr Pro Leu Gly Met Ala Ser Gly
1355 1360 1365
His Ile Arg Asp Phe Gln Ile Thr Ala Ser Gly Gln Tyr Gly Gln
1370 1375 1380
Trp Ala Pro Lys Leu Ala Arg Leu His Tyr Ser Gly Ser Ile Asn
1385 1390 1395
Ala Trp Ser Thr Lys Glu Pro Phe Ser Trp Ile Lys Val Asp Leu
1400 1405 1410
Leu Ala Pro Met Ile Ile His Gly Ile Lys Thr Gln Gly Ala Arg
1415 1420 1425
Gln Lys Phe Ser Ser Leu Tyr Ile Ser Gln Phe Ile Ile Met Tyr
1430 1435 1440
Ser Leu Asp Gly Lys Lys Trp Gln Thr Tyr Arg Gly Asn Ser Thr
1445 1450 1455
Gly Thr Leu Met Val Phe Phe Gly Asn Val Asp Ser Ser Gly Ile
1460 1465 1470
Lys His Asn Ile Phe Asn Pro Pro Ile Ile Ala Arg Tyr Ile Arg
1475 1480 1485
Leu His Pro Thr His Tyr Ser Ile Arg Ser Thr Leu Arg Met Glu
1490 1495 1500
Leu Met Gly Cys Asp Leu Asn Ser Cys Ser Met Pro Leu Gly Met
1505 1510 1515
Glu Ser Lys Ala Ile Ser Asp Ala Gln Ile Thr Ala Ser Ser Tyr
1520 1525 1530
Phe Thr Asn Met Phe Ala Thr Trp Ser Pro Ser Lys Ala Arg Leu
1535 1540 1545
His Leu Gln Gly Arg Ser Asn Ala Trp Arg Pro Gln Val Asn Asn
1550 1555 1560
Pro Lys Glu Trp Leu Gln Val Asp Phe Gln Lys Thr Met Lys Val
1565 1570 1575
Thr Gly Val Thr Thr Gln Gly Val Lys Ser Leu Leu Thr Ser Met
1580 1585 1590
Tyr Val Lys Glu Phe Leu Ile Ser Ser Ser Gln Asp Gly His Gln
1595 1600 1605
Trp Thr Leu Phe Phe Gln Asn Gly Lys Val Lys Val Phe Gln Gly
1610 1615 1620
Asn Gln Asp Ser Phe Thr Pro Val Val Asn Ser Leu Asp Pro Pro
1625 1630 1635
Leu Leu Thr Arg Tyr Leu Arg Ile His Pro Gln Ser Trp Val His
1640 1645 1650
Gln Ile Ala Leu Arg Met Glu Val Leu Gly Cys Glu Ala Gln Asp
1655 1660 1665
Leu Tyr
1670
<210> 22
<211> 1474
<212> PRT
<213> Homo sapiens
<400> 22
Met Gln Ile Glu Leu Ser Thr Cys Phe Phe Leu Cys Leu Leu Arg Phe
1 5 10 15
Cys Phe Ser Ala Thr Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser
20 25 30
Trp Asp Tyr Met Gln Ser Asp Leu Gly Glu Leu Pro Val Asp Ala Arg
35 40 45
Phe Pro Pro Arg Val Pro Lys Ser Phe Pro Phe Asn Thr Ser Val Val
50 55 60
Tyr Lys Lys Thr Leu Phe Val Glu Phe Thr Asp His Leu Phe Asn Ile
65 70 75 80
Ala Lys Pro Arg Pro Pro Trp Met Gly Leu Leu Gly Pro Thr Ile Gln
85 90 95
Ala Glu Val Tyr Asp Thr Val Val Ile Thr Leu Lys Asn Met Ala Ser
100 105 110
His Pro Val Ser Leu His Ala Val Gly Val Ser Tyr Trp Lys Ala Ser
115 120 125
Glu Gly Ala Glu Tyr Asp Asp Gln Thr Ser Gln Arg Glu Lys Glu Asp
130 135 140
Asp Lys Val Phe Pro Gly Gly Ser His Thr Tyr Val Trp Gln Val Leu
145 150 155 160
Lys Glu Asn Gly Pro Met Ala Ser Asp Pro Leu Cys Leu Thr Tyr Ser
165 170 175
Tyr Leu Ser His Val Asp Leu Val Lys Asp Leu Asn Ser Gly Leu Ile
180 185 190
Gly Ala Leu Leu Val Cys Arg Glu Gly Ser Leu Ala Lys Glu Lys Thr
195 200 205
Gln Thr Leu His Lys Phe Ile Leu Leu Phe Ala Val Phe Asp Glu Gly
210 215 220
Lys Ser Trp His Ser Glu Thr Lys Asn Ser Leu Met Gln Asp Arg Asp
225 230 235 240
Ala Ala Ser Ala Arg Ala Trp Pro Lys Met His Thr Val Asn Gly Tyr
245 250 255
Val Asn Arg Ser Leu Pro Gly Leu Ile Gly Cys His Arg Lys Ser Val
260 265 270
Tyr Trp His Val Ile Gly Met Gly Thr Thr Pro Glu Val His Ser Ile
275 280 285
Phe Leu Glu Gly His Thr Phe Leu Val Arg Asn His Arg Gln Ala Ser
290 295 300
Leu Glu Ile Ser Pro Ile Thr Phe Leu Thr Ala Gln Thr Leu Leu Met
305 310 315 320
Asp Leu Gly Gln Phe Leu Leu Phe Cys His Ile Ser Ser His Gln His
325 330 335
Asp Gly Met Glu Ala Tyr Val Lys Val Asp Ser Cys Pro Glu Glu Pro
340 345 350
Gln Leu Arg Met Lys Asn Asn Glu Glu Ala Glu Asp Tyr Asp Asp Asp
355 360 365
Leu Thr Asp Ser Glu Met Asp Val Val Arg Phe Asp Asp Asp Asn Ser
370 375 380
Pro Ser Phe Ile Gln Ile Arg Ser Val Ala Lys Lys His Pro Lys Thr
385 390 395 400
Trp Val His Tyr Ile Ala Ala Glu Glu Glu Asp Trp Asp Tyr Ala Pro
405 410 415
Leu Val Leu Ala Pro Asp Asp Arg Ser Tyr Lys Ser Gln Tyr Leu Asn
420 425 430
Asn Gly Pro Gln Arg Ile Gly Arg Lys Tyr Lys Lys Val Arg Phe Met
435 440 445
Ala Tyr Thr Asp Glu Thr Phe Lys Thr Arg Glu Ala Ile Gln His Glu
450 455 460
Ser Gly Ile Leu Gly Pro Leu Leu Tyr Gly Glu Val Gly Asp Thr Leu
465 470 475 480
Leu Ile Ile Phe Lys Asn Gln Ala Ser Arg Pro Tyr Asn Ile Tyr Pro
485 490 495
His Gly Ile Thr Asp Val Arg Pro Leu Tyr Ser Arg Arg Leu Pro Lys
500 505 510
Gly Val Lys His Leu Lys Asp Phe Pro Ile Leu Pro Gly Glu Ile Phe
515 520 525
Lys Tyr Lys Trp Thr Val Thr Val Glu Asp Gly Pro Thr Lys Ser Asp
530 535 540
Pro Arg Cys Leu Thr Arg Tyr Tyr Ser Ser Phe Val Asn Met Glu Arg
545 550 555 560
Asp Leu Ala Ser Gly Leu Ile Gly Pro Leu Leu Ile Cys Tyr Lys Glu
565 570 575
Ser Val Asp Gln Arg Gly Asn Gln Ile Met Ser Asp Lys Arg Asn Val
580 585 590
Ile Leu Phe Ser Val Phe Asp Glu Asn Arg Ser Trp Tyr Leu Thr Glu
595 600 605
Asn Ile Gln Arg Phe Leu Pro Asn Pro Ala Gly Val Gln Leu Glu Asp
610 615 620
Pro Glu Phe Gln Ala Ser Asn Ile Met His Ser Ile Asn Gly Tyr Val
625 630 635 640
Phe Asp Ser Leu Gln Leu Ser Val Cys Leu His Glu Val Ala Tyr Trp
645 650 655
Tyr Ile Leu Ser Ile Gly Ala Gln Thr Asp Phe Leu Ser Val Phe Phe
660 665 670
Ser Gly Tyr Thr Phe Lys His Lys Met Val Tyr Glu Asp Thr Leu Thr
675 680 685
Leu Phe Pro Phe Ser Gly Glu Thr Val Phe Met Ser Met Glu Asn Pro
690 695 700
Gly Leu Trp Ile Leu Gly Cys His Asn Ser Asp Phe Arg Asn Arg Gly
705 710 715 720
Met Thr Ala Leu Leu Lys Val Ser Ser Cys Asp Lys Asn Thr Gly Asp
725 730 735
Tyr Tyr Glu Asp Ser Tyr Glu Asp Ile Ser Ala Tyr Leu Leu Ser Lys
740 745 750
Asn Asn Ala Ile Glu Pro Arg Ser Phe Ser Gln Asn Ala Thr Asn Val
755 760 765
Ser Asn Asn Ser Asn Thr Ser Asn Asp Ser Asn Val Ser Pro Pro Val
770 775 780
Leu Lys Arg His Gln Arg Glu Ile Thr Arg Thr Thr Leu Gln Ser Asp
785 790 795 800
Gln Glu Glu Ile Asp Tyr Asp Asp Thr Ile Ser Val Glu Met Lys Lys
805 810 815
Glu Asp Phe Asp Ile Tyr Asp Glu Asp Glu Asn Gln Ser Pro Arg Ser
820 825 830
Phe Gln Lys Lys Thr Arg His Tyr Phe Ile Ala Ala Val Glu Arg Leu
835 840 845
Trp Asp Tyr Gly Met Ser Ser Ser Pro His Val Leu Arg Asn Arg Ala
850 855 860
Gln Ser Gly Ser Val Pro Gln Phe Lys Lys Val Val Phe Gln Glu Phe
865 870 875 880
Thr Asp Gly Ser Phe Thr Gln Pro Leu Tyr Arg Gly Glu Leu Asn Glu
885 890 895
His Leu Gly Leu Leu Gly Pro Tyr Ile Arg Ala Glu Val Glu Asp Asn
900 905 910
Ile Met Val Thr Phe Arg Asn Gln Ala Ser Arg Pro Tyr Ser Phe Tyr
915 920 925
Ser Ser Leu Ile Ser Tyr Glu Glu Asp Gln Arg Gln Gly Ala Glu Pro
930 935 940
Arg Lys Asn Phe Val Lys Pro Asn Glu Thr Lys Thr Tyr Phe Trp Lys
945 950 955 960
Val Gln His His Met Ala Pro Thr Lys Asp Glu Phe Asp Cys Lys Ala
965 970 975
Trp Ala Tyr Phe Ser Asp Val Asp Leu Glu Lys Asp Val His Ser Gly
980 985 990
Leu Ile Gly Pro Leu Leu Val Cys His Thr Asn Thr Leu Asn Pro Ala
995 1000 1005
His Gly Arg Gln Val Thr Val Gln Glu Phe Ala Leu Phe Phe Thr
1010 1015 1020
Ile Phe Asp Glu Thr Lys Ser Trp Tyr Phe Thr Glu Asn Met Glu
1025 1030 1035
Arg Asn Cys Arg Ala Pro Cys Asn Ile Gln Met Glu Asp Pro Thr
1040 1045 1050
Phe Lys Glu Asn Tyr Arg Phe His Ala Ile Asn Gly Tyr Ile Met
1055 1060 1065
Asp Thr Leu Pro Gly Leu Val Met Ala Gln Asp Gln Arg Ile Arg
1070 1075 1080
Trp Tyr Leu Leu Ser Met Gly Ser Asn Glu Asn Ile His Ser Ile
1085 1090 1095
His Phe Ser Gly His Val Phe Thr Val Arg Lys Lys Glu Glu Tyr
1100 1105 1110
Lys Met Ala Leu Tyr Asn Leu Tyr Pro Gly Val Phe Glu Thr Val
1115 1120 1125
Glu Met Leu Pro Ser Lys Ala Gly Ile Trp Arg Val Glu Cys Leu
1130 1135 1140
Ile Gly Glu His Leu His Ala Gly Met Ser Thr Leu Phe Leu Val
1145 1150 1155
Tyr Ser Asn Lys Cys Gln Thr Pro Leu Gly Met Ala Ser Gly His
1160 1165 1170
Ile Arg Asp Phe Gln Ile Thr Ala Ser Gly Gln Tyr Gly Gln Trp
1175 1180 1185
Ala Pro Lys Leu Ala Arg Leu His Tyr Ser Gly Ser Ile Asn Ala
1190 1195 1200
Trp Ser Thr Lys Glu Pro Phe Ser Trp Ile Lys Val Asp Leu Leu
1205 1210 1215
Ala Pro Met Ile Ile His Gly Ile Lys Thr Gln Gly Ala Arg Gln
1220 1225 1230
Lys Phe Ser Ser Leu Tyr Ile Ser Gln Phe Ile Ile Met Tyr Ser
1235 1240 1245
Leu Asp Gly Lys Lys Trp Gln Thr Tyr Arg Gly Asn Ser Thr Gly
1250 1255 1260
Thr Leu Met Val Phe Phe Gly Asn Val Asp Ser Ser Gly Ile Lys
1265 1270 1275
His Asn Ile Phe Asn Pro Pro Ile Ile Ala Arg Tyr Ile Arg Leu
1280 1285 1290
His Pro Thr His Tyr Ser Ile Arg Ser Thr Leu Arg Met Glu Leu
1295 1300 1305
Met Gly Cys Asp Leu Asn Ser Cys Ser Met Pro Leu Gly Met Glu
1310 1315 1320
Ser Lys Ala Ile Ser Asp Ala Gln Ile Thr Ala Ser Ser Tyr Phe
1325 1330 1335
Thr Asn Met Phe Ala Thr Trp Ser Pro Ser Lys Ala Arg Leu His
1340 1345 1350
Leu Gln Gly Arg Ser Asn Ala Trp Arg Pro Gln Val Asn Asn Pro
1355 1360 1365
Lys Glu Trp Leu Gln Val Asp Phe Gln Lys Thr Met Lys Val Thr
1370 1375 1380
Gly Val Thr Thr Gln Gly Val Lys Ser Leu Leu Thr Ser Met Tyr
1385 1390 1395
Val Lys Glu Phe Leu Ile Ser Ser Ser Gln Asp Gly His Gln Trp
1400 1405 1410
Thr Leu Phe Phe Gln Asn Gly Lys Val Lys Val Phe Gln Gly Asn
1415 1420 1425
Gln Asp Ser Phe Thr Pro Val Val Asn Ser Leu Asp Pro Pro Leu
1430 1435 1440
Leu Thr Arg Tyr Leu Arg Ile His Pro Gln Ser Trp Val His Gln
1445 1450 1455
Ile Ala Leu Arg Met Glu Val Leu Gly Cys Glu Ala Gln Asp Leu
1460 1465 1470
Tyr
<210> 23
<211> 600
<212> DNA
<213> Woodchuck hepatitis virus
<400> 23
gggcccaatc aacctctgga ttacaaaatt tgtgaaagat tgactggtat tcttaactat 60
gttgctcctt ttacgctatg tggatacgct gctttaatgc ctttgtatca tgctattgct 120
tcccgtatgg ctttcatttt ctcctccttg tataaatcct ggttgctgtc tctttatgag 180
gagttgtggc ccgttgtcag gcaacgtggc gtggtgtgca ctgtgtttgc tgacgcaacc 240
cccactggtt ggggcattgc caccacctgt cagctccttt ccgggacttt cgctttcccc 300
ctccctattg ccacggcgga actcatcgcc gcctgccttg cccgctgctg gacaggggct 360
cggctgttgg gcactgacaa ttccgtggtg ttgtcgggga aatcatcgtc ctttccttgg 420
ctgctcgcct gtgttgccac ctggattctg cgcgggacgt ccttctgcta cgtcccttcg 480
gccctcaatc cagcggacct tccttcccgc ggcctgctgc cggctctgcg gcctcttccg 540
cgtcttcgcc ttcgccctca gacgagtcgg atctcccttt gggccgcctc cccgcaagct 600
<210> 24
<211> 7349
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM407
<400> 24
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact cttgggcaag tagggcaggc ggtgggtacg caatgggggc ggctacctca 1200
gcactaaata ggagacaatt agaccaattt gagaaaatac gacttcgccc gaacggaaag 1260
aaaaagtacc aaattaaaca tttaatatgg gcaggcaagg agatggagcg cttcggcctc 1320
catgagaggt tgttggagac agaggagggg tgtaaaagaa tcatagaagt cctctacccc 1380
ctagaaccaa caggatcgga gggcttaaaa agtctgttca atcttgtgtg cgtgctatat 1440
tgcttgcaca aggaacagaa agtgaaagac acagaggaag cagtagcaac agtaagacaa 1500
cactgccatc tagtggaaaa agaaaaaagt gcaacagaga catctagtgg acaaaagaaa 1560
aatgacaagg gaatagcagc gccacctggt ggcagtcaga attttccagc gcaacaacaa 1620
ggaaatgcct gggtacatgt acccttgtca ccgcgcacct taaatgcgtg ggtaaaagca 1680
gtagaggaga aaaaatttgg agcagaaata gtacccattt ttttgtttca agccctatcg 1740
aattcccgtt tgtgctaggg ttcttaggct tcttgggggc tgctggaact gcaatgggag 1800
cagcggcgac agccctgacg gtccagtctc agcatttgct tgctgggata ctgcagcagc 1860
agaagaatct gctggcggct gtggaggctc aacagcagat gttgaagctg accatttggg 1920
gtgttaaaaa cctcaatgcc cgcgtcacag cccttgagaa gtacctagag gatcaggcac 1980
gactaaactc ctgggggtgc gcatggaaac aagtatgtca taccacagtg gagtggccct 2040
ggacaaatcg gactccggat tggcaaaata tgacttggtt ggagtgggaa agacaaatag 2100
ctgatttgga aagcaacatt acgagacaat tagtgaaggc tagagaacaa gaggaaaaga 2160
atctagatgc ctatcagaag ttaactagtt ggtcagattt ctggtcttgg ttcgatttct 2220
caaaatggct taacatttta aaaatgggat ttttagtaat agtaggaata atagggttaa 2280
gattacttta cacagtatat ggatgtatag tgagggttag gcagggatat gttcctctat 2340
ctccacagat ccatatccgc ggcaatttta aaagaaaggg aggaataggg ggacagactt 2400
cagcagagag actaattaat ataataacaa cacaattaga aatacaacat ttacaaacca 2460
aaattcaaaa aattttaaat tttagagccg cggagatctg ttacataact tatggtaaat 2520
ggcctgcctg gctgactgcc caatgacccc tgcccaatga tgtcaataat gatgtatgtt 2580
cccatgtaat gccaataggg actttccatt gatgtcaatg ggtggagtat ttatggtaac 2640
tgcccacttg gcagtacatc aagtgtatca tatgccaagt atgcccccta ttgatgtcaa 2700
tgatggtaaa tggcctgcct ggcattatgc ccagtacatg accttatggg actttcctac 2760
ttggcagtac atctatgtat tagtcattgc tattaccatg ggaattcact agtggagaag 2820
agcatgcttg agggctgagt gcccctcagt gggcagagag cacatggccc acagtccctg 2880
agaagttggg gggaggggtg ggcaattgaa ctggtgccta gagaaggtgg ggcttgggta 2940
aactgggaaa gtgatgtggt gtactggctc cacctttttc cccagggtgg gggagaacca 3000
tatataagtg cagtagtctc tgtgaacatt caagcttctg ccttctccct cctgtgagtt 3060
tgctagccac catgcccagc tctgtgtcct ggggcattct gctgctggct ggcctgtgct 3120
gtctggtgcc tgtgtccctg gctgaggacc ctcaggggga tgctgcccag aaaacagaca 3180
cctcccacca tgaccaggac caccccacct tcaacaagat cacccccaac ctggcagagt 3240
ttgccttcag cctgtacaga cagctggccc accagagcaa cagcaccaac atctttttca 3300
gccctgtgtc cattgccaca gcctttgcca tgctgagcct gggcaccaag gctgacaccc 3360
atgatgagat cctggaaggc ctgaacttca acctgacaga gatccctgag gcccagatcc 3420
atgagggctt ccaggaactg ctgagaaccc tgaaccagcc agacagccag ctgcagctga 3480
caacaggcaa tgggctgttc ctgtctgagg gcctgaagct ggtggacaag tttctggaag 3540
atgtgaagaa gctgtaccac tctgaggcct tcacagtgaa ctttggggac acagaagagg 3600
ccaagaaaca gatcaatgac tatgtggaaa agggcaccca gggcaagatt gtggaccttg 3660
tgaaagagct ggacagggac actgtgtttg cccttgtgaa ctacatcttc ttcaagggca 3720
agtgggagag gccctttgaa gtgaaggaca ctgaggaaga ggacttccat gtggaccaag 3780
tgaccacagt gaaggtgcca atgatgaaga gactggggat gttcaatatc cagcactgca 3840
agaaactgag cagctgggtg ctgctgatga agtacctggg caatgctaca gccatattct 3900
ttctgcctga tgagggcaag ctgcagcacc tggaaaatga gctgacccat gacatcatca 3960
ccaaatttct ggaaaatgag gacagaagat ctgccagcct gcatctgccc aagctgagca 4020
tcacaggcac atatgacctg aagtctgtgc tgggacagct gggaatcacc aaggtgttca 4080
gcaatggggc agacctgagt ggagtgacag aggaagcccc tctgaagctg tccaaggctg 4140
tgcacaaggc agtgctgacc attgatgaga agggcacaga ggctgctggg gccatgtttc 4200
tggaagccat ccccatgtcc atccccccag aagtgaagtt caacaagccc tttgtgttcc 4260
tgatgattga gcagaacacc aagagccccc tgttcatggg caaggttgtg aaccccaccc 4320
agaaatgagg gcccaatcaa cctctggatt acaaaatttg tgaaagattg actggtattc 4380
ttaactatgt tgctcctttt acgctatgtg gatacgctgc tttaatgcct ttgtatcatg 4440
ctattgcttc ccgtatggct ttcattttct cctccttgta taaatcctgg ttgctgtctc 4500
tttatgagga gttgtggccc gttgtcaggc aacgtggcgt ggtgtgcact gtgtttgctg 4560
acgcaacccc cactggttgg ggcattgcca ccacctgtca gctcctttcc gggactttcg 4620
ctttccccct ccctattgcc acggcggaac tcatcgccgc ctgccttgcc cgctgctgga 4680
caggggctcg gctgttgggc actgacaatt ccgtggtgtt gtcggggaaa tcatcgtcct 4740
ttccttggct gctcgcctgt gttgccacct ggattctgcg cgggacgtcc ttctgctacg 4800
tcccttcggc cctcaatcca gcggaccttc cttcccgcgg cctgctgccg gctctgcggc 4860
ctcttccgcg tcttcgcctt cgccctcaga cgagtcggat ctccctttgg gccgcctccc 4920
cgcaagcttc gcacttttta aaagaaaagg gaggactgga tgggatttat tactccgata 4980
ggacgctggc ttgtaactca gtctcttact aggagaccag cttgagcctg ggtgttcgct 5040
ggttagccta acctggttgg ccaccagggg taaggactcc ttggcttaga aagctaataa 5100
acttgcctgc attagagctc ttacgcgtcc cgggctcgag atccgcatct caattagtca 5160
gcaaccatag tcccgcccct aactccgccc atcccgcccc taactccgcc cagttccgcc 5220
cattctccgc cccatggctg actaattttt tttatttatg cagaggccga ggccgcctcg 5280
gcctctgagc tattccagaa gtagtgagga ggcttttttg gaggcctagg cttttgcaaa 5340
aagctaactt gtttattgca gcttataatg gttacaaata aagcaatagc atcacaaatt 5400
tcacaaataa agcatttttt tcactgcatt ctagttgtgg tttgtccaaa ctcatcaatg 5460
tatcttatca tgtctgtccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct 5520
gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga 5580
taacgcagga aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc 5640
cgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg 5700
ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg 5760
aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt 5820
tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt 5880
gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 5940
cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact 6000
ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt 6060
cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct 6120
gctgaagcca gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac 6180
cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc 6240
tcaagaagat cctttgatct tttctacggg gtctgacgct cagtggaacg aaaactcacg 6300
ttaagggatt ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta 6360
aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttagaa 6420
aaactcatcg agcatcaaat gaaactgcaa tttattcata tcaggattat caataccata 6480
tttttgaaaa agccgtttct gtaatgaagg agaaaactca ccgaggcagt tccataggat 6540
ggcaagatcc tggtatcggt ctgcgattcc gactcgtcca acatcaatac aacctattaa 6600
tttcccctcg tcaaaaataa ggttatcaag tgagaaatca ccatgagtga cgactgaatc 6660
cggtgagaat ggcaacagct tatgcatttc tttccagact tgttcaacag gccagccatt 6720
acgctcgtca tcaaaatcac tcgcatcaac caaaccgtta ttcattcgtg attgcgcctg 6780
agcgagacga aatacgcgat cgctgttaaa aggacaatta caaacaggaa tcgaatgcaa 6840
ccggcgcagg aacactgcca gcgcatcaac aatattttca cctgaatcag gatattcttc 6900
taatacctgg aatgctgttt ttccggggat cgcagtggtg agtaaccatg catcatcagg 6960
agtacggata aaatgcttga tggtcggaag aggcataaat tccgtcagcc agtttagtct 7020
gaccatctca tctgtaacat cattggcaac gctacctttg ccatgtttca gaaacaactc 7080
tggcgcatcg ggcttcccat acaatcgata gattgtcgca cctgattgcc cgacattatc 7140
gcgagcccat ttatacccat ataaatcagc atccatgttg gaatttaatc gcggcctaga 7200
gcaagacgtt tcccgttgaa tatggctcat aacacccctt gtattactgt ttatgtaagc 7260
agacagtttt attgttcatg atgatatatt tttatcttgt gcaatgtaac atcagagatt 7320
ttgagacaca acaattggtc gacggatcc 7349
<210> 25
<211> 10812
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM411
<400> 25
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact ctgggcaagt agggcaggcg gtgggtacgc aatgggggcg gctacctcag 1200
cactaaatag gagacaatta gaccaatttg agaaaatacg acttcgcccg aacggaaaga 1260
aaaagtacca aattaaacat ttaatatggg caggcaagga gatggagcgc ttcggcctcc 1320
atgagaggtt gttggagaca gaggaggggt gtaaaagaat catagaagtc ctctaccccc 1380
tagaaccaac aggatcggag ggcttaaaaa gtctgttcaa tcttgtgtgc gtgctatatt 1440
gcttgcacaa ggaacagaaa gtgaaagaca cagaggaagc agtagcaaca gtaagacaac 1500
actgccatct agtggaaaaa gaaaaaagtg caacagagac atctagtgga caaaagaaaa 1560
atgacaaggg aatagcagcg ccacctggtg gcagtcagaa ttttccagcg caacaacaag 1620
gaaatgcctg ggtacatgta cccttgtcac cgcgcacctt aaatgcgtgg gtaaaagcag 1680
tagaggagaa aaaatttgga gcagaaatag tacccatgtt tcaagcccta tcgaattccc 1740
gtttgtgcta gggttcttag gcttcttggg ggctgctgga actgcaatgg gagcagcggc 1800
gacagccctg acggtccagt ctcagcattt gcttgctggg atactgcagc agcagaagaa 1860
tctgctggcg gctgtggagg ctcaacagca gatgttgaag ctgaccattt ggggtgttaa 1920
aaacctcaat gcccgcgtca cagcccttga gaagtaccta gaggatcagg cacgactaaa 1980
ctcctggggg tgcgcatgga aacaagtatg tcataccaca gtggagtggc cctggacaaa 2040
tcggactccg gattggcaaa atatgacttg gttggagtgg gaaagacaaa tagctgattt 2100
ggaaagcaac attacgagac aattagtgaa ggctagagaa caagaggaaa agaatctaga 2160
tgcctatcag aagttaacta gttggtcaga tttctggtct tggttcgatt tctcaaaatg 2220
gcttaacatt ttaaaaatgg gatttttagt aatagtagga ataatagggt taagattact 2280
ttacacagta tatggatgta tagtgagggt taggcaggga tatgttcctc tatctccaca 2340
gatccatatc cgcggcaatt ttaaaagaaa gggaggaata gggggacaga cttcagcaga 2400
gagactaatt aatataataa caacacaatt agaaatacaa catttacaaa ccaaaattca 2460
aaaaatttta aattttagag ccgcggagat ctcaatattg gccattagcc atattattca 2520
ttggttatat agcataaatc aatattggct attggccatt gcatacgttg tatctatatc 2580
ataatatgta catttatatt ggctcatgtc caatatgacc gccatgttgg cattgattat 2640
tgactagtta ttaatagtaa tcaattacgg ggtcattagt tcatagccca tatatggagt 2700
tccgcgttac ataacttacg gtaaatggcc cgcctggctg accgcccaac gacccccgcc 2760
cattgacgtc aataatgacg tatgttccca tagtaacgcc aatagggact ttccattgac 2820
gtcaatgggt ggagtattta cggtaaactg cccacttggc agtacatcaa gtgtatcata 2880
tgccaagtcc gccccctatt gacgtcaatg acggtaaatg gcccgcctgg cattatgccc 2940
agtacatgac cttacgggac tttcctactt ggcagtacat ctacgtatta gtcatcgcta 3000
ttaccatggt gatgcggttt tggcagtaca ccaatgggcg tggatagcgg tttgactcac 3060
ggggatttcc aagtctccac cccattgacg tcaatgggag tttgttttgg caccaaaatc 3120
aacgggactt tccaaaatgt cgtaataacc ccgccccgtt gacgcaaatg ggcggtaggc 3180
gtgtacggtg ggaggtctat ataagcagag ctcgtttagt gaaccgtcag atcactagaa 3240
gctttattgc ggtagtttat cacagttaaa ttgctaacgc agtcagtgct tctgacacaa 3300
cagtctcgaa cttaagctgc agaagttggt cgtgaggcac tgggcaggct agccaccaat 3360
gcagattgag ctgagcacct gcttcttcct gtgcctgctg aggttctgct tctctgccac 3420
caggagatac tacctggggg ctgtggagct gagctgggac tacatgcagt ctgacctggg 3480
ggagctgcct gtggatgcca ggttcccccc cagagtgccc aagagcttcc ccttcaacac 3540
ctctgtggtg tacaagaaga ccctgtttgt ggagttcact gaccacctgt tcaacattgc 3600
caagcccagg cccccctgga tgggcctgct gggccccacc atccaggctg aggtgtatga 3660
cactgtggtg atcaccctga agaacatggc cagccaccct gtgagcctgc atgctgtggg 3720
ggtgagctac tggaaggcct ctgagggggc tgagtatgat gaccagacca gccagaggga 3780
gaaggaggat gacaaggtgt tccctggggg cagccacacc tatgtgtggc aggtgctgaa 3840
ggagaatggc cccatggcct ctgaccccct gtgcctgacc tacagctacc tgagccatgt 3900
ggacctggtg aaggacctga actctggcct gattggggcc ctgctggtgt gcagggaggg 3960
cagcctggcc aaggagaaga cccagaccct gcacaagttc atcctgctgt ttgctgtgtt 4020
tgatgagggc aagagctggc actctgaaac caagaacagc ctgatgcagg acagggatgc 4080
tgcctctgcc agggcctggc ccaagatgca cactgtgaat ggctatgtga acaggagcct 4140
gcctggcctg attggctgcc acaggaagtc tgtgtactgg catgtgattg gcatgggcac 4200
cacccctgag gtgcacagca tcttcctgga gggccacacc ttcctggtca ggaaccacag 4260
gcaggccagc ctggagatca gccccatcac cttcctgact gcccagaccc tgctgatgga 4320
cctgggccag ttcctgctgt tctgccacat cagcagccac cagcatgatg gcatggaggc 4380
ctatgtgaag gtggacagct gccctgagga gccccagctg aggatgaaga acaatgagga 4440
ggctgaggac tatgatgatg acctgactga ctctgagatg gatgtggtga ggtttgatga 4500
tgacaacagc cccagcttca tccagatcag gtctgtggcc aagaagcacc ccaagacctg 4560
ggtgcactac attgctgctg aggaggagga ctgggactat gcccccctgg tgctggcccc 4620
tgatgacagg agctacaaga gccagtacct gaacaatggc ccccagagga ttggcaggaa 4680
gtacaagaag gtcaggttca tggcctacac tgatgaaacc ttcaagacca gggaggccat 4740
ccagcatgag tctggcatcc tgggccccct gctgtatggg gaggtggggg acaccctgct 4800
gatcatcttc aagaaccagg ccagcaggcc ctacaacatc tacccccatg gcatcactga 4860
tgtgaggccc ctgtacagca ggaggctgcc caagggggtg aagcacctga aggacttccc 4920
catcctgcct ggggagatct tcaagtacaa gtggactgtg actgtggagg atggccccac 4980
caagtctgac cccaggtgcc tgaccagata ctacagcagc tttgtgaaca tggagaggga 5040
cctggcctct ggcctgattg gccccctgct gatctgctac aaggagtctg tggaccagag 5100
gggcaaccag atcatgtctg acaagaggaa tgtgatcctg ttctctgtgt ttgatgagaa 5160
caggagctgg tacctgactg agaacatcca gaggttcctg cccaaccctg ctggggtgca 5220
gctggaggac cctgagttcc aggccagcaa catcatgcac agcatcaatg gctatgtgtt 5280
tgacagcctg cagctgtctg tgtgcctgca tgaggtggcc tactggtaca tcctgagcat 5340
tggggcccag actgacttcc tgtctgtgtt cttctctggc tacaccttca agcacaagat 5400
ggtgtatgag gacaccctga ccctgttccc cttctctggg gagactgtgt tcatgagcat 5460
ggagaaccct ggcctgtgga ttctgggctg ccacaactct gacttcagga acaggggcat 5520
gactgccctg ctgaaagtct ccagctgtga caagaacact ggggactact atgaggacag 5580
ctatgaggac atctctgcct acctgctgag caagaacaat gccattgagc ccaggagctt 5640
cagccagaat gccactaatg tgtctaacaa cagcaacacc agcaatgaca gcaatgtgtc 5700
tcccccagtg ctgaagaggc accagaggga gatcaccagg accaccctgc agtctgacca 5760
ggaggagatt gactatgatg acaccatctc tgtggagatg aagaaggagg actttgacat 5820
ctacgacgag gacgagaacc agagccccag gagcttccag aagaagacca ggcactactt 5880
cattgctgct gtggagaggc tgtgggacta tggcatgagc agcagccccc atgtgctgag 5940
gaacagggcc cagtctggct ctgtgcccca gttcaagaag gtggtgttcc aggagttcac 6000
tgatggcagc ttcacccagc ccctgtacag aggggagctg aatgagcacc tgggcctgct 6060
gggcccctac atcagggctg aggtggagga caacatcatg gtgaccttca ggaaccaggc 6120
cagcaggccc tacagcttct acagcagcct gatcagctat gaggaggacc agaggcaggg 6180
ggctgagccc aggaagaact ttgtgaagcc caatgaaacc aagacctact tctggaaggt 6240
gcagcaccac atggccccca ccaaggatga gtttgactgc aaggcctggg cctacttctc 6300
tgatgtggac ctggagaagg atgtgcactc tggcctgatt ggccccctgc tggtgtgcca 6360
caccaacacc ctgaaccctg cccatggcag gcaggtgact gtgcaggagt ttgccctgtt 6420
cttcaccatc tttgatgaaa ccaagagctg gtacttcact gagaacatgg agaggaactg 6480
cagggccccc tgcaacatcc agatggagga ccccaccttc aaggagaact acaggttcca 6540
tgccatcaat ggctacatca tggacaccct gcctggcctg gtgatggccc aggaccagag 6600
gatcaggtgg tacctgctga gcatgggcag caatgagaac atccacagca tccacttctc 6660
tggccatgtg ttcactgtga ggaagaagga ggagtacaag atggccctgt acaacctgta 6720
ccctggggtg tttgagactg tggagatgct gcccagcaag gctggcatct ggagggtgga 6780
gtgcctgatt ggggagcacc tgcatgctgg catgagcacc ctgttcctgg tgtacagcaa 6840
caagtgccag acccccctgg gcatggcctc tggccacatc agggacttcc agatcactgc 6900
ctctggccag tatggccagt gggcccccaa gctggccagg ctgcactact ctggcagcat 6960
caatgcctgg agcaccaagg agcccttcag ctggatcaag gtggacctgc tggcccccat 7020
gatcatccat ggcatcaaga cccagggggc caggcagaag ttcagcagcc tgtacatcag 7080
ccagttcatc atcatgtaca gcctggatgg caagaagtgg cagacctaca ggggcaacag 7140
cactggcacc ctgatggtgt tctttggcaa tgtggacagc tctggcatca agcacaacat 7200
cttcaacccc cccatcattg ccagatacat caggctgcac cccacccact acagcatcag 7260
gagcaccctg aggatggagc tgatgggctg tgacctgaac agctgcagca tgcccctggg 7320
catggagagc aaggccatct ctgatgccca gatcactgcc agcagctact tcaccaacat 7380
gtttgccacc tggagcccca gcaaggccag gctgcacctg cagggcagga gcaatgcctg 7440
gaggccccag gtcaacaacc ccaaggagtg gctgcaggtg gacttccaga agaccatgaa 7500
ggtgactggg gtgaccaccc agggggtgaa gagcctgctg accagcatgt atgtgaagga 7560
gttcctgatc agcagcagcc aggatggcca ccagtggacc ctgttcttcc agaatggcaa 7620
ggtgaaggtg ttccagggca accaggacag cttcacccct gtggtgaaca gcctggaccc 7680
ccccctgctg accagatacc tgaggattca cccccagagc tgggtgcacc agattgccct 7740
gaggatggag gtgctgggct gtgaggccca ggacctgtac tgagcggccg cgggcccaat 7800
caacctctgg attacaaaat ttgtgaaaga ttgactggta ttcttaacta tgttgctcct 7860
tttacgctat gtggatacgc tgctttaatg cctttgtatc atgctattgc ttcccgtatg 7920
gctttcattt tctcctcctt gtataaatcc tggttgctgt ctctttatga ggagttgtgg 7980
cccgttgtca ggcaacgtgg cgtggtgtgc actgtgtttg ctgacgcaac ccccactggt 8040
tggggcattg ccaccacctg tcagctcctt tccgggactt tcgctttccc cctccctatt 8100
gccacggcgg aactcatcgc cgcctgcctt gcccgctgct ggacaggggc tcggctgttg 8160
ggcactgaca attccgtggt gttgtcgggg aaatcatcgt cctttccttg gctgctcgcc 8220
tgtgttgcca cctggattct gcgcgggacg tccttctgct acgtcccttc ggccctcaat 8280
ccagcggacc ttccttcccg cggcctgctg ccggctctgc ggcctcttcc gcgtcttcgc 8340
cttcgccctc agacgagtcg gatctccctt tgggccgcct ccccgcaagc ttcgcacttt 8400
ttaaaagaaa agggaggact ggatgggatt tattactccg ataggacgct ggcttgtaac 8460
tcagtctctt actaggagac cagcttgagc ctgggtgttc gctggttagc ctaacctggt 8520
tggccaccag gggtaaggac tccttggctt agaaagctaa taaacttgcc tgcattagag 8580
ctcttacgcg tcccgggctc gagatccgca tctcaattag tcagcaacca tagtcccgcc 8640
cctaactccg cccatcccgc ccctaactcc gcccagttcc gcccattctc cgccccatgg 8700
ctgactaatt ttttttattt atgcagaggc cgaggccgcc tcggcctctg agctattcca 8760
gaagtagtga ggaggctttt ttggaggcct aggcttttgc aaaaagctaa cttgtttatt 8820
gcagcttata atggttacaa ataaagcaat agcatcacaa atttcacaaa taaagcattt 8880
ttttcactgc attctagttg tggtttgtcc aaactcatca atgtatctta tcatgtctgt 8940
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 9000
ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 9060
tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 9120
tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 9180
gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 9240
ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 9300
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 9360
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 9420
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 9480
acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 9540
actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 9600
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 9660
tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 9720
tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 9780
tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 9840
caatctaaag tatatatgag taaacttggt ctgacagtta gaaaaactca tcgagcatca 9900
aatgaaactg caatttattc atatcaggat tatcaatacc atatttttga aaaagccgtt 9960
tctgtaatga aggagaaaac tcaccgaggc agttccatag gatggcaaga tcctggtatc 10020
ggtctgcgat tccgactcgt ccaacatcaa tacaacctat taatttcccc tcgtcaaaaa 10080
taaggttatc aagtgagaaa tcaccatgag tgacgactga atccggtgag aatggcaaca 10140
gcttatgcat ttctttccag acttgttcaa caggccagcc attacgctcg tcatcaaaat 10200
cactcgcatc aaccaaaccg ttattcattc gtgattgcgc ctgagcgaga cgaaatacgc 10260
gatcgctgtt aaaaggacaa ttacaaacag gaatcgaatg caaccggcgc aggaacactg 10320
ccagcgcatc aacaatattt tcacctgaat caggatattc ttctaatacc tggaatgctg 10380
tttttccggg gatcgcagtg gtgagtaacc atgcatcatc aggagtacgg ataaaatgct 10440
tgatggtcgg aagaggcata aattccgtca gccagtttag tctgaccatc tcatctgtaa 10500
catcattggc aacgctacct ttgccatgtt tcagaaacaa ctctggcgca tcgggcttcc 10560
catacaatcg atagattgtc gcacctgatt gcccgacatt atcgcgagcc catttatacc 10620
catataaatc agcatccatg ttggaattta atcgcggcct agagcaagac gtttcccgtt 10680
gaatatggct cataacaccc cttgtattac tgtttatgta agcagacagt tttattgttc 10740
atgatgatat atttttatct tgtgcaatgt aacatcagag attttgagac acaacaattg 10800
gtcgacggat cc 10812
<210> 26
<211> 10519
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM413
<400> 26
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact ctgggcaagt agggcaggcg gtgggtacgc aatgggggcg gctacctcag 1200
cactaaatag gagacaatta gaccaatttg agaaaatacg acttcgcccg aacggaaaga 1260
aaaagtacca aattaaacat ttaatatggg caggcaagga gatggagcgc ttcggcctcc 1320
atgagaggtt gttggagaca gaggaggggt gtaaaagaat catagaagtc ctctaccccc 1380
tagaaccaac aggatcggag ggcttaaaaa gtctgttcaa tcttgtgtgc gtgctatatt 1440
gcttgcacaa ggaacagaaa gtgaaagaca cagaggaagc agtagcaaca gtaagacaac 1500
actgccatct agtggaaaaa gaaaaaagtg caacagagac atctagtgga caaaagaaaa 1560
atgacaaggg aatagcagcg ccacctggtg gcagtcagaa ttttccagcg caacaacaag 1620
gaaatgcctg ggtacatgta cccttgtcac cgcgcacctt aaatgcgtgg gtaaaagcag 1680
tagaggagaa aaaatttgga gcagaaatag tacccatgtt tcaagcccta tcgaattccc 1740
gtttgtgcta gggttcttag gcttcttggg ggctgctgga actgcaatgg gagcagcggc 1800
gacagccctg acggtccagt ctcagcattt gcttgctggg atactgcagc agcagaagaa 1860
tctgctggcg gctgtggagg ctcaacagca gatgttgaag ctgaccattt ggggtgttaa 1920
aaacctcaat gcccgcgtca cagcccttga gaagtaccta gaggatcagg cacgactaaa 1980
ctcctggggg tgcgcatgga aacaagtatg tcataccaca gtggagtggc cctggacaaa 2040
tcggactccg gattggcaaa atatgacttg gttggagtgg gaaagacaaa tagctgattt 2100
ggaaagcaac attacgagac aattagtgaa ggctagagaa caagaggaaa agaatctaga 2160
tgcctatcag aagttaacta gttggtcaga tttctggtct tggttcgatt tctcaaaatg 2220
gcttaacatt ttaaaaatgg gatttttagt aatagtagga ataatagggt taagattact 2280
ttacacagta tatggatgta tagtgagggt taggcaggga tatgttcctc tatctccaca 2340
gatccatatc cgcggcaatt ttaaaagaaa gggaggaata gggggacaga cttcagcaga 2400
gagactaatt aatataataa caacacaatt agaaatacaa catttacaaa ccaaaattca 2460
aaaaatttta aattttagag ccgcggagat ctgttacata acttatggta aatggcctgc 2520
ctggctgact gcccaatgac ccctgcccaa tgatgtcaat aatgatgtat gttcccatgt 2580
aatgccaata gggactttcc attgatgtca atgggtggag tatttatggt aactgcccac 2640
ttggcagtac atcaagtgta tcatatgcca agtatgcccc ctattgatgt caatgatggt 2700
aaatggcctg cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag 2760
tacatctatg tattagtcat tgctattacc atgggaattc actagtggag aagagcatgc 2820
ttgagggctg agtgcccctc agtgggcaga gagcacatgg cccacagtcc ctgagaagtt 2880
ggggggaggg gtgggcaatt gaactggtgc ctagagaagg tggggcttgg gtaaactggg 2940
aaagtgatgt ggtgtactgg ctccaccttt ttccccaggg tgggggagaa ccatatataa 3000
gtgcagtagt ctctgtgaac attcaagctt ctgccttctc cctcctgtga gtttgctagc 3060
caccaatgca gattgagctg agcacctgct tcttcctgtg cctgctgagg ttctgcttct 3120
ctgccaccag gagatactac ctgggggctg tggagctgag ctgggactac atgcagtctg 3180
acctggggga gctgcctgtg gatgccaggt tcccccccag agtgcccaag agcttcccct 3240
tcaacacctc tgtggtgtac aagaagaccc tgtttgtgga gttcactgac cacctgttca 3300
acattgccaa gcccaggccc ccctggatgg gcctgctggg ccccaccatc caggctgagg 3360
tgtatgacac tgtggtgatc accctgaaga acatggccag ccaccctgtg agcctgcatg 3420
ctgtgggggt gagctactgg aaggcctctg agggggctga gtatgatgac cagaccagcc 3480
agagggagaa ggaggatgac aaggtgttcc ctgggggcag ccacacctat gtgtggcagg 3540
tgctgaagga gaatggcccc atggcctctg accccctgtg cctgacctac agctacctga 3600
gccatgtgga cctggtgaag gacctgaact ctggcctgat tggggccctg ctggtgtgca 3660
gggagggcag cctggccaag gagaagaccc agaccctgca caagttcatc ctgctgtttg 3720
ctgtgtttga tgagggcaag agctggcact ctgaaaccaa gaacagcctg atgcaggaca 3780
gggatgctgc ctctgccagg gcctggccca agatgcacac tgtgaatggc tatgtgaaca 3840
ggagcctgcc tggcctgatt ggctgccaca ggaagtctgt gtactggcat gtgattggca 3900
tgggcaccac ccctgaggtg cacagcatct tcctggaggg ccacaccttc ctggtcagga 3960
accacaggca ggccagcctg gagatcagcc ccatcacctt cctgactgcc cagaccctgc 4020
tgatggacct gggccagttc ctgctgttct gccacatcag cagccaccag catgatggca 4080
tggaggccta tgtgaaggtg gacagctgcc ctgaggagcc ccagctgagg atgaagaaca 4140
atgaggaggc tgaggactat gatgatgacc tgactgactc tgagatggat gtggtgaggt 4200
ttgatgatga caacagcccc agcttcatcc agatcaggtc tgtggccaag aagcacccca 4260
agacctgggt gcactacatt gctgctgagg aggaggactg ggactatgcc cccctggtgc 4320
tggcccctga tgacaggagc tacaagagcc agtacctgaa caatggcccc cagaggattg 4380
gcaggaagta caagaaggtc aggttcatgg cctacactga tgaaaccttc aagaccaggg 4440
aggccatcca gcatgagtct ggcatcctgg gccccctgct gtatggggag gtgggggaca 4500
ccctgctgat catcttcaag aaccaggcca gcaggcccta caacatctac ccccatggca 4560
tcactgatgt gaggcccctg tacagcagga ggctgcccaa gggggtgaag cacctgaagg 4620
acttccccat cctgcctggg gagatcttca agtacaagtg gactgtgact gtggaggatg 4680
gccccaccaa gtctgacccc aggtgcctga ccagatacta cagcagcttt gtgaacatgg 4740
agagggacct ggcctctggc ctgattggcc ccctgctgat ctgctacaag gagtctgtgg 4800
accagagggg caaccagatc atgtctgaca agaggaatgt gatcctgttc tctgtgtttg 4860
atgagaacag gagctggtac ctgactgaga acatccagag gttcctgccc aaccctgctg 4920
gggtgcagct ggaggaccct gagttccagg ccagcaacat catgcacagc atcaatggct 4980
atgtgtttga cagcctgcag ctgtctgtgt gcctgcatga ggtggcctac tggtacatcc 5040
tgagcattgg ggcccagact gacttcctgt ctgtgttctt ctctggctac accttcaagc 5100
acaagatggt gtatgaggac accctgaccc tgttcccctt ctctggggag actgtgttca 5160
tgagcatgga gaaccctggc ctgtggattc tgggctgcca caactctgac ttcaggaaca 5220
ggggcatgac tgccctgctg aaagtctcca gctgtgacaa gaacactggg gactactatg 5280
aggacagcta tgaggacatc tctgcctacc tgctgagcaa gaacaatgcc attgagccca 5340
ggagcttcag ccagaatgcc actaatgtgt ctaacaacag caacaccagc aatgacagca 5400
atgtgtctcc cccagtgctg aagaggcacc agagggagat caccaggacc accctgcagt 5460
ctgaccagga ggagattgac tatgatgaca ccatctctgt ggagatgaag aaggaggact 5520
ttgacatcta cgacgaggac gagaaccaga gccccaggag cttccagaag aagaccaggc 5580
actacttcat tgctgctgtg gagaggctgt gggactatgg catgagcagc agcccccatg 5640
tgctgaggaa cagggcccag tctggctctg tgccccagtt caagaaggtg gtgttccagg 5700
agttcactga tggcagcttc acccagcccc tgtacagagg ggagctgaat gagcacctgg 5760
gcctgctggg cccctacatc agggctgagg tggaggacaa catcatggtg accttcagga 5820
accaggccag caggccctac agcttctaca gcagcctgat cagctatgag gaggaccaga 5880
ggcagggggc tgagcccagg aagaactttg tgaagcccaa tgaaaccaag acctacttct 5940
ggaaggtgca gcaccacatg gcccccacca aggatgagtt tgactgcaag gcctgggcct 6000
acttctctga tgtggacctg gagaaggatg tgcactctgg cctgattggc cccctgctgg 6060
tgtgccacac caacaccctg aaccctgccc atggcaggca ggtgactgtg caggagtttg 6120
ccctgttctt caccatcttt gatgaaacca agagctggta cttcactgag aacatggaga 6180
ggaactgcag ggccccctgc aacatccaga tggaggaccc caccttcaag gagaactaca 6240
ggttccatgc catcaatggc tacatcatgg acaccctgcc tggcctggtg atggcccagg 6300
accagaggat caggtggtac ctgctgagca tgggcagcaa tgagaacatc cacagcatcc 6360
acttctctgg ccatgtgttc actgtgagga agaaggagga gtacaagatg gccctgtaca 6420
acctgtaccc tggggtgttt gagactgtgg agatgctgcc cagcaaggct ggcatctgga 6480
gggtggagtg cctgattggg gagcacctgc atgctggcat gagcaccctg ttcctggtgt 6540
acagcaacaa gtgccagacc cccctgggca tggcctctgg ccacatcagg gacttccaga 6600
tcactgcctc tggccagtat ggccagtggg cccccaagct ggccaggctg cactactctg 6660
gcagcatcaa tgcctggagc accaaggagc ccttcagctg gatcaaggtg gacctgctgg 6720
cccccatgat catccatggc atcaagaccc agggggccag gcagaagttc agcagcctgt 6780
acatcagcca gttcatcatc atgtacagcc tggatggcaa gaagtggcag acctacaggg 6840
gcaacagcac tggcaccctg atggtgttct ttggcaatgt ggacagctct ggcatcaagc 6900
acaacatctt caaccccccc atcattgcca gatacatcag gctgcacccc acccactaca 6960
gcatcaggag caccctgagg atggagctga tgggctgtga cctgaacagc tgcagcatgc 7020
ccctgggcat ggagagcaag gccatctctg atgcccagat cactgccagc agctacttca 7080
ccaacatgtt tgccacctgg agccccagca aggccaggct gcacctgcag ggcaggagca 7140
atgcctggag gccccaggtc aacaacccca aggagtggct gcaggtggac ttccagaaga 7200
ccatgaaggt gactggggtg accacccagg gggtgaagag cctgctgacc agcatgtatg 7260
tgaaggagtt cctgatcagc agcagccagg atggccacca gtggaccctg ttcttccaga 7320
atggcaaggt gaaggtgttc cagggcaacc aggacagctt cacccctgtg gtgaacagcc 7380
tggacccccc cctgctgacc agatacctga ggattcaccc ccagagctgg gtgcaccaga 7440
ttgccctgag gatggaggtg ctgggctgtg aggcccagga cctgtactga gcggccgcgg 7500
gcccaatcaa cctctggatt acaaaatttg tgaaagattg actggtattc ttaactatgt 7560
tgctcctttt acgctatgtg gatacgctgc tttaatgcct ttgtatcatg ctattgcttc 7620
ccgtatggct ttcattttct cctccttgta taaatcctgg ttgctgtctc tttatgagga 7680
gttgtggccc gttgtcaggc aacgtggcgt ggtgtgcact gtgtttgctg acgcaacccc 7740
cactggttgg ggcattgcca ccacctgtca gctcctttcc gggactttcg ctttccccct 7800
ccctattgcc acggcggaac tcatcgccgc ctgccttgcc cgctgctgga caggggctcg 7860
gctgttgggc actgacaatt ccgtggtgtt gtcggggaaa tcatcgtcct ttccttggct 7920
gctcgcctgt gttgccacct ggattctgcg cgggacgtcc ttctgctacg tcccttcggc 7980
cctcaatcca gcggaccttc cttcccgcgg cctgctgccg gctctgcggc ctcttccgcg 8040
tcttcgcctt cgccctcaga cgagtcggat ctccctttgg gccgcctccc cgcaagcttc 8100
gcacttttta aaagaaaagg gaggactgga tgggatttat tactccgata ggacgctggc 8160
ttgtaactca gtctcttact aggagaccag cttgagcctg ggtgttcgct ggttagccta 8220
acctggttgg ccaccagggg taaggactcc ttggcttaga aagctaataa acttgcctgc 8280
attagagctc ttacgcgtcc cgggctcgag atccgcatct caattagtca gcaaccatag 8340
tcccgcccct aactccgccc atcccgcccc taactccgcc cagttccgcc cattctccgc 8400
cccatggctg actaattttt tttatttatg cagaggccga ggccgcctcg gcctctgagc 8460
tattccagaa gtagtgagga ggcttttttg gaggcctagg cttttgcaaa aagctaactt 8520
gtttattgca gcttataatg gttacaaata aagcaatagc atcacaaatt tcacaaataa 8580
agcatttttt tcactgcatt ctagttgtgg tttgtccaaa ctcatcaatg tatcttatca 8640
tgtctgtccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 8700
gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 8760
aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 8820
gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 8880
aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 8940
gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 9000
ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 9060
cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 9120
ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 9180
actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 9240
tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca 9300
gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc 9360
ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 9420
cctttgatct tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 9480
ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 9540
tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttagaa aaactcatcg 9600
agcatcaaat gaaactgcaa tttattcata tcaggattat caataccata tttttgaaaa 9660
agccgtttct gtaatgaagg agaaaactca ccgaggcagt tccataggat ggcaagatcc 9720
tggtatcggt ctgcgattcc gactcgtcca acatcaatac aacctattaa tttcccctcg 9780
tcaaaaataa ggttatcaag tgagaaatca ccatgagtga cgactgaatc cggtgagaat 9840
ggcaacagct tatgcatttc tttccagact tgttcaacag gccagccatt acgctcgtca 9900
tcaaaatcac tcgcatcaac caaaccgtta ttcattcgtg attgcgcctg agcgagacga 9960
aatacgcgat cgctgttaaa aggacaatta caaacaggaa tcgaatgcaa ccggcgcagg 10020
aacactgcca gcgcatcaac aatattttca cctgaatcag gatattcttc taatacctgg 10080
aatgctgttt ttccggggat cgcagtggtg agtaaccatg catcatcagg agtacggata 10140
aaatgcttga tggtcggaag aggcataaat tccgtcagcc agtttagtct gaccatctca 10200
tctgtaacat cattggcaac gctacctttg ccatgtttca gaaacaactc tggcgcatcg 10260
ggcttcccat acaatcgata gattgtcgca cctgattgcc cgacattatc gcgagcccat 10320
ttatacccat ataaatcagc atccatgttg gaatttaatc gcggcctaga gcaagacgtt 10380
tcccgttgaa tatggctcat aacacccctt gtattactgt ttatgtaagc agacagtttt 10440
attgttcatg atgatatatt tttatcttgt gcaatgtaac atcagagatt ttgagacaca 10500
acaattggtc gacggatcc 10519
<210> 27
<211> 11400
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM412
<400> 27
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact ctgggcaagt agggcaggcg gtgggtacgc aatgggggcg gctacctcag 1200
cactaaatag gagacaatta gaccaatttg agaaaatacg acttcgcccg aacggaaaga 1260
aaaagtacca aattaaacat ttaatatggg caggcaagga gatggagcgc ttcggcctcc 1320
atgagaggtt gttggagaca gaggaggggt gtaaaagaat catagaagtc ctctaccccc 1380
tagaaccaac aggatcggag ggcttaaaaa gtctgttcaa tcttgtgtgc gtgctatatt 1440
gcttgcacaa ggaacagaaa gtgaaagaca cagaggaagc agtagcaaca gtaagacaac 1500
actgccatct agtggaaaaa gaaaaaagtg caacagagac atctagtgga caaaagaaaa 1560
atgacaaggg aatagcagcg ccacctggtg gcagtcagaa ttttccagcg caacaacaag 1620
gaaatgcctg ggtacatgta cccttgtcac cgcgcacctt aaatgcgtgg gtaaaagcag 1680
tagaggagaa aaaatttgga gcagaaatag tacccatgtt tcaagcccta tcgaattccc 1740
gtttgtgcta gggttcttag gcttcttggg ggctgctgga actgcaatgg gagcagcggc 1800
gacagccctg acggtccagt ctcagcattt gcttgctggg atactgcagc agcagaagaa 1860
tctgctggcg gctgtggagg ctcaacagca gatgttgaag ctgaccattt ggggtgttaa 1920
aaacctcaat gcccgcgtca cagcccttga gaagtaccta gaggatcagg cacgactaaa 1980
ctcctggggg tgcgcatgga aacaagtatg tcataccaca gtggagtggc cctggacaaa 2040
tcggactccg gattggcaaa atatgacttg gttggagtgg gaaagacaaa tagctgattt 2100
ggaaagcaac attacgagac aattagtgaa ggctagagaa caagaggaaa agaatctaga 2160
tgcctatcag aagttaacta gttggtcaga tttctggtct tggttcgatt tctcaaaatg 2220
gcttaacatt ttaaaaatgg gatttttagt aatagtagga ataatagggt taagattact 2280
ttacacagta tatggatgta tagtgagggt taggcaggga tatgttcctc tatctccaca 2340
gatccatatc cgcggcaatt ttaaaagaaa gggaggaata gggggacaga cttcagcaga 2400
gagactaatt aatataataa caacacaatt agaaatacaa catttacaaa ccaaaattca 2460
aaaaatttta aattttagag ccgcggagat ctcaatattg gccattagcc atattattca 2520
ttggttatat agcataaatc aatattggct attggccatt gcatacgttg tatctatatc 2580
ataatatgta catttatatt ggctcatgtc caatatgacc gccatgttgg cattgattat 2640
tgactagtta ttaatagtaa tcaattacgg ggtcattagt tcatagccca tatatggagt 2700
tccgcgttac ataacttacg gtaaatggcc cgcctggctg accgcccaac gacccccgcc 2760
cattgacgtc aataatgacg tatgttccca tagtaacgcc aatagggact ttccattgac 2820
gtcaatgggt ggagtattta cggtaaactg cccacttggc agtacatcaa gtgtatcata 2880
tgccaagtcc gccccctatt gacgtcaatg acggtaaatg gcccgcctgg cattatgccc 2940
agtacatgac cttacgggac tttcctactt ggcagtacat ctacgtatta gtcatcgcta 3000
ttaccatggt gatgcggttt tggcagtaca ccaatgggcg tggatagcgg tttgactcac 3060
ggggatttcc aagtctccac cccattgacg tcaatgggag tttgttttgg caccaaaatc 3120
aacgggactt tccaaaatgt cgtaataacc ccgccccgtt gacgcaaatg ggcggtaggc 3180
gtgtacggtg ggaggtctat ataagcagag ctcgtttagt gaaccgtcag atcactagaa 3240
gctttattgc ggtagtttat cacagttaaa ttgctaacgc agtcagtgct tctgacacaa 3300
cagtctcgaa cttaagctgc agaagttggt cgtgaggcac tgggcaggct agccaccaat 3360
gcagattgag ctgagcacct gcttcttcct gtgcctgctg aggttctgct tctctgccac 3420
caggagatac tacctggggg ctgtggagct gagctgggac tacatgcagt ctgacctggg 3480
ggagctgcct gtggatgcca ggttcccccc cagagtgccc aagagcttcc ccttcaacac 3540
ctctgtggtg tacaagaaga ccctgtttgt ggagttcact gaccacctgt tcaacattgc 3600
caagcccagg cccccctgga tgggcctgct gggccccacc atccaggctg aggtgtatga 3660
cactgtggtg atcaccctga agaacatggc cagccaccct gtgagcctgc atgctgtggg 3720
ggtgagctac tggaaggcct ctgagggggc tgagtatgat gaccagacca gccagaggga 3780
gaaggaggat gacaaggtgt tccctggggg cagccacacc tatgtgtggc aggtgctgaa 3840
ggagaatggc cccatggcct ctgaccccct gtgcctgacc tacagctacc tgagccatgt 3900
ggacctggtg aaggacctga actctggcct gattggggcc ctgctggtgt gcagggaggg 3960
cagcctggcc aaggagaaga cccagaccct gcacaagttc atcctgctgt ttgctgtgtt 4020
tgatgagggc aagagctggc actctgaaac caagaacagc ctgatgcagg acagggatgc 4080
tgcctctgcc agggcctggc ccaagatgca cactgtgaat ggctatgtga acaggagcct 4140
gcctggcctg attggctgcc acaggaagtc tgtgtactgg catgtgattg gcatgggcac 4200
cacccctgag gtgcacagca tcttcctgga gggccacacc ttcctggtca ggaaccacag 4260
gcaggccagc ctggagatca gccccatcac cttcctgact gcccagaccc tgctgatgga 4320
cctgggccag ttcctgctgt tctgccacat cagcagccac cagcatgatg gcatggaggc 4380
ctatgtgaag gtggacagct gccctgagga gccccagctg aggatgaaga acaatgagga 4440
ggctgaggac tatgatgatg acctgactga ctctgagatg gatgtggtga ggtttgatga 4500
tgacaacagc cccagcttca tccagatcag gtctgtggcc aagaagcacc ccaagacctg 4560
ggtgcactac attgctgctg aggaggagga ctgggactat gcccccctgg tgctggcccc 4620
tgatgacagg agctacaaga gccagtacct gaacaatggc ccccagagga ttggcaggaa 4680
gtacaagaag gtcaggttca tggcctacac tgatgaaacc ttcaagacca gggaggccat 4740
ccagcatgag tctggcatcc tgggccccct gctgtatggg gaggtggggg acaccctgct 4800
gatcatcttc aagaaccagg ccagcaggcc ctacaacatc tacccccatg gcatcactga 4860
tgtgaggccc ctgtacagca ggaggctgcc caagggggtg aagcacctga aggacttccc 4920
catcctgcct ggggagatct tcaagtacaa gtggactgtg actgtggagg atggccccac 4980
caagtctgac cccaggtgcc tgaccagata ctacagcagc tttgtgaaca tggagaggga 5040
cctggcctct ggcctgattg gccccctgct gatctgctac aaggagtctg tggaccagag 5100
gggcaaccag atcatgtctg acaagaggaa tgtgatcctg ttctctgtgt ttgatgagaa 5160
caggagctgg tacctgactg agaacatcca gaggttcctg cccaaccctg ctggggtgca 5220
gctggaggac cctgagttcc aggccagcaa catcatgcac agcatcaatg gctatgtgtt 5280
tgacagcctg cagctgtctg tgtgcctgca tgaggtggcc tactggtaca tcctgagcat 5340
tggggcccag actgacttcc tgtctgtgtt cttctctggc tacaccttca agcacaagat 5400
ggtgtatgag gacaccctga ccctgttccc cttctctggg gagactgtgt tcatgagcat 5460
ggagaaccct ggcctgtgga ttctgggctg ccacaactct gacttcagga acaggggcat 5520
gactgccctg ctgaaagtct ccagctgtga caagaacact ggggactact atgaggacag 5580
ctatgaggac atctctgcct acctgctgag caagaacaat gccattgagc ccaggagctt 5640
cagccagaac agcaggcacc ccagcaccag gcagaagcag ttcaatgcca ccaccatccc 5700
tgagaatgac atagagaaga cagacccatg gtttgcccac cggaccccca tgcccaagat 5760
ccagaatgtg agcagctctg acctgctgat gctgctgagg cagagcccca ccccccatgg 5820
cctgagcctg tctgacctgc aggaggccaa gtatgaaacc ttctctgatg accccagccc 5880
tggggccatt gacagcaaca acagcctgtc tgagatgacc cacttcaggc cccagctgca 5940
ccactctggg gacatggtgt tcacccctga gtctggcctg cagctgaggc tgaatgagaa 6000
gctgggcacc actgctgcca ctgagctgaa gaagctggac ttcaaagtct ccagcaccag 6060
caacaacctg atcagcacca tcccctctga caacctggct gctggcactg acaacaccag 6120
cagcctgggc ccccccagca tgcctgtgca ctatgacagc cagctggaca ccaccctgtt 6180
tggcaagaag agcagccccc tgactgagtc tgggggcccc ctgagcctgt ctgaggagaa 6240
caatgacagc aagctgctgg agtctggcct gatgaacagc caggagagca gctggggcaa 6300
gaatgtgagc agcagggaga tcaccaggac caccctgcag tctgaccagg aggagattga 6360
ctatgatgac accatctctg tggagatgaa gaaggaggac tttgacatct acgacgagga 6420
cgagaaccag agccccagga gcttccagaa gaagaccagg cactacttca ttgctgctgt 6480
ggagaggctg tgggactatg gcatgagcag cagcccccat gtgctgagga acagggccca 6540
gtctggctct gtgccccagt tcaagaaggt ggtgttccag gagttcactg atggcagctt 6600
cacccagccc ctgtacagag gggagctgaa tgagcacctg ggcctgctgg gcccctacat 6660
cagggctgag gtggaggaca acatcatggt gaccttcagg aaccaggcca gcaggcccta 6720
cagcttctac agcagcctga tcagctatga ggaggaccag aggcaggggg ctgagcccag 6780
gaagaacttt gtgaagccca atgaaaccaa gacctacttc tggaaggtgc agcaccacat 6840
ggcccccacc aaggatgagt ttgactgcaa ggcctgggcc tacttctctg atgtggacct 6900
ggagaaggat gtgcactctg gcctgattgg ccccctgctg gtgtgccaca ccaacaccct 6960
gaaccctgcc catggcaggc aggtgactgt gcaggagttt gccctgttct tcaccatctt 7020
tgatgaaacc aagagctggt acttcactga gaacatggag aggaactgca gggccccctg 7080
caacatccag atggaggacc ccaccttcaa ggagaactac aggttccatg ccatcaatgg 7140
ctacatcatg gacaccctgc ctggcctggt gatggcccag gaccagagga tcaggtggta 7200
cctgctgagc atgggcagca atgagaacat ccacagcatc cacttctctg gccatgtgtt 7260
cactgtgagg aagaaggagg agtacaagat ggccctgtac aacctgtacc ctggggtgtt 7320
tgagactgtg gagatgctgc ccagcaaggc tggcatctgg agggtggagt gcctgattgg 7380
ggagcacctg catgctggca tgagcaccct gttcctggtg tacagcaaca agtgccagac 7440
ccccctgggc atggcctctg gccacatcag ggacttccag atcactgcct ctggccagta 7500
tggccagtgg gcccccaagc tggccaggct gcactactct ggcagcatca atgcctggag 7560
caccaaggag cccttcagct ggatcaaggt ggacctgctg gcccccatga tcatccatgg 7620
catcaagacc cagggggcca ggcagaagtt cagcagcctg tacatcagcc agttcatcat 7680
catgtacagc ctggatggca agaagtggca gacctacagg ggcaacagca ctggcaccct 7740
gatggtgttc tttggcaatg tggacagctc tggcatcaag cacaacatct tcaacccccc 7800
catcattgcc agatacatca ggctgcaccc cacccactac agcatcagga gcaccctgag 7860
gatggagctg atgggctgtg acctgaacag ctgcagcatg cccctgggca tggagagcaa 7920
ggccatctct gatgcccaga tcactgccag cagctacttc accaacatgt ttgccacctg 7980
gagccccagc aaggccaggc tgcacctgca gggcaggagc aatgcctgga ggccccaggt 8040
caacaacccc aaggagtggc tgcaggtgga cttccagaag accatgaagg tgactggggt 8100
gaccacccag ggggtgaaga gcctgctgac cagcatgtat gtgaaggagt tcctgatcag 8160
cagcagccag gatggccacc agtggaccct gttcttccag aatggcaagg tgaaggtgtt 8220
ccagggcaac caggacagct tcacccctgt ggtgaacagc ctggaccccc ccctgctgac 8280
cagatacctg aggattcacc cccagagctg ggtgcaccag attgccctga ggatggaggt 8340
gctgggctgt gaggcccagg acctgtactg agcggccgcg ggcccaatca acctctggat 8400
tacaaaattt gtgaaagatt gactggtatt cttaactatg ttgctccttt tacgctatgt 8460
ggatacgctg ctttaatgcc tttgtatcat gctattgctt cccgtatggc tttcattttc 8520
tcctccttgt ataaatcctg gttgctgtct ctttatgagg agttgtggcc cgttgtcagg 8580
caacgtggcg tggtgtgcac tgtgtttgct gacgcaaccc ccactggttg gggcattgcc 8640
accacctgtc agctcctttc cgggactttc gctttccccc tccctattgc cacggcggaa 8700
ctcatcgccg cctgccttgc ccgctgctgg acaggggctc ggctgttggg cactgacaat 8760
tccgtggtgt tgtcggggaa atcatcgtcc tttccttggc tgctcgcctg tgttgccacc 8820
tggattctgc gcgggacgtc cttctgctac gtcccttcgg ccctcaatcc agcggacctt 8880
ccttcccgcg gcctgctgcc ggctctgcgg cctcttccgc gtcttcgcct tcgccctcag 8940
acgagtcgga tctccctttg ggccgcctcc ccgcaagctt cgcacttttt aaaagaaaag 9000
ggaggactgg atgggattta ttactccgat aggacgctgg cttgtaactc agtctcttac 9060
taggagacca gcttgagcct gggtgttcgc tggttagcct aacctggttg gccaccaggg 9120
gtaaggactc cttggcttag aaagctaata aacttgcctg cattagagct cttacgcgtc 9180
ccgggctcga gatccgcatc tcaattagtc agcaaccata gtcccgcccc taactccgcc 9240
catcccgccc ctaactccgc ccagttccgc ccattctccg ccccatggct gactaatttt 9300
ttttatttat gcagaggccg aggccgcctc ggcctctgag ctattccaga agtagtgagg 9360
aggctttttt ggaggcctag gcttttgcaa aaagctaact tgtttattgc agcttataat 9420
ggttacaaat aaagcaatag catcacaaat ttcacaaata aagcattttt ttcactgcat 9480
tctagttgtg gtttgtccaa actcatcaat gtatcttatc atgtctgtcc gcttcctcgc 9540
tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg 9600
cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag 9660
gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc 9720
gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag 9780
gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga 9840
ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc 9900
atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg 9960
tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt 10020
ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca 10080
gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca 10140
ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag 10200
ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca 10260
agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg 10320
ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa 10380
aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta 10440
tatatgagta aacttggtct gacagttaga aaaactcatc gagcatcaaa tgaaactgca 10500
atttattcat atcaggatta tcaataccat atttttgaaa aagccgtttc tgtaatgaag 10560
gagaaaactc accgaggcag ttccatagga tggcaagatc ctggtatcgg tctgcgattc 10620
cgactcgtcc aacatcaata caacctatta atttcccctc gtcaaaaata aggttatcaa 10680
gtgagaaatc accatgagtg acgactgaat ccggtgagaa tggcaacagc ttatgcattt 10740
ctttccagac ttgttcaaca ggccagccat tacgctcgtc atcaaaatca ctcgcatcaa 10800
ccaaaccgtt attcattcgt gattgcgcct gagcgagacg aaatacgcga tcgctgttaa 10860
aaggacaatt acaaacagga atcgaatgca accggcgcag gaacactgcc agcgcatcaa 10920
caatattttc acctgaatca ggatattctt ctaatacctg gaatgctgtt tttccgggga 10980
tcgcagtggt gagtaaccat gcatcatcag gagtacggat aaaatgcttg atggtcggaa 11040
gaggcataaa ttccgtcagc cagtttagtc tgaccatctc atctgtaaca tcattggcaa 11100
cgctaccttt gccatgtttc agaaacaact ctggcgcatc gggcttccca tacaatcgat 11160
agattgtcgc acctgattgc ccgacattat cgcgagccca tttataccca tataaatcag 11220
catccatgtt ggaatttaat cgcggcctag agcaagacgt ttcccgttga atatggctca 11280
taacacccct tgtattactg tttatgtaag cagacagttt tattgttcat gatgatatat 11340
ttttatcttg tgcaatgtaa catcagagat tttgagacac aacaattggt cgacggatcc 11400
<210> 28
<211> 11108
<212> DNA
<213> Artificial Sequence
<220>
<223> pGM414
<400> 28
ggtacctcaa tattggccat tagccatatt attcattggt tatatagcat aaatcaatat 60
tggctattgg ccattgcata cgttgtatct atatcataat atgtacattt atattggctc 120
atgtccaata tgaccgccat gttggcattg attattgact agttattaat agtaatcaat 180
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 240
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 300
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 360
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtccgcccc ctattgacgt 420
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttac gggactttcc 480
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 540
gtacaccaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 600
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 660
caactgcgat cgcccgcccc gttgacgcaa atgggcggta ggcgtgtacg gtgggaggtc 720
tatataagca gagctcgctg gcttgtaact cagtctctta ctaggagacc agcttgagcc 780
tgggtgttcg ctggttagcc taacctggtt ggccaccagg ggtaaggact ccttggctta 840
gaaagctaat aaacttgcct gcattagagc ttatctgagt caagtgtcct cattgacgcc 900
tcactctctt gaacgggaat cttccttact gggttctctc tctgacccag gcgagagaaa 960
ctccagcagt ggcgcccgaa cagggacttg agtgagagtg taggcacgta cagctgagaa 1020
ggcgtcggac gcgaaggaag cgcggggtgc gacgcgacca agaaggagac ttggtgagta 1080
ggcttctcga gtgccgggaa aaagctcgag cctagttaga ggactaggag aggccgtagc 1140
cgtaactact cttgggcaag tagggcaggc ggtgggtacg caatgggggc ggctacctca 1200
gcactaaata ggagacaatt agaccaattt gagaaaatac gacttcgccc gaacggaaag 1260
aaaaagtacc aaattaaaca tttaatatgg gcaggcaagg agatggagcg cttcggcctc 1320
catgagaggt tgttggagac agaggagggg tgtaaaagaa tcatagaagt cctctacccc 1380
ctagaaccaa caggatcgga gggcttaaaa agtctgttca atcttgtgtg cgtgctatat 1440
tgcttgcaca aggaacagaa agtgaaagac acagaggaag cagtagcaac agtaagacaa 1500
cactgccatc tagtggaaaa agaaaaaagt gcaacagaga catctagtgg acaaaagaaa 1560
aatgacaagg gaatagcagc gccacctggt ggcagtcaga attttccagc gcaacaacaa 1620
ggaaatgcct gggtacatgt acccttgtca ccgcgcacct taaatgcgtg ggtaaaagca 1680
gtagaggaga aaaaatttgg agcagaaata gtacccatgt ttcaagccct atcgaattcc 1740
cgtttgtgct agggttctta ggcttcttgg gggctgctgg aactgcaatg ggagcagcgg 1800
cgacagccct gacggtccag tctcagcatt tgcttgctgg gatactgcag cagcagaaga 1860
atctgctggc ggctgtggag gctcaacagc agatgttgaa gctgaccatt tggggtgtta 1920
aaaacctcaa tgcccgcgtc acagcccttg agaagtacct agaggatcag gcacgactaa 1980
actcctgggg gtgcgcatgg aaacaagtat gtcataccac agtggagtgg ccctggacaa 2040
atcggactcc ggattggcaa aatatgactt ggttggagtg ggaaagacaa atagctgatt 2100
tggaaagcaa cattacgaga caattagtga aggctagaga acaagaggaa aagaatctag 2160
atgcctatca gaagttaact agttggtcag atttctggtc ttggttcgat ttctcaaaat 2220
ggcttaacat tttaaaaatg ggatttttag taatagtagg aataataggg ttaagattac 2280
tttacacagt atatggatgt atagtgaggg ttaggcaggg atatgttcct ctatctccac 2340
agatccatat ccgcggcaat tttaaaagaa agggaggaat agggggacag acttcagcag 2400
agagactaat taatataata acaacacaat tagaaataca acatttacaa accaaaattc 2460
aaaaaatttt aaattttaga gccgcggaga tctgttacat aacttatggt aaatggcctg 2520
cctggctgac tgcccaatga cccctgccca atgatgtcaa taatgatgta tgttcccatg 2580
taatgccaat agggactttc cattgatgtc aatgggtgga gtatttatgg taactgccca 2640
cttggcagta catcaagtgt atcatatgcc aagtatgccc cctattgatg tcaatgatgg 2700
taaatggcct gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca 2760
gtacatctat gtattagtca ttgctattac catgggaatt cactagtgga gaagagcatg 2820
cttgagggct gagtgcccct cagtgggcag agagcacatg gcccacagtc cctgagaagt 2880
tggggggagg ggtgggcaat tgaactggtg cctagagaag gtggggcttg ggtaaactgg 2940
gaaagtgatg tggtgtactg gctccacctt tttccccagg gtgggggaga accatatata 3000
agtgcagtag tctctgtgaa cattcaagct tctgccttct ccctcctgtg agtttgctag 3060
ccaccaatgc agattgagct gagcacctgc ttcttcctgt gcctgctgag gttctgcttc 3120
tctgccacca ggagatacta cctgggggct gtggagctga gctgggacta catgcagtct 3180
gacctggggg agctgcctgt ggatgccagg ttccccccca gagtgcccaa gagcttcccc 3240
ttcaacacct ctgtggtgta caagaagacc ctgtttgtgg agttcactga ccacctgttc 3300
aacattgcca agcccaggcc cccctggatg ggcctgctgg gccccaccat ccaggctgag 3360
gtgtatgaca ctgtggtgat caccctgaag aacatggcca gccaccctgt gagcctgcat 3420
gctgtggggg tgagctactg gaaggcctct gagggggctg agtatgatga ccagaccagc 3480
cagagggaga aggaggatga caaggtgttc cctgggggca gccacaccta tgtgtggcag 3540
gtgctgaagg agaatggccc catggcctct gaccccctgt gcctgaccta cagctacctg 3600
agccatgtgg acctggtgaa ggacctgaac tctggcctga ttggggccct gctggtgtgc 3660
agggagggca gcctggccaa ggagaagacc cagaccctgc acaagttcat cctgctgttt 3720
gctgtgtttg atgagggcaa gagctggcac tctgaaacca agaacagcct gatgcaggac 3780
agggatgctg cctctgccag ggcctggccc aagatgcaca ctgtgaatgg ctatgtgaac 3840
aggagcctgc ctggcctgat tggctgccac aggaagtctg tgtactggca tgtgattggc 3900
atgggcacca cccctgaggt gcacagcatc ttcctggagg gccacacctt cctggtcagg 3960
aaccacaggc aggccagcct ggagatcagc cccatcacct tcctgactgc ccagaccctg 4020
ctgatggacc tgggccagtt cctgctgttc tgccacatca gcagccacca gcatgatggc 4080
atggaggcct atgtgaaggt ggacagctgc cctgaggagc cccagctgag gatgaagaac 4140
aatgaggagg ctgaggacta tgatgatgac ctgactgact ctgagatgga tgtggtgagg 4200
tttgatgatg acaacagccc cagcttcatc cagatcaggt ctgtggccaa gaagcacccc 4260
aagacctggg tgcactacat tgctgctgag gaggaggact gggactatgc ccccctggtg 4320
ctggcccctg atgacaggag ctacaagagc cagtacctga acaatggccc ccagaggatt 4380
ggcaggaagt acaagaaggt caggttcatg gcctacactg atgaaacctt caagaccagg 4440
gaggccatcc agcatgagtc tggcatcctg ggccccctgc tgtatgggga ggtgggggac 4500
accctgctga tcatcttcaa gaaccaggcc agcaggccct acaacatcta cccccatggc 4560
atcactgatg tgaggcccct gtacagcagg aggctgccca agggggtgaa gcacctgaag 4620
gacttcccca tcctgcctgg ggagatcttc aagtacaagt ggactgtgac tgtggaggat 4680
ggccccacca agtctgaccc caggtgcctg accagatact acagcagctt tgtgaacatg 4740
gagagggacc tggcctctgg cctgattggc cccctgctga tctgctacaa ggagtctgtg 4800
gaccagaggg gcaaccagat catgtctgac aagaggaatg tgatcctgtt ctctgtgttt 4860
gatgagaaca ggagctggta cctgactgag aacatccaga ggttcctgcc caaccctgct 4920
ggggtgcagc tggaggaccc tgagttccag gccagcaaca tcatgcacag catcaatggc 4980
tatgtgtttg acagcctgca gctgtctgtg tgcctgcatg aggtggccta ctggtacatc 5040
ctgagcattg gggcccagac tgacttcctg tctgtgttct tctctggcta caccttcaag 5100
cacaagatgg tgtatgagga caccctgacc ctgttcccct tctctgggga gactgtgttc 5160
atgagcatgg agaaccctgg cctgtggatt ctgggctgcc acaactctga cttcaggaac 5220
aggggcatga ctgccctgct gaaagtctcc agctgtgaca agaacactgg ggactactat 5280
gaggacagct atgaggacat ctctgcctac ctgctgagca agaacaatgc cattgagccc 5340
aggagcttca gccagaacag caggcacccc agcaccaggc agaagcagtt caatgccacc 5400
accatccctg agaatgacat agagaagaca gacccatggt ttgcccaccg gacccccatg 5460
cccaagatcc agaatgtgag cagctctgac ctgctgatgc tgctgaggca gagccccacc 5520
ccccatggcc tgagcctgtc tgacctgcag gaggccaagt atgaaacctt ctctgatgac 5580
cccagccctg gggccattga cagcaacaac agcctgtctg agatgaccca cttcaggccc 5640
cagctgcacc actctgggga catggtgttc acccctgagt ctggcctgca gctgaggctg 5700
aatgagaagc tgggcaccac tgctgccact gagctgaaga agctggactt caaagtctcc 5760
agcaccagca acaacctgat cagcaccatc ccctctgaca acctggctgc tggcactgac 5820
aacaccagca gcctgggccc ccccagcatg cctgtgcact atgacagcca gctggacacc 5880
accctgtttg gcaagaagag cagccccctg actgagtctg ggggccccct gagcctgtct 5940
gaggagaaca atgacagcaa gctgctggag tctggcctga tgaacagcca ggagagcagc 6000
tggggcaaga atgtgagcag cagggagatc accaggacca ccctgcagtc tgaccaggag 6060
gagattgact atgatgacac catctctgtg gagatgaaga aggaggactt tgacatctac 6120
gacgaggacg agaaccagag ccccaggagc ttccagaaga agaccaggca ctacttcatt 6180
gctgctgtgg agaggctgtg ggactatggc atgagcagca gcccccatgt gctgaggaac 6240
agggcccagt ctggctctgt gccccagttc aagaaggtgg tgttccagga gttcactgat 6300
ggcagcttca cccagcccct gtacagaggg gagctgaatg agcacctggg cctgctgggc 6360
ccctacatca gggctgaggt ggaggacaac atcatggtga ccttcaggaa ccaggccagc 6420
aggccctaca gcttctacag cagcctgatc agctatgagg aggaccagag gcagggggct 6480
gagcccagga agaactttgt gaagcccaat gaaaccaaga cctacttctg gaaggtgcag 6540
caccacatgg cccccaccaa ggatgagttt gactgcaagg cctgggccta cttctctgat 6600
gtggacctgg agaaggatgt gcactctggc ctgattggcc ccctgctggt gtgccacacc 6660
aacaccctga accctgccca tggcaggcag gtgactgtgc aggagtttgc cctgttcttc 6720
accatctttg atgaaaccaa gagctggtac ttcactgaga acatggagag gaactgcagg 6780
gccccctgca acatccagat ggaggacccc accttcaagg agaactacag gttccatgcc 6840
atcaatggct acatcatgga caccctgcct ggcctggtga tggcccagga ccagaggatc 6900
aggtggtacc tgctgagcat gggcagcaat gagaacatcc acagcatcca cttctctggc 6960
catgtgttca ctgtgaggaa gaaggaggag tacaagatgg ccctgtacaa cctgtaccct 7020
ggggtgtttg agactgtgga gatgctgccc agcaaggctg gcatctggag ggtggagtgc 7080
ctgattgggg agcacctgca tgctggcatg agcaccctgt tcctggtgta cagcaacaag 7140
tgccagaccc ccctgggcat ggcctctggc cacatcaggg acttccagat cactgcctct 7200
ggccagtatg gccagtgggc ccccaagctg gccaggctgc actactctgg cagcatcaat 7260
gcctggagca ccaaggagcc cttcagctgg atcaaggtgg acctgctggc ccccatgatc 7320
atccatggca tcaagaccca gggggccagg cagaagttca gcagcctgta catcagccag 7380
ttcatcatca tgtacagcct ggatggcaag aagtggcaga cctacagggg caacagcact 7440
ggcaccctga tggtgttctt tggcaatgtg gacagctctg gcatcaagca caacatcttc 7500
aaccccccca tcattgccag atacatcagg ctgcacccca cccactacag catcaggagc 7560
accctgagga tggagctgat gggctgtgac ctgaacagct gcagcatgcc cctgggcatg 7620
gagagcaagg ccatctctga tgcccagatc actgccagca gctacttcac caacatgttt 7680
gccacctgga gccccagcaa ggccaggctg cacctgcagg gcaggagcaa tgcctggagg 7740
ccccaggtca acaaccccaa ggagtggctg caggtggact tccagaagac catgaaggtg 7800
actggggtga ccacccaggg ggtgaagagc ctgctgacca gcatgtatgt gaaggagttc 7860
ctgatcagca gcagccagga tggccaccag tggaccctgt tcttccagaa tggcaaggtg 7920
aaggtgttcc agggcaacca ggacagcttc acccctgtgg tgaacagcct ggaccccccc 7980
ctgctgacca gatacctgag gattcacccc cagagctggg tgcaccagat tgccctgagg 8040
atggaggtgc tgggctgtga ggcccaggac ctgtactgag cggccgcggg cccaatcaac 8100
ctctggatta caaaatttgt gaaagattga ctggtattct taactatgtt gctcctttta 8160
cgctatgtgg atacgctgct ttaatgcctt tgtatcatgc tattgcttcc cgtatggctt 8220
tcattttctc ctccttgtat aaatcctggt tgctgtctct ttatgaggag ttgtggcccg 8280
ttgtcaggca acgtggcgtg gtgtgcactg tgtttgctga cgcaaccccc actggttggg 8340
gcattgccac cacctgtcag ctcctttccg ggactttcgc tttccccctc cctattgcca 8400
cggcggaact catcgccgcc tgccttgccc gctgctggac aggggctcgg ctgttgggca 8460
ctgacaattc cgtggtgttg tcggggaaat catcgtcctt tccttggctg ctcgcctgtg 8520
ttgccacctg gattctgcgc gggacgtcct tctgctacgt cccttcggcc ctcaatccag 8580
cggaccttcc ttcccgcggc ctgctgccgg ctctgcggcc tcttccgcgt cttcgccttc 8640
gccctcagac gagtcggatc tccctttggg ccgcctcccc gcaagcttcg cactttttaa 8700
aagaaaaggg aggactggat gggatttatt actccgatag gacgctggct tgtaactcag 8760
tctcttacta ggagaccagc ttgagcctgg gtgttcgctg gttagcctaa cctggttggc 8820
caccaggggt aaggactcct tggcttagaa agctaataaa cttgcctgca ttagagctct 8880
tacgcgtccc gggctcgaga tccgcatctc aattagtcag caaccatagt cccgccccta 8940
actccgccca tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga 9000
ctaatttttt ttatttatgc agaggccgag gccgcctcgg cctctgagct attccagaag 9060
tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa agctaacttg tttattgcag 9120
cttataatgg ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt 9180
cactgcattc tagttgtggt ttgtccaaac tcatcaatgt atcttatcat gtctgtccgc 9240
ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 9300
ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 9360
agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 9420
taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 9480
cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 9540
tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 9600
gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 9660
gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 9720
tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 9780
gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 9840
cggctacact agaagaacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 9900
aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 9960
tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 10020
ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 10080
attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 10140
ctaaagtata tatgagtaaa cttggtctga cagttagaaa aactcatcga gcatcaaatg 10200
aaactgcaat ttattcatat caggattatc aataccatat ttttgaaaaa gccgtttctg 10260
taatgaagga gaaaactcac cgaggcagtt ccataggatg gcaagatcct ggtatcggtc 10320
tgcgattccg actcgtccaa catcaataca acctattaat ttcccctcgt caaaaataag 10380
gttatcaagt gagaaatcac catgagtgac gactgaatcc ggtgagaatg gcaacagctt 10440
atgcatttct ttccagactt gttcaacagg ccagccatta cgctcgtcat caaaatcact 10500
cgcatcaacc aaaccgttat tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc 10560
gctgttaaaa ggacaattac aaacaggaat cgaatgcaac cggcgcagga acactgccag 10620
cgcatcaaca atattttcac ctgaatcagg atattcttct aatacctgga atgctgtttt 10680
tccggggatc gcagtggtga gtaaccatgc atcatcagga gtacggataa aatgcttgat 10740
ggtcggaaga ggcataaatt ccgtcagcca gtttagtctg accatctcat ctgtaacatc 10800
attggcaacg ctacctttgc catgtttcag aaacaactct ggcgcatcgg gcttcccata 10860
caatcgatag attgtcgcac ctgattgccc gacattatcg cgagcccatt tatacccata 10920
taaatcagca tccatgttgg aatttaatcg cggcctagag caagacgttt cccgttgaat 10980
atggctcata acaccccttg tattactgtt tatgtaagca gacagtttta ttgttcatga 11040
tgatatattt ttatcttgtg caatgtaaca tcagagattt tgagacacaa caattggtcg 11100
acggatcc 11108
<210> 29
<211> 1738
<212> DNA
<213> Artificial Sequence
<220>
<223> CAG promoter
<400> 29
attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 60
atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120
acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180
tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240
tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300
attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 360
tcatcgctat taccatggtc gaggtgagcc ccacgttctg cttcactctc cccatctccc 420
ccccctcccc acccccaatt ttgtatttat ttatttttta attattttgt gcagcgatgg 480
gggcgggggg gggggggggg cgcgcgccag gcggggcggg gcggggcgag gggcggggcg 540
gggcgaggcg gagaggtgcg gcggcagcca atcagagcgg cgcgctccga aagtttcctt 600
ttatggcgag gcggcggcgg cggcggccct ataaaaagcg aagcgcgcgg cgggcgggag 660
tcgctgcgcg ctgccttcgc cccgtgcccc gctccgccgc cgcctcgcgc cgcccgcccc 720
ggctctgact gaccgcgtta ctcccacagg tgagcgggcg ggacggccct tctcctccgg 780
gctgtaatta gcgcttggtt taatgacggc ttgtttcttt tctgtggctg cgtgaaagcc 840
ttgaggggct ccgggagggc cctttgtgcg gggggagcgg ctcggggggt gcgtgcgtgt 900
gtgtgtgcgt ggggagcgcc gcgtgcggct ccgcgctgcc cggcggctgt gagcgctgcg 960
ggcgcggcgc ggggctttgt gcgctccgca gtgtgcgcga ggggagcgcg gccgggggcg 1020
gtgccccgcg gtgcgggggg ggctgcgagg ggaacaaagg ctgcgtgcgg ggtgtgtgcg 1080
tgggggggtg agcagggggt gtgggcgcgt cggtcgggct gcaacccccc ctgcaccccc 1140
ctccccgagt tgctgagcac ggcccggctt cgggtgcggg gctccgtacg gggcgtggcg 1200
cggggctcgc cgtgccgggc ggggggtggc ggcaggtggg ggtgccgggc ggggcggggc 1260
cgcctcgggc cggggagggc tcgggggagg ggcgcggcgg cccccggagc gccggcggct 1320
gtcgaggcgc ggcgagccgc agccattgcc ttttatggta atcgtgcgag agggcgcagg 1380
gacttccttt gtcccaaatc tgtgcggagc cgaaatctgg gaggcgccgc cgcaccccct 1440
ctagcgggcg cggggcgaag cggtgcggcg ccggcaggaa ggaaatgggc ggggagggcc 1500
ttcgtgcgtc gccgcgccgc cgtccccttc tccctctcca gcctcggggc tgtccgcggg 1560
gggacggctg ccttcggggg ggacggggca gggcggggtt cggcttctgg cgtgtgaccg 1620
gcggctctag agcctctgct aaccatgttc atgccttctt ctttttccta cagctcctgg 1680
gcaacgtgct ggttattgtg ctgtctcatc attttggcaa agaattgctc gagccacc 1738
Claims (33)
- 프로모터 및 이식 유전자(transgene)를 포함하는, 호흡기 파라믹소바이러스(respiratory paramyxovirus)로부터 헤마글루티닌-뉴라미니다제(HN: hemagglutinin-neuraminidase) 및 융합(F: fusion) 단백질로 슈도타이핑된(pseudotyped) 레트로바이러스 벡터를 생성하는 방법으로서, 상기 방법은 코돈-최적화된(codon-optimised) gag-pol 유전자의 사용을 포함하는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 레트로바이러스 벡터가 렌티바이러스 벡터인 것을 특징으로 하는 방법.
- 제2항에 있어서, 상기 렌티바이러스 벡터가 유인원 면역결핍 바이러스(SIV) 벡터, 인간 면역결핍 바이러스(HIV) 벡터, 고양이 면역결핍 바이러스(FIV) 벡터, 말 감염성 빈혈 바이러스(EIAV) 벡터, 및 비스나/매디(Visna/maedi) 바이러스 벡터로 구성되는 군으로부터 선택되는 것을 특징으로 하는 방법.
- 제2항 또는 제3항에 있어서, 상기 렌티바이러스 벡터가 SIV 벡터인 것을 특징으로 하는 방법.
- 제1항 내지 제4항 중 어느 한 항에 있어서, 상기 코돈-최적화된 gag-pol 유전자가 SIV gag-pol 유전자인 것을 특징으로 하는 방법.
- 제1항 내지 제5항 중 어느 한 항에 있어서, 상기 코돈-최적화된 gag-pol 유전자가 SEQ ID NO: 1과 적어도 80%의 서열 동일성을 갖는 핵산 서열을 포함하거나 이로 구성되는 것을 특징으로 하는 방법.
- 제6항에 있어서, 상기 코돈-최적화된 gag-pol 유전자가 SEQ ID NO: 1의 핵산 서열을 포함하거나 그 핵산 서열만으로 구성되는 것을 특징으로 하는 방법.
- 제1항 내지 제7항 중 어느 한 항에 있어서, 상기 코돈-최적화된 gag-pol 유전자가 SEQ ID NO: 5에 대해 적어도 80% 서열 동일성을 갖는 핵산 서열을 포함하거나 그 핵산 서열만으로 구성되는 플라스미드에 포함되는 것을 특징으로 하는 방법.
- 제8항에 있어서, 상기 코돈-최적화된 gag-pol 유전자가 SEQ ID NO: 5의 핵산 서열을 포함하거나 그 핵산 서열만으로 구성되는 플라스미드에 포함되는 것을 특징으로 하는 방법.
- 제1항 내지 제9항 중 어느 한 항에 있어서, 상기 호흡기 파라믹소바이러스가 센다이(Sendai) 바이러스인 것을 특징으로 하는 방법.
- 제1항 내지 제10항 중 어느 한 항에 있어서, 상기 생성된 레트로바이러스 벡터의 역가(titre)가:
a) 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스 벡터의 역가와 동등하고; 또는
b) 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스 벡터의 역가와 비교하여 증가하는;
것을 특징으로 하는 방법.
- 제11항에 있어서, 상기 레트로바이러스 벡터의 역가가 코돈-최적화된 gal-pol 유전자를 사용하지 않는 상응하는 방법에 의해 생성된 레트로바이러스 벡터의 역가보다 적어도 2배, 또는 적어도 2.5배 큰 것을 특징으로 하는 방법.
- 제1항 내지 제12항 중 어느 한 항에 있어서, 상기 프로모터가 CMV(cytomegalovirus) 프로모터, EF1a(elongation factor 1a) 프로모터, 및 hCEF(hybrid human CMV enhancer/EF1a) 프로모터로 이루어진 군으로부터 선택되는 것을 특징으로 하는 방법.
- 제1항 내지 제13항 중 어느 한 항에 있어서, 상기 벡터가 하이브리드 인간 CMV 인핸서/EF1a(hCEF) 프로모터를 포함하는 것을 특징으로 하는 방법.
- 제1항 내지 제14항 중 어느 한 항에 있어서, 상기 이식 유전자가:
a) 분비된 치료 단백질, 선택적으로 알파-1 항트립신(A1AT), 인자 VIII, 계면활성제 단백질 B(SFTPB), 인자 VII, 인자 IX, 인자 X, 인자 XI, 폰 빌레브란트 인자, 과립구-대식세포 콜로니-자극 인자(GM-CSF), 및 감염원에 대한 단일클론 항체; 또는
b) CFTR, ABCA3, DNAH5, DNAH11, DNAI1, 및 DNAI2;
로부터 선택되는 것을 특징으로 하는 방법.
- 제1항 내지 제15항 중 어느 한 항에 있어서, 상기 이식 유전자가:
a) CFTR;
b) A1AT; 또는
c) FVIII;
를 암호화하는 것을 특징으로 하는 방법.
- 제1항 내지 제16항 중 어느 한 항에 있어서:
a) 상기 프로모터가 hCEF 프로모터이고 상기 이식 유전자가 CFTR을 암호화하고;
b) 상기 프로모터가 hCEF 프로모터이고 상기 이식 유전자가 A1AT를 암호화하고; 또는
c) 상기 프로모터가 hCEF 또는 CMV 프로모터이고 상기 이식 유전자가 FVIII를 암호화하는;
것을 특징으로 하는 방법.
- 제1항 내지 제17항 중 어느 한 항에 있어서, 상기 방법이:
a) 현탁액에서 세포를 성장시키고;
b) 하나 이상의 플라스미드로 세포를 형질 감염시키고;
c) 뉴클레아제를 첨가하고;
d) 렌티바이러스를 채취하고;
e) 트립신을 첨가하고; 그리고
f) 정제하는;
단계를 포함하는 것을 특징으로 하는 방법.
- 제18항에 있어서, 상기 하나 이상의 플라스미드가:
a) 바람직하게는 pGM830 및 pGM326으로부터 선택되는 벡터 게놈 플라스미드;
b) co-galpol 플라스미드, 바람직하게는 pGM691;
c) Rev 플라스미드, 바람직하게는 pGM299;
d) 융합(F) 단백질 플라스미드, 바람직하게는 pGM301; 및
e) 헤마글루티닌-뉴라미니다제(HN) 플라스미드, 바람직하게는 pGM303;
을 포함하거나 이들만으로 구성되는 것을 특징으로 하는 방법.
- 제19항에 있어서, 상기 벡터 게놈 플라스미드:co-gagpol 플라스미드:Rev 플라스미드:F 플라스미드:HN 플라스미드의 비율이 20:9:6:6:6인 것을 특징으로 하는 방법.
- 제18항 내지 제20항 중 어느 한 항에 있어서, 상기 단계 (a)-(f)가 순차적으로 수행되는 것을 특징으로 하는 방법.
- 제18항 내지 제21항 중 어느 한 항에 있어서, 상기 세포가 HEK293T 또는 293T/17 세포인 것을 특징으로 하는 방법.
- 제18항 내지 제22항 중 어느 한 항에 있어서, 상기 뉴클레아제의 첨가가 채취-전 단계인 것을 특징으로 하는 방법.
- 제18항 내지 제23항 중 어느 한 항에 있어서, 상기 트립신의 첨가가 채취-후 단계인 것을 특징으로 하는 방법.
- 제18항 내지 제24항 중 어느 한 항에 있어서, 상기 정제 단계가 크로마토그래피 단계를 포함하는 것을 특징으로 하는 방법.
- 제19항 내지 제24항 중 어느 한 항에 있어서, 상기 벡터 게놈 플라스미드가 레트로바이러스 ORF의 수를 감소시키도록 변형된 것을 특징으로 하는 방법.
- 코돈-최적화된 gag-pol 유전자를 포함하는 핵산으로서, 상기 핵산이 SEQ ID NO: 1과 적어도 80%의 서열 동일성을 갖는 것을 특징으로 하는 핵산.
- 제27항에 있어서, SEQ ID NO: 1의 핵산 서열을 포함하거나 그 핵산 서열만으로 구성되는 것을 특징으로 하는 핵산.
- 제27항 또는 제28항에 정의된 바와 같은 핵산을 포함하는 플라스미드로서, 선택적으로:
a) 상기 플라스미드가 SEQ ID NO: 5에 대해 적어도 80%의 서열 동일성을 갖는 핵산 서열을 포함하거나 그 핵산 서열만으로 구성되고; 또는
b) 상기 플라스미드가 SEQ ID NO: 5의 핵산 서열을 포함하거나 그 핵산 서열만으로 구성되는;
것을 특징으로 하는 플라스미드.
- 제27항 또는 제28항에 정의된 핵산 및/또는 제29항에 정의된 플라스미드를 포함하는 것을 특징으로 하는 숙주 세포.
- 제1항 내지 제26항 중 어느 한 항에 정의된 방법에 의해 수득 가능한 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스 벡터.
- 제1항 내지 제26항 중 어느 한 항에 정의된 방법에 의해 수득 가능한 호흡기 파라믹소바이러스로부터의 헤마글루티닌-뉴라미니다제(HN) 및 융합(F) 단백질로 슈도타이핑된 레트로바이러스 벡터를 이를 필요로 하는 대상에게 투여하는 단계를 포함하는 것을 특징으로 하는 질병 치료 방법.
- 제32항에 있어서, 상기 질병이 폐 질환, 바람직하게는 낭포성 섬유증(cystic fibrosis)인 것을 특징으로 하는 치료 방법.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2102832.9 | 2021-02-26 | ||
GBGB2102832.9A GB202102832D0 (en) | 2021-02-26 | 2021-02-26 | Retroviral vectors |
PCT/GB2022/050524 WO2022180411A1 (en) | 2021-02-26 | 2022-02-25 | Retroviral vectors |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20230154015A true KR20230154015A (ko) | 2023-11-07 |
Family
ID=75339978
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020237029670A KR20230154015A (ko) | 2021-02-26 | 2022-02-25 | 레트로바이러스 벡터 |
Country Status (17)
Country | Link |
---|---|
US (1) | US20220273821A1 (ko) |
EP (1) | EP4298226A1 (ko) |
JP (1) | JP2024509789A (ko) |
KR (1) | KR20230154015A (ko) |
CN (1) | CN116940686A (ko) |
AR (1) | AR124992A1 (ko) |
AU (1) | AU2022225723A1 (ko) |
CA (1) | CA3208936A1 (ko) |
CL (1) | CL2023002470A1 (ko) |
CO (1) | CO2023012522A2 (ko) |
CR (1) | CR20230453A (ko) |
DO (1) | DOP2023000167A (ko) |
GB (1) | GB202102832D0 (ko) |
IL (1) | IL304808A (ko) |
MX (1) | MX2023009990A (ko) |
TW (1) | TW202246508A (ko) |
WO (1) | WO2022180411A1 (ko) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5223409A (en) | 1988-09-02 | 1993-06-29 | Protein Engineering Corp. | Directed evolution of novel binding proteins |
IL99552A0 (en) | 1990-09-28 | 1992-08-18 | Ixsys Inc | Compositions containing procaryotic cells,a kit for the preparation of vectors useful for the coexpression of two or more dna sequences and methods for the use thereof |
GB0009760D0 (en) * | 2000-04-19 | 2000-06-07 | Oxford Biomedica Ltd | Method |
IL157335A0 (en) * | 2001-03-13 | 2004-02-19 | Novartis Ag | Lentiviral packaging constructs |
EP3119798B1 (en) * | 2014-03-17 | 2020-08-05 | Adverum Biotechnologies, Inc. | Compositions and methods for enhanced gene expression in cone cells |
GB2526339A (en) * | 2014-05-21 | 2015-11-25 | Imp Innovations Ltd | Lentiviral vectors |
-
2021
- 2021-02-26 GB GBGB2102832.9A patent/GB202102832D0/en not_active Ceased
-
2022
- 2022-02-25 CN CN202280014096.XA patent/CN116940686A/zh active Pending
- 2022-02-25 CA CA3208936A patent/CA3208936A1/en active Pending
- 2022-02-25 EP EP22708589.1A patent/EP4298226A1/en active Pending
- 2022-02-25 US US17/681,647 patent/US20220273821A1/en active Pending
- 2022-02-25 KR KR1020237029670A patent/KR20230154015A/ko unknown
- 2022-02-25 WO PCT/GB2022/050524 patent/WO2022180411A1/en active Application Filing
- 2022-02-25 MX MX2023009990A patent/MX2023009990A/es unknown
- 2022-02-25 TW TW111106985A patent/TW202246508A/zh unknown
- 2022-02-25 AR ARP220100434A patent/AR124992A1/es unknown
- 2022-02-25 JP JP2023552051A patent/JP2024509789A/ja active Pending
- 2022-02-25 CR CR20230453A patent/CR20230453A/es unknown
- 2022-02-25 AU AU2022225723A patent/AU2022225723A1/en active Pending
-
2023
- 2023-07-27 IL IL304808A patent/IL304808A/en unknown
- 2023-08-21 CL CL2023002470A patent/CL2023002470A1/es unknown
- 2023-08-25 DO DO2023000167A patent/DOP2023000167A/es unknown
- 2023-09-22 CO CONC2023/0012522A patent/CO2023012522A2/es unknown
Also Published As
Publication number | Publication date |
---|---|
GB202102832D0 (en) | 2021-04-14 |
CA3208936A1 (en) | 2022-09-01 |
JP2024509789A (ja) | 2024-03-05 |
CL2023002470A1 (es) | 2024-01-26 |
DOP2023000167A (es) | 2023-11-30 |
AR124992A1 (es) | 2023-05-24 |
CN116940686A (zh) | 2023-10-24 |
US20220273821A1 (en) | 2022-09-01 |
CO2023012522A2 (es) | 2023-10-09 |
AU2022225723A1 (en) | 2023-08-10 |
TW202246508A (zh) | 2022-12-01 |
WO2022180411A1 (en) | 2022-09-01 |
CR20230453A (es) | 2023-11-15 |
EP4298226A1 (en) | 2024-01-03 |
MX2023009990A (es) | 2023-10-16 |
IL304808A (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
ES2805045T3 (es) | Vectores lentivirales | |
AU2020260485B2 (en) | Gene therapies for lysosomal disorders | |
AU2019203955C1 (en) | Multipartite signaling proteins and uses thereof | |
SA516371030B1 (ar) | نواقل لإظهار مولدات مضاد مصاحبة للبروستاتا | |
KR20230035689A (ko) | 조작된 캐스케이드 구성성분 및 캐스케이드 복합체 | |
KR20210150486A (ko) | 리소좀 장애에 대한 유전자 요법 | |
KR20230066360A (ko) | 신경퇴행성 장애를 위한 유전자 요법 | |
KR20230019450A (ko) | 캡슐화된 rna 레플리콘 및 사용 방법 | |
TW202308669A (zh) | 嵌合共刺激性受體、趨化激素受體及彼等於細胞免疫治療之用途 | |
KR20220078607A (ko) | 융합 단백질들을 이용한 tcr 재프로그래밍을 위한 조성물 및 방법들 | |
US20240082327A1 (en) | Retroviral vectors | |
KR20240037192A (ko) | 게놈 통합을 위한 방법 및 조성물 | |
KR20230154015A (ko) | 레트로바이러스 벡터 | |
TW202424202A (zh) | 逆轉錄病毒載體 | |
WO2024062259A1 (en) | Retroviral vector comprising rre inserted within an intron | |
WO2024069192A1 (en) | Gene therapy | |
KR20210150487A (ko) | 리소좀 장애를 위한 유전자 요법 | |
KR20240029020A (ko) | Dna 변형을 위한 crispr-트랜스포손 시스템 | |
TW202233830A (zh) | 使用下一代工程化t細胞療法治療癌症之組合物及方法 |