CN113106107A - 一种表达hpv 35l1的多核苷酸及其表达载体、宿主细胞和应用 - Google Patents
一种表达hpv 35l1的多核苷酸及其表达载体、宿主细胞和应用 Download PDFInfo
- Publication number
- CN113106107A CN113106107A CN202110442669.6A CN202110442669A CN113106107A CN 113106107 A CN113106107 A CN 113106107A CN 202110442669 A CN202110442669 A CN 202110442669A CN 113106107 A CN113106107 A CN 113106107A
- Authority
- CN
- China
- Prior art keywords
- hpv35l1
- protein
- thalli
- polynucleotide
- methanol
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000013604 expression vector Substances 0.000 title claims abstract description 18
- 108091033319 polynucleotide Proteins 0.000 title claims abstract description 18
- 102000040430 polynucleotide Human genes 0.000 title claims abstract description 18
- 239000002157 polynucleotide Substances 0.000 title claims abstract description 18
- 101000781698 Human papillomavirus 35 Major capsid protein L1 Proteins 0.000 claims abstract description 53
- 238000000034 method Methods 0.000 claims abstract description 20
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical group OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 claims description 51
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 claims description 30
- 241000320412 Ogataea angusta Species 0.000 claims description 26
- 241001052560 Thallis Species 0.000 claims description 25
- 238000000855 fermentation Methods 0.000 claims description 22
- 230000004151 fermentation Effects 0.000 claims description 22
- 229960002566 papillomavirus vaccine Drugs 0.000 claims description 14
- 230000006698 induction Effects 0.000 claims description 13
- 238000003259 recombinant expression Methods 0.000 claims description 13
- 239000001963 growth medium Substances 0.000 claims description 10
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 8
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 claims description 8
- 239000002773 nucleotide Substances 0.000 claims description 8
- 125000003729 nucleotide group Chemical group 0.000 claims description 8
- 229910052760 oxygen Inorganic materials 0.000 claims description 8
- 239000001301 oxygen Substances 0.000 claims description 8
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 7
- 239000013612 plasmid Substances 0.000 claims description 7
- 238000002360 preparation method Methods 0.000 claims description 7
- 238000000746 purification Methods 0.000 claims description 7
- 238000012258 culturing Methods 0.000 claims description 6
- 239000006166 lysate Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000004519 manufacturing process Methods 0.000 claims description 5
- 238000003756 stirring Methods 0.000 claims description 5
- 238000004587 chromatography analysis Methods 0.000 claims description 4
- 238000005277 cation exchange chromatography Methods 0.000 claims description 3
- 230000001133 acceleration Effects 0.000 claims description 2
- 229920001184 polypeptide Polymers 0.000 claims description 2
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 2
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 2
- 238000000926 separation method Methods 0.000 claims description 2
- 239000012646 vaccine adjuvant Substances 0.000 claims description 2
- 229940124931 vaccine adjuvant Drugs 0.000 claims description 2
- 229960005486 vaccine Drugs 0.000 abstract description 19
- 208000015181 infectious disease Diseases 0.000 abstract description 4
- 230000014509 gene expression Effects 0.000 description 30
- 108020004414 DNA Proteins 0.000 description 12
- 241000701806 Human papillomavirus Species 0.000 description 11
- 108090000623 proteins and genes Proteins 0.000 description 11
- 239000000243 solution Substances 0.000 description 11
- 241000894006 Bacteria Species 0.000 description 10
- 239000007788 liquid Substances 0.000 description 10
- 238000002965 ELISA Methods 0.000 description 9
- 230000001580 bacterial effect Effects 0.000 description 9
- 239000013598 vector Substances 0.000 description 9
- 230000001939 inductive effect Effects 0.000 description 8
- 239000000725 suspension Substances 0.000 description 8
- 239000007924 injection Substances 0.000 description 7
- 238000002347 injection Methods 0.000 description 7
- 230000003612 virological effect Effects 0.000 description 7
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 6
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 6
- 239000000427 antigen Substances 0.000 description 6
- 102000036639 antigens Human genes 0.000 description 6
- 108091007433 antigens Proteins 0.000 description 6
- 229910052802 copper Inorganic materials 0.000 description 6
- 239000010949 copper Substances 0.000 description 6
- 108091026890 Coding region Proteins 0.000 description 5
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 5
- 150000001413 amino acids Chemical group 0.000 description 5
- 230000005847 immunogenicity Effects 0.000 description 5
- 239000002245 particle Substances 0.000 description 5
- 102000004169 proteins and genes Human genes 0.000 description 5
- 210000002966 serum Anatomy 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 206010008342 Cervix carcinoma Diseases 0.000 description 4
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 4
- 239000002671 adjuvant Substances 0.000 description 4
- 201000010881 cervical cancer Diseases 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 239000013613 expression plasmid Substances 0.000 description 4
- 230000003053 immunization Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 239000006228 supernatant Substances 0.000 description 4
- 108010025188 Alcohol oxidase Proteins 0.000 description 3
- 108090000565 Capsid Proteins Proteins 0.000 description 3
- 102100023321 Ceruloplasmin Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 241000341655 Human papillomavirus type 16 Species 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 241000699666 Mus <mouse, genus> Species 0.000 description 3
- 241000699670 Mus sp. Species 0.000 description 3
- KWYUFKZDYYNOTN-UHFFFAOYSA-M Potassium hydroxide Chemical compound [OH-].[K+] KWYUFKZDYYNOTN-UHFFFAOYSA-M 0.000 description 3
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 3
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000002156 mixing Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 150000007523 nucleic acids Chemical class 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 239000012460 protein solution Substances 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- FDWIKIIKBRJSHK-UHFFFAOYSA-N 2-(2-methyl-4-oxochromen-5-yl)acetic acid Chemical compound C1=CC=C2OC(C)=CC(=O)C2=C1CC(O)=O FDWIKIIKBRJSHK-UHFFFAOYSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 2
- 108010065920 Insulin Lispro Proteins 0.000 description 2
- 239000007993 MOPS buffer Substances 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-N Phosphoric acid Chemical compound OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- XAGFODPZIPBFFR-UHFFFAOYSA-N aluminium Chemical compound [Al] XAGFODPZIPBFFR-UHFFFAOYSA-N 0.000 description 2
- 229910052782 aluminium Inorganic materials 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 229940031416 bivalent vaccine Drugs 0.000 description 2
- 239000007853 buffer solution Substances 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 150000001768 cations Chemical class 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 238000001816 cooling Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000007865 diluting Methods 0.000 description 2
- 239000003085 diluting agent Substances 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 108010020688 glycylhistidine Proteins 0.000 description 2
- 108010050848 glycylleucine Proteins 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000002649 immunization Methods 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000004481 post-translational protein modification Effects 0.000 description 2
- 229940023143 protein vaccine Drugs 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 239000011550 stock solution Substances 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- 238000004627 transmission electron microscopy Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- 101150028074 2 gene Proteins 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- UCIYCBSJBQGDGM-LPEHRKFASA-N Ala-Arg-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N UCIYCBSJBQGDGM-LPEHRKFASA-N 0.000 description 1
- ZEXDYVGDZJBRMO-ACZMJKKPSA-N Ala-Asn-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZEXDYVGDZJBRMO-ACZMJKKPSA-N 0.000 description 1
- XQGIRPGAVLFKBJ-CIUDSAMLSA-N Ala-Asn-Lys Chemical compound N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)O XQGIRPGAVLFKBJ-CIUDSAMLSA-N 0.000 description 1
- UQJUGHFKNKGHFQ-VZFHVOOUSA-N Ala-Cys-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UQJUGHFKNKGHFQ-VZFHVOOUSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 1
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 1
- NLYYHIKRBRMAJV-AEJSXWLSSA-N Ala-Val-Pro Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N NLYYHIKRBRMAJV-AEJSXWLSSA-N 0.000 description 1
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 1
- 206010061424 Anal cancer Diseases 0.000 description 1
- 208000007860 Anus Neoplasms Diseases 0.000 description 1
- KWKQGHSSNHPGOW-BQBZGAKWSA-N Arg-Ala-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)NCC(O)=O KWKQGHSSNHPGOW-BQBZGAKWSA-N 0.000 description 1
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- XMZZGVGKGXRIGJ-JYJNAYRXSA-N Arg-Tyr-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O XMZZGVGKGXRIGJ-JYJNAYRXSA-N 0.000 description 1
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 1
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 1
- WXVGISRWSYGEDK-KKUMJFAQSA-N Asn-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N WXVGISRWSYGEDK-KKUMJFAQSA-N 0.000 description 1
- OROMFUQQTSWUTI-IHRRRGAJSA-N Asn-Phe-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OROMFUQQTSWUTI-IHRRRGAJSA-N 0.000 description 1
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 1
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 1
- VTYQAQFKMQTKQD-ACZMJKKPSA-N Asp-Ala-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O VTYQAQFKMQTKQD-ACZMJKKPSA-N 0.000 description 1
- HOQGTAIGQSDCHR-SRVKXCTJSA-N Asp-Asn-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HOQGTAIGQSDCHR-SRVKXCTJSA-N 0.000 description 1
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 1
- OEUQMKNNOWJREN-AVGNSLFASA-N Asp-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N OEUQMKNNOWJREN-AVGNSLFASA-N 0.000 description 1
- XLILXFRAKOYEJX-GUBZILKMSA-N Asp-Leu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLILXFRAKOYEJX-GUBZILKMSA-N 0.000 description 1
- HXVILZUZXFLVEN-DCAQKATOSA-N Asp-Met-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O HXVILZUZXFLVEN-DCAQKATOSA-N 0.000 description 1
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 1
- USNJAPJZSGTTPX-XVSYOHENSA-N Asp-Phe-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O USNJAPJZSGTTPX-XVSYOHENSA-N 0.000 description 1
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 1
- 108010083946 Asp-Tyr-Leu-Lys Proteins 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- QLCPDGRAEJSYQM-LPEHRKFASA-N Cys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CS)N)C(=O)O QLCPDGRAEJSYQM-LPEHRKFASA-N 0.000 description 1
- VKAWJBQTFCBHQY-GUBZILKMSA-N Cys-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N VKAWJBQTFCBHQY-GUBZILKMSA-N 0.000 description 1
- VPQZSNQICFCCSO-BJDJZHNGSA-N Cys-Leu-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VPQZSNQICFCCSO-BJDJZHNGSA-N 0.000 description 1
- CMYVIUWVYHOLRD-ZLUOBGJFSA-N Cys-Ser-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CMYVIUWVYHOLRD-ZLUOBGJFSA-N 0.000 description 1
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 241000450599 DNA viruses Species 0.000 description 1
- 101100364969 Dictyostelium discoideum scai gene Proteins 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- WLODHVXYKYHLJD-ACZMJKKPSA-N Gln-Asp-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N WLODHVXYKYHLJD-ACZMJKKPSA-N 0.000 description 1
- HHQCBFGKQDMWSP-GUBZILKMSA-N Gln-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HHQCBFGKQDMWSP-GUBZILKMSA-N 0.000 description 1
- DFRYZTUPVZNRLG-KKUMJFAQSA-N Gln-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DFRYZTUPVZNRLG-KKUMJFAQSA-N 0.000 description 1
- SBCYJMOOHUDWDA-NUMRIWBASA-N Glu-Asp-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SBCYJMOOHUDWDA-NUMRIWBASA-N 0.000 description 1
- XKPOCESCRTVRPL-KBIXCLLPSA-N Glu-Cys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XKPOCESCRTVRPL-KBIXCLLPSA-N 0.000 description 1
- ZGKXAUIVGIBISK-SZMVWBNQSA-N Glu-His-Trp Chemical compound N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O ZGKXAUIVGIBISK-SZMVWBNQSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 1
- HBMRTXJZQDVRFT-DZKIICNBSA-N Glu-Tyr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HBMRTXJZQDVRFT-DZKIICNBSA-N 0.000 description 1
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 1
- BIRKKBCSAIHDDF-WDSKDSINSA-N Gly-Glu-Cys Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O BIRKKBCSAIHDDF-WDSKDSINSA-N 0.000 description 1
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 1
- YFGONBOFGGWKKY-VHSXEESVSA-N Gly-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)CN)C(=O)O YFGONBOFGGWKKY-VHSXEESVSA-N 0.000 description 1
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 1
- PDUHNKAFQXQNLH-ZETCQYMHSA-N Gly-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)NCC(O)=O PDUHNKAFQXQNLH-ZETCQYMHSA-N 0.000 description 1
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 1
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 1
- TVTZEOHWHUVYCG-KYNKHSRBSA-N Gly-Thr-Thr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O TVTZEOHWHUVYCG-KYNKHSRBSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 1
- AVQOSMRPITVTRB-CIUDSAMLSA-N His-Asn-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AVQOSMRPITVTRB-CIUDSAMLSA-N 0.000 description 1
- 208000022361 Human papillomavirus infectious disease Diseases 0.000 description 1
- AWTDTFXPVCTHAK-BJDJZHNGSA-N Ile-Cys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N AWTDTFXPVCTHAK-BJDJZHNGSA-N 0.000 description 1
- ZIPOVLBRVPXWJQ-SPOWBLRKSA-N Ile-Cys-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N ZIPOVLBRVPXWJQ-SPOWBLRKSA-N 0.000 description 1
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 1
- NXRNRBOKDBIVKQ-CXTHYWKRSA-N Ile-Tyr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N NXRNRBOKDBIVKQ-CXTHYWKRSA-N 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- FJUKMPUELVROGK-IHRRRGAJSA-N Leu-Arg-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N FJUKMPUELVROGK-IHRRRGAJSA-N 0.000 description 1
- VQPPIMUZCZCOIL-GUBZILKMSA-N Leu-Gln-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O VQPPIMUZCZCOIL-GUBZILKMSA-N 0.000 description 1
- VPKIQULSKFVCSM-SRVKXCTJSA-N Leu-Gln-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPKIQULSKFVCSM-SRVKXCTJSA-N 0.000 description 1
- KUEVMUXNILMJTK-JYJNAYRXSA-N Leu-Gln-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KUEVMUXNILMJTK-JYJNAYRXSA-N 0.000 description 1
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 1
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 1
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 1
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 1
- NTXYXFDMIHXTHE-WDSOQIARSA-N Leu-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 NTXYXFDMIHXTHE-WDSOQIARSA-N 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- ODUQLUADRKMHOZ-JYJNAYRXSA-N Lys-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)O ODUQLUADRKMHOZ-JYJNAYRXSA-N 0.000 description 1
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 1
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 1
- LNMKRJJLEFASGA-BZSNNMDCSA-N Lys-Phe-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LNMKRJJLEFASGA-BZSNNMDCSA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- 101710135729 Major capsid protein L1 Proteins 0.000 description 1
- 208000032271 Malignant tumor of penis Diseases 0.000 description 1
- HLZORBMOISUNIV-DCAQKATOSA-N Met-Ser-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C HLZORBMOISUNIV-DCAQKATOSA-N 0.000 description 1
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 1
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 1
- 101710163801 Minor capsid protein L2 Proteins 0.000 description 1
- 101100364971 Mus musculus Scai gene Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 241001631646 Papillomaviridae Species 0.000 description 1
- 208000009608 Papillomavirus Infections Diseases 0.000 description 1
- 208000002471 Penile Neoplasms Diseases 0.000 description 1
- 206010034299 Penile cancer Diseases 0.000 description 1
- 208000037581 Persistent Infection Diseases 0.000 description 1
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 1
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 1
- BWTKUQPNOMMKMA-FIRPJDEBSA-N Phe-Ile-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BWTKUQPNOMMKMA-FIRPJDEBSA-N 0.000 description 1
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 1
- BAONJAHBAUDJKA-BZSNNMDCSA-N Phe-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 BAONJAHBAUDJKA-BZSNNMDCSA-N 0.000 description 1
- 241000235648 Pichia Species 0.000 description 1
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 1
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 1
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 1
- CLJLVCYFABNTHP-DCAQKATOSA-N Pro-Leu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O CLJLVCYFABNTHP-DCAQKATOSA-N 0.000 description 1
- FXGIMYRVJJEIIM-UWVGGRQHSA-N Pro-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FXGIMYRVJJEIIM-UWVGGRQHSA-N 0.000 description 1
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 1
- FDMKYQQYJKYCLV-GUBZILKMSA-N Pro-Pro-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 FDMKYQQYJKYCLV-GUBZILKMSA-N 0.000 description 1
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 1
- ITUDDXVFGFEKPD-NAKRPEOUSA-N Pro-Ser-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ITUDDXVFGFEKPD-NAKRPEOUSA-N 0.000 description 1
- KWMZPPWYBVZIER-XGEHTFHBSA-N Pro-Ser-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWMZPPWYBVZIER-XGEHTFHBSA-N 0.000 description 1
- LZHHZYDPMZEMRX-STQMWFEESA-N Pro-Tyr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O LZHHZYDPMZEMRX-STQMWFEESA-N 0.000 description 1
- QMABBZHZMDXHKU-FKBYEOEOSA-N Pro-Tyr-Trp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QMABBZHZMDXHKU-FKBYEOEOSA-N 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 1
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- AMRRYKHCILPAKD-FXQIFTODSA-N Ser-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N AMRRYKHCILPAKD-FXQIFTODSA-N 0.000 description 1
- KJKQUQXDEKMPDK-FXQIFTODSA-N Ser-Met-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O KJKQUQXDEKMPDK-FXQIFTODSA-N 0.000 description 1
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 1
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 1
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 1
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 1
- MFEBUIFJVPNZLO-OLHMAJIHSA-N Thr-Asp-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MFEBUIFJVPNZLO-OLHMAJIHSA-N 0.000 description 1
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 1
- CQNFRKAKGDSJFR-NUMRIWBASA-N Thr-Glu-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O CQNFRKAKGDSJFR-NUMRIWBASA-N 0.000 description 1
- HOVLHEKTGVIKAP-WDCWCFNPSA-N Thr-Leu-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HOVLHEKTGVIKAP-WDCWCFNPSA-N 0.000 description 1
- DNCUODYZAMHLCV-XGEHTFHBSA-N Thr-Pro-Cys Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N)O DNCUODYZAMHLCV-XGEHTFHBSA-N 0.000 description 1
- KERCOYANYUPLHJ-XGEHTFHBSA-N Thr-Pro-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O KERCOYANYUPLHJ-XGEHTFHBSA-N 0.000 description 1
- DOBIBIXIHJKVJF-XKBZYTNZSA-N Thr-Ser-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DOBIBIXIHJKVJF-XKBZYTNZSA-N 0.000 description 1
- AKHDFZHUPGVFEJ-YEPSODPASA-N Thr-Val-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AKHDFZHUPGVFEJ-YEPSODPASA-N 0.000 description 1
- BKVICMPZWRNWOC-RHYQMDGZSA-N Thr-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O BKVICMPZWRNWOC-RHYQMDGZSA-N 0.000 description 1
- BPGDJSUFQKWUBK-KJEVXHAQSA-N Thr-Val-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BPGDJSUFQKWUBK-KJEVXHAQSA-N 0.000 description 1
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 1
- TWJDQTTXXZDJKV-BPUTZDHNSA-N Trp-Arg-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O TWJDQTTXXZDJKV-BPUTZDHNSA-N 0.000 description 1
- DVWAIHZOPSYMSJ-ZVZYQTTQSA-N Trp-Glu-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 DVWAIHZOPSYMSJ-ZVZYQTTQSA-N 0.000 description 1
- DZKFGCNKEVMXFA-JUKXBJQTSA-N Tyr-Ile-His Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O DZKFGCNKEVMXFA-JUKXBJQTSA-N 0.000 description 1
- WSFXJLFSJSXGMQ-MGHWNKPDSA-N Tyr-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N WSFXJLFSJSXGMQ-MGHWNKPDSA-N 0.000 description 1
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 1
- BYAKMYBZADCNMN-JYJNAYRXSA-N Tyr-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYAKMYBZADCNMN-JYJNAYRXSA-N 0.000 description 1
- WURLIFOWSMBUAR-SLFFLAALSA-N Tyr-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O WURLIFOWSMBUAR-SLFFLAALSA-N 0.000 description 1
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 1
- LDKDSFQSEUOCOO-RPTUDFQQSA-N Tyr-Thr-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LDKDSFQSEUOCOO-RPTUDFQQSA-N 0.000 description 1
- MWUYSCVVPVITMW-IGNZVWTISA-N Tyr-Tyr-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 MWUYSCVVPVITMW-IGNZVWTISA-N 0.000 description 1
- NWEGIYMHTZXVBP-JSGCOSHPSA-N Tyr-Val-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O NWEGIYMHTZXVBP-JSGCOSHPSA-N 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- HNWQUBBOBKSFQV-AVGNSLFASA-N Val-Arg-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HNWQUBBOBKSFQV-AVGNSLFASA-N 0.000 description 1
- JTWIMNMUYLQNPI-WPRPVWTQSA-N Val-Gly-Arg Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N JTWIMNMUYLQNPI-WPRPVWTQSA-N 0.000 description 1
- RWOGENDAOGMHLX-DCAQKATOSA-N Val-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N RWOGENDAOGMHLX-DCAQKATOSA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 1
- QPPZEDOTPZOSEC-RCWTZXSCSA-N Val-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N)O QPPZEDOTPZOSEC-RCWTZXSCSA-N 0.000 description 1
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 1
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 1
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 1
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 1
- 108010052104 Viral Regulatory and Accessory Proteins Proteins 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 206010047741 Vulval cancer Diseases 0.000 description 1
- 208000004354 Vulvar Neoplasms Diseases 0.000 description 1
- 108010084455 Zeocin Proteins 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- ILRRQNADMUWWFW-UHFFFAOYSA-K aluminium phosphate Chemical compound O1[Al]2OP1(=O)O2 ILRRQNADMUWWFW-UHFFFAOYSA-K 0.000 description 1
- 229910000147 aluminium phosphate Inorganic materials 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 201000011165 anus cancer Diseases 0.000 description 1
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 1
- 108010077245 asparaginyl-proline Proteins 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 108010038633 aspartylglutamate Proteins 0.000 description 1
- 108010068265 aspartyltyrosine Proteins 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003124 biologic agent Substances 0.000 description 1
- 229960000074 biopharmaceutical Drugs 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- PASHVRUKOFIRIK-UHFFFAOYSA-L calcium sulfate dihydrate Chemical compound O.O.[Ca+2].[O-]S([O-])(=O)=O PASHVRUKOFIRIK-UHFFFAOYSA-L 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 230000007969 cellular immunity Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 201000002758 colorectal adenoma Diseases 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 108091036078 conserved sequence Proteins 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- UQLDLKMNUJERMK-UHFFFAOYSA-L di(octadecanoyloxy)lead Chemical compound [Pb+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O UQLDLKMNUJERMK-UHFFFAOYSA-L 0.000 description 1
- 239000013024 dilution buffer Substances 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 238000001493 electron microscopy Methods 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 239000002158 endotoxin Substances 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 238000000703 high-speed centrifugation Methods 0.000 description 1
- 108010045383 histidyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 230000004727 humoral immunity Effects 0.000 description 1
- 230000008696 hypoxemic pulmonary vasoconstriction Effects 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 108010053037 kyotorphin Proteins 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 1
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- QSOMFNQEXNFPNU-UHFFFAOYSA-L magnesium;hydrogen sulfate;hydroxide;hydrate Chemical compound O.O.[Mg+2].[O-]S([O-])(=O)=O QSOMFNQEXNFPNU-UHFFFAOYSA-L 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000003472 neutralizing effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 1
- CWCMIVBLVUHDHK-ZSNHEYEWSA-N phleomycin D1 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC[C@@H](N=1)C=1SC=C(N=1)C(=O)NCCCCNC(N)=N)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C CWCMIVBLVUHDHK-ZSNHEYEWSA-N 0.000 description 1
- 239000008363 phosphate buffer Substances 0.000 description 1
- 239000008055 phosphate buffer solution Substances 0.000 description 1
- 239000002504 physiological saline solution Substances 0.000 description 1
- OTYBMLCTZGSZBG-UHFFFAOYSA-L potassium sulfate Chemical compound [K+].[K+].[O-]S([O-])(=O)=O OTYBMLCTZGSZBG-UHFFFAOYSA-L 0.000 description 1
- 229910052939 potassium sulfate Inorganic materials 0.000 description 1
- 235000011151 potassium sulphates Nutrition 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000009465 prokaryotic expression Effects 0.000 description 1
- 230000002062 proliferating effect Effects 0.000 description 1
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 1
- 108010077112 prolyl-proline Proteins 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 229940032313 prophylactic HPV vaccine Drugs 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 231100000279 safety data Toxicity 0.000 description 1
- 238000007789 sealing Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 235000020183 skimmed milk Nutrition 0.000 description 1
- 239000012089 stop solution Substances 0.000 description 1
- 238000007920 subcutaneous administration Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 229940031351 tetravalent vaccine Drugs 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 238000002255 vaccination Methods 0.000 description 1
- 206010046885 vaginal cancer Diseases 0.000 description 1
- 208000013139 vaginal neoplasm Diseases 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 210000000605 viral structure Anatomy 0.000 description 1
- 201000005102 vulva cancer Diseases 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 239000007221 ypg medium Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K39/12—Viral antigens
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/12—Antivirals
- A61P31/20—Antivirals for DNA viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
- C12N15/815—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts for yeasts other than Saccharomyces
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/20011—Papillomaviridae
- C12N2710/20022—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/20011—Papillomaviridae
- C12N2710/20034—Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/102—Plasmid DNA for yeast
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Virology (AREA)
- General Health & Medical Sciences (AREA)
- Mycology (AREA)
- Medicinal Chemistry (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Pharmacology & Pharmacy (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biochemistry (AREA)
- Veterinary Medicine (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Public Health (AREA)
- Animal Behavior & Ethology (AREA)
- Microbiology (AREA)
- Communicable Diseases (AREA)
- Immunology (AREA)
- Epidemiology (AREA)
- Physics & Mathematics (AREA)
- Gastroenterology & Hepatology (AREA)
- Plant Pathology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Oncology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
Abstract
本发明提供一种表达HPV 35L1的多核苷酸及其表达载体、宿主细胞和应用。利用该多核苷酸生产HPV 35L1蛋白具有高产量优势。该方法制备的HPV 35L1蛋白可以用于制备用于预防HPV 35感染的疫苗。
Description
技术领域
本发明涉及一种生物技术领域,涉及产生HPV 35L1蛋白的方法,尤其涉及一种表达HPV 35L1多核苷酸及其表达载体、宿主细胞和应用。
背景技术
人类乳头瘤病毒(human papillomavirus,HPV)系无包膜的小型双链环状DNA病毒,属乳多空病毒科,乳头瘤空泡病毒A属成员。到目前为止,HPV病毒鉴定出的基因型已超过200种,其中至少有13种基因型的人类乳头瘤病毒持续感染后可能诱发癌变,被认为是高危HPV(high-risk HPV,hrHPV)。据国际癌症研究机构(International Agency ofResearch on Cancer,IARC)公布的数据,HPV-16,-18,-31,-33,-35,-39,-45,-51,-52,-56,-58,-59等基因型已被证实能够将被感染的细胞转化为恶性肿瘤细胞从而引发宫颈癌。[IARC.Biological agents:a review of human carcinogenesis.IARC Monogr EvalCarcinog Risks Hum2012;100B.]HPV的电镜观察形态为直径约60nm的球形,是由呈正二十面体对称的衣壳包裹一个含有约8000个碱基对的核酸组成的病毒颗粒。[Knipe,DM.,Howley,PM.Fields virology.6th.Philadelphia,PA:Wolters Kluwer/LippincottWilliams&Wilkins Health;2013]病毒双链DNA基因组中只有一条链被用作转录模板,包含十个开放阅读框,编码三个基因组区域,包括编码6个病毒调控蛋白(E1,E2,E4,E5,E6和E7)的早期区域(early region,E),编码两种病毒衣壳蛋白L1和L2的晚期区域(late region,L)和调控病毒基因组复制、转录、翻译的长控制区(long control region,LCR)。
目前已上市的预防性HPV疫苗的抗原成分主要为衣壳蛋白(L1)组成的病毒样颗粒(Virus-like particles,VLP)。VLP是利用基因工程手段表达的重组蛋白,即通过异源重组表达系统生产病毒衣壳蛋白,表达产物经纯化后获得不包含病毒核酸的、具有类似于天然病毒空间结构的病毒样颗粒。VLP由于缺少病毒遗传物质,不具备感染宿主的能力,但其接近于天然病毒结构的特性能激发机体产生有效的体液免疫及细胞免疫,起到预防感染和疾病的效果。用这种策略生产出的疫苗组分单一稳定、免疫原性强,具有较高的安全性。与世界卫生组织(World Health Organisation,WHO)合作的全球疫苗安全咨询委员会(GlobalAdvisory Committee on Vaccine Safety,GACVS)定期组织审查HPV疫苗相关的安全性数据,在2017年7月20日的最近一次审查中,他们对超过2.7亿剂接种后的数据进行汇总,得出的结论是:HPV疫苗是非常安全的,目前没有明显的证据表明HPV疫苗与任何严重副作用或重大医疗状况有关。[GACVS.Safety update of HPV vaccines.https://www.who.int/vaccine_safety/committee/topics/hpv/June_2017/en/;2017.]
大量研究表明HPV主要衣壳蛋白L1可在多种表达系统中表达,无需次要衣壳蛋白L2辅助即可组装成与天然HPV形态结构相似的病毒样颗粒。目前,有三家公司的预防性HPV疫苗已上市:葛兰素史克公司的二价疫苗(HPV 16,18),默沙东公司的四价疫苗(HPV 6,11,16,18)和九价疫苗(HPV 6,11,16,18,31,33,45,52,58),以及厦门万泰沧海生物技术有限公司的二价疫苗(HPV16,18)。这三家公司分别采取昆虫细胞-杆状病毒表达系统、酿酒酵母表达系统和大肠杆菌表达系统进行HPV L1蛋白的制备,纯化后的抗原吸附佐剂后制备得到预防HPV感染的VLP疫苗。
而HPV 35作为能诱发宫颈癌等恶性肿瘤的高危型HPV,尚无利用汉逊酵母表达HPV35L1蛋白组装VLP的报道。
发明内容
本发明的目的在于提供一种表达HPV 35L1的多核苷酸序列及其表达载体、宿主细胞和应用。
本发明一方面提供了一种用于编码HPV 35L1蛋白的多核苷酸,所述多核苷酸的序列如SEQ ID NO:2所示。
进一步地,所述HPV 35L1蛋白的氨基酸序列如SEQ ID NO:1所示。
本发明第二方面提供了一种重组表达载体,所述重组表达载体中含有如上所述的多核苷酸。
可选的,所述重组表达载体是将如SEQ ID NO:2所示的多核苷酸插入质粒中获得。所述质粒可以是实验室中常用的质粒,例如本申请实施例中提供的质粒为pMTZ。
进一步地,所述重组表达载体还含有启动子和终止子。
可选的,所述启动子可以是pMOX,所述终止子可以为MOX TT。
本发明的第三方面提供了一种宿主细胞,所述宿主细胞中含有或者整合有上述重组表达载体。
进一步地,所述宿主细胞为酵母。
优选的,所述酵母选自甲醇酵母。进一步优选的,为多形汉逊酵母(Hansenulapolymorpha)。
本发明第四方面提供了一种产生HPV 35L1蛋白的方法,包括如下步骤:构建整合有或者含有核苷酸序列如SEQ ID NO:2所示的多核苷酸的重组汉逊酵母菌种,培养,收集菌体,破碎菌体获得裂解液,分离纯化裂解液,即可获得HPV 35L1蛋白。
进一步地,所述多核苷酸整合于质粒中,所述质粒整合于重组汉逊酵母菌种基因组中。
进一步地,所述培养的条件包括:pH5.0~7.0,发酵温度37℃,搅拌转速≦950rpm,空气流量≦2.0VVM,罐压≦0.10MPa,溶氧10%以上。
进一步地,将重组汉逊酵母菌种置于含有甘油的培养基中培养;在培养过程中,当培养基中的甘油消耗完,菌体湿重大于100g/L时,开始加甘油,甘油补料速度200~600g/h;当菌体湿重大于200g/L时,开始一次性加入甲醇至0.5%(w/v),进入甲醇诱导期,待甲醇全部消耗且溶氧上升到80%时,开始流加甲醇,随着菌体利用甲醇速度加快,逐步调整甲醇流加速度,诱导过程控制溶氧20%以上,诱导30~50h菌体湿重达到300~400g/L后发酵结束;
进一步地,所述分离纯化是指将菌体裂解液先通过阳离子层析柱,再通过层析柱CHT。
进一步地,所述阳离子层析柱的交换层析填料为POROS HS或Nanogel SP等。
本发明第五方面提供HPV 35L1蛋白,采用前述的产生HPV 35L1蛋白的方法获得。
本发明第六方面提供前述用于编码HPV 35L1蛋白的多核苷酸,或重组表达载体,或宿主细胞,或HPV 35L1蛋白在制备HPV疫苗中的用途。
本发明第七方面提供一种抗HPV疫苗的制备方法,包括以下步骤:利用前述的产生HPV35L1蛋白的方法,制备HPV 35L1蛋白,加入药学上可用的疫苗佐剂。
本发明第八方面提供一种抗HPV的疫苗,采用前述的抗HPV疫苗的制备方法获得。
本发明的有益技术效果:本发明提供SEQ ID NO:2所示的多核苷酸,其编码HPV35L1蛋白产量远远高于其他多核苷酸序列。汉逊酵母作为一种真核单细胞生物具有培养成本低廉、生长快速、分子生物学背景清楚等优势,同时相较于原核表达系统,汉逊酵母拥有更完善的蛋白翻译后修饰体系,表达产物不含内毒素。另外,相比其他真核表达系统(如酿酒酵母),汉逊酵母又具有遗传性状稳定、产量高及产物糖基化更合理的优势,并能避免毕赤酵母外源基因整合拷贝数较低等问题。
附图说明
图1:本发明一实施例的pMTZ载体结构图。
图2:本发明一实施例的35L1-1-pMTZ载体结构图。
图3:本发明一实施例的35L1-2-pMTZ载体结构图。
图4:本发明一实施例的35L1-3-pMTZ载体结构图。
图5:本发明一实施例的35L1-4-pMTZ载体结构图。
图6:酶联免疫吸附法检测包含35L1-1、35L1-2、35L1-3和35L1-4不同核苷酸编码序列的重组汉逊酵母工程菌株的35L1蛋白表达情况;
图7:发酵过程中HPV 35L1蛋白表达情况的SDS-PAGE检测。M:分子量标准品;1:诱导前;2:诱导10小时;3:诱导20小时;4:诱导30小时;5:放罐菌体。
图8:发酵过程中HPV 35L1蛋白表达情况的Western Blot检测。M:分子量标准品;1:诱导前;2:诱导10小时;3:诱导20小时;4:诱导30小时;5:放罐菌体。
图9:纯化后的HPV 35L1蛋白的SDS-PAGE检测。M:分子量标准品;1:纯化后HPV35L1蛋白。
图10:纯化后的HPV 35L1蛋白的透射电子显微镜观察结果。
具体实施方式
为实现HPV 35L1蛋白在汉逊酵母中高效表达,本发明公开了编码HPV35 L1蛋白的核苷酸序列和用于表达HPV 35L1蛋白的重组汉逊酵母菌种的制备方法,并公开了确保HPV35L1VLP高效表达的发酵工艺。表达的HPV 35L1蛋白依次通过阳离子层析柱POROS HS和层析柱CHT进行纯化,获得高纯度的目标蛋白溶液,可作为单价重组HPV 35L1疫苗或多价重组HPV疫苗的抗原组分,从而预防HPV 35感染,进而预防由HPV 35感染所引起的宫颈癌等相关疾病(包括但不限于:宫颈癌、阴道癌、外阴癌、子宫内膜癌、肛门癌、阴茎癌、头颈癌、肺癌、膀胱癌、乳腺癌、食管癌、前列腺癌、卵巢癌、结直肠腺瘤等癌症及其癌前病变)。
本发明根据HPV 35L1蛋白的氨基酸序列,合成了4条不同的DNA编码序列。合成所得的DNA序列分别构建至汉逊酵母表达载体上,得到4种携带HPV 35L1蛋白编码基因的重组表达质粒,这4种重组汉逊酵母表达质粒均属于胞内表达型质粒。重组质粒通过基因工程的方法整合到汉逊酵母基因组中,经过表达筛选发现,含有SEQ ID NO:2基因的菌株的HPV35L1蛋白表达量优于其他的对比DNA编码序列。将含有SEQ ID NO:2基因的高表达菌株进行发酵罐发酵培养、纯化层析,获得高纯度的HPV 35L1蛋白,经铝佐剂吸附后成为HPV 35L1疫苗。
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。
在进一步描述本发明具体实施方式之前,应理解,本发明的保护范围不局限于下述特定的具体实施方案;还应当理解,本发明实施例中使用的术语是为了描述特定的具体实施方案,而不是为了限制本发明的保护范围;在本发明说明书和权利要求书中,除非文中另外明确指出,单数形式“一个”、“一”和“这个”包括复数形式。
当实施例给出数值范围时,应理解,除非本发明另有说明,每个数值范围的两个端点以及两个端点之间任何一个数值均可选用。除非另外定义,本发明中使用的所有技术和科学术语与本技术领域技术人员通常理解的意义相同。除实施例中使用的具体方法、设备、材料外,根据本技术领域的技术人员对现有技术的掌握及本发明的记载,还可以使用与本发明实施例中所述的方法、设备、材料相似或等同的现有技术的任何方法、设备和材料来实现本发明。
以上的实施例是为了说明本发明公开的实施方案,并不能理解为对本发明的限制。此外,本文所列出的各种方法,在不脱离本发明的范围和精神的前提下对本领域内的技术人员来说是显而易见的。虽然已结合本发明的多种具体优选实施例对本发明进行了具体的描述,但应当理解,本发明不应仅限于这些具体实施例。事实上,各种如上所述的对本领域内的技术人员来说显而易见的修改来获取发明都应包括在本发明的范围内。
实施例1 HPV 35L1蛋白工程菌株构建
1.HPV 35L1氨基酸序列的选择
全长的HPV 35L1蛋白由502个氨基酸组成,经过NCBI GenBank检索及比对分析后,选择最具代表性的保守序列(GenBank:CAA52566.1)作为HPV 35L1的氨基酸序列,其序列信息如SEQ ID NO:1所示。
SEQ ID NO:1
MSLWRSNEATVYLPPVSVSKVVSTDEYVTRTNIYYHAGSSRLLAVGHPYYAIKKQDSNKIAVPKVSGLQYRVFRVKLPDPNKFGFPDTSFYDPASQRLVWACTGVEVGRGQPLGVGISGHPLLNKLDDTENSNKYVGNSGTDNRECISMDYKQTQLCLIGCRPPIGEHWGKGTPCNANQVKAGECPPLELLNTVLQDGDMVDTGFGAMDFTTLQANKSDVPLDICSSICKYPDYLKMVSEPYGDMLFFYLRREQMFVRHLFNRAGTVGETVPADLYIKGTTGTLPSTSYFPTPSGSMVTSDAQIFNKPYWLQRAQGHNNGICWSNQLFVTVVDTTRSTNMSVCSAVSSSDSTYKNDNFKEYLRHGEEYDLQFIFQLCKITLTADVMTYIHSMNPSILEDWNFGLTPPPSGTLEDTYRYVTSQAVTCQKPSAPKPKDDPLKNYTFWEVDLKEKFSADLDQFPLGRKFLLQAGLKARPNFRLGKRAAPASTSKKSSTKRRKVKS
2.HPV 35L1编码基因的设计及合成
为了在汉逊酵母中高效表达HPV 35L1蛋白,本发明基于GenBank ID为CAA52566.1的HPV 35L1的野生型病毒株的核苷酸序列,采用汉逊酵母密码子优化策略对HPV 35L1的核苷酸编码序列进行优化,得到密码子优化后核苷酸序列,如SEQ ID NO:2,SEQ ID NO:3,SEQID NO:4,SEQ ID NO:5所示。根据以上优化后的核苷酸编码序列,委托苏州金唯智生物科技有限公司合成全长基因,并对合成的基因序列进行测序验证。
SEQ ID NO:2
atgtctctgtggagatccaacgaggccactgtctacctgcctccagtttcggtgtctaaggttgtgtccacggacgagtacgtcactagaaccaacatctactaccacgcaggttcctctagactcctggctgttggtcacccttactatgccattaagaagcaggactcgaacaagatcgccgtcccaaaggtttctggcttgcagtacagagtgttcagagttaagctgccagaccctaacaagttcggatttccagacacctccttctacgaccctgcttcccagagattggtttgggcatgcactggagtcgaggtgggcagaggtcagccattgggagttggtatctctggccaccctttgctgaacaagctcgacgataccgagaactccaacaagtacgttggcaactctggaaccgacaacagagagtgcatctcgatggactacaagcagacccagttgtgtctcatcggatgcagaccacctattggtgaacattggggaaagggcaccccttgcaacgccaaccaggtcaaggccggagagtgtcctccattggagcttctgaacactgttctccaagatggtgacatggttgacaccggctttggtgctatggacttcacgaccttgcaggccaacaagtccgacgtgccacttgacatctgttcttccatttgcaagtaccctgattacctgaagatggtttcggagccatacggagacatgctcttcttttacctgagaagagagcagatgttcgtgagacacttgttcaacagagcaggaactgttggtgaaacggtccctgctgacctgtacatcaagggcaccactggtacgttgccatctacctcgtacttccctactccatctggttcgatggtcacctccgatgcccagatcttcaacaagccatactggttgcagagagcccagggacacaacaatggcatttgctggtccaaccagctgttcgtgaccgtcgttgacactacgagatccaccaacatgtcggtgtgttctgcagtcagctcttccgactcgacctacaagaacgacaacttcaaggagtacctcagacacggtgaagagtacgacctgcaattcatcttccagttgtgcaagatcaccctgactgctgacgttatgacgtacattcactccatgaacccttcgatcctggaggactggaacttcggtcttactccacctccatctggcaccttggaggacacttacagatatgtcacctcccaagctgttacgtgtcagaagccttcggccccaaagcctaaggacgatccactgaagaactacaccttctgggaggttgacctgaaggagaagttctccgcagacctcgaccagttcccattgggcagaaagttcctgctccaagctggattgaaggccagacctaacttcagacttggcaagagagccgctccagcatctacctctaagaaatcgtccacgaagcgcagaaaggtgaagtcgtaatag
SEQ ID NO:3
atgtccttgtggagatctaacgaggctaccgtttacctcccacctgtctctgtttccaaggtcgtttcgactgacgaatacgtgaccagaacgaacatctactaccacgccggatcttcgagactgcttgccgtcggacacccatattacgctatcaagaagcaggactccaacaagatcgctgttcctaaggtctcgggtctccagtacagagttttcagagtgaagttgcctgacccaaacaagttcggcttccctgacacgtcgttctacgacccagcctctcaaagactggtctgggcctgtaccggtgttgaggtcggaagaggccaacctctgggtgtgggaatttccggtcacccactcttgaacaagctggatgacactgagaactcgaataagtacgtcggaaactccggcacagacaacagagaatgtatttctatggactacaagcagacgcaactgtgccttatcggctgtagacctccaatcggagagcactggggcaagggtactccatgcaacgctaaccaggttaaggcaggtgagtgcccacctctggagttgctcaacaccgtgcttcaggacggagatatggttgacaccggtttcggcgcaatggactttactacgctccaggctaacaagtcggacgttcctttggatatttgctcctctatctgtaagtacccagactacttgaagatggtttctgagccttacggcgacatgctgttcttctacctcagacgcgagcagatgttcgttagacacctgtttaacagagccggtactgtgggcgagaccgttccagccgacttgtacattaagggaacgaccggcacactgccttccacctcttacttcccaaccccttcgggatctatggttacttctgacgctcaaatcttcaacaagccttactggctgcaaagagcacagggtcacaacaacggtatctgctggtcgaaccagttgttcgtcactgttgtggacacgaccagatctaccaacatgtccgtttgctctgcagtttccagctctgactccacttacaagaacgacaacttcaaggaatacttgagacacggcgaggaatacgacctccagttcatcttccagctgtgcaagattaccttgaccgccgatgtgatgacttacatccactccatgaacccatccatcctcgaagactggaacttcggactgacccctccaccttctggtactctggaggacacctatagatacgttacctctcaggccgtgacttgccagaagccatccgcacctaagccaaaggatgaccctttgaagaactacacgttttgggaggtcgacttgaaggagaagttctctgccgacttggatcagttccctctgggtagaaagttcctgcttcaggccggcttgaaggctagaccaaacttcagactgggcaagagagcagccccagcttccacttccaagaagtcctcgaccaagagaagaaaggtcaagtcctaatag
SEQ ID NO:4
atgagcctgtggagaagcaacgaggccaccgtgtacctgcccccggtgagcgtgagcaaggtggtgagcaccgacgagtacgtgacaagaaccaacatctactaccacgccggcagcagcagactgctggccgtgggccacccctactacgccatcaagaagcaagacagcaacaagatcgccgtgcccaaggtgagcggcctgcagtacagagtgttcagagtgaagctgcccgaccccaacaagttcggcttccccgacactagcttctatgacccggctagccaaagactggtgtgggcgtgtacgggcgtggaggtaggtagagggcagccactgggcgtgggcatcagcggccaccccctgctgaacaagctggacgacaccgagaacagcaacaagtacgtgggcaacagcggcaccgacaacagagagtgcatcagcatggactacaagcagacacagctgtgcctgatcggctgcagaccccccatcggcgagcactggggcaaaggcaccccgtgtaacgctaaccaagtcaaggcgggagagtgcccccccctggagctgctgaacaccgtgctgcaagacggcgacatggtggacaccggcttcggcgccatggacttcaccaccctgcaagccaacaagagcgacgtgcccctggacatctgcagcagcatctgcaagtaccccgactacctgaagatggtgagcgagccctacggcgacatgctgttcttctacctgagaagagagcagatgttcgtgagacacctgttcaacagagccggcaccgtgggcgagaccgtgcccgccgacctgtacatcaagggcaccaccggcaccctgcctagcacaagctacttccccacgcctagcggcagcatggtgacaagcgacgctcagatcttcaacaagccctactggctgcagagagcccaaggccacaacaacggcatctgctggagcaatcagctgttcgtgaccgtggtggacaccacaagaagcaccaacatgagcgtgtgcagcgccgtgagcagcagcgacagcacctacaagaacgacaacttcaaggagtacctgagacacggcgaggagtacgacctgcagttcatctttcagctgtgcaagatcaccctgaccgccgacgtgatgacctacatccacagcatgaaccctagcatcctggaggactggaacttcggcctgaccccccctcctagcggcaccctggaggacacctacagatacgtgacaagccaagccgtgacctgtcagaagcctagcgcccccaagcccaaggacgaccccctgaagaactacaccttctgggaggtggacctgaaggagaagttcagcgccgacctggatcagttccccctgggcagaaagttcctgctgcaagccggcctgaaggctagacccaacttcagactgggcaagagagccgcccccgctagcacaagcaagaagagcagcaccaagagaagaaaggtgaagagctaatag
SEQ ID NO:5
atgagcctgtggagaagcaatgaggccacagtgtacctgccccctgtgtctgtgagcaaggtggtgagcacagatgagtatgtgacaagaaccaacatctactaccatgctggcagcagcagactgctggctgtgggccacccctactatgccatcaagaagcaagacagcaacaagattgctgtgcccaaggtgtctggcctgcagtacagagtgttcagagtgaagctgcctgaccccaacaagtttggcttccctgacactagcttctatgaccctgctagccaaagactggtgtgggcctgtactggggtggaggtaggtagagggcagccactgggggtgggcatctctggccaccccctgctgaacaagctggatgacacagagaacagcaacaagtatgtgggcaactctggcacagacaacagagagtgcatcagcatggactacaagcagacacagctgtgcctgattggctgcagaccccccattggggagcactggggcaaaggcaccccctgtaatgccaaccaagtcaaggctggagagtgcccccccctggagctgctgaacacagtgctgcaagatggggacatggtggacactggctttggggccatggacttcaccaccctgcaagccaacaagtctgatgtgcccctggacatctgcagcagcatctgcaagtaccctgactacctgaagatggtgtctgagccctatggggacatgctgttcttctacctgagaagagagcagatgtttgtgagacacctgttcaacagagctggcacagtgggggagacagtgcctgctgacctgtacatcaagggcaccactggcaccctgcctagcacaagctacttccccaccccctctggcagcatggtgacctctgatgctcagatcttcaacaagccctactggctgcagagagcccaaggccacaacaatggcatctgctggagcaatcagctgtttgtgacagtggtggacaccacaagaagcaccaacatgtctgtgtgctctgctgtgagcagctctgacagcacctacaagaatgacaacttcaaggagtacctgagacatggggaggagtatgacctgcagttcatctttcagctgtgcaagatcaccctgacagctgatgtgatgacctacatccacagcatgaaccctagcatcctggaggactggaactttggcctgaccccccctccctctggcaccctggaggacacctacagatatgtgacaagccaagctgtgacctgtcagaagccctctgcccccaagcccaaggatgaccccctgaagaactacaccttctgggaggtggacctgaaggagaagttctctgctgacctggatcagttccccctgggcagaaagttcctgctgcaagctggcctgaaggctagacccaacttcagactgggcaagagagctgcccctgctagcacaagcaagaagagcagcaccaagagaagaaaggtgaagagctaatag
3.HPV 35L1蛋白重组表达载体的构建
本发明所应用的汉逊酵母表达载体pMTZ(SEQ ID NO:6,图1)由本公司自行改造(改造自商业化载体pPICZ B,将pPICZ B原有的启动子和转录终止子替换成了汉逊酵母的启动子和转录终止子)。通过5’端BstBI酶切位点和3’端KpnI酶切位点将优化后的4条HPV35L1编码序列分别克隆至pMTZ载体中,分别获得表达载体35L1-1-pMTZ(SEQ ID NO:7,图2)、35L1-2-pMTZ(SEQ ID NO:8,图3)、35L1-3-pMTZ(SEQ ID NO:9,图4)和35L1-4-pMTZ(SEQID NO:10,图5)。HPV 35L1编码序列的转录由汉逊酵母甲醇氧化酶启动子pMOX和MOX转录终止区域调控。
pMTZ载体序列(SEQ ID NO:6):
agatctgtcgacgcggagaacgatctcctcgagctgctcgcggatcagcttgtggcccggtaatggaaccaggccgacgcgacgctccttgcggaccacggtggctggcgagcccagtttgtgaacgaggtcgtttagaacgtcctccgcaaagtccagtgtcagatgaatgtcctcctcggaccaattcagcatgttctcgagcagccatctgtctttggagtagaagcgtaatctctgctcctcgttactgtaccggaagaggtagtttgcctcgccgcccataatgaacaggttctctttctggtggcctgtgagcagcggggacgtctggacggcgtcgatgaggcccttgaggcgctcgtagtacttgttccgtcgctgtagccggccgcggtgacgatacccacatagaggtccttggccattagtttgatgaggtggggcaggatgggcgactcggcatcgaaatttttgccgtcgtcgtacagtgtgatgtcaccatcgaatgtaatgagctgcagcttgcgatctcggatggttttggaatggaagaaccgcgacatctccaacagctgggccgtgttgagaatgagccggacgtcgttgaacgagggggccacaagccggcgtttgctgatggcgcggcgctcgtcctcgatgtacaaggccttttccagaggcagtctcgtgaagaagctgccaacgctcggaaccagctgcacgagccgagacaattcgggggtgccggctttggtcatttcaatcttgtcgtcgatgaggagttcgaggtcgtggaagatttccgcgtagcggcgttttgcctcagagtttaccatgaggtcgtccactgcagagatgccgttgctcttcaccgcgtacaggaccaacggcgtcgccagcaggcccttgatccattctatgaggccatctcgacggtgttccttgagtgcgtactccactctgtagcgactggacatctcgagactgggcttgctgtgctcgatgcaccaattaattgttgccgcatgcatccttgcaccgcaagtttttaaaacccactcgctttagccgtcgcgtaaaacttgtgaatctggcaactgagggggttctgcagccgcaaccgaacttttcgcttcgaggacgcagctgcatggtgtcatgtgaggctctgtttgctggcgtagcctacaacgtgaccttgcctaaccggacggcgctacccactgctgtctgtgcctgctaccagaaaatcaccagagcagcagaggcccgatgtggcaactggtggggtgtcggacaggctgtttctccacagtgcaaatgcgggtgaaccggccagaaagtaaattcttatgctaccgtgcagcgactccgacatccccagtttttgccctacttgatcacagatggggtcagcgctgccgctaagtgtacccaaccgtgcccacacggtccatctataaatactgctgccagtgcacggtggtgacatcaatctaaagtacaaaaacaaattcgaaacgaggaattcacgtggcccagccggccgtctcggatcggtaccggagacgtggaaggacataccgcttttgagaagcgtgtttgaaaatagttctttttctggtttatatcgtttatgaagtgatgagatgaaaagctgaaatagcgagtataggaaaatttaatgaaaattaaattaaatattttcttaggctattagtcaccttcaaaatgccggccgcttctaagaacgttgtcatgatcgacaactacgactcgtttacctggaacctgtacgagtacctgtgtcaggagggagccaatgtcgaggttttcaggaacgatcagatcaccattccggagattgagcagctcaagccggacgttgtggtgatatcccctggtcctggccatccaagaacagactcgggaatatctcgcgacgtgatcagccattttaaaggcaagattcctgtctttggtgtctgtatgggccagcagtgtatcttcgaggagtttggcggagacgtcgagtatgcgggcgagattgtccatggaaaaacgtccactgttaagcacgacaacaagggaatgttcaaaaacgttccgcaagatgttgctgtcaccagataccactcgctggccggaacgctcaagtcgcttccggactgtctagagatcactgctcgcacagacaacgggatcattatgggtgtgagacacaagaagtacaccatcgagggcgtccagtttcatccagagagcattctgaccgaggagggccatctgatgatccagaatatcctcaacgtttccggtggttactgggaggaaaatgccaacggcgcggctcagagaaaggaaagcatattggagaaaatatacgcgcagagacgaaaagactacgagtttgagatgaacagaccggggcgcagatttgctgatctagaactgtacttgtccatgggactgcaccgccgctaatcaatttttacgacagattggagcagaacatcagcgccggcaaggttgcaattctcagcgaaatcaagagagcgtcgccttctaaaggcgtcatcgacggagacgctaacgctgccaaacaggccctcaactacgccaaggctggagttgccacaatttctgttttgaccgagccaacctggtttaaaggaaatatccaggacctggaggtggccagaaaagccattgactctgtggccaatagaccgtgtattttgcggaaggagtttatcttcaacaagtaccaaattctagaggcccgactggcgggagcagacacggttctgctgattgtcaagatgctgagctcggatcccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactttttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagatc
35L1-1-pMTZ(SEQ ID NO:7):
agatctgtcgacgcggagaacgatctcctcgagctgctcgcggatcagcttgtggcccggtaatggaaccaggccgacgcgacgctccttgcggaccacggtggctggcgagcccagtttgtgaacgaggtcgtttagaacgtcctccgcaaagtccagtgtcagatgaatgtcctcctcggaccaattcagcatgttctcgagcagccatctgtctttggagtagaagcgtaatctctgctcctcgttactgtaccggaagaggtagtttgcctcgccgcccataatgaacaggttctctttctggtggcctgtgagcagcggggacgtctggacggcgtcgatgaggcccttgaggcgctcgtagtacttgttccgtcgctgtagccggccgcggtgacgatacccacatagaggtccttggccattagtttgatgaggtggggcaggatgggcgactcggcatcgaaatttttgccgtcgtcgtacagtgtgatgtcaccatcgaatgtaatgagctgcagcttgcgatctcggatggttttggaatggaagaaccgcgacatctccaacagctgggccgtgttgagaatgagccggacgtcgttgaacgagggggccacaagccggcgtttgctgatggcgcggcgctcgtcctcgatgtacaaggccttttccagaggcagtctcgtgaagaagctgccaacgctcggaaccagctgcacgagccgagacaattcgggggtgccggctttggtcatttcaatcttgtcgtcgatgaggagttcgaggtcgtggaagatttccgcgtagcggcgttttgcctcagagtttaccatgaggtcgtccactgcagagatgccgttgctcttcaccgcgtacaggaccaacggcgtcgccagcaggcccttgatccattctatgaggccatctcgacggtgttccttgagtgcgtactccactctgtagcgactggacatctcgagactgggcttgctgtgctcgatgcaccaattaattgttgccgcatgcatccttgcaccgcaagtttttaaaacccactcgctttagccgtcgcgtaaaacttgtgaatctggcaactgagggggttctgcagccgcaaccgaacttttcgcttcgaggacgcagctgcatggtgtcatgtgaggctctgtttgctggcgtagcctacaacgtgaccttgcctaaccggacggcgctacccactgctgtctgtgcctgctaccagaaaatcaccagagcagcagaggcccgatgtggcaactggtggggtgtcggacaggctgtttctccacagtgcaaatgcgggtgaaccggccagaaagtaaattcttatgctaccgtgcagcgactccgacatccccagtttttgccctacttgatcacagatggggtcagcgctgccgctaagtgtacccaaccgtgcccacacggtccatctataaatactgctgccagtgcacggtggtgacatcaatctaaagtacaaaaacaaattcgaaacgatgtctctgtggagatccaacgaggccactgtctacctgcctccagtttcggtgtctaaggttgtgtccacggacgagtacgtcactagaaccaacatctactaccacgcaggttcctctagactcctggctgttggtcacccttactatgccattaagaagcaggactcgaacaagatcgccgtcccaaaggtttctggcttgcagtacagagtgttcagagttaagctgccagaccctaacaagttcggatttccagacacctccttctacgaccctgcttcccagagattggtttgggcatgcactggagtcgaggtgggcagaggtcagccattgggagttggtatctctggccaccctttgctgaacaagctcgacgataccgagaactccaacaagtacgttggcaactctggaaccgacaacagagagtgcatctcgatggactacaagcagacccagttgtgtctcatcggatgcagaccacctattggtgaacattggggaaagggcaccccttgcaacgccaaccaggtcaaggccggagagtgtcctccattggagcttctgaacactgttctccaagatggtgacatggttgacaccggctttggtgctatggacttcacgaccttgcaggccaacaagtccgacgtgccacttgacatctgttcttccatttgcaagtaccctgattacctgaagatggtttcggagccatacggagacatgctcttcttttacctgagaagagagcagatgttcgtgagacacttgttcaacagagcaggaactgttggtgaaacggtccctgctgacctgtacatcaagggcaccactggtacgttgccatctacctcgtacttccctactccatctggttcgatggtcacctccgatgcccagatcttcaacaagccatactggttgcagagagcccagggacacaacaatggcatttgctggtccaaccagctgttcgtgaccgtcgttgacactacgagatccaccaacatgtcggtgtgttctgcagtcagctcttccgactcgacctacaagaacgacaacttcaaggagtacctcagacacggtgaagagtacgacctgcaattcatcttccagttgtgcaagatcaccctgactgctgacgttatgacgtacattcactccatgaacccttcgatcctggaggactggaacttcggtcttactccacctccatctggcaccttggaggacacttacagatatgtcacctcccaagctgttacgtgtcagaagccttcggccccaaagcctaaggacgatccactgaagaactacaccttctgggaggttgacctgaaggagaagttctccgcagacctcgaccagttcccattgggcagaaagttcctgctccaagctggattgaaggccagacctaacttcagacttggcaagagagccgctccagcatctacctctaagaaatcgtccacgaagcgcagaaaggtgaagtcgtaataggtaccggagacgtggaaggacataccgcttttgagaagcgtgtttgaaaatagttctttttctggtttatatcgtttatgaagtgatgagatgaaaagctgaaatagcgagtataggaaaatttaatgaaaattaaattaaatattttcttaggctattagtcaccttcaaaatgccggccgcttctaagaacgttgtcatgatcgacaactacgactcgtttacctggaacctgtacgagtacctgtgtcaggagggagccaatgtcgaggttttcaggaacgatcagatcaccattccggagattgagcagctcaagccggacgttgtggtgatatcccctggtcctggccatccaagaacagactcgggaatatctcgcgacgtgatcagccattttaaaggcaagattcctgtctttggtgtctgtatgggccagcagtgtatcttcgaggagtttggcggagacgtcgagtatgcgggcgagattgtccatggaaaaacgtccactgttaagcacgacaacaagggaatgttcaaaaacgttccgcaagatgttgctgtcaccagataccactcgctggccggaacgctcaagtcgcttccggactgtctagagatcactgctcgcacagacaacgggatcattatgggtgtgagacacaagaagtacaccatcgagggcgtccagtttcatccagagagcattctgaccgaggagggccatctgatgatccagaatatcctcaacgtttccggtggttactgggaggaaaatgccaacggcgcggctcagagaaaggaaagcatattggagaaaatatacgcgcagagacgaaaagactacgagtttgagatgaacagaccggggcgcagatttgctgatctagaactgtacttgtccatgggactgcaccgccgctaatcaatttttacgacagattggagcagaacatcagcgccggcaaggttgcaattctcagcgaaatcaagagagcgtcgccttctaaaggcgtcatcgacggagacgctaacgctgccaaacaggccctcaactacgccaaggctggagttgccacaatttctgttttgaccgagccaacctggtttaaaggaaatatccaggacctggaggtggccagaaaagccattgactctgtggccaatagaccgtgtattttgcggaaggagtttatcttcaacaagtaccaaattctagaggcccgactggcgggagcagacacggttctgctgattgtcaagatgctgagctcggatcccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactttttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagatc
35L1-2-pMTZ(SEQ ID NO:8):
agatctgtcgacgcggagaacgatctcctcgagctgctcgcggatcagcttgtggcccggtaatggaaccaggccgacgcgacgctccttgcggaccacggtggctggcgagcccagtttgtgaacgaggtcgtttagaacgtcctccgcaaagtccagtgtcagatgaatgtcctcctcggaccaattcagcatgttctcgagcagccatctgtctttggagtagaagcgtaatctctgctcctcgttactgtaccggaagaggtagtttgcctcgccgcccataatgaacaggttctctttctggtggcctgtgagcagcggggacgtctggacggcgtcgatgaggcccttgaggcgctcgtagtacttgttccgtcgctgtagccggccgcggtgacgatacccacatagaggtccttggccattagtttgatgaggtggggcaggatgggcgactcggcatcgaaatttttgccgtcgtcgtacagtgtgatgtcaccatcgaatgtaatgagctgcagcttgcgatctcggatggttttggaatggaagaaccgcgacatctccaacagctgggccgtgttgagaatgagccggacgtcgttgaacgagggggccacaagccggcgtttgctgatggcgcggcgctcgtcctcgatgtacaaggccttttccagaggcagtctcgtgaagaagctgccaacgctcggaaccagctgcacgagccgagacaattcgggggtgccggctttggtcatttcaatcttgtcgtcgatgaggagttcgaggtcgtggaagatttccgcgtagcggcgttttgcctcagagtttaccatgaggtcgtccactgcagagatgccgttgctcttcaccgcgtacaggaccaacggcgtcgccagcaggcccttgatccattctatgaggccatctcgacggtgttccttgagtgcgtactccactctgtagcgactggacatctcgagactgggcttgctgtgctcgatgcaccaattaattgttgccgcatgcatccttgcaccgcaagtttttaaaacccactcgctttagccgtcgcgtaaaacttgtgaatctggcaactgagggggttctgcagccgcaaccgaacttttcgcttcgaggacgcagctgcatggtgtcatgtgaggctctgtttgctggcgtagcctacaacgtgaccttgcctaaccggacggcgctacccactgctgtctgtgcctgctaccagaaaatcaccagagcagcagaggcccgatgtggcaactggtggggtgtcggacaggctgtttctccacagtgcaaatgcgggtgaaccggccagaaagtaaattcttatgctaccgtgcagcgactccgacatccccagtttttgccctacttgatcacagatggggtcagcgctgccgctaagtgtacccaaccgtgcccacacggtccatctataaatactgctgccagtgcacggtggtgacatcaatctaaagtacaaaaacaaattcgaaacgatgtccttgtggagatctaacgaggctaccgtttacctcccacctgtctctgtttccaaggtcgtttcgactgacgaatacgtgaccagaacgaacatctactaccacgccggatcttcgagactgcttgccgtcggacacccatattacgctatcaagaagcaggactccaacaagatcgctgttcctaaggtctcgggtctccagtacagagttttcagagtgaagttgcctgacccaaacaagttcggcttccctgacacgtcgttctacgacccagcctctcaaagactggtctgggcctgtaccggtgttgaggtcggaagaggccaacctctgggtgtgggaatttccggtcacccactcttgaacaagctggatgacactgagaactcgaataagtacgtcggaaactccggcacagacaacagagaatgtatttctatggactacaagcagacgcaactgtgccttatcggctgtagacctccaatcggagagcactggggcaagggtactccatgcaacgctaaccaggttaaggcaggtgagtgcccacctctggagttgctcaacaccgtgcttcaggacggagatatggttgacaccggtttcggcgcaatggactttactacgctccaggctaacaagtcggacgttcctttggatatttgctcctctatctgtaagtacccagactacttgaagatggtttctgagccttacggcgacatgctgttcttctacctcagacgcgagcagatgttcgttagacacctgtttaacagagccggtactgtgggcgagaccgttccagccgacttgtacattaagggaacgaccggcacactgccttccacctcttacttcccaaccccttcgggatctatggttacttctgacgctcaaatcttcaacaagccttactggctgcaaagagcacagggtcacaacaacggtatctgctggtcgaaccagttgttcgtcactgttgtggacacgaccagatctaccaacatgtccgtttgctctgcagtttccagctctgactccacttacaagaacgacaacttcaaggaatacttgagacacggcgaggaatacgacctccagttcatcttccagctgtgcaagattaccttgaccgccgatgtgatgacttacatccactccatgaacccatccatcctcgaagactggaacttcggactgacccctccaccttctggtactctggaggacacctatagatacgttacctctcaggccgtgacttgccagaagccatccgcacctaagccaaaggatgaccctttgaagaactacacgttttgggaggtcgacttgaaggagaagttctctgccgacttggatcagttccctctgggtagaaagttcctgcttcaggccggcttgaaggctagaccaaacttcagactgggcaagagagcagccccagcttccacttccaagaagtcctcgaccaagagaagaaaggtcaagtcctaataggtaccggagacgtggaaggacataccgcttttgagaagcgtgtttgaaaatagttctttttctggtttatatcgtttatgaagtgatgagatgaaaagctgaaatagcgagtataggaaaatttaatgaaaattaaattaaatattttcttaggctattagtcaccttcaaaatgccggccgcttctaagaacgttgtcatgatcgacaactacgactcgtttacctggaacctgtacgagtacctgtgtcaggagggagccaatgtcgaggttttcaggaacgatcagatcaccattccggagattgagcagctcaagccggacgttgtggtgatatcccctggtcctggccatccaagaacagactcgggaatatctcgcgacgtgatcagccattttaaaggcaagattcctgtctttggtgtctgtatgggccagcagtgtatcttcgaggagtttggcggagacgtcgagtatgcgggcgagattgtccatggaaaaacgtccactgttaagcacgacaacaagggaatgttcaaaaacgttccgcaagatgttgctgtcaccagataccactcgctggccggaacgctcaagtcgcttccggactgtctagagatcactgctcgcacagacaacgggatcattatgggtgtgagacacaagaagtacaccatcgagggcgtccagtttcatccagagagcattctgaccgaggagggccatctgatgatccagaatatcctcaacgtttccggtggttactgggaggaaaatgccaacggcgcggctcagagaaaggaaagcatattggagaaaatatacgcgcagagacgaaaagactacgagtttgagatgaacagaccggggcgcagatttgctgatctagaactgtacttgtccatgggactgcaccgccgctaatcaatttttacgacagattggagcagaacatcagcgccggcaaggttgcaattctcagcgaaatcaagagagcgtcgccttctaaaggcgtcatcgacggagacgctaacgctgccaaacaggccctcaactacgccaaggctggagttgccacaatttctgttttgaccgagccaacctggtttaaaggaaatatccaggacctggaggtggccagaaaagccattgactctgtggccaatagaccgtgtattttgcggaaggagtttatcttcaacaagtaccaaattctagaggcccgactggcgggagcagacacggttctgctgattgtcaagatgctgagctcggatcccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactttttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagatc
35L1-3-pMTZ(SEQ ID NO:9):
agatctgtcgacgcggagaacgatctcctcgagctgctcgcggatcagcttgtggcccggtaatggaaccaggccgacgcgacgctccttgcggaccacggtggctggcgagcccagtttgtgaacgaggtcgtttagaacgtcctccgcaaagtccagtgtcagatgaatgtcctcctcggaccaattcagcatgttctcgagcagccatctgtctttggagtagaagcgtaatctctgctcctcgttactgtaccggaagaggtagtttgcctcgccgcccataatgaacaggttctctttctggtggcctgtgagcagcggggacgtctggacggcgtcgatgaggcccttgaggcgctcgtagtacttgttccgtcgctgtagccggccgcggtgacgatacccacatagaggtccttggccattagtttgatgaggtggggcaggatgggcgactcggcatcgaaatttttgccgtcgtcgtacagtgtgatgtcaccatcgaatgtaatgagctgcagcttgcgatctcggatggttttggaatggaagaaccgcgacatctccaacagctgggccgtgttgagaatgagccggacgtcgttgaacgagggggccacaagccggcgtttgctgatggcgcggcgctcgtcctcgatgtacaaggccttttccagaggcagtctcgtgaagaagctgccaacgctcggaaccagctgcacgagccgagacaattcgggggtgccggctttggtcatttcaatcttgtcgtcgatgaggagttcgaggtcgtggaagatttccgcgtagcggcgttttgcctcagagtttaccatgaggtcgtccactgcagagatgccgttgctcttcaccgcgtacaggaccaacggcgtcgccagcaggcccttgatccattctatgaggccatctcgacggtgttccttgagtgcgtactccactctgtagcgactggacatctcgagactgggcttgctgtgctcgatgcaccaattaattgttgccgcatgcatccttgcaccgcaagtttttaaaacccactcgctttagccgtcgcgtaaaacttgtgaatctggcaactgagggggttctgcagccgcaaccgaacttttcgcttcgaggacgcagctgcatggtgtcatgtgaggctctgtttgctggcgtagcctacaacgtgaccttgcctaaccggacggcgctacccactgctgtctgtgcctgctaccagaaaatcaccagagcagcagaggcccgatgtggcaactggtggggtgtcggacaggctgtttctccacagtgcaaatgcgggtgaaccggccagaaagtaaattcttatgctaccgtgcagcgactccgacatccccagtttttgccctacttgatcacagatggggtcagcgctgccgctaagtgtacccaaccgtgcccacacggtccatctataaatactgctgccagtgcacggtggtgacatcaatctaaagtacaaaaacaaattcgaaacgatgagcctgtggagaagcaacgaggccaccgtgtacctgcccccggtgagcgtgagcaaggtggtgagcaccgacgagtacgtgacaagaaccaacatctactaccacgccggcagcagcagactgctggccgtgggccacccctactacgccatcaagaagcaagacagcaacaagatcgccgtgcccaaggtgagcggcctgcagtacagagtgttcagagtgaagctgcccgaccccaacaagttcggcttccccgacactagcttctatgacccggctagccaaagactggtgtgggcgtgtacgggcgtggaggtaggtagagggcagccactgggcgtgggcatcagcggccaccccctgctgaacaagctggacgacaccgagaacagcaacaagtacgtgggcaacagcggcaccgacaacagagagtgcatcagcatggactacaagcagacacagctgtgcctgatcggctgcagaccccccatcggcgagcactggggcaaaggcaccccgtgtaacgctaaccaagtcaaggcgggagagtgcccccccctggagctgctgaacaccgtgctgcaagacggcgacatggtggacaccggcttcggcgccatggacttcaccaccctgcaagccaacaagagcgacgtgcccctggacatctgcagcagcatctgcaagtaccccgactacctgaagatggtgagcgagccctacggcgacatgctgttcttctacctgagaagagagcagatgttcgtgagacacctgttcaacagagccggcaccgtgggcgagaccgtgcccgccgacctgtacatcaagggcaccaccggcaccctgcctagcacaagctacttccccacgcctagcggcagcatggtgacaagcgacgctcagatcttcaacaagccctactggctgcagagagcccaaggccacaacaacggcatctgctggagcaatcagctgttcgtgaccgtggtggacaccacaagaagcaccaacatgagcgtgtgcagcgccgtgagcagcagcgacagcacctacaagaacgacaacttcaaggagtacctgagacacggcgaggagtacgacctgcagttcatctttcagctgtgcaagatcaccctgaccgccgacgtgatgacctacatccacagcatgaaccctagcatcctggaggactggaacttcggcctgaccccccctcctagcggcaccctggaggacacctacagatacgtgacaagccaagccgtgacctgtcagaagcctagcgcccccaagcccaaggacgaccccctgaagaactacaccttctgggaggtggacctgaaggagaagttcagcgccgacctggatcagttccccctgggcagaaagttcctgctgcaagccggcctgaaggctagacccaacttcagactgggcaagagagccgcccccgctagcacaagcaagaagagcagcaccaagagaagaaaggtgaagagctaataggtaccggagacgtggaaggacataccgcttttgagaagcgtgtttgaaaatagttctttttctggtttatatcgtttatgaagtgatgagatgaaaagctgaaatagcgagtataggaaaatttaatgaaaattaaattaaatattttcttaggctattagtcaccttcaaaatgccggccgcttctaagaacgttgtcatgatcgacaactacgactcgtttacctggaacctgtacgagtacctgtgtcaggagggagccaatgtcgaggttttcaggaacgatcagatcaccattccggagattgagcagctcaagccggacgttgtggtgatatcccctggtcctggccatccaagaacagactcgggaatatctcgcgacgtgatcagccattttaaaggcaagattcctgtctttggtgtctgtatgggccagcagtgtatcttcgaggagtttggcggagacgtcgagtatgcgggcgagattgtccatggaaaaacgtccactgttaagcacgacaacaagggaatgttcaaaaacgttccgcaagatgttgctgtcaccagataccactcgctggccggaacgctcaagtcgcttccggactgtctagagatcactgctcgcacagacaacgggatcattatgggtgtgagacacaagaagtacaccatcgagggcgtccagtttcatccagagagcattctgaccgaggagggccatctgatgatccagaatatcctcaacgtttccggtggttactgggaggaaaatgccaacggcgcggctcagagaaaggaaagcatattggagaaaatatacgcgcagagacgaaaagactacgagtttgagatgaacagaccggggcgcagatttgctgatctagaactgtacttgtccatgggactgcaccgccgctaatcaatttttacgacagattggagcagaacatcagcgccggcaaggttgcaattctcagcgaaatcaagagagcgtcgccttctaaaggcgtcatcgacggagacgctaacgctgccaaacaggccctcaactacgccaaggctggagttgccacaatttctgttttgaccgagccaacctggtttaaaggaaatatccaggacctggaggtggccagaaaagccattgactctgtggccaatagaccgtgtattttgcggaaggagtttatcttcaacaagtaccaaattctagaggcccgactggcgggagcagacacggttctgctgattgtcaagatgctgagctcggatcccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactttttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagatc
35L1-4-pMTZ(SEQ ID NO:10):
agatctgtcgacgcggagaacgatctcctcgagctgctcgcggatcagcttgtggcccggtaatggaaccaggccgacgcgacgctccttgcggaccacggtggctggcgagcccagtttgtgaacgaggtcgtttagaacgtcctccgcaaagtccagtgtcagatgaatgtcctcctcggaccaattcagcatgttctcgagcagccatctgtctttggagtagaagcgtaatctctgctcctcgttactgtaccggaagaggtagtttgcctcgccgcccataatgaacaggttctctttctggtggcctgtgagcagcggggacgtctggacggcgtcgatgaggcccttgaggcgctcgtagtacttgttccgtcgctgtagccggccgcggtgacgatacccacatagaggtccttggccattagtttgatgaggtggggcaggatgggcgactcggcatcgaaatttttgccgtcgtcgtacagtgtgatgtcaccatcgaatgtaatgagctgcagcttgcgatctcggatggttttggaatggaagaaccgcgacatctccaacagctgggccgtgttgagaatgagccggacgtcgttgaacgagggggccacaagccggcgtttgctgatggcgcggcgctcgtcctcgatgtacaaggccttttccagaggcagtctcgtgaagaagctgccaacgctcggaaccagctgcacgagccgagacaattcgggggtgccggctttggtcatttcaatcttgtcgtcgatgaggagttcgaggtcgtggaagatttccgcgtagcggcgttttgcctcagagtttaccatgaggtcgtccactgcagagatgccgttgctcttcaccgcgtacaggaccaacggcgtcgccagcaggcccttgatccattctatgaggccatctcgacggtgttccttgagtgcgtactccactctgtagcgactggacatctcgagactgggcttgctgtgctcgatgcaccaattaattgttgccgcatgcatccttgcaccgcaagtttttaaaacccactcgctttagccgtcgcgtaaaacttgtgaatctggcaactgagggggttctgcagccgcaaccgaacttttcgcttcgaggacgcagctgcatggtgtcatgtgaggctctgtttgctggcgtagcctacaacgtgaccttgcctaaccggacggcgctacccactgctgtctgtgcctgctaccagaaaatcaccagagcagcagaggcccgatgtggcaactggtggggtgtcggacaggctgtttctccacagtgcaaatgcgggtgaaccggccagaaagtaaattcttatgctaccgtgcagcgactccgacatccccagtttttgccctacttgatcacagatggggtcagcgctgccgctaagtgtacccaaccgtgcccacacggtccatctataaatactgctgccagtgcacggtggtgacatcaatctaaagtacaaaaacaaattcgaaacgatgagcctgtggagaagcaatgaggccacagtgtacctgccccctgtgtctgtgagcaaggtggtgagcacagatgagtatgtgacaagaaccaacatctactaccatgctggcagcagcagactgctggctgtgggccacccctactatgccatcaagaagcaagacagcaacaagattgctgtgcccaaggtgtctggcctgcagtacagagtgttcagagtgaagctgcctgaccccaacaagtttggcttccctgacactagcttctatgaccctgctagccaaagactggtgtgggcctgtactggggtggaggtaggtagagggcagccactgggggtgggcatctctggccaccccctgctgaacaagctggatgacacagagaacagcaacaagtatgtgggcaactctggcacagacaacagagagtgcatcagcatggactacaagcagacacagctgtgcctgattggctgcagaccccccattggggagcactggggcaaaggcaccccctgtaatgccaaccaagtcaaggctggagagtgcccccccctggagctgctgaacacagtgctgcaagatggggacatggtggacactggctttggggccatggacttcaccaccctgcaagccaacaagtctgatgtgcccctggacatctgcagcagcatctgcaagtaccctgactacctgaagatggtgtctgagccctatggggacatgctgttcttctacctgagaagagagcagatgtttgtgagacacctgttcaacagagctggcacagtgggggagacagtgcctgctgacctgtacatcaagggcaccactggcaccctgcctagcacaagctacttccccaccccctctggcagcatggtgacctctgatgctcagatcttcaacaagccctactggctgcagagagcccaaggccacaacaatggcatctgctggagcaatcagctgtttgtgacagtggtggacaccacaagaagcaccaacatgtctgtgtgctctgctgtgagcagctctgacagcacctacaagaatgacaacttcaaggagtacctgagacatggggaggagtatgacctgcagttcatctttcagctgtgcaagatcaccctgacagctgatgtgatgacctacatccacagcatgaaccctagcatcctggaggactggaactttggcctgaccccccctccctctggcaccctggaggacacctacagatatgtgacaagccaagctgtgacctgtcagaagccctctgcccccaagcccaaggatgaccccctgaagaactacaccttctgggaggtggacctgaaggagaagttctctgctgacctggatcagttccccctgggcagaaagttcctgctgcaagctggcctgaaggctagacccaacttcagactgggcaagagagctgcccctgctagcacaagcaagaagagcagcaccaagagaagaaaggtgaagagctaataggtaccggagacgtggaaggacataccgcttttgagaagcgtgtttgaaaatagttctttttctggtttatatcgtttatgaagtgatgagatgaaaagctgaaatagcgagtataggaaaatttaatgaaaattaaattaaatattttcttaggctattagtcaccttcaaaatgccggccgcttctaagaacgttgtcatgatcgacaactacgactcgtttacctggaacctgtacgagtacctgtgtcaggagggagccaatgtcgaggttttcaggaacgatcagatcaccattccggagattgagcagctcaagccggacgttgtggtgatatcccctggtcctggccatccaagaacagactcgggaatatctcgcgacgtgatcagccattttaaaggcaagattcctgtctttggtgtctgtatgggccagcagtgtatcttcgaggagtttggcggagacgtcgagtatgcgggcgagattgtccatggaaaaacgtccactgttaagcacgacaacaagggaatgttcaaaaacgttccgcaagatgttgctgtcaccagataccactcgctggccggaacgctcaagtcgcttccggactgtctagagatcactgctcgcacagacaacgggatcattatgggtgtgagacacaagaagtacaccatcgagggcgtccagtttcatccagagagcattctgaccgaggagggccatctgatgatccagaatatcctcaacgtttccggtggttactgggaggaaaatgccaacggcgcggctcagagaaaggaaagcatattggagaaaatatacgcgcagagacgaaaagactacgagtttgagatgaacagaccggggcgcagatttgctgatctagaactgtacttgtccatgggactgcaccgccgctaatcaatttttacgacagattggagcagaacatcagcgccggcaaggttgcaattctcagcgaaatcaagagagcgtcgccttctaaaggcgtcatcgacggagacgctaacgctgccaaacaggccctcaactacgccaaggctggagttgccacaatttctgttttgaccgagccaacctggtttaaaggaaatatccaggacctggaggtggccagaaaagccattgactctgtggccaatagaccgtgtattttgcggaaggagtttatcttcaacaagtaccaaattctagaggcccgactggcgggagcagacacggttctgctgattgtcaagatgctgagctcggatcccccacacaccatagcttcaaaatgtttctactccttttttactcttccagattttctcggactccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaattttccctctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaaagagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacgtttctttttcttgaaatttttttttttagtttttttctctttcagtgacctccattgatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttcttgttctattacaactttttttacttcttgttcattagaaagaaagcatagcaatctaatctaaggggcggtgttgacaattaatcatcggcatagtatatcggcatagtataatacgacaaggtgaggaactaaaccatggccaagttgaccagtgccgttccggtgctcaccgcgcgcgacgtcgccggagcggtcgagttctggaccgaccggctcgggttctcccgggacttcgtggaggacgacttcgccggtgtggtccgggacgacgtgaccctgttcatcagcgcggtccaggaccaggtggtgccggacaacaccctggcctgggtgtgggtgcgcggcctggacgagctgtacgccgagtggtcggaggtcgtgtccacgaacttccgggacgcctccgggccggccatgaccgagatcggcgagcagccgtgggggcgggagttcgccctgcgcgacccggccggcaactgcgtgcacttcgtggccgaggagcaggactgacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctggagaccaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagatc
4.HPV 35L1蛋白重组表达菌株的构建
本发明所用的汉逊酵母宿主菌来源于野生型汉逊酵母CBS4732株(ATCC 34438),购自美国模式培养物集存库(American type culture collection,ATCC)。将35L1-1-pMTZ,35L1-2-pMTZ,35L1-3-pMTZ和35L1-4-pMTZ重组表达质粒分别用ScaI酶线性化,电转汉逊酵母,电转条件为1500V,120Ω,50μF。电转后菌液涂布YPD平板(200μg/mL Zeocin),37℃倒置培养1~2天。
实施例2 HPV 35L1重组工程菌株的表达筛选
1.玻璃试管表达筛选
分别从35L1-1-pMTZ,35L1-2-pMTZ,35L1-3-pMTZ和35L1-4-pMTZ电转化的YPD平板上随机挑取6个重组汉逊酵母单菌落,接种于YPD液体培养基,37℃过夜培养。取部分菌液离心弃去YPD培养基后,加入诱导培养基BMMY,37℃诱导48小时,收集菌体。酸处理的玻璃珠剧烈震荡破碎菌体,离心后收集破菌上清,用酶联免疫吸附法(ELISA)定量检测破菌上清中HPV 35L1蛋白的表达情况,结果如图6所示:包含不同HPV 35L1编码序列的重组工程菌株均有明确的表达,但不同编码序列的HPV 35L1蛋白表达情况存在一定的差异。相对而言,含有35L1-1和35L1-2编码序列的重组工程菌的表达量显著高于含有35L1-3和35L1-4编码序列的重组工程菌,同时,含有35L1-1编码序列的重组工程菌的表达量显著高于含有35L1-2编码序列的重组工程菌,结果具有统计学意义(图6中*表示p<0.05,**表示p<0.01)。
2.发酵罐表达筛选
为了进一步比较35L1-1和35L1-2编码序列的表达情况,从包含35L1-1和35L1-2编码序列的工程菌株中,各挑取1株菌株进行发酵罐表达验证,比较两个菌株的35L1蛋白表达情况。
主要发酵参数:30L发酵体积;菌体培养温度37℃;培养pH 5.00,3倍甘油增殖。诱导pH 6.50,诱导30小时。
菌体破碎参数:发酵放罐湿菌体按照1:4的比例加入破菌缓冲液(含0.4mol/L氯化钠,0.1mol/L MOPS),菌体经重悬并搅拌均匀后,用筛网过滤菌悬液,将过滤后的菌悬液冰浴降温至4℃,将冰浴的菌悬液于1500bar压力下破碎5次。破碎液于4℃,8500离心20min,收集上清液,进行抗原含量检测。结果如表1显示,包含35L1-1的菌株的抗原表达量明显高于包含35L1-2的菌株。
表1 ELISA检测不同菌株破菌上清中35L1蛋白的抗原含量
实施例3 HPV 35L1重组汉逊酵母表达菌株的发酵工艺
种子液制备:将实施例2中包含35L1-1的菌株,于洁净工作台内(无菌操作条件下)接种至已灭菌的1000mL摇瓶YPG培养基中。将摇瓶放置在恒温振荡器培养,培养温度37℃,摇床转速190rpm,培养时间24h。当种子液OD600值达到2.0时,停止摇瓶培养,检定合格后可于4℃保存作为发酵种子液使用。
发酵罐发酵:按BSM1配方(BSM1培养基配方:85%磷酸26.7ml/L,二水硫酸钙0.93g/L,硫酸钾18.2g/L,二水硫酸镁14.9g/L,氢氧化钾4.13g/L,甘油40g/L,PTM1 4ml/L)配基础培养基20L,121℃下灭菌30min待用。将已经培养好的合格的发酵种子液,在火焰保护下按5%比例接种至30L发酵罐。发酵培养过程中,pH控制在5.0,发酵温度37℃,搅拌转速≦950rpm,空气流量≦2.0VVM,罐压≦0.10MPa,溶氧10%以上。当基础培养基中的甘油消耗完,菌体湿重约100g/L,开始流加甘油,甘油补料速度200~600g/h。当菌体湿重大于200g/L,开始流加甲醇进入甲醇诱导期,随着菌体利用甲醇速度加快,逐步调整甲醇流加速度,诱导过程控制溶氧20%以上,诱导30h菌体后发酵结束。菌体经高速离心后,于-20℃保存待纯化使用。取不同时间的发酵上清液进行SDS-PAGE(图7)和Western Blot鉴定(图8)。结果显示,HPV35L1蛋白的表达情况随诱导时间延长而不断增加,发酵表达量符合大规模生产需求。
实施例4 HPV 35L1重组蛋白的纯化工艺
菌体破碎:取-20℃保存的HPV 35L1发酵放罐湿菌体,按照1:4的比例加入破菌缓冲液(含0.4mol/L氯化钠,0.1mol/L MOPS),菌体经重悬并搅拌均匀后,用筛网过滤菌悬液,将过滤后的菌悬液冰浴降温至4℃,将冰浴的菌悬液于1500bar压力下破碎5次,显微镜检查破菌率≥80%。破碎液于4℃,8500离心20min,收集上清液。
柱层析:澄清液上样至阳离子层析柱POROS HS进行初步纯化,使用1.5mol/L的氯化钠溶液洗脱,并收集初步纯化的洗脱液;将初步纯化的蛋白溶液上样至层析柱CHT进行精制纯化,使用200mol/L的磷酸盐缓冲溶液洗脱,收集洗脱的HPV 35L1蛋白(如图9所示)。
实施例5透射电子显微镜观察HPV 35L1重组蛋白
将纯化后的HPV 35L1蛋白滴加至一洁净塑料板上,形成液滴。用镊子将铜网插入液滴中部,使铜网上下面均被液体浸没,室温静置20分钟后,用镊子取出铜网,用滤纸从铜网边缘将液体吸干。将吸附有样品的铜网放置于染液表面,室温染色10秒后,取出铜网,用滤纸吸干多余液体,晾干。使用透射电子显微镜观察(JEM-2100,日本电子株式会社)观察病毒样颗粒形态。HPV 35L1蛋白的透射电子显微镜观察如图10所示。
实施例6含有HPV 35L1蛋白疫苗的制备
将按实施例1-4制备所得的HPV 35L1蛋白原液用原液稀释缓冲液稀释至250μg/mL,取1mL稀释后的蛋白液加入250μg/mL磷酸铝佐剂混合,吸附1~3h,即获得HPV 35L1蛋白疫苗,于4℃避光保存。
实施例7 HPV 35L1蛋白疫苗的免疫原性
分别向小鼠体内给予不同剂量的HPV 35L1疫苗,通过酶联免疫吸附法(ELISA)测定血清中特异性抗体的阳转率,计算每剂量组阳性血清的百分率,使用SPSS软件计算ED50(半数有效剂量)值,以此评价疫苗的免疫原性。
1.动物的免疫
60只6-8周龄Balb/c雌鼠,随机分成6组,每个剂量组10只小鼠。根据样品的抗原含量选择适当的剂量范围,用空白铝佐剂稀释液按下表进行稀释,样品稀释和免疫动物时均需要完全混匀。皮下五点注射0.5mL/只,0天免疫1针,28天后眼眶采血,分离血清进行中和抗体阳转率的检测。
动物分组如表2:
表2小鼠免疫原性实验分组
组别 | 受试物 | 给药量(μg/0.5mL) | 免疫程序 | 小鼠 |
1 | HPV35L1疫苗 | 0.04000 | 0天一针 | 10 |
2 | HPV35L1疫苗 | 0.01000 | 0天一针 | 10 |
3 | HPV35L1疫苗 | 0.00250 | 0天一针 | 10 |
4 | HPV35L1疫苗 | 0.00063 | 0天一针 | 10 |
5 | HPV35L1疫苗 | 0.00016 | 0天一针 | 10 |
6 | 生理盐水 | / | 0天一针 | 10 |
2.ELISA法检测血清中抗体阳转率
试验步骤如下:1)包被:用磷酸盐缓冲液(0.01mol/mL,pH7.4)稀释HPV 35L1原液至5μg/mL,以100μL/孔加入酶标板,4℃放置过夜或37℃孵育2小时。2)封闭:300μL/孔洗涤液洗板6次,每孔加入200μL封闭液,置37℃封闭2小时。3)用含2.0%脱脂奶粉的PBST稀释液按1:1000倍稀释血清,100μL/孔加入酶标板,双复孔测定,37℃孵育1小时,并设定阳性对照和空白对照。4)加酶标二抗:300μL/孔洗涤液洗板6次,用稀释液1:10000稀释羊抗小鼠-HRP,100μL/孔加入酶标板,37℃孵育1小时。5)显色:300μL/孔洗涤液洗板6次,100μL/孔加入新鲜配置的显色液,37℃显色10分钟。6)终止读数:将终止液以50μL/孔加至板内,稍振荡混匀后,用酶标仪读数,测定波长为450nm,参比波长为620nm。
3.体内效力ED50的计算
根据不同剂量水平的小鼠血清的抗体阳转率结果计算,HPV 35L1疫苗的体内效力ED50的值为0.00034μg,显示HPV 35L1疫苗具备良好的免疫原性。
以上的实施例是为了说明本发明公开的实施方案,并不能理解为对本发明的限制。此外,本文所列出的各种修改以及发明中方法、组合物的变化,在不脱离本发明的范围和精神的前提下对本领域内的技术人员来说是显而易见的。虽然已结合本发明的多种具体优选实施例对本发明进行了具体的描述,但应当理解,本发明不应仅限于这些具体实施例。事实上,各种如上所述的对本领域内的技术人员来说显而易见的修改来获取发明都应包括在本发明的范围内。
序列表
<110> 重庆博唯佰泰生物制药有限公司
<120> 一种表达HPV 35L1的多核苷酸及其表达载体、宿主细胞和应用
<160> 10
<170> SIPOSequenceListing 1.0
<210> 1
<211> 502
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 1
Met Ser Leu Trp Arg Ser Asn Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Ser Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Thr Arg Thr Asn
20 25 30
Ile Tyr Tyr His Ala Gly Ser Ser Arg Leu Leu Ala Val Gly His Pro
35 40 45
Tyr Tyr Ala Ile Lys Lys Gln Asp Ser Asn Lys Ile Ala Val Pro Lys
50 55 60
Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Val Lys Leu Pro Asp Pro
65 70 75 80
Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asp Pro Ala Ser Gln
85 90 95
Arg Leu Val Trp Ala Cys Thr Gly Val Glu Val Gly Arg Gly Gln Pro
100 105 110
Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys Leu Asp Asp
115 120 125
Thr Glu Asn Ser Asn Lys Tyr Val Gly Asn Ser Gly Thr Asp Asn Arg
130 135 140
Glu Cys Ile Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys Leu Ile Gly
145 150 155 160
Cys Arg Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn
165 170 175
Ala Asn Gln Val Lys Ala Gly Glu Cys Pro Pro Leu Glu Leu Leu Asn
180 185 190
Thr Val Leu Gln Asp Gly Asp Met Val Asp Thr Gly Phe Gly Ala Met
195 200 205
Asp Phe Thr Thr Leu Gln Ala Asn Lys Ser Asp Val Pro Leu Asp Ile
210 215 220
Cys Ser Ser Ile Cys Lys Tyr Pro Asp Tyr Leu Lys Met Val Ser Glu
225 230 235 240
Pro Tyr Gly Asp Met Leu Phe Phe Tyr Leu Arg Arg Glu Gln Met Phe
245 250 255
Val Arg His Leu Phe Asn Arg Ala Gly Thr Val Gly Glu Thr Val Pro
260 265 270
Ala Asp Leu Tyr Ile Lys Gly Thr Thr Gly Thr Leu Pro Ser Thr Ser
275 280 285
Tyr Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser Asp Ala Gln Ile
290 295 300
Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly
305 310 315 320
Ile Cys Trp Ser Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg
325 330 335
Ser Thr Asn Met Ser Val Cys Ser Ala Val Ser Ser Ser Asp Ser Thr
340 345 350
Tyr Lys Asn Asp Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Tyr
355 360 365
Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp
370 375 380
Val Met Thr Tyr Ile His Ser Met Asn Pro Ser Ile Leu Glu Asp Trp
385 390 395 400
Asn Phe Gly Leu Thr Pro Pro Pro Ser Gly Thr Leu Glu Asp Thr Tyr
405 410 415
Arg Tyr Val Thr Ser Gln Ala Val Thr Cys Gln Lys Pro Ser Ala Pro
420 425 430
Lys Pro Lys Asp Asp Pro Leu Lys Asn Tyr Thr Phe Trp Glu Val Asp
435 440 445
Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg
450 455 460
Lys Phe Leu Leu Gln Ala Gly Leu Lys Ala Arg Pro Asn Phe Arg Leu
465 470 475 480
Gly Lys Arg Ala Ala Pro Ala Ser Thr Ser Lys Lys Ser Ser Thr Lys
485 490 495
Arg Arg Lys Val Lys Ser
500
<210> 2
<211> 1512
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 2
atgtctctgt ggagatccaa cgaggccact gtctacctgc ctccagtttc ggtgtctaag 60
gttgtgtcca cggacgagta cgtcactaga accaacatct actaccacgc aggttcctct 120
agactcctgg ctgttggtca cccttactat gccattaaga agcaggactc gaacaagatc 180
gccgtcccaa aggtttctgg cttgcagtac agagtgttca gagttaagct gccagaccct 240
aacaagttcg gatttccaga cacctccttc tacgaccctg cttcccagag attggtttgg 300
gcatgcactg gagtcgaggt gggcagaggt cagccattgg gagttggtat ctctggccac 360
cctttgctga acaagctcga cgataccgag aactccaaca agtacgttgg caactctgga 420
accgacaaca gagagtgcat ctcgatggac tacaagcaga cccagttgtg tctcatcgga 480
tgcagaccac ctattggtga acattgggga aagggcaccc cttgcaacgc caaccaggtc 540
aaggccggag agtgtcctcc attggagctt ctgaacactg ttctccaaga tggtgacatg 600
gttgacaccg gctttggtgc tatggacttc acgaccttgc aggccaacaa gtccgacgtg 660
ccacttgaca tctgttcttc catttgcaag taccctgatt acctgaagat ggtttcggag 720
ccatacggag acatgctctt cttttacctg agaagagagc agatgttcgt gagacacttg 780
ttcaacagag caggaactgt tggtgaaacg gtccctgctg acctgtacat caagggcacc 840
actggtacgt tgccatctac ctcgtacttc cctactccat ctggttcgat ggtcacctcc 900
gatgcccaga tcttcaacaa gccatactgg ttgcagagag cccagggaca caacaatggc 960
atttgctggt ccaaccagct gttcgtgacc gtcgttgaca ctacgagatc caccaacatg 1020
tcggtgtgtt ctgcagtcag ctcttccgac tcgacctaca agaacgacaa cttcaaggag 1080
tacctcagac acggtgaaga gtacgacctg caattcatct tccagttgtg caagatcacc 1140
ctgactgctg acgttatgac gtacattcac tccatgaacc cttcgatcct ggaggactgg 1200
aacttcggtc ttactccacc tccatctggc accttggagg acacttacag atatgtcacc 1260
tcccaagctg ttacgtgtca gaagccttcg gccccaaagc ctaaggacga tccactgaag 1320
aactacacct tctgggaggt tgacctgaag gagaagttct ccgcagacct cgaccagttc 1380
ccattgggca gaaagttcct gctccaagct ggattgaagg ccagacctaa cttcagactt 1440
ggcaagagag ccgctccagc atctacctct aagaaatcgt ccacgaagcg cagaaaggtg 1500
aagtcgtaat ag 1512
<210> 3
<211> 1512
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 3
atgtccttgt ggagatctaa cgaggctacc gtttacctcc cacctgtctc tgtttccaag 60
gtcgtttcga ctgacgaata cgtgaccaga acgaacatct actaccacgc cggatcttcg 120
agactgcttg ccgtcggaca cccatattac gctatcaaga agcaggactc caacaagatc 180
gctgttccta aggtctcggg tctccagtac agagttttca gagtgaagtt gcctgaccca 240
aacaagttcg gcttccctga cacgtcgttc tacgacccag cctctcaaag actggtctgg 300
gcctgtaccg gtgttgaggt cggaagaggc caacctctgg gtgtgggaat ttccggtcac 360
ccactcttga acaagctgga tgacactgag aactcgaata agtacgtcgg aaactccggc 420
acagacaaca gagaatgtat ttctatggac tacaagcaga cgcaactgtg ccttatcggc 480
tgtagacctc caatcggaga gcactggggc aagggtactc catgcaacgc taaccaggtt 540
aaggcaggtg agtgcccacc tctggagttg ctcaacaccg tgcttcagga cggagatatg 600
gttgacaccg gtttcggcgc aatggacttt actacgctcc aggctaacaa gtcggacgtt 660
cctttggata tttgctcctc tatctgtaag tacccagact acttgaagat ggtttctgag 720
ccttacggcg acatgctgtt cttctacctc agacgcgagc agatgttcgt tagacacctg 780
tttaacagag ccggtactgt gggcgagacc gttccagccg acttgtacat taagggaacg 840
accggcacac tgccttccac ctcttacttc ccaacccctt cgggatctat ggttacttct 900
gacgctcaaa tcttcaacaa gccttactgg ctgcaaagag cacagggtca caacaacggt 960
atctgctggt cgaaccagtt gttcgtcact gttgtggaca cgaccagatc taccaacatg 1020
tccgtttgct ctgcagtttc cagctctgac tccacttaca agaacgacaa cttcaaggaa 1080
tacttgagac acggcgagga atacgacctc cagttcatct tccagctgtg caagattacc 1140
ttgaccgccg atgtgatgac ttacatccac tccatgaacc catccatcct cgaagactgg 1200
aacttcggac tgacccctcc accttctggt actctggagg acacctatag atacgttacc 1260
tctcaggccg tgacttgcca gaagccatcc gcacctaagc caaaggatga ccctttgaag 1320
aactacacgt tttgggaggt cgacttgaag gagaagttct ctgccgactt ggatcagttc 1380
cctctgggta gaaagttcct gcttcaggcc ggcttgaagg ctagaccaaa cttcagactg 1440
ggcaagagag cagccccagc ttccacttcc aagaagtcct cgaccaagag aagaaaggtc 1500
aagtcctaat ag 1512
<210> 4
<211> 1512
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 4
atgagcctgt ggagaagcaa cgaggccacc gtgtacctgc ccccggtgag cgtgagcaag 60
gtggtgagca ccgacgagta cgtgacaaga accaacatct actaccacgc cggcagcagc 120
agactgctgg ccgtgggcca cccctactac gccatcaaga agcaagacag caacaagatc 180
gccgtgccca aggtgagcgg cctgcagtac agagtgttca gagtgaagct gcccgacccc 240
aacaagttcg gcttccccga cactagcttc tatgacccgg ctagccaaag actggtgtgg 300
gcgtgtacgg gcgtggaggt aggtagaggg cagccactgg gcgtgggcat cagcggccac 360
cccctgctga acaagctgga cgacaccgag aacagcaaca agtacgtggg caacagcggc 420
accgacaaca gagagtgcat cagcatggac tacaagcaga cacagctgtg cctgatcggc 480
tgcagacccc ccatcggcga gcactggggc aaaggcaccc cgtgtaacgc taaccaagtc 540
aaggcgggag agtgcccccc cctggagctg ctgaacaccg tgctgcaaga cggcgacatg 600
gtggacaccg gcttcggcgc catggacttc accaccctgc aagccaacaa gagcgacgtg 660
cccctggaca tctgcagcag catctgcaag taccccgact acctgaagat ggtgagcgag 720
ccctacggcg acatgctgtt cttctacctg agaagagagc agatgttcgt gagacacctg 780
ttcaacagag ccggcaccgt gggcgagacc gtgcccgccg acctgtacat caagggcacc 840
accggcaccc tgcctagcac aagctacttc cccacgccta gcggcagcat ggtgacaagc 900
gacgctcaga tcttcaacaa gccctactgg ctgcagagag cccaaggcca caacaacggc 960
atctgctgga gcaatcagct gttcgtgacc gtggtggaca ccacaagaag caccaacatg 1020
agcgtgtgca gcgccgtgag cagcagcgac agcacctaca agaacgacaa cttcaaggag 1080
tacctgagac acggcgagga gtacgacctg cagttcatct ttcagctgtg caagatcacc 1140
ctgaccgccg acgtgatgac ctacatccac agcatgaacc ctagcatcct ggaggactgg 1200
aacttcggcc tgaccccccc tcctagcggc accctggagg acacctacag atacgtgaca 1260
agccaagccg tgacctgtca gaagcctagc gcccccaagc ccaaggacga ccccctgaag 1320
aactacacct tctgggaggt ggacctgaag gagaagttca gcgccgacct ggatcagttc 1380
cccctgggca gaaagttcct gctgcaagcc ggcctgaagg ctagacccaa cttcagactg 1440
ggcaagagag ccgcccccgc tagcacaagc aagaagagca gcaccaagag aagaaaggtg 1500
aagagctaat ag 1512
<210> 5
<211> 1512
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 5
atgagcctgt ggagaagcaa tgaggccaca gtgtacctgc cccctgtgtc tgtgagcaag 60
gtggtgagca cagatgagta tgtgacaaga accaacatct actaccatgc tggcagcagc 120
agactgctgg ctgtgggcca cccctactat gccatcaaga agcaagacag caacaagatt 180
gctgtgccca aggtgtctgg cctgcagtac agagtgttca gagtgaagct gcctgacccc 240
aacaagtttg gcttccctga cactagcttc tatgaccctg ctagccaaag actggtgtgg 300
gcctgtactg gggtggaggt aggtagaggg cagccactgg gggtgggcat ctctggccac 360
cccctgctga acaagctgga tgacacagag aacagcaaca agtatgtggg caactctggc 420
acagacaaca gagagtgcat cagcatggac tacaagcaga cacagctgtg cctgattggc 480
tgcagacccc ccattgggga gcactggggc aaaggcaccc cctgtaatgc caaccaagtc 540
aaggctggag agtgcccccc cctggagctg ctgaacacag tgctgcaaga tggggacatg 600
gtggacactg gctttggggc catggacttc accaccctgc aagccaacaa gtctgatgtg 660
cccctggaca tctgcagcag catctgcaag taccctgact acctgaagat ggtgtctgag 720
ccctatgggg acatgctgtt cttctacctg agaagagagc agatgtttgt gagacacctg 780
ttcaacagag ctggcacagt gggggagaca gtgcctgctg acctgtacat caagggcacc 840
actggcaccc tgcctagcac aagctacttc cccaccccct ctggcagcat ggtgacctct 900
gatgctcaga tcttcaacaa gccctactgg ctgcagagag cccaaggcca caacaatggc 960
atctgctgga gcaatcagct gtttgtgaca gtggtggaca ccacaagaag caccaacatg 1020
tctgtgtgct ctgctgtgag cagctctgac agcacctaca agaatgacaa cttcaaggag 1080
tacctgagac atggggagga gtatgacctg cagttcatct ttcagctgtg caagatcacc 1140
ctgacagctg atgtgatgac ctacatccac agcatgaacc ctagcatcct ggaggactgg 1200
aactttggcc tgaccccccc tccctctggc accctggagg acacctacag atatgtgaca 1260
agccaagctg tgacctgtca gaagccctct gcccccaagc ccaaggatga ccccctgaag 1320
aactacacct tctgggaggt ggacctgaag gagaagttct ctgctgacct ggatcagttc 1380
cccctgggca gaaagttcct gctgcaagct ggcctgaagg ctagacccaa cttcagactg 1440
ggcaagagag ctgcccctgc tagcacaagc aagaagagca gcaccaagag aagaaaggtg 1500
aagagctaat ag 1512
<210> 6
<211> 4753
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 6
agatctgtcg acgcggagaa cgatctcctc gagctgctcg cggatcagct tgtggcccgg 60
taatggaacc aggccgacgc gacgctcctt gcggaccacg gtggctggcg agcccagttt 120
gtgaacgagg tcgtttagaa cgtcctccgc aaagtccagt gtcagatgaa tgtcctcctc 180
ggaccaattc agcatgttct cgagcagcca tctgtctttg gagtagaagc gtaatctctg 240
ctcctcgtta ctgtaccgga agaggtagtt tgcctcgccg cccataatga acaggttctc 300
tttctggtgg cctgtgagca gcggggacgt ctggacggcg tcgatgaggc ccttgaggcg 360
ctcgtagtac ttgttccgtc gctgtagccg gccgcggtga cgatacccac atagaggtcc 420
ttggccatta gtttgatgag gtggggcagg atgggcgact cggcatcgaa atttttgccg 480
tcgtcgtaca gtgtgatgtc accatcgaat gtaatgagct gcagcttgcg atctcggatg 540
gttttggaat ggaagaaccg cgacatctcc aacagctggg ccgtgttgag aatgagccgg 600
acgtcgttga acgagggggc cacaagccgg cgtttgctga tggcgcggcg ctcgtcctcg 660
atgtacaagg ccttttccag aggcagtctc gtgaagaagc tgccaacgct cggaaccagc 720
tgcacgagcc gagacaattc gggggtgccg gctttggtca tttcaatctt gtcgtcgatg 780
aggagttcga ggtcgtggaa gatttccgcg tagcggcgtt ttgcctcaga gtttaccatg 840
aggtcgtcca ctgcagagat gccgttgctc ttcaccgcgt acaggaccaa cggcgtcgcc 900
agcaggccct tgatccattc tatgaggcca tctcgacggt gttccttgag tgcgtactcc 960
actctgtagc gactggacat ctcgagactg ggcttgctgt gctcgatgca ccaattaatt 1020
gttgccgcat gcatccttgc accgcaagtt tttaaaaccc actcgcttta gccgtcgcgt 1080
aaaacttgtg aatctggcaa ctgagggggt tctgcagccg caaccgaact tttcgcttcg 1140
aggacgcagc tgcatggtgt catgtgaggc tctgtttgct ggcgtagcct acaacgtgac 1200
cttgcctaac cggacggcgc tacccactgc tgtctgtgcc tgctaccaga aaatcaccag 1260
agcagcagag gcccgatgtg gcaactggtg gggtgtcgga caggctgttt ctccacagtg 1320
caaatgcggg tgaaccggcc agaaagtaaa ttcttatgct accgtgcagc gactccgaca 1380
tccccagttt ttgccctact tgatcacaga tggggtcagc gctgccgcta agtgtaccca 1440
accgtgccca cacggtccat ctataaatac tgctgccagt gcacggtggt gacatcaatc 1500
taaagtacaa aaacaaattc gaaacgagga attcacgtgg cccagccggc cgtctcggat 1560
cggtaccgga gacgtggaag gacataccgc ttttgagaag cgtgtttgaa aatagttctt 1620
tttctggttt atatcgttta tgaagtgatg agatgaaaag ctgaaatagc gagtatagga 1680
aaatttaatg aaaattaaat taaatatttt cttaggctat tagtcacctt caaaatgccg 1740
gccgcttcta agaacgttgt catgatcgac aactacgact cgtttacctg gaacctgtac 1800
gagtacctgt gtcaggaggg agccaatgtc gaggttttca ggaacgatca gatcaccatt 1860
ccggagattg agcagctcaa gccggacgtt gtggtgatat cccctggtcc tggccatcca 1920
agaacagact cgggaatatc tcgcgacgtg atcagccatt ttaaaggcaa gattcctgtc 1980
tttggtgtct gtatgggcca gcagtgtatc ttcgaggagt ttggcggaga cgtcgagtat 2040
gcgggcgaga ttgtccatgg aaaaacgtcc actgttaagc acgacaacaa gggaatgttc 2100
aaaaacgttc cgcaagatgt tgctgtcacc agataccact cgctggccgg aacgctcaag 2160
tcgcttccgg actgtctaga gatcactgct cgcacagaca acgggatcat tatgggtgtg 2220
agacacaaga agtacaccat cgagggcgtc cagtttcatc cagagagcat tctgaccgag 2280
gagggccatc tgatgatcca gaatatcctc aacgtttccg gtggttactg ggaggaaaat 2340
gccaacggcg cggctcagag aaaggaaagc atattggaga aaatatacgc gcagagacga 2400
aaagactacg agtttgagat gaacagaccg gggcgcagat ttgctgatct agaactgtac 2460
ttgtccatgg gactgcaccg ccgctaatca atttttacga cagattggag cagaacatca 2520
gcgccggcaa ggttgcaatt ctcagcgaaa tcaagagagc gtcgccttct aaaggcgtca 2580
tcgacggaga cgctaacgct gccaaacagg ccctcaacta cgccaaggct ggagttgcca 2640
caatttctgt tttgaccgag ccaacctggt ttaaaggaaa tatccaggac ctggaggtgg 2700
ccagaaaagc cattgactct gtggccaata gaccgtgtat tttgcggaag gagtttatct 2760
tcaacaagta ccaaattcta gaggcccgac tggcgggagc agacacggtt ctgctgattg 2820
tcaagatgct gagctcggat cccccacaca ccatagcttc aaaatgtttc tactcctttt 2880
ttactcttcc agattttctc ggactccgcg catcgccgta ccacttcaaa acacccaagc 2940
acagcatact aaattttccc tctttcttcc tctagggtgt cgttaattac ccgtactaaa 3000
ggtttggaaa agaaaaaaga gaccgcctcg tttctttttc ttcgtcgaaa aaggcaataa 3060
aaatttttat cacgtttctt tttcttgaaa tttttttttt tagttttttt ctctttcagt 3120
gacctccatt gatatttaag ttaataaacg gtcttcaatt tctcaagttt cagtttcatt 3180
tttcttgttc tattacaact ttttttactt cttgttcatt agaaagaaag catagcaatc 3240
taatctaagg ggcggtgttg acaattaatc atcggcatag tatatcggca tagtataata 3300
cgacaaggtg aggaactaaa ccatggccaa gttgaccagt gccgttccgg tgctcaccgc 3360
gcgcgacgtc gccggagcgg tcgagttctg gaccgaccgg ctcgggttct cccgggactt 3420
cgtggaggac gacttcgccg gtgtggtccg ggacgacgtg accctgttca tcagcgcggt 3480
ccaggaccag gtggtgccgg acaacaccct ggcctgggtg tgggtgcgcg gcctggacga 3540
gctgtacgcc gagtggtcgg aggtcgtgtc cacgaacttc cgggacgcct ccgggccggc 3600
catgaccgag atcggcgagc agccgtgggg gcgggagttc gccctgcgcg acccggccgg 3660
caactgcgtg cacttcgtgg ccgaggagca ggactgacac gtccgacggc ggcccacggg 3720
tcccaggcct cggagatccg tccccctttt cctttgtcga tatcatgtaa ttagttatgt 3780
cacgcttaca ttcacgccct ccccccacat ccgctctaac cgaaaaggaa ggagttagac 3840
aacctgaagt ctaggtccct atttattttt ttatagttat gttagtatta agaacgttat 3900
ttatatttca aatttttctt ttttttctgt acagacgcgt gtacgcatgt aacattatac 3960
tgaaaacctt gcttgagaag gttttgggac gctcgaaggc tttaatttgc aagctggaga 4020
ccaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg 4080
cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga 4140
ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg 4200
tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg 4260
gaagcgtggc gctttctcaa tgctcacgct gtaggtatct cagttcggtg taggtcgttc 4320
gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg 4380
gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca 4440
ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt 4500
ggcctaacta cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag 4560
ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg 4620
gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc 4680
ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt 4740
tggtcatgag atc 4753
<210> 7
<211> 6229
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 7
agatctgtcg acgcggagaa cgatctcctc gagctgctcg cggatcagct tgtggcccgg 60
taatggaacc aggccgacgc gacgctcctt gcggaccacg gtggctggcg agcccagttt 120
gtgaacgagg tcgtttagaa cgtcctccgc aaagtccagt gtcagatgaa tgtcctcctc 180
ggaccaattc agcatgttct cgagcagcca tctgtctttg gagtagaagc gtaatctctg 240
ctcctcgtta ctgtaccgga agaggtagtt tgcctcgccg cccataatga acaggttctc 300
tttctggtgg cctgtgagca gcggggacgt ctggacggcg tcgatgaggc ccttgaggcg 360
ctcgtagtac ttgttccgtc gctgtagccg gccgcggtga cgatacccac atagaggtcc 420
ttggccatta gtttgatgag gtggggcagg atgggcgact cggcatcgaa atttttgccg 480
tcgtcgtaca gtgtgatgtc accatcgaat gtaatgagct gcagcttgcg atctcggatg 540
gttttggaat ggaagaaccg cgacatctcc aacagctggg ccgtgttgag aatgagccgg 600
acgtcgttga acgagggggc cacaagccgg cgtttgctga tggcgcggcg ctcgtcctcg 660
atgtacaagg ccttttccag aggcagtctc gtgaagaagc tgccaacgct cggaaccagc 720
tgcacgagcc gagacaattc gggggtgccg gctttggtca tttcaatctt gtcgtcgatg 780
aggagttcga ggtcgtggaa gatttccgcg tagcggcgtt ttgcctcaga gtttaccatg 840
aggtcgtcca ctgcagagat gccgttgctc ttcaccgcgt acaggaccaa cggcgtcgcc 900
agcaggccct tgatccattc tatgaggcca tctcgacggt gttccttgag tgcgtactcc 960
actctgtagc gactggacat ctcgagactg ggcttgctgt gctcgatgca ccaattaatt 1020
gttgccgcat gcatccttgc accgcaagtt tttaaaaccc actcgcttta gccgtcgcgt 1080
aaaacttgtg aatctggcaa ctgagggggt tctgcagccg caaccgaact tttcgcttcg 1140
aggacgcagc tgcatggtgt catgtgaggc tctgtttgct ggcgtagcct acaacgtgac 1200
cttgcctaac cggacggcgc tacccactgc tgtctgtgcc tgctaccaga aaatcaccag 1260
agcagcagag gcccgatgtg gcaactggtg gggtgtcgga caggctgttt ctccacagtg 1320
caaatgcggg tgaaccggcc agaaagtaaa ttcttatgct accgtgcagc gactccgaca 1380
tccccagttt ttgccctact tgatcacaga tggggtcagc gctgccgcta agtgtaccca 1440
accgtgccca cacggtccat ctataaatac tgctgccagt gcacggtggt gacatcaatc 1500
taaagtacaa aaacaaattc gaaacgatgt ctctgtggag atccaacgag gccactgtct 1560
acctgcctcc agtttcggtg tctaaggttg tgtccacgga cgagtacgtc actagaacca 1620
acatctacta ccacgcaggt tcctctagac tcctggctgt tggtcaccct tactatgcca 1680
ttaagaagca ggactcgaac aagatcgccg tcccaaaggt ttctggcttg cagtacagag 1740
tgttcagagt taagctgcca gaccctaaca agttcggatt tccagacacc tccttctacg 1800
accctgcttc ccagagattg gtttgggcat gcactggagt cgaggtgggc agaggtcagc 1860
cattgggagt tggtatctct ggccaccctt tgctgaacaa gctcgacgat accgagaact 1920
ccaacaagta cgttggcaac tctggaaccg acaacagaga gtgcatctcg atggactaca 1980
agcagaccca gttgtgtctc atcggatgca gaccacctat tggtgaacat tggggaaagg 2040
gcaccccttg caacgccaac caggtcaagg ccggagagtg tcctccattg gagcttctga 2100
acactgttct ccaagatggt gacatggttg acaccggctt tggtgctatg gacttcacga 2160
ccttgcaggc caacaagtcc gacgtgccac ttgacatctg ttcttccatt tgcaagtacc 2220
ctgattacct gaagatggtt tcggagccat acggagacat gctcttcttt tacctgagaa 2280
gagagcagat gttcgtgaga cacttgttca acagagcagg aactgttggt gaaacggtcc 2340
ctgctgacct gtacatcaag ggcaccactg gtacgttgcc atctacctcg tacttcccta 2400
ctccatctgg ttcgatggtc acctccgatg cccagatctt caacaagcca tactggttgc 2460
agagagccca gggacacaac aatggcattt gctggtccaa ccagctgttc gtgaccgtcg 2520
ttgacactac gagatccacc aacatgtcgg tgtgttctgc agtcagctct tccgactcga 2580
cctacaagaa cgacaacttc aaggagtacc tcagacacgg tgaagagtac gacctgcaat 2640
tcatcttcca gttgtgcaag atcaccctga ctgctgacgt tatgacgtac attcactcca 2700
tgaacccttc gatcctggag gactggaact tcggtcttac tccacctcca tctggcacct 2760
tggaggacac ttacagatat gtcacctccc aagctgttac gtgtcagaag ccttcggccc 2820
caaagcctaa ggacgatcca ctgaagaact acaccttctg ggaggttgac ctgaaggaga 2880
agttctccgc agacctcgac cagttcccat tgggcagaaa gttcctgctc caagctggat 2940
tgaaggccag acctaacttc agacttggca agagagccgc tccagcatct acctctaaga 3000
aatcgtccac gaagcgcaga aaggtgaagt cgtaataggt accggagacg tggaaggaca 3060
taccgctttt gagaagcgtg tttgaaaata gttctttttc tggtttatat cgtttatgaa 3120
gtgatgagat gaaaagctga aatagcgagt ataggaaaat ttaatgaaaa ttaaattaaa 3180
tattttctta ggctattagt caccttcaaa atgccggccg cttctaagaa cgttgtcatg 3240
atcgacaact acgactcgtt tacctggaac ctgtacgagt acctgtgtca ggagggagcc 3300
aatgtcgagg ttttcaggaa cgatcagatc accattccgg agattgagca gctcaagccg 3360
gacgttgtgg tgatatcccc tggtcctggc catccaagaa cagactcggg aatatctcgc 3420
gacgtgatca gccattttaa aggcaagatt cctgtctttg gtgtctgtat gggccagcag 3480
tgtatcttcg aggagtttgg cggagacgtc gagtatgcgg gcgagattgt ccatggaaaa 3540
acgtccactg ttaagcacga caacaaggga atgttcaaaa acgttccgca agatgttgct 3600
gtcaccagat accactcgct ggccggaacg ctcaagtcgc ttccggactg tctagagatc 3660
actgctcgca cagacaacgg gatcattatg ggtgtgagac acaagaagta caccatcgag 3720
ggcgtccagt ttcatccaga gagcattctg accgaggagg gccatctgat gatccagaat 3780
atcctcaacg tttccggtgg ttactgggag gaaaatgcca acggcgcggc tcagagaaag 3840
gaaagcatat tggagaaaat atacgcgcag agacgaaaag actacgagtt tgagatgaac 3900
agaccggggc gcagatttgc tgatctagaa ctgtacttgt ccatgggact gcaccgccgc 3960
taatcaattt ttacgacaga ttggagcaga acatcagcgc cggcaaggtt gcaattctca 4020
gcgaaatcaa gagagcgtcg ccttctaaag gcgtcatcga cggagacgct aacgctgcca 4080
aacaggccct caactacgcc aaggctggag ttgccacaat ttctgttttg accgagccaa 4140
cctggtttaa aggaaatatc caggacctgg aggtggccag aaaagccatt gactctgtgg 4200
ccaatagacc gtgtattttg cggaaggagt ttatcttcaa caagtaccaa attctagagg 4260
cccgactggc gggagcagac acggttctgc tgattgtcaa gatgctgagc tcggatcccc 4320
cacacaccat agcttcaaaa tgtttctact ccttttttac tcttccagat tttctcggac 4380
tccgcgcatc gccgtaccac ttcaaaacac ccaagcacag catactaaat tttccctctt 4440
tcttcctcta gggtgtcgtt aattacccgt actaaaggtt tggaaaagaa aaaagagacc 4500
gcctcgtttc tttttcttcg tcgaaaaagg caataaaaat ttttatcacg tttctttttc 4560
ttgaaatttt tttttttagt ttttttctct ttcagtgacc tccattgata tttaagttaa 4620
taaacggtct tcaatttctc aagtttcagt ttcatttttc ttgttctatt acaacttttt 4680
ttacttcttg ttcattagaa agaaagcata gcaatctaat ctaaggggcg gtgttgacaa 4740
ttaatcatcg gcatagtata tcggcatagt ataatacgac aaggtgagga actaaaccat 4800
ggccaagttg accagtgccg ttccggtgct caccgcgcgc gacgtcgccg gagcggtcga 4860
gttctggacc gaccggctcg ggttctcccg ggacttcgtg gaggacgact tcgccggtgt 4920
ggtccgggac gacgtgaccc tgttcatcag cgcggtccag gaccaggtgg tgccggacaa 4980
caccctggcc tgggtgtggg tgcgcggcct ggacgagctg tacgccgagt ggtcggaggt 5040
cgtgtccacg aacttccggg acgcctccgg gccggccatg accgagatcg gcgagcagcc 5100
gtgggggcgg gagttcgccc tgcgcgaccc ggccggcaac tgcgtgcact tcgtggccga 5160
ggagcaggac tgacacgtcc gacggcggcc cacgggtccc aggcctcgga gatccgtccc 5220
ccttttcctt tgtcgatatc atgtaattag ttatgtcacg cttacattca cgccctcccc 5280
ccacatccgc tctaaccgaa aaggaaggag ttagacaacc tgaagtctag gtccctattt 5340
atttttttat agttatgtta gtattaagaa cgttatttat atttcaaatt tttctttttt 5400
ttctgtacag acgcgtgtac gcatgtaaca ttatactgaa aaccttgctt gagaaggttt 5460
tgggacgctc gaaggcttta atttgcaagc tggagaccaa catgtgagca aaaggccagc 5520
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 5580
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 5640
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 5700
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcaatgct 5760
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 5820
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 5880
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 5940
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 6000
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 6060
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 6120
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6180
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatc 6229
<210> 8
<211> 6229
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 8
agatctgtcg acgcggagaa cgatctcctc gagctgctcg cggatcagct tgtggcccgg 60
taatggaacc aggccgacgc gacgctcctt gcggaccacg gtggctggcg agcccagttt 120
gtgaacgagg tcgtttagaa cgtcctccgc aaagtccagt gtcagatgaa tgtcctcctc 180
ggaccaattc agcatgttct cgagcagcca tctgtctttg gagtagaagc gtaatctctg 240
ctcctcgtta ctgtaccgga agaggtagtt tgcctcgccg cccataatga acaggttctc 300
tttctggtgg cctgtgagca gcggggacgt ctggacggcg tcgatgaggc ccttgaggcg 360
ctcgtagtac ttgttccgtc gctgtagccg gccgcggtga cgatacccac atagaggtcc 420
ttggccatta gtttgatgag gtggggcagg atgggcgact cggcatcgaa atttttgccg 480
tcgtcgtaca gtgtgatgtc accatcgaat gtaatgagct gcagcttgcg atctcggatg 540
gttttggaat ggaagaaccg cgacatctcc aacagctggg ccgtgttgag aatgagccgg 600
acgtcgttga acgagggggc cacaagccgg cgtttgctga tggcgcggcg ctcgtcctcg 660
atgtacaagg ccttttccag aggcagtctc gtgaagaagc tgccaacgct cggaaccagc 720
tgcacgagcc gagacaattc gggggtgccg gctttggtca tttcaatctt gtcgtcgatg 780
aggagttcga ggtcgtggaa gatttccgcg tagcggcgtt ttgcctcaga gtttaccatg 840
aggtcgtcca ctgcagagat gccgttgctc ttcaccgcgt acaggaccaa cggcgtcgcc 900
agcaggccct tgatccattc tatgaggcca tctcgacggt gttccttgag tgcgtactcc 960
actctgtagc gactggacat ctcgagactg ggcttgctgt gctcgatgca ccaattaatt 1020
gttgccgcat gcatccttgc accgcaagtt tttaaaaccc actcgcttta gccgtcgcgt 1080
aaaacttgtg aatctggcaa ctgagggggt tctgcagccg caaccgaact tttcgcttcg 1140
aggacgcagc tgcatggtgt catgtgaggc tctgtttgct ggcgtagcct acaacgtgac 1200
cttgcctaac cggacggcgc tacccactgc tgtctgtgcc tgctaccaga aaatcaccag 1260
agcagcagag gcccgatgtg gcaactggtg gggtgtcgga caggctgttt ctccacagtg 1320
caaatgcggg tgaaccggcc agaaagtaaa ttcttatgct accgtgcagc gactccgaca 1380
tccccagttt ttgccctact tgatcacaga tggggtcagc gctgccgcta agtgtaccca 1440
accgtgccca cacggtccat ctataaatac tgctgccagt gcacggtggt gacatcaatc 1500
taaagtacaa aaacaaattc gaaacgatgt ccttgtggag atctaacgag gctaccgttt 1560
acctcccacc tgtctctgtt tccaaggtcg tttcgactga cgaatacgtg accagaacga 1620
acatctacta ccacgccgga tcttcgagac tgcttgccgt cggacaccca tattacgcta 1680
tcaagaagca ggactccaac aagatcgctg ttcctaaggt ctcgggtctc cagtacagag 1740
ttttcagagt gaagttgcct gacccaaaca agttcggctt ccctgacacg tcgttctacg 1800
acccagcctc tcaaagactg gtctgggcct gtaccggtgt tgaggtcgga agaggccaac 1860
ctctgggtgt gggaatttcc ggtcacccac tcttgaacaa gctggatgac actgagaact 1920
cgaataagta cgtcggaaac tccggcacag acaacagaga atgtatttct atggactaca 1980
agcagacgca actgtgcctt atcggctgta gacctccaat cggagagcac tggggcaagg 2040
gtactccatg caacgctaac caggttaagg caggtgagtg cccacctctg gagttgctca 2100
acaccgtgct tcaggacgga gatatggttg acaccggttt cggcgcaatg gactttacta 2160
cgctccaggc taacaagtcg gacgttcctt tggatatttg ctcctctatc tgtaagtacc 2220
cagactactt gaagatggtt tctgagcctt acggcgacat gctgttcttc tacctcagac 2280
gcgagcagat gttcgttaga cacctgttta acagagccgg tactgtgggc gagaccgttc 2340
cagccgactt gtacattaag ggaacgaccg gcacactgcc ttccacctct tacttcccaa 2400
ccccttcggg atctatggtt acttctgacg ctcaaatctt caacaagcct tactggctgc 2460
aaagagcaca gggtcacaac aacggtatct gctggtcgaa ccagttgttc gtcactgttg 2520
tggacacgac cagatctacc aacatgtccg tttgctctgc agtttccagc tctgactcca 2580
cttacaagaa cgacaacttc aaggaatact tgagacacgg cgaggaatac gacctccagt 2640
tcatcttcca gctgtgcaag attaccttga ccgccgatgt gatgacttac atccactcca 2700
tgaacccatc catcctcgaa gactggaact tcggactgac ccctccacct tctggtactc 2760
tggaggacac ctatagatac gttacctctc aggccgtgac ttgccagaag ccatccgcac 2820
ctaagccaaa ggatgaccct ttgaagaact acacgttttg ggaggtcgac ttgaaggaga 2880
agttctctgc cgacttggat cagttccctc tgggtagaaa gttcctgctt caggccggct 2940
tgaaggctag accaaacttc agactgggca agagagcagc cccagcttcc acttccaaga 3000
agtcctcgac caagagaaga aaggtcaagt cctaataggt accggagacg tggaaggaca 3060
taccgctttt gagaagcgtg tttgaaaata gttctttttc tggtttatat cgtttatgaa 3120
gtgatgagat gaaaagctga aatagcgagt ataggaaaat ttaatgaaaa ttaaattaaa 3180
tattttctta ggctattagt caccttcaaa atgccggccg cttctaagaa cgttgtcatg 3240
atcgacaact acgactcgtt tacctggaac ctgtacgagt acctgtgtca ggagggagcc 3300
aatgtcgagg ttttcaggaa cgatcagatc accattccgg agattgagca gctcaagccg 3360
gacgttgtgg tgatatcccc tggtcctggc catccaagaa cagactcggg aatatctcgc 3420
gacgtgatca gccattttaa aggcaagatt cctgtctttg gtgtctgtat gggccagcag 3480
tgtatcttcg aggagtttgg cggagacgtc gagtatgcgg gcgagattgt ccatggaaaa 3540
acgtccactg ttaagcacga caacaaggga atgttcaaaa acgttccgca agatgttgct 3600
gtcaccagat accactcgct ggccggaacg ctcaagtcgc ttccggactg tctagagatc 3660
actgctcgca cagacaacgg gatcattatg ggtgtgagac acaagaagta caccatcgag 3720
ggcgtccagt ttcatccaga gagcattctg accgaggagg gccatctgat gatccagaat 3780
atcctcaacg tttccggtgg ttactgggag gaaaatgcca acggcgcggc tcagagaaag 3840
gaaagcatat tggagaaaat atacgcgcag agacgaaaag actacgagtt tgagatgaac 3900
agaccggggc gcagatttgc tgatctagaa ctgtacttgt ccatgggact gcaccgccgc 3960
taatcaattt ttacgacaga ttggagcaga acatcagcgc cggcaaggtt gcaattctca 4020
gcgaaatcaa gagagcgtcg ccttctaaag gcgtcatcga cggagacgct aacgctgcca 4080
aacaggccct caactacgcc aaggctggag ttgccacaat ttctgttttg accgagccaa 4140
cctggtttaa aggaaatatc caggacctgg aggtggccag aaaagccatt gactctgtgg 4200
ccaatagacc gtgtattttg cggaaggagt ttatcttcaa caagtaccaa attctagagg 4260
cccgactggc gggagcagac acggttctgc tgattgtcaa gatgctgagc tcggatcccc 4320
cacacaccat agcttcaaaa tgtttctact ccttttttac tcttccagat tttctcggac 4380
tccgcgcatc gccgtaccac ttcaaaacac ccaagcacag catactaaat tttccctctt 4440
tcttcctcta gggtgtcgtt aattacccgt actaaaggtt tggaaaagaa aaaagagacc 4500
gcctcgtttc tttttcttcg tcgaaaaagg caataaaaat ttttatcacg tttctttttc 4560
ttgaaatttt tttttttagt ttttttctct ttcagtgacc tccattgata tttaagttaa 4620
taaacggtct tcaatttctc aagtttcagt ttcatttttc ttgttctatt acaacttttt 4680
ttacttcttg ttcattagaa agaaagcata gcaatctaat ctaaggggcg gtgttgacaa 4740
ttaatcatcg gcatagtata tcggcatagt ataatacgac aaggtgagga actaaaccat 4800
ggccaagttg accagtgccg ttccggtgct caccgcgcgc gacgtcgccg gagcggtcga 4860
gttctggacc gaccggctcg ggttctcccg ggacttcgtg gaggacgact tcgccggtgt 4920
ggtccgggac gacgtgaccc tgttcatcag cgcggtccag gaccaggtgg tgccggacaa 4980
caccctggcc tgggtgtggg tgcgcggcct ggacgagctg tacgccgagt ggtcggaggt 5040
cgtgtccacg aacttccggg acgcctccgg gccggccatg accgagatcg gcgagcagcc 5100
gtgggggcgg gagttcgccc tgcgcgaccc ggccggcaac tgcgtgcact tcgtggccga 5160
ggagcaggac tgacacgtcc gacggcggcc cacgggtccc aggcctcgga gatccgtccc 5220
ccttttcctt tgtcgatatc atgtaattag ttatgtcacg cttacattca cgccctcccc 5280
ccacatccgc tctaaccgaa aaggaaggag ttagacaacc tgaagtctag gtccctattt 5340
atttttttat agttatgtta gtattaagaa cgttatttat atttcaaatt tttctttttt 5400
ttctgtacag acgcgtgtac gcatgtaaca ttatactgaa aaccttgctt gagaaggttt 5460
tgggacgctc gaaggcttta atttgcaagc tggagaccaa catgtgagca aaaggccagc 5520
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 5580
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 5640
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 5700
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcaatgct 5760
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 5820
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 5880
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 5940
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 6000
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 6060
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 6120
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6180
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatc 6229
<210> 9
<211> 6229
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 9
agatctgtcg acgcggagaa cgatctcctc gagctgctcg cggatcagct tgtggcccgg 60
taatggaacc aggccgacgc gacgctcctt gcggaccacg gtggctggcg agcccagttt 120
gtgaacgagg tcgtttagaa cgtcctccgc aaagtccagt gtcagatgaa tgtcctcctc 180
ggaccaattc agcatgttct cgagcagcca tctgtctttg gagtagaagc gtaatctctg 240
ctcctcgtta ctgtaccgga agaggtagtt tgcctcgccg cccataatga acaggttctc 300
tttctggtgg cctgtgagca gcggggacgt ctggacggcg tcgatgaggc ccttgaggcg 360
ctcgtagtac ttgttccgtc gctgtagccg gccgcggtga cgatacccac atagaggtcc 420
ttggccatta gtttgatgag gtggggcagg atgggcgact cggcatcgaa atttttgccg 480
tcgtcgtaca gtgtgatgtc accatcgaat gtaatgagct gcagcttgcg atctcggatg 540
gttttggaat ggaagaaccg cgacatctcc aacagctggg ccgtgttgag aatgagccgg 600
acgtcgttga acgagggggc cacaagccgg cgtttgctga tggcgcggcg ctcgtcctcg 660
atgtacaagg ccttttccag aggcagtctc gtgaagaagc tgccaacgct cggaaccagc 720
tgcacgagcc gagacaattc gggggtgccg gctttggtca tttcaatctt gtcgtcgatg 780
aggagttcga ggtcgtggaa gatttccgcg tagcggcgtt ttgcctcaga gtttaccatg 840
aggtcgtcca ctgcagagat gccgttgctc ttcaccgcgt acaggaccaa cggcgtcgcc 900
agcaggccct tgatccattc tatgaggcca tctcgacggt gttccttgag tgcgtactcc 960
actctgtagc gactggacat ctcgagactg ggcttgctgt gctcgatgca ccaattaatt 1020
gttgccgcat gcatccttgc accgcaagtt tttaaaaccc actcgcttta gccgtcgcgt 1080
aaaacttgtg aatctggcaa ctgagggggt tctgcagccg caaccgaact tttcgcttcg 1140
aggacgcagc tgcatggtgt catgtgaggc tctgtttgct ggcgtagcct acaacgtgac 1200
cttgcctaac cggacggcgc tacccactgc tgtctgtgcc tgctaccaga aaatcaccag 1260
agcagcagag gcccgatgtg gcaactggtg gggtgtcgga caggctgttt ctccacagtg 1320
caaatgcggg tgaaccggcc agaaagtaaa ttcttatgct accgtgcagc gactccgaca 1380
tccccagttt ttgccctact tgatcacaga tggggtcagc gctgccgcta agtgtaccca 1440
accgtgccca cacggtccat ctataaatac tgctgccagt gcacggtggt gacatcaatc 1500
taaagtacaa aaacaaattc gaaacgatga gcctgtggag aagcaacgag gccaccgtgt 1560
acctgccccc ggtgagcgtg agcaaggtgg tgagcaccga cgagtacgtg acaagaacca 1620
acatctacta ccacgccggc agcagcagac tgctggccgt gggccacccc tactacgcca 1680
tcaagaagca agacagcaac aagatcgccg tgcccaaggt gagcggcctg cagtacagag 1740
tgttcagagt gaagctgccc gaccccaaca agttcggctt ccccgacact agcttctatg 1800
acccggctag ccaaagactg gtgtgggcgt gtacgggcgt ggaggtaggt agagggcagc 1860
cactgggcgt gggcatcagc ggccaccccc tgctgaacaa gctggacgac accgagaaca 1920
gcaacaagta cgtgggcaac agcggcaccg acaacagaga gtgcatcagc atggactaca 1980
agcagacaca gctgtgcctg atcggctgca gaccccccat cggcgagcac tggggcaaag 2040
gcaccccgtg taacgctaac caagtcaagg cgggagagtg cccccccctg gagctgctga 2100
acaccgtgct gcaagacggc gacatggtgg acaccggctt cggcgccatg gacttcacca 2160
ccctgcaagc caacaagagc gacgtgcccc tggacatctg cagcagcatc tgcaagtacc 2220
ccgactacct gaagatggtg agcgagccct acggcgacat gctgttcttc tacctgagaa 2280
gagagcagat gttcgtgaga cacctgttca acagagccgg caccgtgggc gagaccgtgc 2340
ccgccgacct gtacatcaag ggcaccaccg gcaccctgcc tagcacaagc tacttcccca 2400
cgcctagcgg cagcatggtg acaagcgacg ctcagatctt caacaagccc tactggctgc 2460
agagagccca aggccacaac aacggcatct gctggagcaa tcagctgttc gtgaccgtgg 2520
tggacaccac aagaagcacc aacatgagcg tgtgcagcgc cgtgagcagc agcgacagca 2580
cctacaagaa cgacaacttc aaggagtacc tgagacacgg cgaggagtac gacctgcagt 2640
tcatctttca gctgtgcaag atcaccctga ccgccgacgt gatgacctac atccacagca 2700
tgaaccctag catcctggag gactggaact tcggcctgac cccccctcct agcggcaccc 2760
tggaggacac ctacagatac gtgacaagcc aagccgtgac ctgtcagaag cctagcgccc 2820
ccaagcccaa ggacgacccc ctgaagaact acaccttctg ggaggtggac ctgaaggaga 2880
agttcagcgc cgacctggat cagttccccc tgggcagaaa gttcctgctg caagccggcc 2940
tgaaggctag acccaacttc agactgggca agagagccgc ccccgctagc acaagcaaga 3000
agagcagcac caagagaaga aaggtgaaga gctaataggt accggagacg tggaaggaca 3060
taccgctttt gagaagcgtg tttgaaaata gttctttttc tggtttatat cgtttatgaa 3120
gtgatgagat gaaaagctga aatagcgagt ataggaaaat ttaatgaaaa ttaaattaaa 3180
tattttctta ggctattagt caccttcaaa atgccggccg cttctaagaa cgttgtcatg 3240
atcgacaact acgactcgtt tacctggaac ctgtacgagt acctgtgtca ggagggagcc 3300
aatgtcgagg ttttcaggaa cgatcagatc accattccgg agattgagca gctcaagccg 3360
gacgttgtgg tgatatcccc tggtcctggc catccaagaa cagactcggg aatatctcgc 3420
gacgtgatca gccattttaa aggcaagatt cctgtctttg gtgtctgtat gggccagcag 3480
tgtatcttcg aggagtttgg cggagacgtc gagtatgcgg gcgagattgt ccatggaaaa 3540
acgtccactg ttaagcacga caacaaggga atgttcaaaa acgttccgca agatgttgct 3600
gtcaccagat accactcgct ggccggaacg ctcaagtcgc ttccggactg tctagagatc 3660
actgctcgca cagacaacgg gatcattatg ggtgtgagac acaagaagta caccatcgag 3720
ggcgtccagt ttcatccaga gagcattctg accgaggagg gccatctgat gatccagaat 3780
atcctcaacg tttccggtgg ttactgggag gaaaatgcca acggcgcggc tcagagaaag 3840
gaaagcatat tggagaaaat atacgcgcag agacgaaaag actacgagtt tgagatgaac 3900
agaccggggc gcagatttgc tgatctagaa ctgtacttgt ccatgggact gcaccgccgc 3960
taatcaattt ttacgacaga ttggagcaga acatcagcgc cggcaaggtt gcaattctca 4020
gcgaaatcaa gagagcgtcg ccttctaaag gcgtcatcga cggagacgct aacgctgcca 4080
aacaggccct caactacgcc aaggctggag ttgccacaat ttctgttttg accgagccaa 4140
cctggtttaa aggaaatatc caggacctgg aggtggccag aaaagccatt gactctgtgg 4200
ccaatagacc gtgtattttg cggaaggagt ttatcttcaa caagtaccaa attctagagg 4260
cccgactggc gggagcagac acggttctgc tgattgtcaa gatgctgagc tcggatcccc 4320
cacacaccat agcttcaaaa tgtttctact ccttttttac tcttccagat tttctcggac 4380
tccgcgcatc gccgtaccac ttcaaaacac ccaagcacag catactaaat tttccctctt 4440
tcttcctcta gggtgtcgtt aattacccgt actaaaggtt tggaaaagaa aaaagagacc 4500
gcctcgtttc tttttcttcg tcgaaaaagg caataaaaat ttttatcacg tttctttttc 4560
ttgaaatttt tttttttagt ttttttctct ttcagtgacc tccattgata tttaagttaa 4620
taaacggtct tcaatttctc aagtttcagt ttcatttttc ttgttctatt acaacttttt 4680
ttacttcttg ttcattagaa agaaagcata gcaatctaat ctaaggggcg gtgttgacaa 4740
ttaatcatcg gcatagtata tcggcatagt ataatacgac aaggtgagga actaaaccat 4800
ggccaagttg accagtgccg ttccggtgct caccgcgcgc gacgtcgccg gagcggtcga 4860
gttctggacc gaccggctcg ggttctcccg ggacttcgtg gaggacgact tcgccggtgt 4920
ggtccgggac gacgtgaccc tgttcatcag cgcggtccag gaccaggtgg tgccggacaa 4980
caccctggcc tgggtgtggg tgcgcggcct ggacgagctg tacgccgagt ggtcggaggt 5040
cgtgtccacg aacttccggg acgcctccgg gccggccatg accgagatcg gcgagcagcc 5100
gtgggggcgg gagttcgccc tgcgcgaccc ggccggcaac tgcgtgcact tcgtggccga 5160
ggagcaggac tgacacgtcc gacggcggcc cacgggtccc aggcctcgga gatccgtccc 5220
ccttttcctt tgtcgatatc atgtaattag ttatgtcacg cttacattca cgccctcccc 5280
ccacatccgc tctaaccgaa aaggaaggag ttagacaacc tgaagtctag gtccctattt 5340
atttttttat agttatgtta gtattaagaa cgttatttat atttcaaatt tttctttttt 5400
ttctgtacag acgcgtgtac gcatgtaaca ttatactgaa aaccttgctt gagaaggttt 5460
tgggacgctc gaaggcttta atttgcaagc tggagaccaa catgtgagca aaaggccagc 5520
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 5580
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 5640
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 5700
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcaatgct 5760
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 5820
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 5880
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 5940
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 6000
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 6060
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 6120
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6180
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatc 6229
<210> 10
<211> 6229
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 10
agatctgtcg acgcggagaa cgatctcctc gagctgctcg cggatcagct tgtggcccgg 60
taatggaacc aggccgacgc gacgctcctt gcggaccacg gtggctggcg agcccagttt 120
gtgaacgagg tcgtttagaa cgtcctccgc aaagtccagt gtcagatgaa tgtcctcctc 180
ggaccaattc agcatgttct cgagcagcca tctgtctttg gagtagaagc gtaatctctg 240
ctcctcgtta ctgtaccgga agaggtagtt tgcctcgccg cccataatga acaggttctc 300
tttctggtgg cctgtgagca gcggggacgt ctggacggcg tcgatgaggc ccttgaggcg 360
ctcgtagtac ttgttccgtc gctgtagccg gccgcggtga cgatacccac atagaggtcc 420
ttggccatta gtttgatgag gtggggcagg atgggcgact cggcatcgaa atttttgccg 480
tcgtcgtaca gtgtgatgtc accatcgaat gtaatgagct gcagcttgcg atctcggatg 540
gttttggaat ggaagaaccg cgacatctcc aacagctggg ccgtgttgag aatgagccgg 600
acgtcgttga acgagggggc cacaagccgg cgtttgctga tggcgcggcg ctcgtcctcg 660
atgtacaagg ccttttccag aggcagtctc gtgaagaagc tgccaacgct cggaaccagc 720
tgcacgagcc gagacaattc gggggtgccg gctttggtca tttcaatctt gtcgtcgatg 780
aggagttcga ggtcgtggaa gatttccgcg tagcggcgtt ttgcctcaga gtttaccatg 840
aggtcgtcca ctgcagagat gccgttgctc ttcaccgcgt acaggaccaa cggcgtcgcc 900
agcaggccct tgatccattc tatgaggcca tctcgacggt gttccttgag tgcgtactcc 960
actctgtagc gactggacat ctcgagactg ggcttgctgt gctcgatgca ccaattaatt 1020
gttgccgcat gcatccttgc accgcaagtt tttaaaaccc actcgcttta gccgtcgcgt 1080
aaaacttgtg aatctggcaa ctgagggggt tctgcagccg caaccgaact tttcgcttcg 1140
aggacgcagc tgcatggtgt catgtgaggc tctgtttgct ggcgtagcct acaacgtgac 1200
cttgcctaac cggacggcgc tacccactgc tgtctgtgcc tgctaccaga aaatcaccag 1260
agcagcagag gcccgatgtg gcaactggtg gggtgtcgga caggctgttt ctccacagtg 1320
caaatgcggg tgaaccggcc agaaagtaaa ttcttatgct accgtgcagc gactccgaca 1380
tccccagttt ttgccctact tgatcacaga tggggtcagc gctgccgcta agtgtaccca 1440
accgtgccca cacggtccat ctataaatac tgctgccagt gcacggtggt gacatcaatc 1500
taaagtacaa aaacaaattc gaaacgatga gcctgtggag aagcaatgag gccacagtgt 1560
acctgccccc tgtgtctgtg agcaaggtgg tgagcacaga tgagtatgtg acaagaacca 1620
acatctacta ccatgctggc agcagcagac tgctggctgt gggccacccc tactatgcca 1680
tcaagaagca agacagcaac aagattgctg tgcccaaggt gtctggcctg cagtacagag 1740
tgttcagagt gaagctgcct gaccccaaca agtttggctt ccctgacact agcttctatg 1800
accctgctag ccaaagactg gtgtgggcct gtactggggt ggaggtaggt agagggcagc 1860
cactgggggt gggcatctct ggccaccccc tgctgaacaa gctggatgac acagagaaca 1920
gcaacaagta tgtgggcaac tctggcacag acaacagaga gtgcatcagc atggactaca 1980
agcagacaca gctgtgcctg attggctgca gaccccccat tggggagcac tggggcaaag 2040
gcaccccctg taatgccaac caagtcaagg ctggagagtg cccccccctg gagctgctga 2100
acacagtgct gcaagatggg gacatggtgg acactggctt tggggccatg gacttcacca 2160
ccctgcaagc caacaagtct gatgtgcccc tggacatctg cagcagcatc tgcaagtacc 2220
ctgactacct gaagatggtg tctgagccct atggggacat gctgttcttc tacctgagaa 2280
gagagcagat gtttgtgaga cacctgttca acagagctgg cacagtgggg gagacagtgc 2340
ctgctgacct gtacatcaag ggcaccactg gcaccctgcc tagcacaagc tacttcccca 2400
ccccctctgg cagcatggtg acctctgatg ctcagatctt caacaagccc tactggctgc 2460
agagagccca aggccacaac aatggcatct gctggagcaa tcagctgttt gtgacagtgg 2520
tggacaccac aagaagcacc aacatgtctg tgtgctctgc tgtgagcagc tctgacagca 2580
cctacaagaa tgacaacttc aaggagtacc tgagacatgg ggaggagtat gacctgcagt 2640
tcatctttca gctgtgcaag atcaccctga cagctgatgt gatgacctac atccacagca 2700
tgaaccctag catcctggag gactggaact ttggcctgac cccccctccc tctggcaccc 2760
tggaggacac ctacagatat gtgacaagcc aagctgtgac ctgtcagaag ccctctgccc 2820
ccaagcccaa ggatgacccc ctgaagaact acaccttctg ggaggtggac ctgaaggaga 2880
agttctctgc tgacctggat cagttccccc tgggcagaaa gttcctgctg caagctggcc 2940
tgaaggctag acccaacttc agactgggca agagagctgc ccctgctagc acaagcaaga 3000
agagcagcac caagagaaga aaggtgaaga gctaataggt accggagacg tggaaggaca 3060
taccgctttt gagaagcgtg tttgaaaata gttctttttc tggtttatat cgtttatgaa 3120
gtgatgagat gaaaagctga aatagcgagt ataggaaaat ttaatgaaaa ttaaattaaa 3180
tattttctta ggctattagt caccttcaaa atgccggccg cttctaagaa cgttgtcatg 3240
atcgacaact acgactcgtt tacctggaac ctgtacgagt acctgtgtca ggagggagcc 3300
aatgtcgagg ttttcaggaa cgatcagatc accattccgg agattgagca gctcaagccg 3360
gacgttgtgg tgatatcccc tggtcctggc catccaagaa cagactcggg aatatctcgc 3420
gacgtgatca gccattttaa aggcaagatt cctgtctttg gtgtctgtat gggccagcag 3480
tgtatcttcg aggagtttgg cggagacgtc gagtatgcgg gcgagattgt ccatggaaaa 3540
acgtccactg ttaagcacga caacaaggga atgttcaaaa acgttccgca agatgttgct 3600
gtcaccagat accactcgct ggccggaacg ctcaagtcgc ttccggactg tctagagatc 3660
actgctcgca cagacaacgg gatcattatg ggtgtgagac acaagaagta caccatcgag 3720
ggcgtccagt ttcatccaga gagcattctg accgaggagg gccatctgat gatccagaat 3780
atcctcaacg tttccggtgg ttactgggag gaaaatgcca acggcgcggc tcagagaaag 3840
gaaagcatat tggagaaaat atacgcgcag agacgaaaag actacgagtt tgagatgaac 3900
agaccggggc gcagatttgc tgatctagaa ctgtacttgt ccatgggact gcaccgccgc 3960
taatcaattt ttacgacaga ttggagcaga acatcagcgc cggcaaggtt gcaattctca 4020
gcgaaatcaa gagagcgtcg ccttctaaag gcgtcatcga cggagacgct aacgctgcca 4080
aacaggccct caactacgcc aaggctggag ttgccacaat ttctgttttg accgagccaa 4140
cctggtttaa aggaaatatc caggacctgg aggtggccag aaaagccatt gactctgtgg 4200
ccaatagacc gtgtattttg cggaaggagt ttatcttcaa caagtaccaa attctagagg 4260
cccgactggc gggagcagac acggttctgc tgattgtcaa gatgctgagc tcggatcccc 4320
cacacaccat agcttcaaaa tgtttctact ccttttttac tcttccagat tttctcggac 4380
tccgcgcatc gccgtaccac ttcaaaacac ccaagcacag catactaaat tttccctctt 4440
tcttcctcta gggtgtcgtt aattacccgt actaaaggtt tggaaaagaa aaaagagacc 4500
gcctcgtttc tttttcttcg tcgaaaaagg caataaaaat ttttatcacg tttctttttc 4560
ttgaaatttt tttttttagt ttttttctct ttcagtgacc tccattgata tttaagttaa 4620
taaacggtct tcaatttctc aagtttcagt ttcatttttc ttgttctatt acaacttttt 4680
ttacttcttg ttcattagaa agaaagcata gcaatctaat ctaaggggcg gtgttgacaa 4740
ttaatcatcg gcatagtata tcggcatagt ataatacgac aaggtgagga actaaaccat 4800
ggccaagttg accagtgccg ttccggtgct caccgcgcgc gacgtcgccg gagcggtcga 4860
gttctggacc gaccggctcg ggttctcccg ggacttcgtg gaggacgact tcgccggtgt 4920
ggtccgggac gacgtgaccc tgttcatcag cgcggtccag gaccaggtgg tgccggacaa 4980
caccctggcc tgggtgtggg tgcgcggcct ggacgagctg tacgccgagt ggtcggaggt 5040
cgtgtccacg aacttccggg acgcctccgg gccggccatg accgagatcg gcgagcagcc 5100
gtgggggcgg gagttcgccc tgcgcgaccc ggccggcaac tgcgtgcact tcgtggccga 5160
ggagcaggac tgacacgtcc gacggcggcc cacgggtccc aggcctcgga gatccgtccc 5220
ccttttcctt tgtcgatatc atgtaattag ttatgtcacg cttacattca cgccctcccc 5280
ccacatccgc tctaaccgaa aaggaaggag ttagacaacc tgaagtctag gtccctattt 5340
atttttttat agttatgtta gtattaagaa cgttatttat atttcaaatt tttctttttt 5400
ttctgtacag acgcgtgtac gcatgtaaca ttatactgaa aaccttgctt gagaaggttt 5460
tgggacgctc gaaggcttta atttgcaagc tggagaccaa catgtgagca aaaggccagc 5520
aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc 5580
ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat 5640
aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc 5700
cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcaatgct 5760
cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg 5820
aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc 5880
cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga 5940
ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa 6000
ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta 6060
gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc 6120
agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg 6180
acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatc 6229
Claims (10)
1.一种用于编码HPV 35L1蛋白的多核苷酸,其特征在于,所述多核苷酸的序列如SEQID NO:2所示。
2.一种重组表达载体,其特征在于,所述重组表达载体中含有如权利要求1所示的多核苷酸。
3.一种宿主细胞,其特征在于,所述宿主细胞中含有或者整合有如权利要求2所述重组表达载体。
4.根据权利要求3所述的宿主细胞,其特征在于,所述宿主细胞为酵母;优选的,为甲醇酵母;更优选的,为多形汉逊酵母。
5.一种产生HPV 35L1蛋白的方法,包括如下步骤:构建整合有或者含有核苷酸序列如SEQ ID NO:2所示的多核苷酸的重组汉逊酵母菌种,培养,收集菌体,破碎菌体获得裂解液,分离纯化裂解液,即可获得HPV 35L1蛋白。
6.根据权利要求5所述的产生HPV 35L1蛋白的方法,其特征在于,还包括以下特征中的一项或多项:
1)所述多核苷酸整合于质粒中,所述质粒整合于重组汉逊酵母菌种基因组中;
2)所述培养的条件包括:pH5.0~7.0,发酵温度30~37℃,搅拌转速≦950rpm,空气流量≦2.0VVM,罐压≦0.10MPa,溶氧10%以上;
3)将重组汉逊酵母菌种置于含有甘油的培养基中培养;在培养过程中,当培养基中的甘油消耗完,菌体湿重大于100g/L时,开始加甘油,甘油补料速度200~600g/h;当菌体湿重大于200g/L时,开始一次性加入甲醇至0.5%(w/v),进入甲醇诱导期,待甲醇全部消耗且溶氧上升到80%时,开始流加甲醇,随着菌体利用甲醇速度加快,逐步调整甲醇流加速度,诱导过程控制溶氧20%以上,诱导30~50h菌体湿重达到300~400g/L后发酵结束;
4)所述分离纯化是指将菌体裂解液先通过阳离子交换层析,再通过CHT层析。
7.一种HPV 35L1蛋白,采用权利要求5-6任一所述的产生HPV 35L1蛋白的方法获得。
8.如权利要求1所述的用于编码HPV 35L1蛋白的多核苷酸,或权利要求2所述的重组表达载体,或权利要求3所述的宿主细胞,或权利要求7所述的HPV 35L1蛋白在制备HPV疫苗中的用途。
9.一种抗HPV疫苗的制备方法,包括以下步骤:利用权利要求5-6任一所述的产生HPV35L1蛋白的方法,制备HPV 35L1蛋白,加入药学上可用的疫苗佐剂。
10.一种抗HPV的疫苗,采用权利要求9所述的抗HPV疫苗的制备方法获得。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110442669.6A CN113106107A (zh) | 2021-04-23 | 2021-04-23 | 一种表达hpv 35l1的多核苷酸及其表达载体、宿主细胞和应用 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110442669.6A CN113106107A (zh) | 2021-04-23 | 2021-04-23 | 一种表达hpv 35l1的多核苷酸及其表达载体、宿主细胞和应用 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113106107A true CN113106107A (zh) | 2021-07-13 |
Family
ID=76719924
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110442669.6A Pending CN113106107A (zh) | 2021-04-23 | 2021-04-23 | 一种表达hpv 35l1的多核苷酸及其表达载体、宿主细胞和应用 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113106107A (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113073105A (zh) * | 2021-03-23 | 2021-07-06 | 重庆博唯佰泰生物制药有限公司 | 一种表达hpv56l1的多核苷酸序列及其表达载体、宿主细胞和应用 |
CN113604482A (zh) * | 2021-08-25 | 2021-11-05 | 重庆博唯佰泰生物制药有限公司 | 一种表达hpv68l1的多核苷酸及其表达载体、宿主细胞和应用 |
CN113667683A (zh) * | 2021-08-25 | 2021-11-19 | 上海博唯生物科技有限公司 | 一种表达hpv 39l1的多核苷酸及其表达载体、宿主细胞和应用 |
WO2024140161A1 (zh) * | 2022-12-28 | 2024-07-04 | 北京康乐卫士生物技术股份有限公司 | 人乳头瘤病毒hpv35 l1蛋白的表达和类病毒样颗粒及其制备方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109251236A (zh) * | 2017-07-14 | 2019-01-22 | 厦门大学 | 一种人乳头瘤病毒35型l1蛋白的突变体 |
WO2021013073A1 (zh) * | 2019-07-19 | 2021-01-28 | 神州细胞工程有限公司 | 嵌合的乳头瘤病毒l1蛋白 |
CN112680462A (zh) * | 2020-12-29 | 2021-04-20 | 上海博唯生物科技有限公司 | 一种人乳头瘤病毒35型/hpv35型l1/l2及其制备与应用 |
-
2021
- 2021-04-23 CN CN202110442669.6A patent/CN113106107A/zh active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109251236A (zh) * | 2017-07-14 | 2019-01-22 | 厦门大学 | 一种人乳头瘤病毒35型l1蛋白的突变体 |
WO2021013073A1 (zh) * | 2019-07-19 | 2021-01-28 | 神州细胞工程有限公司 | 嵌合的乳头瘤病毒l1蛋白 |
CN112680462A (zh) * | 2020-12-29 | 2021-04-20 | 上海博唯生物科技有限公司 | 一种人乳头瘤病毒35型/hpv35型l1/l2及其制备与应用 |
Non-Patent Citations (4)
Title |
---|
ELIZABETH WEBB ET AL.: ""Cervical Cancer-causing Human Papillomaviruses have an Alternative Initiation Site for the L1 Protein"", 《VIRUS GENES》, vol. 30, 31 December 2005 (2005-12-31), pages 31 - 35, XP036779243, DOI: 10.1007/s11262-004-4579-8 * |
叶江 等: "《基因工程简明教程》", vol. 1, 华东理工大学出版社, pages: 215 * |
高波 等: ""人乳头瘤病毒31和33型L1蛋白类病毒颗粒的制备及其免疫原性"", 《中国生物制品学杂志》 * |
高波 等: ""人乳头瘤病毒31和33型L1蛋白类病毒颗粒的制备及其免疫原性"", 《中国生物制品学杂志》, vol. 27, no. 12, 31 December 2014 (2014-12-31), pages 1508 - 1511 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113073105A (zh) * | 2021-03-23 | 2021-07-06 | 重庆博唯佰泰生物制药有限公司 | 一种表达hpv56l1的多核苷酸序列及其表达载体、宿主细胞和应用 |
CN113604482A (zh) * | 2021-08-25 | 2021-11-05 | 重庆博唯佰泰生物制药有限公司 | 一种表达hpv68l1的多核苷酸及其表达载体、宿主细胞和应用 |
CN113667683A (zh) * | 2021-08-25 | 2021-11-19 | 上海博唯生物科技有限公司 | 一种表达hpv 39l1的多核苷酸及其表达载体、宿主细胞和应用 |
CN113604482B (zh) * | 2021-08-25 | 2023-02-07 | 重庆博唯佰泰生物制药有限公司 | 一种表达hpv68l1的多核苷酸及其表达载体、宿主细胞和应用 |
CN113667683B (zh) * | 2021-08-25 | 2023-02-10 | 上海博唯生物科技有限公司 | 一种表达hpv 39l1的多核苷酸及其表达载体、宿主细胞和应用 |
WO2024140161A1 (zh) * | 2022-12-28 | 2024-07-04 | 北京康乐卫士生物技术股份有限公司 | 人乳头瘤病毒hpv35 l1蛋白的表达和类病毒样颗粒及其制备方法 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113106107A (zh) | 一种表达hpv 35l1的多核苷酸及其表达载体、宿主细胞和应用 | |
CN106867975B (zh) | 新城疫病毒嵌合病毒样颗粒、疫苗及制备方法 | |
JP2016104015A (ja) | 多重遺伝子ベクターによってコードされる組換えウイルス様粒子 | |
CN113481115A (zh) | 表达人α-乳白蛋白的重组毕赤酵母菌及其构建方法与应用 | |
CN108342362A (zh) | 一种用于扩增重组犬腺病毒cav2的稳定细胞系mdck及其构建方法 | |
CN109797168A (zh) | 一种重组腺病毒的表达载体SAd23-L | |
CN113201550B (zh) | 一种表达hpv 51l1的多核苷酸及其表达载体、宿主细胞和应用 | |
WO2024051150A1 (zh) | 一种人巨细胞病毒重组载体及其制备方法和应用 | |
CN113088527B (zh) | 一种表达hpv 53l1的多核苷酸及其表达载体、宿主细胞和应用 | |
CN106755087B (zh) | 稳定表达猪瘟病毒e2蛋白的重组细胞系、制备方法、应用、及猪瘟病毒亚单位疫苗 | |
CN102827875B (zh) | 表达小鼠神经生长因子的重组腺病毒及其制备方法 | |
CN113774071B (zh) | 一种表达hpv 66l1的多核苷酸及其表达载体、宿主细胞和应用 | |
CN110358733B (zh) | 一株稳定表达A亚群禽白血病病毒gp85蛋白的细胞系及应用 | |
CN113073105B (zh) | 一种表达hpv 56l1的多核苷酸序列及其表达载体、宿主细胞和应用 | |
CN113151311B (zh) | 一种表达hpv 59l1的多核苷酸及其表达载体、宿主细胞和应用 | |
CN113480645B (zh) | 一种抗幽门螺旋杆菌重组抗体、制备方法及用途 | |
CN106754982B (zh) | 表达绿色荧光蛋白的限制性复制西尼罗病毒系统及其应用 | |
CN109134624A (zh) | 禽流感病毒血凝素抗原及其制备方法、应用和禽流感疫苗 | |
CN103215267B (zh) | 抑制流感病毒相关基因的siRNA及其应用 | |
CN113061167B (zh) | 兔出血症病毒重组抗原及其应用 | |
CN113667683A (zh) | 一种表达hpv 39l1的多核苷酸及其表达载体、宿主细胞和应用 | |
CN113501866A (zh) | 一种鸭坦布苏病毒抑制剂 | |
CN109738648B (zh) | 稳定高效表达丙型肝炎病毒核心抗原抗体的工程细胞株及其应用 | |
CN101671688A (zh) | EGFP融合的β2肾上腺素能受体、其重组表达载体及其构建方法 | |
CN109985235A (zh) | 鸡传染性支气管炎基因工程亚单位疫苗 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210713 |