KR20190013627A - 재조합 콜라겐의 수산화를 제어하기 위한 효모 균주 및 방법 - Google Patents
재조합 콜라겐의 수산화를 제어하기 위한 효모 균주 및 방법 Download PDFInfo
- Publication number
- KR20190013627A KR20190013627A KR1020180088161A KR20180088161A KR20190013627A KR 20190013627 A KR20190013627 A KR 20190013627A KR 1020180088161 A KR1020180088161 A KR 1020180088161A KR 20180088161 A KR20180088161 A KR 20180088161A KR 20190013627 A KR20190013627 A KR 20190013627A
- Authority
- KR
- South Korea
- Prior art keywords
- gly
- collagen
- pro
- dna
- yeast
- Prior art date
Links
- 102000008186 Collagen Human genes 0.000 title claims abstract description 428
- 108010035532 Collagen Proteins 0.000 title claims abstract description 428
- 229920001436 collagen Polymers 0.000 title claims abstract description 428
- 240000004808 Saccharomyces cerevisiae Species 0.000 title claims abstract description 166
- 238000000034 method Methods 0.000 title claims abstract description 104
- 230000033444 hydroxylation Effects 0.000 title description 29
- 238000005805 hydroxylation reaction Methods 0.000 title description 29
- 108020004414 DNA Proteins 0.000 claims abstract description 224
- 239000013598 vector Substances 0.000 claims abstract description 128
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 114
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 claims description 87
- 239000003550 marker Substances 0.000 claims description 49
- 102000008109 Mixed Function Oxygenases Human genes 0.000 claims description 38
- 108010074633 Mixed Function Oxygenases Proteins 0.000 claims description 38
- 241000283690 Bos taurus Species 0.000 claims description 34
- 102100036352 Protein disulfide-isomerase Human genes 0.000 claims description 34
- 102100040477 Prolyl 4-hydroxylase subunit alpha-1 Human genes 0.000 claims description 32
- 230000002201 biotropic effect Effects 0.000 claims description 30
- 101000614345 Homo sapiens Prolyl 4-hydroxylase subunit alpha-1 Proteins 0.000 claims description 29
- 241000235648 Pichia Species 0.000 claims description 29
- 108020004705 Codon Proteins 0.000 claims description 28
- 230000003115 biocidal effect Effects 0.000 claims description 24
- 238000004519 manufacturing process Methods 0.000 claims description 18
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 claims description 15
- 241000235070 Saccharomyces Species 0.000 claims description 15
- 241000222120 Candida <Saccharomycetales> Species 0.000 claims description 14
- 241001337994 Cryptococcus <scale insect> Species 0.000 claims description 14
- BRZYSWJRSDMWLG-CAXSIQPQSA-N geneticin Chemical compound O1C[C@@](O)(C)[C@H](NC)[C@@H](O)[C@H]1O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](C(C)O)O2)N)[C@@H](N)C[C@H]1N BRZYSWJRSDMWLG-CAXSIQPQSA-N 0.000 claims description 13
- 102100036826 Aldehyde oxidase Human genes 0.000 claims description 12
- 241000270722 Crocodylidae Species 0.000 claims description 12
- 101000928314 Homo sapiens Aldehyde oxidase Proteins 0.000 claims description 12
- 235000002198 Annona diversifolia Nutrition 0.000 claims description 11
- 241000086550 Dinosauria Species 0.000 claims description 11
- 241001416177 Vicugna pacos Species 0.000 claims description 11
- 241000270728 Alligator Species 0.000 claims description 10
- 241000894006 Bacteria Species 0.000 claims description 10
- 239000008280 blood Substances 0.000 claims description 10
- 210000004369 blood Anatomy 0.000 claims description 10
- 244000303258 Annona diversifolia Species 0.000 claims description 9
- 241000283070 Equus zebra Species 0.000 claims description 9
- 241000282819 Giraffa Species 0.000 claims description 9
- 241000406668 Loxodonta cyclotis Species 0.000 claims description 9
- 241000289581 Macropus sp. Species 0.000 claims description 9
- 108020005091 Replication Origin Proteins 0.000 claims description 9
- 230000010354 integration Effects 0.000 claims description 9
- 235000019687 Lamb Nutrition 0.000 claims description 8
- LNNWVNGFPYWNQE-GMIGKAJZSA-N desomorphine Chemical compound C1C2=CC=C(O)C3=C2[C@]24CCN(C)[C@H]1[C@@H]2CCC[C@@H]4O3 LNNWVNGFPYWNQE-GMIGKAJZSA-N 0.000 claims description 8
- XLYOFNOQVPJJNP-UHFFFAOYSA-M hydroxide Chemical compound [OH-] XLYOFNOQVPJJNP-UHFFFAOYSA-M 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 238000004520 electroporation Methods 0.000 claims description 7
- 229920002791 poly-4-hydroxybutyrate Polymers 0.000 claims description 7
- -1 zeaxin Natural products 0.000 claims description 7
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 claims description 6
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 claims description 6
- 102000003505 Myosin Human genes 0.000 claims description 6
- 108060008487 Myosin Proteins 0.000 claims description 6
- 230000010076 replication Effects 0.000 claims description 6
- 239000001888 Peptone Substances 0.000 claims description 5
- 108010080698 Peptones Proteins 0.000 claims description 5
- 229940041514 candida albicans extract Drugs 0.000 claims description 5
- 238000010367 cloning Methods 0.000 claims description 5
- 239000008121 dextrose Substances 0.000 claims description 5
- 235000019319 peptone Nutrition 0.000 claims description 5
- 239000012138 yeast extract Substances 0.000 claims description 5
- 230000000994 depressogenic effect Effects 0.000 claims description 4
- 230000013011 mating Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 239000003242 anti bacterial agent Substances 0.000 claims description 3
- 230000001580 bacterial effect Effects 0.000 claims description 3
- 150000001720 carbohydrates Chemical class 0.000 claims description 2
- 101001072202 Homo sapiens Protein disulfide-isomerase Proteins 0.000 claims 2
- 230000002611 ovarian Effects 0.000 claims 2
- 206010010071 Coma Diseases 0.000 claims 1
- 241000316848 Rhodococcus <scale insect> Species 0.000 claims 1
- 102000004190 Enzymes Human genes 0.000 abstract description 25
- 108090000790 Enzymes Proteins 0.000 abstract description 25
- 230000000640 hydroxylating effect Effects 0.000 abstract description 2
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 154
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 141
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 128
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 127
- 108010029020 prolylglycine Proteins 0.000 description 107
- 108010087846 prolyl-prolyl-glycine Proteins 0.000 description 93
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 91
- 108010079364 N-glycylalanine Proteins 0.000 description 86
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 84
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 75
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 65
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 59
- 108010047495 alanylglycine Proteins 0.000 description 58
- 108010077515 glycylproline Proteins 0.000 description 58
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 56
- JYPCXBJRLBHWME-UHFFFAOYSA-N glycyl-L-prolyl-L-arginine Natural products NCC(=O)N1CCCC1C(=O)NC(CCCN=C(N)N)C(O)=O JYPCXBJRLBHWME-UHFFFAOYSA-N 0.000 description 55
- CAVKXZMMDNOZJU-UHFFFAOYSA-N Gly-Pro-Ala-Gly-Pro Natural products C1CCC(C(O)=O)N1C(=O)CNC(=O)C(C)NC(=O)C1CCCN1C(=O)CN CAVKXZMMDNOZJU-UHFFFAOYSA-N 0.000 description 47
- 210000004027 cell Anatomy 0.000 description 45
- 108090000623 proteins and genes Proteins 0.000 description 45
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 41
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 41
- 108091033319 polynucleotide Proteins 0.000 description 39
- 102000040430 polynucleotide Human genes 0.000 description 39
- 239000002157 polynucleotide Substances 0.000 description 39
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 38
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 38
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 37
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 37
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 35
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 34
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 33
- 108010078144 glutaminyl-glycine Proteins 0.000 description 33
- 108020003519 protein disulfide isomerase Proteins 0.000 description 32
- BNBBNGZZKQUWCD-IUCAKERBSA-N Pro-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 BNBBNGZZKQUWCD-IUCAKERBSA-N 0.000 description 30
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 29
- 230000014509 gene expression Effects 0.000 description 29
- 108010061238 threonyl-glycine Proteins 0.000 description 29
- 108010047857 aspartylglycine Proteins 0.000 description 28
- MQIGTEQXYCRLGK-BQBZGAKWSA-N Ala-Gly-Pro Chemical compound C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O MQIGTEQXYCRLGK-BQBZGAKWSA-N 0.000 description 27
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 27
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 27
- JYPCXBJRLBHWME-IUCAKERBSA-N Gly-Pro-Arg Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JYPCXBJRLBHWME-IUCAKERBSA-N 0.000 description 26
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 26
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 26
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 25
- 108010076504 Protein Sorting Signals Proteins 0.000 description 25
- 108090000765 processed proteins & peptides Proteins 0.000 description 25
- 229940088598 enzyme Drugs 0.000 description 24
- 125000001500 prolyl group Chemical group [H]N1C([H])(C(=O)[*])C([H])([H])C([H])([H])C1([H])[H] 0.000 description 24
- 102000004169 proteins and genes Human genes 0.000 description 24
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 23
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 23
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 23
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 23
- 102000004196 processed proteins & peptides Human genes 0.000 description 23
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 22
- ZPPVJIJMIKTERM-YUMQZZPRSA-N Pro-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ZPPVJIJMIKTERM-YUMQZZPRSA-N 0.000 description 22
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 22
- 229920001184 polypeptide Polymers 0.000 description 22
- 241000282326 Felis catus Species 0.000 description 21
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 21
- 235000018102 proteins Nutrition 0.000 description 21
- 108010069502 Collagen Type III Proteins 0.000 description 20
- 102000001187 Collagen Type III Human genes 0.000 description 20
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 20
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 20
- 102000057297 Pepsin A Human genes 0.000 description 20
- 108090000284 Pepsin A Proteins 0.000 description 20
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 20
- 238000010586 diagram Methods 0.000 description 20
- 108010050848 glycylleucine Proteins 0.000 description 20
- 239000010985 leather Substances 0.000 description 20
- 230000004048 modification Effects 0.000 description 20
- 238000012986 modification Methods 0.000 description 20
- 229940111202 pepsin Drugs 0.000 description 20
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 19
- LOEANKRDMMVOGZ-YUMQZZPRSA-N Gly-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O LOEANKRDMMVOGZ-YUMQZZPRSA-N 0.000 description 19
- 108010025801 glycyl-prolyl-arginine Proteins 0.000 description 19
- 108010064235 lysylglycine Proteins 0.000 description 19
- 239000002609 medium Substances 0.000 description 19
- SCAKQYSGEIHPLV-IUCAKERBSA-N (4S)-4-[(2-aminoacetyl)amino]-5-[(2S)-2-(carboxymethylcarbamoyl)pyrrolidin-1-yl]-5-oxopentanoic acid Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SCAKQYSGEIHPLV-IUCAKERBSA-N 0.000 description 18
- 239000004472 Lysine Substances 0.000 description 18
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 17
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 17
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 17
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 17
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 16
- JMVQDLDPDBXAAX-YUMQZZPRSA-N Pro-Gly-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 JMVQDLDPDBXAAX-YUMQZZPRSA-N 0.000 description 16
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 16
- 239000000203 mixture Substances 0.000 description 16
- PEZMQPADLFXCJJ-ZETCQYMHSA-N 2-[[2-[[(2s)-1-(2-aminoacetyl)pyrrolidine-2-carbonyl]amino]acetyl]amino]acetic acid Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(=O)NCC(O)=O PEZMQPADLFXCJJ-ZETCQYMHSA-N 0.000 description 15
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 15
- VXKCPBPQEKKERH-IUCAKERBSA-N Gly-Arg-Pro Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N1CCC[C@H]1C(O)=O VXKCPBPQEKKERH-IUCAKERBSA-N 0.000 description 15
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 15
- HJARVELKOSZUEW-YUMQZZPRSA-N Gly-Pro-Gln Chemical compound [H]NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJARVELKOSZUEW-YUMQZZPRSA-N 0.000 description 15
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 15
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 15
- 108010043005 Prolyl Hydroxylases Proteins 0.000 description 15
- 102000004079 Prolyl Hydroxylases Human genes 0.000 description 15
- 108010026333 seryl-proline Proteins 0.000 description 15
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 14
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 14
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 14
- 108010081551 glycylphenylalanine Proteins 0.000 description 14
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 13
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 13
- 108010022452 Collagen Type I Proteins 0.000 description 13
- 102000012422 Collagen Type I Human genes 0.000 description 13
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 13
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 13
- 108010057821 leucylproline Proteins 0.000 description 13
- 210000005253 yeast cell Anatomy 0.000 description 13
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 12
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 12
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 12
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 12
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 12
- 125000003275 alpha amino acid group Chemical group 0.000 description 12
- 239000013604 expression vector Substances 0.000 description 12
- 239000000463 material Substances 0.000 description 12
- 108010031719 prolyl-serine Proteins 0.000 description 12
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 11
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 11
- AQPVUEJJARLJHB-BQBZGAKWSA-N Arg-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N AQPVUEJJARLJHB-BQBZGAKWSA-N 0.000 description 11
- NSORZJXKUQFEKL-JGVFFNPUSA-N Gln-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)N)N)C(=O)O NSORZJXKUQFEKL-JGVFFNPUSA-N 0.000 description 11
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 11
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 11
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 11
- TYVAWPFQYFPSBR-BFHQHQDPSA-N Thr-Ala-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)NCC(O)=O TYVAWPFQYFPSBR-BFHQHQDPSA-N 0.000 description 11
- 235000001014 amino acid Nutrition 0.000 description 11
- 150000001413 amino acids Chemical class 0.000 description 11
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 11
- 108010037850 glycylvaline Proteins 0.000 description 11
- 210000003491 skin Anatomy 0.000 description 11
- 239000013638 trimer Substances 0.000 description 11
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 10
- GKKUBLFXKRDMFC-BQBZGAKWSA-N Asn-Pro-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O GKKUBLFXKRDMFC-BQBZGAKWSA-N 0.000 description 10
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 10
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 10
- 108010077245 asparaginyl-proline Proteins 0.000 description 10
- 210000001519 tissue Anatomy 0.000 description 10
- 108010000998 wheylin-2 peptide Proteins 0.000 description 10
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 9
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 9
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 9
- PDUHNKAFQXQNLH-ZETCQYMHSA-N Gly-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)NCC(O)=O PDUHNKAFQXQNLH-ZETCQYMHSA-N 0.000 description 9
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 9
- 241000235058 Komagataella pastoris Species 0.000 description 9
- RFQATBGBLDAKGI-VHSXEESVSA-N Lys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCCN)N)C(=O)O RFQATBGBLDAKGI-VHSXEESVSA-N 0.000 description 9
- IXKSXJFAGXLQOQ-XISFHERQSA-N WHWLQLKPGQPMY Chemical compound C([C@@H](C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CNC=N1 IXKSXJFAGXLQOQ-XISFHERQSA-N 0.000 description 9
- 230000000694 effects Effects 0.000 description 9
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 9
- 230000028327 secretion Effects 0.000 description 9
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 8
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 8
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 8
- DTNUIAJCPRMNBT-WHFBIAKZSA-N Asp-Gly-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O DTNUIAJCPRMNBT-WHFBIAKZSA-N 0.000 description 8
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 8
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 8
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 8
- 102000009842 Fibril-Associated Collagens Human genes 0.000 description 8
- 108010020305 Fibril-Associated Collagens Proteins 0.000 description 8
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 8
- SSFWXSNOKDZNHY-QXEWZRGKSA-N Gly-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN SSFWXSNOKDZNHY-QXEWZRGKSA-N 0.000 description 8
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 description 8
- 108010065920 Insulin Lispro Proteins 0.000 description 8
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 8
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 8
- VSJAPSMRFYUOKS-IUCAKERBSA-N Met-Pro-Gly Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O VSJAPSMRFYUOKS-IUCAKERBSA-N 0.000 description 8
- 241001465754 Metazoa Species 0.000 description 8
- ULIWFCCJIOEHMU-BQBZGAKWSA-N Pro-Gly-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 ULIWFCCJIOEHMU-BQBZGAKWSA-N 0.000 description 8
- DCHQYSOGURGJST-FJXKBIBVSA-N Pro-Thr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O DCHQYSOGURGJST-FJXKBIBVSA-N 0.000 description 8
- 108010087924 alanylproline Proteins 0.000 description 8
- 238000012258 culturing Methods 0.000 description 8
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 8
- 239000000047 product Substances 0.000 description 8
- 108010015796 prolylisoleucine Proteins 0.000 description 8
- 108010020504 2-Oxoglutarate 5-Dioxygenase Procollagen-Lysine Proteins 0.000 description 7
- 102000008490 2-Oxoglutarate 5-Dioxygenase Procollagen-Lysine Human genes 0.000 description 7
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 7
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 7
- 108010051330 Arg-Pro-Gly-Pro Proteins 0.000 description 7
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 7
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 7
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 7
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 7
- QCBYAHHNOHBXIH-UWVGGRQHSA-N His-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CN=CN1 QCBYAHHNOHBXIH-UWVGGRQHSA-N 0.000 description 7
- KCTIFOCXAIUQQK-QXEWZRGKSA-N Ile-Pro-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O KCTIFOCXAIUQQK-QXEWZRGKSA-N 0.000 description 7
- 108091005804 Peptidases Proteins 0.000 description 7
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 7
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 7
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 7
- 239000004365 Protease Substances 0.000 description 7
- 101710137510 Saimiri transformation-associated protein Proteins 0.000 description 7
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 7
- 108010062796 arginyllysine Proteins 0.000 description 7
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 7
- 239000012634 fragment Substances 0.000 description 7
- 108010015792 glycyllysine Proteins 0.000 description 7
- 108020004999 messenger RNA Proteins 0.000 description 7
- 108010005942 methionylglycine Proteins 0.000 description 7
- 239000002773 nucleotide Substances 0.000 description 7
- 125000003729 nucleotide group Chemical group 0.000 description 7
- PAIHPOGPJVUFJY-WDSKDSINSA-N Ala-Glu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PAIHPOGPJVUFJY-WDSKDSINSA-N 0.000 description 6
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 6
- HAVKMRGWNXMCDR-STQMWFEESA-N Arg-Gly-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HAVKMRGWNXMCDR-STQMWFEESA-N 0.000 description 6
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 6
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 6
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 6
- RYKWOUUZJFSJOH-FXQIFTODSA-N Asp-Gln-Glu Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N RYKWOUUZJFSJOH-FXQIFTODSA-N 0.000 description 6
- 108091026890 Coding region Proteins 0.000 description 6
- 102000000503 Collagen Type II Human genes 0.000 description 6
- 108010041390 Collagen Type II Proteins 0.000 description 6
- LHMSYHSAAJOEBL-CIUDSAMLSA-N Cys-Lys-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O LHMSYHSAAJOEBL-CIUDSAMLSA-N 0.000 description 6
- CAXGCBSRJLADPD-FXQIFTODSA-N Cys-Pro-Asn Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O CAXGCBSRJLADPD-FXQIFTODSA-N 0.000 description 6
- 108010072062 GEKG peptide Proteins 0.000 description 6
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 6
- GUOWMVFLAJNPDY-CIUDSAMLSA-N Glu-Ser-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O GUOWMVFLAJNPDY-CIUDSAMLSA-N 0.000 description 6
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 6
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 6
- DTRUBYPMMVPQPD-YUMQZZPRSA-N Gly-Gln-Arg Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DTRUBYPMMVPQPD-YUMQZZPRSA-N 0.000 description 6
- QSVCIFZPGLOZGH-WDSKDSINSA-N Gly-Glu-Ser Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QSVCIFZPGLOZGH-WDSKDSINSA-N 0.000 description 6
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 6
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 6
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 6
- ZIMTWPHIKZEHSE-UWVGGRQHSA-N His-Arg-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O ZIMTWPHIKZEHSE-UWVGGRQHSA-N 0.000 description 6
- MDOBWSFNSNPENN-PMVVWTBXSA-N His-Thr-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O MDOBWSFNSNPENN-PMVVWTBXSA-N 0.000 description 6
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 6
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 6
- VHTOGMKQXXJOHG-RHYQMDGZSA-N Lys-Thr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VHTOGMKQXXJOHG-RHYQMDGZSA-N 0.000 description 6
- MVBZBRKNZVJEKK-DTWKUNHWSA-N Met-Gly-Pro Chemical compound CSCC[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N MVBZBRKNZVJEKK-DTWKUNHWSA-N 0.000 description 6
- AXHNAGAYRGCDLG-UWVGGRQHSA-N Met-Lys-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O AXHNAGAYRGCDLG-UWVGGRQHSA-N 0.000 description 6
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 6
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 6
- 102000035195 Peptidases Human genes 0.000 description 6
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 6
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 6
- STGVYUTZKGPRCI-GUBZILKMSA-N Pro-Val-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 STGVYUTZKGPRCI-GUBZILKMSA-N 0.000 description 6
- FUOGXAQMNJMBFG-WPRPVWTQSA-N Pro-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FUOGXAQMNJMBFG-WPRPVWTQSA-N 0.000 description 6
- UCTIUWKCVNGEFH-OBJOEFQTSA-N Pro-Val-Gly-Pro Chemical compound N([C@@H](C(C)C)C(=O)NCC(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 UCTIUWKCVNGEFH-OBJOEFQTSA-N 0.000 description 6
- 108010079005 RDV peptide Proteins 0.000 description 6
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 6
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 6
- FHXGMDRKJHKLKW-QWRGUYRKSA-N Ser-Tyr-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 FHXGMDRKJHKLKW-QWRGUYRKSA-N 0.000 description 6
- 108010077465 Tropocollagen Proteins 0.000 description 6
- YSGAPESOXHFTQY-IHRRRGAJSA-N Tyr-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N YSGAPESOXHFTQY-IHRRRGAJSA-N 0.000 description 6
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 6
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 6
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 6
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 6
- 210000000845 cartilage Anatomy 0.000 description 6
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 6
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 6
- 108010043293 glycyl-prolyl-glycyl-glycine Proteins 0.000 description 6
- 108010027338 isoleucylcysteine Proteins 0.000 description 6
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 5
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 5
- HQIZDMIGUJOSNI-IUCAKERBSA-N Arg-Gly-Arg Chemical compound N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O HQIZDMIGUJOSNI-IUCAKERBSA-N 0.000 description 5
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 5
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 5
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 5
- 102000004266 Collagen Type IV Human genes 0.000 description 5
- 108010042086 Collagen Type IV Proteins 0.000 description 5
- FBPFZTCFMRRESA-FSIIMWSLSA-N D-Glucitol Natural products OC[C@H](O)[C@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-FSIIMWSLSA-N 0.000 description 5
- 108010067193 Formaldehyde transketolase Proteins 0.000 description 5
- FGYPOQPQTUNESW-IUCAKERBSA-N Gln-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N FGYPOQPQTUNESW-IUCAKERBSA-N 0.000 description 5
- LWYUQLZOIORFFJ-XKBZYTNZSA-N Glu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O LWYUQLZOIORFFJ-XKBZYTNZSA-N 0.000 description 5
- GRIRDMVMJJDZKV-RCOVLWMOSA-N Gly-Asn-Val Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O GRIRDMVMJJDZKV-RCOVLWMOSA-N 0.000 description 5
- YFGONBOFGGWKKY-VHSXEESVSA-N Gly-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)CN)C(=O)O YFGONBOFGGWKKY-VHSXEESVSA-N 0.000 description 5
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 5
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 5
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 5
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 5
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 5
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 5
- 241000880493 Leptailurus serval Species 0.000 description 5
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 5
- DPWGZWUMUUJQDT-IUCAKERBSA-N Leu-Gln-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O DPWGZWUMUUJQDT-IUCAKERBSA-N 0.000 description 5
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 5
- UETQMSASAVBGJY-QWRGUYRKSA-N Lys-Gly-His Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 UETQMSASAVBGJY-QWRGUYRKSA-N 0.000 description 5
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 5
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 5
- DMKWYMWNEKIPFC-IUCAKERBSA-N Pro-Gly-Arg Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O DMKWYMWNEKIPFC-IUCAKERBSA-N 0.000 description 5
- WSRWHZRUOCACLJ-UWVGGRQHSA-N Pro-Gly-His Chemical compound C([C@@H](C(=O)O)NC(=O)CNC(=O)[C@H]1NCCC1)C1=CN=CN1 WSRWHZRUOCACLJ-UWVGGRQHSA-N 0.000 description 5
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 5
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 5
- WIPAMEKBSHNFQE-IUCAKERBSA-N Pro-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@@H]1CCCN1 WIPAMEKBSHNFQE-IUCAKERBSA-N 0.000 description 5
- HQTKVSCNCDLXSX-BQBZGAKWSA-N Ser-Arg-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O HQTKVSCNCDLXSX-BQBZGAKWSA-N 0.000 description 5
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 5
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 5
- MSIYNSBKKVMGFO-BHNWBGBOSA-N Thr-Gly-Pro Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N)O MSIYNSBKKVMGFO-BHNWBGBOSA-N 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 5
- 108010049041 glutamylalanine Proteins 0.000 description 5
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 5
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 5
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 5
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 5
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 5
- 238000011534 incubation Methods 0.000 description 5
- 238000002844 melting Methods 0.000 description 5
- 230000008018 melting Effects 0.000 description 5
- 239000000600 sorbitol Substances 0.000 description 5
- 238000006467 substitution reaction Methods 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- CWFMWBHMIMNZLN-NAKRPEOUSA-N (2s)-1-[(2s)-2-[[(2s,3s)-2-amino-3-methylpentanoyl]amino]propanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CWFMWBHMIMNZLN-NAKRPEOUSA-N 0.000 description 4
- CUVSTAMIHSSVKL-UWVGGRQHSA-N (4s)-4-[(2-aminoacetyl)amino]-5-[[(2s)-6-amino-1-(carboxymethylamino)-1-oxohexan-2-yl]amino]-5-oxopentanoic acid Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN CUVSTAMIHSSVKL-UWVGGRQHSA-N 0.000 description 4
- OTEWWRBKGONZBW-UHFFFAOYSA-N 2-[[2-[[2-[(2-azaniumylacetyl)amino]-4-methylpentanoyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NC(CC(C)C)C(=O)NCC(=O)NCC(O)=O OTEWWRBKGONZBW-UHFFFAOYSA-N 0.000 description 4
- RXTBLQVXNIECFP-FXQIFTODSA-N Ala-Gln-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RXTBLQVXNIECFP-FXQIFTODSA-N 0.000 description 4
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 4
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 4
- ZSOICJZJSRWNHX-ACZMJKKPSA-N Ala-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)[C@H](C)[NH3+] ZSOICJZJSRWNHX-ACZMJKKPSA-N 0.000 description 4
- OMDNCNKNEGFOMM-BQBZGAKWSA-N Ala-Met-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O OMDNCNKNEGFOMM-BQBZGAKWSA-N 0.000 description 4
- WPWUFUBLGADILS-WDSKDSINSA-N Ala-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WPWUFUBLGADILS-WDSKDSINSA-N 0.000 description 4
- YCTIYBUTCKNOTI-UWJYBYFXSA-N Ala-Tyr-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCTIYBUTCKNOTI-UWJYBYFXSA-N 0.000 description 4
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 4
- VXXHDZKEQNGXNU-QXEWZRGKSA-N Arg-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N VXXHDZKEQNGXNU-QXEWZRGKSA-N 0.000 description 4
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 4
- HGKHPCFTRQDHCU-IUCAKERBSA-N Arg-Pro-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HGKHPCFTRQDHCU-IUCAKERBSA-N 0.000 description 4
- YCYXHLZRUSJITQ-SRVKXCTJSA-N Arg-Pro-Pro Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 YCYXHLZRUSJITQ-SRVKXCTJSA-N 0.000 description 4
- GXMSVVBIAMWMKO-BQBZGAKWSA-N Asn-Arg-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N GXMSVVBIAMWMKO-BQBZGAKWSA-N 0.000 description 4
- XSGBIBGAMKTHMY-WHFBIAKZSA-N Asn-Asp-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O XSGBIBGAMKTHMY-WHFBIAKZSA-N 0.000 description 4
- WQSCVMQDZYTFQU-FXQIFTODSA-N Asn-Cys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WQSCVMQDZYTFQU-FXQIFTODSA-N 0.000 description 4
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 4
- WONGRTVAMHFGBE-WDSKDSINSA-N Asn-Gly-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N WONGRTVAMHFGBE-WDSKDSINSA-N 0.000 description 4
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 4
- DBLPNHGKMDHWNZ-UHFFFAOYSA-N Asp Gly Arg Asn Chemical compound OC(=O)CC(N)C(=O)NCC(=O)NC(CCCN=C(N)N)C(=O)NC(CC(N)=O)C(O)=O DBLPNHGKMDHWNZ-UHFFFAOYSA-N 0.000 description 4
- HPNDBHLITCHRSO-WHFBIAKZSA-N Asp-Ala-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)NCC(O)=O HPNDBHLITCHRSO-WHFBIAKZSA-N 0.000 description 4
- ZLGKHJHFYSRUBH-FXQIFTODSA-N Asp-Arg-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLGKHJHFYSRUBH-FXQIFTODSA-N 0.000 description 4
- FRSGNOZCTWDVFZ-ACZMJKKPSA-N Asp-Asp-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O FRSGNOZCTWDVFZ-ACZMJKKPSA-N 0.000 description 4
- ZEDBMCPXPIYJLW-XHNCKOQMSA-N Asp-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O ZEDBMCPXPIYJLW-XHNCKOQMSA-N 0.000 description 4
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 4
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 4
- SEMWSADZTMJELF-BYULHYEWSA-N Asp-Ile-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O SEMWSADZTMJELF-BYULHYEWSA-N 0.000 description 4
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 4
- 102000004427 Collagen Type IX Human genes 0.000 description 4
- 108010042106 Collagen Type IX Proteins 0.000 description 4
- 102000002734 Collagen Type VI Human genes 0.000 description 4
- 108010043741 Collagen Type VI Proteins 0.000 description 4
- 102000004510 Collagen Type VII Human genes 0.000 description 4
- 108010017377 Collagen Type VII Proteins 0.000 description 4
- SQJSYLDKQBZQTG-FXQIFTODSA-N Cys-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CS)N SQJSYLDKQBZQTG-FXQIFTODSA-N 0.000 description 4
- YZFCGHIBLBDZDA-ZLUOBGJFSA-N Cys-Asp-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YZFCGHIBLBDZDA-ZLUOBGJFSA-N 0.000 description 4
- HYKFOHGZGLOCAY-ZLUOBGJFSA-N Cys-Cys-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O HYKFOHGZGLOCAY-ZLUOBGJFSA-N 0.000 description 4
- ZEXHDOQQYZKOIB-ACZMJKKPSA-N Cys-Glu-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZEXHDOQQYZKOIB-ACZMJKKPSA-N 0.000 description 4
- IDFVDSBJNMPBSX-SRVKXCTJSA-N Cys-Lys-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O IDFVDSBJNMPBSX-SRVKXCTJSA-N 0.000 description 4
- IRKLTAKLAFUTLA-KATARQTJSA-N Cys-Thr-Lys Chemical compound C[C@@H](O)[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CCCCN)C(O)=O IRKLTAKLAFUTLA-KATARQTJSA-N 0.000 description 4
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 4
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 4
- FQCILXROGNOZON-YUMQZZPRSA-N Gln-Pro-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O FQCILXROGNOZON-YUMQZZPRSA-N 0.000 description 4
- WLRYGVYQFXRJDA-DCAQKATOSA-N Gln-Pro-Pro Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 WLRYGVYQFXRJDA-DCAQKATOSA-N 0.000 description 4
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 4
- DYVMTEWCGAVKSE-HJGDQZAQSA-N Gln-Thr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O DYVMTEWCGAVKSE-HJGDQZAQSA-N 0.000 description 4
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 4
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 4
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 4
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 4
- QMOSCLNJVKSHHU-YUMQZZPRSA-N Glu-Met-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QMOSCLNJVKSHHU-YUMQZZPRSA-N 0.000 description 4
- ZIYGTCDTJJCDDP-JYJNAYRXSA-N Glu-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZIYGTCDTJJCDDP-JYJNAYRXSA-N 0.000 description 4
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 4
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 4
- ZTNHPMZHAILHRB-JSGCOSHPSA-N Glu-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)NCC(O)=O)=CNC2=C1 ZTNHPMZHAILHRB-JSGCOSHPSA-N 0.000 description 4
- QLNKFGTZOBVMCS-JBACZVJFSA-N Glu-Tyr-Trp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QLNKFGTZOBVMCS-JBACZVJFSA-N 0.000 description 4
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 4
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 4
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 4
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 4
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 4
- LXXLEUBUOMCAMR-NKWVEPMBSA-N Gly-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)CN)C(=O)O LXXLEUBUOMCAMR-NKWVEPMBSA-N 0.000 description 4
- BULIVUZUDBHKKZ-WDSKDSINSA-N Gly-Gln-Asn Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O BULIVUZUDBHKKZ-WDSKDSINSA-N 0.000 description 4
- BPQYBFAXRGMGGY-LAEOZQHASA-N Gly-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN BPQYBFAXRGMGGY-LAEOZQHASA-N 0.000 description 4
- GNPVTZJUUBPZKW-WDSKDSINSA-N Gly-Gln-Ser Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GNPVTZJUUBPZKW-WDSKDSINSA-N 0.000 description 4
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 4
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 4
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 4
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 4
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 4
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 4
- BXICSAQLIHFDDL-YUMQZZPRSA-N Gly-Lys-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O BXICSAQLIHFDDL-YUMQZZPRSA-N 0.000 description 4
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 4
- FKESCSGWBPUTPN-FOHZUACHSA-N Gly-Thr-Asn Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O FKESCSGWBPUTPN-FOHZUACHSA-N 0.000 description 4
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 4
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 4
- YGHSQRJSHKYUJY-SCZZXKLOSA-N Gly-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN YGHSQRJSHKYUJY-SCZZXKLOSA-N 0.000 description 4
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 4
- DZMVESFTHXSSPZ-XVYDVKMFSA-N His-Ala-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DZMVESFTHXSSPZ-XVYDVKMFSA-N 0.000 description 4
- HVCRQRQPIIRNLY-IUCAKERBSA-N His-Gln-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N HVCRQRQPIIRNLY-IUCAKERBSA-N 0.000 description 4
- OWYIDJCNRWRSJY-QTKMDUPCSA-N His-Pro-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O OWYIDJCNRWRSJY-QTKMDUPCSA-N 0.000 description 4
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 4
- JHCVYQKVKOLAIU-NAKRPEOUSA-N Ile-Cys-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)O)N JHCVYQKVKOLAIU-NAKRPEOUSA-N 0.000 description 4
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 4
- AFERFBZLVUFWRA-HTFCKZLJSA-N Ile-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CS)C(=O)O)N AFERFBZLVUFWRA-HTFCKZLJSA-N 0.000 description 4
- PARSHQDZROHERM-NHCYSSNCSA-N Ile-Lys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N PARSHQDZROHERM-NHCYSSNCSA-N 0.000 description 4
- BKPPWVSPSIUXHZ-OSUNSFLBSA-N Ile-Met-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N BKPPWVSPSIUXHZ-OSUNSFLBSA-N 0.000 description 4
- YKZAMJXNJUWFIK-JBDRJPRFSA-N Ile-Ser-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(=O)O)N YKZAMJXNJUWFIK-JBDRJPRFSA-N 0.000 description 4
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 4
- JNLSTRPWUXOORL-MMWGEVLESA-N Ile-Ser-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N JNLSTRPWUXOORL-MMWGEVLESA-N 0.000 description 4
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 4
- 102100034343 Integrase Human genes 0.000 description 4
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 4
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 4
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 4
- XIRYQRLFHWWWTC-QEJZJMRPSA-N Leu-Ala-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XIRYQRLFHWWWTC-QEJZJMRPSA-N 0.000 description 4
- RRSLQOLASISYTB-CIUDSAMLSA-N Leu-Cys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O RRSLQOLASISYTB-CIUDSAMLSA-N 0.000 description 4
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 4
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 4
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 4
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 4
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 4
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 4
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 4
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 4
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 4
- KYNNSEJZFVCDIV-ZPFDUUQYSA-N Lys-Ile-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O KYNNSEJZFVCDIV-ZPFDUUQYSA-N 0.000 description 4
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 4
- LRALLISKBZNSKN-BQBZGAKWSA-N Met-Gly-Ser Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LRALLISKBZNSKN-BQBZGAKWSA-N 0.000 description 4
- XOFDBXYPKZUAAM-GUBZILKMSA-N Met-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N XOFDBXYPKZUAAM-GUBZILKMSA-N 0.000 description 4
- 108010066427 N-valyltryptophan Proteins 0.000 description 4
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 4
- BBDSZDHUCPSYAC-QEJZJMRPSA-N Phe-Ala-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BBDSZDHUCPSYAC-QEJZJMRPSA-N 0.000 description 4
- CGOMLCQJEMWMCE-STQMWFEESA-N Phe-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CGOMLCQJEMWMCE-STQMWFEESA-N 0.000 description 4
- QEPZQAPZKIPVDV-KKUMJFAQSA-N Phe-Cys-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N QEPZQAPZKIPVDV-KKUMJFAQSA-N 0.000 description 4
- LLGTYVHITPVGKR-RYUDHWBXSA-N Phe-Gln-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O LLGTYVHITPVGKR-RYUDHWBXSA-N 0.000 description 4
- ABQFNJAFONNUTH-FHWLQOOXSA-N Phe-Gln-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N ABQFNJAFONNUTH-FHWLQOOXSA-N 0.000 description 4
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 4
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 4
- KUSYCSMTTHSZOA-DZKIICNBSA-N Phe-Val-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N KUSYCSMTTHSZOA-DZKIICNBSA-N 0.000 description 4
- VXCHGLYSIOOZIS-GUBZILKMSA-N Pro-Ala-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 VXCHGLYSIOOZIS-GUBZILKMSA-N 0.000 description 4
- WFLWKEUBTSOFMP-FXQIFTODSA-N Pro-Cys-Cys Chemical compound OC(=O)[C@H](CS)NC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 WFLWKEUBTSOFMP-FXQIFTODSA-N 0.000 description 4
- DIZLUAZLNDFDPR-CIUDSAMLSA-N Pro-Cys-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 DIZLUAZLNDFDPR-CIUDSAMLSA-N 0.000 description 4
- WVOXLKUUVCCCSU-ZPFDUUQYSA-N Pro-Glu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVOXLKUUVCCCSU-ZPFDUUQYSA-N 0.000 description 4
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 4
- FEVDNIBDCRKMER-IUCAKERBSA-N Pro-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEVDNIBDCRKMER-IUCAKERBSA-N 0.000 description 4
- 102000018399 Prolyl 3-hydroxylases Human genes 0.000 description 4
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 4
- GXXTUIUYTWGPMV-FXQIFTODSA-N Ser-Arg-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O GXXTUIUYTWGPMV-FXQIFTODSA-N 0.000 description 4
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 4
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 4
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 4
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 4
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 4
- CICQXRWZNVXFCU-SRVKXCTJSA-N Ser-His-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O CICQXRWZNVXFCU-SRVKXCTJSA-N 0.000 description 4
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 4
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 4
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 4
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 4
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 4
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 4
- YJCVECXVYHZOBK-KNZXXDILSA-N Thr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H]([C@@H](C)O)N YJCVECXVYHZOBK-KNZXXDILSA-N 0.000 description 4
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 4
- ABCLYRRGTZNIFU-BWAGICSOSA-N Thr-Tyr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O ABCLYRRGTZNIFU-BWAGICSOSA-N 0.000 description 4
- CCZXBOFIBYQLEV-IHPCNDPISA-N Trp-Leu-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(O)=O CCZXBOFIBYQLEV-IHPCNDPISA-N 0.000 description 4
- VDUJEEQMRQCLHB-YTQUADARSA-N Trp-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O VDUJEEQMRQCLHB-YTQUADARSA-N 0.000 description 4
- ACGIVBXINJFALS-HKUYNNGSSA-N Trp-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ACGIVBXINJFALS-HKUYNNGSSA-N 0.000 description 4
- GQYPNFIFJRNDPY-ONUFPDRFSA-N Trp-Trp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC=3C4=CC=CC=C4NC=3)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 GQYPNFIFJRNDPY-ONUFPDRFSA-N 0.000 description 4
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 4
- NLMXVDDEQFKQQU-CFMVVWHZSA-N Tyr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NLMXVDDEQFKQQU-CFMVVWHZSA-N 0.000 description 4
- RYSNTWVRSLCAJZ-RYUDHWBXSA-N Tyr-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RYSNTWVRSLCAJZ-RYUDHWBXSA-N 0.000 description 4
- SYFHQHYTNCQCCN-MELADBBJSA-N Tyr-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O SYFHQHYTNCQCCN-MELADBBJSA-N 0.000 description 4
- WYOBRXPIZVKNMF-IRXDYDNUSA-N Tyr-Tyr-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)NCC(O)=O)C1=CC=C(O)C=C1 WYOBRXPIZVKNMF-IRXDYDNUSA-N 0.000 description 4
- DDNIHOWRDOXXPF-NGZCFLSTSA-N Val-Asp-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DDNIHOWRDOXXPF-NGZCFLSTSA-N 0.000 description 4
- DLYOEFGPYTZVSP-AEJSXWLSSA-N Val-Cys-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N DLYOEFGPYTZVSP-AEJSXWLSSA-N 0.000 description 4
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 4
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 4
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 4
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 4
- 108010045023 alanyl-prolyl-tyrosine Proteins 0.000 description 4
- 108010070783 alanyltyrosine Proteins 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 108010092854 aspartyllysine Proteins 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 210000000988 bone and bone Anatomy 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 239000000835 fiber Substances 0.000 description 4
- 102000013373 fibrillar collagen Human genes 0.000 description 4
- 108060002894 fibrillar collagen Proteins 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 4
- 108010001064 glycyl-glycyl-glycyl-glycine Proteins 0.000 description 4
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 4
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 4
- 108010020688 glycylhistidine Proteins 0.000 description 4
- 230000012010 growth Effects 0.000 description 4
- 108010085325 histidylproline Proteins 0.000 description 4
- 230000006698 induction Effects 0.000 description 4
- 238000002955 isolation Methods 0.000 description 4
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 4
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 4
- 108010003700 lysyl aspartic acid Proteins 0.000 description 4
- 108010038320 lysylphenylalanine Proteins 0.000 description 4
- 108010056582 methionylglutamic acid Proteins 0.000 description 4
- 108010077112 prolyl-proline Proteins 0.000 description 4
- 235000019419 proteases Nutrition 0.000 description 4
- 230000003248 secreting effect Effects 0.000 description 4
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 4
- 239000006228 supernatant Substances 0.000 description 4
- 108010020532 tyrosyl-proline Proteins 0.000 description 4
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 108010050451 2-oxoglutarate 3-dioxygenase proline Proteins 0.000 description 3
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 3
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 3
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 3
- 108010025188 Alcohol oxidase Proteins 0.000 description 3
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 3
- 102100033601 Collagen alpha-1(I) chain Human genes 0.000 description 3
- 102000010834 Extracellular Matrix Proteins Human genes 0.000 description 3
- 108010037362 Extracellular Matrix Proteins Proteins 0.000 description 3
- MFJAPSYJQJCQDN-BQBZGAKWSA-N Gln-Gly-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O MFJAPSYJQJCQDN-BQBZGAKWSA-N 0.000 description 3
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 3
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 3
- BHPQOIPBLYJNAW-NGZCFLSTSA-N Gly-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN BHPQOIPBLYJNAW-NGZCFLSTSA-N 0.000 description 3
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 3
- SCJJPCQUJYPHRZ-BQBZGAKWSA-N Gly-Pro-Asn Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O SCJJPCQUJYPHRZ-BQBZGAKWSA-N 0.000 description 3
- CSMYMGFCEJWALV-WDSKDSINSA-N Gly-Ser-Gln Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O CSMYMGFCEJWALV-WDSKDSINSA-N 0.000 description 3
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 3
- 101000945357 Homo sapiens Collagen alpha-1(I) chain Proteins 0.000 description 3
- 101000875067 Homo sapiens Collagen alpha-2(I) chain Proteins 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- 101150023810 PHO1 gene Proteins 0.000 description 3
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 3
- 101710084323 Prolyl 4-hydroxylase subunit alpha-1 Proteins 0.000 description 3
- 108010029485 Protein Isoforms Proteins 0.000 description 3
- 102000001708 Protein Isoforms Human genes 0.000 description 3
- 108010003894 Protein-Lysine 6-Oxidase Proteins 0.000 description 3
- 102000004669 Protein-Lysine 6-Oxidase Human genes 0.000 description 3
- 101100271429 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ATP6 gene Proteins 0.000 description 3
- 241000282887 Suidae Species 0.000 description 3
- SLUWOCTZVGMURC-BFHQHQDPSA-N Thr-Gly-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O SLUWOCTZVGMURC-BFHQHQDPSA-N 0.000 description 3
- BKVICMPZWRNWOC-RHYQMDGZSA-N Thr-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O BKVICMPZWRNWOC-RHYQMDGZSA-N 0.000 description 3
- 101100115751 Trypanosoma brucei brucei dnaaf11 gene Proteins 0.000 description 3
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 3
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 3
- OFHCOWSQAMBJIW-AVJTYSNKSA-N alfacalcidol Chemical compound C1(/[C@@H]2CC[C@@H]([C@]2(CCC1)C)[C@H](C)CCCC(C)C)=C\C=C1\C[C@@H](O)C[C@H](O)C1=C OFHCOWSQAMBJIW-AVJTYSNKSA-N 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 210000004087 cornea Anatomy 0.000 description 3
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 210000002744 extracellular matrix Anatomy 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 230000013595 glycosylation Effects 0.000 description 3
- 238000006206 glycosylation reaction Methods 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 239000012139 lysis buffer Substances 0.000 description 3
- 108010009298 lysylglutamic acid Proteins 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- SBKVPJHMSUXZTA-MEJXFZFPSA-N (2S)-2-[[(2S)-2-[[(2S)-1-[(2S)-5-amino-2-[[2-[[(2S)-1-[(2S)-6-amino-2-[[(2S)-2-[[(2S)-5-amino-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-amino-3-(1H-indol-3-yl)propanoyl]amino]-3-(1H-imidazol-4-yl)propanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-4-methylpentanoyl]amino]-5-oxopentanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]pyrrolidine-2-carbonyl]amino]acetyl]amino]-5-oxopentanoyl]pyrrolidine-2-carbonyl]amino]-4-methylsulfanylbutanoyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound C([C@@H](C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CNC=N1 SBKVPJHMSUXZTA-MEJXFZFPSA-N 0.000 description 2
- BHLYRWXGMIUIHG-HNNXBMFYSA-N (S)-reticuline Chemical compound C1=C(O)C(OC)=CC=C1C[C@H]1C2=CC(O)=C(OC)C=C2CCN1C BHLYRWXGMIUIHG-HNNXBMFYSA-N 0.000 description 2
- RLCSROTYKMPBDL-USJZOSNVSA-N 2-[[(2s)-1-[(2s)-2-[[(2s)-2-[[2-[[(2s)-2-amino-3-methylbutanoyl]amino]acetyl]amino]-3-methylbutanoyl]amino]propanoyl]pyrrolidine-2-carbonyl]amino]acetic acid Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RLCSROTYKMPBDL-USJZOSNVSA-N 0.000 description 2
- WOJJIRYPFAZEPF-YFKPBYRVSA-N 2-[[(2s)-2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]propanoyl]amino]acetate Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)CNC(=O)CN WOJJIRYPFAZEPF-YFKPBYRVSA-N 0.000 description 2
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 2
- 101150006240 AOX2 gene Proteins 0.000 description 2
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 2
- JAMAWBXXKFGFGX-KZVJFYERSA-N Ala-Arg-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JAMAWBXXKFGFGX-KZVJFYERSA-N 0.000 description 2
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 2
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 2
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 2
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 2
- GSHKMNKPMLXSQW-KBIXCLLPSA-N Ala-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C)N GSHKMNKPMLXSQW-KBIXCLLPSA-N 0.000 description 2
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 2
- QPBSRMDNJOTFAL-AICCOOGYSA-N Ala-Leu-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QPBSRMDNJOTFAL-AICCOOGYSA-N 0.000 description 2
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 2
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 2
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 2
- LFFOJBOTZUWINF-ZANVPECISA-N Ala-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O)=CNC2=C1 LFFOJBOTZUWINF-ZANVPECISA-N 0.000 description 2
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 2
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 2
- DGFXIWKPTDKBLF-AVGNSLFASA-N Arg-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCN=C(N)N)N DGFXIWKPTDKBLF-AVGNSLFASA-N 0.000 description 2
- XRLOBFSLPCHYLQ-ULQDDVLXSA-N Arg-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O XRLOBFSLPCHYLQ-ULQDDVLXSA-N 0.000 description 2
- CNBIWSCSSCAINS-UFYCRDLUSA-N Arg-Tyr-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNBIWSCSSCAINS-UFYCRDLUSA-N 0.000 description 2
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 2
- HZPSDHRYYIORKR-WHFBIAKZSA-N Asn-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O HZPSDHRYYIORKR-WHFBIAKZSA-N 0.000 description 2
- DXVMJJNAOVECBA-WHFBIAKZSA-N Asn-Gly-Asn Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O DXVMJJNAOVECBA-WHFBIAKZSA-N 0.000 description 2
- IICZCLFBILYRCU-WHFBIAKZSA-N Asn-Gly-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IICZCLFBILYRCU-WHFBIAKZSA-N 0.000 description 2
- OLISTMZJGQUOGS-GMOBBJLQSA-N Asn-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OLISTMZJGQUOGS-GMOBBJLQSA-N 0.000 description 2
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 2
- HMUKKNAMNSXDBB-CIUDSAMLSA-N Asn-Met-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMUKKNAMNSXDBB-CIUDSAMLSA-N 0.000 description 2
- CDGHMJJJHYKMPA-DLOVCJGASA-N Asn-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)N)N CDGHMJJJHYKMPA-DLOVCJGASA-N 0.000 description 2
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 2
- KDFQZBWWPYQBEN-ZLUOBGJFSA-N Asp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N KDFQZBWWPYQBEN-ZLUOBGJFSA-N 0.000 description 2
- PXLNPFOJZQMXAT-BYULHYEWSA-N Asp-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O PXLNPFOJZQMXAT-BYULHYEWSA-N 0.000 description 2
- VIRHEUMYXXLCBF-WDSKDSINSA-N Asp-Gly-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O VIRHEUMYXXLCBF-WDSKDSINSA-N 0.000 description 2
- PSLSTUMPZILTAH-BYULHYEWSA-N Asp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PSLSTUMPZILTAH-BYULHYEWSA-N 0.000 description 2
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 2
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 2
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 2
- FIRWLDUOFOULCA-XIRDDKMYSA-N Asp-Trp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N FIRWLDUOFOULCA-XIRDDKMYSA-N 0.000 description 2
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 2
- 102000001191 Collagen Type VIII Human genes 0.000 description 2
- 108010069526 Collagen Type VIII Proteins 0.000 description 2
- 102000030746 Collagen Type X Human genes 0.000 description 2
- 108010022510 Collagen Type X Proteins 0.000 description 2
- 102000014870 Collagen Type XII Human genes 0.000 description 2
- 108010039001 Collagen Type XII Proteins 0.000 description 2
- 102000047200 Collagen Type XVIII Human genes 0.000 description 2
- 108010001463 Collagen Type XVIII Proteins 0.000 description 2
- 102100036213 Collagen alpha-2(I) chain Human genes 0.000 description 2
- GMXSSZUVDNPRMA-FXQIFTODSA-N Cys-Arg-Asp Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GMXSSZUVDNPRMA-FXQIFTODSA-N 0.000 description 2
- RFHGRMMADHHQSA-KBIXCLLPSA-N Cys-Gln-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RFHGRMMADHHQSA-KBIXCLLPSA-N 0.000 description 2
- WAJDEKCJRKGRPG-CIUDSAMLSA-N Cys-His-Ser Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N WAJDEKCJRKGRPG-CIUDSAMLSA-N 0.000 description 2
- WTXCNOPZMQRTNN-BWBBJGPYSA-N Cys-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N)O WTXCNOPZMQRTNN-BWBBJGPYSA-N 0.000 description 2
- MQQLYEHXSBJTRK-FXQIFTODSA-N Cys-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CS)N MQQLYEHXSBJTRK-FXQIFTODSA-N 0.000 description 2
- IOLWXFWVYYCVTJ-NRPADANISA-N Cys-Val-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CS)N IOLWXFWVYYCVTJ-NRPADANISA-N 0.000 description 2
- 241000283086 Equidae Species 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 241000287828 Gallus gallus Species 0.000 description 2
- 108010010803 Gelatin Proteins 0.000 description 2
- 241000282818 Giraffidae Species 0.000 description 2
- NPTGGVQJYRSMCM-GLLZPBPUSA-N Gln-Gln-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPTGGVQJYRSMCM-GLLZPBPUSA-N 0.000 description 2
- HVQCEQTUSWWFOS-WDSKDSINSA-N Gln-Gly-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N HVQCEQTUSWWFOS-WDSKDSINSA-N 0.000 description 2
- GNMQDOGFWYWPNM-LAEOZQHASA-N Gln-Gly-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)CNC(=O)[C@@H](N)CCC(N)=O)C(O)=O GNMQDOGFWYWPNM-LAEOZQHASA-N 0.000 description 2
- ORYMMTRPKVTGSJ-XVKPBYJWSA-N Gln-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O ORYMMTRPKVTGSJ-XVKPBYJWSA-N 0.000 description 2
- HXOLDXKNWKLDMM-YVNDNENWSA-N Gln-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HXOLDXKNWKLDMM-YVNDNENWSA-N 0.000 description 2
- VNTGPISAOMAXRK-CIUDSAMLSA-N Gln-Pro-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O VNTGPISAOMAXRK-CIUDSAMLSA-N 0.000 description 2
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 2
- MXPBQDFWIMBACQ-ACZMJKKPSA-N Glu-Cys-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O MXPBQDFWIMBACQ-ACZMJKKPSA-N 0.000 description 2
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 2
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 2
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 2
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 2
- RAUDKMVXNOWDLS-WDSKDSINSA-N Glu-Gly-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O RAUDKMVXNOWDLS-WDSKDSINSA-N 0.000 description 2
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 2
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 2
- HMJULNMJWOZNFI-XHNCKOQMSA-N Glu-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N)C(=O)O HMJULNMJWOZNFI-XHNCKOQMSA-N 0.000 description 2
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 2
- HAGKYCXGTRUUFI-RYUDHWBXSA-N Glu-Tyr-Gly Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)O HAGKYCXGTRUUFI-RYUDHWBXSA-N 0.000 description 2
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 2
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 2
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 2
- OGCIHJPYKVSMTE-YUMQZZPRSA-N Gly-Arg-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OGCIHJPYKVSMTE-YUMQZZPRSA-N 0.000 description 2
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 2
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 2
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 2
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 2
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 2
- YZPVGIVFMZLQMM-YUMQZZPRSA-N Gly-Gln-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN YZPVGIVFMZLQMM-YUMQZZPRSA-N 0.000 description 2
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 2
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 2
- TVDHVLGFJSHPAX-UWVGGRQHSA-N Gly-His-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 TVDHVLGFJSHPAX-UWVGGRQHSA-N 0.000 description 2
- AYBKPDHHVADEDA-YUMQZZPRSA-N Gly-His-Asn Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O AYBKPDHHVADEDA-YUMQZZPRSA-N 0.000 description 2
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 2
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 2
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 2
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 2
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 2
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 2
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 2
- LXTRSHQLGYINON-DTWKUNHWSA-N Gly-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN LXTRSHQLGYINON-DTWKUNHWSA-N 0.000 description 2
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 2
- WDXLKVQATNEAJQ-BQBZGAKWSA-N Gly-Pro-Asp Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WDXLKVQATNEAJQ-BQBZGAKWSA-N 0.000 description 2
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 2
- NGRPGJGKJMUGDM-XVKPBYJWSA-N Gly-Val-Gln Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O NGRPGJGKJMUGDM-XVKPBYJWSA-N 0.000 description 2
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 2
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 2
- HAPWZEVRQYGLSG-IUCAKERBSA-N His-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O HAPWZEVRQYGLSG-IUCAKERBSA-N 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 2
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 2
- CTHAJJYOHOBUDY-GHCJXIJMSA-N Ile-Cys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N CTHAJJYOHOBUDY-GHCJXIJMSA-N 0.000 description 2
- CYHJCEKUMCNDFG-LAEOZQHASA-N Ile-Gln-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N CYHJCEKUMCNDFG-LAEOZQHASA-N 0.000 description 2
- JXMSHKFPDIUYGS-SIUGBPQLSA-N Ile-Glu-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N JXMSHKFPDIUYGS-SIUGBPQLSA-N 0.000 description 2
- SLQVFYWBGNNOTK-BYULHYEWSA-N Ile-Gly-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N SLQVFYWBGNNOTK-BYULHYEWSA-N 0.000 description 2
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 2
- DGTOKVBDZXJHNZ-WZLNRYEVSA-N Ile-Thr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N DGTOKVBDZXJHNZ-WZLNRYEVSA-N 0.000 description 2
- RNKSNIBMTUYWSH-YFKPBYRVSA-N L-prolylglycine Chemical compound [O-]C(=O)CNC(=O)[C@@H]1CCC[NH2+]1 RNKSNIBMTUYWSH-YFKPBYRVSA-N 0.000 description 2
- TYYLDKGBCJGJGW-UHFFFAOYSA-N L-tryptophan-L-tyrosine Natural products C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 TYYLDKGBCJGJGW-UHFFFAOYSA-N 0.000 description 2
- 241000282838 Lama Species 0.000 description 2
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 2
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 2
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 2
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 2
- VVQJGYPTIYOFBR-IHRRRGAJSA-N Leu-Lys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N VVQJGYPTIYOFBR-IHRRRGAJSA-N 0.000 description 2
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 2
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 2
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 2
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 2
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 2
- 229910009891 LiAc Inorganic materials 0.000 description 2
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 2
- JBRWKVANRYPCAF-XIRDDKMYSA-N Lys-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N JBRWKVANRYPCAF-XIRDDKMYSA-N 0.000 description 2
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 2
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 2
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 2
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 2
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 2
- RMOKGALPSPOYKE-KATARQTJSA-N Lys-Thr-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMOKGALPSPOYKE-KATARQTJSA-N 0.000 description 2
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 2
- 241000289619 Macropodidae Species 0.000 description 2
- 108010038049 Mating Factor Proteins 0.000 description 2
- JQHYVIKEFYETEW-IHRRRGAJSA-N Met-Phe-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=CC=C1 JQHYVIKEFYETEW-IHRRRGAJSA-N 0.000 description 2
- 241000283903 Ovis aries Species 0.000 description 2
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 2
- BPCLGWHVPVTTFM-QWRGUYRKSA-N Phe-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O BPCLGWHVPVTTFM-QWRGUYRKSA-N 0.000 description 2
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 241000425347 Phyla <beetle> Species 0.000 description 2
- ALJGSKMBIUEJOB-FXQIFTODSA-N Pro-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1 ALJGSKMBIUEJOB-FXQIFTODSA-N 0.000 description 2
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 2
- UPJGUQPLYWTISV-GUBZILKMSA-N Pro-Gln-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UPJGUQPLYWTISV-GUBZILKMSA-N 0.000 description 2
- HJSCRFZVGXAGNG-SRVKXCTJSA-N Pro-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1 HJSCRFZVGXAGNG-SRVKXCTJSA-N 0.000 description 2
- SKICPQLTOXGWGO-GARJFASQSA-N Pro-Gln-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O SKICPQLTOXGWGO-GARJFASQSA-N 0.000 description 2
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 2
- RCYUBVHMVUHEBM-RCWTZXSCSA-N Pro-Pro-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RCYUBVHMVUHEBM-RCWTZXSCSA-N 0.000 description 2
- QUBVFEANYYWBTM-VEVYYDQMSA-N Pro-Thr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUBVFEANYYWBTM-VEVYYDQMSA-N 0.000 description 2
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 2
- 241000283080 Proboscidea <mammal> Species 0.000 description 2
- 108010050808 Procollagen Proteins 0.000 description 2
- 102100035202 Procollagen-lysine,2-oxoglutarate 5-dioxygenase 1 Human genes 0.000 description 2
- 241000220010 Rhode Species 0.000 description 2
- GHPQVUYZQQGEDA-BIIVOSGPSA-N Ser-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N)C(=O)O GHPQVUYZQQGEDA-BIIVOSGPSA-N 0.000 description 2
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 2
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 2
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 2
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 2
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 2
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 2
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 2
- SIEBDTCABMZCLF-XGEHTFHBSA-N Ser-Val-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SIEBDTCABMZCLF-XGEHTFHBSA-N 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- 241000282898 Sus scrofa Species 0.000 description 2
- DCCGCVLVVSAJFK-NUMRIWBASA-N Thr-Asp-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O DCCGCVLVVSAJFK-NUMRIWBASA-N 0.000 description 2
- DSLHSTIUAPKERR-XGEHTFHBSA-N Thr-Cys-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O DSLHSTIUAPKERR-XGEHTFHBSA-N 0.000 description 2
- VGYBYGQXZJDZJU-XQXXSGGOSA-N Thr-Glu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VGYBYGQXZJDZJU-XQXXSGGOSA-N 0.000 description 2
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 2
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 2
- ZKVANNIVSDOQMG-HKUYNNGSSA-N Trp-Tyr-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)NCC(=O)O)N ZKVANNIVSDOQMG-HKUYNNGSSA-N 0.000 description 2
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 2
- JWHOIHCOHMZSAR-QWRGUYRKSA-N Tyr-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JWHOIHCOHMZSAR-QWRGUYRKSA-N 0.000 description 2
- RCLOWEZASFJFEX-KKUMJFAQSA-N Tyr-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RCLOWEZASFJFEX-KKUMJFAQSA-N 0.000 description 2
- QARCDOCCDOLJSF-HJPIBITLSA-N Tyr-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QARCDOCCDOLJSF-HJPIBITLSA-N 0.000 description 2
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 2
- KUXCBJFJURINGF-PXDAIIFMSA-N Tyr-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CC3=CC=C(C=C3)O)N KUXCBJFJURINGF-PXDAIIFMSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 2
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 2
- LABUITCFCAABSV-UHFFFAOYSA-N Val-Ala-Tyr Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-UHFFFAOYSA-N 0.000 description 2
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 2
- BMOFUVHDBROBSE-DCAQKATOSA-N Val-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N BMOFUVHDBROBSE-DCAQKATOSA-N 0.000 description 2
- YQMILNREHKTFBS-IHRRRGAJSA-N Val-Phe-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N YQMILNREHKTFBS-IHRRRGAJSA-N 0.000 description 2
- QTXGUIMEHKCPBH-FHWLQOOXSA-N Val-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 QTXGUIMEHKCPBH-FHWLQOOXSA-N 0.000 description 2
- RTJPAGFXOWEBAI-SRVKXCTJSA-N Val-Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RTJPAGFXOWEBAI-SRVKXCTJSA-N 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 2
- 108010086780 arginyl-glycyl-aspartyl-alanine Proteins 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 229960000074 biopharmaceutical Drugs 0.000 description 2
- 210000004204 blood vessel Anatomy 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000004186 co-expression Effects 0.000 description 2
- 230000037319 collagen production Effects 0.000 description 2
- 210000002808 connective tissue Anatomy 0.000 description 2
- 238000004132 cross linking Methods 0.000 description 2
- 108010060199 cysteinylproline Proteins 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- YSMODUONRAFBET-UHFFFAOYSA-N delta-DL-hydroxylysine Natural products NCC(O)CCC(N)C(O)=O YSMODUONRAFBET-UHFFFAOYSA-N 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 210000002472 endoplasmic reticulum Anatomy 0.000 description 2
- YSMODUONRAFBET-UHNVWZDZSA-N erythro-5-hydroxy-L-lysine Chemical compound NC[C@H](O)CC[C@H](N)C(O)=O YSMODUONRAFBET-UHNVWZDZSA-N 0.000 description 2
- 210000003754 fetus Anatomy 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000008273 gelatin Substances 0.000 description 2
- 229920000159 gelatin Polymers 0.000 description 2
- 235000019322 gelatine Nutrition 0.000 description 2
- 235000011852 gelatine desserts Nutrition 0.000 description 2
- 108010045624 glutamyl-lysyl-alanyl-histidyl-aspartyl-glycyl-glycyl-arginine Proteins 0.000 description 2
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 2
- 108010087823 glycyltyrosine Proteins 0.000 description 2
- 108010045383 histidyl-glycyl-glutamic acid Proteins 0.000 description 2
- 230000007062 hydrolysis Effects 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 229960002591 hydroxyproline Drugs 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 2
- 108010073093 leucyl-glycyl-glycyl-glycine Proteins 0.000 description 2
- PWPJGUXAGUPAHP-UHFFFAOYSA-N lufenuron Chemical compound C1=C(Cl)C(OC(F)(F)C(C(F)(F)F)F)=CC(Cl)=C1NC(=O)NC(=O)C1=C(F)C=CC=C1F PWPJGUXAGUPAHP-UHFFFAOYSA-N 0.000 description 2
- 108010017391 lysylvaline Proteins 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 150000007523 nucleic acids Chemical class 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 230000004481 post-translational protein modification Effects 0.000 description 2
- 108010004914 prolylarginine Proteins 0.000 description 2
- 108010053725 prolylvaline Proteins 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 108010048818 seryl-histidine Proteins 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 210000002435 tendon Anatomy 0.000 description 2
- 108010044292 tryptophyltyrosine Proteins 0.000 description 2
- 108010011876 valyl-glycyl-valyl-alanyl-prolyl-glycine Proteins 0.000 description 2
- 230000002792 vascular Effects 0.000 description 2
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 2
- RUPUUZZJJXCDHS-UHFFFAOYSA-N (+)-orientaline Natural products C1=C(O)C(OC)=CC(CC2C3=CC(O)=C(OC)C=C3CCN2C)=C1 RUPUUZZJJXCDHS-UHFFFAOYSA-N 0.000 description 1
- BJBUEDPLEOHJGE-UHFFFAOYSA-N (2R,3S)-3-Hydroxy-2-pyrolidinecarboxylic acid Natural products OC1CCNC1C(O)=O BJBUEDPLEOHJGE-UHFFFAOYSA-N 0.000 description 1
- VZQZXAJWZUSYHU-IKCSJVAGSA-N (2r,3s,4s,5r)-2,3,4,5,6-pentahydroxy-1-[(3r,4s,5s,6r)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]hexan-1-one Chemical compound OC[C@@H](O)[C@H](O)[C@H](O)[C@@H](O)C(=O)C1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O VZQZXAJWZUSYHU-IKCSJVAGSA-N 0.000 description 1
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 1
- 102100034044 All-trans-retinol dehydrogenase [NAD(+)] ADH1B Human genes 0.000 description 1
- 101710193111 All-trans-retinol dehydrogenase [NAD(+)] ADH4 Proteins 0.000 description 1
- 101100378521 Arabidopsis thaliana ADH2 gene Proteins 0.000 description 1
- 108010010777 Arg-Gly-Asp-Gly Proteins 0.000 description 1
- KRQSPVKUISQQFS-FJXKBIBVSA-N Arg-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N KRQSPVKUISQQFS-FJXKBIBVSA-N 0.000 description 1
- NGTYEHIRESTSRX-UWVGGRQHSA-N Arg-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N NGTYEHIRESTSRX-UWVGGRQHSA-N 0.000 description 1
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- ICRHGPYYXMWHIE-LPEHRKFASA-N Arg-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ICRHGPYYXMWHIE-LPEHRKFASA-N 0.000 description 1
- UPALZCBCKAMGIY-PEFMBERDSA-N Asn-Gln-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UPALZCBCKAMGIY-PEFMBERDSA-N 0.000 description 1
- GJFYPBDMUGGLFR-NKWVEPMBSA-N Asn-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC(=O)N)N)C(=O)O GJFYPBDMUGGLFR-NKWVEPMBSA-N 0.000 description 1
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 1
- CSEJMKNZDCJYGJ-XHNCKOQMSA-N Asp-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O CSEJMKNZDCJYGJ-XHNCKOQMSA-N 0.000 description 1
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 1
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 1
- SPKCGKRUYKMDHP-GUDRVLHUSA-N Asp-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N SPKCGKRUYKMDHP-GUDRVLHUSA-N 0.000 description 1
- SCQIQCWLOMOEFP-DCAQKATOSA-N Asp-Leu-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SCQIQCWLOMOEFP-DCAQKATOSA-N 0.000 description 1
- MJJIHRWNWSQTOI-VEVYYDQMSA-N Asp-Thr-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O MJJIHRWNWSQTOI-VEVYYDQMSA-N 0.000 description 1
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 1
- 241001513093 Aspergillus awamori Species 0.000 description 1
- 241000228245 Aspergillus niger Species 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 102100029516 Basic salivary proline-rich protein 1 Human genes 0.000 description 1
- 101100028942 Bos taurus P4HB gene Proteins 0.000 description 1
- 108010004032 Bromelains Proteins 0.000 description 1
- 101150082216 COL2A1 gene Proteins 0.000 description 1
- 241000244203 Caenorhabditis elegans Species 0.000 description 1
- 101100494773 Caenorhabditis elegans ctl-2 gene Proteins 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108010053835 Catalase Proteins 0.000 description 1
- 241000251730 Chondrichthyes Species 0.000 description 1
- 108090000317 Chymotrypsin Proteins 0.000 description 1
- 101150008975 Col3a1 gene Proteins 0.000 description 1
- 102000012432 Collagen Type V Human genes 0.000 description 1
- 108010022514 Collagen Type V Proteins 0.000 description 1
- 102000009736 Collagen Type XI Human genes 0.000 description 1
- 108010034789 Collagen Type XI Proteins 0.000 description 1
- 102000009089 Collagen Type XIII Human genes 0.000 description 1
- 108010073180 Collagen Type XIII Proteins 0.000 description 1
- 102000029816 Collagenase Human genes 0.000 description 1
- 108060005980 Collagenase Proteins 0.000 description 1
- 241001481833 Coryphaena hippurus Species 0.000 description 1
- 241000938605 Crocodylia Species 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- 244000303965 Cyamopsis psoralioides Species 0.000 description 1
- IIGHQOPGMGKDMT-SRVKXCTJSA-N Cys-Asp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N IIGHQOPGMGKDMT-SRVKXCTJSA-N 0.000 description 1
- HHABWQIFXZPZCK-ACZMJKKPSA-N Cys-Gln-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N HHABWQIFXZPZCK-ACZMJKKPSA-N 0.000 description 1
- GFAPBMCRSMSGDZ-XGEHTFHBSA-N Cys-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CS)N)O GFAPBMCRSMSGDZ-XGEHTFHBSA-N 0.000 description 1
- 101150097493 D gene Proteins 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 244000166124 Eucalyptus globulus Species 0.000 description 1
- 101150055254 FBA2 gene Proteins 0.000 description 1
- 101150034017 FDH1 gene Proteins 0.000 description 1
- 101100112369 Fasciola hepatica Cat-1 gene Proteins 0.000 description 1
- 108090000698 Formate Dehydrogenases Proteins 0.000 description 1
- 102100037181 Fructose-1,6-bisphosphatase 1 Human genes 0.000 description 1
- 241001200922 Gagata Species 0.000 description 1
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 1
- SWDSRANUCKNBLA-AVGNSLFASA-N Gln-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SWDSRANUCKNBLA-AVGNSLFASA-N 0.000 description 1
- HMIXCETWRYDVMO-GUBZILKMSA-N Gln-Pro-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O HMIXCETWRYDVMO-GUBZILKMSA-N 0.000 description 1
- PXHABOCPJVTGEK-BQBZGAKWSA-N Glu-Gln-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O PXHABOCPJVTGEK-BQBZGAKWSA-N 0.000 description 1
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- 108010073178 Glucan 1,4-alpha-Glucosidase Proteins 0.000 description 1
- 102100022624 Glucoamylase Human genes 0.000 description 1
- GZUKEVBTYNNUQF-WDSKDSINSA-N Gly-Ala-Gln Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GZUKEVBTYNNUQF-WDSKDSINSA-N 0.000 description 1
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 1
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 1
- UXJHNZODTMHWRD-WHFBIAKZSA-N Gly-Asn-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O UXJHNZODTMHWRD-WHFBIAKZSA-N 0.000 description 1
- CIMULJZTTOBOPN-WHFBIAKZSA-N Gly-Asn-Asn Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CIMULJZTTOBOPN-WHFBIAKZSA-N 0.000 description 1
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 1
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 1
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 1
- XBWMTPAIUQIWKA-BYULHYEWSA-N Gly-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN XBWMTPAIUQIWKA-BYULHYEWSA-N 0.000 description 1
- RPLLQZBOVIVGMX-QWRGUYRKSA-N Gly-Asp-Phe Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RPLLQZBOVIVGMX-QWRGUYRKSA-N 0.000 description 1
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 1
- FIQQRCFQXGLOSZ-WDSKDSINSA-N Gly-Glu-Asp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FIQQRCFQXGLOSZ-WDSKDSINSA-N 0.000 description 1
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 1
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 1
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 1
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 1
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 1
- CQIIXEHDSZUSAG-QWRGUYRKSA-N Gly-His-His Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 CQIIXEHDSZUSAG-QWRGUYRKSA-N 0.000 description 1
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- YSDLIYZLOTZZNP-UWVGGRQHSA-N Gly-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN YSDLIYZLOTZZNP-UWVGGRQHSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 1
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 1
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 1
- LYZYGGWCBLBDMC-QWHCGFSZSA-N Gly-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)CN)C(=O)O LYZYGGWCBLBDMC-QWHCGFSZSA-N 0.000 description 1
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 1
- KSOBNUBCYHGUKH-UWVGGRQHSA-N Gly-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN KSOBNUBCYHGUKH-UWVGGRQHSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 108700023372 Glycosyltransferases Proteins 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 101150009006 HIS3 gene Proteins 0.000 description 1
- 244000286779 Hansenula anomala Species 0.000 description 1
- 235000014683 Hansenula anomala Nutrition 0.000 description 1
- 241000282821 Hippopotamus Species 0.000 description 1
- VBOFRJNDIOPNDO-YUMQZZPRSA-N His-Gly-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N VBOFRJNDIOPNDO-YUMQZZPRSA-N 0.000 description 1
- CMPHFUWXKBPNRS-WDSOQIARSA-N His-Val-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CNC=N1 CMPHFUWXKBPNRS-WDSOQIARSA-N 0.000 description 1
- 101001125486 Homo sapiens Basic salivary proline-rich protein 1 Proteins 0.000 description 1
- 101001028852 Homo sapiens Fructose-1,6-bisphosphatase 1 Proteins 0.000 description 1
- 101000599940 Homo sapiens Interferon gamma Proteins 0.000 description 1
- 101000595904 Homo sapiens Procollagen-lysine,2-oxoglutarate 5-dioxygenase 1 Proteins 0.000 description 1
- LCWXJXMHJVIJFK-UHFFFAOYSA-N Hydroxylysine Natural products NCC(O)CC(N)CC(O)=O LCWXJXMHJVIJFK-UHFFFAOYSA-N 0.000 description 1
- 241000270349 Iguana Species 0.000 description 1
- TZCGZYWNIDZZMR-NAKRPEOUSA-N Ile-Arg-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C)C(=O)O)N TZCGZYWNIDZZMR-NAKRPEOUSA-N 0.000 description 1
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- ODPKZZLRDNXTJZ-WHOFXGATSA-N Ile-Gly-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ODPKZZLRDNXTJZ-WHOFXGATSA-N 0.000 description 1
- 241000235649 Kluyveromyces Species 0.000 description 1
- 244000285963 Kluyveromyces fragilis Species 0.000 description 1
- 101100502336 Komagataella pastoris FLD1 gene Proteins 0.000 description 1
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- 241000270322 Lepidosauria Species 0.000 description 1
- XBBKIIGCUMBKCO-JXUBOQSCSA-N Leu-Ala-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XBBKIIGCUMBKCO-JXUBOQSCSA-N 0.000 description 1
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 1
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 1
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 1
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- ADJWHHZETYAAAX-SRVKXCTJSA-N Leu-Ser-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ADJWHHZETYAAAX-SRVKXCTJSA-N 0.000 description 1
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 1
- HBBGRARXTFLTSG-UHFFFAOYSA-N Lithium ion Chemical compound [Li+] HBBGRARXTFLTSG-UHFFFAOYSA-N 0.000 description 1
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 1
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 1
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 1
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 101710084218 Master replication protein Proteins 0.000 description 1
- 101000686985 Mouse mammary tumor virus (strain C3H) Protein PR73 Proteins 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 101100243377 Mus musculus Pepd gene Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- 101100005271 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) cat-1 gene Proteins 0.000 description 1
- 241001033367 Ogataea siamensis Species 0.000 description 1
- 241001221837 Ogataea thermomethanolica Species 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 241000906034 Orthops Species 0.000 description 1
- 101150029183 PEP4 gene Proteins 0.000 description 1
- 241000833020 Padilla Species 0.000 description 1
- 108090000526 Papain Proteins 0.000 description 1
- 101710112083 Para-Rep C1 Proteins 0.000 description 1
- 101710112078 Para-Rep C2 Proteins 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 241000269799 Perca fluviatilis Species 0.000 description 1
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- 241001483078 Phyto Species 0.000 description 1
- 101100226950 Pichia angusta FMDH gene Proteins 0.000 description 1
- 241000521510 Pichia barkeri Species 0.000 description 1
- 241000521557 Pichia cactophila Species 0.000 description 1
- 241000522591 Pichia cephalocereana Species 0.000 description 1
- 241000522642 Pichia eremophila Species 0.000 description 1
- 241000468776 Pichia exigua Species 0.000 description 1
- 241000521549 Pichia heedii Species 0.000 description 1
- 241000521555 Pichia nakasei Species 0.000 description 1
- 241000235056 Pichia norvegensis Species 0.000 description 1
- 241000531873 Pichia occidentalis Species 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 240000002181 Portulaca pilosa Species 0.000 description 1
- 101150096292 Ppme1 gene Proteins 0.000 description 1
- HMNSRTLZAJHSIK-YUMQZZPRSA-N Pro-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 HMNSRTLZAJHSIK-YUMQZZPRSA-N 0.000 description 1
- ZTVCLZLGHZXLOT-ULQDDVLXSA-N Pro-Glu-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O ZTVCLZLGHZXLOT-ULQDDVLXSA-N 0.000 description 1
- 102000000602 Prolyl 3-hydroxylase 1 Human genes 0.000 description 1
- 108050008031 Prolyl 3-hydroxylase 1 Proteins 0.000 description 1
- 108050007606 Prolyl 3-hydroxylases Proteins 0.000 description 1
- 102100022881 Rab proteins geranylgeranyltransferase component A 1 Human genes 0.000 description 1
- 102100022880 Rab proteins geranylgeranyltransferase component A 2 Human genes 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 101100394989 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) hisI gene Proteins 0.000 description 1
- 235000011449 Rosa Nutrition 0.000 description 1
- 101100010928 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) tuf gene Proteins 0.000 description 1
- 101100421128 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) SEI1 gene Proteins 0.000 description 1
- 241000277331 Salmonidae Species 0.000 description 1
- 101100446293 Schizosaccharomyces pombe (strain 972 / ATCC 24843) fbh1 gene Proteins 0.000 description 1
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 1
- KCNSGAMPBPYUAI-CIUDSAMLSA-N Ser-Leu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KCNSGAMPBPYUAI-CIUDSAMLSA-N 0.000 description 1
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- 241000270295 Serpentes Species 0.000 description 1
- 108010071390 Serum Albumin Proteins 0.000 description 1
- 102000007562 Serum Albumin Human genes 0.000 description 1
- 241000272534 Struthio camelus Species 0.000 description 1
- 239000008049 TAE buffer Substances 0.000 description 1
- 101150001810 TEAD1 gene Proteins 0.000 description 1
- 101150074253 TEF1 gene Proteins 0.000 description 1
- 241000270666 Testudines Species 0.000 description 1
- NRUPKQSXTJNQGD-XGEHTFHBSA-N Thr-Cys-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NRUPKQSXTJNQGD-XGEHTFHBSA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 1
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 1
- 101710119887 Trans-acting factor B Proteins 0.000 description 1
- 101710119961 Trans-acting factor C Proteins 0.000 description 1
- 102100029898 Transcriptional enhancer factor TEF-1 Human genes 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- UIDJDMVRDUANDL-BVSLBCMMSA-N Trp-Tyr-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UIDJDMVRDUANDL-BVSLBCMMSA-N 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- HSVPZJLMPLMPOX-BPNCWPANSA-N Tyr-Arg-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O HSVPZJLMPLMPOX-BPNCWPANSA-N 0.000 description 1
- HDSKHCBAVVWPCQ-FHWLQOOXSA-N Tyr-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HDSKHCBAVVWPCQ-FHWLQOOXSA-N 0.000 description 1
- FZADUTOCSFDBRV-RNXOBYDBSA-N Tyr-Tyr-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=C(O)C=C1 FZADUTOCSFDBRV-RNXOBYDBSA-N 0.000 description 1
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- XEYUMGGWQCIWAR-XVKPBYJWSA-N Val-Gln-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N XEYUMGGWQCIWAR-XVKPBYJWSA-N 0.000 description 1
- CELJCNRXKZPTCX-XPUUQOCRSA-N Val-Gly-Ala Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O CELJCNRXKZPTCX-XPUUQOCRSA-N 0.000 description 1
- NXRAUQGGHPCJIB-RCOVLWMOSA-N Val-Gly-Asn Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O NXRAUQGGHPCJIB-RCOVLWMOSA-N 0.000 description 1
- BEGDZYNDCNEGJZ-XVKPBYJWSA-N Val-Gly-Gln Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O BEGDZYNDCNEGJZ-XVKPBYJWSA-N 0.000 description 1
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 1
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 1
- DIOSYUIWOQCXNR-ONGXEEELSA-N Val-Lys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O DIOSYUIWOQCXNR-ONGXEEELSA-N 0.000 description 1
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241001193070 Wickerhamomyces subpelliculosus Species 0.000 description 1
- 241001377336 [Pichia] myanmarensis Species 0.000 description 1
- 230000002745 absorbent Effects 0.000 description 1
- 239000002250 absorbent Substances 0.000 description 1
- HGEVZDLYZYVYHD-UHFFFAOYSA-N acetic acid;2-amino-2-(hydroxymethyl)propane-1,3-diol;2-[2-[bis(carboxymethyl)amino]ethyl-(carboxymethyl)amino]acetic acid Chemical compound CC(O)=O.OCC(N)(CO)CO.OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O HGEVZDLYZYVYHD-UHFFFAOYSA-N 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 239000004480 active ingredient Substances 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 125000002947 alkylene group Chemical group 0.000 description 1
- 230000000735 allogeneic effect Effects 0.000 description 1
- 102000004139 alpha-Amylases Human genes 0.000 description 1
- 108090000637 alpha-Amylases Proteins 0.000 description 1
- WQZGKKKJIJFFOK-PHYPRBDBSA-N alpha-D-galactose Chemical compound OC[C@H]1O[C@H](O)[C@H](O)[C@@H](O)[C@H]1O WQZGKKKJIJFFOK-PHYPRBDBSA-N 0.000 description 1
- 229940024171 alpha-amylase Drugs 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 210000002469 basement membrane Anatomy 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 108010051210 beta-Fructofuranosidase Proteins 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 239000001045 blue dye Substances 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 235000019835 bromelain Nutrition 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 206010061592 cardiac fibrillation Diseases 0.000 description 1
- 230000006037 cell lysis Effects 0.000 description 1
- 210000002421 cell wall Anatomy 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 150000005829 chemical entities Chemical class 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 229960002376 chymotrypsin Drugs 0.000 description 1
- 229960002424 collagenase Drugs 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 125000004093 cyano group Chemical group *C#N 0.000 description 1
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 239000008367 deionised water Substances 0.000 description 1
- 229910021641 deionized water Inorganic materials 0.000 description 1
- 238000001212 derivatisation Methods 0.000 description 1
- 210000002555 descemet membrane Anatomy 0.000 description 1
- 150000002016 disaccharides Chemical class 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 230000009881 electrostatic interaction Effects 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000855 fermentation Methods 0.000 description 1
- 230000004151 fermentation Effects 0.000 description 1
- 230000002600 fibrillogenic effect Effects 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 102000034240 fibrous proteins Human genes 0.000 description 1
- 108091005899 fibrous proteins Proteins 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 235000019688 fish Nutrition 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 229930182830 galactose Natural products 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 125000003976 glyceryl group Chemical group [H]C([*])([H])C(O[H])([H])C(O[H])([H])[H] 0.000 description 1
- 102000045442 glycosyltransferase activity proteins Human genes 0.000 description 1
- 108700014210 glycosyltransferase activity proteins Proteins 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 210000004349 growth plate Anatomy 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 239000000710 homodimer Substances 0.000 description 1
- 102000043557 human IFNG Human genes 0.000 description 1
- 230000036571 hydration Effects 0.000 description 1
- 238000006703 hydration reaction Methods 0.000 description 1
- 239000000017 hydrogel Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000003301 hydrolyzing effect Effects 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 150000001261 hydroxy acids Chemical class 0.000 description 1
- QJHBJHUKURJDLG-UHFFFAOYSA-N hydroxy-L-lysine Natural products NCCCCC(NO)C(O)=O QJHBJHUKURJDLG-UHFFFAOYSA-N 0.000 description 1
- 229910052588 hydroxylapatite Inorganic materials 0.000 description 1
- 230000001969 hypertrophic effect Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 108010090785 inulinase Proteins 0.000 description 1
- 239000001573 invertase Substances 0.000 description 1
- 235000011073 invertase Nutrition 0.000 description 1
- 210000002510 keratinocyte Anatomy 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 229910001416 lithium ion Inorganic materials 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 239000012160 loading buffer Substances 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 210000004379 membrane Anatomy 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 230000011278 mitosis Effects 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 150000002772 monosaccharides Chemical class 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 239000002121 nanofiber Substances 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229940055729 papain Drugs 0.000 description 1
- 235000019834 papain Nutrition 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- XYJRXVWERLGGKC-UHFFFAOYSA-D pentacalcium;hydroxide;triphosphate Chemical compound [OH-].[Ca+2].[Ca+2].[Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O XYJRXVWERLGGKC-UHFFFAOYSA-D 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 229940002612 prodrug Drugs 0.000 description 1
- 239000000651 prodrug Substances 0.000 description 1
- 108010007513 prolyl-glycyl-prolyl-leucine Proteins 0.000 description 1
- 125000001436 propyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 235000019833 protease Nutrition 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- BNUZUOWRDKPBQR-UHFFFAOYSA-N reticuline Natural products CN1CCC2=CC(OC)=CC=C2C1CC1=CC=C(OC)C(O)=C1 BNUZUOWRDKPBQR-UHFFFAOYSA-N 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 150000003839 salts Chemical group 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000013606 secretion vector Substances 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 125000005504 styryl group Chemical group 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 231100000617 superantigen Toxicity 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000000451 tissue damage Effects 0.000 description 1
- 231100000827 tissue damage Toxicity 0.000 description 1
- BJBUEDPLEOHJGE-IMJSIDKUSA-N trans-3-hydroxy-L-proline Chemical compound O[C@H]1CC[NH2+][C@@H]1C([O-])=O BJBUEDPLEOHJGE-IMJSIDKUSA-N 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- 229960001322 trypsin Drugs 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 210000004127 vitreous body Anatomy 0.000 description 1
- 239000007222 ypd medium Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
- C12N15/815—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts for yeasts other than Saccharomyces
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/78—Connective tissue peptides, e.g. collagen, elastin, laminin, fibronectin, vitronectin or cold insoluble globulin [CIG]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/65—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/0004—Oxidoreductases (1.)
- C12N9/0071—Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Mycology (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Peptides Or Proteins (AREA)
Abstract
비-히드록실화된 콜라겐 또는 히드록실화된 콜라겐의 증가된 양을 생산하기 위한 유전적으로 조작된 효모 균주가 기술된다. 키메라 콜라겐 DN의 총 길이에 기반하여 10 내지 40 퍼센트 또는 60 내지 90 퍼센트의 최적화된 DNA를 포함하는, 키메라 콜라겐 DNA 서열. 콜라겐, 프로모터, 및 히드록실화 효소를 생산하는데 필수적인 DNA를 포함하는 올인원(all-in-one) 벡터가 또한 기술된다. 비-히드록실화 또는 히드록실화된 콜라겐을 생산하기 위한 방법이 또한 제공된다.
Description
관련 분야에 대한 상호 참조
본 출원은 2017년 7월 31일자에 출원된 미국 가출원 제62/539,213호(본 명세서에 그 전문이 참고로 포함됨)에 대한 우선권을 주장한다.
본 출원은 발명의 명칭이 "Biofabricated Material Containing Collagen Fibrils"인 미국 특허 출원 제15/433,566호 및 발명의 명칭이 "Method for Making a Biofabricated Material Containing Collagen Fibrils"인 제15/433,650호(참고로 포함됨)에 관한 것이다.
기술분야
본 발명은 재조합 또는 조작된 콜라겐을 함유하는 생체조직제조된 가죽(biofabricated leather) 또는 가죽 유사 특성을 갖는 재료를 제조하기 위해 사용되는 재조합 콜라겐을 제조하기 위한 유전자 조작된 효모 균주 및 방법에 관한 것이다. 효모 균주는 재조합 콜라겐의 수산화의 특정한 정도를 선택함으로써 재조합 콜라겐의 구조적 및 조직적 특성을 제어하게 하도록 조작된다. 이것은 예를 들어 다양한 상이한 동물 실험이 없고(cruelty-free), 그린(green)인 생체조직제조된 가죽 및 유사한 재료로 도입하기 위해 특정한 최종 사용에 재조합 콜라겐의 특성을 맞추도록 허용한다.
가죽은 가구 덮개, 의류, 신발, 수화물, 핸드백 및 액세서리, 및 자동차 분야를 포함하는 매우 다양한 분야에서 사용된다. 가죽에서의 예상된 세계적인 무역 가치는 매년 대략 US 1000억 달러이고(Future Trends in the World Leather Products Industry and Trade, United Nations Industrial Development Organization, Vienna, 2010), 가죽 제품의 계속적이고 증가하는 수요가 존재한다. 이러한 요구를 만족시키는 새로운 방식은 가죽을 제조하는 것의 경제적인, 환경적인 및 사회적 비용의 관점에서 필요하다. 기술적 및 심미적 경향에 발맞추기 위해, 가죽 제품의 생산자 및 사용자는 천연 성분을 혼입한 더 탁월한 강도, 균일성, 가공성 및 패셔너블하고 어필하는 심미적 특성을 나타내는 새로운 재료를 추구한다.
인구 성장 및 세계적인 환경을 고려하여, 가죽 유사 심미성 및 개선된 기능성을 갖는 대안적인 재료의 수요가 존재할 것이다. 가죽은 동물 생가죽이고, 거의 전부 콜라겐으로 이루어진다. 생체조직제조된 가죽 재료로 혼입될 수 있는 콜라겐의 새로운 공급원의 수요가 존재한다.
재조합으로 발현된 콜라겐을 사용한 생체조직제조된 가죽의 제조는 다양한 상업적 분야에 필요한 형태 및 분량으로 콜라겐을 효율적으로 제조하기 위한 방법의 수요를 포함하는 다수의 도전과제에 직면한다. 몇몇 분야에 대해, 더 연질이고 더 투과성인 콜라겐 성분이 원해지고, 다른 분야에 대해, 더 경질이고 더 저항성이고 내구성인 콜라겐 성분이 필요하다.
몇몇 콜라겐 및 콜라겐 유사 단백질의 재조합 발현은 공지되어 있고; 문헌[Bell, EP 1232182B1, Bovine collagen and method for producing recombinant gelatin; Olsen 등의 미국 특허 제6,428,978호, Methods for the production of gelatin and full-length triple helical collagen in recombinant cells; VanHeerde 등의 미국 특허 제8,188,230호, Method for recombinant microorganism expression and isolation of collagen-like polypeptides](이들의 개시내용은 본 명세서에 참고로 포함됨)을 참조한다. 이러한 재조합 콜라겐은 가죽 또는 생체조직제조된 가죽 제품을 제조하기 위해 사용되지 않는다.
효모에서 단백질을 발현하기에 유용한 벡터는 공지되어 있고; 문헌[Ausubel et al., In: Current Protocols in Molecular Biology, Vol. 2, Chapter 13 Greene Publish. Assoc. & Wiley Interscience, 1988; Grant et al. (1987), Expression and Secretion Vectors for Yeast, in Methods in Enzymology, Ed. Wu & Grossman, Acad. Press, N.Y. 153:516-544; Glover (1986) DNA Cloning, Vol. II, IRL Press, Wash., D.C., Ch. 3; Bitter (1987), Heterologous Gene Expression in Yeast, in Methods in Enzymology, Eds. Berger & Kimmel, Acad. Press, N.Y. 152:673-684; and The Molecular Biology of the Yeast Saccharomyces, Eds. Strathern et al., Cold Spring Harbor Press, Vols. I and II (1982)](이들의 개시내용은 본 명세서에 참고로 포함됨)을 참조한다. 효모 발현 벡터는 예를 들어 ThermoFisher Scientific (www._thermofisher.com); ATUM (https://www._atum.bio/products/expression-vectors/yeast); 또는 IBA(https://www._iba-lifesciences.com/cloning-yeast-vectors.html)(2018년 7월 16일에 마지막 접속, 참고로 포함됨)에서의 카탈로그에 기재된 바대로 상업적으로 구입 가능하다.
피치아 파스토리스(Pichia pastoris)는 바이오의약품 단백질, 예컨대 인간 인터페론 감마를 재조합으로 발현하기 위해 사용된 효모 종이고, 문헌[Razaghi, et al., Biologicals 45: 52-60 (2017)]을 참조한다. 이것은 III형 콜라겐 및 프롤릴-4-수산화효소를 발현하기 위해 사용되고, 문헌[Vuorela, et al., EMBO J. 16:6702-6712 (1997)]을 참조한다. 콜라겐 및 프롤릴-4-수산화효소는 콜라겐성 재료를 제조하기 위해 에스체리치아 콜라이(Escherichia coli)에서 또한 발현되고, 문헌[Pinkas, et al., ACS Chem. Biol. 6(4):320-324 (2011)]을 참조한다.
수산화의 선택 정도에 의해 트로포콜라겐을 제공하고 이로써 생물조작된 가죽의 제조에서 사용하기 위한 상이한 콜라겐 재료의 범위를 제공하고자 하는 코돈 변형의 이용은 이전에 조사되지 않았다.
본 발명자들은 수산화의 선택적인 정도를 특징으로 하는 상이한 형태로 콜라겐을 풍부하게 발현할 수 있는 재조합 효모를 조작함으로써 이 도전과제를 해소하고자 추구하였다.
본 발명의 일 양태는 콜라겐을 효율적으로 발현하고 발현된 콜라겐에서 라이신 및 프롤린 잔기의 수산화의 정도를 제어하도록 조작된 재조합 효모 균주에 관한 것이다. 본 발명의 이 양태는, 콜라겐에서의 라이신, 프롤린, 또는 라이신 및 프롤린 잔기의 수에 기초하여, 라이신, 프롤린, 또는 라이신 및 프롤린 잔기에 대한 수산화의 선택 정도를 갖는 재조합 콜라겐을 발현할 수 있는 재조합 효모를 제공한다. 콜라겐의 수산화의 정도는 콜라겐 삼중 나선 또는 트로포콜라겐의 헐거움 또는 단단함 및 재조합 콜라겐에 의해 제조된 제품, 예컨대 생체조직제조된 가죽의 기능적 및 심미적 특성과 상관된다.
본 발명의 다른 실시형태는 콜라겐 또는 수산화효소를 코딩하는 코돈 변형된 핵산 서열, 콜라겐 및 수산화효소(들)를 코딩하는 벡터, 예컨대 "올-인-온 벡터" 및 재조합 콜라겐을 제조하고 사용하는 방법을 포함한다. 또 다른 실시형태에서, 본 발명은 수산화 콜라겐 및 비수산화 콜라겐을 생성하기에 유용한 효모 숙주에서의 키메라 DNA 서열을 제공한다.
도 1은 비수산화 콜라겐을 제조하도록 설계된 MMV-63의 벡터 다이어그램을 보여준다.
도 2는 비수산화 콜라겐을 제조하도록 설계된 MMV-77의 벡터 다이어그램을 보여준다.
도 3은 비수산화 콜라겐을 제조하도록 설계된 MMV-129의 벡터 다이어그램을 보여준다.
도 4는 비수산화 콜라겐을 제조하도록 설계된 MMV-130의 벡터 다이어그램을 보여준다.
도 5는 수산화 콜라겐을 제조하도록 설계된 MMV-78의 벡터 다이어그램을 보여준다.
도 6은 수산화 콜라겐을 제조하도록 설계된 MMV-94의 벡터 다이어그램을 보여준다.
도 7은 수산화 콜라겐을 제조하도록 설계된 MMV-156의 벡터 다이어그램을 보여준다.
도 8은 수산화 콜라겐을 제조하도록 설계된 MMV-191의 벡터 다이어그램을 보여준다.
도 9는 비수산화 콜라겐 또는 수산화 콜라겐을 제조하도록 설계된 올-인-원 벡터 MMV-208을 보여준다.
도 10은 MMV-84의 벡터 다이어그램을 보여준다.
도 11은 MMV-150의 벡터 다이어그램을 보여준다.
도 12는 MMV-140의 벡터 다이어그램을 보여준다.
도 13은 MMV-132의 벡터 다이어그램을 보여준다.
도 14는 MMV-193의 벡터 다이어그램을 보여준다.
도 15는 MMV-194의 벡터 다이어그램을 보여준다.
도 16은 MMV-195의 벡터 다이어그램을 보여준다.
도 17은 MMV-197의 벡터 다이어그램을 보여준다.
도 18은 MMV-198의 벡터 다이어그램을 보여준다.
도 19는 MMV-199의 벡터 다이어그램을 보여준다.
도 20은 MMV-200의 벡터 다이어그램을 보여준다.
도 21은 MMV-128의 벡터 다이어그램을 보여준다.
도 22는 Col3A1 키메라 분자를 기재한다.
도 2는 비수산화 콜라겐을 제조하도록 설계된 MMV-77의 벡터 다이어그램을 보여준다.
도 3은 비수산화 콜라겐을 제조하도록 설계된 MMV-129의 벡터 다이어그램을 보여준다.
도 4는 비수산화 콜라겐을 제조하도록 설계된 MMV-130의 벡터 다이어그램을 보여준다.
도 5는 수산화 콜라겐을 제조하도록 설계된 MMV-78의 벡터 다이어그램을 보여준다.
도 6은 수산화 콜라겐을 제조하도록 설계된 MMV-94의 벡터 다이어그램을 보여준다.
도 7은 수산화 콜라겐을 제조하도록 설계된 MMV-156의 벡터 다이어그램을 보여준다.
도 8은 수산화 콜라겐을 제조하도록 설계된 MMV-191의 벡터 다이어그램을 보여준다.
도 9는 비수산화 콜라겐 또는 수산화 콜라겐을 제조하도록 설계된 올-인-원 벡터 MMV-208을 보여준다.
도 10은 MMV-84의 벡터 다이어그램을 보여준다.
도 11은 MMV-150의 벡터 다이어그램을 보여준다.
도 12는 MMV-140의 벡터 다이어그램을 보여준다.
도 13은 MMV-132의 벡터 다이어그램을 보여준다.
도 14는 MMV-193의 벡터 다이어그램을 보여준다.
도 15는 MMV-194의 벡터 다이어그램을 보여준다.
도 16은 MMV-195의 벡터 다이어그램을 보여준다.
도 17은 MMV-197의 벡터 다이어그램을 보여준다.
도 18은 MMV-198의 벡터 다이어그램을 보여준다.
도 19는 MMV-199의 벡터 다이어그램을 보여준다.
도 20은 MMV-200의 벡터 다이어그램을 보여준다.
도 21은 MMV-128의 벡터 다이어그램을 보여준다.
도 22는 Col3A1 키메라 분자를 기재한다.
본 명세서에 예시된 바대로, 피치아 파스토리스는 수산화의 상이한 정도를 갖는 재조합 III형 소 콜라겐을 발현하도록 사용된다. 재조합 콜라겐의 수산화는 알파 및 베타 하위단위 소 프롤릴-4-수산화효소를 각각 코딩하는 소 P4HA 및 소 P4HB의 동시발현에 의해 달성된다. 그러나, 본 발명은 III형 콜라겐의 산물 및 발현으로 제한되지 않고, 콜라겐의 다른 종류의 하위단위, 및 프롤린 잔기, 라이신 잔기, 또는 프롤린 및 라이신 잔기 둘 다를 수산화시키는 효소를 코딩하는 폴리뉴클레오타이드에 의해 실행될 수 있다. III형 트로포콜라겐은 동종삼합체이다. 그러나, 몇몇 실시형태에서, 콜라겐은 상이한 폴리펩타이드 사슬로 이루어진 이종삼합체, 예컨대 처음에 2개의 프로-α1(I) 사슬 및 1개의 프로-α2(I) 사슬로 이루어진 I형 콜라겐을 형성할 것이다.
콜라겐. 콜라겐은 가죽의 주성분이다. 껍질 또는 동물 생가죽은 섬유성 단백질인 콜라겐의 상당한 양을 함유한다. 콜라겐은 적어도 28개의 구별되는 콜라겐 유형의 패밀리에 대한 포괄적인 용어이고, 동물 껍질은 통상적으로 I형 콜라겐이지만, 콜라겐의 다른 유형은 III형 콜라겐을 포함하는 가죽을 형성하는 데 사용될 수 있다. 용어 "콜라겐"은 비가공처리된 콜라겐(예를 들어, 프로콜라겐), 및 삼중 나선 구조를 갖는 번역 후 변형되고 단백질분해된 콜라겐을 포함한다.
콜라겐은 -(Gly-X-Y)n-인 아미노산의 반복 트리플렛을 특징으로 하고, 콜라겐에서의 아미노산 잔기의 대략 ⅓은 글라이신이다. X는 대개 프롤린이고, Y는 대개 하이드록시프롤린이지만, 가능한 Gly-X-Y 트리플렛이 400개 이하일 수 있다. 상이한 동물은 상이한 아미노산 조성을 갖는 콜라겐을 생성할 수 있고, 이것은 콜라겐에 상이한 특성을 부여하고 상이한 특성 또는 외관을 갖는 가죽을 생성할 수 있다.
콜라겐의 구조는 다른 길이의 3개의 뒤얽힌 펩타이드 사슬로 이루어질 수 있다. 콜라겐 삼중 나선(또는 단량체)은 약 1,050개의 아미노산 길이의 알파 사슬로부터 생성될 수 있어서, 삼중 나선은, 대략 1.5㎚의 직경으로, 대략 300㎚ 길이의 봉의 형태를 취할 수 있다.
콜라겐 섬유는 동물 생가죽의 유형에 따라 직경의 범위를 가질 수 있다. I형 콜라겐 이외에, 껍질(생가죽)은 III형 콜라겐(레티큘린), IV형 콜라겐 및 VII형 콜라겐을 포함하는 콜라겐의 다른 유형을 또한 포함할 수 있다.
포유류 신체에 걸쳐 다양한 유형의 콜라겐이 존재한다. 예를 들어, 껍질 및 동물 생가죽의 주성분 이외에, I형 콜라겐은 또한 연골, 힘줄, 혈관 결찰, 장기, 근육 및 골의 유기 부분에 존재한다. 동물 껍질 또는 생가죽 이외에 포유류 신체의 다양한 영역으로부터 콜라겐을 분리하기 위한 성공적인 노력이 이루어졌다. 10년 전에, 조사자들은 중성 pH에서 산 가용화된 콜라겐이 네이티브 조직에서 관찰된 동일한 횡문근 패턴으로 이루어진 피브릴로 자가조립된다는 것을 발견하였다; Schmitt F.O. J. Cell. Comp Physiol. 1942;20:11. 이것은 조직 엔지니어링 및 다양한 생물의학 분야에서의 콜라겐의 사용을 발생시킨다. 더 최근의 년에, 콜라겐은 재조합 기법을 이용하여 박테리아 및 효모로부터 수확되었다.
콜라겐은, 정전기 상호작용, 예컨대 염 브릿징, 수소 결합, 반 데르 발스 상호작용, 쌍극자-쌍극자 힘, 분극력, 소수성 상호작용 및 효소 반응에 의해 대개 촉매화되는 공유 결합을 포함하는, 물리적 및 화학적 상호작용의 조합을 통해 형성되고 안정화된다. 다양한 구별되는 콜라겐 유형은 소, 양, 돼지, 닭 및 인간 콜라겐을 포함하는 척추동물에서 확인되었다.
본 발명은 콜라겐의 하나 이상의 유형을 코딩하는 폴리뉴클레오타이드에 실행될 수 있다. 일반적으로, 콜라겐 유형은 로마 숫자로 넘버링되고, 각각의 콜라겐 유형에서 발견되는 사슬은 아라비아 숫자로 식별된다. 천연 발생 콜라겐의 다양한 상이한 유형의 구조 및 생물학적 기능의 상세한 설명은 당해 분야에서 이용 가능하고, 예를 들어 문헌[Ayad et al. (1998) The Extracellular Matrix Facts Book, Academic Press, San Diego, CA; Burgeson, R E., and Nimmi (1992) "Collagen types: Molecular Structure and Tissue Distribution" in Clin. Orthop. 282:250-272; Kielty, C. M. et al. (1993) "The Collagen Family: Structure, Assembly And Organization In The Extracellular Matrix," Connective Tissue And Its Heritable Disorders, Molecular Genetics, And Medical Aspects, Royce, P. M. and B. Steinmann eds., Wiley-Liss, NY, pp. 103-147; 및 Prockop, D.J- and K.I. Kivirikko (1995) "Collagens: Molecular Biology, Diseases, and Potentials for Therapy," Annu. Rev. Biochem., 64:403-434]을 참조한다.
I형 콜라겐은 유기체의 전체 콜라겐의 대략 80% 내지 90%를 포함하는 골 및 껍질의 주요 원섬유 콜라겐이다. I형 콜라겐은 다세포 유기체의 세포외 기질에 존재하는 주요한 구조적 거대분자이고, 전체 단백질 질량의 대략 20%를 포함한다. I형 콜라겐은, 각각 COL1A1 및 COL1A2 유전자에 의해 코딩되는, 2개의 α1(I) 사슬 및 1개의 α2(I) 사슬을 포함하는 이종삼합체 분자이다. 생체내, I형 콜라겐 피브릴, 섬유 및 섬유 다발의 어셈블리는 발생 동안 생기고, 세포 이동성 및 영양소 운반을 허용하면서 조직에 대한 기계적 지원을 제공한다. 다른 콜라겐 유형은 I형 콜라겐보다 덜 풍부하고, 상이한 분포 패턴을 나타낸다. 예를 들어, II형 콜라겐은 연골 및 유리체액에서 주요 콜라겐인 한편, III형 콜라겐은 혈관에서 높은 수준으로 발견되고 껍질에서 더 적은 정도로 발견된다.
II형 콜라겐은 COL2A1 유전자에 의해 코딩된 3개의 동일한 al(II) 사슬을 포함하는 동종삼합체 콜라겐이다. 정제된 II형 콜라겐은 예를 들어 문헌[Miller and Rhodes (1982) Methods In Enzymology 82:33-64]에 기재된 절차에 의해 당해 분야에 공지된 방법에 의해 조직으로부터 제조될 수 있다.
III형 콜라겐은 껍질 및 혈관 조직에서 발견되는 주요 원섬유 콜라겐이다. III형 콜라겐은 COL3A1 유전자에 의해 코딩되는 3개의 동일한 α1(III) 사슬을 포함하는 동종삼합체 콜라겐이다. 조직으로부터 III형 콜라겐을 정제하는 방법은 예를 들어 문헌[Byers et al. (1974) Biochemistry 13:5243-5248]; 및 상기 Miller 및 Rhodes의 문헌에서 발견될 수 있고, 본 발명의 방법에 의해 발현된 콜라겐과 조합되어 사용될 수 있다.
IV형 콜라겐은 피브릴보다는 시트의 형태로 기저막에서 발견된다. 가장 흔히, IV형 콜라겐은 2개의 α1(IV) 사슬 및 1개의 α2(IV) 사슬을 함유한다. IV형 콜라겐을 포함하는 특정한 사슬은 조직 특이적이다. IV형 콜라겐은 예를 들어 문헌[Furuto and Miller (1987) Methods in Enzymology, 144:41-61, Academic Press]에 기재된 절차를 이용하여 정제될 수 있다.
V형 콜라겐은 주로 골, 힘줄, 각막, 피부 및 혈관에서 발견되는 원섬유 콜라겐이다. V형 콜라겐은 동종삼합체 및 이종삼합체 형태 둘 다로 존재한다. V형 콜라겐의 하나의 형태는 2개의 α1(V) 사슬 및 1개의 α2(V) 사슬의 이종삼합체이다. V형 콜라겐의 또 다른 형태는 α1(V), α2(V) 및 α3(V) 사슬의 이종삼합체이다. V형 콜라겐의 추가의 형태는 α1(V)의 동종삼합체이다. 천연 공급원으로부터 V형 콜라겐을 단리하는 방법은 예를 들어 문헌[Elstow and Weiss (1983) Collagen Rel. Res. 3:181-193, and Abedin et al. (1982) Biosci. Rep. 2:493-502]에서 발견될 수 있다.
VI형 콜라겐은 작은 삼중 나선 영역 및 2개의 큰 비콜라겐성 잔여 부분을 갖는다. VI형 콜라겐은 α1(VI), α2(VI) 및 α3(VI) 사슬을 포함하는 이종삼합체이다. VI형 콜라겐은 많은 연결 조직에서 발견된다. 천연 공급원으로부터 VI형 콜라겐을 어떻게 정제하는지의 설명은 예를 들어 문헌[Wu et al. (1987) Biochem. J. 248:373-381, and Kielty et al. (1991) J. Cell Sci. 99:797-807]에서 발견될 수 있다.
VII형 콜라겐은 특정한 상피 조직에서 발견되는 원섬유 콜라겐이다. VII형 콜라겐은 3개의 α1(VII) 사슬의 동종삼합체 분자이다. 조직으로부터 VII형 콜라겐을 어떻게 정제하는지의 설명은 예를 들어 문헌[Lunstrum et al. (1986) J. Biol. Chem. 261:9042-9048, 및 Bentz et al. (1983) Proc. Natl. Acad. Sci. USA 80:3168-3172]에서 발견될 수 있다. VIII형 콜라겐은 각막에서의 데스메막(Descemet's membrane)에서 발견될 수 있다. VIII형 콜라겐은 2개의 α1(VIII) 사슬 및 1개의 α2(VIII) 사슬을 포함하는 이종삼합체이지만, 다른 사슬 조성물이 보고되어 있다. 자연으로부터 VIII형 콜라겐을 정제하는 방법은 예를 들어 문헌[Benya and Padilla (1986) J. Biol. Chem. 261:4160-4169, 및 Kapoor et al. (1986) Biochemistry 25:3930-3937]에서 발견될 수 있다.
IX형 콜라겐은 연골 및 유리체액에서 발견되는 피브릴 연관된 콜라겐이다. IX형 콜라겐은 α1(IX), α2(IX) 및 α3(IX) 사슬을 포함하는 이종삼합체 분자이다. IX형 콜라겐은 비삼중 나선 도메인에 의해 분리된 몇몇 삼중 나선 도메인을 보유하는 FACIT(Fibril Associated Collagens with Interrupted Triple Helices: 중단된 삼중 나선을 갖는 피브릴 연관된 콜라겐) 콜라겐으로 분류된다. IX형 콜라겐을 정제하기 위한 절차는 예를 들어 문헌[Duance, et al. (1984) Biochem. J. 221:885-889; Ayad et al. (1989) Biochem. J. 262:753-761; 및 Grant et al. (1988) The Control of Tissue Damage, Glauert, A. M., ed., Elsevier Science Publishers, Amsterdam, pp. 3-28]에서 발견될 수 있다.
X형 콜라겐은 α1(X) 사슬의 동종삼합체 화합물이다. X형 콜라겐은 예를 들어 성장판에서 발견되는 비대성 연골로부터 단리되고, 예를 들어, 문헌[Apte et al. (1992) Eur J Biochem 206 (1):217-24]을 참조한다.
XI형 콜라겐은 II형 및 IX형 콜라겐과 연관된 연골질 조직 및 신체에서의 다른 위치에서 발견될 수 있다. XI형 콜라겐은 α1(XI), α2(XI) 및 α3(XI) 사슬을 포함하는 이종삼합체 분자이다. XI형 콜라겐을 정제하는 방법은 예를 들어 상기 Grant 등의 문헌에서 발견될 수 있다.
XII형 콜라겐은 I형 콜라겐과 연관되어 주로 발견되는 FACIT 콜라겐이다. XII형 콜라겐은 3개의 α1(XII) 사슬을 포함하는 동종삼합체 분자이다. XII형 콜라겐 및 이의 변이체를 정제하는 방법은 예를 들어 문헌[Dublet et al. (1989) J. Biol. Chem. 264:13150-13156; Lunstrum et al. (1992) J. Biol. Chem. 267:20087-20092; 및 Watt et al. (1992) J. Biol. Chem. 267:20093-20099]에서 발견될 수 있다.
XIII형은 예를 들어 껍질, 장, 골, 연골 및 횡문근에서 발견되는 비원섬유 콜라겐이다. XIII형 콜라겐의 상세한 설명은 예를 들어 문헌[Juvonen et al. (1992) J. Biol. Chem. 267: 24700-24707]에서 발견될 수 있다.
XIV형은 α1(XIV) 사슬을 포함하는 동종삼합체 분자로서 규명되는 FACIT 콜라겐이다. XIV형 콜라겐을 단리하는 방법은 예를 들어 문헌[Aubert-Foucher et al. (1992) J. Biol. Chem. 267:15759-15764] 및 상기 Watt 등의 문헌에서 발견될 수 있다.
XV형 콜라겐은 XVIII형 콜라겐에 대한 구조에서 상동성이다. 천연 XV형 콜라겐의 구조 및 단리에 관한 정보는 예를 들어 문헌[Myers et al. (1992) Proc. Natl. Acad. Sci. USA 89:10144-10148; Huebner et al. (1992) Genomics 14:220-224; Kivirikko et al. (1994) J. Biol. Chem. 269:4773-4779; 및 Muragaki, J. (1994) Biol. Chem. 264:4042-4046]에서 발견될 수 있다.
XVI형 콜라겐은 예를 들어 껍질, 폐 섬유아세포 및 각질세포에서 발견되는 피브릴 연관된 콜라겐이다. XVI형 콜라겐 및 XVI형 콜라겐을 코딩하는 유전자의 구조에 관한 정보는 예를 들어 문헌[Pan et al. (1992) Proc. Natl. Acad. Sci. USA 89:6565-6569; 및 Yamaguchi et al. (1992) J. Biochem. 112:856-863]에서 발견될 수 있다.
XVII형 콜라겐은 수포성 유천포창 항원으로도 공지된 반데스모솜 막관통 콜라겐이다. XVII형 콜라겐 및 XVII형 콜라겐을 코딩하는 유전자의 구조에 관한 정보는 예를 들어 문헌[Li et al. (1993) J. Biol. Chem. 268(12):8825-8834; 및 McGrath et al. (1995) Nat. Genet. 11(1):83-86]에서 발견될 수 있다.
XVIII형 콜라겐은 XV형 콜라겐과 구조에서 유사하고, 간으로부터 단리될 수 있다. XVIII형 콜라겐의 구조 및 천연 공급원으로부터의 단리의 설명은 예를 들어 문헌[Rehn and Pihlajaniemi (1994) Proc. Natl. Acad. Sci USA 91:4234-4238; Oh et al. (1994) Proc. Natl. Acad. Sci USA 91:4229-4233; Rehn et al. (1994) J. Biol. Chem. 269:13924-13935; 및 Oh et al. (1994) Genomics 19:494-499]에서 발견될 수 있다.
XIX형 콜라겐은 FACIT 콜라겐 패밀리의 또 다른 구성원인 것으로 생각되고, 횡문근육종 세포로부터 단리된 mRNA에서 발견된다. XIX형 콜라겐의 구조 및 단리의 설명은 예를 들어 문헌[Inoguchi et al. (1995) J. Biochem. 117:137-146; Yoshioka et al. (1992) Genomics 13:884-886; 및 Myers et al., J. Biol. Chem. 289:18549-18557 (1994)]에서 발견될 수 있다.
XX형 콜라겐은 FACIT 콜라겐성 패밀리의 새로 발견된 구성원이고, 병아리 각막에서 확인되었고, 예를 들어 문헌[Gordon et al. (1999) FASEB Journal 13:A1119; and Gordon et al. (1998), IOVS 39:S1128]을 참조한다.
콜라겐의 하나 이상의 종류는 본 발명의 방법 및 상기 인용된 참고(모든 목적을 위해 참고로 포함됨)로 기재된 바와 같이 추가로 가공처리되거나 정제된 발현된 콜라겐을 이용하여 발현될 수 있다.
용어 "콜라겐"은 상기 기재된 I형 내지 XX형 콜라겐을 포함하는 공지된 콜라겐 유형 중 임의의 하나, 및 천연, 합성, 반합성 또는 재조합이든 임의의 다른 콜라겐을 의미한다. 이것은 본 명세서에 기재된 모든 콜라겐, 변형된 콜라겐 및 콜라겐 유사 단백질을 포함한다. 상기 용어는 또한 모티프 (Gly-X-Y)n(식 중, n은 정수임)을 포함하는 프로콜라겐 및 콜라겐 유사 단백질 또는 콜라겐성 단백질을 포함한다. 이것은 콜라겐 및 콜라겐 유사 단백질의 분자, 콜라겐 분자의 삼합체, 콜라겐의 피브릴 및 콜라겐 피브릴의 섬유를 포함한다. 이것은 또한 피브릴화될 수 있는 화학적으로, 효소로 또는 재조합으로 변형된 콜라겐 또는 콜라겐 유사 분자, 및 나노섬유로 재조립될 수 있는 콜라겐, 콜라겐 유사 분자 및 콜라겐성 분자의 단편을 의미한다. 재조합 콜라겐 분자는 네이티브이든 또는 조작되든 본 명세서에 기재된 반복된 -(Gly-X-Y)n- 서열을 일반적으로 포함할 것이다.
콜라겐에서의 프롤린 및 라이신 잔기의 수산화. 콜라겐의 폴리펩타이드의 주요 번역 후 변형은 4-하이드록시프롤린, 3-하이드록시프롤린(Hyp) 및/또는 하이드록시라이신(Hyl)을 생성시키기 위한 프롤린 및/또는 라이신 잔기의 수산화, 및 하이드록시라이실 잔기의 글라이코실화이다. 이 변형은 3개의 수산화효소(프롤릴 4-수산화효소, 프롤릴 3-수산화효소 및 라이실 수산화효소) 및 2개의 글라이코실 전환효소에 의해 촉매화된다. 폴리펩타이드가 삼중-나선 콜라겐 구조를 형성할 때까지(추가의 변형을 저해함), 생체내 이 반응이 발생한다.
프롤릴 -4-수산화효소. 이 효소는 (2S,4R)-4-하이드록시프롤린(Hyp)으로의 프롤린 잔기의 수산화를 촉매화한다. Gorres, et al., Critical Reviews in Biochemistry and Molecular Biology 45 (2): (2010)(참고로 포함됨). 하기 예는 P4HA(서열 번호 54) 및 P4HB(서열 번호 52)에 의해 코딩된 사합체 소 프롤릴-4-수산화효소(2 알파 및 2 베타 사슬)를 이용하지만, 비-소 공급원으로부터의 아이소폼, 오르토로그, 변이체, 단편 및 프롤릴-4-수산화효소는, 이들이 효모 숙주 세포에서 수산화효소 활성을 보유하는 한 또한 사용될 수 있다. 추가로 P4HA1은 http://_www.omim.org/entry/176710에 의해 기재되고, P4HB1 및 P4HB1은 http://www.omim.org/entry/176790에 의해 기재된다(둘 다 참고로 포함됨).
프롤릴 3-수산화효소. 이 효소는 프롤린 잔기의 수산화를 촉매화한다. 프롤릴 3-수산화효소 1 전구체[보스 타우루스(Bos taurus)]는 NCBI 기준 서열: NP_001096761.1 또는 NM_001103291.1(서열 번호 48)에 의해 기재된다. 추가의 설명을 위해, 문헌[Vranka, et al., J. Biol. Chem. 279: 23615-23621 (2004) 또는 hhttp://_www.omim.org/entry/610339(2017년 7월 14일에 마지막 접속)](참고로 포함됨)을 참조한다. 이 효소는 이의 네이티브 형태로 사용될 수 있다. 그러나, 비-소 공급원으로부터의 아이소폼, 오르토로그, 변이체, 단편 및 프롤릴-3-수산화효소는, 이들이 효모 숙주 세포에서 수산화효소 활성을 보유하는 한 또한 사용될 수 있다.
라이실 수산화효소. 라이실 수산화효소(EC 1.14.11.4)는 X-lys-gly 서열에서의 라이신 잔기의 수산화에 의해 콜라겐 유사 아미노산 서열에 의한 콜라겐 및 다른 단백질에서의 하이드록시라이신의 형성을 촉매화한다. 효소는 약 85kD의 분자 질량을 갖는 하위단위로 이루어진 동종이합체이다. 이들 2개의 콜라겐 수산화효소 사이의 동역학 특성에서의 현저한 유사성에도 불구하고, 라이실 수산화효소의 1차 구조와 프롤릴-4-수산화효소의 하위단위의 2개의 유형(176710, 176790) 사이에 상당한 상동성이 발견되지 않았다. 라이실 수산화효소 반응에서 형성된 하이드록시라이신 잔기는 2개의 중요한 기능을 갖는다: 처음에, 이의 하이드록시기는 모노사카라이드 갈락토스 또는 이당류 글루코실갈락토스인 탄수화물 단위에 대한 부착의 부위로서 작용하고; 둘째로, 이들은 분자간 콜라겐 가교결합을 안정화시킨다.
PLOD1 프로콜라겐-라이신,2-옥소글루타레이트 5-다이옥시게나제 1[보스 타우루스(소)]은 2017년 5월 25일에 업데이트된 Gene ID: 281409에 의해 기재되고, https://www.ncbi.nlm.nih.gov/gene/281409(2017년 7월 14일에 마지막 접속)에 참조하여 포함된다. 또 다른 예는 보스 타우루스 라이실 산화효소(LOX)를 기재하는 서열 번호 50에 의해 기재된다. 이 효소는 이의 네이티브 형태로 사용될 수 있다. 그러나, 비-소 공급원으로부터의 아이소폼, 오르토로그, 변이체, 단편 및 라이실 수산화효소는, 이들이 효모 숙주 세포에서 수산화효소 활성을 보유하는 한 또한 사용될 수 있다.
재조합 콜라겐에서의 프롤린 잔기의 수산화의 정도의 검정. 재조합 콜라겐에서의 프롤린 잔기의 수산화의 정도는, 문헌[Chan, et al., BMC Biotechnology 12:51 (2012)](참고로 포함됨)에 의해 기재된 바대로, 액체 크로마토그래피-질량 분광법을 포함하는 공지된 방법에 의해 평가될 수 있다.
재조합 콜라겐에서의 라이신 잔기의 수산화의 정도의 검정. 콜라겐의 라이신 수산화 및 가교결합은 문헌[Yamauchi, et al., Methods in Molecular Biology, vol. 446, pages 95-108; Humana Press (2008))](참고로 포함됨)에 의해 기재되어 있다. 재조합 콜라겐에서의 라이신 잔기의 수산화의 정도는 문헌[Hausmann, Biochimica et Biophysica Acta (BBA) - Protein Structure 133(3): 591-593 (1967)](참고로 포함됨)에 의해 기재된 방법을 포함하는 공지된 방법에 의해 평가될 수 있다.
콜라겐 융점. 콜라겐에서의 프롤린, 라이신 또는 프롤린 및 라이신 잔기의 수산화의 정도는 공지된 함량의 수산화 아미노산 잔기를 갖는 대조군 콜라겐과 비교하여 수화 콜라겐, 예컨대 하이드로겔의 융점에 의해 예상될 수 있다. 콜라겐 융점은 25 내지 40℃의 범위일 수 있고, 더 고도로 수산화된 콜라겐은 더 높은 융점을 일반적으로 갖는다. 이 범위는 25, 26, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 및 40을 포함하는 모든 중간 하위범위 및 값을 포함한다.
코돈 변형. 이 과정은 효모, 예컨대 피치아 파스토리스에 의해 발현된 재조합 콜라겐의 양을 변형시키기 위해, 재조합 효모에 의해 분비된 재조합 콜라겐의 양을 변형시키기 위해, 재조합 효모에서의 재조합 콜라겐의 발현의 속도를 변형시키기 위해, 또는 재조합 콜라겐에서의 라이신 또는 프롤린 잔기의 수산화의 정도를 변형시키기 위해 자연에서 발견되는 콜라겐 DNA 서열과 같은 콜라겐을 코딩하는 폴리뉴클레오타이드 서열의 변경을 포함한다. 코돈 변형은 유사한 목적을 위해 또는 특정한 세포내 또는 세포외 구획에 수산화효소를 표적화하기 위해, 예를 들어 재조합 콜라겐 분자와 동일한 구획, 예컨대 소포체에 프롤린 수산화효소를 표적화하기 위해 다른 단백질, 예컨대 수산화효소에 또한 적용될 수 있다.
코돈 선택은 RNA 2차 구조에 대한 효과, 전사 및 유전자 발현에 대한 효과, 번역 연장의 속도에 대한 효과, 및/또는 단백질 폴딩에 대한 효과에 기초하여 만들어질 수 있다.
콜라겐 또는 수산화효소를 코딩하는 코돈은 재조합 콜라겐 또는 수산화효소를 코딩하는 mRNA에서의 2차 구조를 감소시키거나 증가시키도록 변형될 수 있거나, 대체로 효모에서의 모든 단백질 코딩 서열에 기초하여 효모 숙주 세포에 의해 가장 빈번히 사용되는(예를 들어, 코돈 샘플링), 효모에서의 모든 단백질 코딩 서열에 기초하여 효모 숙주 세포에 의해 가장 덜 빈번히 사용되는(예를 들어, 코돈 샘플링) 코돈, 또는 효모 숙주 세포에 의해 풍부하게 발현된 단백질에 보이거나 효모 숙주 세포에 의해 분비된 단백질에서 보이는 중복성 코돈(예를 들어, 유전자를 고도로 발현된 유전자 또는 발현 숙주로부터 분비 가능한 단백질을 코딩하는 유전자와 "같이 보이게" 만드는 높은 코돈 적응 지수(High Codon Adaptation Index)에 기초한 코돈 선택)에 의해 중복성 코돈을 대체하도록 변형될 수 있다.
코돈 변형은 단백질 코딩 서열의 전부 또는 일부에, 예를 들어 코딩 서열 또는 이들의 조합의 제1, 제2, 제3, 제4, 제5, 제6, 제7, 제8, 제9 또는 제10의 10% 중 적어도 하나에 적용될 수 있다. 이것은 또한 특정한 아미노산을 코딩하는 코돈 또는 중복성 코돈에 의해 코딩된 모든 아미노산이 아닌 몇몇 아미노산을 코딩하는 코돈에 선택적으로 적용될 수 있다. 예를 들어, 류신 및 페닐알라닌에 대한 코돈만이 상기 기재된 바대로 코돈 변형될 수 있다. 하나 초과의 코돈에 의해 코딩된 아미노산은 당해 분야에 널리 공지되고 https://en.wikipedia.org/wiki/DNA_codon_table(2017년 7월 13일에 마지막 접속)에 참고로 포함된 코돈 표에 의해 기재된다.
코돈 변형은 https://www.atum.bio/services/genegps(2017년 7월 13일에 마지막 접속), https://www.idtdna.com/CodonOpt; https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1523223/ 또는 https://en.wikipedia.org/wiki/DNA2.0Algorithm(각각 참고로 포함됨)에 의해 기재된 소위 코돈 최적화 방법을 포함한다.
코돈 변형은 또한 mRNA 2차 구조의 형성을 허용하도록 또는 2차 구조를 최소화하거나 제거하도록 코돈의 선택을 포함한다. 이의 예는 리보솜-결합 부위 또는 개시 코돈에서 또는 이 주위에서 2차 구조 강한 2차 구조를 제거하거나 감소시키거나 약하게 하도록 코돈 선택을 만드는 것이다.
콜라겐 단편. 재조합 콜라겐 분자는 트로포콜라겐(삼합체 콜라겐)을 형성할 수 있는 네이티브 콜라겐 분자 또는 수탁 번호 NP_001029211.1(https://_www.ncbi.nlm.nih.gov/protein/77404252, 2017년 2월 9일에 마지막 접속), NP_776945.1(https://_www.ncbi.nlm.nih.gov/protein/27806257, 2017년 2월 9일에 마지막 접속) 및 NP_001070299.1(https://_www.ncbi.nlm.nih.gov/protein/116003881, 2017년 2월 9일에 마지막 접속)(참고로 포함됨)에 의해 기재된, Col1A1, Col1A2 및 Col3A1의 아미노산 서열의 것과 같은 네이티브 콜라겐 아미노산 서열과 적어도 70%, 80%, 90%, 95%, 96%, 97%, 98% 또는 99% 동일하거나 유사한 아미노산 서열을 갖는 변형된 콜라겐 분자 또는 절두된 콜라겐 분자(또는 이의 피브릴 형성 영역 또는 실질적으로 [Gly-X-Y]n을 포함하는 분절)의 아미노산 서열의 단편을 포함할 수 있다.
콜라겐 또는 수산화효소를 코딩하는 유전자는 절두되거나 그렇지 않으면 서열을 부가하거나 제거하도록 변형될 수 있다. 이러한 변형은 리뉴클레오타이드 또는 벡터의 크기를 맞춤화하도록, 소포체 또는 다른 세포 또는 세포외 구획으로 발현된 단백질을 표적화하도록, 또는 코딩된 단백질의 길이를 제어하도록 만들어질 수 있다. 예를 들어, 본 발명자들은 Pre 서열만을 함유하는 작제물이 전체 Pre-pro 서열을 함유하는 것보다 대개 더 양호하게 일한다는 것을 발견하였다. Pre 서열은 P4HB에 융합되어 ER에서 P4HB를 국재화하고, 여기서 콜라겐은 또한 국재화된다.
콜라겐 및 수산화효소에 대한 변형된 코딩 서열. 콜라겐 또는 수산화효소, 또는 다른 단백질에 대한 폴리뉴클레오타이드 코딩 서열은 공지된 아미노산 서열과 적어도 70%, 80%, 90%, 95%, 96%, 97%, 98% 또는 100% 동일하거나 유사하고, 비변형된 분자의 본질적인 특성, 예를 들어 트로포콜라겐을 형성하는 능력 또는 콜라겐에서 프롤린 또는 라이신 잔기를 수산화하는 능력을 보유하는 단백질을 코딩하도록 변형될 수 있다. 콜라겐 분자에서의 글라이코실화 부위는 제거되거나 부가될 수 있다. 변형은 효모 숙주 세포에 의한 콜라겐 생성 또는 이의 분비를 수월하게 하도록 또는 이의 구조적, 기능적 또는 심미적 특성을 변경하도록 이루어질 수 있다. 변형된 콜라겐 또는 수산화효소 코딩 서열은 또한 본 명세서에 기재된 바와 같이 코돈 변형될 수 있다.
용어 "네이티브 콜라겐", "네이티브 폴리펩타이드" 또는 "네이티브 폴리뉴클레오타이드"는, 예를 들어 뉴클레오타이드의 결실, 삽입 또는 치환, 예컨대 코돈 변형에 의한 변경에 의해, 예를 들어 아미노산 잔기의 또는 폴리뉴클레오타이드에 대한 결실, 치환의 부가 없이, 네이티브 서열의 변경 없이, 자연에서 발견되면서 폴리펩타이드 또는 폴리뉴클레오타이드 서열을 의미한다. 본 명세서에 기재된 콜라겐 및 효소의 유형은 이의 네이티브 형태, 및 네이티브 콜라겐 또는 효소의 생물학적 활성을 보유하는 변형된 형태를 포함한다. 폴리뉴클레오타이드 및 폴리펩타이드의 변형된 형태는 상응하는 네이티브 서열과 특정한 정도의 서열 동일성 또는 유사성을 갖는 것에 의해 확인될 수 있다. 변형된 폴리뉴클레오타이드 서열은 또한 본 명세서에 기재된 임의의 벡터 또는 예를 들어 도 1 내지 도 20에 도시된 바와 같은 이 벡터를 구성하는 임의의 폴리뉴클레오타이드 유전요소와 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% 또는 100%의 서열 동일성 또는 유사성을 갖는 것을 포함한다.
BLASTN은 기준 폴리뉴클레오타이드, 예컨대 콜라겐을 코딩하는 폴리뉴클레오타이드, 본 명세서에 기재된 하나 이상의 수산화효소, 또는 신호, 리더 또는 분비 펩타이드 또는 본 명세서에 개시된 임의의 다른 단백질과 적어도 70%, 75%, 80%, 85%, 87.5%, 90%, 92.5%, 95%, 97.5%, 98%, 99% 또는 100% 미만의 서열 동일성을 갖는 폴리뉴클레오타이드 서열을 확인하도록 사용될 수 있다. 매우 유사한 서열을 발견하도록 변형된 대표적인 BLASTN 설정은 10의 예상 한계치(Expect Threshold) 및 28의 워드크기, 0의 쿼리 범위에서의 최대 일치, 1/-2의 일치/비일치 점수 및 선형 갭 비용을 이용한다. 낮은 복합 영역은 여과되거나 마스킹될 수 있다. 표준 뉴클레오타이드 BLAST의 디폴트 설정은 https://_blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome(2017년 7월 13일에 마지막 접속)에 의해 기재되고 이에 참고로 포함된다.
BLASTP는 유사성 매트릭스, 예컨대 BLOSUM45, BLOSUM62 또는 BLOSUM80(여기서, BLOSUM45는 밀접하게 관련된 서열에 사용될 수 있음), 중간범위 서열에 대한 BLOSUM62 및 더 멀리 관련된 서열에 대한 BLOSUM80을 이용하여 기준 아미노산, 예컨대 콜라겐 아미노산 서열과 적어도 70%, 75%, 80%, 85%, 87.5%, 90%, 92.5%, 95%, 97.5%, 98%, 99% 또는 100% 미만의 서열 동일성 또는 유사성을 갖는 아미노산 서열을 확인하도록 사용될 수 있다. 달리 표시되지 않은 한, 유사성 점수는 BLOSUM62의 사용에 기초할 것이다. BLASTP를 이용할 때, 유사성 백분율은 BLASTP 양수 점수에 기초하고, 서열 동일성 백분율은 BLASTP 동일성 점수에 기초한다. BLASTP "동일성"은 동일한 높은 스코어링 서열 쌍에서의 전체 잔기의 수 및 분획을 보여주고, BLASTP "양수"는 정렬 점수가 양의 값을 갖고 서로에 유사한 잔기의 수 및 분획을 보여준다. 본 명세서에 개시된 아미노산 서열과 이 정도의 동일성 또는 유사성의 또는 동일성 또는 유사성의 임의의 중간 정도를 갖는 아미노산 서열이 고려되고, 본 개시내용에 의해 포함된다. 10의 예상 한계치, 3의 워드 크기, 매트릭스로서의 BLOSUM 62 및 11(존재) 및 1(연장)의 갭 패널티 및 조건부 구성 점수 매트릭스 조정을 이용하는 대표적인 BLASTP 설정. BLASTP에 대한 다른 디폴트 설정은 https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome(2017년 7월 13일에 마지막 접속)에 의해 기재되고, 여기서 이용 가능한 본 개시내용에 참고로 포함된다.
용어 "이의 유도체", "변형된 서열" 또는 "유사체"는, 본 명세서에 개시된 폴리펩타이드에 적용된 바대로, 생물학적 활성 분자의 아미노산 서열과 적어도 70%, 80%, 90%, 95% 또는 99% 동일하거나 유사한 아미노산 서열을 포함하는 폴리펩타이드를 의미한다. 몇몇 실시형태에서, 유도체는 네이티브의 아미노산 서열 또는 이전에 조작된 서열과 적어도 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% 또는 99% 동일한 아미노산 서열을 포함한다. 유도체는 네이티브의 아미노산 서열 또는 이전에 조작된 분자에 대한 부가, 결실, 치환 또는 이들의 조합을 포함할 수 있다. 예를 들어, 유도체는 네이티브 콜라겐 서열과 비교하여 1개, 2개, 3개, 4개, 5개, 6개, 7개, 8개, 9개, 10개 또는 이것 초과의 프롤린 또는 라이신 잔기를 도입하거나 결실시킬 수 있다. 이러한 선택은 재조합 트로포콜라겐 또는 피브릴화 콜라겐의 느슨함 또는 단단함을 변형시키도록 만들어질 수 있다.
유도체는 아미노산 잔기의 1개, 2개, 3개, 4개, 5개, 6개, 7개, 8개, 9개, 10개, 11개 내지 15개, 16개 내지 20개, 21개 내지 25개, 또는 26개 내지 30개의 부가, 치환 또는 결실을 갖는 돌연변이체 폴리펩타이드를 포함할 수 있다. 부가 또는 치환은 또한 비천연 발생 아미노산 또는 변형된 아미노산의 사용을 포함한다. 유도체는 또한 폴리펩타이드에 대한 화학 변형, 예컨대 시스테인 잔기, 또는 수산화 또는 글라이코실화 잔기 사이의 가교결합을 포함할 수 있다. 유도체는 본 명세서에 개시된 콜라겐 및 효소를 포함하는 모든 폴리펩타이드의 것을 포함한다. 일반적으로, 유도체는 비변형된 모 분자의 적어도 하나의 생물학적 활성을 가질 것이고, 이에 의해 효소 유도체는 일반적으로 모 효소 및 콜라겐 유도체의 효소 활성, 모 콜라겐의 적어도 하나의 구조적, 화학적 또는 생물학적 특성을 가질 것이다.
생체조직제조된 가죽. 본 명세서에 기재된 방법에 의해 피브릴화되고 가교결합될 수 있는 콜라겐, 절두된 콜라겐, 비변형된 또는 번역 후 변형된, 또는 아미노산 서열 변형된 콜라겐의 임의의 유형을 사용하여 생체조직제조된 재료 또는 생체조직제조된 가죽을 제조할 수 있다. 생체조직제조된 가죽은 실질적으로 균일한 콜라겐, 예컨대 오직 I형 또는 III형 콜라겐을 함유할 수 있거나, 콜라겐의 2종, 3종, 4종 또는 이것 초과의 상이한 종류의 혼합물을 함유할 수 있다. 몇몇 실시형태에서, 재조합 콜라겐, 예를 들어 생체조직제조된 가죽의 성분은 수산화된 이의 라이신, 프롤린, 또는 라이신 및 프롤린 잔기 중 어느 것도 가지지 않을 것이다. 다른 실시형태에서, 재조합 콜라겐에서의 라이신, 프롤린, 또는 라이신 및 프롤린 잔기의 적어도 1%, 2%, 3%, 4%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% 또는 100%(또는 하위범위의 임의의 중간 값)는 수산화될 것이다.
효모 균주. 본 발명은 콜라겐을 생성하도록 효모를 이용한다. 적합한 효모는 피치아, 칸디다(Candida), 코마타가엘라(Komatagaella), 한세눌라(Hansenula), 사카로마이세스(Saccharomyces), 크립토코커스(Cryptococcus) 속의 것 및 이들의 조합을 포함하지만, 이들로 제한되지는 않는다. 효모는 변형되거나 혼성화될 수 있다. 혼성화된 효모는 동일한 종의 상이한 균주, 동일한 속의 상이한 종 또는 상이한 속의 균주의 혼합 교배에 의해 제조된다. 본 발명에 따라 사용될 수 있는 몇몇 효모 균주는 피치아 파스토리스, 피치아 멤브라니파시엔스(Pichia membranifaciens), 피치아 데세르티콜라(Pichia deserticola), 피치아 세팔로세레아나(Pichia cephalocereana), 피치아 에레모필라(Pichia eremophila), 피치아 메이안마렌시스(Pichia myanmarensis), 피치아 아노말라(Pichia anomala), 피치아 나카세이(Pichia nakasei), 피치아 시아멘시스(Pichia siamensis), 피치아 헤디(Pichia heedii), 피치아 바르케리(Pichia barkeri), 피치아 노르베겐시스(Pichia norvegensis), 피치아 써모메탄올리카(Pichia thermomethanolica), 피치아 스티피테스(Pichia stipites), 피치아 수브펠리쿨로사(Pichia subpelliculosa), 피치아 엑시구아(Pichia exigua), 피치아 옥시덴탈리스(Pichia occidentalis) 및 피치아 칵토필라(Pichia cactophila)를 포함한다.
일 실시형태에서, 본 발명은 콜라겐 및/또는 수산화효소(들)를 코딩하는 코돈 변형된 폴리뉴클레오타이드를 발현하도록 조작된 피치아 파스토리스 균주에 관한 것이다. 유용한 피치아 파스토리스 숙주 균주는 BG10(야생형)(균주 PPS-9010); BG 11, aox1Δ(MutS)(균주 PPS-9011)(PPS-9010의 느린 메탄올 이용 유도체임); 및 BG16, pep4Δ, prb4Δ(균주 PPS-9016)(프로테아제 결핍됨)를 포함하지만, 이들로 제한되지는 않는다. 이들 균주는 공중에게 이용 가능하고, https://www._atum.bio/products/cell-strains에서 ATUM으로부터 얻어질 수 있다.
효모를 위한 폴리펩타이드 분비 서열. 몇몇 실시형태에서, 효모 숙주 세포에 의해 코딩된 폴리펩타이드는 효모로부터의 이의 분비를 수월하게 하는 폴리펩타이드 서열로 융합될 것이고, 예를 들어 벡터는 분비 펩타이드를 코딩하는 서열에 융합된 콜라겐에 대한 코딩 서열을 포함하는 키메라 유전자를 코딩할 수 있다. 이 목적에 사용될 수 있는 분비 서열은 사카로마이세스 알파 메이팅 인자 Prepro 서열, 사카로마이세스 알파 메이팅 인자 Pre 서열, PHO1 분비 신호, 아스퍼질러스 니게르(Aspergillus niger)로부터의 α-아밀라제 신호 서열, 아스퍼질러스 아와모리(Aspergillus awamori)로부터의 글루코아밀라제 신호 서열, 호모 사피엔스(Homo sapiens)로부터의 혈청 알부민 신호 서열, 클루이베로마이세스 막시아누스(Kluyveromcyes maxianus)로부터의 이눌리나제 신호 서열, 사카로마이세스 세레비시아에(Saccharomyces cerevisiae)로부터의 인베르타제 신호 서열, 사카로마이세스 세레비시아에로부터의 살해 단백질 신호 서열 및 갈루스 갈루스(Gallus gallus)로부터의 라이소짐 신호 서열을 포함한다. 당해 분야에 공지된 다른 분비 서열을 또한 사용할 수 있다.
효모 촉진자 및 종결자. 몇몇 실시형태에서, 하기 효모 촉진자 중 하나 이상은 콜라겐 또는 수산화효소를 코딩하는 mRNA의 촉진자 전사에 벡터로 도입될 수 있다. 촉진자는 당해 분야에 공지되어 있고, pAOX1, pDas1, pDas2,ppmP20, pCAT, pDF, pGAP, pFDH1, pFLD1, pTAL1, pFBA2, pAOX2, pRKI1, pRPE2, pPEX5, pDAK1, pFGH1, pADH2, pTPI1, pFBP1, pTAL1, pPFK1, pGPM1 및 pGCW14를 포함한다.
몇몇 실시형태에서, 효모 종결자 서열은 콜라겐 또는 수산화효소를 코딩하는 mRNA의 전사를 종료시키도록 벡터로 도입된다. 종결자는 AOX1 TT, Das1 TT, Das2 TT, AOD TT, PMP TT, Cat1 TT, TPI TT, FDH1 TT, TEF1 TT, FLD1 TT, GCW14 TT, FBA2 TT, ADH2 TT, FBP1 TT 및 GAP TT를 포함하지만, 이들로 제한되지는 않는다.
펩신 이외의 펩티다제 . 펩신은 N 말단 및 C 말단 서열을 제거함으로써 트로폴라겐으로 콜라겐을 가공 처리하도록 사용될 수 있다. 콜라게나제, 트립신, 키모트립신, 파파인, 피카인 및 브로멜라인(이들로 제한되지는 않음)을 포함하는 다른 프로테아제는 이 목적에 또한 사용될 수 있다. 본 명세서에 사용된 바와 같은, "안정한 콜라겐"은 펩신 또는 또 다른 프로테아제의 특정한 농도에 노출된 후 콜라겐의 초기 농도의 적어도 20%, 30%, 40%, 50%, 60%, 75%, 80%, 85%, 90%, 95% 또는 100%(또는 임의의 중간 값 또는 하위범위)가 여전히 존재한다는 것을 의미한다. 바람직하게는, 안정한 콜라겐의 적어도 75%는 동일한 시간의 양 동안 동일한 조건 하에 처리되는 비안정한 콜라겐과 비교하여 펩신 또는 또 다른 프로테아제에 의한 처리 후 남을 것이다. 번역 후 변형 전에, 콜라겐은 비수산화되고, 높은 펩신 농도(예를 들어, 1:200 또는 이것 초과의 펩신:단백질 비율)의 존재 하에 분해된다.
번역 후 변형되면, 콜라겐은 펩신 또는 또 다른 프로테아제와 접촉하여 콜라겐의 N 말단 및 C 말단 프로펩타이드를 절단할 수 있어서, 콜라겐 피브릴화가 가능하게 한다. 수산화 콜라겐은 비수산화 콜라겐과 비교하여 더 양호한 열안정성을 갖고, 예를 들어 1:25, 1:20, 1:15, 1:10, 1:5, 내지 1:1(또는 임의의 중간 값)의 펩신:전체 단백질 비율에서 높은 농도 펩신 분해에 저항적이다. 따라서, 재조합 콜라겐의 조기 단백질분해를 피하기 위해, 수산화 콜라겐을 제공하는 것이 유용하다.
대안적인 발현 시스템. 콜라겐은 피치아 파스토리스 이외의 다른 종류의 효모 세포에서 발현될 수 있고, 예를 들어 또 다른 효모, 메틸영양균성 효모 또는 다른 유기체에서 발현될 수 있다. 사카로마이세스 세레비시아에는 임의의 많은 수의 발현 벡터와 사용될 수 있다. 흔히 사용되는 발현 벡터는, 외래 유전자의 효율적인 전사를 위해, 이. 콜라이에 대한 Col E1 기원 및 효모에서의 전파를 위한 2P 복제 기원을 함유하는 셔틀 벡터이다. 2P 플라스미드에 기초한 이러한 벡터의 통상적인 예는 pWYG4이고, 이것은 2P ORI-STB 유전요소, GAL1-10 촉진자 및 2P D 유전자 종결자를 갖는다. 이 벡터에서, Ncol 클로닝 부위는 발현시키고자 하는 폴리펩타이드에 대한 유전자를 삽입하고 ATG 출발 코돈을 제공하도록 사용된다.
또 다른 발현 벡터는 pWYG7L(이것은 온전한 2αORI, STB, REP1 및 REP2를 가짐) 및 GAL1-10 촉진자이고, FLP 종결자를 사용한다. 이 벡터에서, 코딩 폴리뉴클레오타이드는 BamHI 또는 Ncol 부위에서 이의 5' 말단에 의해 폴리링커에서 삽입된다. 삽입된 폴리뉴클레오타이드를 함유하는 벡터는 세포벽의 제거 후 에스. 세레비시아에로 형질전환되어서, 칼슘 및 폴리에틸렌 글라이콜에 의한 처리에서 또는 리튬 이온에 의한 온전한 세포의 처리에 의해 DNA를 흡수하는 스페로플라스트(spheroplast)를 생성한다.
대안적으로, DNA는 전기천공에 의해 도입될 수 있다. 형질전환체는 선택 가능한 마커 유전자, 예컨대 LEU2, TRP1, URA3, HIS3 또는 LEU2-D와 함께 류신, 트립토판, 유라실 또는 히스티딘에 대해 영양요구성인 예를 들어 숙주 효모 세포를 사용하여 선택될 수 있다.
메틸영양균성 효모, 예컨대 피치아 파스토리스에서의 다수의 메탄올 반응성 유전자가 존재하고, 이들 각각의 발현은 촉진자라고도 불리는 메탄올 반응성 조절 영역에 의해 제어된다. 임의의 이러한 메탄올 반응성 촉진자는 본 발명의 실행에서 사용하기에 적합하다. 특정한 조절 영역의 예는 AOX1 촉진자, AOX2 촉진자, 다이하이드록시아세톤 합성효소(DAS), P40 촉진자 및 피. 파스토리스로부터의 카탈라아제 유전자에 대한 촉진자 등을 포함한다.
메틸영양균성 효모 한세눌라 폴리모르파를 또한 사용할 수 있다. 메탄올에서의 성장은 MOX(메탄올 산화효소), DAS(다이하이드록시아세톤 합성효소) 및 FMHD(폼에이트 탈수소효소)와 같은 메탄올 대사의 중요한 효소의 유도를 발생시킨다. 이 효소는 전체 세포 단백질의 30% 내지 40%까지 구성할 수 있다. MOX, DAS 및 FMDH 생성을 코딩하는 유전자는 메탄올에서의 성장에 의해 유도되고 글루코스에서의 성장에 의해 억제된 강한 촉진자에 의해 제어된다. 이들 촉진자 중 임의의 또는 모든 3개는 에이치. 폴리모르파에서의 비상동성 유전자의 높은 수준 발현을 얻도록 사용될 수 있다. 따라서, 일 양태에서, 동물 콜라겐을 코딩하는 폴리뉴클레오타이드 또는 이의 단편 또는 변이체는 유도성 에이치. 폴리모르파 촉진자의 제어 하에 발현 벡터로 클로닝된다. 생성물의 분비가 원해지는 경우, 효모에서의 분비를 위한 신호 서열을 코딩하는 폴리뉴클레오타이드는 폴리뉴클레오타이드와 프레임으로 융합된다. 추가의 실시형태에서, 발현 벡터는 바람직하게는 영양요구성 숙주의 결핍성을 보완하도록 사용될 수 있는 영양요구성 마커 유전자, 예컨대 URA3 또는 LEU2를 함유한다.
이후, 발현 벡터는 당해 분야의 숙련자에게 공지된 기법을 이용하여 에이치. 폴리모르파 숙주 세포를 형질전환시키도록 사용된다. 에이치. 폴리모르파 형질전환의 유용한 특징은 게놈으로의 발현 벡터의 100개 이하의 카피의 자발적인 통합이다. 대부분의 경우에, 통합된 폴리뉴클레오타이드는 머리 대 꼬리 배열을 나타내는 다합체를 형성한다. 통합된 외래 폴리뉴클레오타이드는 비선택적 조건 하에서도 몇몇 재조합 균주에서 유사분열로 안정한 것으로 나타났다. 높은 카피 통합의 이 현상은 시스템의 높은 생산성 가능성을 추가로 증가시킨다.
외래 DNA는 효모 게놈으로 삽입되거나 에피솜으로 유지되어서 콜라겐을 생성한다. 콜라겐에 대한 DNA 서열은 벡터를 통해 효모로 도입된다. 외래 DNA는 임의의 비효모 숙주 DNA이고, 포유류, 카이노르하브디티스 엘레간스(Caenorhabditis elegans) 및 박테리아로부터의 것을 포함하지만, 예를 들어 이들로 제한되지는 않는다. 효모에서의 콜라겐 생성에 적합한 포유류 DNA는 소, 말, 돼지, 캥거루, 코끼리, 코뿔소, 하마, 고래, 돌고래, 기린, 얼룩말, 라마, 알파카, 염소 및 양(어린 양)을 포함하지만, 이들로 제한되지는 않는다. 콜라겐 생성을 위한 다른 DNA는 파충류(예컨대, 앨리게이터(alligator), 악어(crocodile), 거북이, 이구아나, 도마뱀, 뱀), 조류(예를 들어, 타조, 에뮤, 모아), 공룡, 양서류 및 어류(예를 들어, 틸리피아, 농어, 연어, 송어, 상어, 장어 콜라겐) 및 이들의 조합으로부터의 것을 포함한다.
DNA는 벡터에서 삽입되고, 적합한 벡터는 pHTX1-BiDi-P4HA-Pre-P4HB 하이그로, pHTX1-BiDi-P4HA-PHO1-P4HB 하이그로, pGCW14-pGAP1-BiDi-P4HA-Prepro-P4HB G418, pGCW14-pGAP1-BiDi-P4HA-PHO1-P4HB 하이그로 pDF-Col3A1 변형된 제오신, pCAT-Col3A1 변형된 제오신, AOX1 랜딩 패드에 의한 pDF-Col3A1 변형된 제오신, pHTX1-BiDi-P4HA-Pre-Pro-P4HB 하이그로를 포함하지만, 이들로 제한되지는 않는다. 벡터는 통상적으로 DNA의 선형화를 위한 적어도 하나의 제한 부위를 포함한다.
선택 촉진자는 재조합 단백질의 생성을 개선할 수 있고, 콜라겐 또는 수산화효소를 코딩하는 서열을 포함하는 벡터에 포함될 수 있다. 본 발명에서 사용하기에 적합한 촉진자는 AOX1 메탄올 유도된 촉진자, pDF 탈억제된 촉진자, pCAT 탈억제된 촉진자, Das1-Das2 메탄올 유도된 이방향성 촉진자, pHTX1 구성적 이방향성 촉진자, pGCW14-pGAP1 구성적 이방향성 촉진자 및 이들의 조합을 포함하지만, 이들로 제한되지는 않는다. 적합한 메탄올 유도된 촉진자는 AOX2, Das 1, Das 2, pDF, pCAT,ppmP20, pFDH1, pFLD1, pTAL2, pFBA2, pPEX5, pDAK1, pFGH1, pRKI1, pREP2 및 이들의 조합을 포함하지만, 이들로 제한되지는 않는다.
올-인-원 벡터를 포함하는 본 발명에 따른 벡터에서, 종결자는 효모로 도입된 벡터에서 이용된 각각의 오픈 리딩 프레임의 말단에 위치할 수 있다. 종결자에 대한 DNA 서열은 벡터로 삽입된다. 복제 벡터에 대해, 복제 기원은 복제를 개시시키도록 필요하다. 복제 기원에 대한 DNA 서열은 벡터로 삽입된다. 효모 게놈에 대한 상동성을 함유하는 하나 이상의 DNA 서열은 벡터로 도입되어서, 재조합 및 효모 게놈으로의 도입을 촉진하거나 효모 세포로 형질전환되면 벡터를 안정화시킬 수 있다.
본 발명에 따른 벡터는 또한 일반적으로 성공적으로 형질전환된 효모 세포를 선택하기 위해 사용된 적어도 하나의 선택적 마커를 포함할 것이다. 마커는 때때로 항생제 내성과 관련되고, 마커는 또한 소정의 아미노산에 의해 또는 이것 없이(영양요구성 마커) 성장하는 능력과 관련될 수 있다. 적합한 영양요구성 마커는 ADE, HIS, URA, LEU, LYS, TRP 및 이들의 조합을 포함하지만, 이들로 제한되지는 않는다. 재조합 벡터를 함유하는 효모 세포의 선택을 제공하도록, 선택 마커에 대한 적어도 하나의 DNA 서열은 벡터로 도입된다.
본 발명의 몇몇 실시형태에서, 재조합 효모 발현된 콜라겐 또는 콜라겐 유사 단백질에서의 아미노산 잔기, 예컨대 라이신 및 프롤린은 수산화가 결여될 수 있거나, 상응하는 천연 또는 비변형된 콜라겐 또는 콜라겐 유사 단백질보다 더 적거나 높은 정도의 수산화를 가질 수 있다. 다른 실시형태에서, 콜라겐 또는 콜라겐 유사 단백질에서의 아미노산 잔기는 글라이코실화가 결여될 수 있거나, 상응하는 천연 또는 비변형된 콜라겐 또는 콜라겐 유사 단백질보다 더 적거나 높은 정도의 글라이코실화를 가질 수 있다.
수산화 콜라겐은 비수산화 또는 수산화이하 콜라겐(32℃ 미만)보다 더 높은 융점(37℃ 초과)을 갖고, 또한 비수산화 또는 수산화이하 콜라겐보다 더 양호하게 피브릴화하고, 재료로서 사용하기에 더 강하고 더 내구성인 구조를 형성한다. 콜라겐 제제의 융점을 이용하여 이의 수산화 정도를 예측할 수 있고, 예를 들어 30 내지 40℃, 및 모든 중간 값, 예컨대 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 및 40℃의 범위일 수 있다. 수산화이하 콜라겐은 내구성 물품, 예컨대 신발 또는 백에 적합하지 않지만, 더 부드럽거나 더 흡수성인 제품으로 제제화될 수 있는 젤로(jello) 또는 젤라틴 유사 재료를 오직 형성할 수 있다.
콜라겐 조성물에서의 콜라겐은 균일하고 단일 유형의 콜라겐 분자, 예컨대 100% 소 I형 콜라겐 또는 100% III형 소 콜라겐을 함유할 수 있거나, 콜라겐 분자 또는 콜라겐 유사 분자의 상이한 종류의 혼합물, 예컨대 소 I형 및 III형 분자의 혼합물을 함유할 수 있다. 이러한 혼합물은 개별 콜라겐 또는 콜라겐 유사 단백질 성분의 0% 초과, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% 또는 100% 미만(또는 임의의 중간 값 또는 하위범위)을 포함할 수 있다. 이 범위는 모든 중간 값을 포함한다. 예를 들어, 콜라겐 조성물은 30%의 I형 콜라겐 및 70%의 III형 콜라겐을 함유할 수 있거나, 33.3%의 I형 콜라겐, 33.3%의 II형 콜라겐 및 33.3%의 III형 콜라겐을 함유할 수 있고, 여기서 콜라겐의 백분율은 조성물 내의 콜라겐의 전체 질량 또는 콜라겐 분자의 분자 백분율에 기초한다.
상기 기재된 조작된 효모 세포는 콜라겐을 생성하도록 사용될 수 있다. 이렇게 하기 위해, 세포는 발효 챔버 내에 배지에 배치되고, 12시간 내지 1주의 범위의 시간 기간 동안 제어된 pH 조건 하에 용존 산소 및 탄소원이 공급된다. 적합한 배지는 완충 글라이세롤 복합 배지(buffered glycerol complex media: BMGY), 완충 메탄올 복합 배지(buffered methanol complex media: BMMY) 및 효모 추출물 펩톤 덱스트로스(yeast extract peptone dextrose: YPD)를 포함하지만, 이들로 제한되지는 않는다. 콜라겐이 효모 세포에서 생성된다는 사실로 인해, 콜라겐을 단리하기 위해, 효모의 분비성 균주를 사용하거나 효모 세포를 용해시켜서 콜라겐을 방출시켜야 한다. 이후, 콜라겐은 종래의 기법, 예컨대 원심분리, 침전, 여과, 크로마토그래피 등을 통해 정제될 수 있다.
또 다른 실시형태에서, 본 발명은 수산화 콜라겐 및 비수산화 콜라겐을 생성하기에 유용한 효모 숙주에서 키메라 DNA 서열을 제공한다. 키메라 DNA 서열은 비변형된 DNA 서열 및 변형된 DNA 서열을 조합함으로써 생성된다. 비변형된 DNA 서열은 다양한 염기 쌍 위치에서 절단될 수 있다. 변형된 DNA 서열은 상응하는 염기 쌍 위치에서 또한 절단될 수 있다. 비변형된 절단 및 변형된 절단은 앞에서 뒤로 및 뒤에서 앞으로 조합될 수 있다. 키메라 DNA 서열은 상기로부터 촉진자, 벡터, 종결자 및 선택 마커와 조합되고 숙주로 삽입되어서, 수산화 콜라겐 및 비수산화 콜라겐을 생성할 수 있는 효모를 생성할 수 있다.
최적화된 DNA 및 비최적화된 DNA의 백분율은 서열의 전체 길이에 기초하여 계산될 수 있다. 키메라 균주는 N 말단에서의 최적화된 DNA 및 C 말단에서의 비최적화된 DNA의 조합일 수 있다. 최적화된 DNA의 백분율은 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% 내지 99%(또는 임의의 중간 값 또는 하위범위)의 범위일 수 있고, 예를 들어 이것은 10 내지 40% 및 60 내지 90%의 범위일 수 있다. 대안적으로, 키메라 균주는 N 말단에서의 비최적화된 DNA 및 C 말단에서의 최적화된 DNA의 조합일 수 있다. 비최적화된 DNA의 백분율은 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% 내지 99%(또는 임의의 중간 값 또는 하위범위)의 범위일 수 있고, 예를 들어 이것은 10 내지 40% 및 60 내지 90%의 범위일 수 있다. 예를 들어, 1331에서 1486개의 염기 쌍 절단을 갖는 DNA 서열은 0개 내지 1331개의 최적화된 DNA 및 1332개 내지 1486개의 비최적화된 DNA를 제공할 것이고, 키메라는 90% 최적화될 것이다. 최적화된 폴리뉴클레오타이드 서열은 C 말단, N 말단에서, 또는 콜라겐 분자의 바디 내의 그 외에서 콜라겐의 분절을 코딩할 수 있고, 예를 들어 이것은 콜라겐 분자의 처음의 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% 또는 90% 또는 콜라겐 분자의 마지막의 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% 또는 90%를 코딩할 수 있다.
대안적으로, 키메라 균주는 함께 융합된 최적화된 DNA 및 비최적화된 DNA의 2개, 3개 또는 4개 또는 이것 초과까지의 섹션으로 구성될 수 있다. 예를 들어, 1,500개의 염기 쌍을 갖는 DNA 서열은 0개 내지 500개의 최적화된 DNA 섹션, 501개 내지 1,000개의 비최적화된 DNA 섹션 및 1001개 내지 1500개의 최적화된 DNA 섹션을 가질 수 있다.
본 명세서에 개시된 콜라겐은 생체조직제조된 가죽을 제조하는 것이 가능하게 한다. 콜라겐을 생체조직제조된 가죽으로 변환하는 방법은 계류 중인 특허 출원 미국 출원 번호 15/433566, 15/433650, 15/433632, 15/433693, 15/433777, 15/433675, 15/433676 및 15/433877(이들의 개시내용은 본 명세서에 참고로 포함됨)에 교시되어 있다.
본 발명의 실시형태
본 발명의 비제한적인 실시형태는 하기를 포함하지만, 이들로 제한되지는 않는다:
소 콜라겐, 예컨대 I형 또는 III형 콜라겐, 또는 콜라겐 변이체 또는 유도체 및 코딩된 콜라겐에서의 프롤린, 라이신, 또는 라이신 및 프롤린 잔기를 수산화시키는 적어도 하나의 효소를 코딩하는 폴리뉴클레오타이드. 몇몇 실시형태에서, 폴리뉴클레오타이드는 네이티브 콜라겐 또는 수산화효소 폴리뉴클레오타이드 서열의 전부 또는 일부를 코돈 변형시키거나, 숙주 효모 세포에서의 콜라겐 또는 수산화효소의 발현을 수월하게 하도록 발현 제어 유전요소, 예컨대 효모 촉진자 서열을 도입할 것이다. 변형된 폴리뉴클레오타이드는 효모에서 발현될 때 10, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 또는 100중량% 초과만큼 동일한 콜라겐 서열을 코딩하는 동일한 조건 하에 발현된 비변형된 폴리펩타이드와 비교하여 콜라겐 발현을 증가시킬 수 있다.
몇몇 실시형태에서, 콜라겐 또는 수산화효소 단백질의 2배, 3배, 4배, 5배, 6배, 7배, 8배, 9배, 10배, 11배, 12배, 13배, 14배, 15배 또는 이것 초과의 배수 발현이 획득될 수 있다. 몇몇 실시형태에서, III형 콜라겐 또는 콜라겐 변이체는 발현될 것이고, 여기서 변이체는 서열 번호 2의 것과 적어도 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% 또는 100% 동일한 아미노산 서열을 갖는다. 다른 실시형태에서, 소 콜라겐은 α1(I) 사슬 및 α2(I) 사슬 둘 다를 코딩하거나, 네이티브 I형 콜라겐 사슬과 적어도 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% 또는 100% 동일한 하나 이상의 콜라겐 사슬을 코딩하는 I형 소 콜라겐 또는 콜라겐 변이체이다.
상기 기재된 소 콜라겐을 코딩하는 폴리뉴클레오타이드는 프롤릴 4-수산화효소의 P4HA 및 P4HB 하위단위를 코딩하는 폴리뉴클레오타이드 서열 또는 분절 또는 이것과 적어도 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% 또는 100% 동일한 효소를 코딩하는 폴리뉴클레오타이드 서열을 포함할 수 있다. 다른 실시형태에서, 폴리뉴클레오타이드는 프롤릴-3-수산화효소, 라이실 수산화효소 및/또는 라이실 산화효소를 코딩하는 폴리뉴클레오타이드 서열 또는 분절 또는 이것과 적어도 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% 또는 100% 동일한 효소를 코딩하는 폴리뉴클레오타이드 서열을 함유할 수 있다. 예를 들어, 본 발명의 폴리뉴클레오타이드는 서열 번호 2의 III형 소 콜라겐 아미노산 서열과 적어도 75% 내지 99% 동일한 폴리펩타이드 및 각각 서열 번호 54 및 52와 적어도 75% 내지 99% 동일한 P4HA 및 P4HB 하위단위를 포함하는 수산화효소를 코딩하는 분절을 코딩할 수 있다.
본 발명의 폴리뉴클레오타이드 서열은 I형 콜라겐, III형 콜라겐 또는 본 명세서에 기재된 몇몇 다른 콜라겐일 수 있는 콜라겐을 코딩하는 폴리뉴클레오타이드 서열에 일반적으로 인접하게 배치된 효모에서 작동적인 폴리펩타이드 분비 서열을 추가로 코딩할 수 있다.
본 발명의 폴리뉴클레오타이드 서열은 콜라겐 또는 효소, 예컨대 수산화효소의 발현을 수월하게 하거나 제어하는 촉진자 또는 다른 서열을 추가로 함유할 수 있고, 예를 들어 이것은 AOX1 메탄올 유도된 촉진자, DN pDF 탈억제된 촉진자, pCAT 탈억제된 촉진자, Das1-Das2 메탄올 유도된 이방향성 촉진자, pHTX1 구성적 이방향성 촉진자, pGCW14-pGAP1 구성적 이방향성 촉진자 또는 이들의 조합 중 적어도 하나를 함유할 수 있다.
본 발명의 폴리뉴클레오타이드는 다른 유전요소, 예컨대 알파 인자 pre- 또는 알파 인자 pre-pro 서열, 예컨대 서열 번호 23 및 24에 의해 각각 코딩된 것을 또한 함유할 수 있다. 몇몇 실시형태에서, 이러한 서열은 효소, 예컨대 수산화효소 또는 본 명세서에 기재된 다른 효소, 예컨대 P4HA(서열 번호 54) 또는 P4HB(서열 번호 52)를 발현하는 폴리뉴클레오타이드 서열, 또는 이것과 적어도 75%, 80%, 90% 또는 95% 내지 100% 동일한 변이체 효소에 작동 가능하게 연결될 수 있다.
상기 개시된 폴리뉴클레오타이드 서열을 함유하는 벡터는 본 발명의 추가 실시형태를 나타낸다. 이것은 본 명세서에 개시된 임의의 폴리뉴클레오타이드 서열, 예컨대 콜라겐, 절두된 콜라겐, 콜라겐 변이체 및 효소, 예컨대 수산화효소 또는 본 명세서에 기재된 다른 효소를 코딩하는 키메라 폴리뉴클레오타이드 서열을 함유하는 벡터를 포함한다. 몇몇 실시형태에서, 콜라겐을 코딩하는 서열 및 수산화효소 또는 다른 효소를 코딩하는 서열은 동일한 벡터에 있을 것이고, 다른 실시형태에서 이들은 상이한 벡터에 있을 수 있다.
본 발명은 본 명세서에 기재된 벡터를 함유하는 숙주 세포, 예컨대 효모 숙주 세포를 또한 고려한다. 몇몇 실시형태에서, 이 벡터는 비효모 세포, 예컨대 박테리아 숙주 세포에서 생성되고, 차후 콜라겐 또는 수산화 콜라겐을 발현하는 효모 숙주 세포, 예컨대 피치아 파스토루스 숙주 세포로 형질전환될 수 있다.
본 발명의 또 다른 양태는 수산화된 이의 프롤린 잔기의 1% 미만, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9% 또는 10%를 갖는 재조합 콜라겐을 제조하는 방법에 관한 것이다. 상기 방법은 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 피치아 파스토루스 또는 또 다른 적합한 효모 숙주 세포(또는 진핵생물 숙주 세포)를 배양하는 단계 및 콜라겐을 회수하는 단계를 수반하고, 상기 벡터는 프롤린 잔기의 1% 이하, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%를 수산화시키는 프롤릴-4-수산화효소의 양 또는 형태를 발현하도록 구성된다. 본 발명의 또 다른 실시형태는 III형 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 피치아 파스토루스 또는 다른 적합한 효모 숙주 세포를 배양하는 단계 및 콜라겐을 회수하는 단계를 수반하는 수산화된 이의 프롤린 잔기의 1% 미만, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%를 갖는 재조합 III형 콜라겐을 제조하는 방법이고, 상기 벡터는 프롤린 잔기의 1% 이하, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%를 수산화시키는 프롤릴-4-수산화효소의 양을 발현하도록 구성된다. 콜라겐 및 수산화효소 둘 다를 코딩하는 올-인-원 벡터는 예를 들어 수산화효소에 대한 유도성 또는 온도 민감한 촉진자의 사용에 의해 기능적 수산화효소가 발현되지 않거나 거의 발현되지 않도록 구성될 수 있다.
본 발명의 추가의 실시형태는 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 본 명세서에 기재된 바와 같은 벡터를 함유하는 피치아 파스토루스 또는 또 다른 적합한 효모 숙주 세포를 배양하는 것 및 콜라겐을 회수하는 것에 의해 수산화된 이의 프롤린 잔기의 10% 초과, 15%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% 또는 95% 초과를 갖는 재조합 콜라겐을 제조하는 방법이고, 벡터는 콜라겐 내의 프롤린 잔기의 10% 초과, 15%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90% 또는 90% 초과 또는 이것 초과를 수산화시키는 프롤릴-4-수산화효소의 양 또는 형태를 발현하도록 구성된다. 배양 시간 및 조건 및 수산화효소의 양 또는 활성은 수산화의 양을 제어하도록 이용될 수 있다. 본 발명의 또 다른 실시형태는 III형 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 본 발명에 따른 벡터를 함유하는 피치아 파스토루스 숙주 세포를 배양하는 단계 및 III형 콜라겐을 회수하는 단계를 포함하는 수산화된 이의 프롤린 잔기의 50%, 60%, 70%, 80%, 90%, 95% 또는 95% 초과를 갖는 재조합 III형 콜라겐을 제조하는 방법이고, 벡터는 프롤린 잔기의 50%, 60%, 70%, 80%, 90% 또는 90% 초과 또는 이것 초과를 수산화시키는 프롤릴-4-수산화효소의 양 또는 형태를 발현하도록 구성된다. 배양 시간 및 조건 및 수산화효소의 양 또는 활성은 수산화의 양을 제어하도록 이용될 수 있다.
본 발명의 또 다른 실시형태는 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 본 발명의 벡터를 함유하는 피치아 파스토루스 또는 다른 적합한 효모 숙주 세포를 배양하는 단계 및 콜라겐을 회수하는 단계를 포함하는 수산화된 이의 프롤린 잔기의 50%, 60%, 70%, 80%, 90%, 95% 또는 95% 초과 또는 이것 초과를 갖는 재조합 콜라겐을 제조하는 방법에 관한 것이고, 벡터는 프롤린 잔기의 50%, 60%, 70%, 80%, 90%, 95% 또는 95% 초과 또는 이것 초과를 수산화시키는 프롤릴-4-수산화효소의 양을 발현하도록 구성된다. 배양 시간 및 조건 및 수산화효소의 양 또는 활성은 수산화의 양을 제어하도록 이용될 수 있다.
본 발명의 또 다른 실시형태는 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 본 발명의 벡터를 함유하는 피치아 파스토루스 또는 다른 효모 숙주 세포를 배양하는 단계 및 콜라겐을 회수하는 단계를 포함하는 수산화된 이의 프롤린 잔기의 50%, 60%, 70%, 80%, 90%, 95% 또는 95% 초과를 갖는 재조합 III형 콜라겐을 제조하는 방법에 관한 것이고, 상기 벡터는 프롤린 잔기의 50%, 60%, 70%, 80%, 90%, 95% 또는 95% 초과를 수산화시키는 프롤릴-4-수산화효소의 양을 발현하도록 구성된다. 배양 시간 및 조건 및 수산화효소의 양 또는 활성은 수산화의 양을 제어하도록 이용될 수 있다.
본 발명의 또 다른 실시형태는 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 본 발명의 벡터를 함유하는 피치아 파스토루스 또는 다른 효모 숙주 세포를 배양하는 단계 및 콜라겐을 회수하는 단계를 포함하는 수산화된 이의 프롤린 잔기의 75%, 80%, 90%, 95% 또는 95% 초과를 갖는 재조합 콜라겐을 제조하는 방법에 관한 것이고, 상기 벡터는 프롤린 잔기의 75%, 80%, 90%, 95% 또는 95% 초과를 수산화시키는 프롤릴-4-수산화효소의 양을 발현하도록 구성된다.
본 발명의 추가의 실시형태는 콜라겐을 제조하기에 적합한 시간 동안 및 이러한 조건 하에 본 발명의 벡터를 함유하는 피치아 파스토루스 또는 다른 효모 숙주 세포를 배양하는 단계 및 콜라겐을 회수하는 단계를 포함하는 수산화된 이의 프롤린 잔기의 75%, 80%, 90%, 95% 또는 95% 초과를 갖는 재조합 III형 콜라겐을 제조하는 방법이고, 상기 벡터는 프롤린 잔기의 75%, 80%, 90%, 95% 또는 95% 초과 또는 이것 초과를 수산화시키는 프롤릴-4-수산화효소의 양을 발현하도록 구성된다.
본 발명의 또 다른 실시형태는 본 명세서에 기재된 방법 중 임의의 하나에 의해 제조된 재조합 콜라겐이다. 이러한 재조합 콜라겐은 수산화된 이의 프롤린 또는 라이신 잔기 수산화를 갖지 않거나, 수산화된 프롤린, 라이신 또는 프롤린 및 라이신 잔기의 0% 초과, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% 또는 100%를 가질 수 있다.
본 발명의 추가의 실시형태는 본 명세서에 기재된 바와 같은 재조합 콜라겐을 포함하거나 본 명세서에 기재된 방법에 의해 제조된 생체조직제조된 가죽 또는 다른 재료이다.
또 다른 실시형태에서, 본 발명은 수산화 콜라겐 및 비수산화 콜라겐을 생성하기에 유용한 효모 숙주 세포에서의 키메라 DNA 서열을 제공한다. 키메라 DNA 서열은 비변형된 DNA 서열 및 변형된 DNA 서열을 조합함으로써 생성된다. 비변형된 DNA 서열은 다양한 염기 쌍 위치에서 절단될 수 있다. 변형된 DNA 서열은 상응하는 염기 쌍 위치에서 또한 절단될 수 있다. 비변형된 절단 및 변형된 절단은 앞에서 뒤로 및 뒤에서 앞으로 조합될 수 있다. 키메라 DNA 서열은 상기로부터 촉진자, 벡터, 종결자 및 선택 마커와 조합되고 숙주로 삽입되어서, 수산화 콜라겐 및 비수산화 콜라겐을 생성할 수 있는 효모를 생성할 수 있다.
본 발명의 다른 실시형태는 하기를 포함하지만, 이들로 제한되지는 않는다:
(ⅰ) 효모의 균주; 및 (ⅱ) 콜라겐에 대한 DNA 서열; 콜라겐 촉진자에 대한 DNA 서열; 콜라겐 종결자에 대한 DNA 서열; 선택 마커에 대한 DNA 서열; 선택 마커에 대한 촉진자에 대한 DNA 서열; 선택 마커에 대한 종결자에 대한 DNA 서열; 박테리아에 대한 것 및 효모에 대한 것으로부터 선택된 복제 기원에 대한 DNA 서열; 및 효모 게놈에 대한 상동성을 함유하는 DNA 서열을 포함하는, 효모의 균주로 삽입된, 벡터를 포함하는, 비수산화 콜라겐을 생성하도록 유전자 조작된 효모의 균주. 이 실시형태에서, 효모의 균주는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택될 수 있다. 상기 실시형태에서, 벡터는 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된 콜라겐에 대한 DNA 서열을 함유할 수 있다. 이 실시형태에서, 콜라겐에 대한 DNA 서열은 네이티브 콜라겐 DNA, 조작된 콜라겐 DNA 및 코돈 변형된 콜라겐 DNA로부터 선택될 수 있다.
이 실시형태에서, 촉진자에 대한 DNA 서열은 AOX1 메탄올 유도된 촉진자에 대한 DNA, pDF 탈억제된 촉진자에 대한 DNA, pCAT 탈억제된 촉진자에 대한 DNA, Das1-Das2 메탄올 유도된 이방향성 촉진자에 대한 DNA, pHTX1 구성적 이방향성 촉진자에 대한 DNA, pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA 및 이들의 조합으로 이루어진 군으로부터 선택될 수 있다. 선택 마커는 이 실시형태에서 항생제 내성에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택될 수 있고, 예를 들어 항생제 내성은 하이그로마이신, 제오신, 제네티신 및 이들의 조합으로 이루어진 군으로부터 선택된 항생제일 수 있다.
상기 실시형태에 기재된 바와 같은 효모 균주는 전기천공, 화학 형질전환 및 메이팅으로 이루어진 군으로부터 선택된 방법을 통해 효모로 삽입된 벡터를 함유할 수 있다.
본 발명의 또 다른 실시형태는 (ⅰ) 상기 실시형태에 의해 기재된 바와 같은 효모의 균주를 제공하는 단계; 및 (ⅱ) 콜라겐을 제조하기에 충분한 시간 기간 동안 배지 중에 균주를 성장시키는 단계를 포함하는 비수산화 콜라겐을 제조하는 방법에 관한 것이다. 이 방법에서, 효모는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택될 수 있고/있거나, 배지는 완충 글라이세롤 복합 배지(BMGY), 완충 메탄올 복합 배지(BMMY) 및 효모 추출물 펩톤 덱스트로스(YPD)로 이루어진 군으로부터 선택될 수 있다. 효모 균주는 24, 48, 또는 72의 범위의 시간 기간 동안 또는 임의의 중간 시간 기간 동안 배양되거나 재배될 수 있다. 이 방법에서, 효모 균주는 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된 콜라겐에 대한 DNA 서열을 발현할 수 있다. 이 방법에서, 효모 균주에서의 촉진자에 대한 DNA 서열은 pHTX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택될 수 있고/있거나, 선택 마커는 항생제 내성 DNA에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택될 수 있다.
본 발명의 또 다른 실시형태는 (ⅰ) 효모의 균주 및 (ⅱ) 콜라겐에 대한 DNA 서열; 콜라겐 촉진자에 대한 DNA 서열; 종결자에 대한 DNA 서열; 선택 마커에 대한 DNA 서열; 선택 마커에 대한 촉진자에 대한 DNA 서열; 선택 마커에 대한 종결자에 대한 DNA 서열; 박테리아 및/또는 효모에 대한 복제 기원에 대한 DNA 서열; 효모 게놈에 대한 상동성을 함유하는 DNA 서열을 함유하는, 효모의 균주로 삽입된, 벡터; 및 (ⅲ) P4HA1에 대한 DNA 서열; P4HB에 대한 DNA 서열; 및 촉진자에 대한 적어도 하나의 DNA 서열을 포함하는, 효모의 균주로 삽입된, 제2 벡터를 포함하는 수산화 콜라겐을 제조하도록 조작된 효모 유전자의 균주이다. 이 실시형태에서, 효모 균주는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택될 수 있고/있거나, 효모 균주는 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된 콜라겐에 대한 DNA 서열을 발현할 수 있다. 이 방법의 몇몇 실시형태에서, 콜라겐에 대한 DNA 서열은 네이티브 콜라겐 DNA, 조작된 콜라겐 DNA 및 변형된 콜라겐 DNA로부터 선택되고/되거나, 촉진자에 대한 DNA 서열은 AOX1 메탄올 유도된 촉진자에 대한 DNA, pDF 탈억제된 촉진자에 대한 DNA, pCAT 탈억제된 촉진자에 대한 DNA, Das1-Das2 메탄올 유도된 이방향성 촉진자에 대한 DNA, pHTX1 구성적 이방향성 촉진자에 대한 DNA, pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA 및 이들의 조합으로 이루어진 군으로부터 선택된다. 효모의 균주에서, 촉진자에 대한 DNA 서열은 pHTX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택될 수 있고/있거나, 선택 마커에 대한 DNA 서열은 항생제 내성 DNA에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택될 수 있다. 항생제 내성 유전자 또는 DNA의 몇몇 예는 하이그로마이신, 제오신, 제네티신 및 이들의 조합으로 이루어진 군으로부터 선택된 항생제에 대한 내성을 포함하지만, 다른 공지된 항생제 내성 유전자를 또한 사용할 수 있다. 벡터는 전기천공, 화학 형질전환 및 메이팅으로 이루어진 군으로부터 선택된 방법을 통해 효모 균주로 삽입될 수 있다.
본 발명의 또 다른 실시형태는 (ⅰ) 본 명세서에 기재된 바와 같은 효모의 균주를 제공하는 단계 및 (ⅱ) 콜라겐을 제조하기에 충분한 시간 기간 동안 배지 중에 균주를 성장시키는 단계를 포함하는 수산화 콜라겐을 제조하는 방법이다. 효모의 균주는 칸디다, 코마타가엘라, 피치아, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택될 수 있고, 효모 균주에 의해 발현된 콜라겐 DNA는 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 또는 이들의 조합을 코딩하는 DNA로 이루어진 군으로부터 선택될 수 있고/있거나, 배지는 BMGY, BMMY 및 YPD로 이루어진 군으로부터 선택된다. 효모 균주는 약 24시간, 48시간 또는 72시간의 범위의 기간 동안 배양되거나 재배될 수 있다. 몇몇 실시형태에서, 촉진자에 대한 DNA는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택되고/되거나, 선택 마커에 대한 DNA는 항생제 내성 DNA에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택된다.
본 발명의 또 다른 실시형태는 (ⅰ) 발현될 때 콜라겐, 촉진자 및 종결자를 생성하는 DNA; (ⅱ) 촉진자 및 종결자를 포함하는, P4HA1 및 P4HB로 이루어진 군으로부터 선택된 하나 이상의 수산화 효소에 대한 적어도 하나의 DNA; (ⅲ) 촉진자 및 종결자를 포함하는, 선택 마커에 대한 적어도 하나의 DNA; (ⅳ) 효모 및 박테리아에 대한 복제 기원에 대한 적어도 하나의 DNA; (ⅴ) 게놈으로의 통합을 위한 효모 게놈에 대한 상동성을 갖는 하나 이상의 DNA; 및 (ⅳ) 모듈식 클로닝을 허용하는, 5', 3', 상기 DNA 내 및 이들의 조합으로 이루어진 군으로부터 선택된 위치에서의 하나 이상의 제한 부위를 포함하는 올-인-원 벡터에 관한 것이다. 몇몇 실시형태에서, 올-인-원 벡터는 발현될 때 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된 콜라겐을 생성하는 하나 이상의 DNA 서열을 함유할 것이다.
올-인-원 벡터는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된 촉진자를 포함할 수 있고, 선택 마커, 예컨대 항생제 내성 및/또는 영양요구성 마커에 대한 하나 이상의 DNA 서열을 포함할 수 있다. 항생제 내성 마커는 하이그로마이신, 제오신, 제네티신 및 이들의 조합으로 이루어진 군으로부터 선택된 항생제에 대한 내성을 포함한다.
본 발명의 또 다른 실시형태는 키메라 콜라겐 DNA의 전체 길이를 기준으로 최적화된 DNA의 10%, 20%, 30% 내지 40% 또는 60%, 70%, 80% 내지 90%를 함유하는 키메라 콜라겐 DNA 서열에 관한 것이다. 이 키메라 콜라겐 DNA 서열에서, 최적화된 DNA는 C 말단에서 기원할 수 있거나, 최적화된 DNA는 N 말단에서 기원할 수 있다.
본 발명의 또 다른 실시형태는 본 명세서에 기재된 바와 같은 키메라 콜라겐에 대한 DNA 서열; 콜라겐 촉진자에 대한 DNA 서열; 종결자에 대한 DNA 서열; 선택 마커에 대한 DNA 서열; 선택 마커에 대한 촉진자에 대한 DNA 서열; 선택 마커에 대한 종결자에 대한 DNA 서열; 박테리아 및/또는 효모에 대한 복제 기원에 대한 DNA 서열; 및 효모 게놈에 대한 상동성을 함유하는 DNA 서열을 포함하는 벡터를 포함하는 콜라겐 생성 효모의 균주에 관한 것이다. 이 실시형태에서, 효모의 균주는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된 촉진자에 대한 DNA를 함유할 수 있다. 효모의 균주는 적어도 하나의 항생제 내성을 코딩하는 DNA 및 적어도 하나의 영양요구성 마커를 코딩하는 DNA로 이루어진 군으로부터 선택된 선택 마커를 함유할 수 있다.
본 발명의 또 다른 실시형태는 (ⅰ) 본 명세서에 기재된 바와 같은 효모의 균주를 제공하는 단계; 및 (ⅱ) 콜라겐을 제조하기에 충분한 시간 기간 동안 배지 중에 균주를 성장시키는 단계를 포함하는 수산화 콜라겐을 제조하는 방법에 관한 것이다. 이 실시형태에서, 효모의 균주는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택될 수 있고, 배지는 완충 글라이세롤 복합 배지, 완충 메탄올 복합 배지 및 효모 추출물 펩톤 덱스트로스로 이루어진 군으로부터 선택될 수 있고, 배양 또는 재배 시간은 24시간, 48시간 또는 72시간의 범위일 수 있다. 이 방법의 몇몇 실시형태에서, 효모의 균주는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된 촉진자를 포함한다. 이 방법의 다른 실시형태에서, 효모의 균주는 항생제 내성을 코딩하는 DNA 및 영양요구성 마커를 코딩하는 DNA로 이루어진 군으로부터 선택된 적어도 하나의 선택 마커를 포함한다.
실시예
하기 비제한적인 실시예는 본 발명을 예시한다. 본 발명의 범위는 이 실시예에 기재된 세부사항으로 제한되지 않는다.
실시예 1
피치아 파스토리스 균주 BG10(야생형)을 ATUM(이전에 DNA 2.0)으로부터 얻었다. 콜라겐 서열 및 벡터를 포함하는 MMV 63(서열 번호 11)("서열 9") DNA 서열을 균주 PP28을 생성시키는 야생형 피치아 파스토리스로 삽입하였다. MMV63을 Pme I에 의해 분해하고, PP1(야생형 피치아 파스토리스 균주)로 형질전환시켜 PP28을 생성시켰다. 벡터 MMV63은 도 1에 도시되어 있다.
네이티브 III형 소 콜라겐을 코딩하는 DNA를 서열분석하고(서열 번호 1), 서열을 중합효소 연쇄 반응 "PCR" 프로토콜에 의해 증폭시켜 선형 DNA 서열을 생성하였다.
DNA를 피치아 전기천공 프로토콜(Bio-Rad Gene Pulser Xcell(상표명) Total System #1652660)을 이용하여 DNA 2.0으로부터 야생형 피치아 효모 세포(PP1)로 형질전환시켰다. 효모 세포를 Hygro 플레이트에 대해 선택된 P4HA/B 동시발현 플라스미드 및 형질전환체(예를 들어, 클론 #4)(200㎍/㎖)에 의해 형질전환시켰다.
클론 #4의 단일 콜로니를 100㎖의 YPD 배지 중에 접종하고, 215rpm에서 진탕시키면서 30℃에서 밤새 성장시켰다. 다음날, 배양물이 OD600 ~ 3.5(약 3 내지 5 X 107개 세포/OD600)에 도달할 때, 이것을 새로운 YPD에 의해 OD600 ~ 1.7로 희석하고, 215rpm에서 진탕시키면서 30℃에서 또 다른 시간 동안 성장시켰다.
이후, 세포를 3,500g에서 5분 동안 스핀 다운하고, 물에 의해 1회 세척하고, 10㎖의 10mM Tris-HCl(pH 7.5), 100mM LiAc, 10mM DTT(새로 첨가됨) 및 0.6M 소르비톨 중에 재현탁시켰다.
각각의 형질전환에 대해, 8 X 108개 세포의 분취량을 8㎖의 10mM Tris-HCl(pH 7.5), 100mM LiAc, 10mM DTT, 0.6M 소르비톨에 위치시키고, 실온에서 30분 동안 항온처리하였다.
세포를 5000g에서 5분 동안 스핀 다운하고, 아주 차가운 1.5㎖의 1M 소르비톨에 의해 3회 세척하고, 80㎕의 아주 차가운 1M 소르비톨 중에 재현탁시켰다.
다양한 양(약 5㎍)의 선형화된 DNA를 세포에 첨가하고, 피펫팅에 의해 혼합하였다.
세포 및 DNA 혼합물(80 내지 100㎕)을 0.2㎝ 큐벳에 첨가하고, 1500v, 25uF 및 200Ω에서 피치아에 대해 프로토콜에 따라 펄스화하였다.
이후, 이것을 YPD 및 1M 소르비톨(1:1)의 1㎖의 혼합물로 즉시 옮기고, 2시간 초과 동안 30℃에서 항온처리하였다.
세포를 상이한 밀도에서 플레이팅하였다.
단일 콜로니를 24 딥웰 플레이트에서 2㎖의 BMGY 배지로 접종하고, 900rpm에서 진탕시키면서 30℃에서 적어도 48시간 동안 성장시켰다. 생성된 세포를 하기 절차에 따라 세포 용해, SDS-page 및 펩신 검정을 이용하여 콜라겐에 대해 시험하였다.
효모 세포를 1분 동안 계속해서 30Hz의 속도에서 Qiagen TissueLyser를 사용하여 1x 용해 완충제 중에 용해시켰다. 용해 완충제를 2.5㎖의 1M HEPES(최종 농도 50mM)로부터 제조하였다; 438.3㎎ NaCl; 최종 농도 150mM; 5㎖ 글라이세롤; 최종 농도 10%; 0.5㎖ Triton X-100; 최종 농도 1%; 및 42㎖ Millipure 물.
용해된 세포를 테이블탑 원심분리기에서 15분 동안 2,500rpm에서 원심분리하였다. 상청액을 보유하고, 펠렛을 버렸다.
SDS-PAGE. 2-머캅토에탄올의 존재 하에 SDS-PAGE를 상청액, 분자량 마커, 음성 대조군 및 양성 대조군에서 수행하였다. 전기천공 후, 겔을 제거하고, 쿠마시 블루에 의해 염색하고, 이후 물 중에 탈염색하였다.
펩신 검정. 펩신 검정을 하기 절차에 의해 수행하였다:
펩신 처리 전에 Thermo Scientific 프로토콜에 따라 각각의 샘플의 전체 단백질을 얻기 위한 BCA 검정을 수행하였다. 전체 단백질의 양을 모든 샘플에 대해 0.5㎎/㎖ 이상에서의 최저 농도로 정규화하였다. 용리물의 100㎕의 샘플을 마이크로원심분리기 관에 위치시켰다. 하기를 함유하는 마스터 믹스를 제조하였다: 탈이온수 중의 1mg/㎖에서의 37% HCl(100㎕당 0.6㎕의 산) 및 펩신 스톡. 첨가된 펩신의 양은 1:25 비율의 펩신:전체 단백질(중량:중량)이었다.
펩신의 첨가 후, 샘플을 피펫에 의해 3회 혼합하고, 펩신 반응이 일어나도록 실온에서 1시간 동안 항온처리하였다. 1시간 후, β-머캅토에탄올을 함유하는 LDS 로딩 완충제의 1:1 용적을 각각의 샘플에 첨가하고, 70℃에서 7분 동안 항온처리되게 하였다. 항온처리 후, 샘플을 14,000rpm에서 1분 동안 스핀하여 혼탁도를 제거하였다.
이후, 샘플의 상부로부터의 18㎕를 TAE 완충제를 사용하여 3% 내지 8% TAE에 첨가하고, 150V에서 1시간 10분 동안 겔에서 수행하였다. 하기 표 1은 결과를 보고한다.
실시예 2
실시예 1을 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 피치아(서열 번호 3)("서열 2")에서의 발현을 증가시키도록 변형된 소 콜라겐 서열을 포함하는 DNA MMV77(서열 번호 12)("서열 10") 서열을 효모로 삽입하였다. pAOX1 촉진자(서열 번호 5)("서열 3")를 사용하여 콜라겐 서열의 발현을 추진하였다. 500㎍/㎖에서의 제오신을 함유하는 YPD 플레이트를 사용하여 성공적인 형질전환체를 선택하였다. 생성된 균주는 PP8이었다. 벡터 MMV77은 도 2에 도시되어 있다. Pme I를 사용하여 제한 분해를 수행하였다. 균주를 BMMY 배지에서 성장시키고, 콜라겐에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 3
실시예 1을 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 피치아 발현을 증가시키도록 변형된 소 콜라겐 서열을 포함하는 DNA MMV-129(서열 번호 13)("서열 11") 서열을 효모로 삽입하였다. pCAT 촉진자(서열 번호 9)("서열 7")를 사용하여 콜라겐 서열의 발현을 추진하였다. 500㎍/㎖에서의 제오신을 함유하는 YPD 플레이트를 사용하여 성공적인 형질전환체를 선택하였다. 생성된 균주는 PP123이었다. MMV129를 Swa I에 의해 분해하고, PP1로 형질전환시켜 PP123을 생성시켰다. 벡터 MMV129는 도 3에 도시되어 있다. 균주를 BMGY 배지에서 성장시키고, 콜라겐에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 4
실시예 1을 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 피치아에서의 발현을 증가시키도록 변형된 소 Col3A1(III형) 콜라겐 서열(서열 번호 3)("서열 2")을 포함하는 DNA MMV-130(서열 번호 14)("서열 12") 서열을 효모로 삽입하였다. 서열 번호 8("서열 6")에 기재된 pDF 촉진자를 사용하여 콜라겐 서열의 발현을 추진하였다. Pme I에 의해 절단된 AOX1 랜딩 패드(서열 번호 10)("서열 8")를 사용하여 피치아 게놈으로의 벡터의 부위 특이적 통합을 수월하게 하였다. 500㎍/㎖에서의 제오신을 함유하는 YPD 플레이트를 사용하여 성공적인 형질전환체를 선택하였다. 생성된 균주는 PP153이라 지칭된다. MMV130을 Pme I에 의해 분해하고, PP1로 형질전환시켜 PP153을 생성시켰다. 변형된 소 col3A1 서열은 서열 번호 3("서열 2")에 의해 기재된다.
PureLink PCR 정제 키트를 페놀 추출 대신에 사용하여 선형화된 DNA를 회수하였다. 균주를 BMGY 배지에서 성장시키고, 콜라겐에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 5
실시예 2를 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 최적화된 소 P4HA(서열 번호 6)("서열 4") 및 소 P4HB(서열 번호 7)("서열 5") 서열을 함유하는 MMV-78(서열 번호 15)("서열 13")인 하나의 DNA 벡터를 효모로 삽입하였다. MMV78을 Pme I에 의해 분해하고, PP1로 형질전환시켜 PP8을 생성시켰다. P4HA 및 P4HB 둘 다는 이의 내인성 신호 펩타이드를 함유하고, Das1-Das2 이방향성 촉진자(서열 번호 27)("서열 24")에 의해 추진되었다. DNA를 Kpn I에 의해 분해하고, PP8로 형질전환시켜 PP3을 생성시켰다. 벡터 MMV78은 도 5에 도시되어 있다. 균주를 BMMY 배지에서 성장시키고, 콜라겐 및 수산화에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 6
실시예 2를 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 소 P4HA 및 소 P4HB 서열 둘 다를 함유하는 MMV-78인 하나의 DNA 벡터를 효모로 삽입하였다. P4HA 및 P4HB 둘 다는 이의 내인성 신호 펩타이드를 함유하고, Das1-Das2 이방향성 촉진자에 의해 추진되었다. DNA를 Kpn I에 의해 분해하고, PP8로 형질전환시켜 PP3을 생성시켰다.
pAOX1 촉진자에 의해 추진된 P4HB를 함유하는 MMV-94(서열 번호 16)("서열 14")인 또 다른 벡터를 사용하고, 또한 효모로 삽입하였다. P4HB의 내인성 신호 펩타이드를 PHO1 신호 펩타이드에 의해 대체하였다. 생성된 균주는 PP38이었다. MMV94를 Avr II에 의해 분해하고, PP3으로 형질전환시켜 PP38을 생성시켰다. 벡터 MMV94는 도 6에 도시되어 있다. 균주를 BMMY 배지에서 성장시키고, 콜라겐 및 수산화에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 7
실시예 4를 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 소 P4HA 및 소 P4HB 서열 둘 다를 함유하는 MMV-156(서열 번호 17)("서열 15")인 하나의 DNA 벡터를 효모로 삽입하였다. P4HA는 이의 내인성 신호 펩타이드를 함유하고, P4HB 신호 서열을 알파-인자 Pre(서열 번호 23)("서열 21") 서열에 의해 대체하였다. 유전자 둘 다는 pHTX1 이방향성 촉진자(서열 번호 26)("서열 25")에 의해 추진되었다. MMV156을 Bam HI에 의해 분해하고, PP153으로 형질전환시켜 PP154를 생성시켰다. 벡터 MMV156은 도 7에 도시되어 있다. 균주를 BMGY 배지에서 성장시키고, 콜라겐 및 수산화에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 8
실시예 4를 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 소 P4HA 및 소 P4HB 서열 둘 다를 함유하는 MMV-156인 하나의 DNA 벡터를 효모로 삽입하였다. P4HA는 이의 내인성 신호 펩타이드를 함유하고, P4HB 신호 서열을 알파-인자 Pre 서열에 의해 대체하였다. 유전자 둘 다는 pHTX1 이방향성 촉진자에 의해 추진되었다. DNA를 Swa I에 의해 분해하고, PP153으로 형질전환시켜 PP154를 생성시켰다.
P4HA 및 P4HB 둘 다를 함유하는 MMV-191(서열 번호 18)("서열 16")인 또 다른 벡터를 또한 효모로 삽입하였다. P4HA의 추가의 카피는 이의 내인성 신호 펩타이드를 함유하고, P4HB의 추가의 카피의 신호 서열을 알파-인자 Pre-Pro(서열 번호 24)("서열 22") 서열에 의해 대체하였다. P4HA 및 P4HB의 추가의 카피는 pGCW14-GAP1 이방향성 촉진자(서열 번호 25)("서열 23")에 의해 추진되었다. MMV191을 Bam HI에 의해 분해하고, PP154로 형질전환시켜 PP268을 생성시켰다. 벡터 MMV191은 도 8에 도시되어 있다. 균주를 BMGY 배지에서 성장시키고, 콜라겐 및 수산화에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
실시예 9
실시예 1의 방법 및 절차를 이용하여 올-인-원 벡터를 생성하였다. 올-인-원 벡터는 콜라겐 및 연관된 촉진자 및 종결자의 DNA, 콜라겐 및 연관된 촉진자 및 종결자를 수산화시키는 효소를 위한 DNA, 마커 발현 및 연관된 촉진자 및 종결자를 위한 DNA, 박테리아 및 효모를 위한 복제 기원(들)을 위한 DNA 및 통합을 위한 효모 게놈에 대한 상동성을 갖는 DNA(들)를 함유한다. 올-인-원 벡터는 전략적으로 배치된 고유한 제한 부위 5', 3', 또는 상기 성분 내를 함유한다. 콜라겐 발현 또는 다른 벡터 성분에 대한 임의의 변형이 원해질 때, 선택 성분에 대한 DNA는 제한 효소에 의해 쉽게 절제되고, 사용자의 선택된 클로닝 방법에 의해 대체될 수 있다. 올-인-원 벡터 MMV208(서열 번호 19)("서열 17")의 가장 단순한 버전은 수산화효소 효소에 대한 촉진자(들)를 제외하고 상기 성분 모두를 함유한다. 벡터 MMV208을 하기 성분을 사용하여 제조하였다: MMV84(서열 번호 20)("서열 18")로부터의 AOX 상동성, MMV150(서열 번호 21)("서열 19")으로부터의 리보솜 상동성, MMV140(서열 번호 22)("서열 20")으로부터의 박테리아 및 효모 복제 기원, MMV140으로부터의 제오신 마커 및 MMV129로부터의 Col3A1. P4HA 및 B 및 연관된 종결자의 변형된 버전을 Genscript로부터 합성하여 하기 제한 부위를 제거하였다: AvrII, NotI, PvuI, PmeI, BamHI, SacII, SwaI, XbaI, SpeI. 벡터를 균주 PP1로 형질전환시켰다.
균주를 BMGY 배지에서 성장시키고, 콜라겐 및 수산화에 대해 시험하였다. 결과는 하기 표 1에 기재된다.
표 1은 g/ℓ 단위로 생성된 콜라겐의 양, 및 수산화 콜라겐의 백분율을 기재한다. 겔을 쿠마쉬 블루 염료에 의해 염색하고, 콜라겐 함량에 대해 표준 곡선에 대해 결과를 비교함으로써 발현된 콜라겐의 양을 정량화하였다. 1:25 펩신 처리 후 샘플 밴드를 표준 밴드와 비교함으로써 수산화 콜라겐의 양을 결정하였다. 피치아에 의한 수산화 콜라겐의 발현이 유리한데, 왜냐하면 수산화 콜라겐은 콜라겐 폴리펩타이드를 추가로 가공 처리하는 데 필요한 펩신의 높은 농도에서 안정하기 때문이다.
실시예 1 및 2에서의 데이터는 III형 소 콜라겐 서열의 코돈 변형이 피치아에 의해 발현된 콜라겐의 양을 배가시킨다는 것을 보여준다. 실시예 2 및 3으로부터의 데이터의 비교는 III형 소 콜라겐의 발현이 pCAT 촉진자를 갖는 III형 콜라겐 코딩 서열의 전사를 추진시킴으로써 5의 인수로 추가로 증가한다는 것을 보여준다. 실시예 2 및 4로부터의 데이터의 비교는 소 III형 콜라겐 발현이 pDF 촉진자를 갖는 III형 콜라겐 코딩 서열의 전사를 추진시키고, 피치아의 게놈 DNA로의 벡터의 통합을 수월하게 하도록 AOX1 랜딩 패드를 제공함으로써 10배 내지 15배 증가한다는 것을 보여준다. 실시예 2 및 5 및 6으로부터의 데이터의 비교는 프롤린 수산화효소(P4HA + P4HB)에 대한 코딩 서열에 의한 피치아의 형질전환이 수산화 콜라겐을 생성시키고, 프롤린 수산화효소의 발현을 추가로 조절함으로써 수산화 콜라겐의 양이 증가할 수 있다는 것을 보여준다. 실시예 7 내지 9는 콜라겐 발현이 5배 내지 15배 부스팅될 수 있고, 수산화 콜라겐의 양이 2개의 벡터를 도입함으로써 또는 올-인-원 벡터 접근법(여기서, 콜라겐 및 수산화효소 서열 둘 다는 동일한 벡터에 의해 코딩됨)에 의해 증가한다는 것을 보여준다.
실시예 10
실시예 1의 방법 및 절차를 이용하여 키메라 Col3A1 벡터를 생성하였다. 벡터 MMV132는 키메라 콜라겐 및 연관된 촉진자 PDF 및 종결자 AOX1TT의 DNA, 마커 발현 및 연관된 촉진자 및 종결자에 대한 DNA, 박테리아 및 효모에 대한 복제 기원(들)에 대한 DNA 및 통합을 위한 효모 게놈에 대한 상동성을 갖는 DNA(들)를 포함하도록 변형되었다. 벡터 MMV63은 비변형된 Col3A1 도메인에 대한 소스 DNA이다. 벡터 MMV128(도 21)은 변형된 Col3A1 도메인에 대한 소스 DNA이다. Col3A1 폴리펩타이드의 전체 길이는 1465개의 아미노산(aa)이다. 플라스미드는 네이티브 소 DNA 서열(비변형된) 및 피치아 파스토리스 코돈 변형된 DNA 서열을 도입하도록 설계되었다. 플라스미드는 Col3A1의 변형된 서열과 비변형된 서열 사이의 전이가 710번, 1,200번 및 1,331번 aa에 있도록 설계되었다. 이들 방법은 플라스미드 MMV193, MMV194, MMV195, MMV197, MMV198 및 MMV199를 생성하도록 사용되었다. 생성된 플라스미드 벡터는 비교를 위해 완전히 최적화된 플라스미드 MMV130 및 완전히 비최적화된 플라스미드 MMV200(도 20)에 의해 하기 표 2에 기재된다.
실시예 11
실시예 2를 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: PP1 및 PP97을 얻었다. PP97은 2개의 프로테아제 유전자(PEP4 및 PRB1)가 숙주 균주로부터 넛아웃된 균주이었다. 피치아 발현을 위해 변형된 및 비변형된 소 콜라겐 서열 DNA의 상이한 조합을 포함하는 DNA MMV194, MMV195, MMV130 및 MMV200 서열을 효모로 삽입하였다. pDF 촉진자를 사용하여 콜라겐 서열의 발현을 추진하였다. 500㎍/㎖에서의 제오신을 함유하는 YPD 플레이트를 사용하여 성공적인 형질전환체를 선택하였다. Swa I를 사용하여 제한 분해를 수행하여 통합을 위해 DNA를 선형화시키고, Pme1 및 200ng의 DNA에 의해 분해된 MMV130을 형질전환시킨다는 것을 제외하고 3 내지 5㎍의 절단된 DNA를 벡터에 대해 형질전환시켰다. 생성된 균주는 하기 표 3에 기재된다.
실시예 12
실시예 7을 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 소 P4HA 및 소 P4HB 서열 둘 다를 함유하는 MMV-156인 하나의 DNA 벡터를 효모로 삽입하였다. P4HA는 이의 내인성 신호 펩타이드를 함유하고, P4HB 신호 서열을 알파-인자 Pre 서열에 의해 대체하였다. 유전자 둘 다는 pHTX1 이방향성 촉진자에 의해 추진되었다. DNA를 BamHI에 의해 분해하고 형질전환시켰다. 균주 및 형질전환 정보에 대해 표 4를 참조한다.
실시예 13
실시예 8을 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 용해 완충제를 50mM Na2PO4, 1mM EDTA, 5% 글라이세롤, 및 아세트산에 의해 7.4로 조정된 pH에 의해 제조하였다. P4HA 및 P4HB 둘 다를 함유하는 MMV-191인 또 다른 벡터를 또한 효모로 삽입하였다. P4HA의 추가의 카피는 이의 내인성 신호 펩타이드를 함유하고, P4HB의 추가의 카피의 신호 서열을 알파-인자 Pre-Pro 서열에 의해 대체하였다. P4HA 및 P4HB의 추가의 카피는 pGCW14-GAP1 이방향성 촉진자에 의해 추진되었다. DNA를 BamHI에 의해 분해하고 형질전환시켰다. 형질전환 및 새로운 균주 정보에 대해 표 5를 참조한다. 균주를 BMGY 배지에서 성장시키고, 콜라겐에 대해 시험하였다.
실시예 14
실시예 2를 하기 변경에 의해 동일한 절차 및 프로토콜에 따라 반복하였다: 소 P4HA 및 소 P4HB 서열 둘 다를 함유하는 MMV-78인 하나의 DNA 벡터를 효모로 삽입하였다. P4HA 및 P4HB는 Das1-Das2 이방향성 촉진자에 의해 추진되었다. DNA를 Kpn I에 의해 분해하고, PP8로 형질전환시켜 서열 번호 3("서열 2")의 콜라겐 서열을 함유하는 PP3을 생성시켰다. pAOX1 촉진자에 의해 추진된 P4HB를 함유하는 MMV-94(서열 번호 16)("서열 14")인 또 다른 벡터를 사용하고, 또한 효모로 삽입하였다. P4HB의 내인성 신호 펩타이드를 PHO1 신호 펩타이드에 의해 대체하였다. 생성된 균주는 PP38이었다.
24 딥웰 플레이트를 각각의 웰에서 2㎖의 YPD에 의해 충전하고, 균주 PP38의 단일 콜로니를 접종하였다. 콜로니를 900rpm에서 진탕시키면서 24시간 동안 YPD 중에 성장시켰다. 세포를 3,000rpm에서 5분 동안 스핀 다운하고, 상청액을 제거하였다. 메탄올 비함유 유도를 위해, 상청액을 2㎖의 BMGY(1%)에 의해 대체하고, 또 다른 48시간 동안 성장시켰다. 메탄올 유도를 위해, 메탄올을 최종 농도 0.5%로 첨가하고, 세포를 24시간 동안 성장시켰다. 메탄올을 다시 첨가하고, 세포를 또 다른 24시간 동안 성장시켰다. 유도의 종료 시, 분석을 위해 1㎖의 샘플을 제거하였다.
실시예 1에 기재된 쿠마쉬 염색 및 SDS-PAGE를 사용하여 콜라겐에 대해 샘플을 시험하였다. 메탄올 비함유 유도 샘플에 대한 밴드는 메탄올 유도된 샘플에 대한 밴드보다 더 어두워서, 메탄올 비함유 유도 샘플이 발현된 콜라겐의 더 높은 농도를 갖는다는 것을 보여준다.
본 명세서에 사용된 바와 같은, "최적화된" 또는 "최적화한다"와 같은 용어는 키메라 DNA 작제물의 특징 또는 다른 중요한 공정 변수의 조심스런 선택에 의해 실현된 값 또는 특징을 포함하고, 공지된 결과-유효 변수의 사용을 의미하지 않는다.
본 명세서에 사용된 전문용어는 오직 특정한 실시형태를 기재할 목적을 위한 것이고, 본 발명의 제한인 것으로 의도되지 않는다.
본 명세서에 사용된 표제(예컨대, "배경기술" 및 "발명의 내용") 및 하위표제는 오직 본 발명 내의 주제의 일반 체계화를 위해 의도되고, 본 발명의 개시내용 또는 이의 임의의 양태를 제한하도록 의도되지 않는다. 특히, "배경기술"에 개시된 대상은 신규한 기술을 포함할 수 있고, 선행 기술의 인용을 구성하지 않을 수 있다. "발명의 내용"에 개시된 대상은 기술의 전체 범위의 배타적이거나 완전한 개시내용 또는 이의 임의의 실시형태가 아니다. 특정한 유용성을 갖는 것으로 본 명세서의 부문 내의 재료의 분류 또는 토의는 편의를 위해 이루어지고, 이것이 임의의 소정의 조성물에서 사용될 때 반드시 또는 오로지 본 명세서에서의 이의 분류에 따라 기능하지 않아야 한다는 추론이 도출되지 않아야 한다.
본 명세서에 사용된 바와 같은, 단수 형태 "하나", "일" 및 "이"는, 문맥이 명확히 달리 표시하지 않는 한, 복수 형태를 또한 포함하는 것으로 의도된다.
본 명세서에 사용된 바와 같은, 용어 "및/또는"은 연관된 기재된 항목의 하나 이상의 임의의 및 모든 조합을 포함하고, "/"로 축약될 수 있다.
링크는 "www" 앞의 스페이스 또는 밑줄 친 스페이스의 삽입에 의해 비활성화되고, 스페이스의 제거에 의해 재활성화될 수 있다.
예에서 사용되는 것을 포함하여, 명확히 달리 기재되지 않는 한, 명세서 및 청구항에서 본 명세서에 사용된 바대로, 모든 숫자는 그 용어가 명확히 나타내지 않는 경우에도 단어 "실질적으로", "약" 또는 "대략"이 앞에 있는 것처럼 읽혀질 수 있다. 구절 "약" 또는 "대략"은 기재된 값 및/또는 위치가 값 및/또는 위치의 합당한 예상된 범위 내에 있다는 것을 나타내도록 규모 및/또는 위치를 기재할 때 사용될 수 있다. 예를 들어, 숫자 값은 기재된 값(또는 값의 범위)의 ±0.1%, 기재된 값(또는 값의 범위)의 ±1%, 기재된 값(또는 값의 범위)의 ±2%, 기재된 값(또는 값의 범위)의 ±5%, 기재된 값(또는 값의 범위)의 ±10%, 기재된 값(또는 값의 범위)의 ±15%, 기재된 값(또는 값의 범위)의 ±20% 등인 값을 가질 수 있다. 본 명세서에서 기재된 임의의 숫자 범위는 여기에 포함된 모든 하위범위를 포함하도록 의도된다.
본 명세서에 사용된 바와 같은, 단어 "바람직한" 및 "바람직하게는"은 소정의 상황 하에 소정의 이익을 제공하는 기술의 실시형태를 의미한다. 그러나, 동일한 또는 다른 상황 하에 다른 실시형태가 또한 바람직할 수 있다. 더구나, 하나 이상의 바람직한 실시형태의 언급은 다른 실시형태가 유용하지 않다는 것을 의미하지 않고, 기술의 범위로부터 다른 실시형태를 배제하도록 의도되지 않는다. 본 명세서에 언급된 바대로, 모든 조성 백분율은 달리 기재되지 않는 한 전체 조성물의 중량을 기준으로 한다. 본 명세서에 사용된 바와 같은, 용어 "포함한다" 및 이의 변형어는 비제한적인 것으로 의도되어서, 목록에서의 항목의 언급은 이 기술의 재료, 조성물, 장치 및 방법에서 또한 유용할 수 있는 다른 유사한 항목의 배제가 아니다. 유사하게, 용어 "할 수 있다" 및 "일 것이다" 및 이의 변형어는 비제한적인 것으로 의도되어서, 실시형태가 소정의 구성요소 또는 특징부를 포함할 수 있거나 포함할 것이라는 언급은 이 구성요소 또는 특징부를 함유하지 않는 본 발명의 다른 실시형태를 배제하지 않는다.
용어 "제1" 및 "제2"는 다양한 특징부/구성요소(단계 포함)를 기술하기 위해 본 명세서에서 사용될 수 있지만, 이 특징부/구성요소는, 문맥이 달리 표시하지 않는 한, 이들 용어에 의해 제한되지 않아야 한다. 이들 용어는 하나의 특징부/구성요소를 또 다른 특징부/구성요소로부터 구별하기 위해 사용될 수 있다. 따라서, 하기 기재된 제1 특징부/구성요소는 제2 특징부/구성요소라 불릴 수 있고, 유사하게, 하기 기재된 제2 특징부/구성요소는 본 발명의 교시내용으로부터 벗어나지 않으면서 제1 특징부/구성요소라 불릴 수 있다.
"밑에", "아래에", "하부", "위에", "상부" 등과 같은 공간상 상대적인 용어는 도면에 예시된 바와 같은 또 다른 구성요소(들) 또는 특징부(들)에 대한 하나의 구성요소 또는 특징부의 관계식을 기술하기 위해 서술의 편의를 위해 본 명세서에서 사용될 수 있다. 공간상 상대적인 용어는 도면에 도시된 배향 이외에 사용 또는 조작 시 장치의 상이한 배향을 포함하도록 의도되는 것으로 이해될 것이다. 예를 들어, 도면에서의 장치가 뒤집혀 있을 때, 다른 구성요소 또는 특징부 "아래에" 또는 "밑에"로 기재된 구성요소는 이후 다른 구성요소 또는 특징부 "위에" 배향될 것이다. 따라서, 예시적인 용어 "밑에"는 위의 또는 밑의 배향 둘 다를 포함할 수 있다. 장치는 달리 배향(90° 또는 다른 배향으로 회전)될 수 있고, 본 명세서에 사용된 공간상 상대적인 기술자는 이렇게 해석될 수 있다. 유사하게, 용어 "상향으로", "하향으로", "수직", "수평" 등은, 구체적으로 달리 표시되지 않는 한, 오직 설명의 목적을 위해 본 명세서에 사용된다.
특징부 또는 구성요소가 본 명세서에서 또 다른 특징부 또는 구성요소"에 있는" 것으로 기재될 때, 이것은 직접적으로 다른 특징부 또는 구성요소에 있을 수 있거나, 개재하는 특징 및/또는 구성요소가 또한 존재할 수 있다. 반대로, 특징부 또는 구성요소가 또 다른 특징부 또는 구성요소에 "직접적으로 있는" 것으로 기재될 때, 개재하는 특징부 또는 구성요소가 존재하지 않는다. 특징부 또는 구성요소가 또 다른 특징부 또는 구성요소에 "연결된", "부착된" 또는 "커플링된" 것으로 기재될 때, 이것은 다른 특징부 또는 구성요소에 직접적으로 연결되거나, 부착되거나 커플링될 수 있거나, 개재하는 특징부 또는 구성요소가 존재할 수 있는 것으로 또한 이해될 것이다. 반대로, 특징부 또는 구성요소가 또 다른 특징부 또는 구성요소에 "직접적으로 연결된", "직접적으로 부착된" 또는 "직접적으로 커플링된" 것으로 기재될 때, 개재하는 특징부 또는 구성요소가 존재할 수 없다. 일 실시형태와 관련하여 기재되거나 표시되어 있지만, 이렇게 기재되거나 표시된 특징부 및 구성요소는 다른 실시형태에 적용될 수 있다. 또 다른 특징부에 "인접하게" 배치된 구조 또는 특징부의 언급이 인접한 특징부를 중첩시키거나 이의 기저가 되는 부분을 가질 수 있는 것으로 당해 분야의 숙련자에 의해 또한 이해될 것이다.
본 명세서에 언급된 모든 공보 및 특허 출원은, 각각의 개별 공보 또는 특허 출원이 참고로 포함된 것으로 구체적으로 및 개별적으로 표시된 것과 같이, 동일한 정도로 그 전문이 본 명세서에 참고로 포함되고, 참고에 의한 포함이 보이는 명세서의 동일한 문장, 문단, 페이지 또는 부문에 보이는 개시내용은 특별히 참고된다. 본 명세서에서의 참고문헌의 인용은 이 참고문헌이 선행 기술이거나 본 명세서에 개시된 기술의 특허능력에 대한 어떤 관련성을 갖는다는 인정을 구성하지 않는다. 인용된 참고문헌의 내용의 임의의 토의는 단지 참고문헌의 저자가 한 주장의 일반적인 요약을 제공하도록 의도되고, 이러한 참고문헌의 내용의 정확도에 관한 인정을 구성하지 않는다.
SEQUENCE LISTING
<110> Modern Meadow
<120> YEAST STRAINS AND METHODS FOR CONTROLLING HYDROXYLATION OF
RECOMBINANT COLLAGEN
<130> 515112US
<160> 55
<170> PatentIn version 3.5
<210> 1
<211> 4401
<212> DNA
<213> Bos taurus
<220>
<221> CDS
<222> (1)..(4401)
<223> Collagen Sequence 1: cDNA sequence - unoptimized natural DNA
sequence from cow
<400> 1
atg atg agc ttt gtg caa aag ggg acc tgg tta ctt ttc gct ctg ctt 48
Met Met Ser Phe Val Gln Lys Gly Thr Trp Leu Leu Phe Ala Leu Leu
1 5 10 15
cat ccc act gtt att ttg gca caa cag gaa gct gtt gac gga gga tgc 96
His Pro Thr Val Ile Leu Ala Gln Gln Glu Ala Val Asp Gly Gly Cys
20 25 30
tcc cat ctc ggt cag tct tat gca gat aga gat gta tgg aaa cca gaa 144
Ser His Leu Gly Gln Ser Tyr Ala Asp Arg Asp Val Trp Lys Pro Glu
35 40 45
ccg tgc caa ata tgc gtc tgt gac tca gga tcc gtt ctc tgt gat gac 192
Pro Cys Gln Ile Cys Val Cys Asp Ser Gly Ser Val Leu Cys Asp Asp
50 55 60
ata ata tgt gac gac caa gaa tta gac tgc ccc aac cct gaa atc ccg 240
Ile Ile Cys Asp Asp Gln Glu Leu Asp Cys Pro Asn Pro Glu Ile Pro
65 70 75 80
ttt gga gaa tgt tgt gca gtt tgc cca cag cct cca aca gct ccc act 288
Phe Gly Glu Cys Cys Ala Val Cys Pro Gln Pro Pro Thr Ala Pro Thr
85 90 95
cgc cct cct aat ggt caa gga cct caa ggc ccc aag gga gat cca ggt 336
Arg Pro Pro Asn Gly Gln Gly Pro Gln Gly Pro Lys Gly Asp Pro Gly
100 105 110
cct cct ggt att cct ggg cga aat ggc gat cct ggt cct cca gga tca 384
Pro Pro Gly Ile Pro Gly Arg Asn Gly Asp Pro Gly Pro Pro Gly Ser
115 120 125
cca ggc tcc cca ggt tct ccc ggc cct cct gga atc tgt gaa tca tgt 432
Pro Gly Ser Pro Gly Ser Pro Gly Pro Pro Gly Ile Cys Glu Ser Cys
130 135 140
cct act ggt ggc cag aac tat tct ccc cag tac gaa gca tat gat gtc 480
Pro Thr Gly Gly Gln Asn Tyr Ser Pro Gln Tyr Glu Ala Tyr Asp Val
145 150 155 160
aag tct gga gta gca gga gga gga atc gca ggc tat cct ggg cca gct 528
Lys Ser Gly Val Ala Gly Gly Gly Ile Ala Gly Tyr Pro Gly Pro Ala
165 170 175
ggt cct cct ggc cca ccc gga ccc cct ggc aca tct ggc cat cct ggt 576
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Thr Ser Gly His Pro Gly
180 185 190
gcc cct ggc gct cca gga tac caa ggt ccc ccc ggt gaa cct ggg caa 624
Ala Pro Gly Ala Pro Gly Tyr Gln Gly Pro Pro Gly Glu Pro Gly Gln
195 200 205
gct ggt ccg gca ggt cct cca gga cct cct ggt gct ata ggt cca tct 672
Ala Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Ala Ile Gly Pro Ser
210 215 220
ggc cct gct gga aaa gat ggg gaa tca gga aga ccc gga cga cct gga 720
Gly Pro Ala Gly Lys Asp Gly Glu Ser Gly Arg Pro Gly Arg Pro Gly
225 230 235 240
gag cga gga ttt cct ggc cct cct ggt atg aaa ggc cca gct ggt atg 768
Glu Arg Gly Phe Pro Gly Pro Pro Gly Met Lys Gly Pro Ala Gly Met
245 250 255
cct gga ttc cct ggt atg aaa gga cac aga ggc ttt gat gga cga aat 816
Pro Gly Phe Pro Gly Met Lys Gly His Arg Gly Phe Asp Gly Arg Asn
260 265 270
gga gag aaa ggc gaa act ggt gct cct gga tta aag ggg gaa aat ggc 864
Gly Glu Lys Gly Glu Thr Gly Ala Pro Gly Leu Lys Gly Glu Asn Gly
275 280 285
gtt cca ggt gaa aat gga gct cct gga ccc atg ggt cca aga ggg gct 912
Val Pro Gly Glu Asn Gly Ala Pro Gly Pro Met Gly Pro Arg Gly Ala
290 295 300
ccc ggt gag aga gga cgg cca gga ctt cct gga gcc gca ggg gct cga 960
Pro Gly Glu Arg Gly Arg Pro Gly Leu Pro Gly Ala Ala Gly Ala Arg
305 310 315 320
ggt aat gat gga gct cga gga agt gat gga caa ccg ggc ccc cct ggt 1008
Gly Asn Asp Gly Ala Arg Gly Ser Asp Gly Gln Pro Gly Pro Pro Gly
325 330 335
cct cct gga act gca gga ttc cct ggt tcc cct ggt gct aag ggt gaa 1056
Pro Pro Gly Thr Ala Gly Phe Pro Gly Ser Pro Gly Ala Lys Gly Glu
340 345 350
gtt gga cct gca gga tct cct ggt tca agt ggc gcc cct gga caa aga 1104
Val Gly Pro Ala Gly Ser Pro Gly Ser Ser Gly Ala Pro Gly Gln Arg
355 360 365
gga gaa cct gga cct cag gga cat gct ggt gct cca ggt ccc cct ggg 1152
Gly Glu Pro Gly Pro Gln Gly His Ala Gly Ala Pro Gly Pro Pro Gly
370 375 380
cct cct ggg agt aat ggt agt cct ggt ggc aaa ggt gaa atg ggt cct 1200
Pro Pro Gly Ser Asn Gly Ser Pro Gly Gly Lys Gly Glu Met Gly Pro
385 390 395 400
gct ggc att cct ggg gct cct ggg ctg ata gga gct cgt ggt cct cca 1248
Ala Gly Ile Pro Gly Ala Pro Gly Leu Ile Gly Ala Arg Gly Pro Pro
405 410 415
ggg cca cct ggc acc aat ggt gtt ccc ggg caa cga ggt gct gca ggt 1296
Gly Pro Pro Gly Thr Asn Gly Val Pro Gly Gln Arg Gly Ala Ala Gly
420 425 430
gaa ccc ggt aag aat gga gcc aaa gga gac cca gga cca cgt ggg gaa 1344
Glu Pro Gly Lys Asn Gly Ala Lys Gly Asp Pro Gly Pro Arg Gly Glu
435 440 445
cgc gga gaa gct ggt tct cca ggt atc gca gga cct aag ggt gaa gat 1392
Arg Gly Glu Ala Gly Ser Pro Gly Ile Ala Gly Pro Lys Gly Glu Asp
450 455 460
ggc aaa gat ggt tct cct gga gaa cct ggt gca aat gga ctt cct gga 1440
Gly Lys Asp Gly Ser Pro Gly Glu Pro Gly Ala Asn Gly Leu Pro Gly
465 470 475 480
gct gca gga gaa agg ggt gtg cct gga ttc cga gga cct gct gga gca 1488
Ala Ala Gly Glu Arg Gly Val Pro Gly Phe Arg Gly Pro Ala Gly Ala
485 490 495
aat ggc ctt cca gga gaa aag ggt cct cct ggg gac cgt ggt ggc cca 1536
Asn Gly Leu Pro Gly Glu Lys Gly Pro Pro Gly Asp Arg Gly Gly Pro
500 505 510
ggc cct gca ggg ccc aga ggt gtt gct gga gag ccc ggc aga gat ggt 1584
Gly Pro Ala Gly Pro Arg Gly Val Ala Gly Glu Pro Gly Arg Asp Gly
515 520 525
ctc cct gga ggt cca gga ttg agg ggt att cct ggt agc ccc gga gga 1632
Leu Pro Gly Gly Pro Gly Leu Arg Gly Ile Pro Gly Ser Pro Gly Gly
530 535 540
cca ggc agt gat ggg aaa cca ggg cct cct gga agc caa gga gag acg 1680
Pro Gly Ser Asp Gly Lys Pro Gly Pro Pro Gly Ser Gln Gly Glu Thr
545 550 555 560
ggt cga ccc ggt cct cca ggt tca cct ggt ccg cga ggc cag cct ggt 1728
Gly Arg Pro Gly Pro Pro Gly Ser Pro Gly Pro Arg Gly Gln Pro Gly
565 570 575
gtc atg ggc ttc cct ggt ccc aaa gga aac gat ggt gct cct gga aaa 1776
Val Met Gly Phe Pro Gly Pro Lys Gly Asn Asp Gly Ala Pro Gly Lys
580 585 590
aat gga gaa cga ggt ggc cct gga ggt cct ggc cct cag ggt cct gct 1824
Asn Gly Glu Arg Gly Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Ala
595 600 605
gga aag aat ggt gag acc gga cct cag ggt cct cca gga cct act ggc 1872
Gly Lys Asn Gly Glu Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly
610 615 620
cct tct ggt gac aaa gga gac aca gga ccc cct ggt cca caa gga cta 1920
Pro Ser Gly Asp Lys Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu
625 630 635 640
caa ggc ttg cct gga acg agt ggt ccc cca gga gaa aac gga aaa cct 1968
Gln Gly Leu Pro Gly Thr Ser Gly Pro Pro Gly Glu Asn Gly Lys Pro
645 650 655
ggt gaa cct ggt cca aag ggt gag gct ggt gca cct gga att cca gga 2016
Gly Glu Pro Gly Pro Lys Gly Glu Ala Gly Ala Pro Gly Ile Pro Gly
660 665 670
ggc aag ggt gat tct ggt gct ccc ggt gaa cgc gga cct cct gga gca 2064
Gly Lys Gly Asp Ser Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Ala
675 680 685
gga ggg ccc cct gga cct aga ggt gga gct ggc ccc cct ggt ccc gaa 2112
Gly Gly Pro Pro Gly Pro Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu
690 695 700
gga gga aag ggt gct gct ggt ccc cct ggg cca cct ggt tct gct ggt 2160
Gly Gly Lys Gly Ala Ala Gly Pro Pro Gly Pro Pro Gly Ser Ala Gly
705 710 715 720
aca cct ggt ctg caa gga atg cct gga gaa aga ggg ggt cct gga ggc 2208
Thr Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Gly Pro Gly Gly
725 730 735
cct ggt cca aag ggt gat aag ggt gag cct ggc agc tca ggt gtc gat 2256
Pro Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Ser Ser Gly Val Asp
740 745 750
ggt gct cca ggg aaa gat ggt cca cgg ggt ccc act ggt ccc att ggt 2304
Gly Ala Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro Ile Gly
755 760 765
cct cct ggc cca gct ggt cag cct gga gat aag ggt gaa agt ggt gcc 2352
Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Ser Gly Ala
770 775 780
cct gga gtt ccg ggt ata gct ggt cct cgc ggt ggc cct ggt gag aga 2400
Pro Gly Val Pro Gly Ile Ala Gly Pro Arg Gly Gly Pro Gly Glu Arg
785 790 795 800
ggc gaa cag ggg ccc cca gga cct gct ggc ttc cct ggt gct cct ggc 2448
Gly Glu Gln Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala Pro Gly
805 810 815
cag aat ggt gag cct ggt gct aaa gga gaa aga ggc gct cct ggt gag 2496
Gln Asn Gly Glu Pro Gly Ala Lys Gly Glu Arg Gly Ala Pro Gly Glu
820 825 830
aaa ggt gaa gga ggc cct ccc gga gcc gca gga ccc gcc gga ggt tct 2544
Lys Gly Glu Gly Gly Pro Pro Gly Ala Ala Gly Pro Ala Gly Gly Ser
835 840 845
ggg cct gcc ggt ccc cca ggc ccc caa ggt gtc aaa ggc gaa cgt ggc 2592
Gly Pro Ala Gly Pro Pro Gly Pro Gln Gly Val Lys Gly Glu Arg Gly
850 855 860
agt cct ggt ggt cct ggt gct gct ggc ttc ccc ggt ggt cgt ggt cct 2640
Ser Pro Gly Gly Pro Gly Ala Ala Gly Phe Pro Gly Gly Arg Gly Pro
865 870 875 880
cct ggc cct cct ggc agt aat ggt aac cca ggc ccc cca ggc tcc agt 2688
Pro Gly Pro Pro Gly Ser Asn Gly Asn Pro Gly Pro Pro Gly Ser Ser
885 890 895
ggt gct cca ggc aaa gat ggt ccc cca ggt cca cct ggc agt aat ggt 2736
Gly Ala Pro Gly Lys Asp Gly Pro Pro Gly Pro Pro Gly Ser Asn Gly
900 905 910
gct cct ggc agc ccc ggg atc tct gga cca aag ggt gat tct ggt cca 2784
Ala Pro Gly Ser Pro Gly Ile Ser Gly Pro Lys Gly Asp Ser Gly Pro
915 920 925
cca ggt gag agg gga gca cct ggc ccc cag ggc cct ccg gga gct cca 2832
Pro Gly Glu Arg Gly Ala Pro Gly Pro Gln Gly Pro Pro Gly Ala Pro
930 935 940
ggc cca cta gga att gca gga ctt act gga gca cga ggt ctt gca ggc 2880
Gly Pro Leu Gly Ile Ala Gly Leu Thr Gly Ala Arg Gly Leu Ala Gly
945 950 955 960
cca cca ggc atg cca ggt gct agg ggc agc ccc ggc cca cag ggc atc 2928
Pro Pro Gly Met Pro Gly Ala Arg Gly Ser Pro Gly Pro Gln Gly Ile
965 970 975
aag ggt gaa aat ggt aaa cca gga cct agt ggt cag aat gga gaa cgt 2976
Lys Gly Glu Asn Gly Lys Pro Gly Pro Ser Gly Gln Asn Gly Glu Arg
980 985 990
ggt cct cct ggc ccc cag ggt ctt cct ggt ctg gct ggt aca gct ggt 3024
Gly Pro Pro Gly Pro Gln Gly Leu Pro Gly Leu Ala Gly Thr Ala Gly
995 1000 1005
gag cct gga aga gat gga aac cct gga tca gat ggt ctg cca ggc 3069
Glu Pro Gly Arg Asp Gly Asn Pro Gly Ser Asp Gly Leu Pro Gly
1010 1015 1020
cga gat gga gct cca ggt gcc aag ggt gac cgt ggt gaa aat ggc 3114
Arg Asp Gly Ala Pro Gly Ala Lys Gly Asp Arg Gly Glu Asn Gly
1025 1030 1035
tct cct ggt gcc cct gga gct cct ggt cac cca ggc cct cct ggt 3159
Ser Pro Gly Ala Pro Gly Ala Pro Gly His Pro Gly Pro Pro Gly
1040 1045 1050
cct gtc ggt cca gct gga aag agc ggt gac aga gga gaa act ggc 3204
Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr Gly
1055 1060 1065
cct gct ggt cct tct ggg gcc ccc ggt cct gcc gga tca aga ggt 3249
Pro Ala Gly Pro Ser Gly Ala Pro Gly Pro Ala Gly Ser Arg Gly
1070 1075 1080
cct cct ggt ccc caa ggc cca cgc ggt gac aaa ggg gaa acc ggt 3294
Pro Pro Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr Gly
1085 1090 1095
gag cgt ggt gct atg ggc atc aaa gga cat cgc gga ttc cct ggc 3339
Glu Arg Gly Ala Met Gly Ile Lys Gly His Arg Gly Phe Pro Gly
1100 1105 1110
aac cca ggg gcc ccc gga tct ccg ggt ccc gct ggt cat caa ggt 3384
Asn Pro Gly Ala Pro Gly Ser Pro Gly Pro Ala Gly His Gln Gly
1115 1120 1125
gca gtt ggc agt cca ggc cct gca ggc ccc aga gga cct gtt gga 3429
Ala Val Gly Ser Pro Gly Pro Ala Gly Pro Arg Gly Pro Val Gly
1130 1135 1140
cct agc ggg ccc cct gga aag gac gga gca agt gga cac cct ggt 3474
Pro Ser Gly Pro Pro Gly Lys Asp Gly Ala Ser Gly His Pro Gly
1145 1150 1155
ccc att gga cca ccg ggg ccc cga ggt aac aga ggt gaa aga gga 3519
Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn Arg Gly Glu Arg Gly
1160 1165 1170
tct gag ggc tcc cca ggc cac cca gga caa cca ggc cct cct gga 3564
Ser Glu Gly Ser Pro Gly His Pro Gly Gln Pro Gly Pro Pro Gly
1175 1180 1185
cct cct ggt gcc cct ggt cca tgt tgt ggt gct ggc ggg gtt gct 3609
Pro Pro Gly Ala Pro Gly Pro Cys Cys Gly Ala Gly Gly Val Ala
1190 1195 1200
gcc att gct ggt gtt gga gcc gaa aaa gct ggt ggt ttt gcc cca 3654
Ala Ile Ala Gly Val Gly Ala Glu Lys Ala Gly Gly Phe Ala Pro
1205 1210 1215
tat tat gga gat gaa ccg ata gat ttc aaa atc aac acc gat gag 3699
Tyr Tyr Gly Asp Glu Pro Ile Asp Phe Lys Ile Asn Thr Asp Glu
1220 1225 1230
att atg acc tca ctc aaa tca gtc aat gga caa ata gaa agc ctc 3744
Ile Met Thr Ser Leu Lys Ser Val Asn Gly Gln Ile Glu Ser Leu
1235 1240 1245
att agt cct gat ggt tcc cgt aaa aac cct gca cgg aac tgc agg 3789
Ile Ser Pro Asp Gly Ser Arg Lys Asn Pro Ala Arg Asn Cys Arg
1250 1255 1260
gac ctg aaa ttc tgc cat cct gaa ctc cag agt gga gaa tat tgg 3834
Asp Leu Lys Phe Cys His Pro Glu Leu Gln Ser Gly Glu Tyr Trp
1265 1270 1275
gtt gat cct aac caa ggt tgc aaa ttg gat gct att aaa gtc tac 3879
Val Asp Pro Asn Gln Gly Cys Lys Leu Asp Ala Ile Lys Val Tyr
1280 1285 1290
tgt aac atg gaa act ggg gaa acg tgc ata agt gcc agt cct ttg 3924
Cys Asn Met Glu Thr Gly Glu Thr Cys Ile Ser Ala Ser Pro Leu
1295 1300 1305
act atc cca cag aag aac tgg tgg aca gat tct ggt gct gag aag 3969
Thr Ile Pro Gln Lys Asn Trp Trp Thr Asp Ser Gly Ala Glu Lys
1310 1315 1320
aaa cat gtt tgg ttt gga gaa tcc atg gag ggt ggt ttt cag ttt 4014
Lys His Val Trp Phe Gly Glu Ser Met Glu Gly Gly Phe Gln Phe
1325 1330 1335
agc tat ggc aat cct gaa ctt ccc gaa gac gtc ctc gat gtc cag 4059
Ser Tyr Gly Asn Pro Glu Leu Pro Glu Asp Val Leu Asp Val Gln
1340 1345 1350
ctg gca ttc ctc cga ctt ctc tcc agc cgg gcc tct cag aac atc 4104
Leu Ala Phe Leu Arg Leu Leu Ser Ser Arg Ala Ser Gln Asn Ile
1355 1360 1365
aca tat cac tgc aag aat agc att gca tac atg gat cat gcc agt 4149
Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp His Ala Ser
1370 1375 1380
ggg aat gta aag aaa gcc ttg aag ctg atg ggg tca aat gaa ggt 4194
Gly Asn Val Lys Lys Ala Leu Lys Leu Met Gly Ser Asn Glu Gly
1385 1390 1395
gaa ttc aag gct gaa gga aat agc aaa ttc aca tac aca gtt ctg 4239
Glu Phe Lys Ala Glu Gly Asn Ser Lys Phe Thr Tyr Thr Val Leu
1400 1405 1410
gag gat ggt tgc aca aaa cac act ggg gaa tgg ggc aaa aca gtc 4284
Glu Asp Gly Cys Thr Lys His Thr Gly Glu Trp Gly Lys Thr Val
1415 1420 1425
ttc cag tat caa aca cgc aag gcc gtc aga cta cct att gta gat 4329
Phe Gln Tyr Gln Thr Arg Lys Ala Val Arg Leu Pro Ile Val Asp
1430 1435 1440
att gca ccc tat gat atc ggt ggt cct gat caa gaa ttt ggt gcg 4374
Ile Ala Pro Tyr Asp Ile Gly Gly Pro Asp Gln Glu Phe Gly Ala
1445 1450 1455
gac att ggc cct gtt tgc ttt tta taa 4401
Asp Ile Gly Pro Val Cys Phe Leu
1460 1465
<210> 2
<211> 1466
<212> PRT
<213> Bos taurus
<400> 2
Met Met Ser Phe Val Gln Lys Gly Thr Trp Leu Leu Phe Ala Leu Leu
1 5 10 15
His Pro Thr Val Ile Leu Ala Gln Gln Glu Ala Val Asp Gly Gly Cys
20 25 30
Ser His Leu Gly Gln Ser Tyr Ala Asp Arg Asp Val Trp Lys Pro Glu
35 40 45
Pro Cys Gln Ile Cys Val Cys Asp Ser Gly Ser Val Leu Cys Asp Asp
50 55 60
Ile Ile Cys Asp Asp Gln Glu Leu Asp Cys Pro Asn Pro Glu Ile Pro
65 70 75 80
Phe Gly Glu Cys Cys Ala Val Cys Pro Gln Pro Pro Thr Ala Pro Thr
85 90 95
Arg Pro Pro Asn Gly Gln Gly Pro Gln Gly Pro Lys Gly Asp Pro Gly
100 105 110
Pro Pro Gly Ile Pro Gly Arg Asn Gly Asp Pro Gly Pro Pro Gly Ser
115 120 125
Pro Gly Ser Pro Gly Ser Pro Gly Pro Pro Gly Ile Cys Glu Ser Cys
130 135 140
Pro Thr Gly Gly Gln Asn Tyr Ser Pro Gln Tyr Glu Ala Tyr Asp Val
145 150 155 160
Lys Ser Gly Val Ala Gly Gly Gly Ile Ala Gly Tyr Pro Gly Pro Ala
165 170 175
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Thr Ser Gly His Pro Gly
180 185 190
Ala Pro Gly Ala Pro Gly Tyr Gln Gly Pro Pro Gly Glu Pro Gly Gln
195 200 205
Ala Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Ala Ile Gly Pro Ser
210 215 220
Gly Pro Ala Gly Lys Asp Gly Glu Ser Gly Arg Pro Gly Arg Pro Gly
225 230 235 240
Glu Arg Gly Phe Pro Gly Pro Pro Gly Met Lys Gly Pro Ala Gly Met
245 250 255
Pro Gly Phe Pro Gly Met Lys Gly His Arg Gly Phe Asp Gly Arg Asn
260 265 270
Gly Glu Lys Gly Glu Thr Gly Ala Pro Gly Leu Lys Gly Glu Asn Gly
275 280 285
Val Pro Gly Glu Asn Gly Ala Pro Gly Pro Met Gly Pro Arg Gly Ala
290 295 300
Pro Gly Glu Arg Gly Arg Pro Gly Leu Pro Gly Ala Ala Gly Ala Arg
305 310 315 320
Gly Asn Asp Gly Ala Arg Gly Ser Asp Gly Gln Pro Gly Pro Pro Gly
325 330 335
Pro Pro Gly Thr Ala Gly Phe Pro Gly Ser Pro Gly Ala Lys Gly Glu
340 345 350
Val Gly Pro Ala Gly Ser Pro Gly Ser Ser Gly Ala Pro Gly Gln Arg
355 360 365
Gly Glu Pro Gly Pro Gln Gly His Ala Gly Ala Pro Gly Pro Pro Gly
370 375 380
Pro Pro Gly Ser Asn Gly Ser Pro Gly Gly Lys Gly Glu Met Gly Pro
385 390 395 400
Ala Gly Ile Pro Gly Ala Pro Gly Leu Ile Gly Ala Arg Gly Pro Pro
405 410 415
Gly Pro Pro Gly Thr Asn Gly Val Pro Gly Gln Arg Gly Ala Ala Gly
420 425 430
Glu Pro Gly Lys Asn Gly Ala Lys Gly Asp Pro Gly Pro Arg Gly Glu
435 440 445
Arg Gly Glu Ala Gly Ser Pro Gly Ile Ala Gly Pro Lys Gly Glu Asp
450 455 460
Gly Lys Asp Gly Ser Pro Gly Glu Pro Gly Ala Asn Gly Leu Pro Gly
465 470 475 480
Ala Ala Gly Glu Arg Gly Val Pro Gly Phe Arg Gly Pro Ala Gly Ala
485 490 495
Asn Gly Leu Pro Gly Glu Lys Gly Pro Pro Gly Asp Arg Gly Gly Pro
500 505 510
Gly Pro Ala Gly Pro Arg Gly Val Ala Gly Glu Pro Gly Arg Asp Gly
515 520 525
Leu Pro Gly Gly Pro Gly Leu Arg Gly Ile Pro Gly Ser Pro Gly Gly
530 535 540
Pro Gly Ser Asp Gly Lys Pro Gly Pro Pro Gly Ser Gln Gly Glu Thr
545 550 555 560
Gly Arg Pro Gly Pro Pro Gly Ser Pro Gly Pro Arg Gly Gln Pro Gly
565 570 575
Val Met Gly Phe Pro Gly Pro Lys Gly Asn Asp Gly Ala Pro Gly Lys
580 585 590
Asn Gly Glu Arg Gly Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Ala
595 600 605
Gly Lys Asn Gly Glu Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly
610 615 620
Pro Ser Gly Asp Lys Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu
625 630 635 640
Gln Gly Leu Pro Gly Thr Ser Gly Pro Pro Gly Glu Asn Gly Lys Pro
645 650 655
Gly Glu Pro Gly Pro Lys Gly Glu Ala Gly Ala Pro Gly Ile Pro Gly
660 665 670
Gly Lys Gly Asp Ser Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Ala
675 680 685
Gly Gly Pro Pro Gly Pro Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu
690 695 700
Gly Gly Lys Gly Ala Ala Gly Pro Pro Gly Pro Pro Gly Ser Ala Gly
705 710 715 720
Thr Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Gly Pro Gly Gly
725 730 735
Pro Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Ser Ser Gly Val Asp
740 745 750
Gly Ala Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro Ile Gly
755 760 765
Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Ser Gly Ala
770 775 780
Pro Gly Val Pro Gly Ile Ala Gly Pro Arg Gly Gly Pro Gly Glu Arg
785 790 795 800
Gly Glu Gln Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala Pro Gly
805 810 815
Gln Asn Gly Glu Pro Gly Ala Lys Gly Glu Arg Gly Ala Pro Gly Glu
820 825 830
Lys Gly Glu Gly Gly Pro Pro Gly Ala Ala Gly Pro Ala Gly Gly Ser
835 840 845
Gly Pro Ala Gly Pro Pro Gly Pro Gln Gly Val Lys Gly Glu Arg Gly
850 855 860
Ser Pro Gly Gly Pro Gly Ala Ala Gly Phe Pro Gly Gly Arg Gly Pro
865 870 875 880
Pro Gly Pro Pro Gly Ser Asn Gly Asn Pro Gly Pro Pro Gly Ser Ser
885 890 895
Gly Ala Pro Gly Lys Asp Gly Pro Pro Gly Pro Pro Gly Ser Asn Gly
900 905 910
Ala Pro Gly Ser Pro Gly Ile Ser Gly Pro Lys Gly Asp Ser Gly Pro
915 920 925
Pro Gly Glu Arg Gly Ala Pro Gly Pro Gln Gly Pro Pro Gly Ala Pro
930 935 940
Gly Pro Leu Gly Ile Ala Gly Leu Thr Gly Ala Arg Gly Leu Ala Gly
945 950 955 960
Pro Pro Gly Met Pro Gly Ala Arg Gly Ser Pro Gly Pro Gln Gly Ile
965 970 975
Lys Gly Glu Asn Gly Lys Pro Gly Pro Ser Gly Gln Asn Gly Glu Arg
980 985 990
Gly Pro Pro Gly Pro Gln Gly Leu Pro Gly Leu Ala Gly Thr Ala Gly
995 1000 1005
Glu Pro Gly Arg Asp Gly Asn Pro Gly Ser Asp Gly Leu Pro Gly
1010 1015 1020
Arg Asp Gly Ala Pro Gly Ala Lys Gly Asp Arg Gly Glu Asn Gly
1025 1030 1035
Ser Pro Gly Ala Pro Gly Ala Pro Gly His Pro Gly Pro Pro Gly
1040 1045 1050
Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr Gly
1055 1060 1065
Pro Ala Gly Pro Ser Gly Ala Pro Gly Pro Ala Gly Ser Arg Gly
1070 1075 1080
Pro Pro Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr Gly
1085 1090 1095
Glu Arg Gly Ala Met Gly Ile Lys Gly His Arg Gly Phe Pro Gly
1100 1105 1110
Asn Pro Gly Ala Pro Gly Ser Pro Gly Pro Ala Gly His Gln Gly
1115 1120 1125
Ala Val Gly Ser Pro Gly Pro Ala Gly Pro Arg Gly Pro Val Gly
1130 1135 1140
Pro Ser Gly Pro Pro Gly Lys Asp Gly Ala Ser Gly His Pro Gly
1145 1150 1155
Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn Arg Gly Glu Arg Gly
1160 1165 1170
Ser Glu Gly Ser Pro Gly His Pro Gly Gln Pro Gly Pro Pro Gly
1175 1180 1185
Pro Pro Gly Ala Pro Gly Pro Cys Cys Gly Ala Gly Gly Val Ala
1190 1195 1200
Ala Ile Ala Gly Val Gly Ala Glu Lys Ala Gly Gly Phe Ala Pro
1205 1210 1215
Tyr Tyr Gly Asp Glu Pro Ile Asp Phe Lys Ile Asn Thr Asp Glu
1220 1225 1230
Ile Met Thr Ser Leu Lys Ser Val Asn Gly Gln Ile Glu Ser Leu
1235 1240 1245
Ile Ser Pro Asp Gly Ser Arg Lys Asn Pro Ala Arg Asn Cys Arg
1250 1255 1260
Asp Leu Lys Phe Cys His Pro Glu Leu Gln Ser Gly Glu Tyr Trp
1265 1270 1275
Val Asp Pro Asn Gln Gly Cys Lys Leu Asp Ala Ile Lys Val Tyr
1280 1285 1290
Cys Asn Met Glu Thr Gly Glu Thr Cys Ile Ser Ala Ser Pro Leu
1295 1300 1305
Thr Ile Pro Gln Lys Asn Trp Trp Thr Asp Ser Gly Ala Glu Lys
1310 1315 1320
Lys His Val Trp Phe Gly Glu Ser Met Glu Gly Gly Phe Gln Phe
1325 1330 1335
Ser Tyr Gly Asn Pro Glu Leu Pro Glu Asp Val Leu Asp Val Gln
1340 1345 1350
Leu Ala Phe Leu Arg Leu Leu Ser Ser Arg Ala Ser Gln Asn Ile
1355 1360 1365
Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp His Ala Ser
1370 1375 1380
Gly Asn Val Lys Lys Ala Leu Lys Leu Met Gly Ser Asn Glu Gly
1385 1390 1395
Glu Phe Lys Ala Glu Gly Asn Ser Lys Phe Thr Tyr Thr Val Leu
1400 1405 1410
Glu Asp Gly Cys Thr Lys His Thr Gly Glu Trp Gly Lys Thr Val
1415 1420 1425
Phe Gln Tyr Gln Thr Arg Lys Ala Val Arg Leu Pro Ile Val Asp
1430 1435 1440
Ile Ala Pro Tyr Asp Ile Gly Gly Pro Asp Gln Glu Phe Gly Ala
1445 1450 1455
Asp Ile Gly Pro Val Cys Phe Leu
1460 1465
<210> 3
<211> 4404
<212> DNA
<213> Artificial Sequence
<220>
<223> Col3A1 cDNA sequence (Sequence 2)
<220>
<221> CDS
<222> (1)..(4404)
<400> 3
atg atg tct ttt gtc caa aag ggt act tgg tta ctt ttt gct ctg ttg 48
Met Met Ser Phe Val Gln Lys Gly Thr Trp Leu Leu Phe Ala Leu Leu
1 5 10 15
cac cca act gtt att ctc gca caa cag gaa gca gta gat ggt ggt tgc 96
His Pro Thr Val Ile Leu Ala Gln Gln Glu Ala Val Asp Gly Gly Cys
20 25 30
tca cat tta ggt caa tct tac gca gat aga gat gta tgg aaa cct gaa 144
Ser His Leu Gly Gln Ser Tyr Ala Asp Arg Asp Val Trp Lys Pro Glu
35 40 45
cca tgt caa att tgc gtg tgt gac tca ggt tca gtg ctc tgc gac gat 192
Pro Cys Gln Ile Cys Val Cys Asp Ser Gly Ser Val Leu Cys Asp Asp
50 55 60
atc ata tgt gac gac cag gaa ttg gac tgt cca aac cca gag ata cca 240
Ile Ile Cys Asp Asp Gln Glu Leu Asp Cys Pro Asn Pro Glu Ile Pro
65 70 75 80
ttc ggt gaa tgt tgt gct gtt tgt cca cag cca cca act gct cct aca 288
Phe Gly Glu Cys Cys Ala Val Cys Pro Gln Pro Pro Thr Ala Pro Thr
85 90 95
aga cct cca aac ggt caa ggt cca caa ggt cct aaa ggt gat ccg ggt 336
Arg Pro Pro Asn Gly Gln Gly Pro Gln Gly Pro Lys Gly Asp Pro Gly
100 105 110
cca cct ggt att cct ggt aga aat ggt gac cct gga cct ccc ggt tcc 384
Pro Pro Gly Ile Pro Gly Arg Asn Gly Asp Pro Gly Pro Pro Gly Ser
115 120 125
cca ggt agc cca gga tca cct ggg cct cct gga ata tgt gaa tcc tgc 432
Pro Gly Ser Pro Gly Ser Pro Gly Pro Pro Gly Ile Cys Glu Ser Cys
130 135 140
cca act ggt ggt cag aac tat agc cca caa tac gag gcc tac gac gtc 480
Pro Thr Gly Gly Gln Asn Tyr Ser Pro Gln Tyr Glu Ala Tyr Asp Val
145 150 155 160
aaa tct ggt gtt gct gga gga ggt att gca ggc tac cct ggt ccc gca 528
Lys Ser Gly Val Ala Gly Gly Gly Ile Ala Gly Tyr Pro Gly Pro Ala
165 170 175
ggg ccc cca ggt ccg ccg ggt ccg ccc gga aca tca ggt cat ccc gga 576
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Thr Ser Gly His Pro Gly
180 185 190
gcc cct ggt gca cca ggt tat cag gga ccg ccc gga gag cct gga caa 624
Ala Pro Gly Ala Pro Gly Tyr Gln Gly Pro Pro Gly Glu Pro Gly Gln
195 200 205
gct ggt ccc gct gga ccc cct ggt cca cca ggt gct att gga cca agt 672
Ala Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Ala Ile Gly Pro Ser
210 215 220
ggt cct gcc gga aaa gac ggt gaa tcc ggt aga cct ggt aga ccc ggc 720
Gly Pro Ala Gly Lys Asp Gly Glu Ser Gly Arg Pro Gly Arg Pro Gly
225 230 235 240
gaa agg ggt ttc cca ggt cct ccc gga atg aag ggt cca gcc ggt atg 768
Glu Arg Gly Phe Pro Gly Pro Pro Gly Met Lys Gly Pro Ala Gly Met
245 250 255
ccc ggt ttt cct ggg atg aag ggt cac aga gga ttt gat ggt aga aac 816
Pro Gly Phe Pro Gly Met Lys Gly His Arg Gly Phe Asp Gly Arg Asn
260 265 270
gga gag aaa ggc gaa acc ggt gct ccc gga ctg aag ggt gaa aac ggt 864
Gly Glu Lys Gly Glu Thr Gly Ala Pro Gly Leu Lys Gly Glu Asn Gly
275 280 285
gtc cct ggt gag aac ggc gct cct gga cct atg ggt cca cgt ggt gct 912
Val Pro Gly Glu Asn Gly Ala Pro Gly Pro Met Gly Pro Arg Gly Ala
290 295 300
cca gga gaa aga ggc aga cca gga ttg cct ggt gca gct ggt gct aga 960
Pro Gly Glu Arg Gly Arg Pro Gly Leu Pro Gly Ala Ala Gly Ala Arg
305 310 315 320
ggt aac gat ggt gcc cgt ggt tcc gat gga caa ccc ggg cca ccc ggc 1008
Gly Asn Asp Gly Ala Arg Gly Ser Asp Gly Gln Pro Gly Pro Pro Gly
325 330 335
cct cca ggt acc gct gga ttt cct gga agc cct ggt gct aag ggg gag 1056
Pro Pro Gly Thr Ala Gly Phe Pro Gly Ser Pro Gly Ala Lys Gly Glu
340 345 350
gtt ggt ccg gct ggt agt ccc gga agt agc ggt gcc cca ggt caa aga 1104
Val Gly Pro Ala Gly Ser Pro Gly Ser Ser Gly Ala Pro Gly Gln Arg
355 360 365
ggc gaa cca ggc cct cag ggt cac gca gga gca cct gga ccg cct ggt 1152
Gly Glu Pro Gly Pro Gln Gly His Ala Gly Ala Pro Gly Pro Pro Gly
370 375 380
cct cct ggt tcg aat ggt tcg cct gga gga aaa ggt gaa atg ggg ccc 1200
Pro Pro Gly Ser Asn Gly Ser Pro Gly Gly Lys Gly Glu Met Gly Pro
385 390 395 400
gca gga atc ccc ggt gcg cct ggt ctt att ggt gcc agg ggt cct cca 1248
Ala Gly Ile Pro Gly Ala Pro Gly Leu Ile Gly Ala Arg Gly Pro Pro
405 410 415
ggc ccg cca ggt aca aat ggt gta ccc gga cag cga gga gca gct ggt 1296
Gly Pro Pro Gly Thr Asn Gly Val Pro Gly Gln Arg Gly Ala Ala Gly
420 425 430
gaa cct ggt aaa aac ggt gcc aaa gga gat cca ggt cct cgt gga gag 1344
Glu Pro Gly Lys Asn Gly Ala Lys Gly Asp Pro Gly Pro Arg Gly Glu
435 440 445
cgt ggt gaa gct ggc tct ccc ggt atc gcc ggt cca aaa ggt gag gac 1392
Arg Gly Glu Ala Gly Ser Pro Gly Ile Ala Gly Pro Lys Gly Glu Asp
450 455 460
ggt aag gac ggt tcc cct ggt gag cca ggt gcg aac gga ctg cca ggt 1440
Gly Lys Asp Gly Ser Pro Gly Glu Pro Gly Ala Asn Gly Leu Pro Gly
465 470 475 480
gca gcc gga gag cga gga gtc cca gga ttc agg gga cca gcc ggt gct 1488
Ala Ala Gly Glu Arg Gly Val Pro Gly Phe Arg Gly Pro Ala Gly Ala
485 490 495
aac ggc ttg cct ggt gaa aaa ggg ccc cct ggt gat agg gga gga ccc 1536
Asn Gly Leu Pro Gly Glu Lys Gly Pro Pro Gly Asp Arg Gly Gly Pro
500 505 510
ggt cca gca ggc cct cgt gga gtt gct ggt gag cct gga cgt gac ggt 1584
Gly Pro Ala Gly Pro Arg Gly Val Ala Gly Glu Pro Gly Arg Asp Gly
515 520 525
tta cca gga ggg cca ggt ttg agg ggt att ccc ggg tcc cct ggc ggt 1632
Leu Pro Gly Gly Pro Gly Leu Arg Gly Ile Pro Gly Ser Pro Gly Gly
530 535 540
cct gga tcg gat gga aaa cca ggg cca cca ggt tcg cag ggt gaa aca 1680
Pro Gly Ser Asp Gly Lys Pro Gly Pro Pro Gly Ser Gln Gly Glu Thr
545 550 555 560
gga cgt cca ggc cca ccc ggc tca cct ggt cca agg ggt cag cct ggt 1728
Gly Arg Pro Gly Pro Pro Gly Ser Pro Gly Pro Arg Gly Gln Pro Gly
565 570 575
gtc atg ggt ttc ccc ggt cca aag ggt aat gac gga gca ccg ggt aaa 1776
Val Met Gly Phe Pro Gly Pro Lys Gly Asn Asp Gly Ala Pro Gly Lys
580 585 590
aat ggt gaa cgt ggt ggc cca ggt ggt cca gga ccc caa ggt cca gct 1824
Asn Gly Glu Arg Gly Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Ala
595 600 605
gga aaa aac ggt gag aca ggt cct caa gga cct cca gga cct acc ggt 1872
Gly Lys Asn Gly Glu Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly
610 615 620
cct agc gga gat aag gga gat acg gga ccg cca gga cct caa gga ttg 1920
Pro Ser Gly Asp Lys Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu
625 630 635 640
caa ggt ttg cct ggt aca tct ggc cct ccc gga gaa aat ggt aag cct 1968
Gln Gly Leu Pro Gly Thr Ser Gly Pro Pro Gly Glu Asn Gly Lys Pro
645 650 655
gga gag cca gga cca aaa ggc gaa gct gga gcc cca ggt atc ccc gga 2016
Gly Glu Pro Gly Pro Lys Gly Glu Ala Gly Ala Pro Gly Ile Pro Gly
660 665 670
ggt aag gga gac tca ggt gct ccg ggt gag cgt ggt cct ccg ggt gcc 2064
Gly Lys Gly Asp Ser Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Ala
675 680 685
ggt ggt cca cct gga cct aga ggt ggt gcc ggg ccg cca ggt cct gaa 2112
Gly Gly Pro Pro Gly Pro Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu
690 695 700
ggt ggt aaa ggt gct gct ggt cca ccg gga ccg cct ggc tct gct ggt 2160
Gly Gly Lys Gly Ala Ala Gly Pro Pro Gly Pro Pro Gly Ser Ala Gly
705 710 715 720
act cct ggc ttg cag gga atg cca gga gag aga ggt gga cct gga ggt 2208
Thr Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Gly Pro Gly Gly
725 730 735
ccc ggt ccg aag ggt gat aaa ggg gag cca gga tca tcc ggt gtt gac 2256
Pro Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Ser Ser Gly Val Asp
740 745 750
ggc gca cct ggt aaa gac gga cca agg gga cca acg ggt cca atc gga 2304
Gly Ala Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro Ile Gly
755 760 765
cca cca gga ccc gct ggc cag cca gga gat aaa ggc gag tcc gga gca 2352
Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Ser Gly Ala
770 775 780
ccc ggt gtt cct ggt ata gct gga ccc agg ggt ggt ccc ggt gaa aga 2400
Pro Gly Val Pro Gly Ile Ala Gly Pro Arg Gly Gly Pro Gly Glu Arg
785 790 795 800
ggt gaa cag ggc cca ccg ggt ccc gcc ggt ttc cct ggc gcc cct ggt 2448
Gly Glu Gln Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala Pro Gly
805 810 815
caa aat gga gaa cca ggt gca aag ggc gag aga gga gcc cca gga gaa 2496
Gln Asn Gly Glu Pro Gly Ala Lys Gly Glu Arg Gly Ala Pro Gly Glu
820 825 830
aag ggt gag gga gga cca ccc ggt gct gcc ggt cca gct ggg ggt tca 2544
Lys Gly Glu Gly Gly Pro Pro Gly Ala Ala Gly Pro Ala Gly Gly Ser
835 840 845
ggt cct gct gga cca cca ggt cca cag ggc gtt aaa ggt gag aga gga 2592
Gly Pro Ala Gly Pro Pro Gly Pro Gln Gly Val Lys Gly Glu Arg Gly
850 855 860
agt cca ggt ggt cct gga gct gct gga ttc cca ggt ggc cgt gga cct 2640
Ser Pro Gly Gly Pro Gly Ala Ala Gly Phe Pro Gly Gly Arg Gly Pro
865 870 875 880
cct ggt ccc cct gga tcg aat ggt aat cct ggt ccg cca ggt agt tcg 2688
Pro Gly Pro Pro Gly Ser Asn Gly Asn Pro Gly Pro Pro Gly Ser Ser
885 890 895
ggt gct cct ggg aag gac ggt cca cct ggc ccc cca ggt agt aac ggt 2736
Gly Ala Pro Gly Lys Asp Gly Pro Pro Gly Pro Pro Gly Ser Asn Gly
900 905 910
gca cct ggt agt cca ggt ata tcc gga cct aaa gga gat tcc ggt cca 2784
Ala Pro Gly Ser Pro Gly Ile Ser Gly Pro Lys Gly Asp Ser Gly Pro
915 920 925
cca ggc gaa aga ggg gcc cca ggc cca cag ggt cca cca gga gcc ccc 2832
Pro Gly Glu Arg Gly Ala Pro Gly Pro Gln Gly Pro Pro Gly Ala Pro
930 935 940
ggt cct ctg ggt att gct ggt ctt act ggt gca cgt gga ctg gcc ggt 2880
Gly Pro Leu Gly Ile Ala Gly Leu Thr Gly Ala Arg Gly Leu Ala Gly
945 950 955 960
cca ccc gga atg cct gga gca aga ggt tca cct gga cca caa ggt att 2928
Pro Pro Gly Met Pro Gly Ala Arg Gly Ser Pro Gly Pro Gln Gly Ile
965 970 975
aaa gga gag aac ggt aaa cct gga cct tcc ggt caa aac gga gag cgg 2976
Lys Gly Glu Asn Gly Lys Pro Gly Pro Ser Gly Gln Asn Gly Glu Arg
980 985 990
gga ccc cca ggc ccc caa ggt ctg cca gga cta gct ggt acc gca ggg 3024
Gly Pro Pro Gly Pro Gln Gly Leu Pro Gly Leu Ala Gly Thr Ala Gly
995 1000 1005
gaa cca gga aga gat gga aat cca ggt tca gac gga cta ccc ggt 3069
Glu Pro Gly Arg Asp Gly Asn Pro Gly Ser Asp Gly Leu Pro Gly
1010 1015 1020
aga gat ggt gca ccg ggg gcc aag ggc gac agg ggt gag aat gga 3114
Arg Asp Gly Ala Pro Gly Ala Lys Gly Asp Arg Gly Glu Asn Gly
1025 1030 1035
tct cct ggt gcg cca ggg gca cca ggc cac cca ggt ccc cca ggt 3159
Ser Pro Gly Ala Pro Gly Ala Pro Gly His Pro Gly Pro Pro Gly
1040 1045 1050
cct gtg ggc cct gct gga aag tca ggt gac agg gga gag aca ggc 3204
Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr Gly
1055 1060 1065
ccg gct ggt cca tct ggc gca ccc gga cca gct ggt tcc aga ggc 3249
Pro Ala Gly Pro Ser Gly Ala Pro Gly Pro Ala Gly Ser Arg Gly
1070 1075 1080
cca cct ggt ccg caa ggc cct aga ggt gac aag gga gag act gga 3294
Pro Pro Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr Gly
1085 1090 1095
gaa cga ggt gct atg ggt atc aag ggt cat aga ggt ttt ccg ggt 3339
Glu Arg Gly Ala Met Gly Ile Lys Gly His Arg Gly Phe Pro Gly
1100 1105 1110
aat ccc ggc gcc cca ggt tct cct ggt cca gct ggc cat caa ggt 3384
Asn Pro Gly Ala Pro Gly Ser Pro Gly Pro Ala Gly His Gln Gly
1115 1120 1125
gca gtc gga tcg ccc ggc cca gcc ggt ccc agg ggc cct gtt ggt 3429
Ala Val Gly Ser Pro Gly Pro Ala Gly Pro Arg Gly Pro Val Gly
1130 1135 1140
cca tcc ggt cct cca gga aag gat ggt gct tct gga cac cca gga 3474
Pro Ser Gly Pro Pro Gly Lys Asp Gly Ala Ser Gly His Pro Gly
1145 1150 1155
cct atc gga cct ccg ggt cct aga ggt aat aga gga gaa cgt gga 3519
Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn Arg Gly Glu Arg Gly
1160 1165 1170
tcc gag ggt agt cct ggt cac cct ggt caa cct ggc cca cca ggg 3564
Ser Glu Gly Ser Pro Gly His Pro Gly Gln Pro Gly Pro Pro Gly
1175 1180 1185
cct cca ggt gca ccc ggt cca tgt tgt ggt gca ggc ggt gtg gct 3609
Pro Pro Gly Ala Pro Gly Pro Cys Cys Gly Ala Gly Gly Val Ala
1190 1195 1200
gca att gct ggt gtg ggt gct gaa aag gcc ggc ggt ttc gct cca 3654
Ala Ile Ala Gly Val Gly Ala Glu Lys Ala Gly Gly Phe Ala Pro
1205 1210 1215
tat tat ggt gat gaa ccg att gat ttt aag atc aat act gac gaa 3699
Tyr Tyr Gly Asp Glu Pro Ile Asp Phe Lys Ile Asn Thr Asp Glu
1220 1225 1230
atc atg act tcc tta aag tcc gtt aat ggt caa att gag tct cta 3744
Ile Met Thr Ser Leu Lys Ser Val Asn Gly Gln Ile Glu Ser Leu
1235 1240 1245
atc tcc cca gat ggt tca cgt aaa aat cct gct aga aat tgt aga 3789
Ile Ser Pro Asp Gly Ser Arg Lys Asn Pro Ala Arg Asn Cys Arg
1250 1255 1260
gat ttg aag ttt tgt cac ccc gag ttg cag tcc ggt gag tac tgg 3834
Asp Leu Lys Phe Cys His Pro Glu Leu Gln Ser Gly Glu Tyr Trp
1265 1270 1275
gtg gac ccc aat caa ggt tgt aag tta gac gct att aaa gtt tac 3879
Val Asp Pro Asn Gln Gly Cys Lys Leu Asp Ala Ile Lys Val Tyr
1280 1285 1290
tgc aat atg gag aca gga gaa act tgc atc agc gct tct cca ttg 3924
Cys Asn Met Glu Thr Gly Glu Thr Cys Ile Ser Ala Ser Pro Leu
1295 1300 1305
act atc cca caa aaa aat tgg tgg act gac tct gga gct gag aaa 3969
Thr Ile Pro Gln Lys Asn Trp Trp Thr Asp Ser Gly Ala Glu Lys
1310 1315 1320
aag cat gta tgg ttc ggg gaa tcg atg gaa ggt ggt ttc caa ttc 4014
Lys His Val Trp Phe Gly Glu Ser Met Glu Gly Gly Phe Gln Phe
1325 1330 1335
agc tac ggt aac cct gaa ctt cct gaa gat gtt ctt gac gtt caa 4059
Ser Tyr Gly Asn Pro Glu Leu Pro Glu Asp Val Leu Asp Val Gln
1340 1345 1350
ttg gca ttt ctg aga ttg ttg tcc agt cgt gca agc caa aac att 4104
Leu Ala Phe Leu Arg Leu Leu Ser Ser Arg Ala Ser Gln Asn Ile
1355 1360 1365
aca tac cat tgc aaa aat tcc atc gca tat atg gat cat gct agc 4149
Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp His Ala Ser
1370 1375 1380
gga aat gtg aaa aag gca ttg aag ctg atg gga tca aat gaa ggt 4194
Gly Asn Val Lys Lys Ala Leu Lys Leu Met Gly Ser Asn Glu Gly
1385 1390 1395
gaa ttt aaa gca gag ggt aat tct aag ttt act tac act gta ttg 4239
Glu Phe Lys Ala Glu Gly Asn Ser Lys Phe Thr Tyr Thr Val Leu
1400 1405 1410
gag gat ggt tgt acg aag cat aca ggt gaa tgg ggt aaa aca gtg 4284
Glu Asp Gly Cys Thr Lys His Thr Gly Glu Trp Gly Lys Thr Val
1415 1420 1425
ttt caa tat caa acc cgc aaa gca gtt aga ttg cca atc gtc gat 4329
Phe Gln Tyr Gln Thr Arg Lys Ala Val Arg Leu Pro Ile Val Asp
1430 1435 1440
atc gca cca tac gac att gga gga cca gat caa gag ttc gga gct 4374
Ile Ala Pro Tyr Asp Ile Gly Gly Pro Asp Gln Glu Phe Gly Ala
1445 1450 1455
gac atc ggt ccg gtg tgt ttc ctt tga taa 4404
Asp Ile Gly Pro Val Cys Phe Leu
1460 1465
<210> 4
<211> 1466
<212> PRT
<213> Artificial Sequence
<220>
<223> Synthetic Construct
<400> 4
Met Met Ser Phe Val Gln Lys Gly Thr Trp Leu Leu Phe Ala Leu Leu
1 5 10 15
His Pro Thr Val Ile Leu Ala Gln Gln Glu Ala Val Asp Gly Gly Cys
20 25 30
Ser His Leu Gly Gln Ser Tyr Ala Asp Arg Asp Val Trp Lys Pro Glu
35 40 45
Pro Cys Gln Ile Cys Val Cys Asp Ser Gly Ser Val Leu Cys Asp Asp
50 55 60
Ile Ile Cys Asp Asp Gln Glu Leu Asp Cys Pro Asn Pro Glu Ile Pro
65 70 75 80
Phe Gly Glu Cys Cys Ala Val Cys Pro Gln Pro Pro Thr Ala Pro Thr
85 90 95
Arg Pro Pro Asn Gly Gln Gly Pro Gln Gly Pro Lys Gly Asp Pro Gly
100 105 110
Pro Pro Gly Ile Pro Gly Arg Asn Gly Asp Pro Gly Pro Pro Gly Ser
115 120 125
Pro Gly Ser Pro Gly Ser Pro Gly Pro Pro Gly Ile Cys Glu Ser Cys
130 135 140
Pro Thr Gly Gly Gln Asn Tyr Ser Pro Gln Tyr Glu Ala Tyr Asp Val
145 150 155 160
Lys Ser Gly Val Ala Gly Gly Gly Ile Ala Gly Tyr Pro Gly Pro Ala
165 170 175
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Thr Ser Gly His Pro Gly
180 185 190
Ala Pro Gly Ala Pro Gly Tyr Gln Gly Pro Pro Gly Glu Pro Gly Gln
195 200 205
Ala Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Ala Ile Gly Pro Ser
210 215 220
Gly Pro Ala Gly Lys Asp Gly Glu Ser Gly Arg Pro Gly Arg Pro Gly
225 230 235 240
Glu Arg Gly Phe Pro Gly Pro Pro Gly Met Lys Gly Pro Ala Gly Met
245 250 255
Pro Gly Phe Pro Gly Met Lys Gly His Arg Gly Phe Asp Gly Arg Asn
260 265 270
Gly Glu Lys Gly Glu Thr Gly Ala Pro Gly Leu Lys Gly Glu Asn Gly
275 280 285
Val Pro Gly Glu Asn Gly Ala Pro Gly Pro Met Gly Pro Arg Gly Ala
290 295 300
Pro Gly Glu Arg Gly Arg Pro Gly Leu Pro Gly Ala Ala Gly Ala Arg
305 310 315 320
Gly Asn Asp Gly Ala Arg Gly Ser Asp Gly Gln Pro Gly Pro Pro Gly
325 330 335
Pro Pro Gly Thr Ala Gly Phe Pro Gly Ser Pro Gly Ala Lys Gly Glu
340 345 350
Val Gly Pro Ala Gly Ser Pro Gly Ser Ser Gly Ala Pro Gly Gln Arg
355 360 365
Gly Glu Pro Gly Pro Gln Gly His Ala Gly Ala Pro Gly Pro Pro Gly
370 375 380
Pro Pro Gly Ser Asn Gly Ser Pro Gly Gly Lys Gly Glu Met Gly Pro
385 390 395 400
Ala Gly Ile Pro Gly Ala Pro Gly Leu Ile Gly Ala Arg Gly Pro Pro
405 410 415
Gly Pro Pro Gly Thr Asn Gly Val Pro Gly Gln Arg Gly Ala Ala Gly
420 425 430
Glu Pro Gly Lys Asn Gly Ala Lys Gly Asp Pro Gly Pro Arg Gly Glu
435 440 445
Arg Gly Glu Ala Gly Ser Pro Gly Ile Ala Gly Pro Lys Gly Glu Asp
450 455 460
Gly Lys Asp Gly Ser Pro Gly Glu Pro Gly Ala Asn Gly Leu Pro Gly
465 470 475 480
Ala Ala Gly Glu Arg Gly Val Pro Gly Phe Arg Gly Pro Ala Gly Ala
485 490 495
Asn Gly Leu Pro Gly Glu Lys Gly Pro Pro Gly Asp Arg Gly Gly Pro
500 505 510
Gly Pro Ala Gly Pro Arg Gly Val Ala Gly Glu Pro Gly Arg Asp Gly
515 520 525
Leu Pro Gly Gly Pro Gly Leu Arg Gly Ile Pro Gly Ser Pro Gly Gly
530 535 540
Pro Gly Ser Asp Gly Lys Pro Gly Pro Pro Gly Ser Gln Gly Glu Thr
545 550 555 560
Gly Arg Pro Gly Pro Pro Gly Ser Pro Gly Pro Arg Gly Gln Pro Gly
565 570 575
Val Met Gly Phe Pro Gly Pro Lys Gly Asn Asp Gly Ala Pro Gly Lys
580 585 590
Asn Gly Glu Arg Gly Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Ala
595 600 605
Gly Lys Asn Gly Glu Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly
610 615 620
Pro Ser Gly Asp Lys Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu
625 630 635 640
Gln Gly Leu Pro Gly Thr Ser Gly Pro Pro Gly Glu Asn Gly Lys Pro
645 650 655
Gly Glu Pro Gly Pro Lys Gly Glu Ala Gly Ala Pro Gly Ile Pro Gly
660 665 670
Gly Lys Gly Asp Ser Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Ala
675 680 685
Gly Gly Pro Pro Gly Pro Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu
690 695 700
Gly Gly Lys Gly Ala Ala Gly Pro Pro Gly Pro Pro Gly Ser Ala Gly
705 710 715 720
Thr Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Gly Pro Gly Gly
725 730 735
Pro Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Ser Ser Gly Val Asp
740 745 750
Gly Ala Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro Ile Gly
755 760 765
Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Ser Gly Ala
770 775 780
Pro Gly Val Pro Gly Ile Ala Gly Pro Arg Gly Gly Pro Gly Glu Arg
785 790 795 800
Gly Glu Gln Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala Pro Gly
805 810 815
Gln Asn Gly Glu Pro Gly Ala Lys Gly Glu Arg Gly Ala Pro Gly Glu
820 825 830
Lys Gly Glu Gly Gly Pro Pro Gly Ala Ala Gly Pro Ala Gly Gly Ser
835 840 845
Gly Pro Ala Gly Pro Pro Gly Pro Gln Gly Val Lys Gly Glu Arg Gly
850 855 860
Ser Pro Gly Gly Pro Gly Ala Ala Gly Phe Pro Gly Gly Arg Gly Pro
865 870 875 880
Pro Gly Pro Pro Gly Ser Asn Gly Asn Pro Gly Pro Pro Gly Ser Ser
885 890 895
Gly Ala Pro Gly Lys Asp Gly Pro Pro Gly Pro Pro Gly Ser Asn Gly
900 905 910
Ala Pro Gly Ser Pro Gly Ile Ser Gly Pro Lys Gly Asp Ser Gly Pro
915 920 925
Pro Gly Glu Arg Gly Ala Pro Gly Pro Gln Gly Pro Pro Gly Ala Pro
930 935 940
Gly Pro Leu Gly Ile Ala Gly Leu Thr Gly Ala Arg Gly Leu Ala Gly
945 950 955 960
Pro Pro Gly Met Pro Gly Ala Arg Gly Ser Pro Gly Pro Gln Gly Ile
965 970 975
Lys Gly Glu Asn Gly Lys Pro Gly Pro Ser Gly Gln Asn Gly Glu Arg
980 985 990
Gly Pro Pro Gly Pro Gln Gly Leu Pro Gly Leu Ala Gly Thr Ala Gly
995 1000 1005
Glu Pro Gly Arg Asp Gly Asn Pro Gly Ser Asp Gly Leu Pro Gly
1010 1015 1020
Arg Asp Gly Ala Pro Gly Ala Lys Gly Asp Arg Gly Glu Asn Gly
1025 1030 1035
Ser Pro Gly Ala Pro Gly Ala Pro Gly His Pro Gly Pro Pro Gly
1040 1045 1050
Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr Gly
1055 1060 1065
Pro Ala Gly Pro Ser Gly Ala Pro Gly Pro Ala Gly Ser Arg Gly
1070 1075 1080
Pro Pro Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr Gly
1085 1090 1095
Glu Arg Gly Ala Met Gly Ile Lys Gly His Arg Gly Phe Pro Gly
1100 1105 1110
Asn Pro Gly Ala Pro Gly Ser Pro Gly Pro Ala Gly His Gln Gly
1115 1120 1125
Ala Val Gly Ser Pro Gly Pro Ala Gly Pro Arg Gly Pro Val Gly
1130 1135 1140
Pro Ser Gly Pro Pro Gly Lys Asp Gly Ala Ser Gly His Pro Gly
1145 1150 1155
Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn Arg Gly Glu Arg Gly
1160 1165 1170
Ser Glu Gly Ser Pro Gly His Pro Gly Gln Pro Gly Pro Pro Gly
1175 1180 1185
Pro Pro Gly Ala Pro Gly Pro Cys Cys Gly Ala Gly Gly Val Ala
1190 1195 1200
Ala Ile Ala Gly Val Gly Ala Glu Lys Ala Gly Gly Phe Ala Pro
1205 1210 1215
Tyr Tyr Gly Asp Glu Pro Ile Asp Phe Lys Ile Asn Thr Asp Glu
1220 1225 1230
Ile Met Thr Ser Leu Lys Ser Val Asn Gly Gln Ile Glu Ser Leu
1235 1240 1245
Ile Ser Pro Asp Gly Ser Arg Lys Asn Pro Ala Arg Asn Cys Arg
1250 1255 1260
Asp Leu Lys Phe Cys His Pro Glu Leu Gln Ser Gly Glu Tyr Trp
1265 1270 1275
Val Asp Pro Asn Gln Gly Cys Lys Leu Asp Ala Ile Lys Val Tyr
1280 1285 1290
Cys Asn Met Glu Thr Gly Glu Thr Cys Ile Ser Ala Ser Pro Leu
1295 1300 1305
Thr Ile Pro Gln Lys Asn Trp Trp Thr Asp Ser Gly Ala Glu Lys
1310 1315 1320
Lys His Val Trp Phe Gly Glu Ser Met Glu Gly Gly Phe Gln Phe
1325 1330 1335
Ser Tyr Gly Asn Pro Glu Leu Pro Glu Asp Val Leu Asp Val Gln
1340 1345 1350
Leu Ala Phe Leu Arg Leu Leu Ser Ser Arg Ala Ser Gln Asn Ile
1355 1360 1365
Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp His Ala Ser
1370 1375 1380
Gly Asn Val Lys Lys Ala Leu Lys Leu Met Gly Ser Asn Glu Gly
1385 1390 1395
Glu Phe Lys Ala Glu Gly Asn Ser Lys Phe Thr Tyr Thr Val Leu
1400 1405 1410
Glu Asp Gly Cys Thr Lys His Thr Gly Glu Trp Gly Lys Thr Val
1415 1420 1425
Phe Gln Tyr Gln Thr Arg Lys Ala Val Arg Leu Pro Ile Val Asp
1430 1435 1440
Ile Ala Pro Tyr Asp Ile Gly Gly Pro Asp Gln Glu Phe Gly Ala
1445 1450 1455
Asp Ile Gly Pro Val Cys Phe Leu
1460 1465
<210> 5
<211> 940
<212> DNA
<213> Artificial Sequence
<220>
<223> pAOX1 (Sequence 3)
<400> 5
agatctaaca tccaaagacg aaaggttgaa tgaaaccttt ttgccatccg acatccacag 60
gtccattctc acacataagt gccaaacgca acaggagggg atacactagc agcagaccgt 120
tgcaaacgca ggacctccac tcctcttctc ctcaacaccc acttttgcca tcgaaaaacc 180
agcccagtta ttgggcttga ttggagctcg ctcattccaa ttccttctat taggctacta 240
acaccatgac tttattagcc tgtctatcct ggcccccctg gcgaggttca tgtttgttta 300
tttccgaatg caacaagctc cgcattacac ccgaacatca ctccagatga gggctttctg 360
agtgtggggt caaatagttt catgttcccc aaatggccca aaactgacag tttaaacgct 420
gtcttggaac ctaatatgac aaaagcgtga tctcatccaa gatgaactaa gtttggttcg 480
ttgaaatgct aacggccagt tggtcaaaaa gaaacttcca aaagtcggca taccgtttgt 540
cttgtttggt attgattgac gaatgctcaa aaataatctc attaatgctt agcgcagtct 600
ctctatcgct tctgaacccc ggtgcacctg tgccgaaacg caaatgggga aacacccgct 660
ttttggatga ttatgcattg tctccacatt gtatgcttcc aagattctgg tgggaatact 720
gctgatagcc taacgttcat gatcaaaatt taactgttct aacccctact tgacagcaat 780
atataaacag aaggaagctg ccctgtctta aacctttttt tttatcatca ttattagctt 840
actttcataa ttgcgactgg ttccaattga caagcttttg attttaacga cttttaacga 900
caacttgaga agatcaaaaa acaactaatt attcgaaacg 940
<210> 6
<211> 1612
<212> DNA
<213> Artificial Sequence
<220>
<223> Bovine P4HA cDNA Optimized (Sequence 4)
<400> 6
atgatttggt atatcctagt cgttggtatt ttgttgccac agtcactggc tcacccaggc 60
ttcttcactt ctataggaca gatgactgat ttgattcaca cagaaaaaga cctagttaca 120
agccttaaag actatatcaa agctgaagag gataagttgg agcaaatcaa aaagtgggca 180
gagaaactcg atagattgac tagtactgca acaaaagatc ctgagggttt tgtgggtcac 240
ccagtgaatg ctttcaagct gatgaagaga cttaatacag agtggtcaga attggaaaac 300
ttggtactta aagatatgag tgatggattc atttctaact taacaattca aagacaatac 360
tttccaaacg atgaggacca agtaggagca gcaaaagctt tgttgcgatt gcaggacaca 420
tacaatttgg acaccgacac gatatcgaag ggtgatttac ctggtgtgaa gcataagtcc 480
ttcctcactg tggaagattg ttttgaattg ggaaaagtcg catatacaga agccgactac 540
tatcacacag aattatggat ggagcaagct ctgcgtcagt tggacgaagg tgaagtttct 600
accgttgata aggtttcagt tttggattac ttatcatacg ctgtttacca gcaaggtgat 660
ctggacaaag ctctactttt aactaaaaag ttgttggagc tggacccgga gcatcaaaga 720
gctaacggta atctgaaata ctttgaatac atcatggcta aggaaaagga cgcaaataag 780
tcctcgtccg atgaccaatc cgatcaaaag accactctga aaaaaaaagg tgcagctgtt 840
gactacctcc cagagagaca aaagtatgaa atgctgtgta gaggagaggg tatcaagatg 900
actccaagga gacagaaaaa gctgttctgt agatatcatg atgggaaccg taacccaaaa 960
ttcattcttg ctccagcgaa acaggaagat gaatgggaca agcctagaat cattcgtttt 1020
catgacatca tctccgatgc agaaatagag gttgtgaaag acttggccaa accaagattg 1080
agtagggcta ccgtccatga ccctgagact ggaaaattga ctaccgcaca atatcgtgtc 1140
tctaaatcag catggttgtc cggttacgag aatcccgtgg tcagccgtat caatatgcgt 1200
attcaagatt tgactggtct tgacgtaagc actgctgagg aactacaagt tgccaactat 1260
ggtgtgggcg gtcagtatga accccacttt gatttcgcca gaaaggacga gcctgatgct 1320
tttaaggagc taggtactgg aaatagaatc gcaacgtggt tgttctatat gtccgatgtg 1380
cttgctggag gagccacagt tttccctgag gtaggtgctt ctgtttggcc taaaaagggc 1440
acggccgtat tttggtacaa tctgtttgca tctggagaag gtgattacag cactagacat 1500
gctgcttgtc ccgtcttagt cggtaataag tgggtttcca ataagtggct gcatgagaga 1560
ggtcaagagt ttaggaggcc atgcacattg tcagaattag aatgataatt tt 1612
<210> 7
<211> 1750
<212> DNA
<213> Artificial Sequence
<220>
<223> Bovine P4HB (PDI) sequence, with Alpha pre-pro signal sequence
(Sequence 5)
<400> 7
aaaatgagat tcccatctat tttcaccgct gtcttgttcg ctgcctcctc tgcattggct 60
gcccctgtta acactaccac tgaagacgag actgctcaaa ttccagctga agcagttatc 120
ggttactctg accttgaggg tgatttcgac gtcgctgttt tgcctttctc taactccact 180
aacaacggtt tgttgttcat taacaccact atcgcttcca ttgctgctaa ggaagagggt 240
gtctctctcg agaaaagaga ggccgaagct gcacccgatg aggaagatca tgttttagta 300
ttgcataaag gaaatttcga tgaagctttg gccgctcaca aatatctgct cgtcgagttt 360
tacgctccct ggtgcggtca ttgtaaggcc cttgcaccag agtacgccaa ggcagctggt 420
aagttaaagg ccgaaggttc agagatcaga ttagcaaaag ttgatgctac agaagagtcc 480
gatcttgctc aacaatacgg ggttcgagga tacccaacaa ttaagttttt caaaaatggt 540
gatactgctt ccccaaagga atatactgct ggtagagagg cagacgacat agtcaactgg 600
ctcaaaaaga gaacgggccc agctgcgtct acattaagcg acggagcagc agccgaagct 660
cttgtggaat ctagtgaagt tgctgtaatc ggtttcttta aggacatgga atctgattca 720
gctaaacagt tccttttagc agctgaagca atcgatgaca tccctttcgg aatcacctca 780
aatagtgacg tgttcagcaa gtaccaactt gacaaagatg gagtggtctt gttcaaaaag 840
tttgacgaag gcagaaacaa tttcgagggt gaggttacaa aggagaaact gcttgatttc 900
attaaacata accaactacc cttagttatc gaattcactg aacaaactgc tcctaagatt 960
ttcggtggag aaatcaaaac acatatcttg ttgtttttgc caaagtccgt atcggattat 1020
gaaggtaaac tctccaattt caaaaaggcc gctgagagct ttaagggcaa gattttgttc 1080
atctttattg actcagacca cacagacaat cagaggattt tggagttttt cggtttgaaa 1140
aaggaggaat gtccagcagt ccgtttgatc accttggagg aggagatgac caaatacaaa 1200
ccagagtcgg atgagttgac tgccgagaag ataacagaat tttgtcacag atttctggaa 1260
ggtaagatca agcctcatct tatgtctcaa gagttgcctg atgactggga taagcaacca 1320
gttaaagtat tggtgggtaa aaactttgag gaagtggcct tcgacgagaa aaaaaatgtc 1380
tttgttgaat tctatgctcc gtggtgtggt cactgtaagc agctggcacc aatttgggat 1440
aaactgggtg aaacttacaa agatcacgaa aacattgtta ttgcaaagat ggacagtact 1500
gctaacgaag tggaggctgt gaaagttcac tccttcccta cgctgaagtt ctttcctgca 1560
tctgctgaca gaactgttat cgactataat ggagagagga cattggatgg ttttaaaaag 1620
tttcttgaat ccggaggtca agacggagct ggtgacgacg atgatttgga agatctggag 1680
gaggctgagg aacctgatct tgaggaggat gacgaccaga aggcagtcaa agatgaactg 1740
tgataagggg 1750
<210> 8
<211> 7479
<212> DNA
<213> Artificial Sequence
<220>
<223> Collagen expression vectors - pDF-Col3A1 (Sequence 6)
<400> 8
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 60
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 120
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 180
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 240
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 300
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 360
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 420
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 480
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 540
aattatccga aaaaattttc tagagtgttg ttactttata cttccggctc gtataatacg 600
acaaggtgta aggaggacta aaccatggct aaactcacct ctgctgttcc agtcctgact 660
gctcgtgatg ttgctggtgc tgttgagttc tggactgata ggctcggttt ctcccgtgac 720
ttcgtagagg acgactttgc cggtgttgta cgtgacgacg ttaccctgtt catctccgca 780
gttcaggacc aggttgtgcc agacaacact ctggcatggg tatgggttcg tggtctggac 840
gaactgtacg ctgagtggtc tgaggtcgtg tctaccaact tccgtgatgc atctggtcca 900
gctatgaccg agatcggtga acagccctgg ggtcgtgagt ttgcactgcg tgatccagct 960
ggtaactgcg tgcatttcgt cgcagaagag caggactaac aattgacacc ttacgattat 1020
ttagagagta tttattagtt ttattgtatg tatacggatg ttttattatc tatttatgcc 1080
cttatattct gtaactatcc aaaagtccta tcttatcaag ccagcaatct atgtccgcga 1140
acgtcaacta aaaataagct ttttatgctc ttctctcttt ttttcccttc ggtataatta 1200
taccttgcat ccacagattc tcctgccaaa ttttgcataa tcctttacaa catggctata 1260
tgggagcact tagcgccctc caaaacccat attgcctacg catgtatagg tgttttttcc 1320
acaatatttt ctctgtgctc tctttttatt aaagagaagc tctatatcgg agaagcttct 1380
gtggccgtta tattcggcct tatcgtggga ccacattgcc tgaattggtt tgccccggaa 1440
gattggggaa acttggatct gattacctta gctgcagaaa agggtaccac tgagcgtcag 1500
accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct 1560
gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac 1620
caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat actgttcttc 1680
tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg 1740
ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt 1800
tggacccaag acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt 1860
gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc 1920
tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca 1980
gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata 2040
gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg 2100
ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct 2160
ggccttttgc tcacatgtat ttaaataatg tatctaaacg caaactccga gctggaaaaa 2220
tgttaccggc gatgcgcgga caatttagag gcggcgatca agaaacacct gctgggcgag 2280
cagtctggag cacagtcttc gatgggcccg agatcccacc gcgttcctgg gtaccgggac 2340
gtgaggcagc gcgacatcca tcaaatatac caggcgccaa ccgagtctct cggaaaacag 2400
cttctggata tcttccgctg gcggcgcaac gacgaataat agtccctgga ggtgacggaa 2460
tatatatgtg tggagggtaa atctgacagg gtgtagcaaa ggtaatattt tcctaaaaca 2520
tgcaatcggc tgccccgcaa cgggaaaaag aatgactttg gcactcttca ccagagtggg 2580
gtgtcccgct cgtgtgtgca aataggctcc cactggtcac cccggatttt gcagaaaaac 2640
agcaagttcc ggggtgtctc actggtgtcc gccaataaga ggagccggca ggcacggagt 2700
ctacatcaag ctgtctccga tacactcgac taccatccgg gtctctcaga gaggggaatg 2760
gcactataaa taccgcctcc ttgcgctctc tgccttcatc aatcaaatca tgatgtcttt 2820
tgtccaaaag ggtacttggt tactttttgc tctgttgcac ccaactgtta ttctcgcaca 2880
acaggaagca gtagatggtg gttgctcaca tttaggtcaa tcttacgcag atagagatgt 2940
atggaaacct gaaccatgtc aaatttgcgt gtgtgactca ggttcagtgc tctgcgacga 3000
tatcatatgt gacgaccagg aattggactg tccaaaccca gagataccat tcggtgaatg 3060
ttgtgctgtt tgtccacagc caccaactgc tcctacaaga cctccaaacg gtcaaggtcc 3120
acaaggtcct aaaggtgatc cgggtccacc tggtattcct ggtagaaatg gtgaccctgg 3180
acctcccggt tccccaggta gcccaggatc acctgggcct cctggaatat gtgaatcctg 3240
cccaactggt ggtcagaact atagcccaca atacgaggcc tacgacgtca aatctggtgt 3300
tgctggagga ggtattgcag gctaccctgg tcccgcaggg cccccaggtc cgccgggtcc 3360
gcccggaaca tcaggtcatc ccggagcccc tggtgcacca ggttatcagg gaccgcccgg 3420
agagcctgga caagctggtc ccgctggacc ccctggtcca ccaggtgcta ttggaccaag 3480
tggtcctgcc ggaaaagacg gtgaatccgg tagacctggt agacccggcg aaaggggttt 3540
cccaggtcct cccggaatga agggtccagc cggtatgccc ggttttcctg ggatgaaggg 3600
tcacagagga tttgatggta gaaacggaga gaaaggcgaa accggtgctc ccggactgaa 3660
gggtgaaaac ggtgtccctg gtgagaacgg cgctcctgga cctatgggtc cacgtggtgc 3720
tccaggagaa agaggcagac caggattgcc tggtgcagct ggtgctagag gtaacgatgg 3780
tgcccgtggt tccgatggac aacccgggcc acccggccct ccaggtaccg ctggatttcc 3840
tggaagccct ggtgctaagg gggaggttgg tccggctggt agtcccggaa gtagcggtgc 3900
cccaggtcaa agaggcgaac caggccctca gggtcacgca ggagcacctg gaccgcctgg 3960
tcctcctggt tcgaatggtt cgcctggagg aaaaggtgaa atggggcccg caggaatccc 4020
cggtgcgcct ggtcttattg gtgccagggg tcctccaggc ccgccaggta caaatggtgt 4080
acccggacag cgaggagcag ctggtgaacc tggtaaaaac ggtgccaaag gagatccagg 4140
tcctcgtgga gagcgtggtg aagctggctc tcccggtatc gccggtccaa aaggtgagga 4200
cggtaaggac ggttcccctg gtgagccagg tgcgaacgga ctgccaggtg cagccggaga 4260
gcgaggagtc ccaggattca ggggaccagc cggtgctaac ggcttgcctg gtgaaaaagg 4320
gccccctggt gataggggag gacccggtcc agcaggccct cgtggagttg ctggtgagcc 4380
tggacgtgac ggtttaccag gagggccagg tttgaggggt attcccgggt cccctggcgg 4440
tcctggatcg gatggaaaac cagggccacc aggttcgcag ggtgaaacag gacgtccagg 4500
cccacccggc tcacctggtc caaggggtca gcctggtgtc atgggtttcc ccggtccaaa 4560
gggtaatgac ggagcaccgg gtaaaaatgg tgaacgtggt ggcccaggtg gtccaggacc 4620
ccaaggtcca gctggaaaaa acggtgagac aggtcctcaa ggacctccag gacctaccgg 4680
tcctagcgga gataagggag atacgggacc gccaggacct caaggattgc aaggtttgcc 4740
tggtacatct ggccctcccg gagaaaatgg taagcctgga gagccaggac caaaaggcga 4800
agctggagcc ccaggtatcc ccggaggtaa gggagactca ggtgctccgg gtgagcgtgg 4860
tcctccgggt gccggtggtc cacctggacc tagaggtggt gccgggccgc caggtcctga 4920
aggtggtaaa ggtgctgctg gtccaccggg accgcctggc tctgctggta ctcctggctt 4980
gcagggaatg ccaggagaga gaggtggacc tggaggtccc ggtccgaagg gtgataaagg 5040
ggagccagga tcatccggtg ttgacggcgc acctggtaaa gacggaccaa ggggaccaac 5100
gggtccaatc ggaccaccag gacccgctgg ccagccagga gataaaggcg agtccggagc 5160
acccggtgtt cctggtatag ctggacccag gggtggtccc ggtgaaagag gtgaacaggg 5220
cccaccgggt cccgccggtt tccctggcgc ccctggtcaa aatggagaac caggtgcaaa 5280
gggcgagaga ggagccccag gagaaaaggg tgagggagga ccacccggtg ctgccggtcc 5340
agctgggggt tcaggtcctg ctggaccacc aggtccacag ggcgttaaag gtgagagagg 5400
aagtccaggt ggtcctggag ctgctggatt cccaggtggc cgtggacctc ctggtccccc 5460
tggatcgaat ggtaatcctg gtccgccagg tagttcgggt gctcctggga aggacggtcc 5520
acctggcccc ccaggtagta acggtgcacc tggtagtcca ggtatatccg gacctaaagg 5580
agattccggt ccaccaggcg aaagaggggc cccaggccca cagggtccac caggagcccc 5640
cggtcctctg ggtattgctg gtcttactgg tgcacgtgga ctggccggtc cacccggaat 5700
gcctggagca agaggttcac ctggaccaca aggtattaaa ggagagaacg gtaaacctgg 5760
accttccggt caaaacggag agcggggacc cccaggcccc caaggtctgc caggactagc 5820
tggtaccgca ggggaaccag gaagagatgg aaatccaggt tcagacggac tacccggtag 5880
agatggtgca ccgggggcca agggcgacag gggtgagaat ggatctcctg gtgcgccagg 5940
ggcaccaggc cacccaggtc ccccaggtcc tgtgggccct gctggaaagt caggtgacag 6000
gggagagaca ggcccggctg gtccatctgg cgcacccgga ccagctggtt ccagaggccc 6060
acctggtccg caaggcccta gaggtgacaa gggagagact ggagaacgag gtgctatggg 6120
tatcaagggt catagaggtt ttccgggtaa tcccggcgcc ccaggttctc ctggtccagc 6180
tggccatcaa ggtgcagtcg gatcgcccgg cccagccggt cccaggggcc ctgttggtcc 6240
atccggtcct ccaggaaagg atggtgcttc tggacaccca ggacctatcg gacctccggg 6300
tcctagaggt aatagaggag aacgtggatc cgagggtagt cctggtcacc ctggtcaacc 6360
tggcccacca gggcctccag gtgcacccgg tccatgttgt ggtgcaggcg gtgtggctgc 6420
aattgctggt gtgggtgctg aaaaggccgg cggtttcgct ccatattatg gtgatgaacc 6480
gattgatttt aagatcaata ctgacgaaat catgacttcc ttaaagtccg ttaatggtca 6540
aattgagtct ctaatctccc cagatggttc acgtaaaaat cctgctagaa attgtagaga 6600
tttgaagttt tgtcaccccg agttgcagtc cggtgagtac tgggtggacc ccaatcaagg 6660
ttgtaagtta gacgctatta aagtttactg caatatggag acaggagaaa cttgcatcag 6720
cgcttctcca ttgactatcc cacaaaaaaa ttggtggact gactctggag ctgagaaaaa 6780
gcatgtatgg ttcggggaat cgatggaagg tggtttccaa ttcagctacg gtaaccctga 6840
acttcctgaa gatgttcttg acgttcaatt ggcatttctg agattgttgt ccagtcgtgc 6900
aagccaaaac attacatacc attgcaaaaa ttccatcgca tatatggatc atgctagcgg 6960
aaatgtgaaa aaggcattga agctgatggg atcaaatgaa ggtgaattta aagcagaggg 7020
taattctaag tttacttaca ctgtattgga ggatggttgt acgaagcata caggtgaatg 7080
gggtaaaaca gtgtttcaat atcaaacccg caaagcagtt agattgccaa tcgtcgatat 7140
cgcaccatac gacattggag gaccagatca agagttcgga gctgacatcg gtccggtgtg 7200
tttcctttga taatcaagag gatgtcagaa tgccatttgc ctgagagatg caggcttcat 7260
ttttgatact tttttatttg taacctatat agtataggat tttttttgtc attttgtttc 7320
ttctcgtacg agcttgctcc tgatcagcct atctcgcagc tgatgaatat cttgtggtag 7380
gggtttggga aaatcattcg agtttgatgt ttttcttggt atttcccact cctcttcaga 7440
gtacagaaga ttaagtgaga cgttcgtttg tgctccgga 7479
<210> 9
<211> 7356
<212> DNA
<213> Artificial Sequence
<220>
<223> Collagen expression vectors - pCAT1-Col3A1 (Sequence 7)
<400> 9
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 60
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 120
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 180
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 240
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 300
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 360
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 420
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 480
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 540
aattatccga aaaaattttc tagagtgttg ttactttata cttccggctc gtataatacg 600
acaaggtgta aggaggacta aaccatggct aaactcacct ctgctgttcc agtcctgact 660
gctcgtgatg ttgctggtgc tgttgagttc tggactgata ggctcggttt ctcccgtgac 720
ttcgtagagg acgactttgc cggtgttgta cgtgacgacg ttaccctgtt catctccgca 780
gttcaggacc aggttgtgcc agacaacact ctggcatggg tatgggttcg tggtctggac 840
gaactgtacg ctgagtggtc tgaggtcgtg tctaccaact tccgtgatgc atctggtcca 900
gctatgaccg agatcggtga acagccctgg ggtcgtgagt ttgcactgcg tgatccagct 960
ggtaactgcg tgcatttcgt cgcagaagag caggactaac aattgacacc ttacgattat 1020
ttagagagta tttattagtt ttattgtatg tatacggatg ttttattatc tatttatgcc 1080
cttatattct gtaactatcc aaaagtccta tcttatcaag ccagcaatct atgtccgcga 1140
acgtcaacta aaaataagct ttttatgctc ttctctcttt ttttcccttc ggtataatta 1200
taccttgcat ccacagattc tcctgccaaa ttttgcataa tcctttacaa catggctata 1260
tgggagcact tagcgccctc caaaacccat attgcctacg catgtatagg tgttttttcc 1320
acaatatttt ctctgtgctc tctttttatt aaagagaagc tctatatcgg agaagcttct 1380
gtggccgtta tattcggcct tatcgtggga ccacattgcc tgaattggtt tgccccggaa 1440
gattggggaa acttggatct gattacctta gctgcagaaa agggtaccac tgagcgtcag 1500
accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct 1560
gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac 1620
caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat actgttcttc 1680
tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg 1740
ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt 1800
tggacccaag acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt 1860
gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc 1920
tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca 1980
gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata 2040
gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg 2100
ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct 2160
ggccttttgc tcacatgtat ttaaattaat cgaactccga atgcggttct cctgtaacct 2220
taattgtagc atagatcact taaataaact catggcctga catctgtaca cgttcttatt 2280
ggtcttttag caatcttgaa gtctttctat tgttccggtc ggcattacct aataaattcg 2340
aatcgagatt gctagtacct gatatcatat gaagtaatca tcacatgcaa gttccatgat 2400
accctctact aatggaattg aacaaagttt aagcttctcg cacgagaccg aatccatact 2460
atgcacccct caaagttggg attagtcagg aaagctgagc aattaacttc cctcgattgg 2520
cctggacttt tcgcttagcc tgccgcaatc ggtaagtttc attatcccag cggggtgata 2580
gcctctgttg ctcatcaggc caaaatcata tataagctgt agacccagca cttcaattac 2640
ttgaaattca ccataacact tgctctagtc aagacttaca attaaaatga tgtcttttgt 2700
ccaaaagggt acttggttac tttttgctct gttgcaccca actgttattc tcgcacaaca 2760
ggaagcagta gatggtggtt gctcacattt aggtcaatct tacgcagata gagatgtatg 2820
gaaacctgaa ccatgtcaaa tttgcgtgtg tgactcaggt tcagtgctct gcgacgatat 2880
catatgtgac gaccaggaat tggactgtcc aaacccagag ataccattcg gtgaatgttg 2940
tgctgtttgt ccacagccac caactgctcc tacaagacct ccaaacggtc aaggtccaca 3000
aggtcctaaa ggtgatccgg gtccacctgg tattcctggt agaaatggtg accctggacc 3060
tcccggttcc ccaggtagcc caggatcacc tgggcctcct ggaatatgtg aatcctgccc 3120
aactggtggt cagaactata gcccacaata cgaggcctac gacgtcaaat ctggtgttgc 3180
tggaggaggt attgcaggct accctggtcc cgcagggccc ccaggtccgc cgggtccgcc 3240
cggaacatca ggtcatcccg gagcccctgg tgcaccaggt tatcagggac cgcccggaga 3300
gcctggacaa gctggtcccg ctggaccccc tggtccacca ggtgctattg gaccaagtgg 3360
tcctgccgga aaagacggtg aatccggtag acctggtaga cccggcgaaa ggggtttccc 3420
aggtcctccc ggaatgaagg gtccagccgg tatgcccggt tttcctggga tgaagggtca 3480
cagaggattt gatggtagaa acggagagaa aggcgaaacc ggtgctcccg gactgaaggg 3540
tgaaaacggt gtccctggtg agaacggcgc tcctggacct atgggtccac gtggtgctcc 3600
aggagaaaga ggcagaccag gattgcctgg tgcagctggt gctagaggta acgatggtgc 3660
ccgtggttcc gatggacaac ccgggccacc cggccctcca ggtaccgctg gatttcctgg 3720
aagccctggt gctaaggggg aggttggtcc ggctggtagt cccggaagta gcggtgcccc 3780
aggtcaaaga ggcgaaccag gccctcaggg tcacgcagga gcacctggac cgcctggtcc 3840
tcctggttcg aatggttcgc ctggaggaaa aggtgaaatg gggcccgcag gaatccccgg 3900
tgcgcctggt cttattggtg ccaggggtcc tccaggcccg ccaggtacaa atggtgtacc 3960
cggacagcga ggagcagctg gtgaacctgg taaaaacggt gccaaaggag atccaggtcc 4020
tcgtggagag cgtggtgaag ctggctctcc cggtatcgcc ggtccaaaag gtgaggacgg 4080
taaggacggt tcccctggtg agccaggtgc gaacggactg ccaggtgcag ccggagagcg 4140
aggagtccca ggattcaggg gaccagccgg tgctaacggc ttgcctggtg aaaaagggcc 4200
ccctggtgat aggggaggac ccggtccagc aggccctcgt ggagttgctg gtgagcctgg 4260
acgtgacggt ttaccaggag ggccaggttt gaggggtatt cccgggtccc ctggcggtcc 4320
tggatcggat ggaaaaccag ggccaccagg ttcgcagggt gaaacaggac gtccaggccc 4380
acccggctca cctggtccaa ggggtcagcc tggtgtcatg ggtttccccg gtccaaaggg 4440
taatgacgga gcaccgggta aaaatggtga acgtggtggc ccaggtggtc caggacccca 4500
aggtccagct ggaaaaaacg gtgagacagg tcctcaagga cctccaggac ctaccggtcc 4560
tagcggagat aagggagata cgggaccgcc aggacctcaa ggattgcaag gtttgcctgg 4620
tacatctggc cctcccggag aaaatggtaa gcctggagag ccaggaccaa aaggcgaagc 4680
tggagcccca ggtatccccg gaggtaaggg agactcaggt gctccgggtg agcgtggtcc 4740
tccgggtgcc ggtggtccac ctggacctag aggtggtgcc gggccgccag gtcctgaagg 4800
tggtaaaggt gctgctggtc caccgggacc gcctggctct gctggtactc ctggcttgca 4860
gggaatgcca ggagagagag gtggacctgg aggtcccggt ccgaagggtg ataaagggga 4920
gccaggatca tccggtgttg acggcgcacc tggtaaagac ggaccaaggg gaccaacggg 4980
tccaatcgga ccaccaggac ccgctggcca gccaggagat aaaggcgagt ccggagcacc 5040
cggtgttcct ggtatagctg gacccagggg tggtcccggt gaaagaggtg aacagggccc 5100
accgggtccc gccggtttcc ctggcgcccc tggtcaaaat ggagaaccag gtgcaaaggg 5160
cgagagagga gccccaggag aaaagggtga gggaggacca cccggtgctg ccggtccagc 5220
tgggggttca ggtcctgctg gaccaccagg tccacagggc gttaaaggtg agagaggaag 5280
tccaggtggt cctggagctg ctggattccc aggtggccgt ggacctcctg gtccccctgg 5340
atcgaatggt aatcctggtc cgccaggtag ttcgggtgct cctgggaagg acggtccacc 5400
tggcccccca ggtagtaacg gtgcacctgg tagtccaggt atatccggac ctaaaggaga 5460
ttccggtcca ccaggcgaaa gaggggcccc aggcccacag ggtccaccag gagcccccgg 5520
tcctctgggt attgctggtc ttactggtgc acgtggactg gccggtccac ccggaatgcc 5580
tggagcaaga ggttcacctg gaccacaagg tattaaagga gagaacggta aacctggacc 5640
ttccggtcaa aacggagagc ggggaccccc aggcccccaa ggtctgccag gactagctgg 5700
taccgcaggg gaaccaggaa gagatggaaa tccaggttca gacggactac ccggtagaga 5760
tggtgcaccg ggggccaagg gcgacagggg tgagaatgga tctcctggtg cgccaggggc 5820
accaggccac ccaggtcccc caggtcctgt gggccctgct ggaaagtcag gtgacagggg 5880
agagacaggc ccggctggtc catctggcgc acccggacca gctggttcca gaggcccacc 5940
tggtccgcaa ggccctagag gtgacaaggg agagactgga gaacgaggtg ctatgggtat 6000
caagggtcat agaggttttc cgggtaatcc cggcgcccca ggttctcctg gtccagctgg 6060
ccatcaaggt gcagtcggat cgcccggccc agccggtccc aggggccctg ttggtccatc 6120
cggtcctcca ggaaaggatg gtgcttctgg acacccagga cctatcggac ctccgggtcc 6180
tagaggtaat agaggagaac gtggatccga gggtagtcct ggtcaccctg gtcaacctgg 6240
cccaccaggg cctccaggtg cacccggtcc atgttgtggt gcaggcggtg tggctgcaat 6300
tgctggtgtg ggtgctgaaa aggccggcgg tttcgctcca tattatggtg atgaaccgat 6360
tgattttaag atcaatactg acgaaatcat gacttcctta aagtccgtta atggtcaaat 6420
tgagtctcta atctccccag atggttcacg taaaaatcct gctagaaatt gtagagattt 6480
gaagttttgt caccccgagt tgcagtccgg tgagtactgg gtggacccca atcaaggttg 6540
taagttagac gctattaaag tttactgcaa tatggagaca ggagaaactt gcatcagcgc 6600
ttctccattg actatcccac aaaaaaattg gtggactgac tctggagctg agaaaaagca 6660
tgtatggttc ggggaatcga tggaaggtgg tttccaattc agctacggta accctgaact 6720
tcctgaagat gttcttgacg ttcaattggc atttctgaga ttgttgtcca gtcgtgcaag 6780
ccaaaacatt acataccatt gcaaaaattc catcgcatat atggatcatg ctagcggaaa 6840
tgtgaaaaag gcattgaagc tgatgggatc aaatgaaggt gaatttaaag cagagggtaa 6900
ttctaagttt acttacactg tattggagga tggttgtacg aagcatacag gtgaatgggg 6960
taaaacagtg tttcaatatc aaacccgcaa agcagttaga ttgccaatcg tcgatatcgc 7020
accatacgac attggaggac cagatcaaga gttcggagct gacatcggtc cggtgtgttt 7080
cctttgataa tcaagaggat gtcagaatgc catttgcctg agagatgcag gcttcatttt 7140
tgatactttt ttatttgtaa cctatatagt ataggatttt ttttgtcatt ttgtttcttc 7200
tcgtacgagc ttgctcctga tcagcctatc tcgcagctga tgaatatctt gtggtagggg 7260
tttgggaaaa tcattcgagt ttgatgtttt tcttggtatt tcccactcct cttcagagta 7320
cagaagatta agtgagacgt tcgtttgtgc tccgga 7356
<210> 10
<211> 404
<212> DNA
<213> Artificial Sequence
<220>
<223> AOX1 landing pad (Sequence 8)
<400> 10
agaagcgata gagagactgc gctaagcatt aatgagatta tttttgagca ttcgtcaatc 60
aataccaaac aagacaaacg gtatgccgac ttttggaagt ttctttttga ccaactggcc 120
gttagcattt caacgaacca aacttagttc atcttggatg agatcacgct tttgtcatat 180
taggttccaa gacagcgttt aaactgtcag ttttgggcca tttggggaac atgaaactat 240
ttgaccccac actcagaaag ccctcatctg gagtgatgtt cgggtgtaat gcggagcttg 300
ttgcattcgg aaataaacaa acatgaacct cgccaggggg gccaggatag acaggctaat 360
aaagtcatgg tgttagtagc ctaatagaag gaattggaat gagc 404
<210> 11
<211> 7942
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV63 (Sequence 9)
<400> 11
ttctttcctg cggtacccag atccaattcc cgctttgact gcctgaaatc tccatcgcct 60
acaatgatga catttggatt tggttgactc atgttggtat tgtgaaatag acgcagatcg 120
ggaacactga aaaatacaca gttattattc atttaaataa catccaaaga cgaaaggttg 180
aatgaaacct ttttgccatc cgacatccac aggtccattc tcacacataa gtgccaaacg 240
caacaggagg ggatacacta gcagcagacc gttgcaaacg caggacctcc actcctcttc 300
tcctcaacac ccacttttgc catcgaaaaa ccagcccagt tattgggctt gattggagct 360
cgctcattcc aattccttct attaggctac taacaccatg actttattag cctgtctatc 420
ctggcccccc tggcgaggtt catgtttgtt tatttccgaa tgcaacaagc tccgcattac 480
acccgaacat cactccagat gagggctttc tgagtgtggg gtcaaatagt ttcatgttcc 540
ccaaatggcc caaaactgac agtttaaacg ctgtcttgga acctaatatg acaaaagcgt 600
gatctcatcc aagatgaact aagtttggtt cgttgaaatg ctaacggcca gttggtcaaa 660
aagaaacttc caaaagtcgg cataccgttt gtcttgtttg gtattgattg acgaatgctc 720
aaaaataatc tcattaatgc ttagcgcagt ctctctatcg cttctgaacc ccggtgcacc 780
tgtgccgaaa cgcaaatggg gaaacacccg ctttttggat gattatgcat tgtctccaca 840
ttgtatgctt ccaagattct ggtgggaata ctgctgatag cctaacgttc atgatcaaaa 900
tttaactgtt ctaaccccta cttgacagca atatataaac agaaggaagc tgccctgtct 960
taaacctttt tttttatcat cattattagc ttactttcat aattgcgact ggttccaatt 1020
gacaagcttt tgattttaac gacttttaac gacaacttga gaagatcaaa aaacaactaa 1080
ttattgaaag aattcaaaac gatgagcttt gtgcaaaagg ggacctggtt acttttcgct 1140
ctgcttcatc ccactgttat tttggcacaa caggaagctg ttgacggagg atgctcccat 1200
ctcggtcagt cttatgcaga tagagatgta tggaaaccag aaccgtgcca aatatgcgtc 1260
tgtgactcag gatccgttct ctgtgatgac ataatatgtg acgaccaaga attagactgc 1320
cccaaccctg aaatcccgtt tggagaatgt tgtgcagttt gcccacagcc tccaacagct 1380
cccactcgcc ctcctaatgg tcaaggacct caaggcccca agggagatcc aggtcctcct 1440
ggtattcctg ggcgaaatgg cgatcctggt cctccaggat caccaggctc cccaggttct 1500
cccggccctc ctggaatctg tgaatcatgt cctactggtg gccagaacta ttctccccag 1560
tacgaagcat atgatgtcaa gtctggagta gcaggaggag gaatcgcagg ctatcctggg 1620
ccagctggtc ctcctggccc acccggaccc cctggcacat ctggccatcc tggtgcccct 1680
ggcgctccag gataccaagg tccccccggt gaacctgggc aagctggtcc ggcaggtcct 1740
ccaggacctc ctggtgctat aggtccatct ggccctgctg gaaaagatgg ggaatcagga 1800
agacccggac gacctggaga gcgaggattt cctggccctc ctggtatgaa aggcccagct 1860
ggtatgcctg gattccctgg tatgaaagga cacagaggct ttgatggacg aaatggagag 1920
aaaggcgaaa ctggtgctcc tggattaaag ggggaaaatg gcgttccagg tgaaaatgga 1980
gctcctggac ccatgggtcc aagaggggct cccggtgaga gaggacggcc aggacttcct 2040
ggagccgcag gggctcgagg taatgatgga gctcgaggaa gtgatggaca accgggcccc 2100
cctggtcctc ctggaactgc aggattccct ggttcccctg gtgctaaggg tgaagttgga 2160
cctgcaggat ctcctggttc aagtggcgcc cctggacaaa gaggagaacc tggacctcag 2220
ggacatgctg gtgctccagg tccccctggg cctcctggga gtaatggtag tcctggtggc 2280
aaaggtgaaa tgggtcctgc tggcattcct ggggctcctg ggctgatagg agctcgtggt 2340
cctccagggc cacctggcac caatggtgtt cccgggcaac gaggtgctgc aggtgaaccc 2400
ggtaagaatg gagccaaagg agacccagga ccacgtgggg aacgcggaga agctggttct 2460
ccaggtatcg caggacctaa gggtgaagat ggcaaagatg gttctcctgg agaacctggt 2520
gcaaatggac ttcctggagc tgcaggagaa aggggtgtgc ctggattccg aggacctgct 2580
ggagcaaatg gccttccagg agaaaagggt cctcctgggg accgtggtgg cccaggccct 2640
gcagggccca gaggtgttgc tggagagccc ggcagagatg gtctccctgg aggtccagga 2700
ttgaggggta ttcctggtag ccccggagga ccaggcagtg atgggaaacc agggcctcct 2760
ggaagccaag gagagacggg tcgacccggt cctccaggtt cacctggtcc gcgaggccag 2820
cctggtgtca tgggcttccc tggtcccaaa ggaaacgatg gtgctcctgg aaaaaatgga 2880
gaacgaggtg gccctggagg tcctggccct cagggtcctg ctggaaagaa tggtgagacc 2940
ggacctcagg gtcctccagg acctactggc ccttctggtg acaaaggaga cacaggaccc 3000
cctggtccac aaggactaca aggcttgcct ggaacgagtg gtcccccagg agaaaacgga 3060
aaacctggtg aacctggtcc aaagggtgag gctggtgcac ctggaattcc aggaggcaag 3120
ggtgattctg gtgctcccgg tgaacgcgga cctcctggag caggagggcc ccctggacct 3180
agaggtggag ctggcccccc tggtcccgaa ggaggaaagg gtgctgctgg tccccctggg 3240
ccacctggtt ctgctggtac acctggtctg caaggaatgc ctggagaaag agggggtcct 3300
ggaggccctg gtccaaaggg tgataagggt gagcctggca gctcaggtgt cgatggtgct 3360
ccagggaaag atggtccacg gggtcccact ggtcccattg gtcctcctgg cccagctggt 3420
cagcctggag ataagggtga aagtggtgcc cctggagttc cgggtatagc tggtcctcgc 3480
ggtggccctg gtgagagagg cgaacagggg cccccaggac ctgctggctt ccctggtgct 3540
cctggccaga atggtgagcc tggtgctaaa ggagaaagag gcgctcctgg tgagaaaggt 3600
gaaggaggcc ctcccggagc cgcaggaccc gccggaggtt ctgggcctgc cggtccccca 3660
ggcccccaag gtgtcaaagg cgaacgtggc agtcctggtg gtcctggtgc tgctggcttc 3720
cccggtggtc gtggtcctcc tggccctcct ggcagtaatg gtaacccagg ccccccaggc 3780
tccagtggtg ctccaggcaa agatggtccc ccaggtccac ctggcagtaa tggtgctcct 3840
ggcagccccg ggatctctgg accaaagggt gattctggtc caccaggtga gaggggagca 3900
cctggccccc agggccctcc gggagctcca ggcccactag gaattgcagg acttactgga 3960
gcacgaggtc ttgcaggccc accaggcatg ccaggtgcta ggggcagccc cggcccacag 4020
ggcatcaagg gtgaaaatgg taaaccagga cctagtggtc agaatggaga acgtggtcct 4080
cctggccccc agggtcttcc tggtctggct ggtacagctg gtgagcctgg aagagatgga 4140
aaccctggat cagatggtct gccaggccga gatggagctc caggtgccaa gggtgaccgt 4200
ggtgaaaatg gctctcctgg tgcccctgga gctcctggtc acccaggccc tcctggtcct 4260
gtcggtccag ctggaaagag cggtgacaga ggagaaactg gccctgctgg tccttctggg 4320
gcccccggtc ctgccggatc aagaggtcct cctggtcccc aaggcccacg cggtgacaaa 4380
ggggaaaccg gtgagcgtgg tgctatgggc atcaaaggac atcgcggatt ccctggcaac 4440
ccaggggccc ccggatctcc gggtcccgct ggtcatcaag gtgcagttgg cagtccaggc 4500
cctgcaggcc ccagaggacc tgttggacct agcgggcccc ctggaaagga cggagcaagt 4560
ggacaccctg gtcccattgg accaccgggg ccccgaggta acagaggtga aagaggatct 4620
gagggctccc caggccaccc aggacaacca ggccctcctg gacctcctgg tgcccctggt 4680
ccatgttgtg gtgctggcgg ggttgctgcc attgctggtg ttggagccga aaaagctggt 4740
ggttttgccc catattatgg agatgaaccg atagatttca aaatcaacac cgatgagatt 4800
atgacctcac tcaaatcagt caatggacaa atagaaagcc tcattagtcc tgatggttcc 4860
cgtaaaaacc ctgcacggaa ctgcagggac ctgaaattct gccatcctga actccagagt 4920
ggagaatatt gggttgatcc taaccaaggt tgcaaattgg atgctattaa agtctactgt 4980
aacatggaaa ctggggaaac gtgcataagt gccagtcctt tgactatccc acagaagaac 5040
tggtggacag attctggtgc tgagaagaaa catgtttggt ttggagaatc catggagggt 5100
ggttttcagt ttagctatgg caatcctgaa cttcccgaag acgtcctcga tgtccagctg 5160
gcattcctcc gacttctctc cagccgggcc tctcagaaca tcacatatca ctgcaagaat 5220
agcattgcat acatggatca tgccagtggg aatgtaaaga aagccttgaa gctgatgggg 5280
tcaaatgaag gtgaattcaa ggctgaagga aatagcaaat tcacatacac agttctggag 5340
gatggttgca caaaacacac tggggaatgg ggcaaaacag tcttccagta tcaaacacgc 5400
aaggccgtca gactacctat tgtagatatt gcaccctatg atatcggtgg tcctgatcaa 5460
gaatttggtg cggacattgg ccctgtttgc tttttataaa ggggcggccg ctcaagagga 5520
tgtcagaatg ccatttgcct gagagatgca ggcttcattt ttgatacttt tttatttgta 5580
acctatatag tataggattt tttttgtcat tttgtttctt ctcgtacgag cttgctcctg 5640
atcagcctat ctcgcagcag atgaatatct tgtggtaggg gtttgggaaa atcattcgag 5700
tttgatgttt ttcttggtat ttcccactcc tcttcagagt acagaagatt aagtgaaacc 5760
ttcgtttgtg cggatccttc agtaatgtct tgtttctttt gttgcagtgg tgagccattt 5820
tgacttcgtg aaagtttctt tagaatagtt gtttccagag gccaaacatt ccacccgtag 5880
taaagtgcaa gcgtaggaag accaagactg gcataaatca ggtataagtg tcgagcactg 5940
gcaggtgatc ttctgaaagt ttctactagc agataagatc cagtagtcat gcatatggca 6000
acaatgtacc gtgtggatct aagaacgcgt cctactaacc ttcgcattcg ttggtccagt 6060
ttgttgttat cgatcaacgt gacaaggttg tcgattccgc gtaagcatgc atacccaagg 6120
acgcctgttg caattccaag tgagccagtt ccaacaatct ttgtaatatt agagcacttc 6180
attgtgttgc gcttgaaagt aaaatgcgaa caaattaaga gataatctcg aaaccgcgac 6240
ttcaaacgcc aatatgatgt gcggcacaca ataagcgttc atatccgctg ggtgactttc 6300
tcgctttaaa aaattatccg aaaaaatttt ctagagtgtt gttactttat acttccggct 6360
cgtataatac gacaaggtgt aaggaggact aaaccatggc taaactcacc tctgctgttc 6420
cagtcctgac tgctcgtgat gttgctggtg ctgttgagtt ctggactgat agactcggtt 6480
tctcccgtga cttcgtagag gacgactttg ccggtgttgt acgtgacgac gttaccctgt 6540
tcatctccgc agttcaggac caggttgtgc cagacaacac tctggcatgg gtatgggttc 6600
gtggtctgga cgaactgtac gctgagtggt ctgaggtcgt gtctaccaac ttccgtgatg 6660
catctggtcc agctatgacc gagatcggtg aacagccctg gggtcgtgag tttgcactgc 6720
gtgatccagc tggtaactgc gtgcatttcg tcgcagaaga gcaggactaa caattgacac 6780
cttacgatta tttagagagt atttattagt tttattgtat gtatacggat gttttattat 6840
ctatttatgc ccttatattc tgtaactatc caaaagtcct atcttatcaa gccagcaatc 6900
tatgtccgcg aacgtcaact aaaaataagc tttttatgct cttctctctt tttttccctt 6960
cggtataatt ataccttgca tccacagatt ctcctgccaa attttgcata atcctttaca 7020
acatggctat atgggagcac ttagcgccct ccaaaaccca tattgcctac gcatgtatag 7080
gtgttttttc cacaatattt tctctgtgct ctctttttat taaagagaag ctctatatcg 7140
gagaagcttc tgtggccgtt atattcggcc ttatcgtggg accacattgc ctgaattggt 7200
ttgccccgga agattgggga aacttggatc tgattacctt agctgcaggt accactgagc 7260
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat 7320
ctgctgcttg caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga 7380
gctaccaact ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt 7440
tcttctagtg tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata 7500
cctcgctctg ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac 7560
cgggttggac tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg 7620
ttcgtgcaca cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg 7680
tgagctatga gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag 7740
cggcagggtc ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct 7800
ttatagtcct gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc 7860
aggggggcgg agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt 7920
ttgctggcct tttgctcaca tg 7942
<210> 12
<211> 7954
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV77 (Sequence 10)
<400> 12
ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 60
tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 120
ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gttcttctag 180
tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 240
tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 300
actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 360
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 420
gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 480
tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 540
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 600
ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 660
cttttgctca catgttcttt cctgcggtac ccagatccaa ttcccgcttt gactgcctga 720
aatctccatc gcctacaatg atgacatttg gatttggttg actcatgttg gtattgtgaa 780
atagacgcag atcgggaaca ctgaaaaata cacagttatt attcatttaa ataacatcca 840
aagacgaaag gttgaatgaa acctttttgc catccgacat ccacaggtcc attctcacac 900
ataagtgcca aacgcaacag gaggggatac actagcagca gaccgttgca aacgcaggac 960
ctccactcct cttctcctca acacccactt ttgccatcga aaaaccagcc cagttattgg 1020
gcttgattgg agctcgctca ttccaattcc ttctattagg ctactaacac catgacttta 1080
ttagcctgtc tatcctggcc cccctggcga ggttcatgtt tgtttatttc cgaatgcaac 1140
aagctccgca ttacacccga acatcactcc agatgagggc tttctgagtg tggggtcaaa 1200
tagtttcatg ttccccaaat ggcccaaaac tgacagttta aacgctgtct tggaacctaa 1260
tatgacaaaa gcgtgatctc atccaagatg aactaagttt ggttcgttga aatgctaacg 1320
gccagttggt caaaaagaaa cttccaaaag tcggcatacc gtttgtcttg tttggtattg 1380
attgacgaat gctcaaaaat aatctcatta atgcttagcg cagtctctct atcgcttctg 1440
aaccccggtg cacctgtgcc gaaacgcaaa tggggaaaca cccgcttttt ggatgattat 1500
gcattgtctc cacattgtat gcttccaaga ttctggtggg aatactgctg atagcctaac 1560
gttcatgatc aaaatttaac tgttctaacc cctacttgac agcaatatat aaacagaagg 1620
aagctgccct gtcttaaacc ttttttttta tcatcattat tagcttactt tcataattgc 1680
gactggttcc aattgacaag cttttgattt taacgacttt taacgacaac ttgagaagat 1740
caaaaaacaa ctaattattg aaagaattca aaacgatgat gtcttttgtc caaaagggta 1800
cttggttact ttttgctctg ttgcacccaa ctgttattct cgcacaacag gaagcagtag 1860
atggtggttg ctcacattta ggtcaatctt acgcagatag agatgtatgg aaacctgaac 1920
catgtcaaat ttgcgtgtgt gactcaggtt cagtgctctg cgacgatatc atatgtgacg 1980
accaggaatt ggactgtcca aacccagaga taccattcgg tgaatgttgt gctgtttgtc 2040
cacagccacc aactgctcct acaagacctc caaacggtca aggtccacaa ggtcctaaag 2100
gtgatccggg tccacctggt attcctggta gaaatggtga ccctggacct cccggttccc 2160
caggtagccc aggatcacct gggcctcctg gaatatgtga atcctgccca actggtggtc 2220
agaactatag cccacaatac gaggcctacg acgtcaaatc tggtgttgct ggaggaggta 2280
ttgcaggcta ccctggtccc gcagggcccc caggtccgcc gggtccgccc ggaacatcag 2340
gtcatcccgg agcccctggt gcaccaggtt atcagggacc gcccggagag cctggacaag 2400
ctggtcccgc tggaccccct ggtccaccag gtgctattgg accaagtggt cctgccggaa 2460
aagacggtga atccggtaga cctggtagac ccggcgaaag gggtttccca ggtcctcccg 2520
gaatgaaggg tccagccggt atgcccggtt ttcctgggat gaagggtcac agaggatttg 2580
atggtagaaa cggagagaaa ggcgaaaccg gtgctcccgg actgaagggt gaaaacggtg 2640
tccctggtga gaacggcgct cctggaccta tgggtccacg tggtgctcca ggagaaagag 2700
gcagaccagg attgcctggt gcagctggtg ctagaggtaa cgatggtgcc cgtggttccg 2760
atggacaacc cgggccaccc ggccctccag gtaccgctgg atttcctgga agccctggtg 2820
ctaaggggga ggttggtccg gctggtagtc ccggaagtag cggtgcccca ggtcaaagag 2880
gcgaaccagg ccctcagggt cacgcaggag cacctggacc gcctggtcct cctggttcga 2940
atggttcgcc tggaggaaaa ggtgaaatgg ggcccgcagg aatccccggt gcgcctggtc 3000
ttattggtgc caggggtcct ccaggcccgc caggtacaaa tggtgtaccc ggacagcgag 3060
gagcagctgg tgaacctggt aaaaacggtg ccaaaggaga tccaggtcct cgtggagagc 3120
gtggtgaagc tggctctccc ggtatcgccg gtccaaaagg tgaggacggt aaggacggtt 3180
cccctggtga gccaggtgcg aacggactgc caggtgcagc cggagagcga ggagtcccag 3240
gattcagggg accagccggt gctaacggct tgcctggtga aaaagggccc cctggtgata 3300
ggggaggacc cggtccagca ggccctcgtg gagttgctgg tgagcctgga cgtgacggtt 3360
taccaggagg gccaggtttg aggggtattc ccgggtcccc tggcggtcct ggatcggatg 3420
gaaaaccagg gccaccaggt tcgcagggtg aaacaggacg tccaggccca cccggctcac 3480
ctggtccaag gggtcagcct ggtgtcatgg gtttccccgg tccaaagggt aatgacggag 3540
caccgggtaa aaatggtgaa cgtggtggcc caggtggtcc aggaccccaa ggtccagctg 3600
gaaaaaacgg tgagacaggt cctcaaggac ctccaggacc taccggtcct agcggagata 3660
agggagatac gggaccgcca ggacctcaag gattgcaagg tttgcctggt acatctggcc 3720
ctcccggaga aaatggtaag cctggagagc caggaccaaa aggcgaagct ggagccccag 3780
gtatccccgg aggtaaggga gactcaggtg ctccgggtga gcgtggtcct ccgggtgccg 3840
gtggtccacc tggacctaga ggtggtgccg ggccgccagg tcctgaaggt ggtaaaggtg 3900
ctgctggtcc accgggaccg cctggctctg ctggtactcc tggcttgcag ggaatgccag 3960
gagagagagg tggacctgga ggtcccggtc cgaagggtga taaaggggag ccaggatcat 4020
ccggtgttga cggcgcacct ggtaaagacg gaccaagggg accaacgggt ccaatcggac 4080
caccaggacc cgctggccag ccaggagata aaggcgagtc cggagcaccc ggtgttcctg 4140
gtatagctgg acccaggggt ggtcccggtg aaagaggtga acagggccca ccgggtcccg 4200
ccggtttccc tggcgcccct ggtcaaaatg gagaaccagg tgcaaagggc gagagaggag 4260
ccccaggaga aaagggtgag ggaggaccac ccggtgctgc cggtccagct gggggttcag 4320
gtcctgctgg accaccaggt ccacagggcg ttaaaggtga gagaggaagt ccaggtggtc 4380
ctggagctgc tggattccca ggtggccgtg gacctcctgg tccccctgga tcgaatggta 4440
atcctggtcc gccaggtagt tcgggtgctc ctgggaagga cggtccacct ggccccccag 4500
gtagtaacgg tgcacctggt agtccaggta tatccggacc taaaggagat tccggtccac 4560
caggcgaaag aggggcccca ggcccacagg gtccaccagg agcccccggt cctctgggta 4620
ttgctggtct tactggtgca cgtggactgg ccggtccacc cggaatgcct ggagcaagag 4680
gttcacctgg accacaaggt attaaaggag agaacggtaa acctggacct tccggtcaaa 4740
acggagagcg gggaccccca ggcccccaag gtctgccagg actagctggt accgcagggg 4800
aaccaggaag agatggaaat ccaggttcag acggactacc cggtagagat ggtgcaccgg 4860
gggccaaggg cgacaggggt gagaatggat ctcctggtgc gccaggggca ccaggccacc 4920
caggtccccc aggtcctgtg ggccctgctg gaaagtcagg tgacagggga gagacaggcc 4980
cggctggtcc atctggcgca cccggaccag ctggttccag aggcccacct ggtccgcaag 5040
gccctagagg tgacaaggga gagactggag aacgaggtgc tatgggtatc aagggtcata 5100
gaggttttcc gggtaatccc ggcgccccag gttctcctgg tccagctggc catcaaggtg 5160
cagtcggatc gcccggccca gccggtccca ggggccctgt tggtccatcc ggtcctccag 5220
gaaaggatgg tgcttctgga cacccaggac ctatcggacc tccgggtcct agaggtaata 5280
gaggagaacg tggatccgag ggtagtcctg gtcaccctgg tcaacctggc ccaccagggc 5340
ctccaggtgc acccggtcca tgttgtggtg caggcggtgt ggctgcaatt gctggtgtgg 5400
gtgctgaaaa ggccggcggt ttcgctccat attatggtga tgaaccgatt gattttaaga 5460
tcaatactga cgaaatcatg acttccttaa agtccgttaa tggtcaaatt gagtctctaa 5520
tctccccaga tggttcacgt aaaaatcctg ctagaaattg tagagatttg aagttttgtc 5580
accccgagtt gcagtccggt gagtactggg tggaccccaa tcaaggttgt aagttagacg 5640
ctattaaagt ttactgcaat atggagacag gagaaacttg catcagcgct tctccattga 5700
ctatcccaca aaaaaattgg tggactgact ctggagctga gaaaaagcat gtatggttcg 5760
gggaatcgat ggaaggtggt ttccaattca gctacggtaa ccctgaactt cctgaagatg 5820
ttcttgacgt tcaattggca tttctgagat tgttgtccag tcgtgcaagc caaaacatta 5880
cataccattg caaaaattcc atcgcatata tggatcatgc tagcggaaat gtgaaaaagg 5940
cattgaagct gatgggatca aatgaaggtg aatttaaagc agagggtaat tctaagttta 6000
cttacactgt attggaggat ggttgtacga agcatacagg tgaatggggt aaaacagtgt 6060
ttcaatatca aacccgcaaa gcagttagat tgccaatcgt cgatatcgca ccatacgaca 6120
ttggaggacc agatcaagag ttcggagctg acatcggtcc ggtgtgtttc ctttgataag 6180
gttaaagggg cggccgctca agaggatgtc agaatgccat ttgcctgaga gatgcaggct 6240
tcatttttga tactttttta tttgtaacct atatagtata ggattttttt tgtcattttg 6300
tttcttctcg tacgagcttg ctcctgatca gcctatctcg cagcagatga atatcttgtg 6360
gtaggggttt gggaaaatca ttcgagtttg atgtttttct tggtatttcc cactcctctt 6420
cagagtacag aagattaagt gaaaccttcg tttgtgcgga tccttcagta atgtcttgtt 6480
tcttttgttg cagtggtgag ccattttgac ttcgtgaaag tttctttaga atagttgttt 6540
ccagaggcca aacattccac ccgtagtaaa gtgcaagcgt aggaagacca agactggcat 6600
aaatcaggta taagtgtcga gcactggcag gtgatcttct gaaagtttct actagcagat 6660
aagatccagt agtcatgcat atggcaacaa tgtaccgtgt ggatctaaga acgcgtccta 6720
ctaaccttcg cattcgttgg tccagtttgt tgttatcgat caacgtgaca aggttgtcga 6780
ttccgcgtaa gcatgcatac ccaaggacgc ctgttgcaat tccaagtgag ccagttccaa 6840
caatctttgt aatattagag cacttcattg tgttgcgctt gaaagtaaaa tgcgaacaaa 6900
ttaagagata atctcgaaac cgcgacttca aacgccaata tgatgtgcgg cacacaataa 6960
gcgttcatat ccgctgggtg actttctcgc tttaaaaaat tatccgaaaa aattttctag 7020
agtgttgtta ctttatactt ccggctcgta taatacgaca aggtgtaagg aggactaaac 7080
catggctaaa ctcacctctg ctgttccagt cctgactgct cgtgatgttg ctggtgctgt 7140
tgagttctgg actgatagac tcggtttctc ccgtgacttc gtagaggacg actttgccgg 7200
tgttgtacgt gacgacgtta ccctgttcat ctccgcagtt caggaccagg ttgtgccaga 7260
caacactctg gcatgggtat gggttcgtgg tctggacgaa ctgtacgctg agtggtctga 7320
ggtcgtgtct accaacttcc gtgatgcatc tggtccagct atgaccgaga tcggtgaaca 7380
gccctggggt cgtgagtttg cactgcgtga tccagctggt aactgcgtgc atttcgtcgc 7440
agaagaacag gactaacaat tgacacctta cgattattta gagagtattt attagtttta 7500
ttgtatgtat acggatgttt tattatctat ttatgccctt atattctgta actatccaaa 7560
agtcctatct tatcaagcca gcaatctatg tccgcgaacg tcaactaaaa ataagctttt 7620
tatgctgttc tctctttttt tcccttcggt ataattatac cttgcatcca cagattctcc 7680
tgccaaattt tgcataatcc tttacaacat ggctatatgg gagcacttag cgccctccaa 7740
aacccatatt gcctacgcat gtataggtgt tttttccaca atattttctc tgtgctctct 7800
ttttattaaa gagaagctct atatcggaga agcttctgtg gccgttatat tcggccttat 7860
cgtgggacca cattgcctga attggtttgc cccggaagat tggggaaact tggatctgat 7920
taccttagct gcaggtacca ctgagcgtca gacc 7954
<210> 13
<211> 7356
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV129 (Sequence 11)
<400> 13
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 60
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 120
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 180
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 240
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 300
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 360
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 420
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 480
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 540
aattatccga aaaaattttc tagagtgttg ttactttata cttccggctc gtataatacg 600
acaaggtgta aggaggacta aaccatggct aaactcacct ctgctgttcc agtcctgact 660
gctcgtgatg ttgctggtgc tgttgagttc tggactgata ggctcggttt ctcccgtgac 720
ttcgtagagg acgactttgc cggtgttgta cgtgacgacg ttaccctgtt catctccgca 780
gttcaggacc aggttgtgcc agacaacact ctggcatggg tatgggttcg tggtctggac 840
gaactgtacg ctgagtggtc tgaggtcgtg tctaccaact tccgtgatgc atctggtcca 900
gctatgaccg agatcggtga acagccctgg ggtcgtgagt ttgcactgcg tgatccagct 960
ggtaactgcg tgcatttcgt cgcagaagag caggactaac aattgacacc ttacgattat 1020
ttagagagta tttattagtt ttattgtatg tatacggatg ttttattatc tatttatgcc 1080
cttatattct gtaactatcc aaaagtccta tcttatcaag ccagcaatct atgtccgcga 1140
acgtcaacta aaaataagct ttttatgctc ttctctcttt ttttcccttc ggtataatta 1200
taccttgcat ccacagattc tcctgccaaa ttttgcataa tcctttacaa catggctata 1260
tgggagcact tagcgccctc caaaacccat attgcctacg catgtatagg tgttttttcc 1320
acaatatttt ctctgtgctc tctttttatt aaagagaagc tctatatcgg agaagcttct 1380
gtggccgtta tattcggcct tatcgtggga ccacattgcc tgaattggtt tgccccggaa 1440
gattggggaa acttggatct gattacctta gctgcagaaa agggtaccac tgagcgtcag 1500
accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct 1560
gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac 1620
caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat actgttcttc 1680
tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg 1740
ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt 1800
tggacccaag acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt 1860
gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc 1920
tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca 1980
gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata 2040
gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg 2100
ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct 2160
ggccttttgc tcacatgtat ttaaattaat cgaactccga atgcggttct cctgtaacct 2220
taattgtagc atagatcact taaataaact catggcctga catctgtaca cgttcttatt 2280
ggtcttttag caatcttgaa gtctttctat tgttccggtc ggcattacct aataaattcg 2340
aatcgagatt gctagtacct gatatcatat gaagtaatca tcacatgcaa gttccatgat 2400
accctctact aatggaattg aacaaagttt aagcttctcg cacgagaccg aatccatact 2460
atgcacccct caaagttggg attagtcagg aaagctgagc aattaacttc cctcgattgg 2520
cctggacttt tcgcttagcc tgccgcaatc ggtaagtttc attatcccag cggggtgata 2580
gcctctgttg ctcatcaggc caaaatcata tataagctgt agacccagca cttcaattac 2640
ttgaaattca ccataacact tgctctagtc aagacttaca attaaaatga tgtcttttgt 2700
ccaaaagggt acttggttac tttttgctct gttgcaccca actgttattc tcgcacaaca 2760
ggaagcagta gatggtggtt gctcacattt aggtcaatct tacgcagata gagatgtatg 2820
gaaacctgaa ccatgtcaaa tttgcgtgtg tgactcaggt tcagtgctct gcgacgatat 2880
catatgtgac gaccaggaat tggactgtcc aaacccagag ataccattcg gtgaatgttg 2940
tgctgtttgt ccacagccac caactgctcc tacaagacct ccaaacggtc aaggtccaca 3000
aggtcctaaa ggtgatccgg gtccacctgg tattcctggt agaaatggtg accctggacc 3060
tcccggttcc ccaggtagcc caggatcacc tgggcctcct ggaatatgtg aatcctgccc 3120
aactggtggt cagaactata gcccacaata cgaggcctac gacgtcaaat ctggtgttgc 3180
tggaggaggt attgcaggct accctggtcc cgcagggccc ccaggtccgc cgggtccgcc 3240
cggaacatca ggtcatcccg gagcccctgg tgcaccaggt tatcagggac cgcccggaga 3300
gcctggacaa gctggtcccg ctggaccccc tggtccacca ggtgctattg gaccaagtgg 3360
tcctgccgga aaagacggtg aatccggtag acctggtaga cccggcgaaa ggggtttccc 3420
aggtcctccc ggaatgaagg gtccagccgg tatgcccggt tttcctggga tgaagggtca 3480
cagaggattt gatggtagaa acggagagaa aggcgaaacc ggtgctcccg gactgaaggg 3540
tgaaaacggt gtccctggtg agaacggcgc tcctggacct atgggtccac gtggtgctcc 3600
aggagaaaga ggcagaccag gattgcctgg tgcagctggt gctagaggta acgatggtgc 3660
ccgtggttcc gatggacaac ccgggccacc cggccctcca ggtaccgctg gatttcctgg 3720
aagccctggt gctaaggggg aggttggtcc ggctggtagt cccggaagta gcggtgcccc 3780
aggtcaaaga ggcgaaccag gccctcaggg tcacgcagga gcacctggac cgcctggtcc 3840
tcctggttcg aatggttcgc ctggaggaaa aggtgaaatg gggcccgcag gaatccccgg 3900
tgcgcctggt cttattggtg ccaggggtcc tccaggcccg ccaggtacaa atggtgtacc 3960
cggacagcga ggagcagctg gtgaacctgg taaaaacggt gccaaaggag atccaggtcc 4020
tcgtggagag cgtggtgaag ctggctctcc cggtatcgcc ggtccaaaag gtgaggacgg 4080
taaggacggt tcccctggtg agccaggtgc gaacggactg ccaggtgcag ccggagagcg 4140
aggagtccca ggattcaggg gaccagccgg tgctaacggc ttgcctggtg aaaaagggcc 4200
ccctggtgat aggggaggac ccggtccagc aggccctcgt ggagttgctg gtgagcctgg 4260
acgtgacggt ttaccaggag ggccaggttt gaggggtatt cccgggtccc ctggcggtcc 4320
tggatcggat ggaaaaccag ggccaccagg ttcgcagggt gaaacaggac gtccaggccc 4380
acccggctca cctggtccaa ggggtcagcc tggtgtcatg ggtttccccg gtccaaaggg 4440
taatgacgga gcaccgggta aaaatggtga acgtggtggc ccaggtggtc caggacccca 4500
aggtccagct ggaaaaaacg gtgagacagg tcctcaagga cctccaggac ctaccggtcc 4560
tagcggagat aagggagata cgggaccgcc aggacctcaa ggattgcaag gtttgcctgg 4620
tacatctggc cctcccggag aaaatggtaa gcctggagag ccaggaccaa aaggcgaagc 4680
tggagcccca ggtatccccg gaggtaaggg agactcaggt gctccgggtg agcgtggtcc 4740
tccgggtgcc ggtggtccac ctggacctag aggtggtgcc gggccgccag gtcctgaagg 4800
tggtaaaggt gctgctggtc caccgggacc gcctggctct gctggtactc ctggcttgca 4860
gggaatgcca ggagagagag gtggacctgg aggtcccggt ccgaagggtg ataaagggga 4920
gccaggatca tccggtgttg acggcgcacc tggtaaagac ggaccaaggg gaccaacggg 4980
tccaatcgga ccaccaggac ccgctggcca gccaggagat aaaggcgagt ccggagcacc 5040
cggtgttcct ggtatagctg gacccagggg tggtcccggt gaaagaggtg aacagggccc 5100
accgggtccc gccggtttcc ctggcgcccc tggtcaaaat ggagaaccag gtgcaaaggg 5160
cgagagagga gccccaggag aaaagggtga gggaggacca cccggtgctg ccggtccagc 5220
tgggggttca ggtcctgctg gaccaccagg tccacagggc gttaaaggtg agagaggaag 5280
tccaggtggt cctggagctg ctggattccc aggtggccgt ggacctcctg gtccccctgg 5340
atcgaatggt aatcctggtc cgccaggtag ttcgggtgct cctgggaagg acggtccacc 5400
tggcccccca ggtagtaacg gtgcacctgg tagtccaggt atatccggac ctaaaggaga 5460
ttccggtcca ccaggcgaaa gaggggcccc aggcccacag ggtccaccag gagcccccgg 5520
tcctctgggt attgctggtc ttactggtgc acgtggactg gccggtccac ccggaatgcc 5580
tggagcaaga ggttcacctg gaccacaagg tattaaagga gagaacggta aacctggacc 5640
ttccggtcaa aacggagagc ggggaccccc aggcccccaa ggtctgccag gactagctgg 5700
taccgcaggg gaaccaggaa gagatggaaa tccaggttca gacggactac ccggtagaga 5760
tggtgcaccg ggggccaagg gcgacagggg tgagaatgga tctcctggtg cgccaggggc 5820
accaggccac ccaggtcccc caggtcctgt gggccctgct ggaaagtcag gtgacagggg 5880
agagacaggc ccggctggtc catctggcgc acccggacca gctggttcca gaggcccacc 5940
tggtccgcaa ggccctagag gtgacaaggg agagactgga gaacgaggtg ctatgggtat 6000
caagggtcat agaggttttc cgggtaatcc cggcgcccca ggttctcctg gtccagctgg 6060
ccatcaaggt gcagtcggat cgcccggccc agccggtccc aggggccctg ttggtccatc 6120
cggtcctcca ggaaaggatg gtgcttctgg acacccagga cctatcggac ctccgggtcc 6180
tagaggtaat agaggagaac gtggatccga gggtagtcct ggtcaccctg gtcaacctgg 6240
cccaccaggg cctccaggtg cacccggtcc atgttgtggt gcaggcggtg tggctgcaat 6300
tgctggtgtg ggtgctgaaa aggccggcgg tttcgctcca tattatggtg atgaaccgat 6360
tgattttaag atcaatactg acgaaatcat gacttcctta aagtccgtta atggtcaaat 6420
tgagtctcta atctccccag atggttcacg taaaaatcct gctagaaatt gtagagattt 6480
gaagttttgt caccccgagt tgcagtccgg tgagtactgg gtggacccca atcaaggttg 6540
taagttagac gctattaaag tttactgcaa tatggagaca ggagaaactt gcatcagcgc 6600
ttctccattg actatcccac aaaaaaattg gtggactgac tctggagctg agaaaaagca 6660
tgtatggttc ggggaatcga tggaaggtgg tttccaattc agctacggta accctgaact 6720
tcctgaagat gttcttgacg ttcaattggc atttctgaga ttgttgtcca gtcgtgcaag 6780
ccaaaacatt acataccatt gcaaaaattc catcgcatat atggatcatg ctagcggaaa 6840
tgtgaaaaag gcattgaagc tgatgggatc aaatgaaggt gaatttaaag cagagggtaa 6900
ttctaagttt acttacactg tattggagga tggttgtacg aagcatacag gtgaatgggg 6960
taaaacagtg tttcaatatc aaacccgcaa agcagttaga ttgccaatcg tcgatatcgc 7020
accatacgac attggaggac cagatcaaga gttcggagct gacatcggtc cggtgtgttt 7080
cctttgataa tcaagaggat gtcagaatgc catttgcctg agagatgcag gcttcatttt 7140
tgatactttt ttatttgtaa cctatatagt ataggatttt ttttgtcatt ttgtttcttc 7200
tcgtacgagc ttgctcctga tcagcctatc tcgcagctga tgaatatctt gtggtagggg 7260
tttgggaaaa tcattcgagt ttgatgtttt tcttggtatt tcccactcct cttcagagta 7320
cagaagatta agtgagacgt tcgtttgtgc tccgga 7356
<210> 14
<211> 7879
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV130 (Sequence 12)
<400> 14
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 60
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 120
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 180
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 240
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 300
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 360
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 420
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 480
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 540
aattatccga aaaaattttc tagagtgttg ttactttata cttccggctc gtataatacg 600
acaaggtgta aggaggacta aaccatggct aaactcacct ctgctgttcc agtcctgact 660
gctcgtgatg ttgctggtgc tgttgagttc tggactgata ggctcggttt ctcccgtgac 720
ttcgtagagg acgactttgc cggtgttgta cgtgacgacg ttaccctgtt catctccgca 780
gttcaggacc aggttgtgcc agacaacact ctggcatggg tatgggttcg tggtctggac 840
gaactgtacg ctgagtggtc tgaggtcgtg tctaccaact tccgtgatgc atctggtcca 900
gctatgaccg agatcggtga acagccctgg ggtcgtgagt ttgcactgcg tgatccagct 960
ggtaactgcg tgcatttcgt cgcagaagag caggactaac aattgacacc ttacgattat 1020
ttagagagta tttattagtt ttattgtatg tatacggatg ttttattatc tatttatgcc 1080
cttatattct gtaactatcc aaaagtccta tcttatcaag ccagcaatct atgtccgcga 1140
acgtcaacta aaaataagct ttttatgctc ttctctcttt ttttcccttc ggtataatta 1200
taccttgcat ccacagattc tcctgccaaa ttttgcataa tcctttacaa catggctata 1260
tgggagcact tagcgccctc caaaacccat attgcctacg catgtatagg tgttttttcc 1320
acaatatttt ctctgtgctc tctttttatt aaagagaagc tctatatcgg agaagcttct 1380
gtggccgtta tattcggcct tatcgtggga ccacattgcc tgaattggtt tgccccggaa 1440
gattggggaa acttggatct gattacctta gctgcagaaa agggtaccac tgagcgtcag 1500
accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct 1560
gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac 1620
caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat actgttcttc 1680
tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg 1740
ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt 1800
tggacccaag acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt 1860
gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc 1920
tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca 1980
gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata 2040
gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg 2100
ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct 2160
ggccttttgc tcacatgtat ttcagaagcg atagagagac tgcgctaagc attaatgaga 2220
ttatttttga gcattcgtca atcaatacca aacaagacaa acggtatgcc gacttttgga 2280
agtttctttt tgaccaactg gccgttagca tttcaacgaa ccaaacttag ttcatcttgg 2340
atgagatcac gcttttgtca tattaggttc caagacagcg tttaaactgt cagttttggg 2400
ccatttgggg aacatgaaac tatttgaccc cacactcaga aagccctcat ctgagtgatg 2460
ttcgggtgta atgcggagct tgttgcattc ggaaataaac aaacatgaac ctcgccaggg 2520
gggccaggat agacaggcta ataaagtcat ggtgttagta gcctaataga aggaattgga 2580
ataaataatg tatctaaacg caaactccga gctggaaaaa tgttaccggc gatgcgcgga 2640
caatttagag gcggcgatca agaaacacct gctgggcgag cagtctggag cacagtcttc 2700
gatgggcccg agatcccacc gcgttcctgg gtaccgggac gtgaggcagc gcgacatcca 2760
tcaaatatac caggcgccaa ccgagtctct cggaaaacag cttctggata tcttccgctg 2820
gcggcgcaac gacgaataat agtccctgga ggtgacggaa tatatatgtg tggagggtaa 2880
atctgacagg gtgtagcaaa ggtaatattt tcctaaaaca tgcaatcggc tgccccgcaa 2940
cgggaaaaag aatgactttg gcactcttca ccagagtggg gtgtcccgct cgtgtgtgca 3000
aataggctcc cactggtcac cccggatttt gcagaaaaac agcaagttcc ggggtgtctc 3060
actggtgtcc gccaataaga ggagccggca ggcacggagt ctacatcaag ctgtctccga 3120
tacactcgac taccatccgg gtctctcaga gaggggaatg gcactataaa taccgcctcc 3180
ttgcgctctc tgccttcatc aatcaaatca tgatgtcttt tgtccaaaag ggtacttggt 3240
tactttttgc tctgttgcac ccaactgtta ttctcgcaca acaggaagca gtagatggtg 3300
gttgctcaca tttaggtcaa tcttacgcag atagagatgt atggaaacct gaaccatgtc 3360
aaatttgcgt gtgtgactca ggttcagtgc tctgcgacga tatcatatgt gacgaccagg 3420
aattggactg tccaaaccca gagataccat tcggtgaatg ttgtgctgtt tgtccacagc 3480
caccaactgc tcctacaaga cctccaaacg gtcaaggtcc acaaggtcct aaaggtgatc 3540
cgggtccacc tggtattcct ggtagaaatg gtgaccctgg acctcccggt tccccaggta 3600
gcccaggatc acctgggcct cctggaatat gtgaatcctg cccaactggt ggtcagaact 3660
atagcccaca atacgaggcc tacgacgtca aatctggtgt tgctggagga ggtattgcag 3720
gctaccctgg tcccgcaggg cccccaggtc cgccgggtcc gcccggaaca tcaggtcatc 3780
ccggagcccc tggtgcacca ggttatcagg gaccgcccgg agagcctgga caagctggtc 3840
ccgctggacc ccctggtcca ccaggtgcta ttggaccaag tggtcctgcc ggaaaagacg 3900
gtgaatccgg tagacctggt agacccggcg aaaggggttt cccaggtcct cccggaatga 3960
agggtccagc cggtatgccc ggttttcctg ggatgaaggg tcacagagga tttgatggta 4020
gaaacggaga gaaaggcgaa accggtgctc ccggactgaa gggtgaaaac ggtgtccctg 4080
gtgagaacgg cgctcctgga cctatgggtc cacgtggtgc tccaggagaa agaggcagac 4140
caggattgcc tggtgcagct ggtgctagag gtaacgatgg tgcccgtggt tccgatggac 4200
aacccgggcc acccggccct ccaggtaccg ctggatttcc tggaagccct ggtgctaagg 4260
gggaggttgg tccggctggt agtcccggaa gtagcggtgc cccaggtcaa agaggcgaac 4320
caggccctca gggtcacgca ggagcacctg gaccgcctgg tcctcctggt tcgaatggtt 4380
cgcctggagg aaaaggtgaa atggggcccg caggaatccc cggtgcgcct ggtcttattg 4440
gtgccagggg tcctccaggc ccgccaggta caaatggtgt acccggacag cgaggagcag 4500
ctggtgaacc tggtaaaaac ggtgccaaag gagatccagg tcctcgtgga gagcgtggtg 4560
aagctggctc tcccggtatc gccggtccaa aaggtgagga cggtaaggac ggttcccctg 4620
gtgagccagg tgcgaacgga ctgccaggtg cagccggaga gcgaggagtc ccaggattca 4680
ggggaccagc cggtgctaac ggcttgcctg gtgaaaaagg gccccctggt gataggggag 4740
gacccggtcc agcaggccct cgtggagttg ctggtgagcc tggacgtgac ggtttaccag 4800
gagggccagg tttgaggggt attcccgggt cccctggcgg tcctggatcg gatggaaaac 4860
cagggccacc aggttcgcag ggtgaaacag gacgtccagg cccacccggc tcacctggtc 4920
caaggggtca gcctggtgtc atgggtttcc ccggtccaaa gggtaatgac ggagcaccgg 4980
gtaaaaatgg tgaacgtggt ggcccaggtg gtccaggacc ccaaggtcca gctggaaaaa 5040
acggtgagac aggtcctcaa ggacctccag gacctaccgg tcctagcgga gataagggag 5100
atacgggacc gccaggacct caaggattgc aaggtttgcc tggtacatct ggccctcccg 5160
gagaaaatgg taagcctgga gagccaggac caaaaggcga agctggagcc ccaggtatcc 5220
ccggaggtaa gggagactca ggtgctccgg gtgagcgtgg tcctccgggt gccggtggtc 5280
cacctggacc tagaggtggt gccgggccgc caggtcctga aggtggtaaa ggtgctgctg 5340
gtccaccggg accgcctggc tctgctggta ctcctggctt gcagggaatg ccaggagaga 5400
gaggtggacc tggaggtccc ggtccgaagg gtgataaagg ggagccagga tcatccggtg 5460
ttgacggcgc acctggtaaa gacggaccaa ggggaccaac gggtccaatc ggaccaccag 5520
gacccgctgg ccagccagga gataaaggcg agtccggagc acccggtgtt cctggtatag 5580
ctggacccag gggtggtccc ggtgaaagag gtgaacaggg cccaccgggt cccgccggtt 5640
tccctggcgc ccctggtcaa aatggagaac caggtgcaaa gggcgagaga ggagccccag 5700
gagaaaaggg tgagggagga ccacccggtg ctgccggtcc agctgggggt tcaggtcctg 5760
ctggaccacc aggtccacag ggcgttaaag gtgagagagg aagtccaggt ggtcctggag 5820
ctgctggatt cccaggtggc cgtggacctc ctggtccccc tggatcgaat ggtaatcctg 5880
gtccgccagg tagttcgggt gctcctggga aggacggtcc acctggcccc ccaggtagta 5940
acggtgcacc tggtagtcca ggtatatccg gacctaaagg agattccggt ccaccaggcg 6000
aaagaggggc cccaggccca cagggtccac caggagcccc cggtcctctg ggtattgctg 6060
gtcttactgg tgcacgtgga ctggccggtc cacccggaat gcctggagca agaggttcac 6120
ctggaccaca aggtattaaa ggagagaacg gtaaacctgg accttccggt caaaacggag 6180
agcggggacc cccaggcccc caaggtctgc caggactagc tggtaccgca ggggaaccag 6240
gaagagatgg aaatccaggt tcagacggac tacccggtag agatggtgca ccgggggcca 6300
agggcgacag gggtgagaat ggatctcctg gtgcgccagg ggcaccaggc cacccaggtc 6360
ccccaggtcc tgtgggccct gctggaaagt caggtgacag gggagagaca ggcccggctg 6420
gtccatctgg cgcacccgga ccagctggtt ccagaggccc acctggtccg caaggcccta 6480
gaggtgacaa gggagagact ggagaacgag gtgctatggg tatcaagggt catagaggtt 6540
ttccgggtaa tcccggcgcc ccaggttctc ctggtccagc tggccatcaa ggtgcagtcg 6600
gatcgcccgg cccagccggt cccaggggcc ctgttggtcc atccggtcct ccaggaaagg 6660
atggtgcttc tggacaccca ggacctatcg gacctccggg tcctagaggt aatagaggag 6720
aacgtggatc cgagggtagt cctggtcacc ctggtcaacc tggcccacca gggcctccag 6780
gtgcacccgg tccatgttgt ggtgcaggcg gtgtggctgc aattgctggt gtgggtgctg 6840
aaaaggccgg cggtttcgct ccatattatg gtgatgaacc gattgatttt aagatcaata 6900
ctgacgaaat catgacttcc ttaaagtccg ttaatggtca aattgagtct ctaatctccc 6960
cagatggttc acgtaaaaat cctgctagaa attgtagaga tttgaagttt tgtcaccccg 7020
agttgcagtc cggtgagtac tgggtggacc ccaatcaagg ttgtaagtta gacgctatta 7080
aagtttactg caatatggag acaggagaaa cttgcatcag cgcttctcca ttgactatcc 7140
cacaaaaaaa ttggtggact gactctggag ctgagaaaaa gcatgtatgg ttcggggaat 7200
cgatggaagg tggtttccaa ttcagctacg gtaaccctga acttcctgaa gatgttcttg 7260
acgttcaatt ggcatttctg agattgttgt ccagtcgtgc aagccaaaac attacatacc 7320
attgcaaaaa ttccatcgca tatatggatc atgctagcgg aaatgtgaaa aaggcattga 7380
agctgatggg atcaaatgaa ggtgaattta aagcagaggg taattctaag tttacttaca 7440
ctgtattgga ggatggttgt acgaagcata caggtgaatg gggtaaaaca gtgtttcaat 7500
atcaaacccg caaagcagtt agattgccaa tcgtcgatat cgcaccatac gacattggag 7560
gaccagatca agagttcgga gctgacatcg gtccggtgtg tttcctttga taatcaagag 7620
gatgtcagaa tgccatttgc ctgagagatg caggcttcat ttttgatact tttttatttg 7680
taacctatat agtataggat tttttttgtc attttgtttc ttctcgtacg agcttgctcc 7740
tgatcagcct atctcgcagc tgatgaatat cttgtggtag gggtttggga aaatcattcg 7800
agtttgatgt ttttcttggt atttcccact cctcttcaga gtacagaaga ttaagtgaga 7860
cgttcgtttg tgctccgga 7879
<210> 15
<211> 7963
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV78 (Sequence 13)
<400> 15
aattgacacc ttacgattat ttagagagta tttattagtt ttattgtatg tatacggatg 60
ttttattatc tatttatgcc cttatattct gtaactatcc aaaagtccta tcttatcaag 120
ccagcaatct atgtccgcga acgtcaacta aaaataagct ttttatgctg ttctctcttt 180
ttttcccttc ggtataatta taccttgcat ccacagattc tcctgccaaa ttttgcataa 240
tcctttacaa catggctata tgggagcact tagcgccctc caaaacccat attgcctacg 300
catgtatagg tgttttttcc acaatatttt ctctgtgctc tctttttatt aaagagaagc 360
tctatatcgg agaagcttct gtggccgtta tattcggcct tatcgtggga ccacattgcc 420
tgaattggtt tgccccggaa gattggggaa acttggatct gattacctta gctgcaggta 480
ccactgagcg tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct 540
gcgcgtaatc tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc 600
ggatcaagag ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc 660
aaatactgtt cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc 720
gcctacatac ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc 780
gtgtcttacc gggttggact caagacgata gttaccggat aaggcgcagc ggtcgggctg 840
aacggggggt tcgtgcacac agcccagctt ggagcgaacg acctacaccg aactgagata 900
cctacagcgt gagctatgag aaagcgccac gcttcccgaa gggagaaagg cggacaggta 960
tccggtaagc ggcagggtcg gaacaggaga gcgcacgagg gagcttccag ggggaaacgc 1020
ctggtatctt tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg 1080
atgctcgtca ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt 1140
cctggccttt tgctggcctt ttgctcacat gtcgcacaaa cgaaggtttc acttaatctt 1200
ctgtactctg aagaggagtg ggaaatacca agaaaaacat caaactcgaa tgattttccc 1260
aaacccctac cacaagatat tcatctgctg cgagataggc tgatcaggag caagctcgta 1320
cgagaagaaa caaaatgaca aaaaaaatcc tatactatat aggttacaaa taaaaaagta 1380
tcaaaaatga agcctgcatc tctcaggcaa atggcattct gacatcctct tgaaaattat 1440
cattctaatt ctgacaatgt gcatggcctc ctaaactctt gacctctctc atgcagccac 1500
ttattggaaa cccacttatt accgactaag acgggacaag cagcatgtct agtgctgtaa 1560
tcaccttctc cagatgcaaa cagattgtac caaaatacgg ccgtgccctt tttaggccaa 1620
acagaagcac ctacctcagg gaaaactgtg gctcctccag caagcacatc ggacatatag 1680
aacaaccacg ttgcgattct atttccagta cctagctcct taaaagcatc aggctcgtcc 1740
tttctggcga aatcaaagtg gggttcatac tgaccgccca caccatagtt ggcaacttgt 1800
agttcctcag cagtgcttac gtcaagacca gtcaaatctt gaatacgcat attgatacgg 1860
ctgaccacgg gattctcgta accggacaac catgctgatt tagagacacg atattgtgcg 1920
gtagtcaatt ttccagtctc agggtcatgg acggtagccc tactcaatct tggtttggcc 1980
aagtctttca caacctctat ttctgcatcg gagatgatgt catgaaaacg aatgattcta 2040
ggcttgtccc attcatcttc ctgtttcgct ggagcaagaa tgaattttgg gttacggttc 2100
ccatcatgat atctacagaa cagctttttc tgtctccttg gagtcatctt gataccctct 2160
cctctacaca gcatttcata cttttgtctc tctgggaggt agtcaacagc tgcacctttt 2220
tttttcagag tggtcttttg atcggattgg tcatcggacg aggacttatt tgcgtccttt 2280
tccttagcca tgatgtattc aaagtatttc agattaccgt tagctctttg atgctccggg 2340
tccagctcca acaacttttt agttaaaagt agagctttgt ccagatcacc ttgctggtaa 2400
acagcgtatg ataagtaatc caaaactgaa accttatcaa cggtagaaac ttcaccttcg 2460
tccaactgac gcagagcttg ctccatccat aattctgtgt gatagtagtc ggcttctgta 2520
tatgcgactt ttcccaattc aaaacaatct tccacagtga ggaaggactt atgcttcaca 2580
ccaggtaaat cacccttcga tatcgtgtcg gtgtccaaat tgtatgtgtc ctgcaatcgc 2640
aacaaagctt ttgctgctcc tacttggtcc tcatcgtttg gaaagtattg tctttgaatt 2700
gttaagttag aaatgaatcc atcactcata tctttaagta ccaagttttc caattctgac 2760
cactctgtat taagtctctt catcagcttg aaagcattca ctgggtgacc cacaaaaccc 2820
tcaggatctt ttgttgcagt actagtcaat ctatcgagtt tctctgccca ctttttgatt 2880
tgctccaact tatcctcttc agctttgata tagtctttaa ggcttgtaac taggtctttt 2940
tctgtgtgaa tcaaatcagt catctgtcct atagaagtga agaagcctgg gtgagccagt 3000
gactgtggca acaaaatacc aacgactagg atataccaaa tcatttttga tgtttgatag 3060
tttgataaga gtgaacttta gtgtttagag gggttataat ttgttgtaac tggttttggt 3120
cttaagttaa aacgaacttg ttatattaaa cacaacggtc actcaggata caagaatagg 3180
aaagaaaaac tttaaactgg ggacatgttg tctttatata atttggcggt taacccttaa 3240
tgcccgtttc cgtctcttca tgataacaaa gctgcccatc tatgactgaa tgtggagaag 3300
tatcggaaca acccttcact aaggatatct aggctaaact cattcgcgcc ttagatttct 3360
ccaaggtatc ggttaagttt cctctttcgt actggctaac gatggtgttg ctcaacaaag 3420
ggatggaacg gcagctaaag ggagtgcatg gaatgacttt aattggctga gaaagtgttc 3480
tatttgtccg aatttctttt ttctattatc tgttcgtttg ggcggatctc tccagtgggg 3540
ggtaaatgga agatttctgt tcatggggta aggaagctga aatccttcgt ttcttatagg 3600
ggcaagtata ctaaatctcg gaacattgaa tggggtttac tttcattggc tacagaaatt 3660
attaagtttg ttatggggtg aagttaccag taattttcat tttttcactt caacttttgg 3720
ggtatttctg tggggtagca tagagcaatg atataaacaa caattgagtg acaggtctac 3780
tttgttctca aaaggccata accatctgtt tgcatctctt atcaccacac catcctcctc 3840
atctggcctt caattgtggg gaacaactag catcccaaca ccagactaac tccacccaga 3900
tgaaaccagt tgtcgcttac cagtcaatga atgttgagct aacgttcctt gaaactcgaa 3960
tgatcccagc cttgctgcgt atcatccctc cgctattccg ccgcttgctc caaccatgtt 4020
tccgcctttt tcgaacaagt tcaaatacct atctttggca ggacttttcc tcctgccttt 4080
tttagcctca gctctcggtt agcctctagg caaattctgg tcttcatacc tatatcaact 4140
tttcatcaga tagcctttgg gttcaaaaaa gaactaaagc aggatgcctg atatataaat 4200
cccagatgat ctgcttttga aactattttc agtatcttga ttcgtttact tacaaacaac 4260
tattgttgat tttatctgga gaataatcga acaaaatgag attcccatct attttcaccg 4320
ctgtcttgtt cgctgcctcc tctgcattgg ctgcccctgt taacactacc actgaagacg 4380
agactgctca aattccagct gaagcagtta tcggttactc tgaccttgag ggtgatttcg 4440
acgtcgctgt tttgcctttc tctaactcca ctaacaacgg tttgttgttc attaacacca 4500
ctatcgcttc cattgctgct aaggaagagg gtgtctctct cgagaaaaga gaggccgaag 4560
ctgcacccga tgaggaagat catgttttag tattgcataa aggaaatttc gatgaagctt 4620
tggccgctca caaatatctg ctcgtcgagt tttacgctcc ctggtgcggt cattgtaagg 4680
cccttgcacc agagtacgcc aaggcagctg gtaagttaaa ggccgaaggt tcagagatca 4740
gattagcaaa agttgatgct acagaagagt ccgatcttgc tcaacaatac ggggttcgag 4800
gatacccaac aattaagttt ttcaaaaatg gtgatactgc ttccccaaag gaatatactg 4860
ctggtagaga ggcagacgac atagtcaact ggctcaaaaa gagaacgggc ccagctgcgt 4920
ctacattaag cgacggagca gcagccgaag ctcttgtgga atctagtgaa gttgctgtaa 4980
tcggtttctt taaggacatg gaatctgatt cagctaaaca gttcctttta gcagctgaag 5040
caatcgatga catccctttc ggaatcacct caaatagtga cgtgttcagc aagtaccaac 5100
ttgacaaaga tggagtggtc ttgttcaaaa agtttgacga aggcagaaac aatttcgagg 5160
gtgaggttac aaaggagaaa ctgcttgatt tcattaaaca taaccaacta cccttagtta 5220
tcgaattcac tgaacaaact gctcctaaga ttttcggtgg agaaatcaaa acacatatct 5280
tgttgttttt gccaaagtcc gtatcggatt atgaaggtaa actctccaat ttcaaaaagg 5340
ccgctgagag ctttaagggc aagattttgt tcatctttat tgactcagac cacacagaca 5400
atcagaggat tttggagttt ttcggtttga aaaaggagga atgtccagca gtccgtttga 5460
tcaccttgga ggaggagatg accaaataca aaccagagtc ggatgagttg actgccgaga 5520
agataacaga attttgtcac agatttctgg aaggtaagat caagcctcat cttatgtctc 5580
aagagttgcc tgatgactgg gataagcaac cagttaaagt attggtgggt aaaaactttg 5640
aggaagtggc cttcgacgag aaaaaaaatg tctttgttga attctatgct ccgtggtgtg 5700
gtcactgtaa gcagctggca ccaatttggg ataaactggg tgaaacttac aaagatcacg 5760
aaaacattgt tattgcaaag atggacagta ctgctaacga agtggaggct gtgaaagttc 5820
actccttccc tacgctgaag ttctttcctg catctgctga cagaactgtt atcgactata 5880
atggagagag gacattggat ggttttaaaa agtttcttga atccggaggt caagacggag 5940
ctggtgacga cgatgatttg gaagatctgg aggaggctga ggaacctgat cttgaggagg 6000
atgacgacca gaaggcagtc aaagatgaac tgtgataagg ggcggccgct caagaggatg 6060
tcagaatgcc atttgcctga gagatgcagg cttcattttt gatacttttt tatttgtaac 6120
ctatatagta taggattttt tttgtcattt tgtttcttct cgtacgagct tgctcctgat 6180
cagcctatct cgcagcagat gaatatcttg tggtaggggt ttgggaaaat cattcgagtt 6240
tgatgttttt cttggtattt cccactcctc ttcagagtac agaagattaa gtgaaacctt 6300
cgtttgtgcg gatccttcag taatgtcttg tttcttttgt tgcagtggtg agccattttg 6360
acttcgtgaa agtttcttta gaatagttgt ttccagaggc caaacattcc acccgtagta 6420
aagtgcaagc gtaggaagac caagactggc ataaatcagg tataagtgtc gagcactggc 6480
aggtgatctt ctgaaagttt ctactagcag ataagatcca gtagtcatgc atatggcaac 6540
aatgtaccgt gtggatctaa gaacgcgtcc tactaacctt cgcattcgtt ggtccagttt 6600
gttgttatcg atcaacgtga caaggttgtc gattccgcgt aagcatgcat acccaaggac 6660
gcctgttgca attccaagtg agccagttcc aacaatcttt gtaatattag agcacttcat 6720
tgtgttgcgc ttgaaagtaa aatgcgaaca aattaagaga taatctcgaa accgcgactt 6780
caaacgccaa tatgatgtgc ggcacacaat aagcgttcat atccgctggg tgactttctc 6840
gctttaaaaa attatccgaa aaaattttct agagtgttga cactttatac ttccggctcg 6900
tataatacga caaggtgtaa ggaggactaa accatgaaaa agccagagct tacagcaacg 6960
agcgttgaga aattcttgat tgaaaagttt gattcagttt ccgacctgat gcagttgtct 7020
gagggtgaag agtcaagagc cttttcgttc gatgtgggtg gtagaggtta cgtccttagg 7080
gtgaactctt gtgccgatgg tttttacaaa gatagatatg tttacagaca tttcgcatcc 7140
gcagcactcc ccatcccaga agtattggac attggagagt tttccgaatc cttgacctat 7200
tgcatctctc gacgtgccca aggtgtcact ttacaagact tgccggagac tgaacttcca 7260
gcagttttac aacctgtagc agaggctatg gacgctattg ctgctgctga tttgtctcaa 7320
acaagtggat tcggcccttt tggtcctcag ggtatcgggc aatacacaac ttggagagac 7380
tttatctgtg ctatcgcaga cccacatgtg tatcactggc aaaccgtcat ggatgacact 7440
gtatcggcta gtgtggccca agctcttgat gagctaatgc tgtgggctga ggactgtcca 7500
gaagtgaggc acttggttca cgcagacttt ggatccaata atgttctgac agataacgga 7560
cgtataacag ctgtcattga ctggtccgaa gctatgttcg gtgattcaca atatgaagtc 7620
gctaacatat tcttttggcg tccctggtta gcatgtatgg agcaacaaac tagatatttc 7680
gaacgtagac atcctgaact agctggatct ccaagattga gagcttacat gctgaggatc 7740
ggtttggatc agctgtacca gagcttggta gacggaaatt tcgacgacgc cgcatgggcg 7800
caaggtagat gcgatgccat tgtgagaagt ggtgctggca ctgttggtag aacccagatt 7860
gcaagacgtt cagctgctgt ttggacggat ggttgtgttg aggttttggc agattccgga 7920
aatcgtagac ctagcactag gccaagagct aaggaataat agc 7963
<210> 16
<211> 5508
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV94 (Sequence 14)
<400> 16
aacatccaaa gacgaaaggt tgaatgaaac ctttttgcca tccgacatcc acaggtccat 60
tctcacacat aagtgccaaa cgcaacagga ggggatacac tagcagcaga ccgttgcaaa 120
cgcaggacct ccactcctct tctcctcaac acccactttt gccatcgaaa aaccagccca 180
gttattgggc ttgattggag ctcgctcatt ccaattcctt ctattaggct actaacacca 240
tgactttatt agcctgtcta tcctggcccc cctggcgagg ttcatgtttg tttatttccg 300
aatgcaacaa gctccgcatt acacccgaac atcactccag atgagggctt tctgagtgtg 360
gggtcaaata gtttcatgtt ccccaaatgg cccaaaactg acagtttaaa cgctgtcttg 420
gaacctaata tgacaaaagc gtgatctcat ccaagatgaa ctaagtttgg ttcgttgaaa 480
tgctaacggc cagttggtca aaaagaaact tccaaaagtc ggcataccgt ttgtcttgtt 540
tggtattgat tgacgaatgc tcaaaaataa tctcattaat gcttagcgca gtctctctat 600
cgcttctgaa ccccggtgca cctgtgccga aacgcaaatg gggaaacacc cgctttttgg 660
atgattatgc attgtctcca cattgtatgc ttccaagatt ctggtgggaa tactgctgat 720
agcctaacgt tcatgatcaa aatttaactg ttctaacccc tacttgacag caatatataa 780
acagaaggaa gctgccctgt cttaaacctt tttttttatc atcattatta gcttactttc 840
ataattgcga ctggttccaa ttgacaagct tttgatttta acgactttta acgacaactt 900
gagaagatca aaaaacaact aattattgaa agaattcatg ttctctccaa ttttgtcctt 960
ggaaattatt ttagctttgg ctactttgca atctgtcttc gctgcccccg acgaggagga 1020
ccacgtcctg gtgctccata agggcaactt cgacgaggcg ctggcggccc acaagtacct 1080
gctggtggag ttctacgccc catggtgcgg ccactgcaag gctctggccc cggagtatgc 1140
caaagcagct gggaagctga aggcagaagg ttctgagatc agactggcca aggtggatgc 1200
cactgaagag tctgacctgg cccagcagta tggtgtccga ggctacccca ccatcaagtt 1260
cttcaagaat ggagacacag cttcccccaa agagtacaca gctggccgag aagcggatga 1320
tatcgtgaac tggctgaaga agcgcacggg ccccgctgcc agcacgctgt ccgacggggc 1380
tgctgcagag gccttggtgg agtccagtga ggtggccgtc attggcttct tcaaggacat 1440
ggagtcggac tccgcaaagc agttcttgtt ggcagcagag gccattgatg acatcccctt 1500
cgggatcaca tctaacagcg atgtgttctc caaataccag ctggacaagg atggggttgt 1560
cctctttaag aagtttgacg aaggccggaa caactttgag ggggaggtca ccaaagaaaa 1620
gcttctggac ttcatcaagc acaaccagtt gcccctggtc attgagttca ccgagcagac 1680
agccccgaag atcttcggag gggaaatcaa gactcacatc ctgctgttcc tgccgaaaag 1740
cgtgtctgac tatgagggca agctgagcaa cttcaaaaaa gcggctgaga gcttcaaggg 1800
caagatcctg tttatcttca tcgacagcga ccacactgac aaccagcgca tcctggaatt 1860
cttcggccta aagaaagagg agtgcccggc cgtgcgcctc atcacgctgg aggaggagat 1920
gaccaaatat aagccagagt cagatgagct gacggcagag aagatcaccg agttctgcca 1980
ccgcttcctg gagggcaaga ttaagcccca cctgatgagc caggagctgc ctgacgactg 2040
ggacaagcag cctgtcaaag tgctggttgg gaagaacttt gaagaggttg cttttgatga 2100
gaaaaagaac gtctttgtag agttctatgc cccgtggtgc ggtcactgca agcagctggc 2160
ccccatctgg gataagctgg gagagacgta caaggaccac gagaacatag tcatcgccaa 2220
gatggactcc acggccaacg aggtggaggc ggtgaaagtg cacagcttcc ccacgctcaa 2280
gttcttcccc gccagcgccg acaggacggt catcgactac aatggggagc ggacactgga 2340
tggttttaag aagttcctgg agagtggtgg ccaggatggg gccggagatg atgacgatct 2400
agaagatctt gaagaagcag aagagcctga tctggaggaa gatgatgatc aaaaagctgt 2460
gaaagatgaa ctgtaagcgg ccgctcaaga ggatgtcaga atgccatttg cctgagagat 2520
gcaggcttca tttttgatac ttttttattt gtaacctata tagtatagga ttttttttgt 2580
cattttgttt cttctcgtac gagcttgctc ctgatcagcc tatctcgcag cagatgaata 2640
tcttgtggta ggggtttggg aaaatcattc gagtttgatg tttttcttgg tatttcccac 2700
tcctcttcag agtacagaag attaagtgaa accttcgttt gtgcggatcc ttcagtaatg 2760
tcttgtttct tttgttgcag tggtgagcca ttttgacttc gtgaaagttt ctttagaata 2820
gttgtttcca gaggccaaac attccacccg tagtaaagtg caagcgtagg aagaccaaga 2880
ctggcataaa tcaggtataa gtgtcgagca ctggcaggtg atcttctgaa agtttctact 2940
agcagataag atccagtagt catgcatatg gcaacaatgt accgtgtgga tctaagaacg 3000
cgtcctacta accttcgcat tcgttggtcc agtttgttgt tatcgatcaa cgtgacaagg 3060
ttgtcgattc cgcgtaagca tgcataccca aggacgcctg ttgcaattcc aagtgagcca 3120
gttccaacaa tctttgtaat attagagcac ttcattgtgt tgcgcttgaa agtaaaatgc 3180
gaacaaatta agagataatc tcgaaaccgc gacttcaaac gccaatatga tgtgcggcac 3240
acaataagcg ttcatatccg ctgggtgact ttctcgcttt aaaaaattat ccgaaaaaat 3300
tttctagagt gttgacactt tatacttccg gctcgtataa tacgacaagg tgtaaggagg 3360
actaaaccat gggtaaggaa aagactcacg tttcgaggcc gcgattaaat tccaacatgg 3420
atgctgattt atatgggtat aaatgggctc gcgataatgt cgggcaatca ggtgcgacaa 3480
tctatcgatt gtatgggaag cccgatgcgc cagagttgtt tctgaaacat ggcaaaggta 3540
gcgttgccaa tgatgttaca gatgagatgg tcagactaaa ctggctgacg gaatttatgc 3600
ctcttccgac catcaagcat tttatccgta ctcctgatga tgcatggtta ctcaccactg 3660
cgatccccgg caaaacagca ttccaggtat tagaagaata tcctgattca ggtgaaaata 3720
ttgttgatgc gctggcagtg ttcctgcgcc ggttgcattc gattcctgtt tgtaattgtc 3780
cttttaacag cgatcgcgta tttcgtctcg ctcaggcgca atcacgaatg aataacggtt 3840
tggttgatgc gagtgatttt gatgacgagc gtaatggctg gcctgttgaa caagtctgga 3900
aagaaatgca taagcttttg ccattctcac cggattcagt cgtcactcat ggtgatttct 3960
cacttgataa ccttattttt gacgagggga aattaatagg ttgtattgat gttggacgag 4020
tcggaatcgc agaccgatac caggatcttg ccatcctatg gaactgcctc ggtgagtttt 4080
ctccttcatt acagaaacgg ctttttcaaa aatatggtat tgataatcct gatatgaata 4140
aattgcagtt tcatttgatg ctcgatgagt ttttctaaca attgacacct tacgattatt 4200
tagagagtat ttattagttt tattgtatgt atacggatgt tttattatct atttatgccc 4260
ttatattctg taactatcca aaagtcctat cttatcaagc cagcaatcta tgtccgcgaa 4320
cgtcaactaa aaataagctt tttatgctgt tctctctttt tttcccttcg gtataattat 4380
accttgcatc cacagattct cctgccaaat tttgcataat cctttacaac atggctatat 4440
gggagcactt agcgccctcc aaaacccata ttgcctacgc atgtataggt gttttttcca 4500
caatattttc tctgtgctct ctttttatta aagagaagct ctatatcgga gaagcttctg 4560
tggccgttat attcggcctt atcgtgggac cacattgcct gaattggttt gccccggaag 4620
attggggaaa cttggatctg attaccttag ctgcaggtac cactgagcgt cagaccccgt 4680
agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct gctgcttgca 4740
aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc taccaactct 4800
ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgttc ttctagtgta 4860
gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 4920
aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg ggttggactc 4980
aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt cgtgcacaca 5040
gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg agctatgaga 5100
aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg 5160
aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 5220
cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag gggggcggag 5280
cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt gctggccttt 5340
tgctcacatg ttctttcctg cggtacccag atccaattcc cgctttgact gcctgaaatc 5400
tccatcgcct acaatgatga catttggatt tggttgactc atgttggtat tgtgaaatag 5460
acgcagatcg ggaacactga aaaatacaca gttattattc atttaaat 5508
<210> 17
<211> 7605
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV156 (Sequence 15)
<400> 17
tgcaggtacc actgagcgtc agaccccgta gaaaagatca aaggatcttc ttgagatcct 60
ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt 120
tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt cagcagagcg 180
cagataccaa atactgttct tctagtgtag ccgtagttag gccaccactt caagaactct 240
gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc tgccagtggc 300
gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa ggcgcagcgg 360
tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac ctacaccgaa 420
ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg 480
gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga gcttccaggg 540
ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact tgagcgtcga 600
tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa cgcggccttt 660
ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc ggtacccaga 720
tccaattccc gctttgactg cctgaaatct ccatcgccta caatgatgac atttggattt 780
ggttgactca tgttggtatt gtgaaataga cgcagatcgg gaacactgaa aaatacacag 840
ttattattca tttcagaagc gatagagaga ctgcgctaag cattaatgag attatttttg 900
agcattcgtc aatcaatacc aaacaagaca aacggtatgc cgacttttgg aagtttcttt 960
ttgaccaact ggccgttagc atttcaacga accaaactta gttcatcttg gatgagatca 1020
cgcttttgtc atattaggtt ccaagacagc gtttaaactg tcagttttgg gccatttggg 1080
gaacatgaaa ctatttgacc ccacactcag aaagccctca tctggagtga tgttcgggtg 1140
taatgcggag cttgttgcat tcggaaataa acaaacatga acctcgccag gggggccagg 1200
atagacaggc taataaagtc atggtgttag tagcctaata gaaggaattg gaataaatga 1260
cccttgtgac tgacactttg ggagtcccta ttctacttag tctcatatcg catgaaactt 1320
ttgataaatt attttctgat aggaattttt catcagatat tatcatcgcg gcttacgtaa 1380
taacaaaaaa aattgatgga gtctatacta ggctaacata aactaagtta ttaattaaac 1440
aaaacaaaac gtactagcat tactgtcata tataagggct cctaactaaa actgtaaaga 1500
cttcccgtaa aattatcatt ctaattctga caatgtgcat ggcctcctaa actcttgacc 1560
tctctcatgc agccacttat tggaaaccca cttattaccg actaagacgg gacaagcagc 1620
atgtctagtg ctgtaatcac cttctccaga tgcaaacaga ttgtaccaaa atacggccgt 1680
gcccttttta ggccaaacag aagcacctac ctcagggaaa actgtggctc ctccagcaag 1740
cacatcggac atatagaaca accacgttgc gattctattt ccagtaccta gctccttaaa 1800
agcatcaggc tcgtcctttc tggcgaaatc aaagtggggt tcatactgac cgcccacacc 1860
atagttggca acttgtagtt cctcagcagt gcttacgtca agaccagtca aatcttgaat 1920
acgcatattg atacggctga ccacgggatt ctcgtaaccg gacaaccatg ctgatttaga 1980
gacacgatat tgtgcggtag tcaattttcc agtctcaggg tcatggacgg tagccctact 2040
caatcttggt ttggccaagt ctttcacaac ctctatttct gcatcggaga tgatgtcatg 2100
aaaacgaatg attctaggct tgtcccattc atcttcctgt ttcgctggag caagaatgaa 2160
ttttgggtta cggttcccat catgatatct acagaacagc tttttctgtc tccttggagt 2220
catcttgata ccctctcctc tacacagcat ttcatacttt tgtctctctg ggaggtagtc 2280
aacagctgca cctttttttt tcagagtggt cttttgatcg gattggtcat cggacgagga 2340
cttatttgcg tccttttcct tagccatgat gtattcaaag tatttcagat taccgttagc 2400
tctttgatgc tccgggtcca gctccaacaa ctttttagtt aaaagtagag ctttgtccag 2460
atcaccttgc tggtaaacag cgtatgataa gtaatccaaa actgaaacct tatcaacggt 2520
agaaacttca ccttcgtcca actgacgcag agcttgctcc atccataatt ctgtgtgata 2580
gtagtcggct tctgtatatg cgacttttcc caattcaaaa caatcttcca cagtgaggaa 2640
ggacttatgc ttcacaccag gtaaatcacc cttcgatatc gtgtcggtgt ccaaattgta 2700
tgtgtcctgc aatcgcaaca aagcttttgc tgctcctact tggtcctcat cgtttggaaa 2760
gtattgtctt tgaattgtta agttagaaat gaatccatca ctcatatctt taagtaccaa 2820
gttttccaat tctgaccact ctgtattaag tctcttcatc agcttgaaag cattcactgg 2880
gtgacccaca aaaccctcag gatcttttgt tgcagtacta gtcaatctat cgagtttctc 2940
tgcccacttt ttgatttgct ccaacttatc ctcttcagct ttgatatagt ctttaaggct 3000
tgtaactagg tctttttctg tgtgaatcaa atcagtcatc tgtcctatag aagtgaagaa 3060
gcctgggtga gccagtgact gtggcaacaa aataccaacg actaggatat accaaatcat 3120
gcggcctgtt gtagttttaa tatagtttga gtatgagatg gaactcagaa cgaaggaatt 3180
atcaccagtt tatatattct gaggaaaggg tgtgtcctaa attggacagt cacgatggca 3240
ataaacgctc agccaatcag aatgcaggag ccataaattg ttgtattatt gctgcaagat 3300
ttatgtgggt tcacattcca ctgaatggtt ttcactgtag aattggtgtc ctagttgtta 3360
tgtttcgaga tgttttcaag aaaaactaaa atgcacaaac tgaccaataa tgtgccgtcg 3420
cgcttggtac aaacgtcagg attgccacca cttttttcgc actctggtac aaaagttcgc 3480
acttcccact cgtatgtaac gaaaaacaga gcagtctatc cagaacgaga caaattagcg 3540
cgtactgtcc cattccataa ggtatcatag gaaacgagag tcctcccccc atcacgtata 3600
tataaacaca ctgatatccc acatccgctt gtcaccaaac taatacatcc agttcaagtt 3660
acctaaacaa atcaaagcat gagattccca tctattttca ccgctgtctt gttcgctgcc 3720
tcctctgcat tggctgcacc cgatgaggaa gatcatgttt tagtattgca taaaggaaat 3780
ttcgatgaag ctttggccgc tcacaaatat ctgctcgtcg agttttacgc tccctggtgc 3840
ggtcattgta aggcccttgc accagagtac gccaaggcag ctggtaagtt aaaggccgaa 3900
ggttcagaga tcagattagc aaaagttgat gctacagaag agtccgatct tgctcaacaa 3960
tacggggttc gaggataccc aacaattaag tttttcaaaa atggtgatac tgcttcccca 4020
aaggaatata ctgctggtag agaggcagac gacatagtca actggctcaa aaagagaacg 4080
ggcccagctg cgtctacatt aagcgacgga gcagcagccg aagctcttgt ggaatctagt 4140
gaagttgctg taatcggttt ctttaaggac atggaatctg attcagctaa acagttcctt 4200
ttagcagctg aagcaatcga tgacatccct ttcggaatca cctcaaatag tgacgtgttc 4260
agcaagtacc aacttgacaa agatggagtg gtcttgttca aaaagtttga cgaaggcaga 4320
aacaatttcg agggtgaggt tacaaaggag aaactgcttg atttcattaa acataaccaa 4380
ctacccttag ttatcgaatt cactgaacaa actgctccta agattttcgg tggagaaatc 4440
aaaacacata tcttgttgtt tttgccaaag tccgtatcgg attatgaagg taaactctcc 4500
aatttcaaaa aggccgctga gagctttaag ggcaagattt tgttcatctt tattgactca 4560
gaccacacag acaatcagag gattttggag tttttcggtt tgaaaaagga ggaatgtcca 4620
gcagtccgtt tgatcacctt ggaggaggag atgaccaaat acaaaccaga gtcggatgag 4680
ttgactgccg agaagataac agaattttgt cacagatttc tggaaggtaa gatcaagcct 4740
catcttatgt ctcaagagtt gcctgatgac tgggataagc aaccagttaa agtattggtg 4800
ggtaaaaact ttgaggaagt ggccttcgac gagaaaaaaa atgtctttgt tgaattctat 4860
gctccgtggt gtggtcactg taagcagctg gcaccaattt gggataaact gggtgaaact 4920
tacaaagatc acgaaaacat tgttattgca aagatggaca gtactgctaa cgaagtggag 4980
gctgtgaaag ttcactcctt ccctacgctg aagttctttc ctgcatctgc tgacagaact 5040
gttatcgact ataatggaga gaggacattg gatggtttta aaaagtttct tgaatccgga 5100
ggtcaagacg gagctggtga cgacgatgat ttggaagatc tggaggaggc tgaggaacct 5160
gatcttgagg aggatgacga ccagaaggca gtcaaagatg aactgtgata aggggtcaag 5220
aggatgtcag aatgccattt gcctgagaga tgcaggcttc atttttgata cttttttatt 5280
tgtaacctat atagtatagg attttttttg tcattttgtt tcttctcgta cgagcttgct 5340
cctgatcagc ctatctcgca gcagatgaat atcttgtggt aggggtttgg gaaaatcatt 5400
cgagtttgat gtttttcttg gtatttccca ctcctcttca gagtacagaa gattaagtga 5460
gaccttcgtt tgtgcggatc cttcagtaat gtcttgtttc ttttgttgca gtggtgagcc 5520
attttgactt cgtgaaagtt tctttagaat agttgtttcc agaggccaaa cattccaccc 5580
gtagtaaagt gcaagcgtag gaagaccaag actggcataa atcaggtata agtgtcgagc 5640
actggcaggt gatcttctga aagtttctac tagcagataa gatccagtag tcatgcatat 5700
ggcaacaatg taccgtgtgg atctaagaac gcgtcctact aaccttcgca ttcgttggtc 5760
cagtttgttg ttatcgatca acgtgacaag gttgtcgatt ccgcgtaagc atgcataccc 5820
aaggacgcct gttgcaattc caagtgagcc agttccaaca atctttgtaa tattagagca 5880
cttcattgtg ttgcgcttga aagtaaaatg cgaacaaatt aagagataat ctcgaaaccg 5940
cgacttcaaa cgccaatatg atgtgcggca cacaataagc gttcatatcc gctgggtgac 6000
tttctcgctt taaaaaatta tccgaaaaaa ttttctagag tgttgacact ttatacttcc 6060
ggctcgtata atacgacaag gtgtaaggag gactaaacca tgggtaaaaa gcctgaactc 6120
accgcgacgt ctgtcgagaa gtttctgatc gaaaagttcg acagcgtctc cgacctgatg 6180
cagctctcgg agggcgaaga atctcgtgct ttcagcttcg atgtaggagg gcgtggatat 6240
gtcctgcggg taaatagctg cgccgatggt ttctacaaag atcgttatgt ttatcggcac 6300
tttgcatcgg ccgcgctccc gattccggaa gtgcttgaca ttggggaatt cagcgagagc 6360
ctgacctatt gcatctcccg ccgtgcacag ggtgtcacgt tgcaagacct gcctgaaacc 6420
gaactgcccg ctgttctgca gccggtcgcg gaggccatgg atgcgatcgc tgcggccgat 6480
cttagccaga cgagcgggtt cggcccattc ggaccgcaag gaatcggtca atacactaca 6540
tggcgtgatt tcatatgcgc gattgctgat ccccatgtgt atcactggca aactgtgatg 6600
gacgacaccg tcagtgcgtc cgtcgcgcag gctctcgatg agctgatgct ttgggccgag 6660
gactgccccg aagtccggca cctcgtgcac gcggatttcg gctccaacaa tgtcctgacg 6720
gacaatggcc gcataacagc ggtcattgac tggagcgagg cgatgttcgg ggattcccaa 6780
tacgaggtcg ccaacatctt cttctggagg ccgtggttgg cttgtatgga gcagcagacg 6840
cgctacttcg agcggaggca tccggagctt gcaggatcgc cgcggctccg ggcgtatatg 6900
ctccgcattg gtcttgacca actctatcag agcttggttg acggcaattt cgatgatgca 6960
gcttgggcgc agggtcgatg cgacgcaatc gtccgatccg gagccgggac tgtcgggcgt 7020
acacaaatcg cccgcagaag cgcggccgtc tggaccgatg gctgtgtaga agtactcgcc 7080
gatagtggaa accgacgccc cagcactcgt ccgagggcaa aggaataaca attgacacct 7140
tacgattatt tagagagtat ttattagttt tattgtatgt atacggatgt tttattatct 7200
atttatgccc ttatattctg taactatcca aaagtcctat cttatcaagc cagcaatcta 7260
tgtccgcgaa cgtcaactaa aaataagctt tttatgctct tctctctttt tttcccttcg 7320
gtataattat accttgcatc cacagattct cctgccaaat tttgcataat cctttacaac 7380
atggctatat gggagcactt agcgccctcc aaaacccata ttgcctacgc atgtataggt 7440
gttttttcca caatattttc tctgtgctct ctttttatta aagagaagct ctatatcgga 7500
gaagcttctg tggccgttat attcggcctt atcgtgggac cacattgcct gaattggttt 7560
gccccggaag attggggaaa cttggatctg attaccttag ctgca 7605
<210> 18
<211> 8743
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV191 (Sequence 16)
<400> 18
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 60
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 120
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 180
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 240
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 300
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 360
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 420
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 480
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 540
aattatccga aaaaattttc tagacttctc ttccaaatat cgtctccaca aaatgggtaa 600
ggaaaagact cacgtttcga ggccgcgatt aaattccaac atggatgctg atttatatgg 660
gtataaatgg gctcgcgata atgtcgggca atcaggtgcg acaatctatc gattgtatgg 720
gaagcccgat gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt 780
tacagatgag atggtcagac taaactggct gacggaattt atgcctcttc cgaccatcaa 840
gcattttatc cgtactcctg atgatgcatg gttactcacc actgcgatcc ccggcaaaac 900
agcattccag gtattagaag aatatcctga ttcaggtgaa aatattgttg atgcgctggc 960
agtgttcctg cgccggttgc attcgattcc tgtttgtaat tgtcctttta acagcgatcg 1020
cgtatttcgt ctcgctcagg cgcaatcacg aatgaataac ggtttggttg atgcgagtga 1080
ttttgatgac gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa tgcataagct 1140
tttgccattc tcaccggatt cagtcgtcac tcatggtgat ttctcacttg ataaccttat 1200
ttttgacgag gggaaattaa taggttgtat tgatgttgga cgagtcggaa tcgcagaccg 1260
ataccaggat cttgccatcc tatggaactg cctcggtgag ttttctcctt cattacagaa 1320
acggcttttt caaaaatatg gtattgataa tcctgatatg aataaattgc agtttcattt 1380
gatgctcgat gagtttttct aaaattgaca ccttacgatt atttagagag tatttattag 1440
ttttattgta tgtatacgga tgttttatta tctatttatg cccttatatt ctgtaactat 1500
ccaaaagtcc tatcttatca agccagcaat ctatgtccgc gaacgtcaac taaaaataag 1560
ctttttatgc tgttctctct ttttttccct tcggtataat tataccttgc atccacagat 1620
tctcctgcca aattttgcat aatcctttac aacatggcta tatgggagca cttagcgccc 1680
tccaaaaccc atattgccta cgcatgtata ggtgtttttt ccacaatatt ttctctgtgc 1740
tctcttttta ttaaagagaa gctctatatc ggagaagctt ctgtggccgt tatattcggc 1800
cttatcgtgg gaccacattg cctgaattgg tttgccccgg aagattgggg aaacttggat 1860
ctgattacct tagctgcatc agaattggtt aattggttgt aacactgacc cctatttgtt 1920
tatttttcta aatacattca aatatgtatc cgctcatgag acaataaccc tgataaatgc 1980
ttcaataata ttgaaaaagg aagaatatga gtattcaaca tttccgtgtc gcccttattc 2040
ccttttttgc ggcattttgc cttcctgttt ttgctcaccc agaaacgctg gtgaaagtaa 2100
aagatgctga agatcagttg ggtgcacgag tgggttacat cgaactggat ctcaacagcg 2160
gtaagatcct tgagagtttt cgccccgaag aacgttttcc aatgatgagc acttttaaag 2220
ttctgctatg tggcgcggta ttatcccgta ttgacgccgg gcaagagcaa ctcggtcgcc 2280
gcatacacta ttctcagaat gacttggttg agtactcacc agtcacagaa aagcatctta 2340
cggatggcat gacagtaaga gaattatgca gtgctgccat aaccatgagt gataacactg 2400
cggccaactt acttctgaca acgatcggag gaccgaagga gctaaccgct tttttgcaca 2460
acatggggga tcatgtaact cgccttgatc gttgggaacc ggagctgaat gaagccatac 2520
caaacgacga gcgtgacacc acgatgcctg tagcgatggc aacaacgttg cgcaaactat 2580
taactggcga actacttact ctagcttccc ggcaacaatt aatagactgg atggaggcgg 2640
ataaagttgc aggaccactt ctgcgctcgg cccttccggc tggctggttt attgctgata 2700
aatccggagc cggtgagcgt ggttctcgcg gtatcatcgc agcgctgggg ccagatggta 2760
agccctcccg tatcgtagtt atctacacga cggggagtca ggcaactatg gatgaacgaa 2820
atagacagat cgctgagata ggtgcctcac tgattaagca ttggtaaggt accactgagc 2880
gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat 2940
ctgctgcttg caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga 3000
gctaccaact ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt 3060
tcttctagtg tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata 3120
cctcgctctg ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac 3180
cgggttggac tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg 3240
ttcgtgcaca cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg 3300
tgagctatga gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag 3360
cggcagggtc ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct 3420
ttatagtcct gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc 3480
aggggggcgg agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt 3540
ttgctggcct tttgctcaca atttaaatga cccttgtgac tgacactttg ggagtcccta 3600
ttctacttag tctcatatcg catgaaactt ttgataaatt attttctgat aggaattttt 3660
catcagatat tatcatcgcg gcttacgtaa taacaaaaaa aattgatgga gtctatacta 3720
ggctaacata aactaagtta ttaattaaac aaaacaaaac gtactagcat tactgtcata 3780
tataagggct cctaactaaa actgtaaaga cttcccgtaa aattatcatt ctaattctga 3840
caatgtgcat ggcctcctaa actcttgacc tctctcatgc agccacttat tggaaaccca 3900
cttattaccg actaagacgg gacaagcagc atgtctagtg ctgtaatcac cttctccaga 3960
tgcaaacaga ttgtaccaaa atacggccgt gcccttttta ggccaaacag aagcacctac 4020
ctcagggaaa actgtggctc ctccagcaag cacatcggac atatagaaca accacgttgc 4080
gattctattt ccagtaccta gctccttaaa agcatcaggc tcgtcctttc tggcgaaatc 4140
aaagtggggt tcatactgac cgcccacacc atagttggca acttgtagtt cctcagcagt 4200
gcttacgtca agaccagtca aatcttgaat acgcatattg atacggctga ccacgggatt 4260
ctcgtaaccg gacaaccatg ctgatttaga gacacgatat tgtgcggtag tcaattttcc 4320
agtctcaggg tcatggacgg tagccctact caatcttggt ttggccaagt ctttcacaac 4380
ctctatttct gcatcggaga tgatgtcatg aaaacgaatg attctaggct tgtcccattc 4440
atcttcctgt ttcgctggag caagaatgaa ttttgggtta cggttcccat catgatatct 4500
acagaacagc tttttctgtc tccttggagt catcttgata ccctctcctc tacacagcat 4560
ttcatacttt tgtctctctg ggaggtagtc aacagctgca cctttttttt tcagagtggt 4620
cttttgatcg gattggtcat cggacgagga cttatttgcg tccttttcct tagccatgat 4680
gtattcaaag tatttcagat taccgttagc tctttgatgc tccgggtcca gctccaacaa 4740
ctttttagtt aaaagtagag ctttgtccag atcaccttgc tggtaaacag cgtatgataa 4800
gtaatccaaa actgaaacct tatcaacggt agaaacttca ccttcgtcca actgacgcag 4860
agcttgctcc atccataatt ctgtgtgata gtagtcggct tctgtatatg cgacttttcc 4920
caattcaaaa caatcttcca cagtgaggaa ggacttatgc ttcacaccag gtaaatcacc 4980
cttcgatatc gtgtcggtgt ccaaattgta tgtgtcctgc aatcgcaaca aagcttttgc 5040
tgctcctact tggtcctcat cgtttggaaa gtattgtctt tgaattgtta agttagaaat 5100
gaatccatca ctcatatctt taagtaccaa gttttccaat tctgaccact ctgtattaag 5160
tctcttcatc agcttgaaag cattcactgg gtgacccaca aaaccctcag gatcttttgt 5220
tgcagtacta gtcaatctat cgagtttctc tgcccacttt ttgatttgct ccaacttatc 5280
ctcttcagct ttgatatagt ctttaaggct tgtaactagg tctttttctg tgtgaatcaa 5340
atcagtcatc tgtcctatag aagtgaagaa gcctgggtga gccagtgact gtggcaacaa 5400
aataccaacg actaggatat accaaatcat gcttttgttg ttgagtgaag cgagtgacgg 5460
aacggtaaaa tgtaagtaac aaaagaaaaa gagaaccagg ggggggagga gagtatgtat 5520
ttataccgta cggcaccagg cgaaaagcta taaacaaacc tttttcgcgg tatatttgtt 5580
tatatttcct attttaaact caaaatctgc cctaatctgg acttttcatg caaagttatg 5640
cacctgaggc aggaatgaag caggctcgac gacgaaaagg ctggaatggg taactatgga 5700
tcgattgatt tgtctgttga aatcttgatt tggcactcgt ttaaattaac attctgcatc 5760
atggtgaatt gcggtcacag gtactggttt ttcctgaagc tctaggcggt gttactgttc 5820
ccacaactta aaacctaaaa gaggtgggtg cttctttgcg tgggtgacca aaaataaaac 5880
cgactgccta gtggcattga tacctttttt tgggtgttgt cctggaaacc actgaacgta 5940
tctgcgagat acaaaagtat ttttagataa gtggcaaatg caaaaaatct gattggtcag 6000
ttaatgattg atgaacgact ttaaggttaa aaagcaaaat agtgactgct gccatgtgcc 6060
tgtatagcac atgaactgat tattctgttc ccacgctacg atgaaaacgc cttctctgcc 6120
gaaagattaa agctgcgcgg gaaaaaaaaa ttaactttac ggggcgagca cggttccccg 6180
aaacaaaaga tggttggctt tcacccagcg agctcactgg atgccagtta aaaatagtta 6240
ggtgggttca cctgtttttg tagaaatgtc ttggtgtcct cgaccaatca ggtagccatc 6300
cctgaaatac ctggctccgt ggcaacaccg aacgacctgc tggcaacgtt aaattctccg 6360
gggtaaaact taaatgtgga gtaatagaac cagaaacgtc tcttcccttc tctctccttc 6420
caccgcccgt taccgtccct aggaaatttt actctgctgg agagcttctt ctacggcccc 6480
cttgcagcaa tgctcttccc agcattacgt tgcgggtaaa acggaggtcg tgtacccgac 6540
ctagcagccc agggatggaa agtcccggcc gtcgctggca ataactgcgg gcggacgcat 6600
gtcttgagat tattggaaac caccagaatc gaatataaaa ggcgaacacc tttcccaatt 6660
ttggtttctc ctgacccaaa gactttaaat ttaatttatt tgtccctatt tcaatcaatt 6720
gaacaactat ggccgcatga gattcccatc tattttcacc gctgtcttgt tcgctgcctc 6780
ctctgcattg gctgcccctg ttaacactac cactgaagac gagactgctc aaattccagc 6840
tgaagcagtt atcggttact ctgaccttga gggtgatttc gacgtcgctg ttttgccttt 6900
ctctaactcc actaacaacg gtttgttgtt cattaacacc actatcgctt ccattgctgc 6960
taaggaagag ggtgtctctc tcgagaaaag agaggccgaa gctgcacccg atgaggaaga 7020
tcatgtttta gtattgcata aaggaaattt cgatgaagct ttggccgctc acaaatatct 7080
gctcgtcgag ttttacgctc cctggtgcgg tcattgtaag gcccttgcac cagagtacgc 7140
caaggcagct ggtaagttaa aggccgaagg ttcagagatc agattagcaa aagttgatgc 7200
tacagaagag tccgatcttg ctcaacaata cggggttcga ggatacccaa caattaagtt 7260
tttcaaaaat ggtgatactg cttccccaaa ggaatatact gctggtagag aggcagacga 7320
catagtcaac tggctcaaaa agagaacggg cccagctgcg tctacattaa gcgacggagc 7380
agcagccgaa gctcttgtgg aatctagtga agttgctgta atcggtttct ttaaggacat 7440
ggaatctgat tcagctaaac agttcctttt agcagctgaa gcaatcgatg acatcccttt 7500
cggaatcacc tcaaatagtg acgtgttcag caagtaccaa cttgacaaag atggagtggt 7560
cttgttcaaa aagtttgacg aaggcagaaa caatttcgag ggtgaggtta caaaggagaa 7620
actgcttgat ttcattaaac ataaccaact acccttagtt atcgaattca ctgaacaaac 7680
tgctcctaag attttcggtg gagaaatcaa aacacatatc ttgttgtttt tgccaaagtc 7740
cgtatcggat tatgaaggta aactctccaa tttcaaaaag gccgctgaga gctttaaggg 7800
caagattttg ttcatcttta ttgactcaga ccacacagac aatcagagga ttttggagtt 7860
tttcggtttg aaaaaggagg aatgtccagc agtccgtttg atcaccttgg aggaggagat 7920
gaccaaatac aaaccagagt cggatgagtt gactgccgag aagataacag aattttgtca 7980
cagatttctg gaaggtaaga tcaagcctca tcttatgtct caagagttgc ctgatgactg 8040
ggataagcaa ccagttaaag tattggtggg taaaaacttt gaggaagtgg ccttcgacga 8100
gaaaaaaaat gtctttgttg aattctatgc tccgtggtgt ggtcactgta agcagctggc 8160
accaatttgg gataaactgg gtgaaactta caaagatcac gaaaacattg ttattgcaaa 8220
gatggacagt actgctaacg aagtggaggc tgtgaaagtt cactccttcc ctacgctgaa 8280
gttctttcct gcatctgctg acagaactgt tatcgactat aatggagaga ggacattgga 8340
tggttttaaa aagtttcttg aatccggagg tcaagacgga gctggtgacg acgatgattt 8400
ggaagatctg gaggaggctg aggaacctga tcttgaggag gatgacgacc agaaggcagt 8460
caaagatgaa ctgtgataag gggtcaagag gatgtcagaa tgccatttgc ctgagagatg 8520
caggcttcat ttttgatact tttttatttg taacctatat agtataggat tttttttgtc 8580
attttgtttc ttctcgtacg agcttgctcc tgatcagcct atctcgcagc agatgaatat 8640
cttgtggtag gggtttggga aaatcattcg agtttgatgt ttttcttggt atttcccact 8700
cctcttcaga gtacagaaga ttaagtgaga ccttcgtttg tgc 8743
<210> 19
<211> 12068
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV208 (Sequence 17)
<400> 19
cggatgtttt attatctatt tatgccctta tattctgtaa ctatccaaaa gtcctatctt 60
atcaagccag caatctatgt ccgcgaacgt caactaaaaa taagcttttt atgctcttct 120
ctcttttttt cccttcggta taattatacc ttgcatccac agattctcct gccaaatttt 180
gcataatcct ttacaacatg gctatatggg agcacttagc gccctccaaa acccatattg 240
cctacgcatg tataggtgtt ttttccacaa tattttctct gtgctctctt tttattaaag 300
agaagctcta tatcggagaa gcttctgtgg ccgttatatt cggccttatc gtgggaccac 360
attgcctgaa ttggtttgcc ccggaagatt ggggaaactt ggatctgatt accttagctg 420
cagaaaaggg taccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga 480
tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt 540
ggtttgtttg ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag 600
agcgcagata ccaaatactg ttcttctagt gtagccgtag ttaggccacc acttcaagaa 660
ctctgtagca ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag 720
tggcgataag tcgtgtctta ccgggttgga cccaagacga tagttaccgg ataaggcgca 780
gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac 840
cgaactgaga tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa 900
ggcggacagg tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc 960
agggggaaac gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg 1020
tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc 1080
ctttttacgg ttcctggcct tttgctggcc ttttgctcat atgtaagctt tgaacactta 1140
tgtaagctcg aaaccagtta ggtaagcagc tttgtaagca atctggacaa tatgtaagcg 1200
ggttacgtaa acagttatgt aagcagaaaa atttcaaacg acaaaacttg gggtctacag 1260
acacagtagc cagaagattg cactaccatt cgactcctca tgacccactc tttcgatcca 1320
tgtagttagg ttaccgtttt tcctaatatt taaggatgtt gaaaattcat tttcattttt 1380
tttcgttttt aagattttct cacaactctt ccaaagatta ctagttgact tttcaaaata 1440
tttagggtat ttttctcact ttttcctagc aaactccaat tggtgggttc agtgcaatgg 1500
agtaccacct tgcaaccaca acgtaatagc taacttgtgg ccaccatgtc tggttgtaga 1560
gataattgga ttctaatgtg gatcacatga ctactcacgt gtcaaaaacc caacctgact 1620
tggcccagct tagcaagaat atttcgaatc cactcttgtg gcctagtgga caactgggac 1680
ctagggaccc ttgtgactga cactttggga gtccctattc tacttagtct catatcgcat 1740
gaaacttttg ataaattatt ttctgatagg aatttttcat cagatattat catcgcggct 1800
tacgtaataa caaaaaaaat tgatggagtc tatactaggc taacataaac taagttatta 1860
attaaacaaa acaaaacgta ctagcattac tgtcatatat aagggctcct aactaaaact 1920
gtaaagactt cccgtaaaat tatcattcta attctgacaa tgtgcatggc ctcctaaact 1980
cttgacctct ctcatgcagc cacttattgg aaacccactt attaccgact aagacgggac 2040
aagcagcatg tctagtgctg taatcacctt ctccagatgc aaacagattg taccaaaata 2100
cggccgtgcc ctttttaggc caaacagaag caccaacctc agggaaaact gtggctcctc 2160
cagcaagcac atcggacata tagaacaacc acgttgcgat tctatttcca gtacctagct 2220
ccttaaaagc atcaggctcg tcctttctgg cgaaatcaaa gtggggttca tactgaccgc 2280
ccacaccata gttggcaact tgtagttctt cagcagtgct tacgtcaaga ccagtcaaat 2340
cttgaatacg catattgata cggctgacca cgggattctc gtaaccggac aaccatgctg 2400
atttagagac acgatattgt gcggtagtca attttccagt ctcagggtca tggacggtag 2460
ccctactcaa tcttggtttg gccaagtctt tcacaacctc tatttctgca tcggagatga 2520
tgtcatgaaa acgaatgatt ctaggcttgt cccattcatc ttcctgtttc gctggagcaa 2580
gaatgaattt tgggttacgg ttcccatcat gatatctaca gaacagcttt ttctgtctcc 2640
ttggagtcat cttgataccc tctcctctac acagcatttc atacttttgt ctctctggga 2700
ggtagtcaac agccgcacct ttttttttca gagtggtctt ttgatcggat tggtcatcgg 2760
acgaggactt atttgcgtcc ttttccttag ccatgatgta ttcaaagtat ttcagattac 2820
cgttagctct ttgatgctcc gggtccagct ccaacaactt tttagttaaa agtagagctt 2880
tgtccagatc accttgctgg taaacagcgt atgataagta atccaaaact gaaaccttat 2940
caacggtaga aacttcacct tcgtccaact gacgtagagc ttgctccatc cataattctg 3000
tgtgatagta gtcggcttct gtatatgcga cttttcccaa ttcaaaacaa tcttccacag 3060
tgaggaagga cttatgcttc acaccaggta aatcaccctt cgatatcgtg tcggtgtcca 3120
aattgtatgt gtcctgcaat cgcaacaaag cttttgctgc tcctacttgg tcctcatcgt 3180
ttggaaagta ttgtctttga attgttaagt tagaaatgaa tccatcactc atatctttaa 3240
gtaccaagtt ttccaattct gaccactctg tattaagtct cttcatcagc ttgaaagcat 3300
tcactgggtg acccacaaaa ccctcaggat cttttgttgc agtacttgtc aatctatcga 3360
gtttctctgc ccactttttg atttgctcca acttatcctc ttcagctttg atatagtctt 3420
taaggcttgt aactaggtct ttttctgtgt gaatcaaatc agtcatctgt cctatagaag 3480
tgaagaagcc tgggtgagcc agtgactgtg gcaacaaaat accaacgact aggatatacc 3540
aaatcatgcg gccgcatggc ccccgacgag gaggaccacg tcctggtgct ccataagggc 3600
aacttcgacg aggcgctggc ggcccacaag tacctgctgg tggagttcta cgccccatgg 3660
tgcggccact gcaaggctct ggccccggag tatgccaaag cagctgggaa gctgaaggca 3720
gaaggttctg agatcagact ggccaaggtg gatgccactg aagagtctga cctggcccag 3780
cagtatggtg tccgaggcta ccccaccatc aagttcttca agaatggaga cacagcttcc 3840
cccaaagagt acacagctgg ccgggaagcg gatgatatcg tgaactggct gaagaagcgc 3900
acgggccccg ctgccagcac gctgtccgac ggggctgctg cagaggcttt ggtggagtcc 3960
agtgaggtgg ccgtcattgg cttcttcaag gatatggagt cggactccgc aaagcagttc 4020
ttcttggcag cagaggtcat tgatgacatc cccttcggga tcacatctaa cagcgatgtg 4080
ttctccaaat accagctgga caaggatggg gttgtcctct ttaagaagtt tgacgaaggc 4140
cggaacaact ttgaggggga ggtcaccaaa gaaaagcttc tggacttcat caagcacaac 4200
cagttgcccc tggtcattga gttcaccgag cagacagccc cgaagatctt cggaggggaa 4260
atcaagactc acatcctgct gttcctgccg aaaagcgtgt ctgactatga gggcaagctg 4320
agtaacttca aaaaagcggc tgagagcttc aagggcaaga tcctgtttat cttcatcgac 4380
agcgaccaca ctgacaacca gcgcatcctg gagttcttcg gcctaaagaa agaggagtgc 4440
ccggccgtgc gcctcatcac gctggaggag gagatgacca aatataagcc agagtcagat 4500
gagctgacgg cagagaagat caccgagttc tgccaccgct tcctggaggg caagattaag 4560
ccccacctga tgagccagga gctgcctgac gactgggaca agcagcctgt caaagtgctg 4620
gttgggaaga actttgaaga ggttgctttt gatgagaaaa agaacgtctt tgtagagttc 4680
tatgccccgt ggtgcggtca ctgcaagcag ctggccccca tctgggataa gctgggagag 4740
acgtacaagg accacgagaa catagtcatc gccaagatgg actccacggc caacgaggtg 4800
gaggcggtga aagtgcacag cttccccacg ctcaagttct tccccgccag cgccgacagg 4860
acggtcatcg actacaatgg ggaacggaca ctggatggtt ttaagaagtt cctggagagt 4920
ggtggccagg atggggccgg agatgatgac gatcttgaag atcttgaaga agcagaagag 4980
cctgatctgg aggaagatga tgatcaaaaa gctgtgaaag atgaactgta atcaagagga 5040
tgtcagaatg ccatttgcct gagagatgca ggcttcattt ttgatacttt tttatttgta 5100
acctatatag tataggattt tttttgtcat tttgtttctt ctcgtacgag cttgctcctg 5160
atcagcctat ctcgcagcag atgaatatct tgtggtaggg gtttgggaaa atcattcgag 5220
tttgatgttt ttcttggtat ttcccactcc tcttcagagt acagaagatt aagtgagacc 5280
ttcgtttgtg ccgatcggtt cagaagcgat agagagactg cgctaagcat taatgagatt 5340
atttttgagc attcgtcaat caataccaaa caagacaaac ggtatgccga cttttggaag 5400
tttctttttg accaactggc cgttagcatt tcaacgaacc aaacttagtt catcttggat 5460
gagatcacgc ttttgtcata ttaggttcca agacagcgtt taaactgtca gttttgggcc 5520
atttggggaa catgaaacta tttgacccca cactcagaaa gccctcatct ggagtgatgt 5580
tcgggtgtaa tgcggagctt gttgcattcg gaaataaaca aacatgaacc tcgccagggg 5640
ggccaggata gacaggctaa taaagtcatg gtgttagtag cctaatagaa ggaattggaa 5700
tgagcggatc caatgtatct aaacgcaaac tccgagctgg aaaaatgtta ccggcgatgc 5760
gcggacaatt tagaggcggc gatcaagaaa cacctgctgg gcgagcagtc tggagcacag 5820
tcttcgatgg gcccgagatc ccaccgcgtt cctgggtacc gggacgtgag gcagcgcgac 5880
atccatcaaa tataccaggc gccaaccgag tctctcggaa aacagcttct ggatatcttc 5940
cgctggcggc gcaacgacga ataatagtcc ctggaggtga cggaatatat atgtgtggag 6000
ggtaaatctg acagggtgta gcaaaggtaa tattttccta aaacatgcaa tcggctgccc 6060
cgcaacggga aaaagaatga ctttggcact cttcaccaga gtggggtgtc ccgctcgtgt 6120
gtgcaaatag gctcccactg gtcaccccgg attttgcaga aaaacagcaa gttccggggt 6180
gtctcactgg tgtccgccaa taagaggagc cggcaggcac ggagtctaca tcaagctgtc 6240
tccgatacac tcgactacca tccgggtctc tcagagaggg gaatggcact ataaataccg 6300
cctccttgcg ctctctgcct tcatcaatca aatcggatcc atgtcttttg tccaaaaggg 6360
tacttggtta ctttttgctc tgttgcaccc aactgttatt ctcgcacaac aggaagcagt 6420
agatggtggt tgctcacatt taggtcaatc ttacgcagat agagatgtat ggaaacctga 6480
accatgtcaa atttgcgtgt gtgactcagg ttcagtgctc tgcgacgata tcatatgtga 6540
cgaccaggaa ttggactgtc caaacccaga gataccattc ggtgaatgtt gtgctgtttg 6600
tccacagcca ccaactgctc ctacaagacc tccaaacggt caaggtccac aaggtcctaa 6660
aggtgatccg ggtccacctg gtattcctgg tagaaatggt gaccctggac ctcccggttc 6720
cccaggtagc ccaggatcac ctgggcctcc tggaatatgt gaatcctgcc caactggtgg 6780
tcagaactat agcccacaat acgaggccta cgacgtcaaa tctggtgttg ctggaggagg 6840
tattgcaggc taccctggtc ccgcagggcc cccaggtccg ccgggtccgc ccggaacatc 6900
aggtcatccc ggagcccctg gtgcaccagg ttatcaggga ccgcccggag agcctggaca 6960
agctggtccc gctggacccc ctggtccacc aggtgctatt ggaccaagtg gtcctgccgg 7020
aaaagacggt gaatccggta gacctggtag acccggcgaa aggggtttcc caggtcctcc 7080
cggaatgaag ggtccagccg gtatgcccgg ttttcctggg atgaagggtc acagaggatt 7140
tgatggtaga aacggagaga aaggcgaaac cggtgctccc ggactgaagg gtgaaaacgg 7200
tgtccctggt gagaacggcg ctcctggacc tatgggtcca cgtggtgctc caggagaaag 7260
aggcagacca ggattgcctg gtgcagctgg tgctagaggt aacgatggtg cccgtggttc 7320
cgatggacaa cccgggccac ccggccctcc aggtaccgct ggatttcctg gaagccctgg 7380
tgctaagggg gaggttggtc cggctggtag tcccggaagt agcggtgccc caggtcaaag 7440
aggcgaacca ggccctcagg gtcacgcagg agcacctgga ccgcctggtc ctcctggttc 7500
gaatggttcg cctggaggaa aaggtgaaat ggggcccgca ggaatccccg gtgcgcctgg 7560
tcttattggt gccaggggtc ctccaggccc gccaggtaca aatggtgtac ccggacagcg 7620
aggagcagct ggtgaacctg gtaaaaacgg tgccaaagga gatccaggtc ctcgtggaga 7680
gcgtggtgaa gctggctctc ccggtatcgc cggtccaaaa ggtgaggacg gtaaggacgg 7740
ttcccctggt gagccaggtg cgaacggact gccaggtgca gccggagagc gaggagtccc 7800
aggattcagg ggaccagccg gtgctaacgg cttgcctggt gaaaaagggc cccctggtga 7860
taggggagga cccggtccag caggccctcg tggagttgct ggtgagcctg gacgtgacgg 7920
tttaccagga gggccaggtt tgaggggtat tcccgggtcc cctggcggtc ctggatcgga 7980
tggaaaacca gggccaccag gttcgcaggg tgaaacagga cgtccaggcc cacccggctc 8040
acctggtcca aggggtcagc ctggtgtcat gggtttcccc ggtccaaagg gtaatgacgg 8100
agcaccgggt aaaaatggtg aacgtggtgg cccaggtggt ccaggacccc aaggtccagc 8160
tggaaaaaac ggtgagacag gtcctcaagg acctccagga cctaccggtc ctagcggaga 8220
taagggagat acgggaccgc caggacctca aggattgcaa ggtttgcctg gtacatctgg 8280
ccctcccgga gaaaatggta agcctggaga gccaggacca aaaggcgaag ctggagcccc 8340
aggtatcccc ggaggtaagg gagactcagg tgctccgggt gagcgtggtc ctccgggtgc 8400
cggtggtcca cctggaccta gaggtggtgc cgggccgcca ggtcctgaag gtggtaaagg 8460
tgctgctggt ccaccgggac cgcctggctc tgctggtact cctggcttgc agggaatgcc 8520
aggagagaga ggtggacctg gaggtcccgg tccgaagggt gataaagggg agccaggatc 8580
atccggtgtt gacggcgcac ctggtaaaga cggaccaagg ggaccaacgg gtccaatcgg 8640
accaccagga cccgctggcc agccaggaga taaaggcgag tccggagcac ccggtgttcc 8700
tggtatagct ggacccaggg gtggtcccgg tgaaagaggt gaacagggcc caccgggtcc 8760
cgccggtttc cctggcgccc ctggtcaaaa tggagaacca ggtgcaaagg gcgagagagg 8820
agccccagga gaaaagggtg agggaggacc acccggtgct gccggtccag ctgggggttc 8880
aggtcctgct ggaccaccag gtccacaggg cgttaaaggt gagagaggaa gtccaggtgg 8940
tcctggagct gctggattcc caggtggccg tggacctcct ggtccccctg gatcgaatgg 9000
taatcctggt ccgccaggta gttcgggtgc tcctgggaag gacggtccac ctggcccccc 9060
aggtagtaac ggtgcacctg gtagtccagg tatatccgga cctaaaggag attccggtcc 9120
accaggcgaa agaggggccc caggcccaca gggtccacca ggagcccccg gtcctctggg 9180
tattgctggt cttactggtg cacgtggact ggccggtcca cccggaatgc ctggagcaag 9240
aggttcacct ggaccacaag gtattaaagg agagaacggt aaacctggac cttccggtca 9300
aaacggagag cggggacccc caggccccca aggtctgcca ggactagctg gtaccgcagg 9360
ggaaccagga agagatggaa atccaggttc agacggacta cccggtagag atggtgcacc 9420
gggggccaag ggcgacaggg gtgagaatgg atctcctggt gcgccagggg caccaggcca 9480
cccaggtccc ccaggtcctg tgggccctgc tggaaagtca ggtgacaggg gagagacagg 9540
cccggctggt ccatctggcg cacccggacc agctggttcc agaggcccac ctggtccgca 9600
aggccctaga ggtgacaagg gagagactgg agaacgaggt gctatgggta tcaagggtca 9660
tagaggtttt ccgggtaatc ccggcgcccc aggttctcct ggtccagctg gccatcaagg 9720
tgcagtcgga tcgcccggcc cagccggtcc caggggccct gttggtccat ccggtcctcc 9780
aggaaaggat ggtgcttctg gacacccagg acctatcgga cctccgggtc ctagaggtaa 9840
tagaggagaa cgtggttccg agggtagtcc tggtcaccct ggtcaacctg gcccaccagg 9900
gcctccaggt gcacccggtc catgttgtgg tgcaggcggt gtggctgcaa ttgctggtgt 9960
gggtgctgaa aaggccggcg gtttcgctcc atattatggt gatgaaccga ttgattttaa 10020
gatcaatact gacgaaatca tgacttcctt aaagtccgtt aatggtcaaa ttgagtctct 10080
aatctcccca gatggttcac gtaaaaatcc tgctagaaat tgtagagatt tgaagttttg 10140
tcaccccgag ttgcagtccg gtgagtactg ggtggacccc aatcaaggtt gtaagttaga 10200
cgctattaaa gtttactgca atatggagac aggagaaact tgcatcagcg cttctccatt 10260
gactatccca caaaaaaatt ggtggactga ctctggagct gagaaaaagc atgtatggtt 10320
cggggaatcg atggaaggtg gtttccaatt cagctacggt aaccctgaac ttcctgaaga 10380
tgttcttgac gttcaattgg catttctgag attgttgtcc agtcgtgcaa gccaaaacat 10440
tacataccat tgcaaaaatt ccatcgcata tatggatcat gctagcggaa atgtgaaaaa 10500
ggcattgaag ctgatgggat caaatgaagg tgaatttaaa gcagagggta attctaagtt 10560
tacttacact gtattggagg atggttgtac gaagcataca ggtgaatggg gtaaaacagt 10620
gtttcaatat caaacccgca aagcagttag attgccaatc gtcgatatcg caccatacga 10680
cattggagga ccagatcaag agttcggagc tgacatcggt ccggtgtgtt tcctttgata 10740
atcaagagga tgtcagaatg ccatttgcct gagagatgca ggcttcattt ttgatacttt 10800
tttatttgta acctatatag tataggattt tttttgtcat tttgtttctt ctcgtacgag 10860
cttgctcctg atcagcctat ctcgcagctg atgaatatct tgtggtaggg gtttgggaaa 10920
atcattcgag tttgatgttt ttcttggtat ttcccactcc tcttcagagt acagaagatt 10980
aagtgagacg ttcgtttgtg cccgcggatt taaatgatcc ttcagtaatg tcttgtttct 11040
tttgttgcag tggtgagcca ttttgacttc gtgaaagttt ctttagaata gttgtttcca 11100
gaggccaaac attccacccg tagtaaagtg caagcgtagg aagaccaaga ctggcataaa 11160
tcaggtataa gtgtcgagca ctggcaggtg atcttctgaa agtttctact agcagataag 11220
atccagtagt catgcatatg gcaacaatgt accgtgtgga tctaagaacg cgtcctacta 11280
accttcgcat tcgttggtcc agtttgttgt tatcgatcaa cgtgacaagg ttgtcgattc 11340
cgcgtaagca tgcataccca aggacgcctg ttgcaattcc aagtgagcca gttccaacaa 11400
tctttgtaat attagagcac ttcattgtgt tgcgcttgaa agtaaaatgc gaacaaatta 11460
agagataatc tcgaaaccgc gacttcaaac gccaatatga tgtgcggcac acaataagcg 11520
ttcatatccg ctgggtgact ttctcgcttt aaaaaattat ccgaaaaaat tttctagagt 11580
gttgttactt tatacttccg gctcgtataa tacgacaagg tgtaaggagg actaaaccat 11640
ggctaaactc acctctgctg ttccagtcct gactgctcgt gatgttgctg gtgctgttga 11700
gttctggact gataggctcg gtttctcccg tgacttcgta gaggacgact ttgccggtgt 11760
tgtacgtgac gacgttaccc tgttcatctc cgcagttcag gaccaggttg tgccagacaa 11820
cactctggca tgggtatggg ttcgtggtct ggacgaactg tacgctgagt ggtctgaggt 11880
cgtgtctacc aacttccgtg atgcatctgg tccagctatg accgagatcg gtgaacagcc 11940
ctggggtcgt gagtttgcac tgcgtgatcc agctggtaac tgcgtgcatt tcgtcgcaga 12000
agagcaggac taacaattga caccttacga ttatttagag agtatttatt agttttattg 12060
tatgtata 12068
<210> 20
<211> 5735
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV84 (Sequence 18)
<400> 20
aacatccaaa gacgaaaggt tgaatgaaac ctttttgcca tccgacatcc acaggtccat 60
tctcacacat aagtgccaaa cgcaacagga ggggatacac tagcagcaga ccgttgcaaa 120
cgcaggacct ccactcctct tctcctcaac acccactttt gccatcgaaa aaccagccca 180
gttattgggc ttgattggag ctcgctcatt ccaattcctt ctattaggct actaacacca 240
tgactttatt agcctgtcta tcctggcccc cctggcgagg ttcatgtttg tttatttccg 300
aatgcaacaa gctccgcatt acacccgaac atcactccag atgagggctt tctgagtgtg 360
gggtcaaata gtttcatgtt ccccaaatgg cccaaaactg acagtttaaa cgctgtcttg 420
gaacctaata tgacaaaagc gtgatctcat ccaagatgaa ctaagtttgg ttcgttgaaa 480
tgctaacggc cagttggtca aaaagaaact tccaaaagtc ggcataccgt ttgtcttgtt 540
tggtattgat tgacgaatgc tcaaaaataa tctcattaat gcttagcgca gtctctctat 600
cgcttctgaa ccccggtgca cctgtgccga aacgcaaatg gggaaacacc cgctttttgg 660
atgattatgc attgtctcca cattgtatgc ttccaagatt ctggtgggaa tactgctgat 720
agcctaacgt tcatgatcaa aatttaactg ttctaacccc tacttgacag caatatataa 780
acagaaggaa gctgccctgt cttaaacctt tttttttatc atcattatta gcttactttc 840
ataattgcga ctggttccaa ttgacaagct tttgatttta acgactttta acgacaactt 900
gagaagatca aaaaacaact aattattgaa agaattcaaa acgaaaatga gattcccatc 960
tattttcacc gctgtcttgt tcgctgcctc ctctgcattg gctgcccctg ttaacactac 1020
cactgaagac gagactgctc aaattccagc tgaagcagtt atcggttact ctgaccttga 1080
gggtgatttc gacgtcgctg ttttgccttt ctctaactcc actaacaacg gtttgttgtt 1140
cattaacacc actatcgctt ccattgctgc taaggaagag ggtgtctctc tcgagaaaag 1200
agaggccgaa gctgcacccg atgaggaaga tcatgtttta gtattgcata aaggaaattt 1260
cgatgaagct ttggccgctc acaaatatct gctcgtcgag ttttacgctc cctggtgcgg 1320
tcattgtaag gcccttgcac cagagtacgc caaggcagct ggtaagttaa aggccgaagg 1380
ttcagagatc agattagcaa aagttgatgc tacagaagag tccgatcttg ctcaacaata 1440
cggggttcga ggatacccaa caattaagtt tttcaaaaat ggtgatactg cttccccaaa 1500
ggaatatact gctggtagag aggcagacga catagtcaac tggctcaaaa agagaacggg 1560
cccagctgcg tctacattaa gcgacggagc agcagccgaa gctcttgtgg aatctagtga 1620
agttgctgta atcggtttct ttaaggacat ggaatctgat tcagctaaac agttcctttt 1680
agcagctgaa gcaatcgatg acatcccttt cggaatcacc tcaaatagtg acgtgttcag 1740
caagtaccaa cttgacaaag atggagtggt cttgttcaaa aagtttgacg aaggcagaaa 1800
caatttcgag ggtgaggtta caaaggagaa actgcttgat ttcattaaac ataaccaact 1860
acccttagtt atcgaattca ctgaacaaac tgctcctaag attttcggtg gagaaatcaa 1920
aacacatatc ttgttgtttt tgccaaagtc cgtatcggat tatgaaggta aactctccaa 1980
tttcaaaaag gccgctgaga gctttaaggg caagattttg ttcatcttta ttgactcaga 2040
ccacacagac aatcagagga ttttggagtt tttcggtttg aaaaaggagg aatgtccagc 2100
agtccgtttg atcaccttgg aggaggagat gaccaaatac aaaccagagt cggatgagtt 2160
gactgccgag aagataacag aattttgtca cagatttctg gaaggtaaga tcaagcctca 2220
tcttatgtct caagagttgc ctgatgactg ggataagcaa ccagttaaag tattggtggg 2280
taaaaacttt gaggaagtgg ccttcgacga gaaaaaaaat gtctttgttg aattctatgc 2340
tccgtggtgt ggtcactgta agcagctggc accaatttgg gataaactgg gtgaaactta 2400
caaagatcac gaaaacattg ttattgcaaa gatggacagt actgctaacg aagtggaggc 2460
tgtgaaagtt cactccttcc ctacgctgaa gttctttcct gcatctgctg acagaactgt 2520
tatcgactat aatggagaga ggacattgga tggttttaaa aagtttcttg aatccggagg 2580
tcaagacgga gctggtgacg acgatgattt ggaagatctg gaggaggctg aggaacctga 2640
tcttgaggag gatgacgacc agaaggcagt caaagatgaa ctgtgataag gggggttaaa 2700
ggggcggccg ctcaagagga tgtcagaatg ccatttgcct gagagatgca ggcttcattt 2760
ttgatacttt tttatttgta acctatatag tataggattt tttttgtcat tttgtttctt 2820
ctcgtacgag cttgctcctg atcagcctat ctcgcagcag atgaatatct tgtggtaggg 2880
gtttgggaaa atcattcgag tttgatgttt ttcttggtat ttcccactcc tcttcagagt 2940
acagaagatt aagtgaaacc ttcgtttgtg cggatccttc agtaatgtct tgtttctttt 3000
gttgcagtgg tgagccattt tgacttcgtg aaagtttctt tagaatagtt gtttccagag 3060
gccaaacatt ccacccgtag taaagtgcaa gcgtaggaag accaagactg gcataaatca 3120
ggtataagtg tcgagcactg gcaggtgatc ttctgaaagt ttctactagc agataagatc 3180
cagtagtcat gcatatggca acaatgtacc gtgtggatct aagaacgcgt cctactaacc 3240
ttcgcattcg ttggtccagt ttgttgttat cgatcaacgt gacaaggttg tcgattccgc 3300
gtaagcatgc atacccaagg acgcctgttg caattccaag tgagccagtt ccaacaatct 3360
ttgtaatatt agagcacttc attgtgttgc gcttgaaagt aaaatgcgaa caaattaaga 3420
gataatctcg aaaccgcgac ttcaaacgcc aatatgatgt gcggcacaca ataagcgttc 3480
atatccgctg ggtgactttc tcgctttaaa aaattatccg aaaaaatttt ctagagtgtt 3540
gttactttat acttccggct cgtataatac gacaaggtgt aaggaggact aaaccatggg 3600
taaggaaaag actcacgttt cgaggccgcg attaaattcc aacatggatg ctgatttata 3660
tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt gcgacaatct atcgattgta 3720
tgggaagccc gatgcgccag agttgtttct gaaacatggc aaaggtagcg ttgccaatga 3780
tgttacagat gagatggtca gactaaactg gctgacggaa tttatgcctc ttccgaccat 3840
caagcatttt atccgtactc ctgatgatgc atggttactc accactgcga tccccggcaa 3900
aacagcattc caggtattag aagaatatcc tgattcaggt gaaaatattg ttgatgcgct 3960
ggcagtgttc ctgcgccggt tgcattcgat tcctgtttgt aattgtcctt ttaacagcga 4020
tcgcgtattt cgtctcgctc aggcgcaatc acgaatgaat aacggtttgg ttgatgcgag 4080
tgattttgat gacgagcgta atggctggcc tgttgaacaa gtctggaaag aaatgcataa 4140
gcttttgcca ttctcaccgg attcagtcgt cactcatggt gatttctcac ttgataacct 4200
tatttttgac gaggggaaat taataggttg tattgatgtt ggacgagtcg gaatcgcaga 4260
ccgataccag gatcttgcca tcctatggaa ctgcctcggt gagttttctc cttcattaca 4320
gaaacggctt tttcaaaaat atggtattga taatcctgat atgaataaat tgcagtttca 4380
tttgatgctc gatgagtttt tctaacaatt gacaccttac gattatttag agagtattta 4440
ttagttttat tgtatgtata cggatgtttt attatctatt tatgccctta tattctgtaa 4500
ctatccaaaa gtcctatctt atcaagccag caatctatgt ccgcgaacgt caactaaaaa 4560
taagcttttt atgctgttct ctcttttttt cccttcggta taattatacc ttgcatccac 4620
agattctcct gccaaatttt gcataatcct ttacaacatg gctatatggg agcacttagc 4680
gccctccaaa acccatattg cctacgcatg tataggtgtt ttttccacaa tattttctct 4740
gtgctctctt tttattaaag agaagctcta tatcggagaa gcttctgtgg ccgttatatt 4800
cggccttatc gtgggaccac attgcctgaa ttggtttgcc ccggaagatt ggggaaactt 4860
ggatctgatt accttagctg caggtaccac tgagcgtcag accccgtaga aaagatcaaa 4920
ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca 4980
ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt tccgaaggta 5040
actggcttca gcagagcgca gataccaaat actgttcttc tagtgtagcc gtagttaggc 5100
caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat cctgttacca 5160
gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag acgatagtta 5220
ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc cagcttggag 5280
cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag cgccacgctt 5340
cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac aggagagcgc 5400
acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg gtttcgccac 5460
ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac 5520
gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc tcacatgttc 5580
tttcctgcgg tacccagatc caattcccgc tttgactgcc tgaaatctcc atcgcctaca 5640
atgatgacat ttggatttgg ttgactcatg ttggtattgt gaaatagacg cagatcggga 5700
acactgaaaa atacacagtt attattcatt taaat 5735
<210> 21
<211> 7204
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV150 (Sequence 19)
<400> 21
aaaaataagc tttttatgct cttctctctt tttttccctt cggtataatt ataccttgca 60
tccacagatt ctcctgccaa attttgcata atcctttaca acatggctat atgggagcac 120
ttagcgccct ccaaaaccca tattgcctac gcatgtatag gtgttttttc cacaatattt 180
tctctgtgct ctctttttat taaagagaag ctctatatcg gagaagcttc tgtggccgtt 240
atattcggcc ttatcgtggg accacattgc ctgaattggt ttgccccgga agattgggga 300
aacttggatc tgattacctt agctgcagaa aagggtacca ctgagcgtca gaccccgtag 360
aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa 420
caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt 480
ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgttctt ctagtgtagc 540
cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa 600
tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggacccaa 660
gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc 720
ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa 780
gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa 840
caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg 900
ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc 960
tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg 1020
ctcacatgta ttttatgtaa gctttgaaca cttatgtaag ctcgaaacca gttaggtaag 1080
cagctttgta agcaatctgg acaatatgta agcgggttac gtaaacagtt atgtaagcag 1140
aaaaatttca aacgacaaaa cttggggtct acagacacag tagccagaag attgcactac 1200
cattcgactc ctcatgaccc actctttcga tccatgtagt taggttaccg tttttcctaa 1260
tatttaagga tgttgaaaat tcattttcat tttttttcgt ttttaagatt ttctcacaac 1320
tcttccaaag attactagtt gacttttcaa aatatttagg gtatttttct cactttttcc 1380
tagcaaactc caattggtgg gttcagtgca atggagtacc accttgcaac cacaacgtaa 1440
tagctaactt gtggccacca tgtctggttg tagagataat tggattctaa tgtggatcac 1500
atgactactc acgtgtcaaa aacccaacct gacttggccc agcttagcaa gaatatttcg 1560
aatccactct tgtggcctag tggacaactg ggaaagcttg cgacgcagtc gtttttggcg 1620
atccaggcgt agtactagga aataatgtat ctaaacgcaa actccgagct ggaaaaatgt 1680
taccggcgat gcgcggacaa tttagaggcg gcgatcaaga aacacctgct gggcgagcag 1740
tctggagcac agtcttcgat gggcccgaga tcccaccgcg ttcctgggta ccgggacgtg 1800
aggcagcgcg acatccatca aatataccag gcgccaaccg agtgtctcgg aaaacagctt 1860
ctggatatct tccgctggcg gcgcaacgac gaataatagt ccctggaggt gacggaatat 1920
atatgtgtgg agggtaaatc tgacagggtg tagcaaaggt aatattttcc taaaacatgc 1980
aatcggctgc cccgcaacgg gaaaaagaat gactttggca ctcttcacca gagtggggtg 2040
tcccgctcgt gtgtgcaaat aggctcccac tggtcacccc ggattttgca gaaaaacagc 2100
aagttccggg gtgtctcact ggtgtccgcc aataagagga gccggcaggc acggagttta 2160
catcaagctg tctccgatac actcgactac catccgggtc tctcagagag gggaatggca 2220
ctataaatac cgcctccttg cgctctctgc cttcatcaat caaatcatgc tgaggactcg 2280
aattccctag gatgttctct ccaattttgt ccttggaaat tattttagct ttggctactt 2340
tgcaatctgt cttcgctcaa cagtatccgt atgatgtgcc ggattatgcg tctccccagt 2400
acgaagcata tgatgtcaag tctggagtag caggaggagg aatcgcaggc tatcctgggc 2460
cagctggtcc tcctggccca cccggacccc ctggcacatc tggccatcct ggtgcccctg 2520
gcgctccagg ataccaaggt ccccccggtg aacctgggca agctggtccg gcaggtcctc 2580
caggacctcc tggtgctata ggtccatctg gccctgctgg aaaagatggg gaatcaggaa 2640
gacccggacg acctggagag cgaggatttc ctggccctcc tggtatgaaa ggcccagctg 2700
gtatgcctgg attccctggt atgaaaggac acagaggctt tgatggacga aatggagaga 2760
aaggcgaaac tggtgctcct ggattaaagg gggaaaatgg cgttccaggt gaaaatggag 2820
ctcctggacc catgggtcca agaggggctc ccggtgagag aggacggcca ggacttcctg 2880
gagccgcagg ggctcgaggt aatgatggag ctcgaggaag tgatggacaa ccgggccccc 2940
ctggtcctcc tggaactgca ggattccctg gttcccctgg tgctaagggt gaagttggac 3000
ctgcaggatc tcctggttca agtggcgccc ctggacaaag aggagaacct ggacctcagg 3060
gacatgctgg tgctccaggt ccccctgggc ctcctgggag taatggtagt cctggtggca 3120
aaggtgaaat gggtcctgct ggcattcctg gggctcctgg gctgatagga gctcgtggtc 3180
ctccagggcc acctggcacc aatggtgttc ccgggcaacg aggtgctgca ggtgaacccg 3240
gtaagaatgg agccaaagga gacccaggac cacgtgggga acgcggagaa gctggttctc 3300
caggtatcgc aggacctaag ggtgaagatg gcaaagatgg ttctcctgga gaacctggtg 3360
caaatggact tcctggagct gcaggagaaa ggggtgtgcc tggattccga ggacctgctg 3420
gagcaaatgg ccttccagga gaaaagggtc ctcctgggga ccgtggtggc ccaggccctg 3480
cagggcccag aggtgttgct ggagagcccg gcagagatgg tctccctgga ggtccaggat 3540
tgaggggtat tcctggtagc cccggaggac caggcagtga tgggaaacca gggcctcctg 3600
gaagccaagg agagacgggt cgacccggtc ctccaggttc acctggtccg cgaggccagc 3660
ctggtgtcat gggcttccct ggtcccaaag gaaacgatgg tgctcctgga aaaaatggag 3720
aacgaggtgg ccctggaggt cctggccctc agggtcctgc tggaaagaat ggtgagaccg 3780
gacctcaggg tcctccagga cctactggcc cttctggtga caaaggagac acaggacccc 3840
ctggtccaca aggactacaa ggcttgcctg gaacgagtgg tcccccagga gaaaacggaa 3900
aacctggtga acctggtcca aagggtgagg ctggtgcacc tggaattcca ggaggcaagg 3960
gtgattctgg tgctcccggt gaacgcggac ctcctggagc aggagggccc cctggaccta 4020
gaggtggagc tggcccccct ggtcccgaag gaggaaaggg tgctgctggt ccccctgggc 4080
cacctggttc tgctggtaca cctggtctgc aaggaatgcc tggagaaaga gggggtcctg 4140
gaggccctgg tccaaagggt gataagggtg agcctggcag ctcaggtgtc gatggtgctc 4200
cagggaaaga tggtccacgg ggtcccactg gtcccattgg tcctcctggc ccagctggtc 4260
agcctggaga taagggtgaa agtggtgccc ctggagttcc gggtatagct ggtcctcgcg 4320
gtggccctgg tgagagaggc gaacaggggc ccccaggacc tgctggcttc cctggtgctc 4380
ctggccagaa tggtgagcct ggtgctaaag gagaaagagg cgctcctggt gagaaaggtg 4440
aaggaggccc tcccggagcc gcaggacccg ccggaggttc tgggcctgcc ggtcccccag 4500
gcccccaagg tgtcaaaggc gaacgtggca gtcctggtgg tcctggtgct gctggcttcc 4560
ccggtggtcg tggtcctcct ggccctcctg gcagtaatgg taacccaggc cccccaggct 4620
ccagtggtgc tccaggcaaa gatggtcccc caggtccacc tggcagtaat ggtgctcctg 4680
gcagccccgg gatctctgga ccaaagggtg attctggtcc accaggtgag aggggagcac 4740
ctggccccca gggccctccg ggagctccag gcccactagg aattgcagga cttactggag 4800
cacgaggtct tgcaggccca ccaggcatgc caggtgctag gggcagcccc ggcccacagg 4860
gcatcaaggg tgaaaatggt aaaccaggac ctagtggtca gaatggagaa cgtggtcctc 4920
ctggccccca gggtcttcct ggtctggctg gtacagctgg tgagcctgga agagatggaa 4980
accctggatc agatggtctg ccaggccgag atggagctcc aggtgccaag ggtgaccgtg 5040
gtgaaaatgg ctctcctggt gcccctggag ctcctggtca cccaggccct cctggtcctg 5100
tcggtccagc tggaaagagc ggtgacagag gagaaactgg ccctgctggt ccttctgggg 5160
cccccggtcc tgccggatca agaggtcctc ctggtcccca aggcccacgc ggtgacaaag 5220
gggaaaccgg tgagcgtggt gctatgggca tcaaaggaca tcgcggattc cctggcaacc 5280
caggggcccc cggatctccg ggtcccgctg gtcatcaagg tgcagttggc agtccaggcc 5340
ctgcaggccc cagaggacct gttggaccta gcgggccccc tggaaaggac ggagcaagtg 5400
gacaccctgg tcccattgga ccaccggggc cccgaggtaa cagaggtgaa agaggatctg 5460
agggctcccc aggccaccca ggacaaccag gccctcctgg acctcctggt gcccctggtc 5520
catgttgtgg tgctggcggg gttgctgcca ttgctggtgt tggagccgaa aaagctggtg 5580
gttttgcccc atattatgga gctagcggtt acattcctga agctcctaga gacggacaag 5640
catacgttag aaaggacggt gagtgggtgt tgctgtccac cttcttagct agcgattaca 5700
aggatgacga cgataaggga tcgtgttgcc cgggctgctg tcatcaccat catcaccata 5760
gatcttaagc ggccgcgagt cgtgagtaat caagaggatg tcagaatgcc atttgcctga 5820
gagatgcagg cttcattttt gatacttttt tatttgtaac ctatatagta taggattttt 5880
tttgtcattt tgtttcttct cgtacgagct tgctcctgat cagcctatct cgcagctgat 5940
gaatatcttg tggtaggggt ttgggaaaat cattcgagtt tgatgttttt cttggtattt 6000
cccactcctc ttcagagtac agaagattaa gtgagacgtt cgtttgtgct ccggaggatc 6060
cttcagtaat gtcttgtttc ttttgttgca gtggtgagcc attttgactt cgtgaaagtt 6120
tctttagaat agttgtttcc agaggccaaa cattccaccc gtagtaaagt gcaagcgtag 6180
gaagaccaag actggcataa atcaggtata agtgtcgagc actggcaggt gatcttctga 6240
aagtttctac tagcagataa gatccagtag tcatgcatat ggcaacaatg taccgtgtgg 6300
atctaagaac gcgtcctact aaccttcgca ttcgttggtc cagtttgttg ttatcgatca 6360
acgtgacaag gttgtcgatt ccgcgtaagc atgcataccc aaggacgcct gttgcaattc 6420
caagtgagcc agttccaaca atctttgtaa tattagagca cttcattgtg ttgcgcttga 6480
aagtaaaatg cgaacaaatt aagagataat ctcgaaaccg cgacttcaaa cgccaatatg 6540
atgtgcggca cacaataagc gttcatatcc gctgggtgac tttctcgctt taaaaaatta 6600
tccgaaaaaa ttttctagag tgttgttact ttatacttcc ggctcgtata atacgacaag 6660
gtgtaaggag gactaaacca tggctaaact cacctctgct gttccagtcc tgactgctcg 6720
tgatgttgct ggtgctgttg agttctggac tgataggctc ggtttctccc gtgacttcgt 6780
agaggacgac tttgccggtg ttgtacgtga cgacgttacc ctgttcatct ccgcagttca 6840
ggaccaggtt gtgccagaca acactctggc atgggtatgg gttcgtggtc tggacgaact 6900
gtacgctgag tggtctgagg tcgtgtctac caacttccgt gatgcatctg gtccagctat 6960
gaccgagatc ggtgaacagc cctggggtcg tgagtttgca ctgcgtgatc cagctggtaa 7020
ctgcgtgcat ttcgtcgcag aagagcagga ctaacaattg acaccttacg attatttaga 7080
gagtatttat tagttttatt gtatgtatac ggatgtttta ttatctattt atgcccttat 7140
attctgtaac tatccaaaag tcctatctta tcaagccagc aatctatgtc cgcgaacgtc 7200
aact 7204
<210> 22
<211> 6601
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV140 (Sequence 20)
<400> 22
gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa 60
aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc 120
gaaggtaact ggcttcagca gagcgcagat accaaatact gttcttctag tgtagccgta 180
gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct 240
gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg acccaagacg 300
atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag 360
cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc 420
cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg 480
agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt 540
tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 600
gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc cttttgctca 660
catgtattta aataatgtat ctaaacgcaa actccgagct ggaaaaatgt taccggcgat 720
gcgcggacaa tttagaggcg gcgatcaaga aacacctgct gggcgagcag tctggagcac 780
agtcttcgat gggcccgaga tcccaccgcg ttcctgggta ccgggacgtg aggcagcgcg 840
acatccatca aatataccag gcgccaaccg agtgtctcgg aaaacagctt ctggatatct 900
tccgctggcg gcgcaacgac gaataatagt ccctggaggt gacggaatat atatgtgtgg 960
agggtaaatc tgacagggtg tagcaaaggt aatattttcc taaaacatgc aatcggctgc 1020
cccgcaacgg gaaaaagaat gactttggca ctcttcacca gagtggggtg tcccgctcgt 1080
gtgtgcaaat aggctcccac tggtcacccc ggattttgca gaaaaacagc aagttccggg 1140
gtgtctcact ggtgtccgcc aataagagga gccggcaggc acggagttta catcaagctg 1200
tctccgatac actcgactac catccgggtc tctcagagag gggaatggca ctataaatac 1260
cgcctccttg cgctctctgc cttcatcaat caaatcatgc tgaggactcg aattccctag 1320
gatgatgagc tttgtgcaaa aggggacctg gttacttttc gctctgcttc atcccactgt 1380
tattttggca caacagtatc cgtatgatgt gccggattat gcgtctcccc agtacgaagc 1440
atatgatgtc aagtctggag tagcaggagg aggaatcgca ggctatcctg ggccagctgg 1500
tcctcctggc ccacccggac cccctggcac atctggccat cctggtgccc ctggcgctcc 1560
aggataccaa ggtccccccg gtgaacctgg gcaagctggt ccggcaggtc ctccaggacc 1620
tcctggtgct ataggtccat ctggccctgc tggaaaagat ggggaatcag gaagacccgg 1680
acgacctgga gagcgaggat ttcctggccc tcctggtatg aaaggcccag ctggtatgcc 1740
tggattccct ggtatgaaag gacacagagg ctttgatgga cgaaatggag agaaaggcga 1800
aactggtgct cctggattaa agggggaaaa tggcgttcca ggtgaaaatg gagctcctgg 1860
acccatgggt ccaagagggg ctcccggtga gagaggacgg ccaggacttc ctggagccgc 1920
aggggctcga ggtaatgatg gagctcgagg aagtgatgga caaccgggcc cccctggtcc 1980
tcctggaact gcaggattcc ctggttcccc tggtgctaag ggtgaagttg gacctgcagg 2040
atctcctggt tcaagtggcg cccctggaca aagaggagaa cctggacctc agggacatgc 2100
tggtgctcca ggtccccctg ggcctcctgg gagtaatggt agtcctggtg gcaaaggtga 2160
aatgggtcct gctggcattc ctggggctcc tgggctgata ggagctcgtg gtcctccagg 2220
gccacctggc accaatggtg ttcccgggca acgaggtgct gcaggtgaac ccggtaagaa 2280
tggagccaaa ggagacccag gaccacgtgg ggaacgcgga gaagctggtt ctccaggtat 2340
cgcaggacct aagggtgaag atggcaaaga tggttctcct ggagaacctg gtgcaaatgg 2400
acttcctgga gctgcaggag aaaggggtgt gcctggattc cgaggacctg ctggagcaaa 2460
tggccttcca ggagaaaagg gtcctcctgg ggaccgtggt ggcccaggcc ctgcagggcc 2520
cagaggtgtt gctggagagc ccggcagaga tggtctccct ggaggtccag gattgagggg 2580
tattcctggt agccccggag gaccaggcag tgatgggaaa ccagggcctc ctggaagcca 2640
aggagagacg ggtcgacccg gtcctccagg ttcacctggt ccgcgaggcc agcctggtgt 2700
catgggcttc cctggtccca aaggaaacga tggtgctcct ggaaaaaatg gagaacgagg 2760
tggccctgga ggtcctggcc ctcagggtcc tgctggaaag aatggtgaga ccggacctca 2820
gggtcctcca ggacctactg gcccttctgg tgacaaagga gacacaggac cccctggtcc 2880
acaaggacta caaggcttgc ctggaacgag tggtccccca ggagaaaacg gaaaacctgg 2940
tgaacctggt ccaaagggtg aggctggtgc acctggaatt ccaggaggca agggtgattc 3000
tggtgctccc ggtgaacgcg gacctcctgg agcaggaggg ccccctggac ctagaggtgg 3060
agctggcccc cctggtcccg aaggaggaaa gggtgctgct ggtccccctg ggccacctgg 3120
ttctgctggt acacctggtc tgcaaggaat gcctggagaa agagggggtc ctggaggccc 3180
tggtccaaag ggtgataagg gtgagcctgg cagctcaggt gtcgatggtg ctccagggaa 3240
agatggtcca cggggtccca ctggtcccat tggtcctcct ggcccagctg gtcagcctgg 3300
agataagggt gaaagtggtg cccctggagt tccgggtata gctggtcctc gcggtggccc 3360
tggtgagaga ggcgaacagg ggcccccagg acctgctggc ttccctggtg ctcctggcca 3420
gaatggtgag cctggtgcta aaggagaaag aggcgctcct ggtgagaaag gtgaaggagg 3480
ccctcccgga gccgcaggac ccgccggagg ttctgggcct gccggtcccc caggccccca 3540
aggtgtcaaa ggcgaacgtg gcagtcctgg tggtcctggt gctgctggct tccccggtgg 3600
tcgtggtcct cctggccctc ctggcagtaa tggtaaccca ggccccccag gctccagtgg 3660
tgctccaggc aaagatggtc ccccaggtcc acctggcagt aatggtgctc ctggcagccc 3720
cgggatctct ggaccaaagg gtgattctgg tccaccaggt gagaggggag cacctggccc 3780
ccagggccct ccgggagctc caggcccact aggaattgca ggacttactg gagcacgagg 3840
tcttgcaggc ccaccaggca tgccaggtgc taggggcagc cccggcccac agggcatcaa 3900
gggtgaaaat ggtaaaccag gacctagtgg tcagaatgga gaacgtggtc ctcctggccc 3960
ccagggtctt cctggtctgg ctggtacagc tggtgagcct ggaagagatg gaaaccctgg 4020
atcagatggt ctgccaggcc gagatggagc tccaggtgcc aagggtgacc gtggtgaaaa 4080
tggctctcct ggtgcccctg gagctcctgg tcacccaggc cctcctggtc ctgtcggtcc 4140
agctggaaag agcggtgaca gaggagaaac tggccctgct ggtccttctg gggcccccgg 4200
tcctgccgga tcaagaggtc ctcctggtcc ccaaggccca cgcggtgaca aaggggaaac 4260
cggtgagcgt ggtgctatgg gcatcaaagg acatcgcgga ttccctggca acccaggggc 4320
ccccggatct ccgggtcccg ctggtcatca aggtgcagtt ggcagtccag gccctgcagg 4380
ccccagagga cctgttggac ctagcgggcc ccctggaaag gacggagcaa gtggacaccc 4440
tggtcccatt ggaccaccgg ggccccgagg taacagaggt gaaagaggat ctgagggctc 4500
cccaggccac ccaggacaac caggccctcc tggacctcct ggtgcccctg gtccatgttg 4560
tggtgctggc ggggttgctg ccattgctgg tgttggagcc gaaaaagctg gtggttttgc 4620
cccatattat ggagctagcg gttacattcc tgaagctcct agagacggac aagcatacgt 4680
tagaaaggac ggtgagtggg tgttgctgtc caccttctta gctagcgatt acaaggatga 4740
cgacgataag ggatcgtgtt gcccgggctg ctgtcatcac catcatcacc atagatctta 4800
agcggccgcg agtcgtgagt aatcaagagg atgtcagaat gccatttgcc tgagagatgc 4860
aggcttcatt tttgatactt ttttatttgt aacctatata gtataggatt ttttttgtca 4920
ttttgtttct tctcgtacga gcttgctcct gatcagccta tctcgcagct gatgaatatc 4980
ttgtggtagg ggtttgggaa aatcattcga gtttgatgtt tttcttggta tttcccactc 5040
ctcttcagag tacagaagat taagtgagac gttcgtttgt gctccggagg atccttcagt 5100
aatgtcttgt ttcttttgtt gcagtggtga gccattttga cttcgtgaaa gtttctttag 5160
aatagttgtt tccagaggcc aaacattcca cccgtagtaa agtgcaagcg taggaagacc 5220
aagactggca taaatcaggt ataagtgtcg agcactggca ggtgatcttc tgaaagtttc 5280
tactagcaga taagatccag tagtcatgca tatggcaaca atgtaccgtg tggatctaag 5340
aacgcgtcct actaaccttc gcattcgttg gtccagtttg ttgttatcga tcaacgtgac 5400
aaggttgtcg attccgcgta agcatgcata cccaaggacg cctgttgcaa ttccaagtga 5460
gccagttcca acaatctttg taatattaga gcacttcatt gtgttgcgct tgaaagtaaa 5520
atgcgaacaa attaagagat aatctcgaaa ccgcgacttc aaacgccaat atgatgtgcg 5580
gcacacaata agcgttcata tccgctgggt gactttctcg ctttaaaaaa ttatccgaaa 5640
aaattttcta gagtgttgtt actttatact tccggctcgt ataatacgac aaggtgtaag 5700
gaggactaaa ccatggctaa actcacctct gctgttccag tcctgactgc tcgtgatgtt 5760
gctggtgctg ttgagttctg gactgatagg ctcggtttct cccgtgactt cgtagaggac 5820
gactttgccg gtgttgtacg tgacgacgtt accctgttca tctccgcagt tcaggaccag 5880
gttgtgccag acaacactct ggcatgggta tgggttcgtg gtctggacga actgtacgct 5940
gagtggtctg aggtcgtgtc taccaacttc cgtgatgcat ctggtccagc tatgaccgag 6000
atcggtgaac agccctgggg tcgtgagttt gcactgcgtg atccagctgg taactgcgtg 6060
catttcgtcg cagaagagca ggactaacaa ttgacacctt acgattattt agagagtatt 6120
tattagtttt attgtatgta tacggatgtt ttattatcta tttatgccct tatattctgt 6180
aactatccaa aagtcctatc ttatcaagcc agcaatctat gtccgcgaac gtcaactaaa 6240
aataagcttt ttatgctctt ctctcttttt ttcccttcgg tataattata ccttgcatcc 6300
acagattctc ctgccaaatt ttgcataatc ctttacaaca tggctatatg ggagcactta 6360
gcgccctcca aaacccatat tgcctacgca tgtataggtg ttttttccac aatattttct 6420
ctgtgctctc tttttattaa agagaagctc tatatcggag aagcttctgt ggccgttata 6480
ttcggcctta tcgtgggacc acattgcctg aattggtttg ccccggaaga ttggggaaac 6540
ttggatctga ttaccttagc tgcagaaaag ggtaccactg agcgtcagac cccgtagaaa 6600
a 6601
<210> 23
<211> 57
<212> DNA
<213> Artificial Sequence
<220>
<223> alpha-factor Pre (Sequence 21)
<400> 23
atgagattcc catctatttt caccgctgtc ttgttcgctg cctcctctgc attggct 57
<210> 24
<211> 267
<212> DNA
<213> Artificial Sequence
<220>
<223> Alpha-factor Pre pro (Sequence 22)
<400> 24
atgagattcc catctatttt caccgctgtc ttgttcgctg cctcctctgc attggctgcc 60
cctgttaaca ctaccactga agacgagact gctcaaattc cagctgaagc agttatcggt 120
tactctgacc ttgagggtga tttcgacgtc gctgttttgc ctttctctaa ctccactaac 180
aacggtttgt tgttcattaa caccactatc gcttccattg ctgctaagga agagggtgtc 240
tctctcgaga aaagagaggc cgaagct 267
<210> 25
<211> 1298
<212> DNA
<213> Artificial Sequence
<220>
<223> pGCW14-GAP1 bidirectional promoter (Sequence 23)
<400> 25
ttttgttgtt gagtgaagcg agtgacggaa cggtaaaatg taagtaacaa aagaaaaaga 60
gaaccagggg ggggaggaga gtatgtattt ataccgtacg gcaccaggcg aaaagctata 120
aacaaacctt tttcgcggta tatttgttta tatttcctat tttaaactca aaatctgccc 180
taatctggac ttttcatgca aagttatgca cctgaggcag gaatgaagca ggctcgacga 240
cgaaaaggct ggaatgggta actatggatc gattgatttg tctgttgaaa tcttgatttg 300
gcactcgttt aaattaacat tctgcatcat ggtgaattgc ggtcacaggt actggttttt 360
cctgaagctc taggcggtgt tactgttccc acaacttaaa acctaaaaga ggtgggtgct 420
tctttgcgtg ggtgaccaaa aataaaaccg actgcctagt ggcattgata cctttttttg 480
ggtgttgtcc tggaaaccac tgaacgtatc tgcgagatac aaaagtattt ttagataagt 540
ggcaaatgca aaaaatctga ttggtcagtt aatgattgat gaacgacttt aaggttaaaa 600
agcaaaatag tgactgctgc catgtgcctg tatagcacat gaactgatta ttctgttccc 660
acgctacgat gaaaacgcct tctctgccga aagattaaag ctgcgcggga aaaaaaaatt 720
aactttacgg ggcgagcacg gttccccgaa acaaaagatg gttggctttc acccagcgag 780
ctcactggat cccagttaaa aatagttagg tgggttcacc tgtttttgta gaaatgtctt 840
ggtgtcctcg accaatcagg tagccatccc tgaaatacct ggctccgtgg caacaccgaa 900
cgacctgctg gcaacgttaa attctccggg gtaaaactta aatgtggagt aatagaacca 960
gaaacgtctc ttcccttctc tctccttcca ccgcccgtta ccgtccctag gaaattttac 1020
tctgctggag agcttcttct acggccccct tgcagcaatg ctcttcccag cattacgttg 1080
cgggtaaaac ggaggtcgtg tacccgacct agcagcccag ggatggaaag tcccggccgt 1140
cgctggcaat aactgcgggc ggacgcatgt cttgagatta ttggaaacca ccagaatcga 1200
atataaaagg cgaacacctt tcccaatttt ggtttctcct gacccaaaga ctttaaattt 1260
aatttatttg tccctatttc aatcaattga acaactat 1298
<210> 26
<211> 550
<212> DNA
<213> Artificial Sequence
<220>
<223> pHTX1 bi-directional promoter (Sequence 25)
<400> 26
tgttgtagtt ttaatatagt ttgagtatga gatggaactc agaacgaagg aattatcacc 60
agtttatata ttctgaggaa agggtgtgtc ctaaattgga cagtcacgat ggcaataaac 120
gctcagccaa tcagaatgca ggagccataa attgttgtat tattgctgca agatttatgt 180
gggttcacat tccactgaat ggttttcact gtagaattgg tgtcctagtt gttatgtttc 240
gagatgtttt caagaaaaac taaaatgcac aaactgacca ataatgtgcc gtcgcgcttg 300
gtacaaacgt caggattgcc accacttttt tcgcactctg gtacaaaagt tcgcacttcc 360
cactcgtatg taacgaaaaa cagagcagtc tatccagaac gagacaaatt agcgcgtact 420
gtcccattcc ataaggtatc ataggaaacg agagtcctcc ccccatcacg tatatataaa 480
cacactgata tcccacatcc gcttgtcacc aaactaatac atccagttca agttacctaa 540
acaaatcaaa 550
<210> 27
<211> 1251
<212> DNA
<213> Artificial Sequence
<220>
<223> Das1-Das2 bi-directional promoter (Sequence 24)
<400> 27
ttttgatgtt tgatagtttg ataagagtga actttagtgt ttagaggggt tataatttgt 60
tgtaactggt tttggtctta agttaaaacg aacttgttat attaaacaca acggtcactc 120
aggatacaag aataggaaag aaaaacttta aactggggac atgttgtctt tatataattt 180
ggcggttaac ccttaatgcc cgtttccgtc tcttcatgat aacaaagctg cccatctatg 240
actgaatgtg gagaagtatc ggaacaaccc ttcactaagg atatctaggc taaactcatt 300
cgcgccttag atttctccaa ggtatcggtt aagtttcctc tttcgtactg gctaacgatg 360
gtgttgctca acaaagggat ggaacggcag ctaaagggag tgcatggaat gactttaatt 420
ggctgagaaa gtgttctatt tgtccgaatt tcttttttct attatctgtt cgtttgggcg 480
gatctctcca gtggggggta aatggaagat ttctgttcat ggggtaagga agctgaaatc 540
cttcgtttct tataggggca agtatactaa atctcggaac attgaatggg gtttactttc 600
attggctaca gaaattatta agtttgttat ggggtgaagt taccagtaat tttcattttt 660
tcacttcaac ttttggggta tttctgtggg gtagcataga gcaatgatat aaacaacaat 720
tgagtgacag gtctactttg ttctcaaaag gccataacca tctgtttgca tctcttatca 780
ccacaccatc ctcctcatct ggccttcaat tgtggggaac aactagcatc ccaacaccag 840
actaactcca cccagatgaa accagttgtc gcttaccagt caatgaatgt tgagctaacg 900
ttccttgaaa ctcgaatgat cccagccttg ctgcgtatca tccctccgct attccgccgc 960
ttgctccaac catgtttccg cctttttcga acaagttcaa atacctatct ttggcaggac 1020
ttttcctcct gcctttttta gcctcagctc tcggttagcc tctaggcaaa ttctggtctt 1080
catacctata tcaacttttc atcagatagc ctttgggttc aaaaaagaac taaagcagga 1140
tgcctgatat ataaatccca gatgatctgc ttttgaaact attttcagta tcttgattcg 1200
tttacttaca aacaactatt gttgatttta tctggagaat aatcgaacaa a 1251
<210> 28
<211> 3908
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV132
<400> 28
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 60
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 120
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 180
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 240
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 300
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 360
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 420
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 480
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 540
aattatccga aaaaattttc tagagtgttg ttactttata cttccggctc gtataatacg 600
acaaggtgta aggaggacta aaccatggct aaactcacct ctgctgttcc agtcctgact 660
gctcgtgatg ttgctggtgc tgttgagttc tggactgata ggctcggttt ctcccgtgac 720
ttcgtagagg acgactttgc cggtgttgta cgtgacgacg ttaccctgtt catctccgca 780
gttcaggacc aggttgtgcc agacaacact ctggcatggg tatgggttcg tggtctggac 840
gaactgtacg ctgagtggtc tgaggtcgtg tctaccaact tccgtgatgc atctggtcca 900
gctatgaccg agatcggtga acagccctgg ggtcgtgagt ttgcactgcg tgatccagct 960
ggtaactgcg tgcatttcgt cgcagaagag caggactaac aattgacacc ttacgattat 1020
ttagagagta tttattagtt ttattgtatg tatacggatg ttttattatc tatttatgcc 1080
cttatattct gtaactatcc aaaagtccta tcttatcaag ccagcaatct atgtccgcga 1140
acgtcaacta aaaataagct ttttatgctc ttctctcttt ttttcccttc ggtataatta 1200
taccttgcat ccacagattc tcctgccaaa ttttgcataa tcctttacaa catggctata 1260
tgggagcact tagcgccctc caaaacccat attgcctacg catgtatagg tgttttttcc 1320
acaatatttt ctctgtgctc tctttttatt aaagagaagc tctatatcgg agaagcttct 1380
gtggccgtta tattcggcct tatcgtggga ccacattgcc tgaattggtt tgccccggaa 1440
gattggggaa acttggatct gattacctta gctgcagaaa agggtaccac tgagcgtcag 1500
accccgtaga aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct 1560
gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac 1620
caactctttt tccgaaggta actggcttca gcagagcgca gataccaaat actgttcttc 1680
tagtgtagcc gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg 1740
ctctgctaat cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt 1800
tggacccaag acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt 1860
gcacacagcc cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc 1920
tatgagaaag cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca 1980
gggtcggaac aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata 2040
gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg 2100
ggcggagcct atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct 2160
ggccttttgc tcacatgtat ttaaataatg tatctaaacg caaactccga gctggaaaaa 2220
tgttaccggc gatgcgcgga caatttagag gcggcgatca agaaacacct gctgggcgag 2280
cagtctggag cacagtcttc gatgggcccg agatcccacc gcgttcctgg gtaccgggac 2340
gtgaggcagc gcgacatcca tcaaatatac caggcgccaa ccgagtgtct cggaaaacag 2400
cttctggata tcttccgctg gcggcgcaac gacgaataat agtccctgga ggtgacggaa 2460
tatatatgtg tggagggtaa atctgacagg gtgtagcaaa ggtaatattt tcctaaaaca 2520
tgcaatcggc tgccccgcaa cgggaaaaag aatgactttg gcactcttca ccagagtggg 2580
gtgtcccgct cgtgtgtgca aataggctcc cactggtcac cccggatttt gcagaaaaac 2640
agcaagttcc ggggtgtctc actggtgtcc gccaataaga ggagccggca ggcacggagt 2700
ttacatcaag ctgtctccga tacactcgac taccatccgg gtctctcaga gaggggaatg 2760
gcactataaa taccgcctcc ttgcgctctc tgccttcatc aatcaaatca tgctgaggac 2820
tcgaattcga cctctgttgc ctctttgttg gacgaaccat tcaccggtgt cttgtactta 2880
aagggcagtg gtatcactga agacttccag tccctaaagg gtaagaagat cggttacgtt 2940
ggtgacttcg gtaagatcca aatcgatgaa ttgaccaagc actacggtat gaagccagaa 3000
gactacaccg ccgtcagatg tggtatgaat gtcgccaagt acatcatcga aggtaagatt 3060
gatgccggta ttggtatcga atgtatgcaa caagtcgaat tggaagagta cttggccaag 3120
caaggcagac cagcttctga tgctaaaatg ttgagaattg acaagttggc ttgcttgggt 3180
tgctgttgct tctgtaccgt tctttacatc tgcaacgatg aatttttgaa gaagaaccct 3240
gaaaaggtca gaaagttctt gaaagccatc aagaaggcaa ccgactacgt tctagccgac 3300
cctgtgaagg cttggaaaga atacatcgac ttcaagcctc aattgaacaa cgatctatcc 3360
tacaagcaat accaaagatg ttacgcttac ttctcttcat ctttgtacaa tgttcaccgt 3420
gactggaaga aggttaccgg ttacggtaag agattagcca tcttgccacc agactatgtc 3480
tcgaactaca ctaatgaata cttgtcctgg ccagaaccag aagaggtttc tgatcctttg 3540
gaagctcaaa gattgatggc tattcatcaa gaaaaatgca gacaggaagg tactttcaag 3600
agattggctc ttccagctta agcggccgcg agtcgtgagt aatcaagagg atgtcagaat 3660
gccatttgcc tgagagatgc aggcttcatt tttgatactt ttttatttgt aacctatata 3720
gtataggatt ttttttgtca ttttgtttct tctcgtacga gcttgctcct gatcagccta 3780
tctcgcagct gatgaatatc ttgtggtagg ggtttgggaa aatcattcga gtttgatgtt 3840
tttcttggta tttcccactc ctcttcagag tacagaagat taagtgagac gttcgtttgt 3900
gctccgga 3908
<210> 29
<211> 7476
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV193
<400> 29
ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 60
tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 120
ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gttcttctag 180
tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 240
tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 300
acccaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 360
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 420
gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 480
tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 540
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 600
ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 660
cttttgctca catgtattta aataatgtat ctaaacgcaa actccgagct ggaaaaatgt 720
taccggcgat gcgcggacaa tttagaggcg gcgatcaaga aacacctgct gggcgagcag 780
tctggagcac agtcttcgat gggcccgaga tcccaccgcg ttcctgggta ccgggacgtg 840
aggcagcgcg acatccatca aatataccag gcgccaaccg agtgtctcgg aaaacagctt 900
ctggatatct tccgctggcg gcgcaacgac gaataatagt ccctggaggt gacggaatat 960
atatgtgtgg agggtaaatc tgacagggtg tagcaaaggt aatattttcc taaaacatgc 1020
aatcggctgc cccgcaacgg gaaaaagaat gactttggca ctcttcacca gagtggggtg 1080
tcccgctcgt gtgtgcaaat aggctcccac tggtcacccc ggattttgca gaaaaacagc 1140
aagttccggg gtgtctcact ggtgtccgcc aataagagga gccggcaggc acggagttta 1200
catcaagctg tctccgatac actcgactac catccgggtc tctcagagag gggaatggca 1260
ctataaatac cgcctccttg cgctctctgc cttcatcaat caaatcatga tgtcttttgt 1320
ccaaaagggt acttggttac tttttgctct gttgcaccca actgttattc tcgcacaaca 1380
ggaagcagta gatggtggtt gctcacattt aggtcaatct tacgcagata gagatgtatg 1440
gaaacctgaa ccatgtcaaa tttgcgtgtg tgactcaggt tcagtgctct gcgacgatat 1500
catatgtgac gaccaggaat tggactgtcc aaacccagag ataccattcg gtgaatgttg 1560
tgctgtttgt ccacagccac caactgctcc tacaagacct ccaaacggtc aaggtccaca 1620
aggtcctaaa ggtgatccgg gtccacctgg tattcctggt agaaatggtg accctggacc 1680
tcccggttcc ccaggtagcc caggatcacc tgggcctcct ggaatatgtg aatcctgccc 1740
aactggtggt cagaactata gcccacaata cgaggcctac gacgtcaaat ctggtgttgc 1800
tggaggaggt attgcaggct accctggtcc cgcagggccc ccaggtccgc cgggtccgcc 1860
cggaacatca ggtcatcccg gagcccctgg tgcaccaggt tatcagggac cgcccggaga 1920
gcctggacaa gctggtcccg ctggaccccc tggtccacca ggtgctattg gaccaagtgg 1980
tcctgccgga aaagacggtg aatccggtag acctggtaga cccggcgaaa ggggtttccc 2040
aggtcctccc ggaatgaagg gtccagccgg tatgcccggt tttcctggga tgaagggtca 2100
cagaggattt gatggtagaa acggagagaa aggcgaaacc ggtgctcccg gactgaaggg 2160
tgaaaacggt gtccctggtg agaacggcgc tcctggacct atgggtccac gtggtgctcc 2220
aggagaaaga ggcagaccag gattgcctgg tgcagctggt gctagaggta acgatggtgc 2280
ccgtggttcc gatggacaac ccgggccacc cggccctcca ggtaccgctg gatttcctgg 2340
aagccctggt gctaaggggg aggttggtcc ggctggtagt cccggaagta gcggtgcccc 2400
aggtcaaaga ggcgaaccag gccctcaggg tcacgcagga gcacctggac cgcctggtcc 2460
tcctggttcg aatggttcgc ctggaggaaa aggtgaaatg gggcccgcag gaatccccgg 2520
tgcgcctggt cttattggtg ccaggggtcc tccaggcccg ccaggtacaa atggtgtacc 2580
cggacagcga ggagcagctg gtgaacctgg taaaaacggt gccaaaggag atccaggtcc 2640
tcgtggagag cgtggtgaag ctggctctcc cggtatcgcc ggtccaaaag gtgaggacgg 2700
taaggacggt tcccctggtg agccaggtgc gaacggactg ccaggtgcag ccggagagcg 2760
aggagtccca ggattcaggg gaccagccgg tgctaacggc ttgcctggtg aaaaagggcc 2820
ccctggtgat aggggaggac ccggtccagc aggccctcgt ggagttgctg gtgagcctgg 2880
acgtgacggt ttaccaggag ggccaggttt gaggggtatt cccgggtccc ctggcggtcc 2940
tggatcggat ggaaaaccag ggccaccagg ttcgcagggt gaaacaggac gtccaggccc 3000
acccggctca cctggtccaa ggggtcagcc tggtgtcatg ggtttccccg gtccaaaggg 3060
taatgacgga gcaccgggta aaaatggtga acgtggtggc ccaggtggtc caggacccca 3120
aggtccagct ggaaaaaacg gtgagacagg tcctcaagga cctccaggac ctaccggtcc 3180
tagcggagat aagggagata cgggaccgcc aggacctcaa ggattgcaag gtttgcctgg 3240
tacatctggc cctcccggag aaaatggtaa gcctggagag ccaggaccaa aaggcgaagc 3300
tggagcccca ggtatccccg gaggtaaggg agactcaggt gctccgggtg agcgtggtcc 3360
tccgggtgcc ggtggtccac ctggacctag aggtggtgcc gggccgccag gtcctgaagg 3420
tggtaaaggt gctgctggtc cccctgggcc acctggttct gctggtacac ctggtctgca 3480
aggaatgcct ggagaaagag ggggtcctgg aggccctggt ccaaagggtg ataagggtga 3540
gcctggcagc tcaggtgtcg atggtgctcc agggaaagat ggtccacggg gtcccactgg 3600
tcccattggt cctcctggcc cagctggtca gcctggagat aagggtgaaa gtggtgcccc 3660
tggagttccg ggtatagctg gtcctcgcgg tggccctggt gagagaggcg aacaggggcc 3720
cccaggacct gctggcttcc ctggtgctcc tggccagaat ggtgagcctg gtgctaaagg 3780
agaaagaggc gctcctggtg agaaaggtga aggaggccct cccggagccg caggacccgc 3840
cggaggttct gggcctgccg gtcccccagg cccccaaggt gtcaaaggcg aacgtggcag 3900
tcctggtggt cctggtgctg ctggcttccc cggtggtcgt ggtcctcctg gccctcctgg 3960
cagtaatggt aacccaggcc ccccaggctc cagtggtgct ccaggcaaag atggtccccc 4020
aggtccacct ggcagtaatg gtgctcctgg cagccccggg atctctggac caaagggtga 4080
ttctggtcca ccaggtgaga ggggagcacc tggcccccag ggccctccgg gagctccagg 4140
cccactagga attgcaggac ttactggagc acgaggtctt gcaggcccac caggcatgcc 4200
aggtgctagg ggcagccccg gcccacaggg catcaagggt gaaaatggta aaccaggacc 4260
tagtggtcag aatggagaac gtggtcctcc tggcccccag ggtcttcctg gtctggctgg 4320
tacagctggt gagcctggaa gagatggaaa ccctggatca gatggtctgc caggccgaga 4380
tggagctcca ggtgccaagg gtgaccgtgg tgaaaatggc tctcctggtg cccctggagc 4440
tcctggtcac ccaggccctc ctggtcctgt cggtccagct ggaaagagcg gtgacagagg 4500
agaaactggc cctgctggtc cttctggggc ccccggtcct gccggatcaa gaggtcctcc 4560
tggtccccaa ggcccacgcg gtgacaaagg ggaaaccggt gagcgtggtg ctatgggcat 4620
caaaggacat cgcggattcc ctggcaaccc aggggccccc ggatctccgg gtcccgctgg 4680
tcatcaaggt gcagttggca gtccaggccc tgcaggcccc agaggacctg ttggacctag 4740
cgggccccct ggaaaggacg gagcaagtgg acaccctggt cccattggac caccggggcc 4800
ccgaggtaac agaggtgaaa gaggatctga gggctcccca ggccacccag gacaaccagg 4860
ccctcctgga cctcctggtg cccctggtcc atgttgtggt gctggcgggg ttgctgccat 4920
tgctggtgtt ggagccgaaa aagctggtgg ttttgcccca tattatggag atgaaccgat 4980
agatttcaaa atcaacaccg atgagattat gacctcactc aaatcagtca atggacaaat 5040
agaaagcctc attagtcctg atggttcccg taaaaaccct gcacggaact gcagggacct 5100
gaaattctgc catcctgaac tccagagtgg agaatattgg gttgatccta accaaggttg 5160
caaattggat gctattaaag tctactgtaa catggaaact ggggaaacgt gcataagtgc 5220
cagtcctttg actatcccac agaagaactg gtggacagat tctggtgctg agaagaaaca 5280
tgtttggttt ggagaatcca tggagggtgg ttttcagttt agctatggca atcctgaact 5340
tcccgaagac gtcctcgatg tccagctggc attcctccga cttctctcca gccgggcctc 5400
tcagaacatc acatatcact gcaagaatag cattgcatac atggatcatg ccagtgggaa 5460
tgtaaagaaa gccttgaagc tgatggggtc aaatgaaggt gaattcaagg ctgaaggaaa 5520
tagcaaattc acatacacag ttctggagga tggttgcaca aaacacactg gggaatgggg 5580
caaaacagtc ttccagtatc aaacacgcaa ggccgtcaga ctacctattg tagatattgc 5640
accctatgat atcggtggtc ctgatcaaga atttggtgcg gacattggcc ctgtttgctt 5700
tttataatca agaggatgtc agaatgccat ttgcctgaga gatgcaggct tcatttttga 5760
tactttttta tttgtaacct atatagtata ggattttttt tgtcattttg tttcttctcg 5820
tacgagcttg ctcctgatca gcctatctcg cagctgatga atatcttgtg gtaggggttt 5880
gggaaaatca ttcgagtttg atgtttttct tggtatttcc cactcctctt cagagtacag 5940
aagattaagt gagacgttcg tttgtgctcc ggaggatcct tcagtaatgt cttgtttctt 6000
ttgttgcagt ggtgagccat tttgacttcg tgaaagtttc tttagaatag ttgtttccag 6060
aggccaaaca ttccacccgt agtaaagtgc aagcgtagga agaccaagac tggcataaat 6120
caggtataag tgtcgagcac tggcaggtga tcttctgaaa gtttctacta gcagataaga 6180
tccagtagtc atgcatatgg caacaatgta ccgtgtggat ctaagaacgc gtcctactaa 6240
ccttcgcatt cgttggtcca gtttgttgtt atcgatcaac gtgacaaggt tgtcgattcc 6300
gcgtaagcat gcatacccaa ggacgcctgt tgcaattcca agtgagccag ttccaacaat 6360
ctttgtaata ttagagcact tcattgtgtt gcgcttgaaa gtaaaatgcg aacaaattaa 6420
gagataatct cgaaaccgcg acttcaaacg ccaatatgat gtgcggcaca caataagcgt 6480
tcatatccgc tgggtgactt tctcgcttta aaaaattatc cgaaaaaatt ttctagagtg 6540
ttgttacttt atacttccgg ctcgtataat acgacaaggt gtaaggagga ctaaaccatg 6600
gctaaactca cctctgctgt tccagtcctg actgctcgtg atgttgctgg tgctgttgag 6660
ttctggactg ataggctcgg tttctcccgt gacttcgtag aggacgactt tgccggtgtt 6720
gtacgtgacg acgttaccct gttcatctcc gcagttcagg accaggttgt gccagacaac 6780
actctggcat gggtatgggt tcgtggtctg gacgaactgt acgctgagtg gtctgaggtc 6840
gtgtctacca acttccgtga tgcatctggt ccagctatga ccgagatcgg tgaacagccc 6900
tggggtcgtg agtttgcact gcgtgatcca gctggtaact gcgtgcattt cgtcgcagaa 6960
gagcaggact aacaattgac accttacgat tatttagaga gtatttatta gttttattgt 7020
atgtatacgg atgttttatt atctatttat gcccttatat tctgtaacta tccaaaagtc 7080
ctatcttatc aagccagcaa tctatgtccg cgaacgtcaa ctaaaaataa gctttttatg 7140
ctcttctctc tttttttccc ttcggtataa ttataccttg catccacaga ttctcctgcc 7200
aaattttgca taatccttta caacatggct atatgggagc acttagcgcc ctccaaaacc 7260
catattgcct acgcatgtat aggtgttttt tccacaatat tttctctgtg ctctcttttt 7320
attaaagaga agctctatat cggagaagct tctgtggccg ttatattcgg ccttatcgtg 7380
ggaccacatt gcctgaattg gtttgccccg gaagattggg gaaacttgga tctgattacc 7440
ttagctgcag aaaagggtac cactgagcgt cagacc 7476
<210> 30
<211> 7476
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV 194
<400> 30
acgcatgtat aggtgttttt tccacaatat tttctctgtg ctctcttttt attaaagaga 60
agctctatat cggagaagct tctgtggccg ttatattcgg ccttatcgtg ggaccacatt 120
gcctgaattg gtttgccccg gaagattggg gaaacttgga tctgattacc ttagctgcag 180
aaaagggtac cactgagcgt cagaccccgt agaaaagatc aaaggatctt cttgagatcc 240
tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 300
ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc 360
gcagatacca aatactgttc ttctagtgta gccgtagtta ggccaccact tcaagaactc 420
tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg 480
cgataagtcg tgtcttaccg ggttggaccc aagacgatag ttaccggata aggcgcagcg 540
gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 600
actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag ggagaaaggc 660
ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg 720
gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg 780
atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt 840
tttacggttc ctggcctttt gctggccttt tgctcacatg tatttaaata atgtatctaa 900
acgcaaactc cgagctggaa aaatgttacc ggcgatgcgc ggacaattta gaggcggcga 960
tcaagaaaca cctgctgggc gagcagtctg gagcacagtc ttcgatgggc ccgagatccc 1020
accgcgttcc tgggtaccgg gacgtgaggc agcgcgacat ccatcaaata taccaggcgc 1080
caaccgagtg tctcggaaaa cagcttctgg atatcttccg ctggcggcgc aacgacgaat 1140
aatagtccct ggaggtgacg gaatatatat gtgtggaggg taaatctgac agggtgtagc 1200
aaaggtaata ttttcctaaa acatgcaatc ggctgccccg caacgggaaa aagaatgact 1260
ttggcactct tcaccagagt ggggtgtccc gctcgtgtgt gcaaataggc tcccactggt 1320
caccccggat tttgcagaaa aacagcaagt tccggggtgt ctcactggtg tccgccaata 1380
agaggagccg gcaggcacgg agtttacatc aagctgtctc cgatacactc gactaccatc 1440
cgggtctctc agagagggga atggcactat aaataccgcc tccttgcgct ctctgccttc 1500
atcaatcaaa tcatgatgtc ttttgtccaa aagggtactt ggttactttt tgctctgttg 1560
cacccaactg ttattctcgc acaacaggaa gcagtagatg gtggttgctc acatttaggt 1620
caatcttacg cagatagaga tgtatggaaa cctgaaccat gtcaaatttg cgtgtgtgac 1680
tcaggttcag tgctctgcga cgatatcata tgtgacgacc aggaattgga ctgtccaaac 1740
ccagagatac cattcggtga atgttgtgct gtttgtccac agccaccaac tgctcctaca 1800
agacctccaa acggtcaagg tccacaaggt cctaaaggtg atccgggtcc acctggtatt 1860
cctggtagaa atggtgaccc tggacctccc ggttccccag gtagcccagg atcacctggg 1920
cctcctggaa tatgtgaatc ctgcccaact ggtggtcaga actatagccc acaatacgag 1980
gcctacgacg tcaaatctgg tgttgctgga ggaggtattg caggctaccc tggtcccgca 2040
gggcccccag gtccgccggg tccgcccgga acatcaggtc atcccggagc ccctggtgca 2100
ccaggttatc agggaccgcc cggagagcct ggacaagctg gtcccgctgg accccctggt 2160
ccaccaggtg ctattggacc aagtggtcct gccggaaaag acggtgaatc cggtagacct 2220
ggtagacccg gcgaaagggg tttcccaggt cctcccggaa tgaagggtcc agccggtatg 2280
cccggttttc ctgggatgaa gggtcacaga ggatttgatg gtagaaacgg agagaaaggc 2340
gaaaccggtg ctcccggact gaagggtgaa aacggtgtcc ctggtgagaa cggcgctcct 2400
ggacctatgg gtccacgtgg tgctccagga gaaagaggca gaccaggatt gcctggtgca 2460
gctggtgcta gaggtaacga tggtgcccgt ggttccgatg gacaacccgg gccacccggc 2520
cctccaggta ccgctggatt tcctggaagc cctggtgcta agggggaggt tggtccggct 2580
ggtagtcccg gaagtagcgg tgccccaggt caaagaggcg aaccaggccc tcagggtcac 2640
gcaggagcac ctggaccgcc tggtcctcct ggttcgaatg gttcgcctgg aggaaaaggt 2700
gaaatggggc ccgcaggaat ccccggtgcg cctggtctta ttggtgccag gggtcctcca 2760
ggcccgccag gtacaaatgg tgtacccgga cagcgaggag cagctggtga acctggtaaa 2820
aacggtgcca aaggagatcc aggtcctcgt ggagagcgtg gtgaagctgg ctctcccggt 2880
atcgccggtc caaaaggtga ggacggtaag gacggttccc ctggtgagcc aggtgcgaac 2940
ggactgccag gtgcagccgg agagcgagga gtcccaggat tcaggggacc agccggtgct 3000
aacggcttgc ctggtgaaaa agggccccct ggtgataggg gaggacccgg tccagcaggc 3060
cctcgtggag ttgctggtga gcctggacgt gacggtttac caggagggcc aggtttgagg 3120
ggtattcccg ggtcccctgg cggtcctgga tcggatggaa aaccagggcc accaggttcg 3180
cagggtgaaa caggacgtcc aggcccaccc ggctcacctg gtccaagggg tcagcctggt 3240
gtcatgggtt tccccggtcc aaagggtaat gacggagcac cgggtaaaaa tggtgaacgt 3300
ggtggcccag gtggtccagg accccaaggt ccagctggaa aaaacggtga gacaggtcct 3360
caaggacctc caggacctac cggtcctagc ggagataagg gagatacggg accgccagga 3420
cctcaaggat tgcaaggttt gcctggtaca tctggccctc ccggagaaaa tggtaagcct 3480
ggagagccag gaccaaaagg cgaagctgga gccccaggta tccccggagg taagggagac 3540
tcaggtgctc cgggtgagcg tggtcctccg ggtgccggtg gtccacctgg acctagaggt 3600
ggtgccgggc cgccaggtcc tgaaggtggt aaaggtgctg ctggtccacc gggaccgcct 3660
ggctctgctg gtactcctgg cttgcaggga atgccaggag agagaggtgg acctggaggt 3720
cccggtccga agggtgataa aggggagcca ggatcatccg gtgttgacgg cgcacctggt 3780
aaagacggac caaggggacc aacgggtcca atcggaccac caggacccgc tggccagcca 3840
ggagataaag gcgagtccgg agcacccggt gttcctggta tagctggacc caggggtggt 3900
cccggtgaaa gaggtgaaca gggcccaccg ggtcccgccg gtttccctgg cgcccctggt 3960
caaaatggag aaccaggtgc aaagggcgag agaggagccc caggagaaaa gggtgaggga 4020
ggaccacccg gtgctgccgg tccagctggg ggttcaggtc ctgctggacc accaggtcca 4080
cagggcgtta aaggtgagag aggaagtcca ggtggtcctg gagctgctgg attcccaggt 4140
ggccgtggac ctcctggtcc ccctggatcg aatggtaatc ctggtccgcc aggtagttcg 4200
ggtgctcctg ggaaggacgg tccacctggc cccccaggta gtaacggtgc acctggtagt 4260
ccaggtatat ccggacctaa aggagattcc ggtccaccag gcgaaagagg ggccccaggc 4320
ccacagggtc caccaggagc ccccggtcct ctgggtattg ctggtcttac tggtgcacgt 4380
ggactggccg gtccacccgg aatgcctgga gcaagaggtt cacctggacc acaaggtatt 4440
aaaggagaga acggtaaacc tggaccttcc ggtcaaaacg gagagcgggg acccccaggc 4500
ccccaaggtc tgccaggact agctggtacc gcaggggaac caggaagaga tggaaatcca 4560
ggttcagacg gactacccgg tagagatggt gcaccggggg ccaagggcga caggggtgag 4620
aatggatctc ctggtgcgcc aggggcacca ggccacccag gtcccccagg tcctgtgggc 4680
cctgctggaa agtcaggtga caggggagag acaggcccgg ctggtccatc tggcgcaccc 4740
ggaccagctg gttccagagg cccacctggt ccgcaaggcc ctagaggtga caagggagag 4800
actggagaac gaggtgctat gggtatcaag ggtcatagag gttttccggg taatcccggc 4860
gccccaggtt ctcctggtcc agctggccat caaggtgcag tcggatcgcc cggcccagcc 4920
ggtcccaggg gccctgttgg tccatccggt cctccaggaa aggatggtgc ttctggacac 4980
ccaggaccta tcggacctcc gggtcctaga ggtaatagag gagaacgtgg atccgagggt 5040
agtcctggtc accctggtca acctggccca ccagggcctc caggtgcacc cggtccatgt 5100
tgtggtgcag gcggggttgc tgccattgct ggtgttggag ccgaaaaagc tggtggtttt 5160
gccccatatt atggagatga accgatagat ttcaaaatca acaccgatga gattatgacc 5220
tcactcaaat cagtcaatgg acaaatagaa agcctcatta gtcctgatgg ttcccgtaaa 5280
aaccctgcac ggaactgcag ggacctgaaa ttctgccatc ctgaactcca gagtggagaa 5340
tattgggttg atcctaacca aggttgcaaa ttggatgcta ttaaagtcta ctgtaacatg 5400
gaaactgggg aaacgtgcat aagtgccagt cctttgacta tcccacagaa gaactggtgg 5460
acagattctg gtgctgagaa gaaacatgtt tggtttggag aatccatgga gggtggtttt 5520
cagtttagct atggcaatcc tgaacttccc gaagacgtcc tcgatgtcca gctggcattc 5580
ctccgacttc tctccagccg ggcctctcag aacatcacat atcactgcaa gaatagcatt 5640
gcatacatgg atcatgccag tgggaatgta aagaaagcct tgaagctgat ggggtcaaat 5700
gaaggtgaat tcaaggctga aggaaatagc aaattcacat acacagttct ggaggatggt 5760
tgcacaaaac acactgggga atggggcaaa acagtcttcc agtatcaaac acgcaaggcc 5820
gtcagactac ctattgtaga tattgcaccc tatgatatcg gtggtcctga tcaagaattt 5880
ggtgcggaca ttggccctgt ttgcttttta taatcaagag gatgtcagaa tgccatttgc 5940
ctgagagatg caggcttcat ttttgatact tttttatttg taacctatat agtataggat 6000
tttttttgtc attttgtttc ttctcgtacg agcttgctcc tgatcagcct atctcgcagc 6060
tgatgaatat cttgtggtag gggtttggga aaatcattcg agtttgatgt ttttcttggt 6120
atttcccact cctcttcaga gtacagaaga ttaagtgaga cgttcgtttg tgctccggag 6180
gatccttcag taatgtcttg tttcttttgt tgcagtggtg agccattttg acttcgtgaa 6240
agtttcttta gaatagttgt ttccagaggc caaacattcc acccgtagta aagtgcaagc 6300
gtaggaagac caagactggc ataaatcagg tataagtgtc gagcactggc aggtgatctt 6360
ctgaaagttt ctactagcag ataagatcca gtagtcatgc atatggcaac aatgtaccgt 6420
gtggatctaa gaacgcgtcc tactaacctt cgcattcgtt ggtccagttt gttgttatcg 6480
atcaacgtga caaggttgtc gattccgcgt aagcatgcat acccaaggac gcctgttgca 6540
attccaagtg agccagttcc aacaatcttt gtaatattag agcacttcat tgtgttgcgc 6600
ttgaaagtaa aatgcgaaca aattaagaga taatctcgaa accgcgactt caaacgccaa 6660
tatgatgtgc ggcacacaat aagcgttcat atccgctggg tgactttctc gctttaaaaa 6720
attatccgaa aaaattttct agagtgttgt tactttatac ttccggctcg tataatacga 6780
caaggtgtaa ggaggactaa accatggcta aactcacctc tgctgttcca gtcctgactg 6840
ctcgtgatgt tgctggtgct gttgagttct ggactgatag gctcggtttc tcccgtgact 6900
tcgtagagga cgactttgcc ggtgttgtac gtgacgacgt taccctgttc atctccgcag 6960
ttcaggacca ggttgtgcca gacaacactc tggcatgggt atgggttcgt ggtctggacg 7020
aactgtacgc tgagtggtct gaggtcgtgt ctaccaactt ccgtgatgca tctggtccag 7080
ctatgaccga gatcggtgaa cagccctggg gtcgtgagtt tgcactgcgt gatccagctg 7140
gtaactgcgt gcatttcgtc gcagaagagc aggactaaca attgacacct tacgattatt 7200
tagagagtat ttattagttt tattgtatgt atacggatgt tttattatct atttatgccc 7260
ttatattctg taactatcca aaagtcctat cttatcaagc cagcaatcta tgtccgcgaa 7320
cgtcaactaa aaataagctt tttatgctct tctctctttt tttcccttcg gtataattat 7380
accttgcatc cacagattct cctgccaaat tttgcataat cctttacaac atggctatat 7440
gggagcactt agcgccctcc aaaacccata ttgcct 7476
<210> 31
<211> 7476
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV 195
<400> 31
cgcatgtata ggtgtttttt ccacaatatt ttctctgtgc tctcttttta ttaaagagaa 60
gctctatatc ggagaagctt ctgtggccgt tatattcggc cttatcgtgg gaccacattg 120
cctgaattgg tttgccccgg aagattgggg aaacttggat ctgattacct tagctgcaga 180
aaagggtacc actgagcgtc agaccccgta gaaaagatca aaggatcttc ttgagatcct 240
ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt 300
tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt cagcagagcg 360
cagataccaa atactgttct tctagtgtag ccgtagttag gccaccactt caagaactct 420
gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc tgccagtggc 480
gataagtcgt gtcttaccgg gttggaccca agacgatagt taccggataa ggcgcagcgg 540
tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac ctacaccgaa 600
ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg 660
gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga gcttccaggg 720
ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact tgagcgtcga 780
tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa cgcggccttt 840
ttacggttcc tggccttttg ctggcctttt gctcacatgt atttaaataa tgtatctaaa 900
cgcaaactcc gagctggaaa aatgttaccg gcgatgcgcg gacaatttag aggcggcgat 960
caagaaacac ctgctgggcg agcagtctgg agcacagtct tcgatgggcc cgagatccca 1020
ccgcgttcct gggtaccggg acgtgaggca gcgcgacatc catcaaatat accaggcgcc 1080
aaccgagtgt ctcggaaaac agcttctgga tatcttccgc tggcggcgca acgacgaata 1140
atagtccctg gaggtgacgg aatatatatg tgtggagggt aaatctgaca gggtgtagca 1200
aaggtaatat tttcctaaaa catgcaatcg gctgccccgc aacgggaaaa agaatgactt 1260
tggcactctt caccagagtg gggtgtcccg ctcgtgtgtg caaataggct cccactggtc 1320
accccggatt ttgcagaaaa acagcaagtt ccggggtgtc tcactggtgt ccgccaataa 1380
gaggagccgg caggcacgga gtttacatca agctgtctcc gatacactcg actaccatcc 1440
gggtctctca gagaggggaa tggcactata aataccgcct ccttgcgctc tctgccttca 1500
tcaatcaaat catgatgtct tttgtccaaa agggtacttg gttacttttt gctctgttgc 1560
acccaactgt tattctcgca caacaggaag cagtagatgg tggttgctca catttaggtc 1620
aatcttacgc agatagagat gtatggaaac ctgaaccatg tcaaatttgc gtgtgtgact 1680
caggttcagt gctctgcgac gatatcatat gtgacgacca ggaattggac tgtccaaacc 1740
cagagatacc attcggtgaa tgttgtgctg tttgtccaca gccaccaact gctcctacaa 1800
gacctccaaa cggtcaaggt ccacaaggtc ctaaaggtga tccgggtcca cctggtattc 1860
ctggtagaaa tggtgaccct ggacctcccg gttccccagg tagcccagga tcacctgggc 1920
ctcctggaat atgtgaatcc tgcccaactg gtggtcagaa ctatagccca caatacgagg 1980
cctacgacgt caaatctggt gttgctggag gaggtattgc aggctaccct ggtcccgcag 2040
ggcccccagg tccgccgggt ccgcccggaa catcaggtca tcccggagcc cctggtgcac 2100
caggttatca gggaccgccc ggagagcctg gacaagctgg tcccgctgga ccccctggtc 2160
caccaggtgc tattggacca agtggtcctg ccggaaaaga cggtgaatcc ggtagacctg 2220
gtagacccgg cgaaaggggt ttcccaggtc ctcccggaat gaagggtcca gccggtatgc 2280
ccggttttcc tgggatgaag ggtcacagag gatttgatgg tagaaacgga gagaaaggcg 2340
aaaccggtgc tcccggactg aagggtgaaa acggtgtccc tggtgagaac ggcgctcctg 2400
gacctatggg tccacgtggt gctccaggag aaagaggcag accaggattg cctggtgcag 2460
ctggtgctag aggtaacgat ggtgcccgtg gttccgatgg acaacccggg ccacccggcc 2520
ctccaggtac cgctggattt cctggaagcc ctggtgctaa gggggaggtt ggtccggctg 2580
gtagtcccgg aagtagcggt gccccaggtc aaagaggcga accaggccct cagggtcacg 2640
caggagcacc tggaccgcct ggtcctcctg gttcgaatgg ttcgcctgga ggaaaaggtg 2700
aaatggggcc cgcaggaatc cccggtgcgc ctggtcttat tggtgccagg ggtcctccag 2760
gcccgccagg tacaaatggt gtacccggac agcgaggagc agctggtgaa cctggtaaaa 2820
acggtgccaa aggagatcca ggtcctcgtg gagagcgtgg tgaagctggc tctcccggta 2880
tcgccggtcc aaaaggtgag gacggtaagg acggttcccc tggtgagcca ggtgcgaacg 2940
gactgccagg tgcagccgga gagcgaggag tcccaggatt caggggacca gccggtgcta 3000
acggcttgcc tggtgaaaaa gggccccctg gtgatagggg aggacccggt ccagcaggcc 3060
ctcgtggagt tgctggtgag cctggacgtg acggtttacc aggagggcca ggtttgaggg 3120
gtattcccgg gtcccctggc ggtcctggat cggatggaaa accagggcca ccaggttcgc 3180
agggtgaaac aggacgtcca ggcccacccg gctcacctgg tccaaggggt cagcctggtg 3240
tcatgggttt ccccggtcca aagggtaatg acggagcacc gggtaaaaat ggtgaacgtg 3300
gtggcccagg tggtccagga ccccaaggtc cagctggaaa aaacggtgag acaggtcctc 3360
aaggacctcc aggacctacc ggtcctagcg gagataaggg agatacggga ccgccaggac 3420
ctcaaggatt gcaaggtttg cctggtacat ctggccctcc cggagaaaat ggtaagcctg 3480
gagagccagg accaaaaggc gaagctggag ccccaggtat ccccggaggt aagggagact 3540
caggtgctcc gggtgagcgt ggtcctccgg gtgccggtgg tccacctgga cctagaggtg 3600
gtgccgggcc gccaggtcct gaaggtggta aaggtgctgc tggtccaccg ggaccgcctg 3660
gctctgctgg tactcctggc ttgcagggaa tgccaggaga gagaggtgga cctggaggtc 3720
ccggtccgaa gggtgataaa ggggagccag gatcatccgg tgttgacggc gcacctggta 3780
aagacggacc aaggggacca acgggtccaa tcggaccacc aggacccgct ggccagccag 3840
gagataaagg cgagtccgga gcacccggtg ttcctggtat agctggaccc aggggtggtc 3900
ccggtgaaag aggtgaacag ggcccaccgg gtcccgccgg tttccctggc gcccctggtc 3960
aaaatggaga accaggtgca aagggcgaga gaggagcccc aggagaaaag ggtgagggag 4020
gaccacccgg tgctgccggt ccagctgggg gttcaggtcc tgctggacca ccaggtccac 4080
agggcgttaa aggtgagaga ggaagtccag gtggtcctgg agctgctgga ttcccaggtg 4140
gccgtggacc tcctggtccc cctggatcga atggtaatcc tggtccgcca ggtagttcgg 4200
gtgctcctgg gaaggacggt ccacctggcc ccccaggtag taacggtgca cctggtagtc 4260
caggtatatc cggacctaaa ggagattccg gtccaccagg cgaaagaggg gccccaggcc 4320
cacagggtcc accaggagcc cccggtcctc tgggtattgc tggtcttact ggtgcacgtg 4380
gactggccgg tccacccgga atgcctggag caagaggttc acctggacca caaggtatta 4440
aaggagagaa cggtaaacct ggaccttccg gtcaaaacgg agagcgggga cccccaggcc 4500
cccaaggtct gccaggacta gctggtaccg caggggaacc aggaagagat ggaaatccag 4560
gttcagacgg actacccggt agagatggtg caccgggggc caagggcgac aggggtgaga 4620
atggatctcc tggtgcgcca ggggcaccag gccacccagg tcccccaggt cctgtgggcc 4680
ctgctggaaa gtcaggtgac aggggagaga caggcccggc tggtccatct ggcgcacccg 4740
gaccagctgg ttccagaggc ccacctggtc cgcaaggccc tagaggtgac aagggagaga 4800
ctggagaacg aggtgctatg ggtatcaagg gtcatagagg ttttccgggt aatcccggcg 4860
ccccaggttc tcctggtcca gctggccatc aaggtgcagt cggatcgccc ggcccagccg 4920
gtcccagggg ccctgttggt ccatccggtc ctccaggaaa ggatggtgct tctggacacc 4980
caggacctat cggacctccg ggtcctagag gtaatagagg agaacgtgga tccgagggta 5040
gtcctggtca ccctggtcaa cctggcccac cagggcctcc aggtgcaccc ggtccatgtt 5100
gtggtgcagg cggtgtggct gcaattgctg gtgtgggtgc tgaaaaggcc ggcggtttcg 5160
ctccatatta tggtgatgaa ccgattgatt ttaagatcaa tactgacgaa atcatgactt 5220
ccttaaagtc cgttaatggt caaattgagt ctctaatctc cccagatggt tcacgtaaaa 5280
atcctgctag aaattgtaga gatttgaagt tttgtcaccc cgagttgcag tccggtgagt 5340
actgggtgga ccccaatcaa ggttgtaagt tagacgctat taaagtttac tgcaatatgg 5400
agacaggaga aacttgcatc agcgcttctc cattgactat cccacaaaaa aattggtgga 5460
ctgactctgg agctgagaaa aagcatgtat ggttcgggga atcgatggag ggtggttttc 5520
agtttagcta tggcaatcct gaacttcccg aagacgtcct cgatgtccag ctggcattcc 5580
tccgacttct ctccagccgg gcctctcaga acatcacata tcactgcaag aatagcattg 5640
catacatgga tcatgccagt gggaatgtaa agaaagcctt gaagctgatg gggtcaaatg 5700
aaggtgaatt caaggctgaa ggaaatagca aattcacata cacagttctg gaggatggtt 5760
gcacaaaaca cactggggaa tggggcaaaa cagtcttcca gtatcaaaca cgcaaggccg 5820
tcagactacc tattgtagat attgcaccct atgatatcgg tggtcctgat caagaatttg 5880
gtgcggacat tggccctgtt tgctttttat aatcaagagg atgtcagaat gccatttgcc 5940
tgagagatgc aggcttcatt tttgatactt ttttatttgt aacctatata gtataggatt 6000
ttttttgtca ttttgtttct tctcgtacga gcttgctcct gatcagccta tctcgcagct 6060
gatgaatatc ttgtggtagg ggtttgggaa aatcattcga gtttgatgtt tttcttggta 6120
tttcccactc ctcttcagag tacagaagat taagtgagac gttcgtttgt gctccggagg 6180
atccttcagt aatgtcttgt ttcttttgtt gcagtggtga gccattttga cttcgtgaaa 6240
gtttctttag aatagttgtt tccagaggcc aaacattcca cccgtagtaa agtgcaagcg 6300
taggaagacc aagactggca taaatcaggt ataagtgtcg agcactggca ggtgatcttc 6360
tgaaagtttc tactagcaga taagatccag tagtcatgca tatggcaaca atgtaccgtg 6420
tggatctaag aacgcgtcct actaaccttc gcattcgttg gtccagtttg ttgttatcga 6480
tcaacgtgac aaggttgtcg attccgcgta agcatgcata cccaaggacg cctgttgcaa 6540
ttccaagtga gccagttcca acaatctttg taatattaga gcacttcatt gtgttgcgct 6600
tgaaagtaaa atgcgaacaa attaagagat aatctcgaaa ccgcgacttc aaacgccaat 6660
atgatgtgcg gcacacaata agcgttcata tccgctgggt gactttctcg ctttaaaaaa 6720
ttatccgaaa aaattttcta gagtgttgtt actttatact tccggctcgt ataatacgac 6780
aaggtgtaag gaggactaaa ccatggctaa actcacctct gctgttccag tcctgactgc 6840
tcgtgatgtt gctggtgctg ttgagttctg gactgatagg ctcggtttct cccgtgactt 6900
cgtagaggac gactttgccg gtgttgtacg tgacgacgtt accctgttca tctccgcagt 6960
tcaggaccag gttgtgccag acaacactct ggcatgggta tgggttcgtg gtctggacga 7020
actgtacgct gagtggtctg aggtcgtgtc taccaacttc cgtgatgcat ctggtccagc 7080
tatgaccgag atcggtgaac agccctgggg tcgtgagttt gcactgcgtg atccagctgg 7140
taactgcgtg catttcgtcg cagaagagca ggactaacaa ttgacacctt acgattattt 7200
agagagtatt tattagtttt attgtatgta tacggatgtt ttattatcta tttatgccct 7260
tatattctgt aactatccaa aagtcctatc ttatcaagcc agcaatctat gtccgcgaac 7320
gtcaactaaa aataagcttt ttatgctctt ctctcttttt ttcccttcgg tataattata 7380
ccttgcatcc acagattctc ctgccaaatt ttgcataatc ctttacaaca tggctatatg 7440
ggagcactta gcgccctcca aaacccatat tgccta 7476
<210> 32
<211> 7479
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV197
<400> 32
aaggggagcc aggatcatcc ggtgttgacg gcgcacctgg taaagacgga ccaaggggac 60
caacgggtcc aatcggacca ccaggacccg ctggccagcc aggagataaa ggcgagtccg 120
gagcacccgg tgttcctggt atagctggac ccaggggtgg tcccggtgaa agaggtgaac 180
agggcccacc gggtcccgcc ggtttccctg gcgcccctgg tcaaaatgga gaaccaggtg 240
caaagggcga gagaggagcc ccaggagaaa agggtgaggg aggaccaccc ggtgctgccg 300
gtccagctgg gggttcaggt cctgctggac caccaggtcc acagggcgtt aaaggtgaga 360
gaggaagtcc aggtggtcct ggagctgctg gattcccagg tggccgtgga cctcctggtc 420
cccctggatc gaatggtaat cctggtccgc caggtagttc gggtgctcct gggaaggacg 480
gtccacctgg ccccccaggt agtaacggtg cacctggtag tccaggtata tccggaccta 540
aaggagattc cggtccacca ggcgaaagag gggccccagg cccacagggt ccaccaggag 600
cccccggtcc tctgggtatt gctggtctta ctggtgcacg tggactggcc ggtccacccg 660
gaatgcctgg agcaagaggt tcacctggac cacaaggtat taaaggagag aacggtaaac 720
ctggaccttc cggtcaaaac ggagagcggg gacccccagg cccccaaggt ctgccaggac 780
tagctggtac cgcaggggaa ccaggaagag atggaaatcc aggttcagac ggactacccg 840
gtagagatgg tgcaccgggg gccaagggcg acaggggtga gaatggatct cctggtgcgc 900
caggggcacc aggccaccca ggtcccccag gtcctgtggg ccctgctgga aagtcaggtg 960
acaggggaga gacaggcccg gctggtccat ctggcgcacc cggaccagct ggttccagag 1020
gcccacctgg tccgcaaggc cctagaggtg acaagggaga gactggagaa cgaggtgcta 1080
tgggtatcaa gggtcataga ggttttccgg gtaatcccgg cgccccaggt tctcctggtc 1140
cagctggcca tcaaggtgca gtcggatcgc ccggcccagc cggtcccagg ggccctgttg 1200
gtccatccgg tcctccagga aaggatggtg cttctggaca cccaggacct atcggacctc 1260
cgggtcctag aggtaataga ggagaacgtg gatccgaggg tagtcctggt caccctggtc 1320
aacctggccc accagggcct ccaggtgcac ccggtccatg ttgtggtgca ggcggtgtgg 1380
ctgcaattgc tggtgtgggt gctgaaaagg ccggcggttt cgctccatat tatggtgatg 1440
aaccgattga ttttaagatc aatactgacg aaatcatgac ttccttaaag tccgttaatg 1500
gtcaaattga gtctctaatc tccccagatg gttcacgtaa aaatcctgct agaaattgta 1560
gagatttgaa gttttgtcac cccgagttgc agtccggtga gtactgggtg gaccccaatc 1620
aaggttgtaa gttagacgct attaaagttt actgcaatat ggagacagga gaaacttgca 1680
tcagcgcttc tccattgact atcccacaaa aaaattggtg gactgactct ggagctgaga 1740
aaaagcatgt atggttcggg gaatcgatgg aaggtggttt ccaattcagc tacggtaacc 1800
ctgaacttcc tgaagatgtt cttgacgttc aattggcatt tctgagattg ttgtccagtc 1860
gtgcaagcca aaacattaca taccattgca aaaattccat cgcatatatg gatcatgcta 1920
gcggaaatgt gaaaaaggca ttgaagctga tgggatcaaa tgaaggtgaa tttaaagcag 1980
agggtaattc taagtttact tacactgtat tggaggatgg ttgtacgaag catacaggtg 2040
aatggggtaa aacagtgttt caatatcaaa cccgcaaagc agttagattg ccaatcgtcg 2100
atatcgcacc atacgacatt ggaggaccag atcaagagtt cggagctgac atcggtccgg 2160
tgtgtttcct ttgataatca agaggatgtc agaatgccat ttgcctgaga gatgcaggct 2220
tcatttttga tactttttta tttgtaacct atatagtata ggattttttt tgtcattttg 2280
tttcttctcg tacgagcttg ctcctgatca gcctatctcg cagctgatga atatcttgtg 2340
gtaggggttt gggaaaatca ttcgagtttg atgtttttct tggtatttcc cactcctctt 2400
cagagtacag aagattaagt gagacgttcg tttgtgctcc ggaggatcct tcagtaatgt 2460
cttgtttctt ttgttgcagt ggtgagccat tttgacttcg tgaaagtttc tttagaatag 2520
ttgtttccag aggccaaaca ttccacccgt agtaaagtgc aagcgtagga agaccaagac 2580
tggcataaat caggtataag tgtcgagcac tggcaggtga tcttctgaaa gtttctacta 2640
gcagataaga tccagtagtc atgcatatgg caacaatgta ccgtgtggat ctaagaacgc 2700
gtcctactaa ccttcgcatt cgttggtcca gtttgttgtt atcgatcaac gtgacaaggt 2760
tgtcgattcc gcgtaagcat gcatacccaa ggacgcctgt tgcaattcca agtgagccag 2820
ttccaacaat ctttgtaata ttagagcact tcattgtgtt gcgcttgaaa gtaaaatgcg 2880
aacaaattaa gagataatct cgaaaccgcg acttcaaacg ccaatatgat gtgcggcaca 2940
caataagcgt tcatatccgc tgggtgactt tctcgcttta aaaaattatc cgaaaaaatt 3000
ttctagagtg ttgttacttt atacttccgg ctcgtataat acgacaaggt gtaaggagga 3060
ctaaaccatg gctaaactca cctctgctgt tccagtcctg actgctcgtg atgttgctgg 3120
tgctgttgag ttctggactg ataggctcgg tttctcccgt gacttcgtag aggacgactt 3180
tgccggtgtt gtacgtgacg acgttaccct gttcatctcc gcagttcagg accaggttgt 3240
gccagacaac actctggcat gggtatgggt tcgtggtctg gacgaactgt acgctgagtg 3300
gtctgaggtc gtgtctacca acttccgtga tgcatctggt ccagctatga ccgagatcgg 3360
tgaacagccc tggggtcgtg agtttgcact gcgtgatcca gctggtaact gcgtgcattt 3420
cgtcgcagaa gagcaggact aacaattgac accttacgat tatttagaga gtatttatta 3480
gttttattgt atgtatacgg atgttttatt atctatttat gcccttatat tctgtaacta 3540
tccaaaagtc ctatcttatc aagccagcaa tctatgtccg cgaacgtcaa ctaaaaataa 3600
gctttttatg ctcttctctc tttttttccc ttcggtataa ttataccttg catccacaga 3660
ttctcctgcc aaattttgca taatccttta caacatggct atatgggagc acttagcgcc 3720
ctccaaaacc catattgcct acgcatgtat aggtgttttt tccacaatat tttctctgtg 3780
ctctcttttt attaaagaga agctctatat cggagaagct tctgtggccg ttatattcgg 3840
ccttatcgtg ggaccacatt gcctgaattg gtttgccccg gaagattggg gaaacttgga 3900
tctgattacc ttagctgcag aaaagggtac cactgagcgt cagaccccgt agaaaagatc 3960
aaaggatctt cttgagatcc tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa 4020
ccaccgctac cagcggtggt ttgtttgccg gatcaagagc taccaactct ttttccgaag 4080
gtaactggct tcagcagagc gcagatacca aatactgttc ttctagtgta gccgtagtta 4140
ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct aatcctgtta 4200
ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg ggttggaccc aagacgatag 4260
ttaccggata aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcccagcttg 4320
gagcgaacga cctacaccga actgagatac ctacagcgtg agctatgaga aagcgccacg 4380
cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag 4440
cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc 4500
cacctctgac ttgagcgtcg atttttgtga tgctcgtcag gggggcggag cctatggaaa 4560
aacgccagca acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg 4620
tatttaaata atgtatctaa acgcaaactc cgagctggaa aaatgttacc ggcgatgcgc 4680
ggacaattta gaggcggcga tcaagaaaca cctgctgggc gagcagtctg gagcacagtc 4740
ttcgatgggc ccgagatccc accgcgttcc tgggtaccgg gacgtgaggc agcgcgacat 4800
ccatcaaata taccaggcgc caaccgagtg tctcggaaaa cagcttctgg atatcttccg 4860
ctggcggcgc aacgacgaat aatagtccct ggaggtgacg gaatatatat gtgtggaggg 4920
taaatctgac agggtgtagc aaaggtaata ttttcctaaa acatgcaatc ggctgccccg 4980
caacgggaaa aagaatgact ttggcactct tcaccagagt ggggtgtccc gctcgtgtgt 5040
gcaaataggc tcccactggt caccccggat tttgcagaaa aacagcaagt tccggggtgt 5100
ctcactggtg tccgccaata agaggagccg gcaggcacgg agtttacatc aagctgtctc 5160
cgatacactc gactaccatc cgggtctctc agagagggga atggcactat aaataccgcc 5220
tccttgcgct ctctgccttc atcaatcaaa tcatgatgag ctttgtgcaa aaggggacct 5280
ggttactttt cgctctgctt catcccactg ttattttggc acaacaggaa gctgttgacg 5340
gaggatgctc ccatctcggt cagtcttatg cagatagaga tgtatggaaa ccagaaccgt 5400
gccaaatatg cgtctgtgac tcaggatccg ttctctgtga tgacataata tgtgacgacc 5460
aagaattaga ctgccccaac cctgaaatcc cgtttggaga atgttgtgca gtttgcccac 5520
agcctccaac agctcccact cgccctccta atggtcaagg acctcaaggc cccaagggag 5580
atccaggtcc tcctggtatt cctgggcgaa atggcgatcc tggtcctcca ggatcaccag 5640
gctccccagg ttctcccggc cctcctggaa tctgtgaatc atgtcctact ggtggccaga 5700
actattctcc ccagtacgaa gcatatgatg tcaagtctgg agtagcagga ggaggaatcg 5760
caggctatcc tgggccagct ggtcctcctg gcccacccgg accccctggc acatctggcc 5820
atcctggtgc ccctggcgct ccaggatacc aaggtccccc cggtgaacct gggcaagctg 5880
gtccggcagg tcctccagga cctcctggtg ctataggtcc atctggccct gctggaaaag 5940
atggggaatc aggaagaccc ggacgacctg gagagcgagg atttcctggc cctcctggta 6000
tgaaaggccc agctggtatg cctggattcc ctggtatgaa aggacacaga ggctttgatg 6060
gacgaaatgg agagaaaggc gaaactggtg ctcctggatt aaagggggaa aatggcgttc 6120
caggtgaaaa tggagctcct ggacccatgg gtccaagagg ggctcccggt gagagaggac 6180
ggccaggact tcctggagcc gcaggggctc gaggtaatga tggagctcga ggaagtgatg 6240
gacaaccggg cccccctggt cctcctggaa ctgcaggatt ccctggttcc cctggtgcta 6300
agggtgaagt tggacctgca ggatctcctg gttcaagtgg cgcccctgga caaagaggag 6360
aacctggacc tcagggacat gctggtgctc caggtccccc tgggcctcct gggagtaatg 6420
gtagtcctgg tggcaaaggt gaaatgggtc ctgctggcat tcctggggct cctgggctga 6480
taggagctcg tggtcctcca gggccacctg gcaccaatgg tgttcccggg caacgaggtg 6540
ctgcaggtga acccggtaag aatggagcca aaggagaccc aggaccacgt ggggaacgcg 6600
gagaagctgg ttctccaggt atcgcaggac ctaagggtga agatggcaaa gatggttctc 6660
ctggagaacc tggtgcaaat ggacttcctg gagctgcagg agaaaggggt gtgcctggat 6720
tccgaggacc tgctggagca aatggccttc caggagaaaa gggtcctcct ggggaccgtg 6780
gtggcccagg ccctgcaggg cccagaggtg ttgctggaga gcccggcaga gatggtctcc 6840
ctggaggtcc aggattgagg ggtattcctg gtagccccgg aggaccaggc agtgatggga 6900
aaccagggcc tcctggaagc caaggagaga cgggtcgacc cggtcctcca ggttcacctg 6960
gtccgcgagg ccagcctggt gtcatgggct tccctggtcc caaaggaaac gatggtgctc 7020
ctggaaaaaa tggagaacga ggtggccctg gaggtcctgg ccctcagggt cctgctggaa 7080
agaatggtga gaccggacct cagggtcctc caggacctac tggcccttct ggtgacaaag 7140
gagacacagg accccctggt ccacaaggac tacaaggctt gcctggaacg agtggtcccc 7200
caggagaaaa cggaaaacct ggtgaacctg gtccaaaggg tgaggctggt gcacctggaa 7260
ttccaggagg caagggtgat tctggtgctc ccggtgaacg cggacctcct ggagcaggag 7320
ggccccctgg acctagaggt ggagctggcc cccctggtcc cgaaggagga aagggtgctg 7380
ctggtccacc gggaccgcct ggctctgctg gtactcctgg cttgcaggga atgccaggag 7440
agagaggtgg acctggaggt cccggtccga agggtgata 7479
<210> 33
<211> 7479
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV198
<400> 33
tccgccaata agaggagccg gcaggcacgg agtttacatc aagctgtctc cgatacactc 60
gactaccatc cgggtctctc agagagggga atggcactat aaataccgcc tccttgcgct 120
ctctgccttc atcaatcaaa tcatgatgag ctttgtgcaa aaggggacct ggttactttt 180
cgctctgctt catcccactg ttattttggc acaacaggaa gctgttgacg gaggatgctc 240
ccatctcggt cagtcttatg cagatagaga tgtatggaaa ccagaaccgt gccaaatatg 300
cgtctgtgac tcaggatccg ttctctgtga tgacataata tgtgacgacc aagaattaga 360
ctgccccaac cctgaaatcc cgtttggaga atgttgtgca gtttgcccac agcctccaac 420
agctcccact cgccctccta atggtcaagg acctcaaggc cccaagggag atccaggtcc 480
tcctggtatt cctgggcgaa atggcgatcc tggtcctcca ggatcaccag gctccccagg 540
ttctcccggc cctcctggaa tctgtgaatc atgtcctact ggtggccaga actattctcc 600
ccagtacgaa gcatatgatg tcaagtctgg agtagcagga ggaggaatcg caggctatcc 660
tgggccagct ggtcctcctg gcccacccgg accccctggc acatctggcc atcctggtgc 720
ccctggcgct ccaggatacc aaggtccccc cggtgaacct gggcaagctg gtccggcagg 780
tcctccagga cctcctggtg ctataggtcc atctggccct gctggaaaag atggggaatc 840
aggaagaccc ggacgacctg gagagcgagg atttcctggc cctcctggta tgaaaggccc 900
agctggtatg cctggattcc ctggtatgaa aggacacaga ggctttgatg gacgaaatgg 960
agagaaaggc gaaactggtg ctcctggatt aaagggggaa aatggcgttc caggtgaaaa 1020
tggagctcct ggacccatgg gtccaagagg ggctcccggt gagagaggac ggccaggact 1080
tcctggagcc gcaggggctc gaggtaatga tggagctcga ggaagtgatg gacaaccggg 1140
cccccctggt cctcctggaa ctgcaggatt ccctggttcc cctggtgcta agggtgaagt 1200
tggacctgca ggatctcctg gttcaagtgg cgcccctgga caaagaggag aacctggacc 1260
tcagggacat gctggtgctc caggtccccc tgggcctcct gggagtaatg gtagtcctgg 1320
tggcaaaggt gaaatgggtc ctgctggcat tcctggggct cctgggctga taggagctcg 1380
tggtcctcca gggccacctg gcaccaatgg tgttcccggg caacgaggtg ctgcaggtga 1440
acccggtaag aatggagcca aaggagaccc aggaccacgt ggggaacgcg gagaagctgg 1500
ttctccaggt atcgcaggac ctaagggtga agatggcaaa gatggttctc ctggagaacc 1560
tggtgcaaat ggacttcctg gagctgcagg agaaaggggt gtgcctggat tccgaggacc 1620
tgctggagca aatggccttc caggagaaaa gggtcctcct ggggaccgtg gtggcccagg 1680
ccctgcaggg cccagaggtg ttgctggaga gcccggcaga gatggtctcc ctggaggtcc 1740
aggattgagg ggtattcctg gtagccccgg aggaccaggc agtgatggga aaccagggcc 1800
tcctggaagc caaggagaga cgggtcgacc cggtcctcca ggttcacctg gtccgcgagg 1860
ccagcctggt gtcatgggct tccctggtcc caaaggaaac gatggtgctc ctggaaaaaa 1920
tggagaacga ggtggccctg gaggtcctgg ccctcagggt cctgctggaa agaatggtga 1980
gaccggacct cagggtcctc caggacctac tggcccttct ggtgacaaag gagacacagg 2040
accccctggt ccacaaggac tacaaggctt gcctggaacg agtggtcccc caggagaaaa 2100
cggaaaacct ggtgaacctg gtccaaaggg tgaggctggt gcacctggaa ttccaggagg 2160
caagggtgat tctggtgctc ccggtgaacg cggacctcct ggagcaggag ggccccctgg 2220
acctagaggt ggagctggcc cccctggtcc cgaaggagga aagggtgctg ctggtccccc 2280
tgggccacct ggttctgctg gtacacctgg tctgcaagga atgcctggag aaagaggggg 2340
tcctggaggc cctggtccaa agggtgataa gggtgagcct ggcagctcag gtgtcgatgg 2400
tgctccaggg aaagatggtc cacggggtcc cactggtccc attggtcctc ctggcccagc 2460
tggtcagcct ggagataagg gtgaaagtgg tgcccctgga gttccgggta tagctggtcc 2520
tcgcggtggc cctggtgaga gaggcgaaca ggggccccca ggacctgctg gcttccctgg 2580
tgctcctggc cagaatggtg agcctggtgc taaaggagaa agaggcgctc ctggtgagaa 2640
aggtgaagga ggccctcccg gagccgcagg acccgccgga ggttctgggc ctgccggtcc 2700
cccaggcccc caaggtgtca aaggcgaacg tggcagtcct ggtggtcctg gtgctgctgg 2760
cttccccggt ggtcgtggtc ctcctggccc tcctggcagt aatggtaacc caggcccccc 2820
aggctccagt ggtgctccag gcaaagatgg tcccccaggt ccacctggca gtaatggtgc 2880
tcctggcagc cccgggatct ctggaccaaa gggtgattct ggtccaccag gtgagagggg 2940
agcacctggc ccccagggcc ctccgggagc tccaggccca ctaggaattg caggacttac 3000
tggagcacga ggtcttgcag gcccaccagg catgccaggt gctaggggca gccccggccc 3060
acagggcatc aagggtgaaa atggtaaacc aggacctagt ggtcagaatg gagaacgtgg 3120
tcctcctggc ccccagggtc ttcctggtct ggctggtaca gctggtgagc ctggaagaga 3180
tggaaaccct ggatcagatg gtctgccagg ccgagatgga gctccaggtg ccaagggtga 3240
ccgtggtgaa aatggctctc ctggtgcccc tggagctcct ggtcacccag gccctcctgg 3300
tcctgtcggt ccagctggaa agagcggtga cagaggagaa actggccctg ctggtccttc 3360
tggggccccc ggtcctgccg gatcaagagg tcctcctggt ccccaaggcc cacgcggtga 3420
caaaggggaa accggtgagc gtggtgctat gggcatcaaa ggacatcgcg gattccctgg 3480
caacccaggg gcccccggat ctccgggtcc cgctggtcat caaggtgcag ttggcagtcc 3540
aggccctgca ggccccagag gacctgttgg acctagcggg ccccctggaa aggacggagc 3600
aagtggacac cctggtccca ttggaccacc ggggccccga ggtaacagag gtgaaagagg 3660
atctgagggc tccccaggcc acccaggaca accaggccct cctggacctc ctggtgcccc 3720
tggtccatgt tgtggtgctg gcggtgtggc tgcaattgct ggtgtgggtg ctgaaaaggc 3780
cggcggtttc gctccatatt atggtgatga accgattgat tttaagatca atactgacga 3840
aatcatgact tccttaaagt ccgttaatgg tcaaattgag tctctaatct ccccagatgg 3900
ttcacgtaaa aatcctgcta gaaattgtag agatttgaag ttttgtcacc ccgagttgca 3960
gtccggtgag tactgggtgg accccaatca aggttgtaag ttagacgcta ttaaagttta 4020
ctgcaatatg gagacaggag aaacttgcat cagcgcttct ccattgacta tcccacaaaa 4080
aaattggtgg actgactctg gagctgagaa aaagcatgta tggttcgggg aatcgatgga 4140
aggtggtttc caattcagct acggtaaccc tgaacttcct gaagatgttc ttgacgttca 4200
attggcattt ctgagattgt tgtccagtcg tgcaagccaa aacattacat accattgcaa 4260
aaattccatc gcatatatgg atcatgctag cggaaatgtg aaaaaggcat tgaagctgat 4320
gggatcaaat gaaggtgaat ttaaagcaga gggtaattct aagtttactt acactgtatt 4380
ggaggatggt tgtacgaagc atacaggtga atggggtaaa acagtgtttc aatatcaaac 4440
ccgcaaagca gttagattgc caatcgtcga tatcgcacca tacgacattg gaggaccaga 4500
tcaagagttc ggagctgaca tcggtccggt gtgtttcctt tgataatcaa gaggatgtca 4560
gaatgccatt tgcctgagag atgcaggctt catttttgat acttttttat ttgtaaccta 4620
tatagtatag gatttttttt gtcattttgt ttcttctcgt acgagcttgc tcctgatcag 4680
cctatctcgc agctgatgaa tatcttgtgg taggggtttg ggaaaatcat tcgagtttga 4740
tgtttttctt ggtatttccc actcctcttc agagtacaga agattaagtg agacgttcgt 4800
ttgtgctccg gaggatcctt cagtaatgtc ttgtttcttt tgttgcagtg gtgagccatt 4860
ttgacttcgt gaaagtttct ttagaatagt tgtttccaga ggccaaacat tccacccgta 4920
gtaaagtgca agcgtaggaa gaccaagact ggcataaatc aggtataagt gtcgagcact 4980
ggcaggtgat cttctgaaag tttctactag cagataagat ccagtagtca tgcatatggc 5040
aacaatgtac cgtgtggatc taagaacgcg tcctactaac cttcgcattc gttggtccag 5100
tttgttgtta tcgatcaacg tgacaaggtt gtcgattccg cgtaagcatg catacccaag 5160
gacgcctgtt gcaattccaa gtgagccagt tccaacaatc tttgtaatat tagagcactt 5220
cattgtgttg cgcttgaaag taaaatgcga acaaattaag agataatctc gaaaccgcga 5280
cttcaaacgc caatatgatg tgcggcacac aataagcgtt catatccgct gggtgacttt 5340
ctcgctttaa aaaattatcc gaaaaaattt tctagagtgt tgttacttta tacttccggc 5400
tcgtataata cgacaaggtg taaggaggac taaaccatgg ctaaactcac ctctgctgtt 5460
ccagtcctga ctgctcgtga tgttgctggt gctgttgagt tctggactga taggctcggt 5520
ttctcccgtg acttcgtaga ggacgacttt gccggtgttg tacgtgacga cgttaccctg 5580
ttcatctccg cagttcagga ccaggttgtg ccagacaaca ctctggcatg ggtatgggtt 5640
cgtggtctgg acgaactgta cgctgagtgg tctgaggtcg tgtctaccaa cttccgtgat 5700
gcatctggtc cagctatgac cgagatcggt gaacagccct ggggtcgtga gtttgcactg 5760
cgtgatccag ctggtaactg cgtgcatttc gtcgcagaag agcaggacta acaattgaca 5820
ccttacgatt atttagagag tatttattag ttttattgta tgtatacgga tgttttatta 5880
tctatttatg cccttatatt ctgtaactat ccaaaagtcc tatcttatca agccagcaat 5940
ctatgtccgc gaacgtcaac taaaaataag ctttttatgc tcttctctct ttttttccct 6000
tcggtataat tataccttgc atccacagat tctcctgcca aattttgcat aatcctttac 6060
aacatggcta tatgggagca cttagcgccc tccaaaaccc atattgccta cgcatgtata 6120
ggtgtttttt ccacaatatt ttctctgtgc tctcttttta ttaaagagaa gctctatatc 6180
ggagaagctt ctgtggccgt tatattcggc cttatcgtgg gaccacattg cctgaattgg 6240
tttgccccgg aagattgggg aaacttggat ctgattacct tagctgcaga aaagggtacc 6300
actgagcgtc agaccccgta gaaaagatca aaggatcttc ttgagatcct ttttttctgc 6360
gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt tgtttgccgg 6420
atcaagagct accaactctt tttccgaagg taactggctt cagcagagcg cagataccaa 6480
atactgttct tctagtgtag ccgtagttag gccaccactt caagaactct gtagcaccgc 6540
ctacatacct cgctctgcta atcctgttac cagtggctgc tgccagtggc gataagtcgt 6600
gtcttaccgg gttggaccca agacgatagt taccggataa ggcgcagcgg tcgggctgaa 6660
cggggggttc gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc 6720
tacagcgtga gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc 6780
cggtaagcgg cagggtcgga acaggagagc gcacgaggga gcttccaggg ggaaacgcct 6840
ggtatcttta tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat 6900
gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc 6960
tggccttttg ctggcctttt gctcacatgt atttaaataa tgtatctaaa cgcaaactcc 7020
gagctggaaa aatgttaccg gcgatgcgcg gacaatttag aggcggcgat caagaaacac 7080
ctgctgggcg agcagtctgg agcacagtct tcgatgggcc cgagatccca ccgcgttcct 7140
gggtaccggg acgtgaggca gcgcgacatc catcaaatat accaggcgcc aaccgagtgt 7200
ctcggaaaac agcttctgga tatcttccgc tggcggcgca acgacgaata atagtccctg 7260
gaggtgacgg aatatatatg tgtggagggt aaatctgaca gggtgtagca aaggtaatat 7320
tttcctaaaa catgcaatcg gctgccccgc aacgggaaaa agaatgactt tggcactctt 7380
caccagagtg gggtgtcccg ctcgtgtgtg caaataggct cccactggtc accccggatt 7440
ttgcagaaaa acagcaagtt ccggggtgtc tcactggtg 7479
<210> 34
<211> 7479
<212> DNA
<213> Artificial Sequence
<220>
<223> MMV199
<400> 34
gcatgtatag gtgttttttc cacaatattt tctctgtgct ctctttttat taaagagaag 60
ctctatatcg gagaagcttc tgtggccgtt atattcggcc ttatcgtggg accacattgc 120
ctgaattggt ttgccccgga agattgggga aacttggatc tgattacctt agctgcagaa 180
aagggtacca ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt 240
tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt 300
gtttgccgga tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc 360
agataccaaa tactgttctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg 420
tagcaccgcc tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg 480
ataagtcgtg tcttaccggg ttggacccaa gacgatagtt accggataag gcgcagcggt 540
cgggctgaac ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac 600
tgagatacct acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg 660
acaggtatcc ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg 720
gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat 780
ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt 840
tacggttcct ggccttttgc tggccttttg ctcacatgta tttaaataat gtatctaaac 900
gcaaactccg agctggaaaa atgttaccgg cgatgcgcgg acaatttaga ggcggcgatc 960
aagaaacacc tgctgggcga gcagtctgga gcacagtctt cgatgggccc gagatcccac 1020
cgcgttcctg ggtaccggga cgtgaggcag cgcgacatcc atcaaatata ccaggcgcca 1080
accgagtgtc tcggaaaaca gcttctggat atcttccgct ggcggcgcaa cgacgaataa 1140
tagtccctgg aggtgacgga atatatatgt gtggagggta aatctgacag ggtgtagcaa 1200
aggtaatatt ttcctaaaac atgcaatcgg ctgccccgca acgggaaaaa gaatgacttt 1260
ggcactcttc accagagtgg ggtgtcccgc tcgtgtgtgc aaataggctc ccactggtca 1320
ccccggattt tgcagaaaaa cagcaagttc cggggtgtct cactggtgtc cgccaataag 1380
aggagccggc aggcacggag tttacatcaa gctgtctccg atacactcga ctaccatccg 1440
ggtctctcag agaggggaat ggcactataa ataccgcctc cttgcgctct ctgccttcat 1500
caatcaaatc atgatgagct ttgtgcaaaa ggggacctgg ttacttttcg ctctgcttca 1560
tcccactgtt attttggcac aacaggaagc tgttgacgga ggatgctccc atctcggtca 1620
gtcttatgca gatagagatg tatggaaacc agaaccgtgc caaatatgcg tctgtgactc 1680
aggatccgtt ctctgtgatg acataatatg tgacgaccaa gaattagact gccccaaccc 1740
tgaaatcccg tttggagaat gttgtgcagt ttgcccacag cctccaacag ctcccactcg 1800
ccctcctaat ggtcaaggac ctcaaggccc caagggagat ccaggtcctc ctggtattcc 1860
tgggcgaaat ggcgatcctg gtcctccagg atcaccaggc tccccaggtt ctcccggccc 1920
tcctggaatc tgtgaatcat gtcctactgg tggccagaac tattctcccc agtacgaagc 1980
atatgatgtc aagtctggag tagcaggagg aggaatcgca ggctatcctg ggccagctgg 2040
tcctcctggc ccacccggac cccctggcac atctggccat cctggtgccc ctggcgctcc 2100
aggataccaa ggtccccccg gtgaacctgg gcaagctggt ccggcaggtc ctccaggacc 2160
tcctggtgct ataggtccat ctggccctgc tggaaaagat ggggaatcag gaagacccgg 2220
acgacctgga gagcgaggat ttcctggccc tcctggtatg aaaggcccag ctggtatgcc 2280
tggattccct ggtatgaaag gacacagagg ctttgatgga cgaaatggag agaaaggcga 2340
aactggtgct cctggattaa agggggaaaa tggcgttcca ggtgaaaatg gagctcctgg 2400
acccatgggt ccaagagggg ctcccggtga gagaggacgg ccaggacttc ctggagccgc 2460
aggggctcga ggtaatgatg gagctcgagg aagtgatgga caaccgggcc cccctggtcc 2520
tcctggaact gcaggattcc ctggttcccc tggtgctaag ggtgaagttg gacctgcagg 2580
atctcctggt tcaagtggcg cccctggaca aagaggagaa cctggacctc agggacatgc 2640
tggtgctcca ggtccccctg ggcctcctgg gagtaatggt agtcctggtg gcaaaggtga 2700
aatgggtcct gctggcattc ctggggctcc tgggctgata ggagctcgtg gtcctccagg 2760
gccacctggc accaatggtg ttcccgggca acgaggtgct gcaggtgaac ccggtaagaa 2820
tggagccaaa ggagacccag gaccacgtgg ggaacgcgga gaagctggtt ctccaggtat 2880
cgcaggacct aagggtgaag atggcaaaga tggttctcct ggagaacctg gtgcaaatgg 2940
acttcctgga gctgcaggag aaaggggtgt gcctggattc cgaggacctg ctggagcaaa 3000
tggccttcca ggagaaaagg gtcctcctgg ggaccgtggt ggcccaggcc ctgcagggcc 3060
cagaggtgtt gctggagagc ccggcagaga tggtctccct ggaggtccag gattgagggg 3120
tattcctggt agccccggag gaccaggcag tgatgggaaa ccagggcctc ctggaagcca 3180
aggagagacg ggtcgacccg gtcctccagg ttcacctggt ccgcgaggcc agcctggtgt 3240
catgggcttc cctggtccca aaggaaacga tggtgctcct ggaaaaaatg gagaacgagg 3300
tggccctgga ggtcctggcc ctcagggtcc tgctggaaag aatggtgaga ccggacctca 3360
gggtcctcca ggacctactg gcccttctgg tgacaaagga gacacaggac cccctggtcc 3420
acaaggacta caaggcttgc ctggaacgag tggtccccca ggagaaaacg gaaaacctgg 3480
tgaacctggt ccaaagggtg aggctggtgc acctggaatt ccaggaggca agggtgattc 3540
tggtgctccc ggtgaacgcg gacctcctgg agcaggaggg ccccctggac ctagaggtgg 3600
agctggcccc cctggtcccg aaggaggaaa gggtgctgct ggtccccctg ggccacctgg 3660
ttctgctggt acacctggtc tgcaaggaat gcctggagaa agagggggtc ctggaggccc 3720
tggtccaaag ggtgataagg gtgagcctgg cagctcaggt gtcgatggtg ctccagggaa 3780
agatggtcca cggggtccca ctggtcccat tggtcctcct ggcccagctg gtcagcctgg 3840
agataagggt gaaagtggtg cccctggagt tccgggtata gctggtcctc gcggtggccc 3900
tggtgagaga ggcgaacagg ggcccccagg acctgctggc ttccctggtg ctcctggcca 3960
gaatggtgag cctggtgcta aaggagaaag aggcgctcct ggtgagaaag gtgaaggagg 4020
ccctcccgga gccgcaggac ccgccggagg ttctgggcct gccggtcccc caggccccca 4080
aggtgtcaaa ggcgaacgtg gcagtcctgg tggtcctggt gctgctggct tccccggtgg 4140
tcgtggtcct cctggccctc ctggcagtaa tggtaaccca ggccccccag gctccagtgg 4200
tgctccaggc aaagatggtc ccccaggtcc acctggcagt aatggtgctc ctggcagccc 4260
cgggatctct ggaccaaagg gtgattctgg tccaccaggt gagaggggag cacctggccc 4320
ccagggccct ccgggagctc caggcccact aggaattgca ggacttactg gagcacgagg 4380
tcttgcaggc ccaccaggca tgccaggtgc taggggcagc cccggcccac agggcatcaa 4440
gggtgaaaat ggtaaaccag gacctagtgg tcagaatgga gaacgtggtc ctcctggccc 4500
ccagggtctt cctggtctgg ctggtacagc tggtgagcct ggaagagatg gaaaccctgg 4560
atcagatggt ctgccaggcc gagatggagc tccaggtgcc aagggtgacc gtggtgaaaa 4620
tggctctcct ggtgcccctg gagctcctgg tcacccaggc cctcctggtc ctgtcggtcc 4680
agctggaaag agcggtgaca gaggagaaac tggccctgct ggtccttctg gggcccccgg 4740
tcctgccgga tcaagaggtc ctcctggtcc ccaaggccca cgcggtgaca aaggggaaac 4800
cggtgagcgt ggtgctatgg gcatcaaagg acatcgcgga ttccctggca acccaggggc 4860
ccccggatct ccgggtcccg ctggtcatca aggtgcagtt ggcagtccag gccctgcagg 4920
ccccagagga cctgttggac ctagcgggcc ccctggaaag gacggagcaa gtggacaccc 4980
tggtcccatt ggaccaccgg ggccccgagg taacagaggt gaaagaggat ctgagggctc 5040
cccaggccac ccaggacaac caggccctcc tggacctcct ggtgcccctg gtccatgttg 5100
tggtgctggc ggggttgctg ccattgctgg tgttggagcc gaaaaagctg gtggttttgc 5160
cccatattat ggagatgaac cgatagattt caaaatcaac accgatgaga ttatgacctc 5220
actcaaatca gtcaatggac aaatagaaag cctcattagt cctgatggtt cccgtaaaaa 5280
ccctgcacgg aactgcaggg acctgaaatt ctgccatcct gaactccaga gtggagaata 5340
ttgggttgat cctaaccaag gttgcaaatt ggatgctatt aaagtctact gtaacatgga 5400
aactggggaa acgtgcataa gtgccagtcc tttgactatc ccacagaaga actggtggac 5460
agattctggt gctgagaaga aacatgtttg gtttggagaa tccatggaag gtggtttcca 5520
attcagctac ggtaaccctg aacttcctga agatgttctt gacgttcaat tggcatttct 5580
gagattgttg tccagtcgtg caagccaaaa cattacatac cattgcaaaa attccatcgc 5640
atatatggat catgctagcg gaaatgtgaa aaaggcattg aagctgatgg gatcaaatga 5700
aggtgaattt aaagcagagg gtaattctaa gtttacttac actgtattgg aggatggttg 5760
tacgaagcat acaggtgaat ggggtaaaac agtgtttcaa tatcaaaccc gcaaagcagt 5820
tagattgcca atcgtcgata tcgcaccata cgacattgga ggaccagatc aagagttcgg 5880
agctgacatc ggtccggtgt gtttcctttg ataatcaaga ggatgtcaga atgccatttg 5940
cctgagagat gcaggcttca tttttgatac ttttttattt gtaacctata tagtatagga 6000
ttttttttgt cattttgttt cttctcgtac gagcttgctc ctgatcagcc tatctcgcag 6060
ctgatgaata tcttgtggta ggggtttggg aaaatcattc gagtttgatg tttttcttgg 6120
tatttcccac tcctcttcag agtacagaag attaagtgag acgttcgttt gtgctccgga 6180
ggatccttca gtaatgtctt gtttcttttg ttgcagtggt gagccatttt gacttcgtga 6240
aagtttcttt agaatagttg tttccagagg ccaaacattc cacccgtagt aaagtgcaag 6300
cgtaggaaga ccaagactgg cataaatcag gtataagtgt cgagcactgg caggtgatct 6360
tctgaaagtt tctactagca gataagatcc agtagtcatg catatggcaa caatgtaccg 6420
tgtggatcta agaacgcgtc ctactaacct tcgcattcgt tggtccagtt tgttgttatc 6480
gatcaacgtg acaaggttgt cgattccgcg taagcatgca tacccaagga cgcctgttgc 6540
aattccaagt gagccagttc caacaatctt tgtaatatta gagcacttca ttgtgttgcg 6600
cttgaaagta aaatgcgaac aaattaagag ataatctcga aaccgcgact tcaaacgcca 6660
atatgatgtg cggcacacaa taagcgttca tatccgctgg gtgactttct cgctttaaaa 6720
aattatccga aaaaattttc tagagtgttg ttactttata cttccggctc gtataatacg 6780
acaaggtgta aggaggacta aaccatggct aaactcacct ctgctgttcc agtcctgact 6840
gctcgtgatg ttgctggtgc tgttgagttc tggactgata ggctcggttt ctcccgtgac 6900
ttcgtagagg acgactttgc cggtgttgta cgtgacgacg ttaccctgtt catctccgca 6960
gttcaggacc aggttgtgcc agacaacact ctggcatggg tatgggttcg tggtctggac 7020
gaactgtacg ctgagtggtc tgaggtcgtg tctaccaact tccgtgatgc atctggtcca 7080
gctatgaccg agatcggtga acagccctgg ggtcgtgagt ttgcactgcg tgatccagct 7140
ggtaactgcg tgcatttcgt cgcagaagag caggactaac aattgacacc ttacgattat 7200
ttagagagta tttattagtt ttattgtatg tatacggatg ttttattatc tatttatgcc 7260
cttatattct gtaactatcc aaaagtccta tcttatcaag ccagcaatct atgtccgcga 7320
acgtcaacta aaaataagct ttttatgctc ttctctcttt ttttcccttc ggtataatta 7380
taccttgcat ccacagattc tcctgccaaa ttttgcataa tcctttacaa catggctata 7440
tgggagcact tagcgccctc caaaacccat attgcctac 7479
<210> 35
<211> 4751
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(4751)
<223> Bos taurus collagen type I alpha 1 chain (COL1A1), mRNA; NCBI
Reference Sequence: NM_001034039.2
<220>
<221> CDS
<222> (119)..(4510)
<400> 35
gcagacggga gtttctcctc ggggtcggag caggaggcac gcggagtgtg aggccacgca 60
tgagcggacg ctaaccccca ccccagccgc aaagagtcta catgtctagg gtctagac 118
atg ttc agc ttt gtg gac ctc cgg ctc ctg ctc ctc tta gcg gcc acc 166
Met Phe Ser Phe Val Asp Leu Arg Leu Leu Leu Leu Leu Ala Ala Thr
1 5 10 15
gcc ctc ctg acg cac ggc caa gag gag ggc cag gaa gaa ggc caa gaa 214
Ala Leu Leu Thr His Gly Gln Glu Glu Gly Gln Glu Glu Gly Gln Glu
20 25 30
gaa gac atc cca cca gtc acc tgc gta cag aac ggc ctc agg tac cat 262
Glu Asp Ile Pro Pro Val Thr Cys Val Gln Asn Gly Leu Arg Tyr His
35 40 45
gac cga gac gtg tgg aaa ccc gtg ccc tgc cag atc tgt gtc tgc gac 310
Asp Arg Asp Val Trp Lys Pro Val Pro Cys Gln Ile Cys Val Cys Asp
50 55 60
aac ggc aac gtg ctg tgc gat gac gtg atc tgc gac gaa ctt aag gac 358
Asn Gly Asn Val Leu Cys Asp Asp Val Ile Cys Asp Glu Leu Lys Asp
65 70 75 80
tgt cct aac gcc aaa gtc ccc acg gac gaa tgc tgc ccc gtc tgc ccc 406
Cys Pro Asn Ala Lys Val Pro Thr Asp Glu Cys Cys Pro Val Cys Pro
85 90 95
gaa ggc cag gaa tca ccc acg gac caa gaa acc acc gga gtc gag gga 454
Glu Gly Gln Glu Ser Pro Thr Asp Gln Glu Thr Thr Gly Val Glu Gly
100 105 110
ccg aaa gga gac act ggc ccc cga ggc cca agg gga ccc gcc ggc ccc 502
Pro Lys Gly Asp Thr Gly Pro Arg Gly Pro Arg Gly Pro Ala Gly Pro
115 120 125
ccc ggc cga gat ggc atc cct gga caa cct gga ctt ccc gga ccc cct 550
Pro Gly Arg Asp Gly Ile Pro Gly Gln Pro Gly Leu Pro Gly Pro Pro
130 135 140
gga ccc ccc gga cct ccc gga ccc cct ggc ctc gga gga aac ttt gct 598
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala
145 150 155 160
ccc cag ttg tct tac ggc tat gat gag aaa tca aca gga att tcc gtg 646
Pro Gln Leu Ser Tyr Gly Tyr Asp Glu Lys Ser Thr Gly Ile Ser Val
165 170 175
cct ggt ccc atg ggt cct tct ggt cct cgt ggt ctc cct ggc ccc cct 694
Pro Gly Pro Met Gly Pro Ser Gly Pro Arg Gly Leu Pro Gly Pro Pro
180 185 190
ggc gca cct ggt ccc caa ggt ttc caa ggc ccc cct ggt gag cct ggc 742
Gly Ala Pro Gly Pro Gln Gly Phe Gln Gly Pro Pro Gly Glu Pro Gly
195 200 205
gag cca gga gcc tca ggt ccc atg ggt ccc cgt ggt ccc cct ggc ccc 790
Glu Pro Gly Ala Ser Gly Pro Met Gly Pro Arg Gly Pro Pro Gly Pro
210 215 220
cct ggc aag aac gga gat gat ggc gaa gct gga aag cct ggt cgt cct 838
Pro Gly Lys Asn Gly Asp Asp Gly Glu Ala Gly Lys Pro Gly Arg Pro
225 230 235 240
ggt gag cgc ggg cct ccc gga cct cag ggt gct cgg gga ttg cct gga 886
Gly Glu Arg Gly Pro Pro Gly Pro Gln Gly Ala Arg Gly Leu Pro Gly
245 250 255
aca gct ggc ctc cct gga atg aag gga cac aga ggt ttc agt ggt ttg 934
Thr Ala Gly Leu Pro Gly Met Lys Gly His Arg Gly Phe Ser Gly Leu
260 265 270
gat ggt gcc aag gga gat gct ggt cct gct ggc ccc aag ggc gag cct 982
Asp Gly Ala Lys Gly Asp Ala Gly Pro Ala Gly Pro Lys Gly Glu Pro
275 280 285
ggt agc ccc ggt gaa aat gga gct cct ggt cag atg ggc ccc cgt ggt 1030
Gly Ser Pro Gly Glu Asn Gly Ala Pro Gly Gln Met Gly Pro Arg Gly
290 295 300
ctg cct ggt gag aga ggt cgc cct gga gcc cct ggc cct gct ggt gct 1078
Leu Pro Gly Glu Arg Gly Arg Pro Gly Ala Pro Gly Pro Ala Gly Ala
305 310 315 320
cga gga aat gat ggt gcg act ggt gct gct ggg ccc cct ggt ccc act 1126
Arg Gly Asn Asp Gly Ala Thr Gly Ala Ala Gly Pro Pro Gly Pro Thr
325 330 335
ggc ccc gct ggt cct cct ggt ttc cct ggt gct gtg ggt gct aag ggt 1174
Gly Pro Ala Gly Pro Pro Gly Phe Pro Gly Ala Val Gly Ala Lys Gly
340 345 350
gaa ggt ggt ccc caa gga ccc cga ggt tct gaa ggt ccc cag ggt gta 1222
Glu Gly Gly Pro Gln Gly Pro Arg Gly Ser Glu Gly Pro Gln Gly Val
355 360 365
cgt ggt gag cct ggc ccc cct ggc cct gct ggt gct gct ggc cct gct 1270
Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Ala Ala Gly Pro Ala
370 375 380
ggc aac cct ggt gct gat gga cag cct ggt gct aaa gga gcc aat ggc 1318
Gly Asn Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Ala Asn Gly
385 390 395 400
gct cct ggt att gct ggt gct cct ggc ttc cct ggt gcc cga ggc ccc 1366
Ala Pro Gly Ile Ala Gly Ala Pro Gly Phe Pro Gly Ala Arg Gly Pro
405 410 415
tct gga ccc cag ggc ccc agc ggc ccc cct ggc ccc aag ggt aac agc 1414
Ser Gly Pro Gln Gly Pro Ser Gly Pro Pro Gly Pro Lys Gly Asn Ser
420 425 430
ggt gaa cct ggt gct cct ggc agc aaa gga gac act ggc gcc aag gga 1462
Gly Glu Pro Gly Ala Pro Gly Ser Lys Gly Asp Thr Gly Ala Lys Gly
435 440 445
gaa ccc ggt ccc act ggt att caa ggc ccc cct ggc ccc gct ggg gaa 1510
Glu Pro Gly Pro Thr Gly Ile Gln Gly Pro Pro Gly Pro Ala Gly Glu
450 455 460
gaa gga aag cga gga gcc cga ggt gaa cct gga cct gct ggc ctg cct 1558
Glu Gly Lys Arg Gly Ala Arg Gly Glu Pro Gly Pro Ala Gly Leu Pro
465 470 475 480
gga ccc cct ggc gag cgt ggt gga cct gga agc cgt ggt ttc cct ggc 1606
Gly Pro Pro Gly Glu Arg Gly Gly Pro Gly Ser Arg Gly Phe Pro Gly
485 490 495
gcc gac ggt gtt gct ggt ccc aag ggt cct gct ggt gaa cgc ggt gct 1654
Ala Asp Gly Val Ala Gly Pro Lys Gly Pro Ala Gly Glu Arg Gly Ala
500 505 510
cct ggc cct gct ggc ccc aaa ggt tct cct ggt gaa gct ggt cgc ccc 1702
Pro Gly Pro Ala Gly Pro Lys Gly Ser Pro Gly Glu Ala Gly Arg Pro
515 520 525
ggt gaa gct ggt ctg ccc ggt gcc aag ggt ctg act gga agc cct ggc 1750
Gly Glu Ala Gly Leu Pro Gly Ala Lys Gly Leu Thr Gly Ser Pro Gly
530 535 540
agc ccg ggt cct gat ggc aaa act ggc ccc cct ggt ccc gcc ggt caa 1798
Ser Pro Gly Pro Asp Gly Lys Thr Gly Pro Pro Gly Pro Ala Gly Gln
545 550 555 560
gat ggc cgc cct gga cct cca ggc cct ccc ggt gcc cgt ggt cag gct 1846
Asp Gly Arg Pro Gly Pro Pro Gly Pro Pro Gly Ala Arg Gly Gln Ala
565 570 575
ggc gtg atg ggt ttc cct gga cct aaa ggt gct gct gga gag cct gga 1894
Gly Val Met Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly
580 585 590
aaa gct gga gag cga ggt gtt cct gga ccc cct ggc gct gtt ggt cct 1942
Lys Ala Gly Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro
595 600 605
gct ggc aaa gac gga gaa gct gga gct cag gga ccc cca gga cct gct 1990
Ala Gly Lys Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala
610 615 620
ggc ccc gct ggt gag aga ggc gaa caa ggc cct gct ggc tcc cct gga 2038
Gly Pro Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly
625 630 635 640
ttc cag ggt ctc ccc ggc cct gct ggt cct cct ggt gaa gca ggc aaa 2086
Phe Gln Gly Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys
645 650 655
cct ggt gaa cag ggt gtt cct gga gat ctt ggt gcc ccc ggc ccc tct 2134
Pro Gly Glu Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser
660 665 670
gga gca aga ggc gag aga ggt ttc ccc ggc gag cgt ggt gtg caa ggg 2182
Gly Ala Arg Gly Glu Arg Gly Phe Pro Gly Glu Arg Gly Val Gln Gly
675 680 685
ccg ccc ggt cct gca ggt ccc cgt ggg gcc aat ggt gcc cct ggc aac 2230
Pro Pro Gly Pro Ala Gly Pro Arg Gly Ala Asn Gly Ala Pro Gly Asn
690 695 700
gat ggt gct aag ggt gat gct ggt gcc cct gga gcc ccc ggt agc cag 2278
Asp Gly Ala Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly Ser Gln
705 710 715 720
ggt gcc cct ggc ctt caa gga atg cct ggt gaa cga ggt gca gct ggt 2326
Gly Ala Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Ala Ala Gly
725 730 735
ctt cca ggc cct aag ggt gac aga ggg gat gct ggt ccc aaa ggt gct 2374
Leu Pro Gly Pro Lys Gly Asp Arg Gly Asp Ala Gly Pro Lys Gly Ala
740 745 750
gat ggt gct cct ggc aaa gat ggc gtc cgt ggt ctg act ggt ccc atc 2422
Asp Gly Ala Pro Gly Lys Asp Gly Val Arg Gly Leu Thr Gly Pro Ile
755 760 765
ggt cct cct ggc ccc gct ggt gcc cct ggt gac aag ggt gaa gct ggt 2470
Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Asp Lys Gly Glu Ala Gly
770 775 780
cct agt ggc cca gcc ggt ccc act gga gct cgt ggt gcc ccc ggt gac 2518
Pro Ser Gly Pro Ala Gly Pro Thr Gly Ala Arg Gly Ala Pro Gly Asp
785 790 795 800
cgt ggt gag cct ggt ccc ccc ggc cct gct ggc ttc gct ggc ccc cct 2566
Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Phe Ala Gly Pro Pro
805 810 815
ggt gct gat ggc caa cct ggt gct aaa ggc gaa cct ggt gat gct ggt 2614
Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Glu Pro Gly Asp Ala Gly
820 825 830
gct aaa ggt gac gct ggt ccc ccc ggc cct gct ggg ccc gct gga ccc 2662
Ala Lys Gly Asp Ala Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly Pro
835 840 845
ccc ggc ccc att ggt aac gtt ggt gct ccc gga ccc aaa ggt gct cgt 2710
Pro Gly Pro Ile Gly Asn Val Gly Ala Pro Gly Pro Lys Gly Ala Arg
850 855 860
ggc agc gct ggt ccc cct ggt gct act ggt ttc cca ggt gct gct ggc 2758
Gly Ser Ala Gly Pro Pro Gly Ala Thr Gly Phe Pro Gly Ala Ala Gly
865 870 875 880
cga gtc ggt ccc ccc ggc ccc tct gga aat gct gga ccc cct ggc cct 2806
Arg Val Gly Pro Pro Gly Pro Ser Gly Asn Ala Gly Pro Pro Gly Pro
885 890 895
cct ggc cct gct ggc aaa gaa ggc agc aaa ggc ccc cgc ggt gag act 2854
Pro Gly Pro Ala Gly Lys Glu Gly Ser Lys Gly Pro Arg Gly Glu Thr
900 905 910
ggc ccc gct ggg cgt ccc ggt gaa gtc ggt ccc cct ggt ccc cct ggc 2902
Gly Pro Ala Gly Arg Pro Gly Glu Val Gly Pro Pro Gly Pro Pro Gly
915 920 925
ccc gct ggt gag aaa gga gcc cct ggt gct gac gga cct gct gga gct 2950
Pro Ala Gly Glu Lys Gly Ala Pro Gly Ala Asp Gly Pro Ala Gly Ala
930 935 940
cct ggc act cct gga cct caa ggt att gct gga cag cgt ggt gtg gtc 2998
Pro Gly Thr Pro Gly Pro Gln Gly Ile Ala Gly Gln Arg Gly Val Val
945 950 955 960
ggc ctg cct ggt cag aga gga gaa aga ggc ttc cct ggt ctt cct ggc 3046
Gly Leu Pro Gly Gln Arg Gly Glu Arg Gly Phe Pro Gly Leu Pro Gly
965 970 975
ccc tct ggt gaa ccc ggc aaa caa ggt cct tct gga gca agt ggt gaa 3094
Pro Ser Gly Glu Pro Gly Lys Gln Gly Pro Ser Gly Ala Ser Gly Glu
980 985 990
cgt ggc ccc cct ggt ccc atg ggc ccc cct gga ttg gct gga ccc cct 3142
Arg Gly Pro Pro Gly Pro Met Gly Pro Pro Gly Leu Ala Gly Pro Pro
995 1000 1005
ggc gag tct gga cgt gag gga gct cct ggt gct gaa gga tcc cct 3187
Gly Glu Ser Gly Arg Glu Gly Ala Pro Gly Ala Glu Gly Ser Pro
1010 1015 1020
gga cga gat ggt tct cct ggc gcc aag ggt gac cgt ggt gag acc 3232
Gly Arg Asp Gly Ser Pro Gly Ala Lys Gly Asp Arg Gly Glu Thr
1025 1030 1035
ggc cct gct gga cct cct ggt gct cct ggc gct ccc ggt gcc ccc 3277
Gly Pro Ala Gly Pro Pro Gly Ala Pro Gly Ala Pro Gly Ala Pro
1040 1045 1050
ggc cct gtc gga cct gcc ggc aag agc ggt gat cgt ggt gag acc 3322
Gly Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr
1055 1060 1065
ggt cct gct ggt cct gct ggt ccc att ggc ccc gtt ggt gcc cgt 3367
Gly Pro Ala Gly Pro Ala Gly Pro Ile Gly Pro Val Gly Ala Arg
1070 1075 1080
ggc ccc gct gga ccc caa ggc ccc cgt ggt gac aag ggt gag aca 3412
Gly Pro Ala Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr
1085 1090 1095
ggc gaa cag ggc gac aga ggc att aag ggt cac cgt ggc ttc tct 3457
Gly Glu Gln Gly Asp Arg Gly Ile Lys Gly His Arg Gly Phe Ser
1100 1105 1110
ggt ctc cag ggt ccc ccc ggc cct ccc ggc tct cct ggt gag caa 3502
Gly Leu Gln Gly Pro Pro Gly Pro Pro Gly Ser Pro Gly Glu Gln
1115 1120 1125
ggt cct tcc gga gcc tct ggt cct gct ggt ccc cgc ggt ccc cct 3547
Gly Pro Ser Gly Ala Ser Gly Pro Ala Gly Pro Arg Gly Pro Pro
1130 1135 1140
ggc tct gct ggt tct ccc ggc aaa gat gga ctc aat ggt ctc cca 3592
Gly Ser Ala Gly Ser Pro Gly Lys Asp Gly Leu Asn Gly Leu Pro
1145 1150 1155
ggc ccc atc ggt ccc cct ggg cct cga ggt cgc act ggt gat gct 3637
Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly Arg Thr Gly Asp Ala
1160 1165 1170
ggt cct gct ggt cct ccc ggc cct cct gga ccc cct ggt ccc cca 3682
Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro
1175 1180 1185
ggt cct ccc agc ggc ggc tac gac ttg agc ttc ctg ccc cag cca 3727
Gly Pro Pro Ser Gly Gly Tyr Asp Leu Ser Phe Leu Pro Gln Pro
1190 1195 1200
cct caa gag aag gct cac gat ggt ggc cgc tac tac cgg gct gat 3772
Pro Gln Glu Lys Ala His Asp Gly Gly Arg Tyr Tyr Arg Ala Asp
1205 1210 1215
gat gcc aat gtg gtc cgt gac cgt gac ctc gag gtg gac acc acc 3817
Asp Ala Asn Val Val Arg Asp Arg Asp Leu Glu Val Asp Thr Thr
1220 1225 1230
ctc aag agc ctg agc cag cag atc gag aac atc cgg agc cct gaa 3862
Leu Lys Ser Leu Ser Gln Gln Ile Glu Asn Ile Arg Ser Pro Glu
1235 1240 1245
ggc agc cgc aag aac ccc gcc cgc acc tgc cgt gac ctc aag atg 3907
Gly Ser Arg Lys Asn Pro Ala Arg Thr Cys Arg Asp Leu Lys Met
1250 1255 1260
tgc cac tct gac tgg aag agc gga gaa tac tgg att gac ccc aac 3952
Cys His Ser Asp Trp Lys Ser Gly Glu Tyr Trp Ile Asp Pro Asn
1265 1270 1275
caa ggc tgc aac ctg gat gcc att aag gtc ttc tgc aac atg gaa 3997
Gln Gly Cys Asn Leu Asp Ala Ile Lys Val Phe Cys Asn Met Glu
1280 1285 1290
acc ggt gag acc tgt gta tac ccc act cag ccc agc gtg gcc cag 4042
Thr Gly Glu Thr Cys Val Tyr Pro Thr Gln Pro Ser Val Ala Gln
1295 1300 1305
aag aac tgg tat atc agc aag aac ccc aag gaa aag agg cac gtc 4087
Lys Asn Trp Tyr Ile Ser Lys Asn Pro Lys Glu Lys Arg His Val
1310 1315 1320
tgg tac ggc gag agc atg acc ggc gga ttc cag ttc gag tat ggc 4132
Trp Tyr Gly Glu Ser Met Thr Gly Gly Phe Gln Phe Glu Tyr Gly
1325 1330 1335
ggc cag ggg tcc gat cct gcc gat gtg gcc atc cag ctg act ttc 4177
Gly Gln Gly Ser Asp Pro Ala Asp Val Ala Ile Gln Leu Thr Phe
1340 1345 1350
ctg cgc ctg atg tcc acc gag gcc tcc cag aac atc acc tac cac 4222
Leu Arg Leu Met Ser Thr Glu Ala Ser Gln Asn Ile Thr Tyr His
1355 1360 1365
tgc aag aac agc gtg gcc tac atg gac cag cag act ggc aac ctc 4267
Cys Lys Asn Ser Val Ala Tyr Met Asp Gln Gln Thr Gly Asn Leu
1370 1375 1380
aag aag gcc ctg ctc ctc cag ggc tcc aac gag atc gag atc cgg 4312
Lys Lys Ala Leu Leu Leu Gln Gly Ser Asn Glu Ile Glu Ile Arg
1385 1390 1395
gcc gag ggc aac agc cgc ttc acc tac agc gtc acc tac gat ggc 4357
Ala Glu Gly Asn Ser Arg Phe Thr Tyr Ser Val Thr Tyr Asp Gly
1400 1405 1410
tgc acg agt cac acc gga gcc tgg ggc aag aca gtg atc gaa tac 4402
Cys Thr Ser His Thr Gly Ala Trp Gly Lys Thr Val Ile Glu Tyr
1415 1420 1425
aaa acc acc aag acc tcc cgc ttg ccc atc atc gat gtg gcc ccc 4447
Lys Thr Thr Lys Thr Ser Arg Leu Pro Ile Ile Asp Val Ala Pro
1430 1435 1440
ttg gac gtt ggc gcc cca gac cag gaa ttc ggc ttc gac gtt ggc 4492
Leu Asp Val Gly Ala Pro Asp Gln Glu Phe Gly Phe Asp Val Gly
1445 1450 1455
cct gcc tgc ttc ctg taa actccttcca ccccaacctg gctccctccc 4540
Pro Ala Cys Phe Leu
1460
acccaaccca cttgcccctg actctggaaa cagacaaaca acccaaactg aaacccccga 4600
aaagccaaaa aatgggagac aatttcacat ggactttgga aaatattttt ttcctttgca 4660
ttcatctctc aaacttagtt tttatctttg accaactgaa catgaccaaa aaccaaaagt 4720
gcattcaacc ttaccaaaaa aaaaaaaaaa a 4751
<210> 36
<211> 1463
<212> PRT
<213> Bos taurus
<400> 36
Met Phe Ser Phe Val Asp Leu Arg Leu Leu Leu Leu Leu Ala Ala Thr
1 5 10 15
Ala Leu Leu Thr His Gly Gln Glu Glu Gly Gln Glu Glu Gly Gln Glu
20 25 30
Glu Asp Ile Pro Pro Val Thr Cys Val Gln Asn Gly Leu Arg Tyr His
35 40 45
Asp Arg Asp Val Trp Lys Pro Val Pro Cys Gln Ile Cys Val Cys Asp
50 55 60
Asn Gly Asn Val Leu Cys Asp Asp Val Ile Cys Asp Glu Leu Lys Asp
65 70 75 80
Cys Pro Asn Ala Lys Val Pro Thr Asp Glu Cys Cys Pro Val Cys Pro
85 90 95
Glu Gly Gln Glu Ser Pro Thr Asp Gln Glu Thr Thr Gly Val Glu Gly
100 105 110
Pro Lys Gly Asp Thr Gly Pro Arg Gly Pro Arg Gly Pro Ala Gly Pro
115 120 125
Pro Gly Arg Asp Gly Ile Pro Gly Gln Pro Gly Leu Pro Gly Pro Pro
130 135 140
Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala
145 150 155 160
Pro Gln Leu Ser Tyr Gly Tyr Asp Glu Lys Ser Thr Gly Ile Ser Val
165 170 175
Pro Gly Pro Met Gly Pro Ser Gly Pro Arg Gly Leu Pro Gly Pro Pro
180 185 190
Gly Ala Pro Gly Pro Gln Gly Phe Gln Gly Pro Pro Gly Glu Pro Gly
195 200 205
Glu Pro Gly Ala Ser Gly Pro Met Gly Pro Arg Gly Pro Pro Gly Pro
210 215 220
Pro Gly Lys Asn Gly Asp Asp Gly Glu Ala Gly Lys Pro Gly Arg Pro
225 230 235 240
Gly Glu Arg Gly Pro Pro Gly Pro Gln Gly Ala Arg Gly Leu Pro Gly
245 250 255
Thr Ala Gly Leu Pro Gly Met Lys Gly His Arg Gly Phe Ser Gly Leu
260 265 270
Asp Gly Ala Lys Gly Asp Ala Gly Pro Ala Gly Pro Lys Gly Glu Pro
275 280 285
Gly Ser Pro Gly Glu Asn Gly Ala Pro Gly Gln Met Gly Pro Arg Gly
290 295 300
Leu Pro Gly Glu Arg Gly Arg Pro Gly Ala Pro Gly Pro Ala Gly Ala
305 310 315 320
Arg Gly Asn Asp Gly Ala Thr Gly Ala Ala Gly Pro Pro Gly Pro Thr
325 330 335
Gly Pro Ala Gly Pro Pro Gly Phe Pro Gly Ala Val Gly Ala Lys Gly
340 345 350
Glu Gly Gly Pro Gln Gly Pro Arg Gly Ser Glu Gly Pro Gln Gly Val
355 360 365
Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Ala Ala Gly Pro Ala
370 375 380
Gly Asn Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Ala Asn Gly
385 390 395 400
Ala Pro Gly Ile Ala Gly Ala Pro Gly Phe Pro Gly Ala Arg Gly Pro
405 410 415
Ser Gly Pro Gln Gly Pro Ser Gly Pro Pro Gly Pro Lys Gly Asn Ser
420 425 430
Gly Glu Pro Gly Ala Pro Gly Ser Lys Gly Asp Thr Gly Ala Lys Gly
435 440 445
Glu Pro Gly Pro Thr Gly Ile Gln Gly Pro Pro Gly Pro Ala Gly Glu
450 455 460
Glu Gly Lys Arg Gly Ala Arg Gly Glu Pro Gly Pro Ala Gly Leu Pro
465 470 475 480
Gly Pro Pro Gly Glu Arg Gly Gly Pro Gly Ser Arg Gly Phe Pro Gly
485 490 495
Ala Asp Gly Val Ala Gly Pro Lys Gly Pro Ala Gly Glu Arg Gly Ala
500 505 510
Pro Gly Pro Ala Gly Pro Lys Gly Ser Pro Gly Glu Ala Gly Arg Pro
515 520 525
Gly Glu Ala Gly Leu Pro Gly Ala Lys Gly Leu Thr Gly Ser Pro Gly
530 535 540
Ser Pro Gly Pro Asp Gly Lys Thr Gly Pro Pro Gly Pro Ala Gly Gln
545 550 555 560
Asp Gly Arg Pro Gly Pro Pro Gly Pro Pro Gly Ala Arg Gly Gln Ala
565 570 575
Gly Val Met Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro Gly
580 585 590
Lys Ala Gly Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly Pro
595 600 605
Ala Gly Lys Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro Ala
610 615 620
Gly Pro Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro Gly
625 630 635 640
Phe Gln Gly Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly Lys
645 650 655
Pro Gly Glu Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro Ser
660 665 670
Gly Ala Arg Gly Glu Arg Gly Phe Pro Gly Glu Arg Gly Val Gln Gly
675 680 685
Pro Pro Gly Pro Ala Gly Pro Arg Gly Ala Asn Gly Ala Pro Gly Asn
690 695 700
Asp Gly Ala Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly Ser Gln
705 710 715 720
Gly Ala Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Ala Ala Gly
725 730 735
Leu Pro Gly Pro Lys Gly Asp Arg Gly Asp Ala Gly Pro Lys Gly Ala
740 745 750
Asp Gly Ala Pro Gly Lys Asp Gly Val Arg Gly Leu Thr Gly Pro Ile
755 760 765
Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Asp Lys Gly Glu Ala Gly
770 775 780
Pro Ser Gly Pro Ala Gly Pro Thr Gly Ala Arg Gly Ala Pro Gly Asp
785 790 795 800
Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Phe Ala Gly Pro Pro
805 810 815
Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Glu Pro Gly Asp Ala Gly
820 825 830
Ala Lys Gly Asp Ala Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly Pro
835 840 845
Pro Gly Pro Ile Gly Asn Val Gly Ala Pro Gly Pro Lys Gly Ala Arg
850 855 860
Gly Ser Ala Gly Pro Pro Gly Ala Thr Gly Phe Pro Gly Ala Ala Gly
865 870 875 880
Arg Val Gly Pro Pro Gly Pro Ser Gly Asn Ala Gly Pro Pro Gly Pro
885 890 895
Pro Gly Pro Ala Gly Lys Glu Gly Ser Lys Gly Pro Arg Gly Glu Thr
900 905 910
Gly Pro Ala Gly Arg Pro Gly Glu Val Gly Pro Pro Gly Pro Pro Gly
915 920 925
Pro Ala Gly Glu Lys Gly Ala Pro Gly Ala Asp Gly Pro Ala Gly Ala
930 935 940
Pro Gly Thr Pro Gly Pro Gln Gly Ile Ala Gly Gln Arg Gly Val Val
945 950 955 960
Gly Leu Pro Gly Gln Arg Gly Glu Arg Gly Phe Pro Gly Leu Pro Gly
965 970 975
Pro Ser Gly Glu Pro Gly Lys Gln Gly Pro Ser Gly Ala Ser Gly Glu
980 985 990
Arg Gly Pro Pro Gly Pro Met Gly Pro Pro Gly Leu Ala Gly Pro Pro
995 1000 1005
Gly Glu Ser Gly Arg Glu Gly Ala Pro Gly Ala Glu Gly Ser Pro
1010 1015 1020
Gly Arg Asp Gly Ser Pro Gly Ala Lys Gly Asp Arg Gly Glu Thr
1025 1030 1035
Gly Pro Ala Gly Pro Pro Gly Ala Pro Gly Ala Pro Gly Ala Pro
1040 1045 1050
Gly Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu Thr
1055 1060 1065
Gly Pro Ala Gly Pro Ala Gly Pro Ile Gly Pro Val Gly Ala Arg
1070 1075 1080
Gly Pro Ala Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu Thr
1085 1090 1095
Gly Glu Gln Gly Asp Arg Gly Ile Lys Gly His Arg Gly Phe Ser
1100 1105 1110
Gly Leu Gln Gly Pro Pro Gly Pro Pro Gly Ser Pro Gly Glu Gln
1115 1120 1125
Gly Pro Ser Gly Ala Ser Gly Pro Ala Gly Pro Arg Gly Pro Pro
1130 1135 1140
Gly Ser Ala Gly Ser Pro Gly Lys Asp Gly Leu Asn Gly Leu Pro
1145 1150 1155
Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly Arg Thr Gly Asp Ala
1160 1165 1170
Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro
1175 1180 1185
Gly Pro Pro Ser Gly Gly Tyr Asp Leu Ser Phe Leu Pro Gln Pro
1190 1195 1200
Pro Gln Glu Lys Ala His Asp Gly Gly Arg Tyr Tyr Arg Ala Asp
1205 1210 1215
Asp Ala Asn Val Val Arg Asp Arg Asp Leu Glu Val Asp Thr Thr
1220 1225 1230
Leu Lys Ser Leu Ser Gln Gln Ile Glu Asn Ile Arg Ser Pro Glu
1235 1240 1245
Gly Ser Arg Lys Asn Pro Ala Arg Thr Cys Arg Asp Leu Lys Met
1250 1255 1260
Cys His Ser Asp Trp Lys Ser Gly Glu Tyr Trp Ile Asp Pro Asn
1265 1270 1275
Gln Gly Cys Asn Leu Asp Ala Ile Lys Val Phe Cys Asn Met Glu
1280 1285 1290
Thr Gly Glu Thr Cys Val Tyr Pro Thr Gln Pro Ser Val Ala Gln
1295 1300 1305
Lys Asn Trp Tyr Ile Ser Lys Asn Pro Lys Glu Lys Arg His Val
1310 1315 1320
Trp Tyr Gly Glu Ser Met Thr Gly Gly Phe Gln Phe Glu Tyr Gly
1325 1330 1335
Gly Gln Gly Ser Asp Pro Ala Asp Val Ala Ile Gln Leu Thr Phe
1340 1345 1350
Leu Arg Leu Met Ser Thr Glu Ala Ser Gln Asn Ile Thr Tyr His
1355 1360 1365
Cys Lys Asn Ser Val Ala Tyr Met Asp Gln Gln Thr Gly Asn Leu
1370 1375 1380
Lys Lys Ala Leu Leu Leu Gln Gly Ser Asn Glu Ile Glu Ile Arg
1385 1390 1395
Ala Glu Gly Asn Ser Arg Phe Thr Tyr Ser Val Thr Tyr Asp Gly
1400 1405 1410
Cys Thr Ser His Thr Gly Ala Trp Gly Lys Thr Val Ile Glu Tyr
1415 1420 1425
Lys Thr Thr Lys Thr Ser Arg Leu Pro Ile Ile Asp Val Ala Pro
1430 1435 1440
Leu Asp Val Gly Ala Pro Asp Gln Glu Phe Gly Phe Asp Val Gly
1445 1450 1455
Pro Ala Cys Phe Leu
1460
<210> 37
<211> 4628
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(4628)
<223> Bos taurus collagen type I alpha 2 chain (COL1A2), mRNA; NCBI
Reference Sequence: NM_174520.2
<220>
<221> CDS
<222> (111)..(4205)
<400> 37
taagttggag gtactggcca cgactgcatg cctgcgcccg ccaggtgata cctccgccgg 60
tgacccaggg gctctgcgac acaaggagtc tgcatgtctg agtggtagac atg ctc 116
Met Leu
1
agc ttt gtg gat acg cgg act ttg ttg ctg ctt gca gta act tcg tgc 164
Ser Phe Val Asp Thr Arg Thr Leu Leu Leu Leu Ala Val Thr Ser Cys
5 10 15
cta gca aca tgc caa tcc tta caa gag gca act gca aga aag ggc cca 212
Leu Ala Thr Cys Gln Ser Leu Gln Glu Ala Thr Ala Arg Lys Gly Pro
20 25 30
agt gga gat aga gga cca cgc gga gaa agg ggt cca cca ggc cca cca 260
Ser Gly Asp Arg Gly Pro Arg Gly Glu Arg Gly Pro Pro Gly Pro Pro
35 40 45 50
ggc aga gat ggt gat gac ggc atc cca ggc cct cct ggc ccc cct ggc 308
Gly Arg Asp Gly Asp Asp Gly Ile Pro Gly Pro Pro Gly Pro Pro Gly
55 60 65
cct cct ggc ccc cct ggt ctt ggc ggg aac ttt gct gct cag ttt gat 356
Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala Ala Gln Phe Asp
70 75 80
gca aaa gga ggt ggc cct gga cca atg ggg ctg atg gga cct cgc ggc 404
Ala Lys Gly Gly Gly Pro Gly Pro Met Gly Leu Met Gly Pro Arg Gly
85 90 95
cct cct ggg gct tct gga gcc cct ggc cct caa ggt ttc cag gga cct 452
Pro Pro Gly Ala Ser Gly Ala Pro Gly Pro Gln Gly Phe Gln Gly Pro
100 105 110
ccg ggt gag cct ggt gaa cct ggt cag act ggt cct gca ggt gct cgt 500
Pro Gly Glu Pro Gly Glu Pro Gly Gln Thr Gly Pro Ala Gly Ala Arg
115 120 125 130
ggc ccg cct ggc cct cct ggc aag gct ggt gag gat ggt cac cct gga 548
Gly Pro Pro Gly Pro Pro Gly Lys Ala Gly Glu Asp Gly His Pro Gly
135 140 145
aaa cct gga cga cct ggt gag aga ggg gtt gtt gga cca cag ggt gct 596
Lys Pro Gly Arg Pro Gly Glu Arg Gly Val Val Gly Pro Gln Gly Ala
150 155 160
cgt ggc ttt cct gga act cct gga ctc cct ggc ttc aag ggc att agg 644
Arg Gly Phe Pro Gly Thr Pro Gly Leu Pro Gly Phe Lys Gly Ile Arg
165 170 175
ggt cac aat ggt ctg gat gga ttg aag gga cag cct ggt gct cca ggt 692
Gly His Asn Gly Leu Asp Gly Leu Lys Gly Gln Pro Gly Ala Pro Gly
180 185 190
gtg aag ggt gaa cct ggt gcc cct ggt gaa aat gga act cca ggt caa 740
Val Lys Gly Glu Pro Gly Ala Pro Gly Glu Asn Gly Thr Pro Gly Gln
195 200 205 210
acg gga gcc cgt ggt ctt cct ggt gag aga gga cgt gtt ggt gcc cct 788
Thr Gly Ala Arg Gly Leu Pro Gly Glu Arg Gly Arg Val Gly Ala Pro
215 220 225
ggc cca gct ggt gcc cgt gga agt gat gga agt gtg ggt cct gtg ggc 836
Gly Pro Ala Gly Ala Arg Gly Ser Asp Gly Ser Val Gly Pro Val Gly
230 235 240
cct gct ggt ccc att ggg tct gct ggc cct cca ggc ttc cca ggt gct 884
Pro Ala Gly Pro Ile Gly Ser Ala Gly Pro Pro Gly Phe Pro Gly Ala
245 250 255
cct ggc ccc aag ggt gaa ctc gga cct gtt ggt aac cct ggc cct gct 932
Pro Gly Pro Lys Gly Glu Leu Gly Pro Val Gly Asn Pro Gly Pro Ala
260 265 270
ggt ccc gcg ggt ccc cgt ggt gaa gtg ggt ctc cca ggc ctt tct ggc 980
Gly Pro Ala Gly Pro Arg Gly Glu Val Gly Leu Pro Gly Leu Ser Gly
275 280 285 290
cct gtc gga cct cct gga aac ccc gga gcc aat ggg ctt cct ggc gct 1028
Pro Val Gly Pro Pro Gly Asn Pro Gly Ala Asn Gly Leu Pro Gly Ala
295 300 305
aag ggt gct gct ggc ctt ccc ggt gtt gct ggg gct ccc ggc ctc cct 1076
Lys Gly Ala Ala Gly Leu Pro Gly Val Ala Gly Ala Pro Gly Leu Pro
310 315 320
gga ccc cgg ggt att cct ggc cct gtt ggc gct gct ggt gct act ggc 1124
Gly Pro Arg Gly Ile Pro Gly Pro Val Gly Ala Ala Gly Ala Thr Gly
325 330 335
gcc aga gga ctt gtt ggt gag ccc ggc cca gct ggt tcg aaa gga gag 1172
Ala Arg Gly Leu Val Gly Glu Pro Gly Pro Ala Gly Ser Lys Gly Glu
340 345 350
agc ggc aac aag ggc gag cct ggt gct gtt ggg cag cca ggt cct cct 1220
Ser Gly Asn Lys Gly Glu Pro Gly Ala Val Gly Gln Pro Gly Pro Pro
355 360 365 370
ggc ccc agt ggt gaa gaa gga aag aga ggc tcc act gga gaa atc gga 1268
Gly Pro Ser Gly Glu Glu Gly Lys Arg Gly Ser Thr Gly Glu Ile Gly
375 380 385
ccc gct ggc ccc cca gga cct cct ggg ctg agg gga aat cct ggc tcc 1316
Pro Ala Gly Pro Pro Gly Pro Pro Gly Leu Arg Gly Asn Pro Gly Ser
390 395 400
cgt ggt cta cct gga gct gac ggc aga gct ggt gtc atg ggt cct gct 1364
Arg Gly Leu Pro Gly Ala Asp Gly Arg Ala Gly Val Met Gly Pro Ala
405 410 415
ggt agc cgt ggt gca act ggc cct gct ggt gtg cga ggt ccc aat gga 1412
Gly Ser Arg Gly Ala Thr Gly Pro Ala Gly Val Arg Gly Pro Asn Gly
420 425 430
gat tct ggt cgc cct gga gag cct ggc ctc atg gga ccc cga ggt ttc 1460
Asp Ser Gly Arg Pro Gly Glu Pro Gly Leu Met Gly Pro Arg Gly Phe
435 440 445 450
cca ggt tcc cct gga aat atc ggc cca gct ggt aaa gaa ggt cct gtg 1508
Pro Gly Ser Pro Gly Asn Ile Gly Pro Ala Gly Lys Glu Gly Pro Val
455 460 465
ggt ctc cct ggt att gac ggc aga cct ggg ccc att ggc cca gcg gga 1556
Gly Leu Pro Gly Ile Asp Gly Arg Pro Gly Pro Ile Gly Pro Ala Gly
470 475 480
gca aga gga gag cct ggc aac att gga ttc cct gga ccc aaa ggc ccc 1604
Ala Arg Gly Glu Pro Gly Asn Ile Gly Phe Pro Gly Pro Lys Gly Pro
485 490 495
agt ggt gat cct ggc aaa gct ggt gaa aaa ggt cat gct ggt ctt gct 1652
Ser Gly Asp Pro Gly Lys Ala Gly Glu Lys Gly His Ala Gly Leu Ala
500 505 510
ggt gct cgg ggc gct cca ggt ccc gat ggc aac aac ggt gct cag gga 1700
Gly Ala Arg Gly Ala Pro Gly Pro Asp Gly Asn Asn Gly Ala Gln Gly
515 520 525 530
ccc cct gga cta cag ggt gtc caa ggt gga aaa ggt gaa cag ggt cct 1748
Pro Pro Gly Leu Gln Gly Val Gln Gly Gly Lys Gly Glu Gln Gly Pro
535 540 545
gct ggt cct cca ggc ttc cag ggt ctg cct ggc cct gca ggc aca gct 1796
Ala Gly Pro Pro Gly Phe Gln Gly Leu Pro Gly Pro Ala Gly Thr Ala
550 555 560
ggt gaa gct ggc aaa cca gga gaa agg ggt atc cct ggt gaa ttt ggt 1844
Gly Glu Ala Gly Lys Pro Gly Glu Arg Gly Ile Pro Gly Glu Phe Gly
565 570 575
ctc cct ggc cct gct ggt gca aga ggg gag cgg ggg ccc cca ggt gaa 1892
Leu Pro Gly Pro Ala Gly Ala Arg Gly Glu Arg Gly Pro Pro Gly Glu
580 585 590
agt ggt gct gct ggg cct act ggg cct att gga agc cga ggt cct tct 1940
Ser Gly Ala Ala Gly Pro Thr Gly Pro Ile Gly Ser Arg Gly Pro Ser
595 600 605 610
gga ccc cca ggg cct gat gga aac aag ggt gaa ccg ggt gtg gtt ggc 1988
Gly Pro Pro Gly Pro Asp Gly Asn Lys Gly Glu Pro Gly Val Val Gly
615 620 625
gct cca ggc act gct ggc cca tct ggt cct agc gga ctc cca gga gag 2036
Ala Pro Gly Thr Ala Gly Pro Ser Gly Pro Ser Gly Leu Pro Gly Glu
630 635 640
agg ggt gcg gct ggc att cct gga ggc aag gga gaa aag ggt gaa act 2084
Arg Gly Ala Ala Gly Ile Pro Gly Gly Lys Gly Glu Lys Gly Glu Thr
645 650 655
ggt ctc aga ggt gac att ggt agc cct ggt aga gat ggt gct cgt ggt 2132
Gly Leu Arg Gly Asp Ile Gly Ser Pro Gly Arg Asp Gly Ala Arg Gly
660 665 670
gct cct ggt gct att ggt gct cct ggc cct gct gga gcc aat ggg gac 2180
Ala Pro Gly Ala Ile Gly Ala Pro Gly Pro Ala Gly Ala Asn Gly Asp
675 680 685 690
cgg ggt gaa gct ggt ccc gct ggc cct gct ggc cct gct ggt cct cgt 2228
Arg Gly Glu Ala Gly Pro Ala Gly Pro Ala Gly Pro Ala Gly Pro Arg
695 700 705
ggt agc cct ggt gaa cgt ggt gag gtc ggt ccc gct ggc ccc aac gga 2276
Gly Ser Pro Gly Glu Arg Gly Glu Val Gly Pro Ala Gly Pro Asn Gly
710 715 720
ttt gct ggt cct gct ggt gct gct ggt caa cct ggt gct aaa gga gag 2324
Phe Ala Gly Pro Ala Gly Ala Ala Gly Gln Pro Gly Ala Lys Gly Glu
725 730 735
aga gga acc aaa gga ccc aag ggt gaa aat ggt cct gtt ggt ccc aca 2372
Arg Gly Thr Lys Gly Pro Lys Gly Glu Asn Gly Pro Val Gly Pro Thr
740 745 750
ggc ccc gtt gga gct gcc ggt ccg tct ggt cca aat ggc cca cct ggt 2420
Gly Pro Val Gly Ala Ala Gly Pro Ser Gly Pro Asn Gly Pro Pro Gly
755 760 765 770
cct gct gga agt cgt ggt gat gga ggg ccc cct ggg gct act ggt ttc 2468
Pro Ala Gly Ser Arg Gly Asp Gly Gly Pro Pro Gly Ala Thr Gly Phe
775 780 785
cct ggt gct gct gga cgg act ggt ccc cct gga ccc tct ggt atc tct 2516
Pro Gly Ala Ala Gly Arg Thr Gly Pro Pro Gly Pro Ser Gly Ile Ser
790 795 800
ggc ccc cct ggc ccc cct ggt cct gct ggt aaa gaa ggg ctt cgt ggg 2564
Gly Pro Pro Gly Pro Pro Gly Pro Ala Gly Lys Glu Gly Leu Arg Gly
805 810 815
cct cgt ggt gac caa ggt cca gtt ggt cga agt gga gag aca ggt gcc 2612
Pro Arg Gly Asp Gln Gly Pro Val Gly Arg Ser Gly Glu Thr Gly Ala
820 825 830
tct ggc cct cct ggc ttt gtt ggt gag aag ggt ccc tct gga gag cct 2660
Ser Gly Pro Pro Gly Phe Val Gly Glu Lys Gly Pro Ser Gly Glu Pro
835 840 845 850
ggt act gct ggg cct cct gga acc cca ggt cca caa ggc ctt ctt ggt 2708
Gly Thr Ala Gly Pro Pro Gly Thr Pro Gly Pro Gln Gly Leu Leu Gly
855 860 865
gct cct ggt ttt ctg ggt ctc cca ggc tct aga ggt gag cgt ggt cta 2756
Ala Pro Gly Phe Leu Gly Leu Pro Gly Ser Arg Gly Glu Arg Gly Leu
870 875 880
cca ggt gtc gct gga tct gtg ggt gaa cct ggc ccc ctc ggc atc gca 2804
Pro Gly Val Ala Gly Ser Val Gly Glu Pro Gly Pro Leu Gly Ile Ala
885 890 895
ggc cca cct ggg gcc cgt ggt ccc cct ggt aat gtc ggt aat cct ggc 2852
Gly Pro Pro Gly Ala Arg Gly Pro Pro Gly Asn Val Gly Asn Pro Gly
900 905 910
gtc aat ggt gct cct ggt gaa gcc ggt cgt gac ggc aac cct ggg aat 2900
Val Asn Gly Ala Pro Gly Glu Ala Gly Arg Asp Gly Asn Pro Gly Asn
915 920 925 930
gac ggt ccc cca ggc cgc gat ggt caa ccc gga cac aag ggg gag cgt 2948
Asp Gly Pro Pro Gly Arg Asp Gly Gln Pro Gly His Lys Gly Glu Arg
935 940 945
ggt tac ccc ggt aac gca ggt cct gtt ggt gct gcc ggt gct cct ggc 2996
Gly Tyr Pro Gly Asn Ala Gly Pro Val Gly Ala Ala Gly Ala Pro Gly
950 955 960
cct caa ggc cct gtg ggt ccc gtt ggt aaa cac gga aac cgt ggt gaa 3044
Pro Gln Gly Pro Val Gly Pro Val Gly Lys His Gly Asn Arg Gly Glu
965 970 975
ccg ggt cct gcc ggt gct gtt ggt cct gct ggt gcc gtt ggc cca aga 3092
Pro Gly Pro Ala Gly Ala Val Gly Pro Ala Gly Ala Val Gly Pro Arg
980 985 990
ggt ccc agt ggc cca caa ggt att cga ggt gac aag gga gag cct 3137
Gly Pro Ser Gly Pro Gln Gly Ile Arg Gly Asp Lys Gly Glu Pro
995 1000 1005
ggt gat aag ggt ccc aga ggt ctt cct ggc tta aag gga cac aat 3182
Gly Asp Lys Gly Pro Arg Gly Leu Pro Gly Leu Lys Gly His Asn
1010 1015 1020
ggg ttg caa ggt ctc ccg ggt ctt gct ggt cat cat ggc gat caa 3227
Gly Leu Gln Gly Leu Pro Gly Leu Ala Gly His His Gly Asp Gln
1025 1030 1035
ggt gct ccc ggt gct gtg ggt ccc gct ggt ccc agg ggc cct gct 3272
Gly Ala Pro Gly Ala Val Gly Pro Ala Gly Pro Arg Gly Pro Ala
1040 1045 1050
ggt cct tct ggc ccc gct ggc aaa gac ggt cgc att gga cag cct 3317
Gly Pro Ser Gly Pro Ala Gly Lys Asp Gly Arg Ile Gly Gln Pro
1055 1060 1065
ggt gca gtc gga cct gct ggc att cgt ggc tct cag ggt agc caa 3362
Gly Ala Val Gly Pro Ala Gly Ile Arg Gly Ser Gln Gly Ser Gln
1070 1075 1080
ggt cct gct ggc cct cct ggt ccc cct ggc cct cct gga cct cct 3407
Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro
1085 1090 1095
ggc cca agt ggt ggt ggt tac gag ttt ggt ttt gat gga gac ttc 3452
Gly Pro Ser Gly Gly Gly Tyr Glu Phe Gly Phe Asp Gly Asp Phe
1100 1105 1110
tac agg gct gac cag cct cgc tca cca act tct ctc aga ccc aag 3497
Tyr Arg Ala Asp Gln Pro Arg Ser Pro Thr Ser Leu Arg Pro Lys
1115 1120 1125
gat tat gaa gtt gat gct act ctg aaa tct ctc aac aac cag att 3542
Asp Tyr Glu Val Asp Ala Thr Leu Lys Ser Leu Asn Asn Gln Ile
1130 1135 1140
gag acc ctt ctt act cca gaa ggc tct agg aag aac cca gct cgc 3587
Glu Thr Leu Leu Thr Pro Glu Gly Ser Arg Lys Asn Pro Ala Arg
1145 1150 1155
aca tgc cga gac ttg aga ctc agc cac cca gaa tgg agc agt ggt 3632
Thr Cys Arg Asp Leu Arg Leu Ser His Pro Glu Trp Ser Ser Gly
1160 1165 1170
tac tac tgg att gac cct aac caa gga tgt act atg gat gct atc 3677
Tyr Tyr Trp Ile Asp Pro Asn Gln Gly Cys Thr Met Asp Ala Ile
1175 1180 1185
aaa gta tac tgt gat ttc tct act ggc gaa acc tgc atc cgg gct 3722
Lys Val Tyr Cys Asp Phe Ser Thr Gly Glu Thr Cys Ile Arg Ala
1190 1195 1200
caa cct gaa gac atc cca gtc aag aac tgg tac aga aat tcc aag 3767
Gln Pro Glu Asp Ile Pro Val Lys Asn Trp Tyr Arg Asn Ser Lys
1205 1210 1215
gcc aag aag cat gtc tgg gta gga gaa act atc aac ggt ggt acc 3812
Ala Lys Lys His Val Trp Val Gly Glu Thr Ile Asn Gly Gly Thr
1220 1225 1230
cag ttt gaa tat aat gtt gaa gga gta acc acc aag gaa atg gct 3857
Gln Phe Glu Tyr Asn Val Glu Gly Val Thr Thr Lys Glu Met Ala
1235 1240 1245
acc caa ctt gcc ttc atg cgt ctg ctg gcc aac cat gcc tct cag 3902
Thr Gln Leu Ala Phe Met Arg Leu Leu Ala Asn His Ala Ser Gln
1250 1255 1260
aac atc acc tac cat tgc aag aac agc att gca tac atg gat gag 3947
Asn Ile Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp Glu
1265 1270 1275
gaa act ggc aac ctg aaa aag gct gtc att ctg caa gga tcc aat 3992
Glu Thr Gly Asn Leu Lys Lys Ala Val Ile Leu Gln Gly Ser Asn
1280 1285 1290
gat gtc gaa ctt gtt gcc gag ggc aac agc aga ttc act tac act 4037
Asp Val Glu Leu Val Ala Glu Gly Asn Ser Arg Phe Thr Tyr Thr
1295 1300 1305
gtt ctt gta gat ggc tgc tct aaa aag aca aat gaa tgg cag aag 4082
Val Leu Val Asp Gly Cys Ser Lys Lys Thr Asn Glu Trp Gln Lys
1310 1315 1320
aca atc att gaa tat aaa aca aac aag cca tct cgc ctg cct atc 4127
Thr Ile Ile Glu Tyr Lys Thr Asn Lys Pro Ser Arg Leu Pro Ile
1325 1330 1335
ctt gat att gca cct ttg gac atc ggt ggc gct gac caa gaa atc 4172
Leu Asp Ile Ala Pro Leu Asp Ile Gly Gly Ala Asp Gln Glu Ile
1340 1345 1350
aga ttg aac att ggc cca gtc tgt ttc aaa taa acgaactcaa 4215
Arg Leu Asn Ile Gly Pro Val Cys Phe Lys
1355 1360
cctaaattaa agaaaaagga aatctgaaac atttctcttg gccatttctt tttcttcttt 4275
cctaactgaa agctgaatcc ttccatttct tctgcacatc tacttgctta aattgtggca 4335
aaagaggaga aggattgatc agagcattgt gcaatacaat ttaattcact ccccctccct 4395
tttcccctct ccaaaagatt tggaattttt tttttttcaa cactcttaca cctgttgtgg 4455
aaaatgtcaa cctttgtaag aaaaccaaaa taaaaattga aaaataaaaa ccatgaacat 4515
ttgcaccact tgtggctttt gaatatcttc cacggaggga agtttaaaac ccaaacttcc 4575
aaaggtttaa actacctcaa aacactttcc tgtgagtgtg atccacacct cgt 4628
<210> 38
<211> 1364
<212> PRT
<213> Bos taurus
<400> 38
Met Leu Ser Phe Val Asp Thr Arg Thr Leu Leu Leu Leu Ala Val Thr
1 5 10 15
Ser Cys Leu Ala Thr Cys Gln Ser Leu Gln Glu Ala Thr Ala Arg Lys
20 25 30
Gly Pro Ser Gly Asp Arg Gly Pro Arg Gly Glu Arg Gly Pro Pro Gly
35 40 45
Pro Pro Gly Arg Asp Gly Asp Asp Gly Ile Pro Gly Pro Pro Gly Pro
50 55 60
Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala Ala Gln
65 70 75 80
Phe Asp Ala Lys Gly Gly Gly Pro Gly Pro Met Gly Leu Met Gly Pro
85 90 95
Arg Gly Pro Pro Gly Ala Ser Gly Ala Pro Gly Pro Gln Gly Phe Gln
100 105 110
Gly Pro Pro Gly Glu Pro Gly Glu Pro Gly Gln Thr Gly Pro Ala Gly
115 120 125
Ala Arg Gly Pro Pro Gly Pro Pro Gly Lys Ala Gly Glu Asp Gly His
130 135 140
Pro Gly Lys Pro Gly Arg Pro Gly Glu Arg Gly Val Val Gly Pro Gln
145 150 155 160
Gly Ala Arg Gly Phe Pro Gly Thr Pro Gly Leu Pro Gly Phe Lys Gly
165 170 175
Ile Arg Gly His Asn Gly Leu Asp Gly Leu Lys Gly Gln Pro Gly Ala
180 185 190
Pro Gly Val Lys Gly Glu Pro Gly Ala Pro Gly Glu Asn Gly Thr Pro
195 200 205
Gly Gln Thr Gly Ala Arg Gly Leu Pro Gly Glu Arg Gly Arg Val Gly
210 215 220
Ala Pro Gly Pro Ala Gly Ala Arg Gly Ser Asp Gly Ser Val Gly Pro
225 230 235 240
Val Gly Pro Ala Gly Pro Ile Gly Ser Ala Gly Pro Pro Gly Phe Pro
245 250 255
Gly Ala Pro Gly Pro Lys Gly Glu Leu Gly Pro Val Gly Asn Pro Gly
260 265 270
Pro Ala Gly Pro Ala Gly Pro Arg Gly Glu Val Gly Leu Pro Gly Leu
275 280 285
Ser Gly Pro Val Gly Pro Pro Gly Asn Pro Gly Ala Asn Gly Leu Pro
290 295 300
Gly Ala Lys Gly Ala Ala Gly Leu Pro Gly Val Ala Gly Ala Pro Gly
305 310 315 320
Leu Pro Gly Pro Arg Gly Ile Pro Gly Pro Val Gly Ala Ala Gly Ala
325 330 335
Thr Gly Ala Arg Gly Leu Val Gly Glu Pro Gly Pro Ala Gly Ser Lys
340 345 350
Gly Glu Ser Gly Asn Lys Gly Glu Pro Gly Ala Val Gly Gln Pro Gly
355 360 365
Pro Pro Gly Pro Ser Gly Glu Glu Gly Lys Arg Gly Ser Thr Gly Glu
370 375 380
Ile Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Leu Arg Gly Asn Pro
385 390 395 400
Gly Ser Arg Gly Leu Pro Gly Ala Asp Gly Arg Ala Gly Val Met Gly
405 410 415
Pro Ala Gly Ser Arg Gly Ala Thr Gly Pro Ala Gly Val Arg Gly Pro
420 425 430
Asn Gly Asp Ser Gly Arg Pro Gly Glu Pro Gly Leu Met Gly Pro Arg
435 440 445
Gly Phe Pro Gly Ser Pro Gly Asn Ile Gly Pro Ala Gly Lys Glu Gly
450 455 460
Pro Val Gly Leu Pro Gly Ile Asp Gly Arg Pro Gly Pro Ile Gly Pro
465 470 475 480
Ala Gly Ala Arg Gly Glu Pro Gly Asn Ile Gly Phe Pro Gly Pro Lys
485 490 495
Gly Pro Ser Gly Asp Pro Gly Lys Ala Gly Glu Lys Gly His Ala Gly
500 505 510
Leu Ala Gly Ala Arg Gly Ala Pro Gly Pro Asp Gly Asn Asn Gly Ala
515 520 525
Gln Gly Pro Pro Gly Leu Gln Gly Val Gln Gly Gly Lys Gly Glu Gln
530 535 540
Gly Pro Ala Gly Pro Pro Gly Phe Gln Gly Leu Pro Gly Pro Ala Gly
545 550 555 560
Thr Ala Gly Glu Ala Gly Lys Pro Gly Glu Arg Gly Ile Pro Gly Glu
565 570 575
Phe Gly Leu Pro Gly Pro Ala Gly Ala Arg Gly Glu Arg Gly Pro Pro
580 585 590
Gly Glu Ser Gly Ala Ala Gly Pro Thr Gly Pro Ile Gly Ser Arg Gly
595 600 605
Pro Ser Gly Pro Pro Gly Pro Asp Gly Asn Lys Gly Glu Pro Gly Val
610 615 620
Val Gly Ala Pro Gly Thr Ala Gly Pro Ser Gly Pro Ser Gly Leu Pro
625 630 635 640
Gly Glu Arg Gly Ala Ala Gly Ile Pro Gly Gly Lys Gly Glu Lys Gly
645 650 655
Glu Thr Gly Leu Arg Gly Asp Ile Gly Ser Pro Gly Arg Asp Gly Ala
660 665 670
Arg Gly Ala Pro Gly Ala Ile Gly Ala Pro Gly Pro Ala Gly Ala Asn
675 680 685
Gly Asp Arg Gly Glu Ala Gly Pro Ala Gly Pro Ala Gly Pro Ala Gly
690 695 700
Pro Arg Gly Ser Pro Gly Glu Arg Gly Glu Val Gly Pro Ala Gly Pro
705 710 715 720
Asn Gly Phe Ala Gly Pro Ala Gly Ala Ala Gly Gln Pro Gly Ala Lys
725 730 735
Gly Glu Arg Gly Thr Lys Gly Pro Lys Gly Glu Asn Gly Pro Val Gly
740 745 750
Pro Thr Gly Pro Val Gly Ala Ala Gly Pro Ser Gly Pro Asn Gly Pro
755 760 765
Pro Gly Pro Ala Gly Ser Arg Gly Asp Gly Gly Pro Pro Gly Ala Thr
770 775 780
Gly Phe Pro Gly Ala Ala Gly Arg Thr Gly Pro Pro Gly Pro Ser Gly
785 790 795 800
Ile Ser Gly Pro Pro Gly Pro Pro Gly Pro Ala Gly Lys Glu Gly Leu
805 810 815
Arg Gly Pro Arg Gly Asp Gln Gly Pro Val Gly Arg Ser Gly Glu Thr
820 825 830
Gly Ala Ser Gly Pro Pro Gly Phe Val Gly Glu Lys Gly Pro Ser Gly
835 840 845
Glu Pro Gly Thr Ala Gly Pro Pro Gly Thr Pro Gly Pro Gln Gly Leu
850 855 860
Leu Gly Ala Pro Gly Phe Leu Gly Leu Pro Gly Ser Arg Gly Glu Arg
865 870 875 880
Gly Leu Pro Gly Val Ala Gly Ser Val Gly Glu Pro Gly Pro Leu Gly
885 890 895
Ile Ala Gly Pro Pro Gly Ala Arg Gly Pro Pro Gly Asn Val Gly Asn
900 905 910
Pro Gly Val Asn Gly Ala Pro Gly Glu Ala Gly Arg Asp Gly Asn Pro
915 920 925
Gly Asn Asp Gly Pro Pro Gly Arg Asp Gly Gln Pro Gly His Lys Gly
930 935 940
Glu Arg Gly Tyr Pro Gly Asn Ala Gly Pro Val Gly Ala Ala Gly Ala
945 950 955 960
Pro Gly Pro Gln Gly Pro Val Gly Pro Val Gly Lys His Gly Asn Arg
965 970 975
Gly Glu Pro Gly Pro Ala Gly Ala Val Gly Pro Ala Gly Ala Val Gly
980 985 990
Pro Arg Gly Pro Ser Gly Pro Gln Gly Ile Arg Gly Asp Lys Gly Glu
995 1000 1005
Pro Gly Asp Lys Gly Pro Arg Gly Leu Pro Gly Leu Lys Gly His
1010 1015 1020
Asn Gly Leu Gln Gly Leu Pro Gly Leu Ala Gly His His Gly Asp
1025 1030 1035
Gln Gly Ala Pro Gly Ala Val Gly Pro Ala Gly Pro Arg Gly Pro
1040 1045 1050
Ala Gly Pro Ser Gly Pro Ala Gly Lys Asp Gly Arg Ile Gly Gln
1055 1060 1065
Pro Gly Ala Val Gly Pro Ala Gly Ile Arg Gly Ser Gln Gly Ser
1070 1075 1080
Gln Gly Pro Ala Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro
1085 1090 1095
Pro Gly Pro Ser Gly Gly Gly Tyr Glu Phe Gly Phe Asp Gly Asp
1100 1105 1110
Phe Tyr Arg Ala Asp Gln Pro Arg Ser Pro Thr Ser Leu Arg Pro
1115 1120 1125
Lys Asp Tyr Glu Val Asp Ala Thr Leu Lys Ser Leu Asn Asn Gln
1130 1135 1140
Ile Glu Thr Leu Leu Thr Pro Glu Gly Ser Arg Lys Asn Pro Ala
1145 1150 1155
Arg Thr Cys Arg Asp Leu Arg Leu Ser His Pro Glu Trp Ser Ser
1160 1165 1170
Gly Tyr Tyr Trp Ile Asp Pro Asn Gln Gly Cys Thr Met Asp Ala
1175 1180 1185
Ile Lys Val Tyr Cys Asp Phe Ser Thr Gly Glu Thr Cys Ile Arg
1190 1195 1200
Ala Gln Pro Glu Asp Ile Pro Val Lys Asn Trp Tyr Arg Asn Ser
1205 1210 1215
Lys Ala Lys Lys His Val Trp Val Gly Glu Thr Ile Asn Gly Gly
1220 1225 1230
Thr Gln Phe Glu Tyr Asn Val Glu Gly Val Thr Thr Lys Glu Met
1235 1240 1245
Ala Thr Gln Leu Ala Phe Met Arg Leu Leu Ala Asn His Ala Ser
1250 1255 1260
Gln Asn Ile Thr Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp
1265 1270 1275
Glu Glu Thr Gly Asn Leu Lys Lys Ala Val Ile Leu Gln Gly Ser
1280 1285 1290
Asn Asp Val Glu Leu Val Ala Glu Gly Asn Ser Arg Phe Thr Tyr
1295 1300 1305
Thr Val Leu Val Asp Gly Cys Ser Lys Lys Thr Asn Glu Trp Gln
1310 1315 1320
Lys Thr Ile Ile Glu Tyr Lys Thr Asn Lys Pro Ser Arg Leu Pro
1325 1330 1335
Ile Leu Asp Ile Ala Pro Leu Asp Ile Gly Gly Ala Asp Gln Glu
1340 1345 1350
Ile Arg Leu Asn Ile Gly Pro Val Cys Phe Lys
1355 1360
<210> 39
<211> 623
<212> DNA
<213> Artificial Sequence
<220>
<223> pDF promoter
<400> 39
aatgtatcta aacgcaaact ccgagctgga aaaatgttac cggcgatgcg cggacaattt 60
agaggcggcg atcaagaaac acctgctggg cgagcagtct ggagcacagt cttcgatggg 120
cccgagatcc caccgcgttc ctgggtaccg ggacgtgagg cagcgcgaca tccatcaaat 180
ataccaggcg ccaaccgagt gtctcggaaa acagcttctg gatatcttcc gctggcggcg 240
caacgacgaa taatagtccc tggaggtgac ggaatatata tgtgtggagg gtaaatctga 300
cagggtgtag caaaggtaat attttcctaa aacatgcaat cggctgcccc gcaacgggaa 360
aaagaatgac tttggcactc ttcaccagag tggggtgtcc cgctcgtgtg tgcaaatagg 420
ctcccactgg tcaccccgga ttttgcagaa aaacagcaag ttccggggtg tctcactggt 480
gtccgccaat aagaggagcc ggcaggcacg gagtttacat caagctgtct ccgatacact 540
cgactaccat ccgggtctct cagagagggg aatggcacta taaataccgc ctccttgcgc 600
tctctgcctt catcaatcaa atc 623
<210> 40
<211> 822
<212> DNA
<213> Artificial Sequence
<220>
<223> pGCEW14 promoter
<400> 40
caggtgaacc cacctaacta tttttaactg ggatccagtg agctcgctgg gtgaaagcca 60
accatctttt gtttcgggga accgtgctcg ccccgtaaag ttaatttttt tttcccgcgc 120
agctttaatc tttcggcaga gaaggcgttt tcatcgtagc gtgggaacag aataatcagt 180
tcatgtgcta tacaggcaca tggcagcagt cactattttg ctttttaacc ttaaagtcgt 240
tcatcaatca ttaactgacc aatcagattt tttgcatttg ccacttatct aaaaatactt 300
ttgtatctcg cagatacgtt cagtggtttc caggacaaca cccaaaaaaa ggtatcaatg 360
ccactaggca gtcggtttta tttttggtca cccacgcaaa gaagcaccca cctcttttag 420
gttttaagtt gtgggaacag taacaccgcc tagagcttca ggaaaaacca gtacctgtga 480
ccgcaattca ccatgatgca gaatgttaat ttaaacgagt gccaaatcaa gatttcaaca 540
gacaaatcaa tcgatccata gttacccatt ccagcctttt cgtcgtcgag cctgcttcat 600
tcctgcctca ggtgcataac tttgcatgaa aagtccagat tagggcagat tttgagttta 660
aaataggaaa tataaacaaa tataccgcga aaaaggtttg tttatagctt ttcgcctggt 720
gccgtacggt ataaatacat actctcctcc cccccctggt tctctttttc ttttgttact 780
tacattttac cgttccgtca ctcgcttcac tcaacaacaa aa 822
<210> 41
<211> 476
<212> DNA
<213> Artificial Sequence
<220>
<223> pGAP1 promoter
<400> 41
tttttgtaga aatgtcttgg tgtcctcgac caatcaggta gccatccctg aaatacctgg 60
ctccgtggca acaccgaacg acctgctggc aacgttaaat tctccggggt aaaacttaaa 120
tgtggagtaa tagaaccaga aacgtctctt cccttctctc tccttccacc gcccgttacc 180
gtccctagga aattttactc tgctggagag cttcttctac ggcccccttg cagcaatgct 240
cttcccagca ttacgttgcg ggtaaaacgg aggtcgtgta cccgacctag cagcccaggg 300
atggaaagtc ccggccgtcg ctggcaataa ctgcgggcgg acgcatgtct tgagattatt 360
ggaaaccacc agaatcgaat ataaaaggcg aacacctttc ccaattttgg tttctcctga 420
cccaaagact ttaaatttaa tttatttgtc cctatttcaa tcaattgaac aactat 476
<210> 42
<211> 550
<212> DNA
<213> Artificial Sequence
<220>
<223> pHTX1 bi-directional promoter
<400> 42
tgttgtagtt ttaatatagt ttgagtatga gatggaactc agaacgaagg aattatcacc 60
agtttatata ttctgaggaa agggtgtgtc ctaaattgga cagtcacgat ggcaataaac 120
gctcagccaa tcagaatgca ggagccataa attgttgtat tattgctgca agatttatgt 180
gggttcacat tccactgaat ggttttcact gtagaattgg tgtcctagtt gttatgtttc 240
gagatgtttt caagaaaaac taaaatgcac aaactgacca ataatgtgcc gtcgcgcttg 300
gtacaaacgt caggattgcc accacttttt tcgcactctg gtacaaaagt tcgcacttcc 360
cactcgtatg taacgaaaaa cagagcagtc tatccagaac gagacaaatt agcgcgtact 420
gtcccattcc ataaggtatc ataggaaacg agagtcctcc ccccatcacg tatatataaa 480
cacactgata tcccacatcc gcttgtcacc aaactaatac atccagttca agttacctaa 540
acaaatcaaa 550
<210> 43
<211> 931
<212> DNA
<213> Artificial Sequence
<220>
<223> pAOX1 promoter
<400> 43
aacatccaaa gacgaaaggt tgaatgaaac ctttttgcca tccgacatcc acaggtccat 60
tctcacacat aagtgccaaa cgcaacagga ggggatacac tagcagcaga ccgttgcaaa 120
cgcaggacct ccactcctct tctcctcaac acccactttt gccatcgaaa aaccagccca 180
gttattgggc ttgattggag ctcgctcatt ccaattcctt ctattaggct actaacacca 240
tgactttatt agcctgtcta tcctggcccc cctggcgagg ttcatgtttg tttatttccg 300
aatgcaacaa gctccgcatt acacccgaac atcactccag atgagggctt tctgagtgtg 360
gggtcaaata gtttcatgtt ccccaaatgg cccaaaactg acagtttaaa cgctgtcttg 420
gaacctaata tgacaaaagc gtgatctcat ccaagatgaa ctaagtttgg ttcgttgaaa 480
tgctaacggc cagttggtca aaaagaaact tccaaaagtc ggcataccgt ttgtcttgtt 540
tggtattgat tgacgaatgc tcaaaaataa tctcattaat gcttagcgca gtctctctat 600
cgcttctgaa ccccggtgca cctgtgccga aacgcaaatg gggaaacacc cgctttttgg 660
atgattatgc attgtctcca cattgtatgc ttccaagatt ctggtgggaa tactgctgat 720
agcctaacgt tcatgatcaa aatttaactg ttctaacccc tacttgacag caatatataa 780
acagaaggaa gctgccctgt cttaaacctt tttttttatc atcattatta gcttactttc 840
ataattgcga ctggttccaa ttgacaagct tttgatttta acgactttta acgacaactt 900
gagaagatca aaaaacaact aattattgaa a 931
<210> 44
<211> 699
<212> DNA
<213> Artificial Sequence
<220>
<223> pDas1 promoter
<400> 44
ctatgctacc ccacagaaat accccaaaag ttgaagtgaa aaaatgaaaa ttactggtaa 60
cttcacccca taacaaactt aataatttct gtagccaatg aaagtaaacc ccattcaatg 120
ttccgagatt tagtatactt gcccctataa gaaacgaagg atttcagctt ccttacccca 180
tgaacagaaa tcttccattt accccccact ggagagatcc gcccaaacga acagataata 240
gaaaaaagaa attcggacaa atagaacact ttctcagcca attaaagtca ttccatgcac 300
tccctttagc tgccgttcca tccctttgtt gagcaacacc atcgttagcc agtacgaaag 360
aggaaactta accgatacct tggagaaatc taaggcgcga atgagtttag cctagatatc 420
cttagtgaag ggttgttccg atacttctcc acattcagtc atagatgggc agctttgtta 480
tcatgaagag acggaaacgg gcattaaggg ttaaccgcca aattatataa agacaacatg 540
tccccagttt aaagtttttc tttcctattc ttgtatcctg agtgaccgtt gtgtttaata 600
taacaagttc gttttaactt aagaccaaaa ccagttacaa caaattataa cccctctaaa 660
cactaaagtt cactcttatc aaactatcaa acatcaaaa 699
<210> 45
<211> 552
<212> DNA
<213> Artificial Sequence
<220>
<223> pDas2 promoter
<400> 45
agcaatgata taaacaacaa ttgagtgaca ggtctacttt gttctcaaaa ggccataacc 60
atctgtttgc atctcttatc accacaccat cctcctcatc tggccttcaa ttgtggggaa 120
caactagcat cccaacacca gactaactcc acccagatga aaccagttgt cgcttaccag 180
tcaatgaatg ttgagctaac gttccttgaa actcgaatga tcccagcctt gctgcgtatc 240
atccctccgc tattccgccg cttgctccaa ccatgtttcc gcctttttcg aacaagttca 300
aatacctatc tttggcagga cttttcctcc tgcctttttt agcctcagct ctcggttagc 360
ctctaggcaa attctggtct tcatacctat atcaactttt catcagatag cctttgggtt 420
caaaaaagaa ctaaagcagg atgcctgata tataaatccc agatgatctg cttttgaaac 480
tattttcagt atcttgattc gtttacttac aaacaactat tgttgatttt atctggagaa 540
taatcgaaca aa 552
<210> 46
<211> 2326
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(2326)
<223> Bos taurus prolyl 4-hydroxylase, alpha polypeptide II, mRNA (cDNA
clone MGC:127031 IMAGE:7942056), complete cds
<220>
<221> CDS
<222> (413)..(1876)
<223> Bos taurus prolyl 4-hydroxylase, alpha polypeptide II, mRNA (cDNA
clone MGC:127031 IMAGE:7942056), complete cds
<400> 46
aaaaagttcg agtctgtacc ggactgtgca acggagcagg gaaaggctca gggccgccct 60
accacgctgt caccgccggg cctccgagga agagtggcgt tttctctcga ctttggaggt 120
tctgggttct aggctctgtg ctggacctgg atacacagtg ataaacaggc cagaagcagc 180
tcccatccct aggaaggcaa agtggtgaag gatgcagaca tgacagtcag atcatcctga 240
ttacccagtt ttgcctcagc agccgcggag actgtaacta gttaactaat tcaagaaacg 300
aacccttcag tgttaatcag aaactgcaag gagttgctgg cctagtgggg cacgtggact 360
ggagaccagg aaaggccagg ccccggtcag tgtgacactg ccctctgtga cc atg aaa 418
Met Lys
1
ccc tgg gag tcc acg ttg ctg gtg gcc tgg ttt ggt gtc ctg agc tgc 466
Pro Trp Glu Ser Thr Leu Leu Val Ala Trp Phe Gly Val Leu Ser Cys
5 10 15
gtg cag gct gaa ttc ttc act tct att gga cac atg aca gac ctg att 514
Val Gln Ala Glu Phe Phe Thr Ser Ile Gly His Met Thr Asp Leu Ile
20 25 30
tat gca gag aag gac ctg gtg cag tcc ctg aag gag tac atc ctg gtg 562
Tyr Ala Glu Lys Asp Leu Val Gln Ser Leu Lys Glu Tyr Ile Leu Val
35 40 45 50
gag gaa gcc aag ctc tcc aag att aag agc tgg gct gac aaa atg gaa 610
Glu Glu Ala Lys Leu Ser Lys Ile Lys Ser Trp Ala Asp Lys Met Glu
55 60 65
gcc ctg acc agc aag tcg gct gct gac cct gag ggc tac ctg gcc cac 658
Ala Leu Thr Ser Lys Ser Ala Ala Asp Pro Glu Gly Tyr Leu Ala His
70 75 80
cct gtg aat gcc tat aaa ctg gtg aag cgg cta aac acg gac tgg cct 706
Pro Val Asn Ala Tyr Lys Leu Val Lys Arg Leu Asn Thr Asp Trp Pro
85 90 95
gca ctg gag gac ctt gtc ctg cag aac tcg gcc gca gga acc aaa tac 754
Ala Leu Glu Asp Leu Val Leu Gln Asn Ser Ala Ala Gly Thr Lys Tyr
100 105 110
cag gcc atg ctg agt gtg gat gac tgc ttt ggg atg ggc cgc tcg gcc 802
Gln Ala Met Leu Ser Val Asp Asp Cys Phe Gly Met Gly Arg Ser Ala
115 120 125 130
tac aac gaa ggc gac tat tac cac acg gtg ttg tgg atg gaa cag gtg 850
Tyr Asn Glu Gly Asp Tyr Tyr His Thr Val Leu Trp Met Glu Gln Val
135 140 145
cta aag cag ctc gat gct ggg gag gag gcc acc aca tcc aag gcc cag 898
Leu Lys Gln Leu Asp Ala Gly Glu Glu Ala Thr Thr Ser Lys Ala Gln
150 155 160
gtg ctg gac tat ctg agc tac gct gtc ttc cag ttg ggt gac ctg cac 946
Val Leu Asp Tyr Leu Ser Tyr Ala Val Phe Gln Leu Gly Asp Leu His
165 170 175
cgt gcc gtg gag ctc acc cgc cgc ctg ctc tcc ctt gac ccg agc cat 994
Arg Ala Val Glu Leu Thr Arg Arg Leu Leu Ser Leu Asp Pro Ser His
180 185 190
gaa cga gct gga ggg aat ctg cac tac ttt gaa cgg ttg ttg gaa gaa 1042
Glu Arg Ala Gly Gly Asn Leu His Tyr Phe Glu Arg Leu Leu Glu Glu
195 200 205 210
gaa aga gaa aaa atg tta tcg aat cac aca gaa gct gag ctt gca tcc 1090
Glu Arg Glu Lys Met Leu Ser Asn His Thr Glu Ala Glu Leu Ala Ser
215 220 225
cag caa ggc ata tac gag agg cct gtg gac tac ctg ccg gag agg gat 1138
Gln Gln Gly Ile Tyr Glu Arg Pro Val Asp Tyr Leu Pro Glu Arg Asp
230 235 240
gtc tac gag agc ctc tgt cgt ggg gag ggt gtc aaa ctg acc ccc cga 1186
Val Tyr Glu Ser Leu Cys Arg Gly Glu Gly Val Lys Leu Thr Pro Arg
245 250 255
agg cag aag agg ctc ttc tgt agg tat cac cat ggc aac agg gtg ccg 1234
Arg Gln Lys Arg Leu Phe Cys Arg Tyr His His Gly Asn Arg Val Pro
260 265 270
cag ctg ctc atc gcc ccc ttc aaa gag gag gat gag tgg gac agc ccg 1282
Gln Leu Leu Ile Ala Pro Phe Lys Glu Glu Asp Glu Trp Asp Ser Pro
275 280 285 290
cac atc gtc agg tac tac gac gtc atg tct gac gag gaa atc gag agg 1330
His Ile Val Arg Tyr Tyr Asp Val Met Ser Asp Glu Glu Ile Glu Arg
295 300 305
atc aag gag att gcg aaa ccc aaa ctt gca cga gcc act gtt cgt gat 1378
Ile Lys Glu Ile Ala Lys Pro Lys Leu Ala Arg Ala Thr Val Arg Asp
310 315 320
ccc aag aca ggt gtg ctt act gtc gcc agc tac agg gtt tcc aaa agc 1426
Pro Lys Thr Gly Val Leu Thr Val Ala Ser Tyr Arg Val Ser Lys Ser
325 330 335
tcc tgg ctg gag gag gac gat gac ccc gtt gtg gct cgg gtg aat ctg 1474
Ser Trp Leu Glu Glu Asp Asp Asp Pro Val Val Ala Arg Val Asn Leu
340 345 350
cgg atg cag cac atc aca ggg cta aca gtg aag act gca gaa ttg ttg 1522
Arg Met Gln His Ile Thr Gly Leu Thr Val Lys Thr Ala Glu Leu Leu
355 360 365 370
cag gtt gct aat tat gga atg gga gga cag tac gag cca cat ttt gac 1570
Gln Val Ala Asn Tyr Gly Met Gly Gly Gln Tyr Glu Pro His Phe Asp
375 380 385
ttc tcc agg cga cct ttt gac agc ggc ctc aaa acg gag ggg aat agg 1618
Phe Ser Arg Arg Pro Phe Asp Ser Gly Leu Lys Thr Glu Gly Asn Arg
390 395 400
tta gcg acg ttt ctt aac tat atg agt gat gta gaa gct ggt ggt gcc 1666
Leu Ala Thr Phe Leu Asn Tyr Met Ser Asp Val Glu Ala Gly Gly Ala
405 410 415
acc gtc ttt cct gat ctg ggg gct gca att tgg cct aag aag ggc aca 1714
Thr Val Phe Pro Asp Leu Gly Ala Ala Ile Trp Pro Lys Lys Gly Thr
420 425 430
gct gta ttc tgg tac aac ctc cta cgg agt ggg gaa ggt gac tat cga 1762
Ala Val Phe Trp Tyr Asn Leu Leu Arg Ser Gly Glu Gly Asp Tyr Arg
435 440 445 450
aca aga cat gct gcc tgc cct gtg ctt gtg ggc tgc aag tgg gtc tcc 1810
Thr Arg His Ala Ala Cys Pro Val Leu Val Gly Cys Lys Trp Val Ser
455 460 465
aat aag tgg ttc cat gaa cga gga cag gaa ttc ttg agg ccg tgt gga 1858
Asn Lys Trp Phe His Glu Arg Gly Gln Glu Phe Leu Arg Pro Cys Gly
470 475 480
tcg aca gaa gtt gac tga catcattttc tgcccttcgc cttcctggcc 1906
Ser Thr Glu Val Asp
485
ccacagtccg tgttgtcttc aagttcaatg tgacagactc ctgtctatgt tccagtccca 1966
tcaggcgggt ctctggaggc ataaatgttt tgtgtggagt agagagtgga ctagggaagg 2026
tcctggacga cctgggcccc agcctctctg accagcccgt gctatctctg gacgctcggg 2086
tagggttgga gcagagtcag gtggtctgca cctagcaagg tgcttttgta cctcagatgc 2146
tttaggtgtg agatgtttca gtgaaccaaa gttctgattc cttgtttaca tgcttgtttt 2206
tatggaattt ctattaatgt ggctttaacc aaaataaaac gtccctgcca gaagccttaa 2266
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aagaaaataa aaaaaaaaga 2326
<210> 47
<211> 487
<212> PRT
<213> Bos taurus
<400> 47
Met Lys Pro Trp Glu Ser Thr Leu Leu Val Ala Trp Phe Gly Val Leu
1 5 10 15
Ser Cys Val Gln Ala Glu Phe Phe Thr Ser Ile Gly His Met Thr Asp
20 25 30
Leu Ile Tyr Ala Glu Lys Asp Leu Val Gln Ser Leu Lys Glu Tyr Ile
35 40 45
Leu Val Glu Glu Ala Lys Leu Ser Lys Ile Lys Ser Trp Ala Asp Lys
50 55 60
Met Glu Ala Leu Thr Ser Lys Ser Ala Ala Asp Pro Glu Gly Tyr Leu
65 70 75 80
Ala His Pro Val Asn Ala Tyr Lys Leu Val Lys Arg Leu Asn Thr Asp
85 90 95
Trp Pro Ala Leu Glu Asp Leu Val Leu Gln Asn Ser Ala Ala Gly Thr
100 105 110
Lys Tyr Gln Ala Met Leu Ser Val Asp Asp Cys Phe Gly Met Gly Arg
115 120 125
Ser Ala Tyr Asn Glu Gly Asp Tyr Tyr His Thr Val Leu Trp Met Glu
130 135 140
Gln Val Leu Lys Gln Leu Asp Ala Gly Glu Glu Ala Thr Thr Ser Lys
145 150 155 160
Ala Gln Val Leu Asp Tyr Leu Ser Tyr Ala Val Phe Gln Leu Gly Asp
165 170 175
Leu His Arg Ala Val Glu Leu Thr Arg Arg Leu Leu Ser Leu Asp Pro
180 185 190
Ser His Glu Arg Ala Gly Gly Asn Leu His Tyr Phe Glu Arg Leu Leu
195 200 205
Glu Glu Glu Arg Glu Lys Met Leu Ser Asn His Thr Glu Ala Glu Leu
210 215 220
Ala Ser Gln Gln Gly Ile Tyr Glu Arg Pro Val Asp Tyr Leu Pro Glu
225 230 235 240
Arg Asp Val Tyr Glu Ser Leu Cys Arg Gly Glu Gly Val Lys Leu Thr
245 250 255
Pro Arg Arg Gln Lys Arg Leu Phe Cys Arg Tyr His His Gly Asn Arg
260 265 270
Val Pro Gln Leu Leu Ile Ala Pro Phe Lys Glu Glu Asp Glu Trp Asp
275 280 285
Ser Pro His Ile Val Arg Tyr Tyr Asp Val Met Ser Asp Glu Glu Ile
290 295 300
Glu Arg Ile Lys Glu Ile Ala Lys Pro Lys Leu Ala Arg Ala Thr Val
305 310 315 320
Arg Asp Pro Lys Thr Gly Val Leu Thr Val Ala Ser Tyr Arg Val Ser
325 330 335
Lys Ser Ser Trp Leu Glu Glu Asp Asp Asp Pro Val Val Ala Arg Val
340 345 350
Asn Leu Arg Met Gln His Ile Thr Gly Leu Thr Val Lys Thr Ala Glu
355 360 365
Leu Leu Gln Val Ala Asn Tyr Gly Met Gly Gly Gln Tyr Glu Pro His
370 375 380
Phe Asp Phe Ser Arg Arg Pro Phe Asp Ser Gly Leu Lys Thr Glu Gly
385 390 395 400
Asn Arg Leu Ala Thr Phe Leu Asn Tyr Met Ser Asp Val Glu Ala Gly
405 410 415
Gly Ala Thr Val Phe Pro Asp Leu Gly Ala Ala Ile Trp Pro Lys Lys
420 425 430
Gly Thr Ala Val Phe Trp Tyr Asn Leu Leu Arg Ser Gly Glu Gly Asp
435 440 445
Tyr Arg Thr Arg His Ala Ala Cys Pro Val Leu Val Gly Cys Lys Trp
450 455 460
Val Ser Asn Lys Trp Phe His Glu Arg Gly Gln Glu Phe Leu Arg Pro
465 470 475 480
Cys Gly Ser Thr Glu Val Asp
485
<210> 48
<211> 2580
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(2580)
<223> Bos taurus prolyl 3-hydroxylase 1 (P3H1), mRNA; NCBI Reference
Sequence: NM_001103291.1
<220>
<221> CDS
<222> (28)..(2238)
<400> 48
caagggtccc gttaggtctg agcggcc atg gcg gca cgc gct tta agg ctg ctg 54
Met Ala Ala Arg Ala Leu Arg Leu Leu
1 5
acc ata ttg ctg gcc gtc gcc gcc act gcc tcc cag gct gag gcc gag 102
Thr Ile Leu Leu Ala Val Ala Ala Thr Ala Ser Gln Ala Glu Ala Glu
10 15 20 25
tcc gag gcg gga tgg gac ctg acg gcg cct gat ctg ctg ttc gcg gag 150
Ser Glu Ala Gly Trp Asp Leu Thr Ala Pro Asp Leu Leu Phe Ala Glu
30 35 40
ggg acg gcg gcc tat gct cgc ggg gac tgg gcc ggt gtg gtt ctg agc 198
Gly Thr Ala Ala Tyr Ala Arg Gly Asp Trp Ala Gly Val Val Leu Ser
45 50 55
atg gag cgg gcg ctc cgc tcg cgg gcc gcc ctg cgc gcc ctc cgt ctg 246
Met Glu Arg Ala Leu Arg Ser Arg Ala Ala Leu Arg Ala Leu Arg Leu
60 65 70
cgc tgc cgc act cgg tgt gcc gcc gac ctc cca tgg gaa gtg gac cca 294
Arg Cys Arg Thr Arg Cys Ala Ala Asp Leu Pro Trp Glu Val Asp Pro
75 80 85
gac tcg ccc cca agc ttg gcg cag gct tca ggt gcc tcc gcc ctg cac 342
Asp Ser Pro Pro Ser Leu Ala Gln Ala Ser Gly Ala Ser Ala Leu His
90 95 100 105
gac ctg cgg ttc ttc gga ggc ttg ctg cgc cgc gcc gct tgc ctg cgc 390
Asp Leu Arg Phe Phe Gly Gly Leu Leu Arg Arg Ala Ala Cys Leu Arg
110 115 120
cgc tgc ctc ggg ccg tcg acc gcc cac tcg ctc agc gag gag ctg gag 438
Arg Cys Leu Gly Pro Ser Thr Ala His Ser Leu Ser Glu Glu Leu Glu
125 130 135
ttg gag ttc cgc aag cgg agc ccc tac aac tac ctg cag gtc gcc tac 486
Leu Glu Phe Arg Lys Arg Ser Pro Tyr Asn Tyr Leu Gln Val Ala Tyr
140 145 150
ttc aag ata aac aag ttg gag aaa gct gta gca gca gcc cat acc ttc 534
Phe Lys Ile Asn Lys Leu Glu Lys Ala Val Ala Ala Ala His Thr Phe
155 160 165
ttc gtg ggc aac cct gag cac atg gag atg cga cag aac ctg gac tat 582
Phe Val Gly Asn Pro Glu His Met Glu Met Arg Gln Asn Leu Asp Tyr
170 175 180 185
tac cag acc atg tct ggg gtg aag gag gct gac ttc aag gat ctt gag 630
Tyr Gln Thr Met Ser Gly Val Lys Glu Ala Asp Phe Lys Asp Leu Glu
190 195 200
gcc aaa ccc cat atg cac gaa ttt cgg ctg gga gtg cgc ctc tac tcc 678
Ala Lys Pro His Met His Glu Phe Arg Leu Gly Val Arg Leu Tyr Ser
205 210 215
gag gag cag ccg cag gaa gcc gtg ccc cac ctg gag gcg gcg ctg cgg 726
Glu Glu Gln Pro Gln Glu Ala Val Pro His Leu Glu Ala Ala Leu Arg
220 225 230
gag tac ttc gtg gcg gcc gag gag tgc cgc gcg ctc tgc gaa ggg ccc 774
Glu Tyr Phe Val Ala Ala Glu Glu Cys Arg Ala Leu Cys Glu Gly Pro
235 240 245
tat gac tac gac ggc tac aac tac ctg gag tac aat gcc gac ctc ttc 822
Tyr Asp Tyr Asp Gly Tyr Asn Tyr Leu Glu Tyr Asn Ala Asp Leu Phe
250 255 260 265
cag gcc atc aca gat cat tac atc cag gtc ctc agc tgt aag cag aac 870
Gln Ala Ile Thr Asp His Tyr Ile Gln Val Leu Ser Cys Lys Gln Asn
270 275 280
tgt gtc acg gag ctt gct tcc cac cca agt cga gag aag ccc ttt gaa 918
Cys Val Thr Glu Leu Ala Ser His Pro Ser Arg Glu Lys Pro Phe Glu
285 290 295
gac ttc ctg cca tct cat tat aat tat ctg cag ttt gcc tac tat aac 966
Asp Phe Leu Pro Ser His Tyr Asn Tyr Leu Gln Phe Ala Tyr Tyr Asn
300 305 310
att ggg aat tac aca cag gcc att gaa tgt gcc aag acc tat ctc ctc 1014
Ile Gly Asn Tyr Thr Gln Ala Ile Glu Cys Ala Lys Thr Tyr Leu Leu
315 320 325
ttc ttt ccc aat gat gag gtg atg agc cag aat ctg gcc tac tat aca 1062
Phe Phe Pro Asn Asp Glu Val Met Ser Gln Asn Leu Ala Tyr Tyr Thr
330 335 340 345
gcc atg ctt gga gaa gag caa gcc aga tcc att ggc ccc cgt gag agt 1110
Ala Met Leu Gly Glu Glu Gln Ala Arg Ser Ile Gly Pro Arg Glu Ser
350 355 360
gcc cag gag tac cgc cag cgg agc ctg ctg gag aag gaa ctg ctt ttc 1158
Ala Gln Glu Tyr Arg Gln Arg Ser Leu Leu Glu Lys Glu Leu Leu Phe
365 370 375
ttc gcc tat gac gtt ttt gga att ccc ttt gtt gat ccg gat tca tgg 1206
Phe Ala Tyr Asp Val Phe Gly Ile Pro Phe Val Asp Pro Asp Ser Trp
380 385 390
act cca gtg gag gtg att cct aag aga ctg caa gag aaa cag aag tca 1254
Thr Pro Val Glu Val Ile Pro Lys Arg Leu Gln Glu Lys Gln Lys Ser
395 400 405
gaa cgg gaa aca gct gcc cgc atc tcc cag gaa atc ggg aac ctt atg 1302
Glu Arg Glu Thr Ala Ala Arg Ile Ser Gln Glu Ile Gly Asn Leu Met
410 415 420 425
aag gag atc gag acc ctc gtg gag gag aag acc aag gag tca ctg gac 1350
Lys Glu Ile Glu Thr Leu Val Glu Glu Lys Thr Lys Glu Ser Leu Asp
430 435 440
gtg agc agg ctg acc cgg gaa ggt ggc ccc ctg ctg tat gat ggc atc 1398
Val Ser Arg Leu Thr Arg Glu Gly Gly Pro Leu Leu Tyr Asp Gly Ile
445 450 455
aga ctc acc atg aac tcc aaa gtc ctg aat ggt tcc cag cgg gtg gtg 1446
Arg Leu Thr Met Asn Ser Lys Val Leu Asn Gly Ser Gln Arg Val Val
460 465 470
atg gat ggc gtc atc tct gac gag gag tgc cag gag ctg cag aga ctg 1494
Met Asp Gly Val Ile Ser Asp Glu Glu Cys Gln Glu Leu Gln Arg Leu
475 480 485
acc aat gca gca gca act tca gga gat ggc tac cgg ggt cag acc tcc 1542
Thr Asn Ala Ala Ala Thr Ser Gly Asp Gly Tyr Arg Gly Gln Thr Ser
490 495 500 505
cca cac acc ccc agc gag aag ttc tac ggt gtc acc gtc ttc aag gcc 1590
Pro His Thr Pro Ser Glu Lys Phe Tyr Gly Val Thr Val Phe Lys Ala
510 515 520
ctc aag ctg ggg cag gaa ggg aag gtt cct ctg cag agc gcc cac ctg 1638
Leu Lys Leu Gly Gln Glu Gly Lys Val Pro Leu Gln Ser Ala His Leu
525 530 535
tac tac aac gtg acg gag aag gtg cgc cgc gtc atg gag tcg tac ttc 1686
Tyr Tyr Asn Val Thr Glu Lys Val Arg Arg Val Met Glu Ser Tyr Phe
540 545 550
cgc ctg gat acc ccg ctc tac ttc tcc tac tcc cac ctg gtg tgc cgc 1734
Arg Leu Asp Thr Pro Leu Tyr Phe Ser Tyr Ser His Leu Val Cys Arg
555 560 565
acc gcc atc gaa gag gca cag gct gag agg aag gac ggt agc cac ccc 1782
Thr Ala Ile Glu Glu Ala Gln Ala Glu Arg Lys Asp Gly Ser His Pro
570 575 580 585
gtc cac gtg gac aac tgc atc ctg aat gcc gag gcc ctc gtg tgc atc 1830
Val His Val Asp Asn Cys Ile Leu Asn Ala Glu Ala Leu Val Cys Ile
590 595 600
aag gag ccc cct gcc tac act ttc cgg gac ttc agc gcc att ctt tat 1878
Lys Glu Pro Pro Ala Tyr Thr Phe Arg Asp Phe Ser Ala Ile Leu Tyr
605 610 615
ctg aac gaa gac ttc gat gga gga aac ttt tat ttc act gaa cta gat 1926
Leu Asn Glu Asp Phe Asp Gly Gly Asn Phe Tyr Phe Thr Glu Leu Asp
620 625 630
gcc aag acc gtg acg gca gag gtg cag ccc cag tgc gga agg gct gtg 1974
Ala Lys Thr Val Thr Ala Glu Val Gln Pro Gln Cys Gly Arg Ala Val
635 640 645
gga ttc tct tcc ggc acg gaa aac ccg cat gga gta aag gcc gtc acc 2022
Gly Phe Ser Ser Gly Thr Glu Asn Pro His Gly Val Lys Ala Val Thr
650 655 660 665
aga ggg cag cgc tgt gcc att gcc ctc tgg ttc act ttg gat gct cga 2070
Arg Gly Gln Arg Cys Ala Ile Ala Leu Trp Phe Thr Leu Asp Ala Arg
670 675 680
cac agc gag agg gag cga gtg cag gcg gac gac ctg gta aag atg ctc 2118
His Ser Glu Arg Glu Arg Val Gln Ala Asp Asp Leu Val Lys Met Leu
685 690 695
ttt agc cca gaa gag atg gac ctc ccc cac gag cag ccc caa gaa gcc 2166
Phe Ser Pro Glu Glu Met Asp Leu Pro His Glu Gln Pro Gln Glu Ala
700 705 710
cag gag ggg acc ccc gag ccc cta cag gag ccc gtc tcc agc agt gag 2214
Gln Glu Gly Thr Pro Glu Pro Leu Gln Glu Pro Val Ser Ser Ser Glu
715 720 725
tca ggg cac aag gat gag ctc tga caactcccgt ggatggtgat cagacccaca 2268
Ser Gly His Lys Asp Glu Leu
730 735
cgagggactc tgtcctgcag cctggactgg ccagccccgg gcgaggagca gtgggaaccc 2328
aggcctgccg cccagctgag ggggctctgc tcacggccgt ccgcatggtg ctgctgctct 2388
tggagtggac atggcgagat ggccctctcc cctctgggcc tgactgaggg ctcaggacgc 2448
aggcccagag ccactctggg ggcccacaca ggcagccacg tgacagcaat acagtattta 2508
agtgcctgtg tagacaacca aagaataaat gattcgtggt tttttttaaa aaaaaaaaaa 2568
aaaaaaaaaa aa 2580
<210> 49
<211> 736
<212> PRT
<213> Bos taurus
<400> 49
Met Ala Ala Arg Ala Leu Arg Leu Leu Thr Ile Leu Leu Ala Val Ala
1 5 10 15
Ala Thr Ala Ser Gln Ala Glu Ala Glu Ser Glu Ala Gly Trp Asp Leu
20 25 30
Thr Ala Pro Asp Leu Leu Phe Ala Glu Gly Thr Ala Ala Tyr Ala Arg
35 40 45
Gly Asp Trp Ala Gly Val Val Leu Ser Met Glu Arg Ala Leu Arg Ser
50 55 60
Arg Ala Ala Leu Arg Ala Leu Arg Leu Arg Cys Arg Thr Arg Cys Ala
65 70 75 80
Ala Asp Leu Pro Trp Glu Val Asp Pro Asp Ser Pro Pro Ser Leu Ala
85 90 95
Gln Ala Ser Gly Ala Ser Ala Leu His Asp Leu Arg Phe Phe Gly Gly
100 105 110
Leu Leu Arg Arg Ala Ala Cys Leu Arg Arg Cys Leu Gly Pro Ser Thr
115 120 125
Ala His Ser Leu Ser Glu Glu Leu Glu Leu Glu Phe Arg Lys Arg Ser
130 135 140
Pro Tyr Asn Tyr Leu Gln Val Ala Tyr Phe Lys Ile Asn Lys Leu Glu
145 150 155 160
Lys Ala Val Ala Ala Ala His Thr Phe Phe Val Gly Asn Pro Glu His
165 170 175
Met Glu Met Arg Gln Asn Leu Asp Tyr Tyr Gln Thr Met Ser Gly Val
180 185 190
Lys Glu Ala Asp Phe Lys Asp Leu Glu Ala Lys Pro His Met His Glu
195 200 205
Phe Arg Leu Gly Val Arg Leu Tyr Ser Glu Glu Gln Pro Gln Glu Ala
210 215 220
Val Pro His Leu Glu Ala Ala Leu Arg Glu Tyr Phe Val Ala Ala Glu
225 230 235 240
Glu Cys Arg Ala Leu Cys Glu Gly Pro Tyr Asp Tyr Asp Gly Tyr Asn
245 250 255
Tyr Leu Glu Tyr Asn Ala Asp Leu Phe Gln Ala Ile Thr Asp His Tyr
260 265 270
Ile Gln Val Leu Ser Cys Lys Gln Asn Cys Val Thr Glu Leu Ala Ser
275 280 285
His Pro Ser Arg Glu Lys Pro Phe Glu Asp Phe Leu Pro Ser His Tyr
290 295 300
Asn Tyr Leu Gln Phe Ala Tyr Tyr Asn Ile Gly Asn Tyr Thr Gln Ala
305 310 315 320
Ile Glu Cys Ala Lys Thr Tyr Leu Leu Phe Phe Pro Asn Asp Glu Val
325 330 335
Met Ser Gln Asn Leu Ala Tyr Tyr Thr Ala Met Leu Gly Glu Glu Gln
340 345 350
Ala Arg Ser Ile Gly Pro Arg Glu Ser Ala Gln Glu Tyr Arg Gln Arg
355 360 365
Ser Leu Leu Glu Lys Glu Leu Leu Phe Phe Ala Tyr Asp Val Phe Gly
370 375 380
Ile Pro Phe Val Asp Pro Asp Ser Trp Thr Pro Val Glu Val Ile Pro
385 390 395 400
Lys Arg Leu Gln Glu Lys Gln Lys Ser Glu Arg Glu Thr Ala Ala Arg
405 410 415
Ile Ser Gln Glu Ile Gly Asn Leu Met Lys Glu Ile Glu Thr Leu Val
420 425 430
Glu Glu Lys Thr Lys Glu Ser Leu Asp Val Ser Arg Leu Thr Arg Glu
435 440 445
Gly Gly Pro Leu Leu Tyr Asp Gly Ile Arg Leu Thr Met Asn Ser Lys
450 455 460
Val Leu Asn Gly Ser Gln Arg Val Val Met Asp Gly Val Ile Ser Asp
465 470 475 480
Glu Glu Cys Gln Glu Leu Gln Arg Leu Thr Asn Ala Ala Ala Thr Ser
485 490 495
Gly Asp Gly Tyr Arg Gly Gln Thr Ser Pro His Thr Pro Ser Glu Lys
500 505 510
Phe Tyr Gly Val Thr Val Phe Lys Ala Leu Lys Leu Gly Gln Glu Gly
515 520 525
Lys Val Pro Leu Gln Ser Ala His Leu Tyr Tyr Asn Val Thr Glu Lys
530 535 540
Val Arg Arg Val Met Glu Ser Tyr Phe Arg Leu Asp Thr Pro Leu Tyr
545 550 555 560
Phe Ser Tyr Ser His Leu Val Cys Arg Thr Ala Ile Glu Glu Ala Gln
565 570 575
Ala Glu Arg Lys Asp Gly Ser His Pro Val His Val Asp Asn Cys Ile
580 585 590
Leu Asn Ala Glu Ala Leu Val Cys Ile Lys Glu Pro Pro Ala Tyr Thr
595 600 605
Phe Arg Asp Phe Ser Ala Ile Leu Tyr Leu Asn Glu Asp Phe Asp Gly
610 615 620
Gly Asn Phe Tyr Phe Thr Glu Leu Asp Ala Lys Thr Val Thr Ala Glu
625 630 635 640
Val Gln Pro Gln Cys Gly Arg Ala Val Gly Phe Ser Ser Gly Thr Glu
645 650 655
Asn Pro His Gly Val Lys Ala Val Thr Arg Gly Gln Arg Cys Ala Ile
660 665 670
Ala Leu Trp Phe Thr Leu Asp Ala Arg His Ser Glu Arg Glu Arg Val
675 680 685
Gln Ala Asp Asp Leu Val Lys Met Leu Phe Ser Pro Glu Glu Met Asp
690 695 700
Leu Pro His Glu Gln Pro Gln Glu Ala Gln Glu Gly Thr Pro Glu Pro
705 710 715 720
Leu Gln Glu Pro Val Ser Ser Ser Glu Ser Gly His Lys Asp Glu Leu
725 730 735
<210> 50
<211> 2030
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(2030)
<223> Bos taurus lysyl oxidase (LOX), mRNA; NCBI Reference Sequence:
NM_173932.4
<220>
<221> CDS
<222> (25)..(1281)
<400> 50
ggggacagtc caggaaaggg agcg atg cgc ttc gcc tgg acc gca ctc ctc 51
Met Arg Phe Ala Trp Thr Ala Leu Leu
1 5
ggg tcg ctg cag ctc tgc gca ctc gtg cgc tgc gcc ccg ccg gcc gcc 99
Gly Ser Leu Gln Leu Cys Ala Leu Val Arg Cys Ala Pro Pro Ala Ala
10 15 20 25
agc cac cgg cag ccc cct cgc gaa cag gcg gcg gct ccc ggc gcc tgg 147
Ser His Arg Gln Pro Pro Arg Glu Gln Ala Ala Ala Pro Gly Ala Trp
30 35 40
cgc cag aag atc caa tgg gag aac aac ggg cag gtg ttc agc ctg ctg 195
Arg Gln Lys Ile Gln Trp Glu Asn Asn Gly Gln Val Phe Ser Leu Leu
45 50 55
agc ctg ggc tcg cag tac cag ccg caa cgg cga cgg gac ccc ggc gcc 243
Ser Leu Gly Ser Gln Tyr Gln Pro Gln Arg Arg Arg Asp Pro Gly Ala
60 65 70
acc gcc ccg ggg gcc gcc aac gcc act gcc cca cag atg cgc aca cca 291
Thr Ala Pro Gly Ala Ala Asn Ala Thr Ala Pro Gln Met Arg Thr Pro
75 80 85
atc ctg ctg ctc cgc aac aac cgc acc gcg gcg gcg cga gtg cgg acg 339
Ile Leu Leu Leu Arg Asn Asn Arg Thr Ala Ala Ala Arg Val Arg Thr
90 95 100 105
gcc ggc ccc tct gcg gcc gca gct ggc cgc ccc agg ccc gcc gcc cgc 387
Ala Gly Pro Ser Ala Ala Ala Ala Gly Arg Pro Arg Pro Ala Ala Arg
110 115 120
cac tgg ttc caa gct ggc tac tcg acg tcc ggg gcc cac gac gct ggg 435
His Trp Phe Gln Ala Gly Tyr Ser Thr Ser Gly Ala His Asp Ala Gly
125 130 135
acc tcg cgc gct gat aac cag acg gca ccg gga gag gtc ccg acg ctc 483
Thr Ser Arg Ala Asp Asn Gln Thr Ala Pro Gly Glu Val Pro Thr Leu
140 145 150
agt aac ctg cga ccg ccc aac cgc gtg gac gtg gac ggc atg gtg ggc 531
Ser Asn Leu Arg Pro Pro Asn Arg Val Asp Val Asp Gly Met Val Gly
155 160 165
gac gac ccg tac aac ccc tat aag tac acc gac gac aac ccc tat tac 579
Asp Asp Pro Tyr Asn Pro Tyr Lys Tyr Thr Asp Asp Asn Pro Tyr Tyr
170 175 180 185
aac tat tac gac acg tac gaa agg ccc agg cct ggg agc agg tac cgg 627
Asn Tyr Tyr Asp Thr Tyr Glu Arg Pro Arg Pro Gly Ser Arg Tyr Arg
190 195 200
ccc gga tac ggc acc ggc tac ttc cag tat ggt ctt ccg gac ctg gtg 675
Pro Gly Tyr Gly Thr Gly Tyr Phe Gln Tyr Gly Leu Pro Asp Leu Val
205 210 215
ccc gat ccc tac tac atc cag gcg tcc aca tac gtg caa aag atg gcc 723
Pro Asp Pro Tyr Tyr Ile Gln Ala Ser Thr Tyr Val Gln Lys Met Ala
220 225 230
atg tac aac ctt aga tgc gct gcg gag gaa aac tgc ttg gcc agc tca 771
Met Tyr Asn Leu Arg Cys Ala Ala Glu Glu Asn Cys Leu Ala Ser Ser
235 240 245
gca tac agg gga gat gtc aga gat tat gat cac agg gtg ctg cta aga 819
Ala Tyr Arg Gly Asp Val Arg Asp Tyr Asp His Arg Val Leu Leu Arg
250 255 260 265
ttt ccc cag aga gtg aaa aac caa ggg aca tct gat ttc cta cca agt 867
Phe Pro Gln Arg Val Lys Asn Gln Gly Thr Ser Asp Phe Leu Pro Ser
270 275 280
cga cca aga tat tcc tgg gaa tgg cac agt tgt cac cag cat tac cac 915
Arg Pro Arg Tyr Ser Trp Glu Trp His Ser Cys His Gln His Tyr His
285 290 295
agc atg gat gaa ttc agc cac tat gac ctg ctt gat gcc agc acc cag 963
Ser Met Asp Glu Phe Ser His Tyr Asp Leu Leu Asp Ala Ser Thr Gln
300 305 310
agg aga gtg gct gag ggc cat aaa gcg agt ttc tgt ctt gag gac aca 1011
Arg Arg Val Ala Glu Gly His Lys Ala Ser Phe Cys Leu Glu Asp Thr
315 320 325
tcg tgt gac tac ggc tac cac agg cga ttt gca tgt act gca cac aca 1059
Ser Cys Asp Tyr Gly Tyr His Arg Arg Phe Ala Cys Thr Ala His Thr
330 335 340 345
cag ggc ttg agt cct ggc tgc tat gat acc tat aat gca gac ata gac 1107
Gln Gly Leu Ser Pro Gly Cys Tyr Asp Thr Tyr Asn Ala Asp Ile Asp
350 355 360
tgc caa tgg att gat atc act gat gtc aaa cct gga aac tat att ctc 1155
Cys Gln Trp Ile Asp Ile Thr Asp Val Lys Pro Gly Asn Tyr Ile Leu
365 370 375
aag gtc agt gtg aat ccc agc tat ttg gtg cct gag tcg gat tat tcc 1203
Lys Val Ser Val Asn Pro Ser Tyr Leu Val Pro Glu Ser Asp Tyr Ser
380 385 390
aac aat gtc gtc cgc tgt gaa att cgc tac aca gga cat cac gca tat 1251
Asn Asn Val Val Arg Cys Glu Ile Arg Tyr Thr Gly His His Ala Tyr
395 400 405
gcc tcg ggc tgc aca att tca ccg tat tag aaagcaagcc aaaactccca 1301
Ala Ser Gly Cys Thr Ile Ser Pro Tyr
410 415
aaggatatat cagtgcctgg tgttctgaag tggaaaaaaa tagattaact tcagtaggat 1361
ttatgtattt tgaaagagag aacagaaaac aacaaaagaa tttttgtttg gactgtttta 1421
taacaaagca cataactgga ttttgaacat ttcaatcggc attatttggg aaatttttaa 1481
tattattatt cacattactt tgtgaattaa cacagtgttt caattctgta attgcacact 1541
tggctctttc tgagaaatcc aaatttctta tgcttcttct gaaattatag tgcaaaaggg 1601
aaaaaaaatt cgatgaatga gtcaaaatta ttttaaaact gagaattttc taaagttcta 1661
aaactttagt gaaccttaat aataactggc ttatatatgt cctagcatag atcactttag 1721
aaatgaagct cctactgttt aaatagatat ggacacattt ggtactgagg gaggaataaa 1781
caggttacca ttggtgtcaa gaaatgttac tatatagcag agaaatggca atgtatgtat 1841
tcagatagtt acatccctat ataaaatttg tttacatttt aaaaattagt agataaactc 1901
ctttctttct gtcaagtgta caagttcatt ctgacttaag tcagcttttg ttgtggaaca 1961
aattaagtaa ttgagctgcc caaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2021
aaaaaaaaa 2030
<210> 51
<211> 418
<212> PRT
<213> Bos taurus
<400> 51
Met Arg Phe Ala Trp Thr Ala Leu Leu Gly Ser Leu Gln Leu Cys Ala
1 5 10 15
Leu Val Arg Cys Ala Pro Pro Ala Ala Ser His Arg Gln Pro Pro Arg
20 25 30
Glu Gln Ala Ala Ala Pro Gly Ala Trp Arg Gln Lys Ile Gln Trp Glu
35 40 45
Asn Asn Gly Gln Val Phe Ser Leu Leu Ser Leu Gly Ser Gln Tyr Gln
50 55 60
Pro Gln Arg Arg Arg Asp Pro Gly Ala Thr Ala Pro Gly Ala Ala Asn
65 70 75 80
Ala Thr Ala Pro Gln Met Arg Thr Pro Ile Leu Leu Leu Arg Asn Asn
85 90 95
Arg Thr Ala Ala Ala Arg Val Arg Thr Ala Gly Pro Ser Ala Ala Ala
100 105 110
Ala Gly Arg Pro Arg Pro Ala Ala Arg His Trp Phe Gln Ala Gly Tyr
115 120 125
Ser Thr Ser Gly Ala His Asp Ala Gly Thr Ser Arg Ala Asp Asn Gln
130 135 140
Thr Ala Pro Gly Glu Val Pro Thr Leu Ser Asn Leu Arg Pro Pro Asn
145 150 155 160
Arg Val Asp Val Asp Gly Met Val Gly Asp Asp Pro Tyr Asn Pro Tyr
165 170 175
Lys Tyr Thr Asp Asp Asn Pro Tyr Tyr Asn Tyr Tyr Asp Thr Tyr Glu
180 185 190
Arg Pro Arg Pro Gly Ser Arg Tyr Arg Pro Gly Tyr Gly Thr Gly Tyr
195 200 205
Phe Gln Tyr Gly Leu Pro Asp Leu Val Pro Asp Pro Tyr Tyr Ile Gln
210 215 220
Ala Ser Thr Tyr Val Gln Lys Met Ala Met Tyr Asn Leu Arg Cys Ala
225 230 235 240
Ala Glu Glu Asn Cys Leu Ala Ser Ser Ala Tyr Arg Gly Asp Val Arg
245 250 255
Asp Tyr Asp His Arg Val Leu Leu Arg Phe Pro Gln Arg Val Lys Asn
260 265 270
Gln Gly Thr Ser Asp Phe Leu Pro Ser Arg Pro Arg Tyr Ser Trp Glu
275 280 285
Trp His Ser Cys His Gln His Tyr His Ser Met Asp Glu Phe Ser His
290 295 300
Tyr Asp Leu Leu Asp Ala Ser Thr Gln Arg Arg Val Ala Glu Gly His
305 310 315 320
Lys Ala Ser Phe Cys Leu Glu Asp Thr Ser Cys Asp Tyr Gly Tyr His
325 330 335
Arg Arg Phe Ala Cys Thr Ala His Thr Gln Gly Leu Ser Pro Gly Cys
340 345 350
Tyr Asp Thr Tyr Asn Ala Asp Ile Asp Cys Gln Trp Ile Asp Ile Thr
355 360 365
Asp Val Lys Pro Gly Asn Tyr Ile Leu Lys Val Ser Val Asn Pro Ser
370 375 380
Tyr Leu Val Pro Glu Ser Asp Tyr Ser Asn Asn Val Val Arg Cys Glu
385 390 395 400
Ile Arg Tyr Thr Gly His His Ala Tyr Ala Ser Gly Cys Thr Ile Ser
405 410 415
Pro Tyr
<210> 52
<211> 2375
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(2375)
<223> Bos taurus prolyl 4-hydroxylase subunit beta (P4HB), mRNA; NCBI Reference Sequence: NM_174135.3
<220>
<221> misc_feature
<222> (6)..(65)
<223> Bos taurus prolyl 4-hydroxylase subunit beta (P4HB): signal
peptide
<220>
<221> CDS
<222> (66)..(1535)
<400> 52
ccgacatgct gcgccgcgct ctgctctgcc tggccctgac cgcgctattc cgcgcgggtg 60
ccggc gcc ccc gac gag gag gac cac gtc ctg gtg ctc cat aag ggc aac 110
Ala Pro Asp Glu Glu Asp His Val Leu Val Leu His Lys Gly Asn
1 5 10 15
ttc gac gag gcg ctg gcg gcc cac aag tac ctg ctg gtg gag ttc tac 158
Phe Asp Glu Ala Leu Ala Ala His Lys Tyr Leu Leu Val Glu Phe Tyr
20 25 30
gcc cca tgg tgc ggc cac tgc aag gct ctg gcc ccg gag tat gcc aaa 206
Ala Pro Trp Cys Gly His Cys Lys Ala Leu Ala Pro Glu Tyr Ala Lys
35 40 45
gca gct ggg aag ctg aag gca gaa ggt tct gag atc aga ctg gcc aag 254
Ala Ala Gly Lys Leu Lys Ala Glu Gly Ser Glu Ile Arg Leu Ala Lys
50 55 60
gtg gat gcc act gaa gag tct gac ctg gcc cag cag tat ggt gtc cga 302
Val Asp Ala Thr Glu Glu Ser Asp Leu Ala Gln Gln Tyr Gly Val Arg
65 70 75
ggc tac ccc acc atc aag ttc ttc aag aat gga gac aca gct tcc ccc 350
Gly Tyr Pro Thr Ile Lys Phe Phe Lys Asn Gly Asp Thr Ala Ser Pro
80 85 90 95
aaa gag tac aca gct ggc cga gaa gcg gat gat atc gtg aac tgg ctg 398
Lys Glu Tyr Thr Ala Gly Arg Glu Ala Asp Asp Ile Val Asn Trp Leu
100 105 110
aag aag cgc acg ggc ccc gct gcc agc acg ctg tcc gac ggg gct gct 446
Lys Lys Arg Thr Gly Pro Ala Ala Ser Thr Leu Ser Asp Gly Ala Ala
115 120 125
gca gag gcc ttg gtg gag tcc agt gag gtg gcc gtc att ggc ttc ttc 494
Ala Glu Ala Leu Val Glu Ser Ser Glu Val Ala Val Ile Gly Phe Phe
130 135 140
aag gac atg gag tcg gac tcc gca aag cag ttc ttc ttg gca gca gag 542
Lys Asp Met Glu Ser Asp Ser Ala Lys Gln Phe Phe Leu Ala Ala Glu
145 150 155
gtc att gat gac atc ccc ttc ggg atc aca tct aac agc gat gtg ttc 590
Val Ile Asp Asp Ile Pro Phe Gly Ile Thr Ser Asn Ser Asp Val Phe
160 165 170 175
tcc aaa tac cag ctg gac aag gat ggg gtt gtc ctc ttt aag aag ttt 638
Ser Lys Tyr Gln Leu Asp Lys Asp Gly Val Val Leu Phe Lys Lys Phe
180 185 190
gac gaa ggc cgg aac aac ttt gag ggg gag gtc acc aaa gaa aag ctt 686
Asp Glu Gly Arg Asn Asn Phe Glu Gly Glu Val Thr Lys Glu Lys Leu
195 200 205
ctg gac ttc atc aag cac aac cag ttg ccc ctg gtc att gag ttc acc 734
Leu Asp Phe Ile Lys His Asn Gln Leu Pro Leu Val Ile Glu Phe Thr
210 215 220
gag cag aca gcc ccg aag atc ttc gga ggg gaa atc aag act cac atc 782
Glu Gln Thr Ala Pro Lys Ile Phe Gly Gly Glu Ile Lys Thr His Ile
225 230 235
ctg ctg ttc ctg ccg aaa agc gtg tct gac tat gag ggc aag ctg agc 830
Leu Leu Phe Leu Pro Lys Ser Val Ser Asp Tyr Glu Gly Lys Leu Ser
240 245 250 255
aac ttc aaa aaa gcg gct gag agc ttc aag ggc aag atc ctg ttt atc 878
Asn Phe Lys Lys Ala Ala Glu Ser Phe Lys Gly Lys Ile Leu Phe Ile
260 265 270
ttc atc gac agc gac cac act gac aac cag cgc atc ctg gaa ttc ttc 926
Phe Ile Asp Ser Asp His Thr Asp Asn Gln Arg Ile Leu Glu Phe Phe
275 280 285
ggc cta aag aaa gag gag tgc ccg gcc gtg cgc ctc atc acg ctg gag 974
Gly Leu Lys Lys Glu Glu Cys Pro Ala Val Arg Leu Ile Thr Leu Glu
290 295 300
gag gag atg acc aaa tat aag cca gag tca gat gag ctg acg gca gag 1022
Glu Glu Met Thr Lys Tyr Lys Pro Glu Ser Asp Glu Leu Thr Ala Glu
305 310 315
aag atc acc gag ttc tgc cac cgc ttc ctg gag ggc aag att aag ccc 1070
Lys Ile Thr Glu Phe Cys His Arg Phe Leu Glu Gly Lys Ile Lys Pro
320 325 330 335
cac ctg atg agc cag gag ctg cct gac gac tgg gac aag cag cct gtc 1118
His Leu Met Ser Gln Glu Leu Pro Asp Asp Trp Asp Lys Gln Pro Val
340 345 350
aaa gtg ctg gtt ggg aag aac ttt gaa gag gtt gct ttt gat gag aaa 1166
Lys Val Leu Val Gly Lys Asn Phe Glu Glu Val Ala Phe Asp Glu Lys
355 360 365
aag aac gtc ttt gta gag ttc tat gcc ccg tgg tgc ggt cac tgc aag 1214
Lys Asn Val Phe Val Glu Phe Tyr Ala Pro Trp Cys Gly His Cys Lys
370 375 380
cag ctg gcc ccc atc tgg gat aag ctg gga gag acg tac aag gac cac 1262
Gln Leu Ala Pro Ile Trp Asp Lys Leu Gly Glu Thr Tyr Lys Asp His
385 390 395
gag aac ata gtc atc gcc aag atg gac tcc acg gcc aac gag gtg gag 1310
Glu Asn Ile Val Ile Ala Lys Met Asp Ser Thr Ala Asn Glu Val Glu
400 405 410 415
gcg gtg aaa gtg cac agc ttc ccc acg ctc aag ttc ttc ccc gcc agc 1358
Ala Val Lys Val His Ser Phe Pro Thr Leu Lys Phe Phe Pro Ala Ser
420 425 430
gcc gac agg acg gtc atc gac tac aat ggg gag cgg aca ctg gat ggt 1406
Ala Asp Arg Thr Val Ile Asp Tyr Asn Gly Glu Arg Thr Leu Asp Gly
435 440 445
ttt aag aag ttc ctg gag agt ggt ggc cag gat ggg gcc gga gat gat 1454
Phe Lys Lys Phe Leu Glu Ser Gly Gly Gln Asp Gly Ala Gly Asp Asp
450 455 460
gac gat cta gaa gat ctt gaa gaa gca gaa gag cct gat ctg gag gaa 1502
Asp Asp Leu Glu Asp Leu Glu Glu Ala Glu Glu Pro Asp Leu Glu Glu
465 470 475
gat gat gat caa aaa gct gtg aaa gat gaa ctg taacacagag agccagacct 1555
Asp Asp Asp Gln Lys Ala Val Lys Asp Glu Leu
480 485 490
gggcaccaaa cccggacctc ccagtgggct gcacacccag cagcacagcc tccagacgcc 1615
cgcagaccct cccagcgagg gagcgtcgat tggaaatgca gggaactttt ctgaagccac 1675
acttcactct accacacgtg caaatctaaa cccgtcttcc tttgcttttc aacttttgga 1735
aaagggttta tttccaggcc agcccagccc agcccatctt ggtgggcctt tttttttaaa 1795
tcgtgatgta ctttttttgt acctggtttt gtccagagtg ctcgctaaaa tgttttggac 1855
tctcacgctg gcaatgtctc tcattcctgt taggtttata ctatcacttt aaaaaaattc 1915
cgtctgtggg atttttagac atttttggac gtcagggtgt gtgctccacc ttggccaggc 1975
ctccctggga ctcctgccct ctgtggggca gaaccaggca aggctggacg ggtccctcac 2035
ctcatgcggt attgccatgg tggagcgtgg ctcctgcatc atttgattaa atggagactt 2095
tccggtctct gtcacaggcc gctccccaac cgtgagtgga gggtgtggct gggccaggac 2155
aagcccagca ctgtgccagg cagaaccggg acccttcgtt tccaggctgg gagacagcca 2215
aggatgcttg gccccctcct tccccaagcc agggtcctta ttgctctgtg atgtccaggg 2275
tggcctgagg agctgaatca catgttgaca gttcttcagg catttctacc acaatattgg 2335
aattggacac attggccaaa taaagttaaa attttctgcc 2375
<210> 53
<211> 490
<212> PRT
<213> Bos taurus
<400> 53
Ala Pro Asp Glu Glu Asp His Val Leu Val Leu His Lys Gly Asn Phe
1 5 10 15
Asp Glu Ala Leu Ala Ala His Lys Tyr Leu Leu Val Glu Phe Tyr Ala
20 25 30
Pro Trp Cys Gly His Cys Lys Ala Leu Ala Pro Glu Tyr Ala Lys Ala
35 40 45
Ala Gly Lys Leu Lys Ala Glu Gly Ser Glu Ile Arg Leu Ala Lys Val
50 55 60
Asp Ala Thr Glu Glu Ser Asp Leu Ala Gln Gln Tyr Gly Val Arg Gly
65 70 75 80
Tyr Pro Thr Ile Lys Phe Phe Lys Asn Gly Asp Thr Ala Ser Pro Lys
85 90 95
Glu Tyr Thr Ala Gly Arg Glu Ala Asp Asp Ile Val Asn Trp Leu Lys
100 105 110
Lys Arg Thr Gly Pro Ala Ala Ser Thr Leu Ser Asp Gly Ala Ala Ala
115 120 125
Glu Ala Leu Val Glu Ser Ser Glu Val Ala Val Ile Gly Phe Phe Lys
130 135 140
Asp Met Glu Ser Asp Ser Ala Lys Gln Phe Phe Leu Ala Ala Glu Val
145 150 155 160
Ile Asp Asp Ile Pro Phe Gly Ile Thr Ser Asn Ser Asp Val Phe Ser
165 170 175
Lys Tyr Gln Leu Asp Lys Asp Gly Val Val Leu Phe Lys Lys Phe Asp
180 185 190
Glu Gly Arg Asn Asn Phe Glu Gly Glu Val Thr Lys Glu Lys Leu Leu
195 200 205
Asp Phe Ile Lys His Asn Gln Leu Pro Leu Val Ile Glu Phe Thr Glu
210 215 220
Gln Thr Ala Pro Lys Ile Phe Gly Gly Glu Ile Lys Thr His Ile Leu
225 230 235 240
Leu Phe Leu Pro Lys Ser Val Ser Asp Tyr Glu Gly Lys Leu Ser Asn
245 250 255
Phe Lys Lys Ala Ala Glu Ser Phe Lys Gly Lys Ile Leu Phe Ile Phe
260 265 270
Ile Asp Ser Asp His Thr Asp Asn Gln Arg Ile Leu Glu Phe Phe Gly
275 280 285
Leu Lys Lys Glu Glu Cys Pro Ala Val Arg Leu Ile Thr Leu Glu Glu
290 295 300
Glu Met Thr Lys Tyr Lys Pro Glu Ser Asp Glu Leu Thr Ala Glu Lys
305 310 315 320
Ile Thr Glu Phe Cys His Arg Phe Leu Glu Gly Lys Ile Lys Pro His
325 330 335
Leu Met Ser Gln Glu Leu Pro Asp Asp Trp Asp Lys Gln Pro Val Lys
340 345 350
Val Leu Val Gly Lys Asn Phe Glu Glu Val Ala Phe Asp Glu Lys Lys
355 360 365
Asn Val Phe Val Glu Phe Tyr Ala Pro Trp Cys Gly His Cys Lys Gln
370 375 380
Leu Ala Pro Ile Trp Asp Lys Leu Gly Glu Thr Tyr Lys Asp His Glu
385 390 395 400
Asn Ile Val Ile Ala Lys Met Asp Ser Thr Ala Asn Glu Val Glu Ala
405 410 415
Val Lys Val His Ser Phe Pro Thr Leu Lys Phe Phe Pro Ala Ser Ala
420 425 430
Asp Arg Thr Val Ile Asp Tyr Asn Gly Glu Arg Thr Leu Asp Gly Phe
435 440 445
Lys Lys Phe Leu Glu Ser Gly Gly Gln Asp Gly Ala Gly Asp Asp Asp
450 455 460
Asp Leu Glu Asp Leu Glu Glu Ala Glu Glu Pro Asp Leu Glu Glu Asp
465 470 475 480
Asp Asp Gln Lys Ala Val Lys Asp Glu Leu
485 490
<210> 54
<211> 2786
<212> DNA
<213> Bos taurus
<220>
<221> misc_feature
<222> (1)..(2786)
<223> Bos taurus prolyl 4-hydroxylase subunit alpha 1 (P4HA1), mRNA; NCBI Reference Sequence: NM_001075770.1
<220>
<221> CDS
<222> (104)..(1708)
<400> 54
gagtaggtag ccggccgggt gcaggcgacc gggtactgaa gaacgcgcag ctctcgcgtg 60
ccacttccca ggtgtgtgag cctgtaaaat taaacctttg aag atg atc tgg tat 115
Met Ile Trp Tyr
1
att tta gtt gta ggg att cta ctt ccc cag tct ttg gcc cat cca ggc 163
Ile Leu Val Val Gly Ile Leu Leu Pro Gln Ser Leu Ala His Pro Gly
5 10 15 20
ttt ttt act tct att ggt cag atg act gat ttg att cat act gaa aaa 211
Phe Phe Thr Ser Ile Gly Gln Met Thr Asp Leu Ile His Thr Glu Lys
25 30 35
gat ctg gtg act tcc ctg aaa gac tat ata aag gca gaa gag gac aaa 259
Asp Leu Val Thr Ser Leu Lys Asp Tyr Ile Lys Ala Glu Glu Asp Lys
40 45 50
tta gaa caa ata aaa aaa tgg gca gag aaa tta gat cga tta acc agc 307
Leu Glu Gln Ile Lys Lys Trp Ala Glu Lys Leu Asp Arg Leu Thr Ser
55 60 65
aca gcg aca aaa gat cca gaa gga ttt gtt gga cac cct gta aat gca 355
Thr Ala Thr Lys Asp Pro Glu Gly Phe Val Gly His Pro Val Asn Ala
70 75 80
ttc aaa tta atg aaa cgt ctg aac act gag tgg agt gag ttg gag aat 403
Phe Lys Leu Met Lys Arg Leu Asn Thr Glu Trp Ser Glu Leu Glu Asn
85 90 95 100
ctg gtc ctt aag gat atg tca gat ggt ttt atc tct aac cta acc att 451
Leu Val Leu Lys Asp Met Ser Asp Gly Phe Ile Ser Asn Leu Thr Ile
105 110 115
cag aga cag tac ttc cct aat gat gaa gat cag gtt ggg gca gcc aaa 499
Gln Arg Gln Tyr Phe Pro Asn Asp Glu Asp Gln Val Gly Ala Ala Lys
120 125 130
gct ctg ttg cgt cta cag gac acc tac aat ttg gat aca gat acc atc 547
Ala Leu Leu Arg Leu Gln Asp Thr Tyr Asn Leu Asp Thr Asp Thr Ile
135 140 145
tca aag ggt gat ctt cca gga gta aaa cac aaa tct ttt cta aca gtt 595
Ser Lys Gly Asp Leu Pro Gly Val Lys His Lys Ser Phe Leu Thr Val
150 155 160
gag gac tgt ttt gag ttg ggc aaa gtg gcc tac aca gaa gca gat tat 643
Glu Asp Cys Phe Glu Leu Gly Lys Val Ala Tyr Thr Glu Ala Asp Tyr
165 170 175 180
tac cat aca gag ctg tgg atg gaa caa gca ctg agg cag ctg gat gaa 691
Tyr His Thr Glu Leu Trp Met Glu Gln Ala Leu Arg Gln Leu Asp Glu
185 190 195
ggc gag gtt tct acc gtt gat aaa gtc tct gtt ctg gat tat ttg agc 739
Gly Glu Val Ser Thr Val Asp Lys Val Ser Val Leu Asp Tyr Leu Ser
200 205 210
tat gca gta tac cag cag gga gac ctg gat aag gcg ctt ttg ctc aca 787
Tyr Ala Val Tyr Gln Gln Gly Asp Leu Asp Lys Ala Leu Leu Leu Thr
215 220 225
aag aag ctt ctt gaa cta gat cct gaa cat cag aga gct aac ggt aac 835
Lys Lys Leu Leu Glu Leu Asp Pro Glu His Gln Arg Ala Asn Gly Asn
230 235 240
tta aaa tac ttt gag tat ata atg gct aaa gaa aaa gat gcc aat aag 883
Leu Lys Tyr Phe Glu Tyr Ile Met Ala Lys Glu Lys Asp Ala Asn Lys
245 250 255 260
tct tct tca gat gac caa tct gat cag aaa acc aca ctg aag aag aaa 931
Ser Ser Ser Asp Asp Gln Ser Asp Gln Lys Thr Thr Leu Lys Lys Lys
265 270 275
ggt gct gct gtg gat tac ctg cca gag aga cag aag tac gaa atg ctg 979
Gly Ala Ala Val Asp Tyr Leu Pro Glu Arg Gln Lys Tyr Glu Met Leu
280 285 290
tgc cgt ggg gag ggt atc aaa atg act cct cgg aga cag aaa aaa ctc 1027
Cys Arg Gly Glu Gly Ile Lys Met Thr Pro Arg Arg Gln Lys Lys Leu
295 300 305
ttc tgt cgc tac cat gat gga aac cgg aat cct aaa ttt atc ctg gct 1075
Phe Cys Arg Tyr His Asp Gly Asn Arg Asn Pro Lys Phe Ile Leu Ala
310 315 320
cca gcc aaa cag gag gat gag tgg gac aag cct cgt att atc cgc ttc 1123
Pro Ala Lys Gln Glu Asp Glu Trp Asp Lys Pro Arg Ile Ile Arg Phe
325 330 335 340
cat gat att att tct gat gca gaa att gaa gtc gtt aaa gat cta gca 1171
His Asp Ile Ile Ser Asp Ala Glu Ile Glu Val Val Lys Asp Leu Ala
345 350 355
aaa cca agg ctg agg cga gcc acc att tca aac cca ata aca gga gac 1219
Lys Pro Arg Leu Arg Arg Ala Thr Ile Ser Asn Pro Ile Thr Gly Asp
360 365 370
ttg gag acg gta cat tac aga att agc aaa agt gcc tgg ctg tct ggc 1267
Leu Glu Thr Val His Tyr Arg Ile Ser Lys Ser Ala Trp Leu Ser Gly
375 380 385
tat gaa aac cct gtg gtg tca cga att aat atg aga atc caa gat ctg 1315
Tyr Glu Asn Pro Val Val Ser Arg Ile Asn Met Arg Ile Gln Asp Leu
390 395 400
aca gga cta gat gtc tcc aca gca gag gaa tta cag gta gca aat tat 1363
Thr Gly Leu Asp Val Ser Thr Ala Glu Glu Leu Gln Val Ala Asn Tyr
405 410 415 420
gga gtt gga gga cag tat gaa ccc cat ttt gat ttt gca cgg aaa gat 1411
Gly Val Gly Gly Gln Tyr Glu Pro His Phe Asp Phe Ala Arg Lys Asp
425 430 435
gag cca gat gct ttc aaa gag ctg ggg aca gga aat aga att gct aca 1459
Glu Pro Asp Ala Phe Lys Glu Leu Gly Thr Gly Asn Arg Ile Ala Thr
440 445 450
tgg ctg ttt tat atg agt gat gtg tta gca gga gga gcc act gtt ttt 1507
Trp Leu Phe Tyr Met Ser Asp Val Leu Ala Gly Gly Ala Thr Val Phe
455 460 465
cct gaa gta gga gct agt gtt tgg ccc aaa aag gga act gct gtt ttc 1555
Pro Glu Val Gly Ala Ser Val Trp Pro Lys Lys Gly Thr Ala Val Phe
470 475 480
tgg tat aat ctg ttt gcc agt gga gaa gga gat tat agt aca cgg cat 1603
Trp Tyr Asn Leu Phe Ala Ser Gly Glu Gly Asp Tyr Ser Thr Arg His
485 490 495 500
gca gcc tgt cca gtg ctg gtt gga aac aaa tgg gta tcc aat aaa tgg 1651
Ala Ala Cys Pro Val Leu Val Gly Asn Lys Trp Val Ser Asn Lys Trp
505 510 515
ctc cat gaa cgt gga cag gaa ttt cga aga cca tgc acc ttg tca gaa 1699
Leu His Glu Arg Gly Gln Glu Phe Arg Arg Pro Cys Thr Leu Ser Glu
520 525 530
ttg gaa tga caaatgaact ttctctcctg ttgtactcta atgtgtctga 1748
Leu Glu
tacacacaat tcccagtctt aactttcaag agtttacaat tgactaacac tccgtgattg 1808
attcagtcat gaacctcatc ccatgtttca tctgtggaca atcactaact ttgtggggtt 1868
tgtttttttt ttcttttaaa agtaacacta aatcaccaca ttgtacatat aaaaaacctt 1928
aaagttcagt tggcatcaca gaggacaaaa agacagggtt aaaaatgagg aacttttacc 1988
tttatattaa aaaaattttt ttttagttgg ggaaaaaaaa agtcaagcat ctgattataa 2048
tatttcagta tatctctgtt ggtgggtggt ggactaaaat ggtccatctg attaaggaac 2108
agatgcctta tagtgtatac ctaggtactg tgtttaccta gtcttaactt tcttctggat 2168
ctgcctgacg actaggaata aattagccct ctaaactcgg ttcagtttaa cgtttgcccc 2228
tatgtttact aagtagattt tttcttctcc caagtccttt ctaaagtatt ctttattttt 2288
accaatctgt tcctttcata gctcctctgt ggtgaattaa atttgagtta aaatactttg 2348
attttaaaaa aaatttaaca gaaggtccta cattaaaaag ttttggcctt cttaacagaa 2408
atgatcatga cttagtctgt ttctgctttt tcttaaatga ctcatgattt tgtccaggaa 2468
tttttgttgt tttccttagt gctaattcct tgcctcttgt tccagctata gacagcgggg 2528
gatgatgatg ttggcattca gattaaataa atactgtgcc ttaggagact ggaaatttta 2588
aaatgtacaa gttctttcaa tgatgaggga attgataaaa aaaaaaaaaa aaaaaaaaaa 2648
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2708
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2768
aaaaaaaaaa aaaaaaaa 2786
<210> 55
<211> 534
<212> PRT
<213> Bos taurus
<400> 55
Met Ile Trp Tyr Ile Leu Val Val Gly Ile Leu Leu Pro Gln Ser Leu
1 5 10 15
Ala His Pro Gly Phe Phe Thr Ser Ile Gly Gln Met Thr Asp Leu Ile
20 25 30
His Thr Glu Lys Asp Leu Val Thr Ser Leu Lys Asp Tyr Ile Lys Ala
35 40 45
Glu Glu Asp Lys Leu Glu Gln Ile Lys Lys Trp Ala Glu Lys Leu Asp
50 55 60
Arg Leu Thr Ser Thr Ala Thr Lys Asp Pro Glu Gly Phe Val Gly His
65 70 75 80
Pro Val Asn Ala Phe Lys Leu Met Lys Arg Leu Asn Thr Glu Trp Ser
85 90 95
Glu Leu Glu Asn Leu Val Leu Lys Asp Met Ser Asp Gly Phe Ile Ser
100 105 110
Asn Leu Thr Ile Gln Arg Gln Tyr Phe Pro Asn Asp Glu Asp Gln Val
115 120 125
Gly Ala Ala Lys Ala Leu Leu Arg Leu Gln Asp Thr Tyr Asn Leu Asp
130 135 140
Thr Asp Thr Ile Ser Lys Gly Asp Leu Pro Gly Val Lys His Lys Ser
145 150 155 160
Phe Leu Thr Val Glu Asp Cys Phe Glu Leu Gly Lys Val Ala Tyr Thr
165 170 175
Glu Ala Asp Tyr Tyr His Thr Glu Leu Trp Met Glu Gln Ala Leu Arg
180 185 190
Gln Leu Asp Glu Gly Glu Val Ser Thr Val Asp Lys Val Ser Val Leu
195 200 205
Asp Tyr Leu Ser Tyr Ala Val Tyr Gln Gln Gly Asp Leu Asp Lys Ala
210 215 220
Leu Leu Leu Thr Lys Lys Leu Leu Glu Leu Asp Pro Glu His Gln Arg
225 230 235 240
Ala Asn Gly Asn Leu Lys Tyr Phe Glu Tyr Ile Met Ala Lys Glu Lys
245 250 255
Asp Ala Asn Lys Ser Ser Ser Asp Asp Gln Ser Asp Gln Lys Thr Thr
260 265 270
Leu Lys Lys Lys Gly Ala Ala Val Asp Tyr Leu Pro Glu Arg Gln Lys
275 280 285
Tyr Glu Met Leu Cys Arg Gly Glu Gly Ile Lys Met Thr Pro Arg Arg
290 295 300
Gln Lys Lys Leu Phe Cys Arg Tyr His Asp Gly Asn Arg Asn Pro Lys
305 310 315 320
Phe Ile Leu Ala Pro Ala Lys Gln Glu Asp Glu Trp Asp Lys Pro Arg
325 330 335
Ile Ile Arg Phe His Asp Ile Ile Ser Asp Ala Glu Ile Glu Val Val
340 345 350
Lys Asp Leu Ala Lys Pro Arg Leu Arg Arg Ala Thr Ile Ser Asn Pro
355 360 365
Ile Thr Gly Asp Leu Glu Thr Val His Tyr Arg Ile Ser Lys Ser Ala
370 375 380
Trp Leu Ser Gly Tyr Glu Asn Pro Val Val Ser Arg Ile Asn Met Arg
385 390 395 400
Ile Gln Asp Leu Thr Gly Leu Asp Val Ser Thr Ala Glu Glu Leu Gln
405 410 415
Val Ala Asn Tyr Gly Val Gly Gly Gln Tyr Glu Pro His Phe Asp Phe
420 425 430
Ala Arg Lys Asp Glu Pro Asp Ala Phe Lys Glu Leu Gly Thr Gly Asn
435 440 445
Arg Ile Ala Thr Trp Leu Phe Tyr Met Ser Asp Val Leu Ala Gly Gly
450 455 460
Ala Thr Val Phe Pro Glu Val Gly Ala Ser Val Trp Pro Lys Lys Gly
465 470 475 480
Thr Ala Val Phe Trp Tyr Asn Leu Phe Ala Ser Gly Glu Gly Asp Tyr
485 490 495
Ser Thr Arg His Ala Ala Cys Pro Val Leu Val Gly Asn Lys Trp Val
500 505 510
Ser Asn Lys Trp Leu His Glu Arg Gly Gln Glu Phe Arg Arg Pro Cys
515 520 525
Thr Leu Ser Glu Leu Glu
530
Claims (50)
- 비수산화 콜라겐을 제조하기 위해 유전자 조작된 효모의 균주로서,
(ⅰ) 효모의 균주; 및
(ⅱ) 콜라겐에 대한 DNA 서열; 콜라겐 촉진자에 대한 DNA 서열; 콜라겐 종결자에 대한 DNA 서열; 선택 마커에 대한 DNA 서열; 선택 마커에 대한 촉진자에 대한 DNA 서열; 선택 마커에 대한 종결자에 대한 DNA 서열; 박테리아에 대한 것 및 효모에 대한 것으로부터 선택된 복제 기원에 대한 DNA 서열; 및 상기 효모 게놈에 대한 상동성을 함유하는 DNA 서열을 포함하는, 상기 효모의 균주로 삽입된, 벡터를 포함하는, 효모의 균주. - 제1항에 있어서, 상기 효모의 균주는 피치아(Pichia), 칸디다(Candida), 코마타가엘라(Komatagaella), 한세눌라(Hansenula), 사카로마이세스(Saccharomyces), 크립토코커스(Cryptococcus) 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제1항에 있어서, 상기 콜라겐에 대한 DNA 서열은 소, 돼지, 캥거루, 앨리게이터(alligator), 악어(crocodile), 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제3항에 있어서, 상기 콜라겐에 대한 DNA 서열은 네이티브 콜라겐 DNA, 조작된 콜라겐 DNA 및 코돈 변형된 콜라겐 DNA로부터 선택된, 효모의 균주.
- 제1항에 있어서, 상기 촉진자에 대한 DNA 서열은 AOX1 메탄올 유도된 촉진자에 대한 DNA, pDF 탈억제된 촉진자에 대한 DNA, pCAT 탈억제된 촉진자에 대한 DNA, Das1-Das2 메탄올 유도된 이방향성 촉진자에 대한 DNA, pHTX1 구성적 이방향성 촉진자에 대한 DNA, pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA 및 이들의 조합으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제1항에 있어서, 상기 선택 마커에 대한 DNA 서열은 항생제 내성에 대한 DNA 및 영양요구성 마커(auxotrophic marker)에 대한 DNA로 이루어진 군으로부터 선택된, 효모의 균주.
- 제6항에 있어서, 상기 항생제 내성 마커는 하이그로마이신, 제오신, 제네티신 및 이들의 조합에 대한 내성으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제1항에 있어서, 상기 벡터는 전기천공, 화학 형질전환 및 메이팅(mating)으로 이루어진 군으로부터 선택된 방법을 통해 상기 효모로 삽입된, 효모의 균주.
- 비수산화 콜라겐을 제조하는 방법으로서,
(ⅰ) 제1항에 따른 효모의 균주를 제공하는 단계; 및
(ⅱ) 콜라겐을 제조하기에 충분한 시간 기간 동안 배지 중에 상기 균주를 성장시키는 단계를 포함하는, 비수산화 콜라겐을 제조하는 방법. - 제9항에 있어서, 상기 효모의 균주는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 비수산화 콜라겐을 제조하는 방법.
- 제9항에 있어서, 상기 배지는 완충 글라이세롤 복합 배지(buffered glycerol complex media: BMGY), 완충 메탄올 복합 배지(buffered methanol complex media: BMMY) 및 효모 추출물 펩톤 덱스트로스(yeast extract peptone dextrose: YPD)로 이루어진 군으로부터 선택된, 비수산화 콜라겐을 제조하는 방법.
- 제9항에 있어서, 상기 시간 기간은 24시간 내지 72시간인, 비수산화 콜라겐을 제조하는 방법.
- 제9항에 있어서, 상기 효모는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 비수산화 콜라겐을 제조하는 방법.
- 제9항에 있어서, 상기 콜라겐에 대한 DNA 서열은 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된, 비수산화 콜라겐을 제조하는 방법.
- 제9항에 있어서, 상기 촉진자에 대한 DNA 서열은 pHTX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된, 비수산화 콜라겐을 제조하는 방법.
- 제9항에 있어서, 상기 선택 마커에 대한 DNA 서열은 항생제 내성 DNA에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택된, 비수산화 콜라겐을 제조하는 방법.
- 수산화 콜라겐을 제조하기 위해 유전자 조작된 효모의 균주로서,
(ⅰ) 효모의 균주;
(ⅱ) 콜라겐에 대한 DNA 서열; 콜라겐 촉진자에 대한 DNA 서열; 종결자에 대한 DNA 서열; 선택 마커에 대한 DNA 서열; 선택 마커에 대한 촉진자에 대한 DNA 서열; 선택 마커에 대한 종결자에 대한 DNA 서열; 박테리아 및/또는 효모에 대한 복제 기원에 대한 DNA 서열; 상기 효모 게놈에 대한 상동성을 함유하는 DNA 서열을 포함하는, 상기 효모의 균주로 삽입된, 벡터; 및
(ⅲ) P4HA1에 대한 DNA 서열; P4HB에 대한 DNA 서열; 및 촉진자에 대한 적어도 하나의 DNA 서열을 포함하는, 상기 효모의 균주로 삽입된, 제2 벡터를 포함하는, 수산화 콜라겐을 제조하기 위해 유전자 조작된 효모의 균주. - 제17항에 있어서, 상기 효모의 균주는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제17항에 있어서, 상기 콜라겐에 대한 DNA 서열은 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제17항에 있어서, 상기 콜라겐에 대한 DNA 서열은 네이티브 콜라겐 DNA, 조작된 콜라겐 DNA 및 변형된 콜라겐 DNA로부터 선택된, 효모의 균주.
- 제17항에 있어서, 상기 촉진자에 대한 DNA 서열은 AOX1 메탄올 유도된 촉진자에 대한 DNA, pDF 탈억제된 촉진자에 대한 DNA, pCAT 탈억제된 촉진자에 대한 DNA, Das1-Das2 메탄올 유도된 이방향성 촉진자에 대한 DNA, pHTX1 구성적 이방향성 촉진자에 대한 DNA, pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA 및 이들의 조합으로 이루어진 군으로부터 선택된, 효모의 균주.
- 제17항에 있어서, 상기 촉진자에 대한 DNA 서열은 pHTX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된, 효모의 균주.
- 제17항에 있어서, 상기 선택 마커에 대한 DNA 서열은 항생제 내성 DNA에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택된, 효모의 균주.
- 제23항에 있어서, 상기 선택 마커는 하이그로마이신, 제오신, 제네티신 및 이들의 조합에 대한 내성으로 이루어진 군으로부터 선택된 항생제 내성에 대한 것인, 효모의 균주.
- 제17항에 있어서, 상기 벡터는 전기천공, 화학 형질전환 및 메이팅으로 이루어진 군으로부터 선택된 방법을 통해 상기 효모로 삽입된, 효모의 균주.
- 수산화 콜라겐을 제조하는 방법으로서,
(ⅰ) 제17항에 따른 효모의 균주를 제공하는 단계; 및
(ⅱ) 콜라겐을 제조하기에 충분한 시간 기간 동안 배지 중에 상기 균주를 성장시키는 단계를 포함하는, 수산화 콜라겐을 제조하는 방법. - 제26항에 있어서, 상기 효모의 균주는 칸디다, 코마타가엘라, 피치아, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 제26항에 있어서, 상기 배지는 BMGY, BMMY 및 YPD로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 제26항에 있어서, 상기 시간 기간은 24시간 내지 72시간인, 수산화 콜라겐을 제조하는 방법.
- 제26항에 있어서, 상기 효모는 칸디다, 코마타가엘라, 피치아, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법
- 제26항에 있어서, 상기 콜라겐에 대한 DNA는 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 제26항에 있어서, 상기 촉진자에 대한 DNA는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 제26항에 있어서, 상기 선택 마커에 대한 DNA는 항생제 내성 DNA에 대한 DNA 및 영양요구성 마커에 대한 DNA로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 올-인-원 벡터(all-in-one vector)로서,
(ⅰ) 촉진자 및 종결자를 포함하는, 콜라겐을 제조하기 위해 필요한 DNA;
(ⅱ) 촉진자 및 종결자를 포함하는, P4HA1 및 P4HB로 이루어진 군으로부터 선택된 수산화 효소에 대한 DNA;
(ⅲ) 촉진자 및 종결자를 포함하는, 선택 마커에 대한 DNA;
(ⅳ) 효모 및 박테리아에 대한 복제 기원에 대한 DNA;
(ⅴ) 상기 게놈으로의 통합을 위한 상기 효모 게놈에 대한 상동성을 갖는 DNA; 및
(ⅵ) 모듈식 클로닝을 허용하는, 5', 3', 상기 DNA 내 및 이들의 조합으로 이루어진 군으로부터 선택된 위치에서의 제한 부위를 포함하는, 올-인-원 벡터. - 제34항에 있어서, 상기 콜라겐을 제조하기 위해 필요한 DNA 서열은 소, 돼지, 캥거루, 앨리게이터, 악어, 코끼리, 기린, 얼룩말, 라마, 알파카, 어린양, 공룡 콜라겐 및 이들의 조합으로 이루어진 군으로부터 선택된, 올-인-원 벡터.
- 제34항에 있어서, 상기 촉진자에 대한 DNA 서열은 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된, 올-인-원 벡터.
- 제34항에 있어서, 상기 선택 마커에 대한 DNA 서열은 항생제 내성에 대한 DNA 및 영양요구성 마커에 대한 DNA인, 올-인-원 벡터.
- 제37항에 있어서, 상기 선택 마커는 하이그로마이신, 제오신, 제네티신 및 이들의 조합으로 이루어진 군으로부터 선택된 항생제에 대한 항생제 내성에 대한 것인, 올-인-원 벡터.
- 키메라 콜라겐 DNA 서열로서, 상기 키메라 콜라겐 DNA의 전체 길이를 기준으로, 최적화된 DNA의 10 내지 40% 또는 60 내지 90%를 포함하는, 키메라 콜라겐 DNA 서열.
- 제39항에 있어서, 상기 최적화된 DNA는 C 말단에서 기원하는, 키메라 콜라겐 DNA 서열.
- 제39항에 있어서, 상기 최적화된 DNA는 N 말단에서 기원하는, 키메라 콜라겐 DNA 서열.
- 콜라겐 생성 효모의 균주로서,
제39항의 키메라 콜라겐에 대한 DNA 서열을 포함하는 벡터;
콜라겐 촉진자에 대한 DNA 서열;
종결자에 대한 DNA 서열; 선택 마커에 대한 DNA 서열;
선택 마커에 대한 촉진자에 대한 DNA 서열;
선택 마커에 대한 종결자에 대한 DNA 서열;
박테리아 및/또는 효모에 대한 복제 기원에 대한 DNA 서열; 및
상기 효모 게놈에 대한 상동성을 함유하는 DNA 서열을 포함하는, 효모의 균주. - 제42항에 있어서, 상기 촉진자에 대한 DNA는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된, 효모의 균주.
- 제42항에 있어서, 상기 선택 마커에 대한 DNA는 적어도 하나의 항생제 내성을 코딩하는 DNA 및 적어도 하나의 영양요구성 마커를 코딩하는 DNA로 이루어진 군으로부터 선택된, 효모의 균주.
- 수산화 콜라겐을 제조하는 방법으로서,
(ⅰ) 제42항에 따른 콜라겐 생성 효모의 균주를 제공하는 단계; 및
(ⅱ) 콜라겐을 제조하기에 충분한 시간 기간 동안 배지 중에 상기 균주를 성장시키는 단계를 포함하는, 수산화 콜라겐을 제조하는 방법. - 제45항에 있어서, 상기 효모의 균주는 피치아, 칸디다, 코마타가엘라, 한세눌라, 사카로마이세스, 크립토코커스 속으로부터의 것 및 이들의 조합으로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 제45항에 있어서, 상기 배지는 완충 글라이세롤 복합 배지, 완충 메탄올 복합 배지 및 효모 추출물 펩톤 덱스트로스로 이루어진 군으로부터 선택된, 수산화 콜라겐을 제조하는 방법.
- 제45항에 있어서, 상기 시간 기간은 24시간 내지 72시간의 범위인, 수산화 콜라겐을 제조하는 방법.
- 제45항에 있어서, 상기 효모의 균주는 pTHX1 구성적 이방향성 촉진자에 대한 DNA 및 pGCW14-pGAP1 구성적 이방향성 촉진자에 대한 DNA로 이루어진 군으로부터 선택된 촉진자를 포함하는, 수산화 콜라겐을 제조하는 방법.
- 제45항에 있어서, 상기 효모의 균주는 항생제 내성을 코딩하는 DNA 및 영양요구성 마커를 코딩하는 DNA로 이루어진 군으로부터 선택된 적어도 하나의 선택 마커를 포함하는, 수산화 콜라겐을 제조하는 방법.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762539213P | 2017-07-31 | 2017-07-31 | |
US62/539,213 | 2017-07-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20190013627A true KR20190013627A (ko) | 2019-02-11 |
Family
ID=63254509
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020180088161A KR20190013627A (ko) | 2017-07-31 | 2018-07-27 | 재조합 콜라겐의 수산화를 제어하기 위한 효모 균주 및 방법 |
Country Status (10)
Country | Link |
---|---|
US (1) | US20190040400A1 (ko) |
EP (1) | EP3438125B1 (ko) |
JP (2) | JP7402604B2 (ko) |
KR (1) | KR20190013627A (ko) |
CN (1) | CN109321480B (ko) |
BR (1) | BR102018015599A2 (ko) |
CA (1) | CA3012006A1 (ko) |
DK (1) | DK3438125T5 (ko) |
ES (1) | ES2967088T3 (ko) |
FI (1) | FI3438125T3 (ko) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA3008850A1 (en) | 2017-06-29 | 2018-12-29 | Modern Meadow, Inc. | Yeast strains and methods for producing collagen |
KR20220139877A (ko) * | 2020-02-14 | 2022-10-17 | 모던 메도우 아이엔씨. | 아미노산의 히드록실화를 위한 단량체 단백질 및 제품 |
CN112626074B (zh) * | 2021-01-11 | 2021-11-23 | 肽源(广州)生物科技有限公司 | 一种含羟脯氨酸修饰化的重组人iii型胶原蛋白成熟肽及其制备方法与应用 |
CA3174981A1 (en) * | 2021-04-30 | 2022-10-30 | Modern Meadow, Inc. | Collagen compositions and methods of use thereof |
CN114404667A (zh) * | 2021-12-07 | 2022-04-29 | 尚诚怡美(成都)生物科技有限公司 | 一种长效型重组人源胶原蛋白植入物及其应用 |
CN114480471A (zh) * | 2021-12-27 | 2022-05-13 | 江苏创健医疗科技有限公司 | 酵母重组人源iii型三螺旋胶原蛋白及其制备方法 |
CN116375847A (zh) * | 2022-10-26 | 2023-07-04 | 江苏创健医疗科技股份有限公司 | 酵母重组xvii型人源化胶原蛋白及其制备方法 |
EP4379068A1 (en) | 2022-12-01 | 2024-06-05 | Institut National Des Sciences Appliquées De Rouen | Iridoid or seco-iridoid derivatives and their use in a tanning process |
EP4379067A1 (en) | 2022-12-01 | 2024-06-05 | Institut National Des Sciences Appliquées De Rouen | Iridoid derivatives and their use in a tanning process |
CN116240187B (zh) * | 2023-04-06 | 2024-05-07 | 广东普言生物科技有限公司 | 脯氨酰羟化酶α1亚基突变体、编码基因及其在催化脯氨酸羟基化中的应用 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020142391A1 (en) * | 1991-06-12 | 2002-10-03 | Kivirikko Kari I. | Synthesis of human procollagens and collagens in recombinant DNA systems |
US5804407A (en) * | 1993-11-04 | 1998-09-08 | University Technologies International, Inc. | Method of expressing genes in mammalian cells |
US6150081A (en) | 1997-12-24 | 2000-11-21 | Fuji Photo Film B.V. | Silver halide emulsions with recombinant collagen suitable for photographic application and also the preparation thereof |
US6428978B1 (en) | 1998-05-08 | 2002-08-06 | Cohesion Technologies, Inc. | Methods for the production of gelatin and full-length triple helical collagen in recombinant cells |
EP1232182B1 (en) | 1999-11-12 | 2007-10-03 | Fibrogen, Inc. | Bovine collagen and method for producing recombinant gelatin |
AU2005246967B2 (en) * | 1999-11-12 | 2008-09-04 | Fibrogen, Inc. | Recombinant gelatins |
CA2399371A1 (en) * | 1999-11-12 | 2001-05-17 | Fibrogen, Inc. | Animal collagens and gelatins |
KR20060015296A (ko) * | 2003-05-23 | 2006-02-16 | 제넨테크, 인크. | 신경교 기원의 종양의 진단 및 치료를 위한 조성물 및 방법 |
US20080081353A1 (en) * | 2006-09-29 | 2008-04-03 | Universite Laval | Production of recombinant human collagen |
ES2536198T3 (es) * | 2008-12-22 | 2015-05-21 | National University Corporation Hokkaido University | Sustancia proteínica que tiene estructura de hélice triple y método de producción de la misma |
JP2011190210A (ja) * | 2010-03-15 | 2011-09-29 | Sumitomo Chemical Co Ltd | グリシン繰り返し配列タンパク質を産生する形質転換体 |
CN102020712B (zh) * | 2010-11-11 | 2013-05-29 | 北京东方红航天生物技术股份有限公司 | 一种可用于疫苗稳定剂的类人胶原蛋白及其生产方法 |
CN105358694B (zh) * | 2013-03-08 | 2019-06-18 | 凯克应用生命科学研究生院 | 来自巴斯德毕赤酵母的酵母启动子 |
CA3008850A1 (en) * | 2017-06-29 | 2018-12-29 | Modern Meadow, Inc. | Yeast strains and methods for producing collagen |
-
2018
- 2018-07-20 CA CA3012006A patent/CA3012006A1/en active Pending
- 2018-07-27 KR KR1020180088161A patent/KR20190013627A/ko not_active Application Discontinuation
- 2018-07-30 US US16/048,920 patent/US20190040400A1/en not_active Abandoned
- 2018-07-30 JP JP2018141979A patent/JP7402604B2/ja active Active
- 2018-07-30 BR BR102018015599-7A patent/BR102018015599A2/pt unknown
- 2018-07-31 FI FIEP18186574.2T patent/FI3438125T3/fi active
- 2018-07-31 EP EP18186574.2A patent/EP3438125B1/en active Active
- 2018-07-31 ES ES18186574T patent/ES2967088T3/es active Active
- 2018-07-31 DK DK18186574.2T patent/DK3438125T5/da active
- 2018-07-31 CN CN201810855911.0A patent/CN109321480B/zh active Active
-
2023
- 2023-09-15 JP JP2023150398A patent/JP2023179488A/ja active Pending
Also Published As
Publication number | Publication date |
---|---|
CN109321480B (zh) | 2024-01-02 |
JP2023179488A (ja) | 2023-12-19 |
EP3438125A1 (en) | 2019-02-06 |
JP2019047774A (ja) | 2019-03-28 |
ES2967088T3 (es) | 2024-04-26 |
CA3012006A1 (en) | 2019-01-31 |
DK3438125T3 (da) | 2024-01-08 |
EP3438125B1 (en) | 2023-10-11 |
CN109321480A (zh) | 2019-02-12 |
BR102018015599A2 (pt) | 2019-03-26 |
US20190040400A1 (en) | 2019-02-07 |
FI3438125T3 (fi) | 2024-01-02 |
DK3438125T5 (da) | 2024-09-16 |
JP7402604B2 (ja) | 2023-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109321480B (zh) | 用于控制重组胶原羟基化的酵母菌株和方法 | |
CN109207387B (zh) | 用于产生胶原的酵母菌株和方法 | |
KR101778174B1 (ko) | 프로테아제 스크리닝 방법 및 이에 의해 확인된 프로테아제 | |
KR20210149060A (ko) | Tn7-유사 트랜스포존을 사용한 rna-유도된 dna 통합 | |
CN101939434B (zh) | 用于在大豆中提高种子贮藏油脂的生成和改变脂肪酸谱的来自解脂耶氏酵母的dgat基因 | |
AU2022200903B2 (en) | Engineered Cascade components and Cascade complexes | |
KR102528337B1 (ko) | 정의된 서열 및 길이의 dna 단일 가닥 분자의 확장 가능한 생명공학적 생산 | |
JP2010524440A (ja) | 発現系 | |
KR20220004959A (ko) | 종양, 종양-상주 면역 세포, 및 종양 미세환경을 콜로니화하기 위해 조작된 면역자극성 박테리아 | |
KR20220113943A (ko) | 면역자극성 박테리아 전달 플랫폼 및 치료 제품의 전달을 위한 이의 용도 | |
KR20200126997A (ko) | 인간 대상체에서의 비-노화-관련 청각 손상의 치료를 위한 조성물 및 방법 | |
CN109661403A (zh) | 前导序列修饰的葡糖淀粉酶多肽和具有增强的生物产物产生的工程化的酵母菌株 | |
CN101517074B (zh) | 蛋白酶筛选方法及由此鉴别的蛋白酶 | |
BRPI0806354A2 (pt) | plantas oleaginosas transgências, sementes, óleos, produtos alimentìcios ou análogos a alimento, produtos alimentìcios medicinais ou análogos alimentìcios medicinais, produtos farmacêuticos, bebidas fórmulas para bebês, suplementos nutricionais, rações para animais domésticos, alimentos para aquacultura, rações animais, produtos de sementes inteiras, produtos de óleos misturados, produtos, subprodutos e subprodutos parcialmente processados | |
KR20130020842A (ko) | 유전적으로 변형된 광합성 생물들의 고처리량 스크리닝 | |
KR20230066000A (ko) | 면역자극성 박테리아-기초 백신, 치료제, 및 rna 전달 플랫폼 | |
KR20220029676A (ko) | 희토류 원소(ree) 결합 단백질 | |
KR20210151916A (ko) | 뒤시엔느 근육 이영양증의 치료를 위한 aav 벡터-매개된 큰 돌연변이 핫스팟의 결실 | |
KR20220157944A (ko) | 인간 대상체에서 비-연령-연관 청각 장애를 치료하기 위한 조성물 및 방법 | |
CN115128266A (zh) | 用于检测自身抗体的方法和试剂 | |
CN113943718B (zh) | 一种糖基转移酶及其在Tn抗原的标记、成像和检测中的应用 | |
RU2774631C1 (ru) | Сконструированные компоненты cascade и комплексы cascade | |
TW202403048A (zh) | 用於治療龐貝症之具有訊號肽修飾之以密碼子最佳化核酸編碼α-葡萄糖苷酶(GAA)之治療性腺相關病毒 | |
TW202305362A (zh) | 用於偵測自體抗體之方法與手段 | |
US20040259122A1 (en) | Autocatalysis/yeast two-hybrid assay |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
E902 | Notification of reason for refusal | ||
E90F | Notification of reason for final refusal |