CN112921052A - In vivo cell proliferation marking and tracing system and application thereof - Google Patents
In vivo cell proliferation marking and tracing system and application thereof Download PDFInfo
- Publication number
- CN112921052A CN112921052A CN201911242892.5A CN201911242892A CN112921052A CN 112921052 A CN112921052 A CN 112921052A CN 201911242892 A CN201911242892 A CN 201911242892A CN 112921052 A CN112921052 A CN 112921052A
- Authority
- CN
- China
- Prior art keywords
- recombinase
- sequence
- nucleic acid
- leu
- recognition site
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004663 cell proliferation Effects 0.000 title claims abstract description 71
- 238000001727 in vivo Methods 0.000 title claims abstract description 37
- 102000018120 Recombinases Human genes 0.000 claims abstract description 110
- 108010091086 Recombinases Proteins 0.000 claims abstract description 110
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 97
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 97
- 239000002157 polynucleotide Substances 0.000 claims abstract description 97
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 85
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 71
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 71
- 108010038795 estrogen receptors Proteins 0.000 claims abstract description 56
- 102000015694 estrogen receptors Human genes 0.000 claims abstract description 55
- 108091026890 Coding region Proteins 0.000 claims abstract description 48
- 238000000034 method Methods 0.000 claims abstract description 46
- 210000004027 cell Anatomy 0.000 claims description 118
- 108090000623 proteins and genes Proteins 0.000 claims description 66
- 241001465754 Metazoa Species 0.000 claims description 58
- 238000002744 homologous recombination Methods 0.000 claims description 28
- 230000009261 transgenic effect Effects 0.000 claims description 25
- 230000006801 homologous recombination Effects 0.000 claims description 24
- 150000001413 amino acids Chemical class 0.000 claims description 20
- 108020001507 fusion proteins Proteins 0.000 claims description 19
- 102000037865 fusion proteins Human genes 0.000 claims description 19
- 239000003550 marker Substances 0.000 claims description 19
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 16
- 238000002372 labelling Methods 0.000 claims description 9
- 239000002773 nucleotide Substances 0.000 claims description 9
- 125000003729 nucleotide group Chemical group 0.000 claims description 9
- 230000006798 recombination Effects 0.000 claims description 9
- 102000004190 Enzymes Human genes 0.000 claims description 8
- 108090000790 Enzymes Proteins 0.000 claims description 8
- 238000005215 recombination Methods 0.000 claims description 8
- 230000007774 longterm Effects 0.000 claims description 7
- 108050006400 Cyclin Proteins 0.000 claims description 6
- 230000000295 complement effect Effects 0.000 claims description 5
- 239000003153 chemical reaction reagent Substances 0.000 claims description 4
- 239000000411 inducer Substances 0.000 claims description 4
- 210000004102 animal cell Anatomy 0.000 claims description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 2
- 238000012258 culturing Methods 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 claims 3
- 102100036691 Proliferating cell nuclear antigen Human genes 0.000 claims 1
- 230000035755 proliferation Effects 0.000 abstract description 61
- 230000002062 proliferating effect Effects 0.000 abstract description 28
- 239000012634 fragment Substances 0.000 abstract description 21
- 230000008859 change Effects 0.000 abstract description 6
- 108020001756 ligand binding domains Proteins 0.000 abstract description 5
- 108700026220 vif Genes Proteins 0.000 abstract description 4
- NKANXQFJJICGDU-QPLCGJKRSA-N Tamoxifen Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=CC=C1 NKANXQFJJICGDU-QPLCGJKRSA-N 0.000 description 78
- 239000005090 green fluorescent protein Substances 0.000 description 48
- 210000004413 cardiac myocyte Anatomy 0.000 description 46
- 229960001603 tamoxifen Drugs 0.000 description 39
- 241000699670 Mus sp. Species 0.000 description 36
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 34
- 230000014509 gene expression Effects 0.000 description 33
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 32
- 210000001519 tissue Anatomy 0.000 description 26
- 241000699666 Mus <mouse, genus> Species 0.000 description 24
- 108020004414 DNA Proteins 0.000 description 21
- 210000003494 hepatocyte Anatomy 0.000 description 18
- 239000013598 vector Substances 0.000 description 18
- 230000006698 induction Effects 0.000 description 17
- 102000004169 proteins and genes Human genes 0.000 description 15
- 108091033409 CRISPR Proteins 0.000 description 13
- 210000002216 heart Anatomy 0.000 description 13
- 230000001939 inductive effect Effects 0.000 description 11
- 210000000056 organ Anatomy 0.000 description 11
- 230000002068 genetic effect Effects 0.000 description 10
- 210000004185 liver Anatomy 0.000 description 10
- 238000003125 immunofluorescent labeling Methods 0.000 description 9
- 108090000765 processed proteins & peptides Proteins 0.000 description 9
- 238000006467 substitution reaction Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 8
- 239000003446 ligand Substances 0.000 description 8
- 230000032823 cell division Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 239000013604 expression vector Substances 0.000 description 7
- 238000010348 incorporation Methods 0.000 description 7
- 238000010186 staining Methods 0.000 description 7
- 108020005004 Guide RNA Proteins 0.000 description 6
- -1 fatty acid amino acids Chemical class 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 108010051219 Cre recombinase Proteins 0.000 description 5
- 102100038595 Estrogen receptor Human genes 0.000 description 5
- 102000009339 Proliferating Cell Nuclear Antigen Human genes 0.000 description 5
- 108700008625 Reporter Genes Proteins 0.000 description 5
- 230000009471 action Effects 0.000 description 5
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 210000003527 eukaryotic cell Anatomy 0.000 description 5
- 230000013011 mating Effects 0.000 description 5
- 239000000700 radioactive tracer Substances 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 210000000813 small intestine Anatomy 0.000 description 5
- 230000002861 ventricular Effects 0.000 description 5
- 238000010354 CRISPR gene editing Methods 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- 238000007796 conventional method Methods 0.000 description 4
- 235000013601 eggs Nutrition 0.000 description 4
- 230000002440 hepatic effect Effects 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 229920001184 polypeptide Polymers 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 238000011830 transgenic mouse model Methods 0.000 description 4
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 3
- 241000880493 Leptailurus serval Species 0.000 description 3
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 3
- 101710163270 Nuclease Proteins 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 108010005233 alanylglutamic acid Proteins 0.000 description 3
- 108010047857 aspartylglycine Proteins 0.000 description 3
- 230000001580 bacterial effect Effects 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 229940011871 estrogen Drugs 0.000 description 3
- 239000000262 estrogen Substances 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- 108010050848 glycylleucine Proteins 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 108010057821 leucylproline Proteins 0.000 description 3
- 210000005229 liver cell Anatomy 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 230000002107 myocardial effect Effects 0.000 description 3
- 108010051242 phenylalanylserine Proteins 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000011714 129 mouse Methods 0.000 description 2
- 108020005345 3' Untranslated Regions Proteins 0.000 description 2
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 2
- RYQSYXFGFOTJDJ-RHYQMDGZSA-N Arg-Thr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RYQSYXFGFOTJDJ-RHYQMDGZSA-N 0.000 description 2
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 2
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 2
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 2
- LNENWJXDHCFVOF-DCAQKATOSA-N Asp-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N LNENWJXDHCFVOF-DCAQKATOSA-N 0.000 description 2
- XMKXONRMGJXCJV-LAEOZQHASA-N Asp-Val-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XMKXONRMGJXCJV-LAEOZQHASA-N 0.000 description 2
- 238000011746 C57BL/6J (JAX™ mouse strain) Methods 0.000 description 2
- 108091033380 Coding strand Proteins 0.000 description 2
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 2
- LXAUHIRMWXQRKI-XHNCKOQMSA-N Glu-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)C(=O)O LXAUHIRMWXQRKI-XHNCKOQMSA-N 0.000 description 2
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 2
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 2
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 2
- NGRPGJGKJMUGDM-XVKPBYJWSA-N Gly-Val-Gln Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O NGRPGJGKJMUGDM-XVKPBYJWSA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- IGBBXBFSLKRHJB-BZSNNMDCSA-N His-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 IGBBXBFSLKRHJB-BZSNNMDCSA-N 0.000 description 2
- 101000851334 Homo sapiens Troponin I, cardiac muscle Proteins 0.000 description 2
- 108020004684 Internal Ribosome Entry Sites Proteins 0.000 description 2
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 2
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 2
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 2
- GQUDMNDPQTXZRV-DCAQKATOSA-N Lys-Arg-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GQUDMNDPQTXZRV-DCAQKATOSA-N 0.000 description 2
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 2
- 241000699660 Mus musculus Species 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 2
- 101800001494 Protease 2A Proteins 0.000 description 2
- 101800001066 Protein 2A Proteins 0.000 description 2
- 101710200251 Recombinase cre Proteins 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 2
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 2
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 2
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 2
- MYNYCUXMIIWUNW-IEGACIPQSA-N Thr-Trp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MYNYCUXMIIWUNW-IEGACIPQSA-N 0.000 description 2
- 102100036859 Troponin I, cardiac muscle Human genes 0.000 description 2
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 2
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 108010087924 alanylproline Proteins 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 2
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 230000017531 blood circulation Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 239000002771 cell marker Substances 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 108010082025 cyan fluorescent protein Proteins 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 230000002538 fungal effect Effects 0.000 description 2
- 238000003198 gene knock in Methods 0.000 description 2
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 210000005003 heart tissue Anatomy 0.000 description 2
- 108010025306 histidylleucine Proteins 0.000 description 2
- 238000010166 immunofluorescence Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 210000005228 liver tissue Anatomy 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 108010012988 lysyl-glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010064235 lysylglycine Proteins 0.000 description 2
- 108010017391 lysylvaline Proteins 0.000 description 2
- 210000004962 mammalian cell Anatomy 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- 230000007102 metabolic function Effects 0.000 description 2
- 239000003147 molecular marker Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 108010031719 prolyl-serine Proteins 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 239000000523 sample Substances 0.000 description 2
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 2
- 108010071207 serylmethionine Proteins 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- 238000003151 transfection method Methods 0.000 description 2
- 108700004896 tripeptide FEG Proteins 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- ONEGZXHXCLCVRF-UHFFFAOYSA-N 2-[[2-[[1-(2-amino-3-methylbutanoyl)pyrrolidine-2-carbonyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoic acid Chemical compound CC(C)CC(C(O)=O)NC(=O)C(C(C)C)NC(=O)C1CCCN1C(=O)C(N)C(C)C ONEGZXHXCLCVRF-UHFFFAOYSA-N 0.000 description 1
- AAQGRPOPTAUUBM-ZLUOBGJFSA-N Ala-Ala-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O AAQGRPOPTAUUBM-ZLUOBGJFSA-N 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- LGQPPBQRUBVTIF-JBDRJPRFSA-N Ala-Ala-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LGQPPBQRUBVTIF-JBDRJPRFSA-N 0.000 description 1
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 1
- PBAMJJXWDQXOJA-FXQIFTODSA-N Ala-Asp-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PBAMJJXWDQXOJA-FXQIFTODSA-N 0.000 description 1
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 1
- KUDREHRZRIVKHS-UWJYBYFXSA-N Ala-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KUDREHRZRIVKHS-UWJYBYFXSA-N 0.000 description 1
- RCQRKPUXJAGEEC-ZLUOBGJFSA-N Ala-Cys-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O RCQRKPUXJAGEEC-ZLUOBGJFSA-N 0.000 description 1
- BLGHHPHXVJWCNK-GUBZILKMSA-N Ala-Gln-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BLGHHPHXVJWCNK-GUBZILKMSA-N 0.000 description 1
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 1
- PIXQDIGKDNNOOV-GUBZILKMSA-N Ala-Lys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O PIXQDIGKDNNOOV-GUBZILKMSA-N 0.000 description 1
- CNQAFFMNJIQYGX-DRZSPHRISA-N Ala-Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 CNQAFFMNJIQYGX-DRZSPHRISA-N 0.000 description 1
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 1
- XWFWAXPOLRTDFZ-FXQIFTODSA-N Ala-Pro-Ser Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O XWFWAXPOLRTDFZ-FXQIFTODSA-N 0.000 description 1
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 1
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 1
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 1
- IDLBLNBDLCTPGC-HERUPUMHSA-N Ala-Trp-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CS)C(=O)O)N IDLBLNBDLCTPGC-HERUPUMHSA-N 0.000 description 1
- VQBULXOHAZSTQY-GKCIPKSASA-N Ala-Trp-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VQBULXOHAZSTQY-GKCIPKSASA-N 0.000 description 1
- AENHOIXXHKNIQL-AUTRQRHGSA-N Ala-Tyr-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H]([NH3+])C)CC1=CC=C(O)C=C1 AENHOIXXHKNIQL-AUTRQRHGSA-N 0.000 description 1
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 1
- VBFJESQBIWCWRL-DCAQKATOSA-N Arg-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCNC(N)=N VBFJESQBIWCWRL-DCAQKATOSA-N 0.000 description 1
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 1
- QPOARHANPULOTM-GMOBBJLQSA-N Arg-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N QPOARHANPULOTM-GMOBBJLQSA-N 0.000 description 1
- KWTVWJPNHAOREN-IHRRRGAJSA-N Arg-Asn-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KWTVWJPNHAOREN-IHRRRGAJSA-N 0.000 description 1
- XVLLUZMFSAYKJV-GUBZILKMSA-N Arg-Asp-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XVLLUZMFSAYKJV-GUBZILKMSA-N 0.000 description 1
- QAXCZGMLVICQKS-SRVKXCTJSA-N Arg-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QAXCZGMLVICQKS-SRVKXCTJSA-N 0.000 description 1
- QEHMMRSQJMOYNO-DCAQKATOSA-N Arg-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QEHMMRSQJMOYNO-DCAQKATOSA-N 0.000 description 1
- GFMWTFHOZGLTLC-AVGNSLFASA-N Arg-His-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCSC)C(O)=O GFMWTFHOZGLTLC-AVGNSLFASA-N 0.000 description 1
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 1
- HCIUUZGFTDTEGM-NAKRPEOUSA-N Arg-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N HCIUUZGFTDTEGM-NAKRPEOUSA-N 0.000 description 1
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 1
- NOZYDJOPOGKUSR-AVGNSLFASA-N Arg-Leu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O NOZYDJOPOGKUSR-AVGNSLFASA-N 0.000 description 1
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 1
- OISWSORSLQOGFV-AVGNSLFASA-N Arg-Met-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCCN=C(N)N OISWSORSLQOGFV-AVGNSLFASA-N 0.000 description 1
- AMIQZQAAYGYKOP-FXQIFTODSA-N Arg-Ser-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O AMIQZQAAYGYKOP-FXQIFTODSA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- JQHASVQBAKRJKD-GUBZILKMSA-N Arg-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N JQHASVQBAKRJKD-GUBZILKMSA-N 0.000 description 1
- WCZXPVPHUMYLMS-VEVYYDQMSA-N Arg-Thr-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O WCZXPVPHUMYLMS-VEVYYDQMSA-N 0.000 description 1
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- QMQZYILAWUOLPV-JYJNAYRXSA-N Arg-Tyr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)CC1=CC=C(O)C=C1 QMQZYILAWUOLPV-JYJNAYRXSA-N 0.000 description 1
- PJOPLXOCKACMLK-KKUMJFAQSA-N Arg-Tyr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O PJOPLXOCKACMLK-KKUMJFAQSA-N 0.000 description 1
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 1
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- DQTIWTULBGLJBL-DCAQKATOSA-N Asn-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N DQTIWTULBGLJBL-DCAQKATOSA-N 0.000 description 1
- GOVUDFOGXOONFT-VEVYYDQMSA-N Asn-Arg-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GOVUDFOGXOONFT-VEVYYDQMSA-N 0.000 description 1
- FANGHKQYFPYDNB-UBHSHLNASA-N Asn-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N FANGHKQYFPYDNB-UBHSHLNASA-N 0.000 description 1
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 1
- UYXXMIZGHYKYAT-NHCYSSNCSA-N Asn-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)N)N UYXXMIZGHYKYAT-NHCYSSNCSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- NYGILGUOUOXGMJ-YUMQZZPRSA-N Asn-Lys-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O NYGILGUOUOXGMJ-YUMQZZPRSA-N 0.000 description 1
- AEZCCDMZZJOGII-DCAQKATOSA-N Asn-Met-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O AEZCCDMZZJOGII-DCAQKATOSA-N 0.000 description 1
- YUUIAUXBNOHFRJ-IHRRRGAJSA-N Asn-Phe-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O YUUIAUXBNOHFRJ-IHRRRGAJSA-N 0.000 description 1
- DOURAOODTFJRIC-CIUDSAMLSA-N Asn-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N DOURAOODTFJRIC-CIUDSAMLSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- IPPFAOCLQSGHJV-WFBYXXMGSA-N Asn-Trp-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O IPPFAOCLQSGHJV-WFBYXXMGSA-N 0.000 description 1
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 1
- DATSKXOXPUAOLK-KKUMJFAQSA-N Asn-Tyr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DATSKXOXPUAOLK-KKUMJFAQSA-N 0.000 description 1
- QNNBHTFDFFFHGC-KKUMJFAQSA-N Asn-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QNNBHTFDFFFHGC-KKUMJFAQSA-N 0.000 description 1
- GHWWTICYPDKPTE-NGZCFLSTSA-N Asn-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N GHWWTICYPDKPTE-NGZCFLSTSA-N 0.000 description 1
- HBUJSDCLZCXXCW-YDHLFZDLSA-N Asn-Val-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HBUJSDCLZCXXCW-YDHLFZDLSA-N 0.000 description 1
- ZVTDYGWRRPMFCL-WFBYXXMGSA-N Asp-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)O)N ZVTDYGWRRPMFCL-WFBYXXMGSA-N 0.000 description 1
- QHAJMRDEWNAIBQ-FXQIFTODSA-N Asp-Arg-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O QHAJMRDEWNAIBQ-FXQIFTODSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- VHQOCWWKXIOAQI-WDSKDSINSA-N Asp-Gln-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VHQOCWWKXIOAQI-WDSKDSINSA-N 0.000 description 1
- ZSJFGGSPCCHMNE-LAEOZQHASA-N Asp-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N ZSJFGGSPCCHMNE-LAEOZQHASA-N 0.000 description 1
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 1
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 1
- SARSTIZOZFBDOM-FXQIFTODSA-N Asp-Met-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O SARSTIZOZFBDOM-FXQIFTODSA-N 0.000 description 1
- HXVILZUZXFLVEN-DCAQKATOSA-N Asp-Met-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O HXVILZUZXFLVEN-DCAQKATOSA-N 0.000 description 1
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 1
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 1
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 1
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- 108091062157 Cis-regulatory element Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- OJQJUQUBJGTCRY-WFBYXXMGSA-N Cys-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CS)N OJQJUQUBJGTCRY-WFBYXXMGSA-N 0.000 description 1
- XGIAHEUULGOZHH-GUBZILKMSA-N Cys-Arg-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CS)N XGIAHEUULGOZHH-GUBZILKMSA-N 0.000 description 1
- WPXPYZPGSGWQSC-DCAQKATOSA-N Cys-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CS)N WPXPYZPGSGWQSC-DCAQKATOSA-N 0.000 description 1
- UIKLEGZPIOXFHJ-DLOVCJGASA-N Cys-Phe-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O UIKLEGZPIOXFHJ-DLOVCJGASA-N 0.000 description 1
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 1
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 230000010190 G1 phase Effects 0.000 description 1
- 230000010337 G2 phase Effects 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- NUMFTVCBONFQIQ-DRZSPHRISA-N Gln-Ala-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NUMFTVCBONFQIQ-DRZSPHRISA-N 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- MQANCSUBSBJNLU-KKUMJFAQSA-N Gln-Arg-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQANCSUBSBJNLU-KKUMJFAQSA-N 0.000 description 1
- AAOBFSKXAVIORT-GUBZILKMSA-N Gln-Asn-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O AAOBFSKXAVIORT-GUBZILKMSA-N 0.000 description 1
- ULXXDWZMMSQBDC-ACZMJKKPSA-N Gln-Asp-Asp Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ULXXDWZMMSQBDC-ACZMJKKPSA-N 0.000 description 1
- LPJVZYMINRLCQA-AVGNSLFASA-N Gln-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)N)N LPJVZYMINRLCQA-AVGNSLFASA-N 0.000 description 1
- GPISLLFQNHELLK-DCAQKATOSA-N Gln-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N GPISLLFQNHELLK-DCAQKATOSA-N 0.000 description 1
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 1
- DWDBJWAXPXXYLP-SRVKXCTJSA-N Gln-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DWDBJWAXPXXYLP-SRVKXCTJSA-N 0.000 description 1
- LTXLIIZACMCQTO-GUBZILKMSA-N Gln-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LTXLIIZACMCQTO-GUBZILKMSA-N 0.000 description 1
- GLEGHWQNGPMKHO-DCAQKATOSA-N Gln-His-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GLEGHWQNGPMKHO-DCAQKATOSA-N 0.000 description 1
- IWUFOVSLWADEJC-AVGNSLFASA-N Gln-His-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IWUFOVSLWADEJC-AVGNSLFASA-N 0.000 description 1
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 1
- KUBFPYIMAGXGBT-ACZMJKKPSA-N Gln-Ser-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KUBFPYIMAGXGBT-ACZMJKKPSA-N 0.000 description 1
- ZGHMRONFHDVXEF-AVGNSLFASA-N Gln-Ser-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZGHMRONFHDVXEF-AVGNSLFASA-N 0.000 description 1
- OUBUHIODTNUUTC-WDCWCFNPSA-N Gln-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O OUBUHIODTNUUTC-WDCWCFNPSA-N 0.000 description 1
- SGVGIVDZLSHSEN-RYUDHWBXSA-N Gln-Tyr-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O SGVGIVDZLSHSEN-RYUDHWBXSA-N 0.000 description 1
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 1
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 1
- MXOODARRORARSU-ACZMJKKPSA-N Glu-Ala-Ser Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N MXOODARRORARSU-ACZMJKKPSA-N 0.000 description 1
- FYBSCGZLICNOBA-XQXXSGGOSA-N Glu-Ala-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FYBSCGZLICNOBA-XQXXSGGOSA-N 0.000 description 1
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 1
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 1
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 1
- VOORMNJKNBGYGK-YUMQZZPRSA-N Glu-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N VOORMNJKNBGYGK-YUMQZZPRSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- YDJOULGWHQRPEV-SRVKXCTJSA-N Glu-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N YDJOULGWHQRPEV-SRVKXCTJSA-N 0.000 description 1
- VGOFRWOTSXVPAU-SDDRHHMPSA-N Glu-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VGOFRWOTSXVPAU-SDDRHHMPSA-N 0.000 description 1
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 1
- GRHXUHCFENOCOS-ZPFDUUQYSA-N Glu-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)O)N GRHXUHCFENOCOS-ZPFDUUQYSA-N 0.000 description 1
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 1
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 1
- MFYLRRCYBBJYPI-JYJNAYRXSA-N Glu-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O MFYLRRCYBBJYPI-JYJNAYRXSA-N 0.000 description 1
- MLILEEIVMRUYBX-NHCYSSNCSA-N Glu-Val-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O MLILEEIVMRUYBX-NHCYSSNCSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 1
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 1
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 1
- WJZLEENECIOOSA-WDSKDSINSA-N Gly-Asn-Gln Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)O WJZLEENECIOOSA-WDSKDSINSA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- CEXINUGNTZFNRY-BYPYZUCNSA-N Gly-Cys-Gly Chemical compound [NH3+]CC(=O)N[C@@H](CS)C(=O)NCC([O-])=O CEXINUGNTZFNRY-BYPYZUCNSA-N 0.000 description 1
- LXXANCRPFBSSKS-IUCAKERBSA-N Gly-Gln-Leu Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LXXANCRPFBSSKS-IUCAKERBSA-N 0.000 description 1
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 1
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 1
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 1
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 1
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 1
- UPADCCSMVOQAGF-LBPRGKRZSA-N Gly-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)CN)C(O)=O)=CNC2=C1 UPADCCSMVOQAGF-LBPRGKRZSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- FCKPEGOCSVZPNC-WHOFXGATSA-N Gly-Ile-Phe Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FCKPEGOCSVZPNC-WHOFXGATSA-N 0.000 description 1
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 1
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 1
- LOEANKRDMMVOGZ-YUMQZZPRSA-N Gly-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O LOEANKRDMMVOGZ-YUMQZZPRSA-N 0.000 description 1
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 1
- CVFOYJJOZYYEPE-KBPBESRZSA-N Gly-Lys-Tyr Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CVFOYJJOZYYEPE-KBPBESRZSA-N 0.000 description 1
- OJNZVYSGVYLQIN-BQBZGAKWSA-N Gly-Met-Asp Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O OJNZVYSGVYLQIN-BQBZGAKWSA-N 0.000 description 1
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 1
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 1
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 1
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 1
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- PDSUIXMZYNURGI-AVGNSLFASA-N His-Arg-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC1=CN=CN1 PDSUIXMZYNURGI-AVGNSLFASA-N 0.000 description 1
- SYMSVYVUSPSAAO-IHRRRGAJSA-N His-Arg-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O SYMSVYVUSPSAAO-IHRRRGAJSA-N 0.000 description 1
- UZZXGLOJRZKYEL-DJFWLOJKSA-N His-Asn-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UZZXGLOJRZKYEL-DJFWLOJKSA-N 0.000 description 1
- DGYNAJNQMBFYIF-SZMVWBNQSA-N His-Glu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CN=CN1 DGYNAJNQMBFYIF-SZMVWBNQSA-N 0.000 description 1
- UQTKYYNHMVAOAA-HJPIBITLSA-N His-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N UQTKYYNHMVAOAA-HJPIBITLSA-N 0.000 description 1
- ORERHHPZDDEMSC-VGDYDELISA-N His-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ORERHHPZDDEMSC-VGDYDELISA-N 0.000 description 1
- HJUPAYWVVVRYFQ-PYJNHQTQSA-N His-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CN=CN1)N HJUPAYWVVVRYFQ-PYJNHQTQSA-N 0.000 description 1
- WYSJPCTWSBJFCO-AVGNSLFASA-N His-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CN=CN1)N WYSJPCTWSBJFCO-AVGNSLFASA-N 0.000 description 1
- YXXKBPJEIYFGOD-MGHWNKPDSA-N His-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC2=CN=CN2)N YXXKBPJEIYFGOD-MGHWNKPDSA-N 0.000 description 1
- KAXZXLSXFWSNNZ-XVYDVKMFSA-N His-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KAXZXLSXFWSNNZ-XVYDVKMFSA-N 0.000 description 1
- IXQGOKWTQPCIQM-YJRXYDGGSA-N His-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O IXQGOKWTQPCIQM-YJRXYDGGSA-N 0.000 description 1
- JUCZDDVZBMPKRT-IXOXFDKPSA-N His-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N)O JUCZDDVZBMPKRT-IXOXFDKPSA-N 0.000 description 1
- 238000010867 Hoechst staining Methods 0.000 description 1
- DMHGKBGOUAJRHU-RVMXOQNASA-N Ile-Arg-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N DMHGKBGOUAJRHU-RVMXOQNASA-N 0.000 description 1
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 1
- NULSANWBUWLTKN-NAKRPEOUSA-N Ile-Arg-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N NULSANWBUWLTKN-NAKRPEOUSA-N 0.000 description 1
- IIXDMJNYALIKGP-DJFWLOJKSA-N Ile-Asn-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N IIXDMJNYALIKGP-DJFWLOJKSA-N 0.000 description 1
- XENGULNPUDGALZ-ZPFDUUQYSA-N Ile-Asn-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N XENGULNPUDGALZ-ZPFDUUQYSA-N 0.000 description 1
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 1
- VQUCKIAECLVLAD-SVSWQMSJSA-N Ile-Cys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VQUCKIAECLVLAD-SVSWQMSJSA-N 0.000 description 1
- BALLIXFZYSECCF-QEWYBTABSA-N Ile-Gln-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N BALLIXFZYSECCF-QEWYBTABSA-N 0.000 description 1
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- KOPIAUWNLKKELG-SIGLWIIPSA-N Ile-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N KOPIAUWNLKKELG-SIGLWIIPSA-N 0.000 description 1
- TWPSALMCEHCIOY-YTFOTSKYSA-N Ile-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(=O)O)N TWPSALMCEHCIOY-YTFOTSKYSA-N 0.000 description 1
- PFPUFNLHBXKPHY-HTFCKZLJSA-N Ile-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)O)N PFPUFNLHBXKPHY-HTFCKZLJSA-N 0.000 description 1
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 1
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 1
- IOVUXUSIGXCREV-DKIMLUQUSA-N Ile-Leu-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IOVUXUSIGXCREV-DKIMLUQUSA-N 0.000 description 1
- ZBYBKIQDPOSLDR-XSXWSVAESA-N Ile-Leu-Val-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ZBYBKIQDPOSLDR-XSXWSVAESA-N 0.000 description 1
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 1
- IDMNOFVUXYYZPF-DKIMLUQUSA-N Ile-Lys-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IDMNOFVUXYYZPF-DKIMLUQUSA-N 0.000 description 1
- NPAYJTAXWXJKLO-NAKRPEOUSA-N Ile-Met-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N NPAYJTAXWXJKLO-NAKRPEOUSA-N 0.000 description 1
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 1
- JHNJNTMTZHEDLJ-NAKRPEOUSA-N Ile-Ser-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JHNJNTMTZHEDLJ-NAKRPEOUSA-N 0.000 description 1
- JZNVOBUNTWNZPW-GHCJXIJMSA-N Ile-Ser-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JZNVOBUNTWNZPW-GHCJXIJMSA-N 0.000 description 1
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 1
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 1
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 1
- PRTZQMBYUZFSFA-XEGUGMAKSA-N Ile-Tyr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)NCC(=O)O)N PRTZQMBYUZFSFA-XEGUGMAKSA-N 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- XBBKIIGCUMBKCO-JXUBOQSCSA-N Leu-Ala-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XBBKIIGCUMBKCO-JXUBOQSCSA-N 0.000 description 1
- HXWALXSAVBLTPK-NUTKFTJISA-N Leu-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(C)C)N HXWALXSAVBLTPK-NUTKFTJISA-N 0.000 description 1
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 1
- XYUBOFCTGPZFSA-WDSOQIARSA-N Leu-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 XYUBOFCTGPZFSA-WDSOQIARSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- YKNBJXOJTURHCU-DCAQKATOSA-N Leu-Asp-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKNBJXOJTURHCU-DCAQKATOSA-N 0.000 description 1
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 1
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 1
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 1
- CQGSYZCULZMEDE-SRVKXCTJSA-N Leu-Gln-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CQGSYZCULZMEDE-SRVKXCTJSA-N 0.000 description 1
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 1
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 1
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 1
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- VBZOAGIPCULURB-QWRGUYRKSA-N Leu-Gly-His Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N VBZOAGIPCULURB-QWRGUYRKSA-N 0.000 description 1
- YWYQSLOTVIRCFE-SRVKXCTJSA-N Leu-His-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O YWYQSLOTVIRCFE-SRVKXCTJSA-N 0.000 description 1
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 1
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 1
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 1
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 1
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 1
- PKKMDPNFGULLNQ-AVGNSLFASA-N Leu-Met-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O PKKMDPNFGULLNQ-AVGNSLFASA-N 0.000 description 1
- AUNMOHYWTAPQLA-XUXIUFHCSA-N Leu-Met-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AUNMOHYWTAPQLA-XUXIUFHCSA-N 0.000 description 1
- ZDBMWELMUCLUPL-QEJZJMRPSA-N Leu-Phe-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ZDBMWELMUCLUPL-QEJZJMRPSA-N 0.000 description 1
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 1
- ONHCDMBHPQIPAI-YTQUADARSA-N Leu-Trp-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N3CCC[C@@H]3C(=O)O)N ONHCDMBHPQIPAI-YTQUADARSA-N 0.000 description 1
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- VUBIPAHVHMZHCM-KKUMJFAQSA-N Leu-Tyr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 VUBIPAHVHMZHCM-KKUMJFAQSA-N 0.000 description 1
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 1
- RVOMPSJXSRPFJT-DCAQKATOSA-N Lys-Ala-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVOMPSJXSRPFJT-DCAQKATOSA-N 0.000 description 1
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 1
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 1
- NTSPQIONFJUMJV-AVGNSLFASA-N Lys-Arg-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O NTSPQIONFJUMJV-AVGNSLFASA-N 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 1
- OPTCSTACHGNULU-DCAQKATOSA-N Lys-Cys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCCCN OPTCSTACHGNULU-DCAQKATOSA-N 0.000 description 1
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 1
- NDORZBUHCOJQDO-GVXVVHGQSA-N Lys-Gln-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O NDORZBUHCOJQDO-GVXVVHGQSA-N 0.000 description 1
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 1
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 1
- HAUUXTXKJNVIFY-ONGXEEELSA-N Lys-Gly-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAUUXTXKJNVIFY-ONGXEEELSA-N 0.000 description 1
- OIYWBDBHEGAVST-BZSNNMDCSA-N Lys-His-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OIYWBDBHEGAVST-BZSNNMDCSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- GOVDTWNJCBRRBJ-DCAQKATOSA-N Lys-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N GOVDTWNJCBRRBJ-DCAQKATOSA-N 0.000 description 1
- HYSVGEAWTGPMOA-IHRRRGAJSA-N Lys-Pro-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O HYSVGEAWTGPMOA-IHRRRGAJSA-N 0.000 description 1
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 1
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 1
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 1
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 1
- MDDUIRLQCYVRDO-NHCYSSNCSA-N Lys-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN MDDUIRLQCYVRDO-NHCYSSNCSA-N 0.000 description 1
- 230000027311 M phase Effects 0.000 description 1
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 1
- IVCPHARVJUYDPA-FXQIFTODSA-N Met-Asn-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IVCPHARVJUYDPA-FXQIFTODSA-N 0.000 description 1
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 1
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 1
- FBQMBZLJHOQAIH-GUBZILKMSA-N Met-Asp-Met Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O FBQMBZLJHOQAIH-GUBZILKMSA-N 0.000 description 1
- YLLWCSDBVGZLOW-CIUDSAMLSA-N Met-Gln-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O YLLWCSDBVGZLOW-CIUDSAMLSA-N 0.000 description 1
- PQPMMGQTRQFSDA-SRVKXCTJSA-N Met-Glu-His Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O PQPMMGQTRQFSDA-SRVKXCTJSA-N 0.000 description 1
- VZBXCMCHIHEPBL-SRVKXCTJSA-N Met-Glu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN VZBXCMCHIHEPBL-SRVKXCTJSA-N 0.000 description 1
- IUYCGMNKIZDRQI-BQBZGAKWSA-N Met-Gly-Ala Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O IUYCGMNKIZDRQI-BQBZGAKWSA-N 0.000 description 1
- LRALLISKBZNSKN-BQBZGAKWSA-N Met-Gly-Ser Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LRALLISKBZNSKN-BQBZGAKWSA-N 0.000 description 1
- QZPXMHVKPHJNTR-DCAQKATOSA-N Met-Leu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O QZPXMHVKPHJNTR-DCAQKATOSA-N 0.000 description 1
- RATXDYWHIYNZLE-DCAQKATOSA-N Met-Lys-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)O)N RATXDYWHIYNZLE-DCAQKATOSA-N 0.000 description 1
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 1
- CGUYGMFQZCYJSG-DCAQKATOSA-N Met-Lys-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O CGUYGMFQZCYJSG-DCAQKATOSA-N 0.000 description 1
- JOYFULUKJRJCSX-IUCAKERBSA-N Met-Met-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O JOYFULUKJRJCSX-IUCAKERBSA-N 0.000 description 1
- HUURTRNKPBHHKZ-JYJNAYRXSA-N Met-Phe-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 HUURTRNKPBHHKZ-JYJNAYRXSA-N 0.000 description 1
- BQHLZUMZOXUWNU-DCAQKATOSA-N Met-Pro-Glu Chemical compound CSCC[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BQHLZUMZOXUWNU-DCAQKATOSA-N 0.000 description 1
- 101100020156 Mus musculus Mki67 gene Proteins 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- 101710118186 Neomycin resistance protein Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 208000037273 Pathologic Processes Diseases 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- LBSARGIQACMGDF-WBAXXEDZSA-N Phe-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 LBSARGIQACMGDF-WBAXXEDZSA-N 0.000 description 1
- PLNHHOXNVSYKOB-JYJNAYRXSA-N Phe-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N PLNHHOXNVSYKOB-JYJNAYRXSA-N 0.000 description 1
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 1
- YOFKMVUAZGPFCF-IHRRRGAJSA-N Phe-Met-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(O)=O YOFKMVUAZGPFCF-IHRRRGAJSA-N 0.000 description 1
- FENSZYFJQOFSQR-FIRPJDEBSA-N Phe-Phe-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FENSZYFJQOFSQR-FIRPJDEBSA-N 0.000 description 1
- PTDAGKJHZBGDKD-OEAJRASXSA-N Phe-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O PTDAGKJHZBGDKD-OEAJRASXSA-N 0.000 description 1
- OLZVAVSJEUAOHI-UNQGMJICSA-N Phe-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O OLZVAVSJEUAOHI-UNQGMJICSA-N 0.000 description 1
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 1
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 1
- DXWNFNOPBYAFRM-IHRRRGAJSA-N Phe-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N DXWNFNOPBYAFRM-IHRRRGAJSA-N 0.000 description 1
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 1
- GAMLAXHLYGLQBJ-UFYCRDLUSA-N Phe-Val-Tyr Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC1=CC=C(C=C1)O)C(C)C)CC1=CC=CC=C1 GAMLAXHLYGLQBJ-UFYCRDLUSA-N 0.000 description 1
- 208000020584 Polyploidy Diseases 0.000 description 1
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 1
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 1
- CQZNGNCAIXMAIQ-UBHSHLNASA-N Pro-Ala-Phe Chemical compound C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O CQZNGNCAIXMAIQ-UBHSHLNASA-N 0.000 description 1
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 1
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 description 1
- ICTZKEXYDDZZFP-SRVKXCTJSA-N Pro-Arg-Pro Chemical compound N([C@@H](CCCN=C(N)N)C(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 ICTZKEXYDDZZFP-SRVKXCTJSA-N 0.000 description 1
- SMCHPSMKAFIERP-FXQIFTODSA-N Pro-Asn-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 SMCHPSMKAFIERP-FXQIFTODSA-N 0.000 description 1
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 1
- AMBLXEMWFARNNQ-DCAQKATOSA-N Pro-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 AMBLXEMWFARNNQ-DCAQKATOSA-N 0.000 description 1
- SWXSLPHTJVAWDF-VEVYYDQMSA-N Pro-Asn-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWXSLPHTJVAWDF-VEVYYDQMSA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 1
- WVOXLKUUVCCCSU-ZPFDUUQYSA-N Pro-Glu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVOXLKUUVCCCSU-ZPFDUUQYSA-N 0.000 description 1
- PTLOFJZJADCNCD-DCAQKATOSA-N Pro-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 PTLOFJZJADCNCD-DCAQKATOSA-N 0.000 description 1
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 1
- QGOZJLYCGRYYRW-KKUMJFAQSA-N Pro-Glu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QGOZJLYCGRYYRW-KKUMJFAQSA-N 0.000 description 1
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 1
- FJLODLCIOJUDRG-PYJNHQTQSA-N Pro-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2 FJLODLCIOJUDRG-PYJNHQTQSA-N 0.000 description 1
- UREQLMJCKFLLHM-NAKRPEOUSA-N Pro-Ile-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UREQLMJCKFLLHM-NAKRPEOUSA-N 0.000 description 1
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- GFHXZNVJIKMAGO-IHRRRGAJSA-N Pro-Phe-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GFHXZNVJIKMAGO-IHRRRGAJSA-N 0.000 description 1
- PKHDJFHFMGQMPS-RCWTZXSCSA-N Pro-Thr-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKHDJFHFMGQMPS-RCWTZXSCSA-N 0.000 description 1
- SNSYSBUTTJBPDG-OKZBNKHCSA-N Pro-Trp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N4CCC[C@@H]4C(=O)O SNSYSBUTTJBPDG-OKZBNKHCSA-N 0.000 description 1
- KHRLUIPIMIQFGT-AVGNSLFASA-N Pro-Val-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHRLUIPIMIQFGT-AVGNSLFASA-N 0.000 description 1
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 230000018199 S phase Effects 0.000 description 1
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 1
- IYCBDVBJWDXQRR-FXQIFTODSA-N Ser-Ala-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IYCBDVBJWDXQRR-FXQIFTODSA-N 0.000 description 1
- HQTKVSCNCDLXSX-BQBZGAKWSA-N Ser-Arg-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O HQTKVSCNCDLXSX-BQBZGAKWSA-N 0.000 description 1
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 1
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 1
- UICKAKRRRBTILH-GUBZILKMSA-N Ser-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N UICKAKRRRBTILH-GUBZILKMSA-N 0.000 description 1
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 1
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- LOKXAXAESFYFAX-CIUDSAMLSA-N Ser-His-Cys Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CS)C(O)=O)CC1=CN=CN1 LOKXAXAESFYFAX-CIUDSAMLSA-N 0.000 description 1
- UGHCUDLCCVVIJR-VGDYDELISA-N Ser-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CO)N UGHCUDLCCVVIJR-VGDYDELISA-N 0.000 description 1
- CICQXRWZNVXFCU-SRVKXCTJSA-N Ser-His-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O CICQXRWZNVXFCU-SRVKXCTJSA-N 0.000 description 1
- CLKKNZQUQMZDGD-SRVKXCTJSA-N Ser-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CN=CN1 CLKKNZQUQMZDGD-SRVKXCTJSA-N 0.000 description 1
- CAOYHZOWXFFAIR-CIUDSAMLSA-N Ser-His-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O CAOYHZOWXFFAIR-CIUDSAMLSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 1
- BYCVMHKULKRVPV-GUBZILKMSA-N Ser-Lys-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYCVMHKULKRVPV-GUBZILKMSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- RQXDSYQXBCRXBT-GUBZILKMSA-N Ser-Met-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RQXDSYQXBCRXBT-GUBZILKMSA-N 0.000 description 1
- ASGYVPAVFNDZMA-GUBZILKMSA-N Ser-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N ASGYVPAVFNDZMA-GUBZILKMSA-N 0.000 description 1
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 1
- MQUZANJDFOQOBX-SRVKXCTJSA-N Ser-Phe-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O MQUZANJDFOQOBX-SRVKXCTJSA-N 0.000 description 1
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 1
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 1
- STIAINRLUUKYKM-WFBYXXMGSA-N Ser-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CO)=CNC2=C1 STIAINRLUUKYKM-WFBYXXMGSA-N 0.000 description 1
- IAOHCSQDQDWRQU-GUBZILKMSA-N Ser-Val-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IAOHCSQDQDWRQU-GUBZILKMSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- NJEMRSFGDNECGF-GCJQMDKQSA-N Thr-Ala-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O NJEMRSFGDNECGF-GCJQMDKQSA-N 0.000 description 1
- TYVAWPFQYFPSBR-BFHQHQDPSA-N Thr-Ala-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)NCC(O)=O TYVAWPFQYFPSBR-BFHQHQDPSA-N 0.000 description 1
- PXQUBKWZENPDGE-CIQUZCHMSA-N Thr-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)O)N PXQUBKWZENPDGE-CIQUZCHMSA-N 0.000 description 1
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 1
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 1
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 1
- PZVGOVRNGKEFCB-KKHAAJSZSA-N Thr-Asn-Val Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N)O PZVGOVRNGKEFCB-KKHAAJSZSA-N 0.000 description 1
- DCCGCVLVVSAJFK-NUMRIWBASA-N Thr-Asp-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O DCCGCVLVVSAJFK-NUMRIWBASA-N 0.000 description 1
- JEDIEMIJYSRUBB-FOHZUACHSA-N Thr-Asp-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 1
- LYGKYFKSZTUXGZ-ZDLURKLDSA-N Thr-Cys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)NCC(O)=O LYGKYFKSZTUXGZ-ZDLURKLDSA-N 0.000 description 1
- HJOSVGCWOTYJFG-WDCWCFNPSA-N Thr-Glu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O HJOSVGCWOTYJFG-WDCWCFNPSA-N 0.000 description 1
- VYEHBMMAJFVTOI-JHEQGTHGSA-N Thr-Gly-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O VYEHBMMAJFVTOI-JHEQGTHGSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 1
- ZBKDBZUTTXINIX-RWRJDSDZSA-N Thr-Ile-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZBKDBZUTTXINIX-RWRJDSDZSA-N 0.000 description 1
- ODXKUIGEPAGKKV-KATARQTJSA-N Thr-Leu-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N)O ODXKUIGEPAGKKV-KATARQTJSA-N 0.000 description 1
- HOVLHEKTGVIKAP-WDCWCFNPSA-N Thr-Leu-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HOVLHEKTGVIKAP-WDCWCFNPSA-N 0.000 description 1
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 1
- XIULAFZYEKSGAJ-IXOXFDKPSA-N Thr-Leu-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XIULAFZYEKSGAJ-IXOXFDKPSA-N 0.000 description 1
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 1
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 1
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 1
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 1
- KVEWWQRTAVMOFT-KJEVXHAQSA-N Thr-Tyr-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O KVEWWQRTAVMOFT-KJEVXHAQSA-N 0.000 description 1
- CURFABYITJVKEW-QTKMDUPCSA-N Thr-Val-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O CURFABYITJVKEW-QTKMDUPCSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical class O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- SAKLWFSRZTZQAJ-GQGQLFGLSA-N Trp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N SAKLWFSRZTZQAJ-GQGQLFGLSA-N 0.000 description 1
- IKUMWSDCGQVGHC-UMPQAUOISA-N Trp-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)O IKUMWSDCGQVGHC-UMPQAUOISA-N 0.000 description 1
- SEXRBCGSZRCIPE-LYSGOOTNSA-N Trp-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O SEXRBCGSZRCIPE-LYSGOOTNSA-N 0.000 description 1
- QYSBJAUCUKHSLU-JYJNAYRXSA-N Tyr-Arg-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O QYSBJAUCUKHSLU-JYJNAYRXSA-N 0.000 description 1
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 1
- IYHNBRUWVBIVJR-IHRRRGAJSA-N Tyr-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IYHNBRUWVBIVJR-IHRRRGAJSA-N 0.000 description 1
- MPKPIWFFDWVJGC-IRIUXVKKSA-N Tyr-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O MPKPIWFFDWVJGC-IRIUXVKKSA-N 0.000 description 1
- FNWGDMZVYBVAGJ-XEGUGMAKSA-N Tyr-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CC=C(C=C1)O)N FNWGDMZVYBVAGJ-XEGUGMAKSA-N 0.000 description 1
- AXWBYOVVDRBOGU-SIUGBPQLSA-N Tyr-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N AXWBYOVVDRBOGU-SIUGBPQLSA-N 0.000 description 1
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 1
- SCZJKZLFSSPJDP-ACRUOGEOSA-N Tyr-Phe-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SCZJKZLFSSPJDP-ACRUOGEOSA-N 0.000 description 1
- QKXAEWMHAAVVGS-KKUMJFAQSA-N Tyr-Pro-Glu Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O QKXAEWMHAAVVGS-KKUMJFAQSA-N 0.000 description 1
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 1
- KRXFXDCNKLANCP-CXTHYWKRSA-N Tyr-Tyr-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 KRXFXDCNKLANCP-CXTHYWKRSA-N 0.000 description 1
- KLOZTPOXVVRVAQ-DZKIICNBSA-N Tyr-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KLOZTPOXVVRVAQ-DZKIICNBSA-N 0.000 description 1
- FZSPNKUFROZBSG-ZKWXMUAHSA-N Val-Ala-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O FZSPNKUFROZBSG-ZKWXMUAHSA-N 0.000 description 1
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 1
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 1
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 1
- FRUYSSRPJXNRRB-GUBZILKMSA-N Val-Cys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N FRUYSSRPJXNRRB-GUBZILKMSA-N 0.000 description 1
- VFOHXOLPLACADK-GVXVVHGQSA-N Val-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N VFOHXOLPLACADK-GVXVVHGQSA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 1
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 1
- PTFPUAXGIKTVNN-ONGXEEELSA-N Val-His-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)NCC(=O)O)N PTFPUAXGIKTVNN-ONGXEEELSA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- HPANGHISDXDUQY-ULQDDVLXSA-N Val-Lys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HPANGHISDXDUQY-ULQDDVLXSA-N 0.000 description 1
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 1
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 1
- PGQUDQYHWICSAB-NAKRPEOUSA-N Val-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N PGQUDQYHWICSAB-NAKRPEOUSA-N 0.000 description 1
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 1
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 1
- GTACFKZDQFTVAI-STECZYCISA-N Val-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 GTACFKZDQFTVAI-STECZYCISA-N 0.000 description 1
- BGTDGENDNWGMDQ-KJEVXHAQSA-N Val-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N)O BGTDGENDNWGMDQ-KJEVXHAQSA-N 0.000 description 1
- DFQZDQPLWBSFEJ-LSJOCFKGSA-N Val-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DFQZDQPLWBSFEJ-LSJOCFKGSA-N 0.000 description 1
- YKZVPMUGEJXEOR-JYJNAYRXSA-N Val-Val-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N YKZVPMUGEJXEOR-JYJNAYRXSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000005377 adsorption chromatography Methods 0.000 description 1
- TXUZVZSFRXZGTL-QPLCGJKRSA-N afimoxifene Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=C(O)C=C1 TXUZVZSFRXZGTL-QPLCGJKRSA-N 0.000 description 1
- 108010039538 alanyl-glycyl-aspartyl-valine Proteins 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010008355 arginyl-glutamine Proteins 0.000 description 1
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 230000004993 binary fission Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000002459 blastocyst Anatomy 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000006037 cell lysis Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 108010016616 cysteinylglycine Proteins 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 102000004419 dihydrofolate reductase Human genes 0.000 description 1
- 108010054812 diprotin A Proteins 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 238000007877 drug screening Methods 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 101150012763 endA gene Proteins 0.000 description 1
- 210000001174 endocardium Anatomy 0.000 description 1
- 210000002889 endothelial cell Anatomy 0.000 description 1
- 210000003038 endothelium Anatomy 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 239000003687 estradiol congener Substances 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 108010021843 fluorescent protein 583 Proteins 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 238000012239 gene modification Methods 0.000 description 1
- 108010040856 glutamyl-cysteinyl-alanine Proteins 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 1
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 1
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 1
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 108010015792 glycyllysine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 230000013632 homeostatic process Effects 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000010569 immunofluorescence imaging Methods 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 108010053037 kyotorphin Proteins 0.000 description 1
- 108010034529 leucyl-lysine Proteins 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 1
- 108010091871 leucylmethionine Proteins 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 238000004811 liquid chromatography Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 238000010297 mechanical methods and process Methods 0.000 description 1
- 230000021121 meiosis Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 108010056582 methionylglutamic acid Proteins 0.000 description 1
- 108010085203 methionylmethionine Proteins 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 230000011278 mitosis Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 239000002808 molecular sieve Substances 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000496 pancreas Anatomy 0.000 description 1
- 230000009054 pathological process Effects 0.000 description 1
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 210000003240 portal vein Anatomy 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 108010025826 prolyl-leucyl-arginine Proteins 0.000 description 1
- 108010020432 prolyl-prolylisoleucine Proteins 0.000 description 1
- 108010015796 prolylisoleucine Proteins 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000004153 renaturation Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 210000005241 right ventricle Anatomy 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 102000005969 steroid hormone receptors Human genes 0.000 description 1
- 108020003113 steroid hormone receptors Proteins 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 1
- 108010051110 tyrosyl-lysine Proteins 0.000 description 1
- 238000005199 ultracentrifugation Methods 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 210000003462 vein Anatomy 0.000 description 1
- 210000000596 ventricular septum Anatomy 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/8509—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K67/00—Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
- A01K67/027—New or modified breeds of vertebrates
- A01K67/0275—Genetically modified vertebrates, e.g. transgenic
- A01K67/0278—Knock-in vertebrates, e.g. humanised vertebrates
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/705—Receptors; Cell surface antigens; Cell surface determinants
- C07K14/72—Receptors; Cell surface antigens; Cell surface determinants for hormones
- C07K14/721—Steroid/thyroid hormone superfamily, e.g. GR, EcR, androgen receptor, oestrogen receptor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/07—Animals genetically altered by homologous recombination
- A01K2217/072—Animals genetically altered by homologous recombination maintaining or altering function, i.e. knock in
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/10—Mammal
- A01K2227/105—Murine
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/30—Vector systems comprising sequences for excision in presence of a recombinase, e.g. loxP or FRT
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Cell Biology (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- Veterinary Medicine (AREA)
- Plant Pathology (AREA)
- Environmental Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Immunology (AREA)
- Toxicology (AREA)
- Endocrinology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Animal Behavior & Ethology (AREA)
- Animal Husbandry (AREA)
- Biodiversity & Conservation Biology (AREA)
- Mycology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The present invention relates to a polynucleotide sequence comprising a chimeric recombinase coding sequence, a fragment comprising a cell proliferation factor gene, a coding sequence of a first recombinase, a coding sequence of an estrogen receptor ER ligand binding domain, and a recognition site by a second recombinase. The invention also provides polynucleotide products, nucleic acid constructs, host cells and the like comprising the polynucleotide sequences. The invention also provides methods of in vivo proliferative cell marking or tracking using the polynucleotide products, nucleic acid constructs and host cells of the invention. The invention provides a proliferating cell marking system through the chimeric recombinase which is artificially modified, and realizes continuous long-time cell marking and tracing. The invention tracks the proliferation of various types of cells in vivo for the first time, and has important significance for understanding the dynamic change of various cell groups in vivo.
Description
Technical Field
The present discovery relates to cell proliferation markers and tracking systems and uses thereof.
Background
The proliferation of cells in vivo is a dynamic and unstable process, the proliferation of cells of the same type in the same time is different, and the capture of the proliferation state of cells in vivo is crucial to the understanding of biological physiology and pathological processes, so the capture of the proliferation of cells in vivo is always a hot point of research in the biological field. Conventional methods for capturing cell proliferation include mainly proliferation molecular marker staining and incorporation of DNA analogs and isotopes. However, the three have disadvantages that the cell proliferation in a short time can only be captured by utilizing the proliferation molecular marker for dyeing; secondly, cell proliferation within a certain time period can be only researched by utilizing the DNA analogue incorporation, because the long-term incorporation of the DNA analogue is likely to influence the normal proliferation of the cells; the isotope incorporation method, however, has a problem that the influence on the normal proliferation of cells is reduced, but the detection is difficult.
Studies of in vivo proliferation cell tracing using genetic lineage tracing techniques have been conducted in the current field. For example, researchers have used Ki67 cell proliferation markers as promoters to initiate inducible homologous recombinases to homologously recombine and label cells, i.e., Ki67-CreER tool mice crossed with the corresponding constantly open reporter gene. When cells express Ki67 and proliferate, the activity of Ki67 promoter promotes the expression of subsequent creER, and meanwhile, under the induction of tamoxifen (tamoxifen), Cre enters nucleus to carry out homologous recombination on LoxP sites of corresponding reporter genes so as to mark the cells with fluorescence corresponding to the reporter genes. Although this technique has some effect on tracking proliferating cells in vivo, it has a significant drawback in that the system for in vivo tracing proliferating cells using Ki67-CreER relies on the simultaneous presence of two conditions, one being the activation of Ki67 that initiates the expression of CreER and the other being the simultaneous presence in vivo of tamoxifen (tamoxifen) that nucleates CreER. If the cells which proliferate rapidly are tracked, the cells which proliferate can be basically captured by the tamoxifen which acts in vivo for a period of time, but if the cells which proliferate slowly act in vivo for a shorter period of time, the cells can not be labeled to fluoresce if the cells have Ki67 activity to drive the expression of creER but do not have the existence of tamoxifen which can lead Cre to enter the nucleus in vivo, and the signals are missed. It is also unrealistic to carry out tamoxifen treatment all the time in the long-time tracking process, which on the one hand will affect the physiological state of mice, and on the other hand, there is a certain efficiency problem in tamoxifen induction of Cre to enter nucleus. The research aims to establish a set of genetic lineage tracing technology for tracing in-vivo proliferation cells, capture signals which are easy to be missed due to the instantaneity of the expression of proliferation genes, and continuously reduce the in-vivo cell proliferation situation for a long time.
Disclosure of Invention
In a first aspect, the invention provides a nucleic acid molecule selected from
(1) A nucleic acid molecule comprising a 5 'homology arm and a 3' homology arm, the 5 'homology arm and the 3' homology arm being capable of recombining, into a genome, sequences therebetween such that the sequences therebetween are co-expressed with a cell proliferation factor gene in the genome, and a first recombinase coding sequence, an estrogen receptor ER coding sequence, and a recognition site for a second recombinase located between the 5 'homology arm and the 3' homology arm, and
(2) (1) the complementary sequence of said nucleic acid molecule.
In one or more embodiments, the polynucleotide sequence of the nucleic acid molecule is, in order from the 5 'end to the 3' end, a 5 'homology arm, a first recombinase coding sequence, a recognition site for a second recombinase, an estrogen receptor ER coding sequence, a recognition site for a second recombinase, and a 3' homology arm, the 5 'homology arm and the 3' homology arm being capable of recombining sequences therebetween at the 5 'or 3' end of the cell proliferation factor gene.
In one or more embodiments, the cell proliferation factor is Ki67 or PCNA.
In one or more embodiments, the nucleic acid sequence of the 5' homology arm is as set forth in nucleotides 1-3000 of SEQ ID NO 1.
In one or more embodiments, the nucleic acid sequence of the 3' homology arm is as shown in SEQ ID NO 1 nucleotides 5128-8127.
In one or more embodiments, the first recombinase and the second recombinase are Cre and Dre or Dre and Cre, respectively, wherein the recognition site for Cre is LoxP and the recognition site for Dre is rox.
In one or more embodiments, the amino acid sequence of Dre is as set forth in amino acids 1-356 of SEQ ID NO. 2.
In one or more embodiments, the amino acid sequence of Cre is set forth in SEQ ID NO 3. In one or more embodiments, the nucleic acid sequence of Cre is as set forth in 3067-4099 of SEQ ID NO: 1.
In one or more embodiments, the nucleic acid sequence of LoxP is shown in SEQ ID NO 4.
In one or more embodiments, the nucleic acid sequence of rox is set forth in SEQ ID NO 5.
In one or more embodiments, the amino acid sequence of the estrogen receptor ER is as shown in amino acids 357 and 666 of SEQ ID NO: 2. In one or more embodiments, the nucleic acid sequence of the estrogen receptor ER is as shown in nucleotides 4153-5085 of SEQ ID NO: 1.
The invention also provides a nucleic acid construct comprising a nucleic acid molecule as described herein.
The present invention also provides a recombinase system comprising (1) a nucleic acid molecule as described herein, and (2) optionally, a nucleic acid molecule encoding a fusion protein of a second recombinase and an estrogen receptor ER, (3) optionally, a nucleic acid molecule comprising the structure: from the 5 'end to the 3' end, the recognition site of the first recombinase, the termination sequence, the first recombination site and the marker coding sequence are arranged in sequence.
The present invention also provides a recombinase system comprising (1) a nucleic acid construct as described herein, and (2) optionally a second nucleic acid construct having a polynucleotide sequence encoding a fusion protein of a second recombinase and an estrogen receptor ER, (3) optionally a third nucleic acid construct having a polynucleotide sequence comprising the structure: from the 5 'end to the 3' end, the recognition site of the first recombinase, the termination sequence, the recognition site of the first recombinase and the marker coding sequence are arranged in sequence.
The invention also provides a host cell comprising one or more of: (1) a polynucleotide sequence as described herein; (2) a nucleic acid construct as described herein; (3) a system as described herein.
The present invention also provides a method for constructing a transgenic animal, comprising:
f is to be0First generation animal and F0Second animal or F0Mating the animals of the third generation, and homologous recombination occurs in the progeny animals to obtain a transgenic animal comprising the first and second polynucleotide sequences or a transgenic animal comprising the first and third polynucleotide sequences, or
Subjecting the three kinds of F0Mating any two of the generations of animals, then F, which undergoes homologous recombination1Animal and a third F0Mating the animals at F2Homologous recombination occurs in the animal generations to obtain transgenic animals comprising the first, second and third polynucleotide sequences.
In one or more embodiments, F0The genome of the first animal comprises a first polynucleotide sequence, wherein the first polynucleotide sequence is the polynucleotide sequence of the nucleic acid molecule described herein, or the first polynucleotide sequence comprises, in order from 5 'to 3', a first recombinase coding sequence, a second recombinase recognition site, an Estrogen Receptor (ER) coding sequence, and a second recombinase recognition site, and the first polynucleotide sequence is co-expressed with a cell proliferation factor gene in the genome.
In one or more embodiments, F0The genome of the second animal comprises a second polynucleotide sequence encoding a fusion protein of a second recombinase and an estrogen receptor ER.
In one or more embodiments, F0The genome of the third animal comprises a third polynucleotide sequence comprising, in order from the 5 'end to the 3' endA recognition site for a first recombinase, a termination sequence, a recognition site for a first recombinase, and a marker-encoding sequence.
In one or more embodiments, the animal is a mouse.
In one or more embodiments, the first and second recombinant enzymes, estrogen receptor ER and cell proliferation factor are as described herein.
The present invention also provides a method for constructing a transgenic animal comprising introducing any one, two or three of the first, second and third polynucleotide sequences into animal cells, culturing the cells, and screening the transgenic animal whose genome comprises any one, two or three of the first, second and third polynucleotide sequences, wherein
The first polynucleotide sequence is a first recombinase coding sequence, a recognition site of a second recombinase, an estrogen receptor ER coding sequence and a recognition site of the second recombinase from 5 'end to 3' end in sequence, and the first polynucleotide sequence is co-expressed with a cell proliferation factor gene in a genome in the transgenic animal,
the second polynucleotide sequence encodes a fusion protein of a second recombinase and an estrogen receptor ER,
the third polynucleotide sequence comprises a recognition site of the first recombinase, a termination sequence, a recognition site of the first recombinase and a marker coding sequence from the 5 'end to the 3' end in sequence.
In one or more embodiments, the cell is an animal ES cell.
In one or more embodiments, the animal is a mouse.
In one or more embodiments, the first and second recombinant enzymes, estrogen receptor ER and cell proliferation factor are as described herein.
The invention also provides an in vivo long-term cell labeling method comprising labeling cells expressing a cell proliferation factor gene in an animal comprising a first, second and third polynucleotide sequence in the presence of an inducer that interacts with the estrogen receptor ER, wherein,
the first polynucleotide sequence is a first recombinase coding sequence, a recognition site of a second recombinase, an estrogen receptor ER coding sequence and a recognition site of the second recombinase from 5 'end to 3' end in sequence, and the first polynucleotide sequence is co-expressed with a cell proliferation factor gene in a genome in the transgenic animal,
the second polynucleotide sequence encodes a fusion protein of a second recombinase and an estrogen receptor ER, and
the third polynucleotide sequence is provided with a recognition site of the first recombinase, a termination sequence, a first recombination site and a marker coding sequence from the 5 'end to the 3' end in sequence.
In one or more embodiments, the animal is a mouse.
In one or more embodiments, the first and second recombinant enzymes, estrogen receptor ER and cell proliferation factor are as described herein.
The invention also provides the use of a nucleic acid molecule, nucleic acid construct, system or host cell as described herein in long-term cell labeling or cell tracking.
The invention also provides a kit comprising a nucleic acid molecule, nucleic acid construct, system or host cell as described herein, and reagents required to knock the nucleic acid molecule into the genome of the cell.
Drawings
FIG. 1 shows a conventional cell proliferation tracing technique in accordance with one embodiment of the present invention. (a) Cartoon representation of Ki67 expression profile and fate profile, Ki67 is dynamically expressed at different time points from T1 to Tn, Ki67 fate profile can mean that all expression of Ki67 from T1 to Tn is captured. (b) DreER nuclear excision of ER behind Cre under tamoxifen induction allows Ki67 to enter cells that proliferate as a homologous recombination marker of the reporter gene upon Cre expression, allowing seamless tracking of cell proliferation over a long period of time. (c) The proliferation of cells is tracked by using the traditional genetic lineage tracing technology, and the cell proliferation can be detected only in a short time due to the short action time window of tamoxifen.
FIG. 2 shows the construction and validation of an exemplary chimeric recombinase in accordance with an embodiment of the present invention. (a) Schematic diagram of construction strategy of Ki67-CrexER knock-in mouse. (b) Results of immunofluorescence staining of sections of Ki67-CrexER knock-in mouse adult small intestine. Staining results of ESR with Ki67 and EdU and statistics showed that the expression profile of ESR was substantially consistent with a marker of proliferation. (c) Ki67-CrexER was mated with R26-GFP and induced by tamoxifen (Tam) in adults, and GFP and VE-Cad immunofluorescent-stained two-photon confocal pictures of aortic sections were obtained after 5 days. Statistical results show that Ki67-CrexER marks endothelial cells existing in pairs, and the fact that the knock-in of Ki67-CrexER gene into mice captures the proliferation activity of cells in vivo is suggested. The scale bar represents 100 μm in b and 500 μm in c. Each picture represents at least three biological replicate samples.
FIG. 3 shows a method of labeling and tracking proliferating cells (ProTracker) according to an embodiment of the present invention. (a) Cartoon representation of conversion of Ki67-CrexER to Ki67-Cre under the induction of DreER and tamoxifen. (b) Three mouse strains required for cell proliferation were traced using the methods described herein. (c) Traditional methods and a schematic of time for tamoxifen induction and sample collection of ProTracker proliferation cells. (d) The traditional method and signals of proliferation of each tissue organ of an adult mouse in two days and four weeks of ProTracer tracing show a full-tissue fluorescence and immunofluorescence result graph after each tissue section, and an embedded small graph represents a bright-field full-tissue picture of the same tissue or organ. The upper panel of the two graphs for each result is the result of Ki67-CreER for the conventional method, and the lower panel is the result of ProTracker system. The scale bar represents 1mm in the full tissue picture and 100 μm in the slice fluorescence picture. Each picture represents five independent biological replicate samples.
FIG. 4 shows that DreER does not undergo a mixed homologous recombination reaction with R26-GFP. Mice obtained by mating R26-DreER with R26-GFP were subjected to tamoxifen induction after adult life, and full tissue fluorescence from each tissue or organ or GFP immunofluorescence after sectioning were collected after four weeks and photographed. The embedded panels represent full tissue brightfield photographs of the same tissue organ. The full tissue fluorescence of GFP is shown at 1mm scale with 100 μm for the other scales, and each panel represents 5 biological replicates.
FIG. 5, Ki 67-CrexER; R26-GFP and Ki 67-CrexER; R26-DreER; R26-GFP shows substantially no signal leakage. (a, b) Ki67-CrexER at 12 weeks of age; R26-GFP and Ki 67-CrexER; R26-DreER; R26-GFP mouse whole tissue fluorescence and each tissue section immunofluorescence staining photo. Scale bar, 1mm in total tissue, 100 μm in each tissue section. Each picture represents 5 individual biological replicate samples.
FIG. 6, proliferation of hepatocytes in adult liver. (a) The experimental strategy for tracing the proliferation of the liver cells by using the method of the invention is a cartoon schematic diagram. (b) The liver is divided into cartoon representations of three different metabolic regions from the portal venous region to the central venous region, with arrows indicating the direction of blood flow. (c) Immunofluorescence staining results of GFP, GS and E-CAD of liver tissue sections on day 0, day 2 of tamoxifen-induced protacker mice are shown. (d) Immunofluorescent staining results of GFP, GS and E-CAD of liver tissue sections of mice at 2, 4, 6, 8, 10, 12 weeks after induction by tamoxifen in ProTracker mice. (e) 2 GFP/mm in different regions of hepatic lobule for different sampling time points+Cell count statistics of (2). 1. 2 and 3 respectively represent Zone1 (E-CAD)+)、Zone2(E-CAD-GS-)、Zone3(GS+). Scale bar 100 μm, each picture represents 5 individual biological replicate samples.
Fig. 7, proliferation of cardiomyocytes in adult hearts. (a) Cartoon representation of lineage tracing of Ki67 expressing cells from day 1 to day n, with green representing the proliferation signal of Ki67 expressed. (b) Cartoon representation of seamlessly tracking proliferating cells. (c) Lineage tracing three-month heart sections of ProTracker mice are shown by GFP, TNNI3 immunofluorescent staining, with 1, 2 showing partial enlarged areas in the left panel. (d) Pedigree tracing is a graph of fluorescent staining results of GFP, WGA of heart tissue sections of three-month protacker mice, with an enlargement of the outlined region on the left side, and GFP, TNNI3 immunofluorescent staining results of the same enlarged region below the right side.
FIG. 8, capturing nuclear division and cell division of cardiomyocytes in the adult heart. (a) Ki67 expressing cells from day 1 to day nThe green color represents the proliferation signal of Ki67 expression and subsequent analysis of the number of nuclei can be performed. (b) Lineage-traced heart of three-month ProTracer mice cardiomyocytes were isolated and Hoechst stained, and nuclear number statistics were performed on GFP + as well as GFP-cardiomyocytes. (c) 3D representation of multi-slice scan of GFP + cardiomyocytes (xyz:500X 100. mu.m). (d) The magnified multislice scan shows a GFP+The yellow arrows indicate the nuclei of the cardiomyocytes, and the inset is a cartoon representation of the binuclear cardiomyocytes. (e) Partial area xy and yz magnified 3D plots in the c plot show two immediately adjacent cardiomyocytes. (f) The magnified multislice scan shows that both of the two immediately adjacent GFP + cardiomyocytes are mononuclear cardiomyocytes. Yellow arrows indicate myocardial nuclei, hollow white arrows indicate non-myocardial nuclei of cardiomyocytes next to GFP +. The inset is a cartoon representation of two adjacent mononuclear cardiomyocytes. (g) GFP (green fluorescent protein)+The number of adjacent cardiomyocytes in the cardiomyocytes in (a). The scale bar represents 100 μm. Each picture represents five independent biological replicate samples.
Detailed Description
It is understood that within the scope of the present invention, the above-described technical features of the present invention and the technical features described in detail below (e.g., the embodiments) can be combined with each other to constitute a preferred technical solution.
The present invention aims to provide a genetic lineage tracing technique for tracing proliferating cells in vivo, which captures signals that are likely to be missed due to the transient nature of the expression of proliferating genes, and reduces the proliferation of long-term cells in vivo.
Mouse Ki67-CreER established by using traditional genetic lineage tracing technology cannot track cell proliferation with slow proliferation for a long time because of the short window time of tamoxifen action in vivo, and can only track cell proliferation during the action time period of tamoxifen (FIG. 1, c). The dynamic expression of cell proliferation factor genes such as Ki67 in vivo, single proliferation marker staining and the traditional lineage tracing method can only capture the proliferation expression profile at a single time point, and only capture the fate map of the cell proliferation factor genes can really reflect the cell proliferation status in vivo within a certain time period (FIG. 1, a). But direct lineage tracing using Ki67-Cre was not feasible because all progeny cells were derived from the same zygote. In order to be able to track slowly proliferating cells for a long period of time in vivo, it is necessary to convert the inducible CreER under certain conditions into Cre which is expressed and then incorporated into the nucleus.
In order to solve the problem, the invention introduces a second recombination system Dre-rox independent of Cre-LoxP to modify the creER in the prior art lineage tracing technology, recognition sites rox of Dre homologous recombinase are respectively added at two ends of an ER DNA sequence, and the modified creERT2 is converted into Cre-rox-ERT2-rox, which is called crexER in the text. When dreER and Tamoxifen (Tamoxifen) coexist in vivo, Tamoxifen induces Dre to enter nucleus to perform homologous recombination of rox at two ends of ER DNA, and crexeR is changed from inducible homologous recombinase into homologous recombinase Cre which can be directly inserted into nucleus, so that proliferation specific genes in subsequent cells can directly recognize reporter genes to mark the cells once the expression of Cre is started, as shown in (figure 1, b). The invention utilizes a Cell proliferation tracker (ProTracker) to realize the continuous tracing of the expression of intracellular proliferation specific genes in a long time course.
The engineered tracer system of the invention comprises a nucleic acid molecule encoding a chimeric recombinase, the polynucleotide sequence of said nucleic acid molecule being selected from the group consisting of: a polynucleotide sequence comprising a fragment of a cell proliferation factor gene, a coding sequence for a first recombinase, a coding sequence for an estrogen receptor ER, and a recognition site for a second recombinase, and/or a complement of the polynucleotide sequence. In one embodiment, the polynucleotide sequence of the nucleic acid molecule is, in order from the 5 'end to the 3' end, a first fragment of the cell proliferation factor gene, a coding sequence for a first recombinase, a recognition site for a second recombinase, a coding sequence for an estrogen receptor ER, a recognition site for a second recombinase, and a second fragment of the cell proliferation factor gene. Preferably, the nucleic acid molecule of the invention is selected from (1) a nucleic acid molecule comprising a 5 'homology arm and a 3' homology arm, and a first recombinase coding sequence, an estrogen receptor ER coding sequence, and a recognition site for a second recombinase located between the 5 'homology arm and the 3' homology arm, the 5 'homology arm and the 3' homology arm being capable of recombining sequences therebetween into a genome such that the sequences therebetween are co-expressed with a cell proliferation factor gene in the genome, and (2) (1) a complement of the nucleic acid molecule.
The polynucleotides herein may be in the form of DNA or RNA. The form of DNA includes cDNA or artificially synthesized DNA. The DNA may be single-stranded or double-stranded. The DNA may be the coding strand or the non-coding strand. In certain embodiments, the polynucleotide sequence is set forth in SEQ ID NO 1.
The "cell proliferation factor gene" or "cell proliferation specific gene" as used herein refers to a gene that is expressed only during cell proliferation. Cell proliferation as described herein includes, but is not limited to, mitosis, amitoses, meiosis, binary fission, and the like. In one embodiment, the cell proliferation factor is Ki67 and/or PCNA. Ki-67 is a cell proliferation factor which is used more frequently at present, and Ki-67 is expressed at each stage of cell proliferation except that the cell proliferation factor is not expressed in G0 stage. PCNA is a short term for Proliferating Cell Nuclear Antigen (Proliferating Cell Nuclear Antigen), and is present only in normal Proliferating cells and tumor cells.
Herein, the 5 'homology arm and the 3' homology arm may be a fragment of a cell proliferation factor gene, as long as the fragment can be used for the homology arm for homologous recombination. The fragment may comprise or be located after the promoter of the cell proliferation factor gene, or the fragment may be located before or within the 3 'UTR of the gene or comprise the 3' UTR. Such that expression of a sequence of interest (e.g., a chimeric recombinase as described herein) is placed under the control of a specific promoter such that expression occurs in a particular tissue or organ or at a particular developmental stage. In one embodiment, the fragments are selected such that the cell proliferation factor gene is expressed or not expressed following homologous recombination. In one embodiment, the fragments are selected such that expression of the cell proliferation factor gene is not affected following homologous recombination. The fragment of the cell proliferation factor gene includes a first and a second fragment, which serve as the 5 'and 3' homology arms required for the set of homologous recombination, respectively. In one embodiment, the first and second fragments are selected such that the sequence of interest is inserted downstream of a promoter of a cell proliferation factor gene. In one embodiment, the first and second segments are selected such that the sequence of interest is inserted between the last exon and the 3' UTR of the cell proliferation factor gene. In one or more embodiments, fragments of the cell proliferation factor gene are described in SEQ ID NO:1, 1-3000 and 5128-8127.
A recombinase suitable for use herein can be any recombinase, including a first recombinase and a second recombinase. The first and second recombinant enzymes herein may be different. In one embodiment, the first and second recombining enzymes are Cre and Dre or Dre and Cre, respectively. The Cre recombinase herein may be a Cre recombinase known in the art, the gene coding region sequence of which has a full length of 1029bp (EMBL database accession number X03453), and encodes a 38kDa monomeric protein consisting of 343 amino acids. The Cre recombinase not only has catalytic activity, but also can recognize specific DNA sequences, namely LoxP sites, similar to restriction enzymes. The Cre recombinase can mediate the specific recombination between two LoxP sites (sequences), so that the gene sequences between the LoxP sites are deleted or recombined. Cre recombinase suitable for use herein also includes mutants of Cre that retain recombinase enzyme activity. In certain embodiments, the amino acid sequence of recombinase Cre is as set forth in SEQ ID NO 3 and the nucleic acid sequence is as set forth in 3067-4099 of SEQ ID NO 1. Dre is a homologous recombinase similar to Cre, and it specifically recognizes another recombination site rox, similar to Cre specifically recognizing LoxP site. Dre recombinases suitable for use herein also include mutants of Dre that retain recombinase enzyme activity. In certain embodiments, the amino acid sequence of recombinase Dre is as set forth in 1-356 of SEQ ID NO 2. "LoxP" and "rox" as described herein have art-recognized meanings, and the sequences thereof are well known in the art. Illustratively, the nucleic acid sequence of LoxP is shown in SEQ ID NO. 4, and the nucleic acid sequence of rox is shown in SEQ ID NO. 5, wherein N represents any nucleotide.
Estrogen Receptors (ERs) are members of the steroid hormone receptor protein superfamily. Chimeric recombinases can be generated by fusing a ligand-binding domain (LBD) of the estrogen receptor with the recombinase. Because of the presence of the estrogen receptor binding region, the chimeric recombinase cannot enter the nucleus to bind to the recombination site and can only localize in the cytoplasm. Only after the addition of estrogen can the chimeric recombinase enter the nucleus to play a role. The ligand binding domain of the estrogen receptor may be mutated such that it is unable to bind physiological estrogen in the body, but binds only to exogenous inducers. The inducer forms a stable complex with the ligand binding domain of the estrogen receptor and is transported into the nucleus. The inducing agent includes estrogen analogues such as tamoxifen or 4-OHT. Suitable estrogen receptor ligand binding regions for use herein also include mutants known in the art in which the ligand binding region is mutated, but the resulting mutated transmembrane domain retains the biological function of the ligand binding region (i.e., the chimeric recombinase is still capable of binding to an inducing agent after the mutation). The mutation may be an insertion, deletion or substitution, and the number of the mutated amino acids may be one or more, for example, within 20, preferably within 10, more preferably within 5. In some embodiments, the mutation is a substitution mutation. It is well known in the art that substitution of an amino acid with one that is chemically similar (i.e., conservative substitution) has little to no effect on the function of the resulting protein. Thus, in some preferred embodiments of the invention, the substitution is a conservative substitution. Examples of conservative substitutions include, but are not limited to, substitutions between amino acids having the same polarity of the side chain group, such as between non-polar amino acids such as Ala, Val, Leu, Ile, Pro, Phe, Trp and Met, or between polar amino acids such as hydrophilic amino acids Gly, Ser, Thr, Cys, Tyr, Asn and Gln, or between polar positively charged amino acids such as Lys, Arg and His, or between polar negatively charged amino acids such as Asp and Glu; and substitutions between fatty acid amino acids such as Ala, Val, Leu, Ile, Met, Asp, Glu, Lys, Arg, Gly, Ser, Thr, Cys, Asn, and Gln, etc., between aromatic amino acids such as Phe and Tyr, between heterocyclic amino acids such as His and Trp, etc. In exemplary embodiments, the amino acid sequence of the estrogen receptor ligand binding region described herein is depicted as SEQ ID NO 2 at position 357 and 666; the nucleic acid sequence is shown as position 4153-5085 of SEQ ID NO. 1.
The polypeptides forming the fusion protein of the invention may be linked directly or may be linked by a linker sequence. The linker may be a linking sequence capable of expressing multiple polycistrons on a single vector, such as an Internal Ribosome Entry Site (IRES) or a 2A peptide. It is well known in the art that 2A peptide is a short peptide capable of inducing self-cleavage of proteins, and includes F2A, P2A, T2A peptide, and the like. The 2A peptide may also be attached to the flanking polypeptides by a conventional G and S containing linker. In one or more embodiments, the coding sequence for the P2A peptide comprises or consists of nucleotides 3001-3066 of SEQ ID NO: 1.
The polynucleotide sequence encoding the chimeric recombinase herein may also include one or more regulatory sequences operatively linked to the chimeric recombinase. The control sequence may be an appropriate promoter sequence. The promoter sequence is typically operably linked to the coding sequence of the protein to be expressed. The promoter may be any nucleotide sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell. The control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription. The terminator sequence is operably linked to the 3' terminus of the nucleotide sequence encoding the polypeptide. Any terminator which is functional in the host cell of choice may be used in the present invention.
Also included herein are nucleic acid molecules encoding a fusion protein of a second recombinase and an estrogen receptor ligand binding region. The second recombinase and estrogen receptor ligand binding region are as defined elsewhere herein. In certain embodiments, the polynucleotide sequence of the nucleic acid molecule encodes the fusion protein Dre-ER shown as SEQ ID NO. 2. In certain embodiments, the fusion protein of the second recombinase and the estrogen receptor ligand binding region is constitutively expressed in the host. In certain embodiments, the fusion protein of the second recombinase and the estrogen receptor ligand binding region induces expression or specific expression in the host. In certain embodiments, the mouse of the invention has the coding sequence for Dre-ER inserted at the Rosa26 gene site.
Also included herein are nucleic acid molecules comprising the structure: and the nucleic acid molecule comprises a recognition site of the first recombinase, a termination sequence, a recognition site of the first recombinase and a marker coding sequence from the 5 'end to the 3' end in sequence. Thereby allowing the use of a recombinase to regulate the expression of the marker. Exemplary polynucleotide sequences of the nucleic acid molecules include the following structures: LoxP-termination sequence-LoxP-marker. In certain embodiments, the fusion protein encoded by the polynucleotide is constitutively expressed in a host. Suitable termination sequences for use herein can be any termination sequence known in the art. Suitable labels for use herein may be any label known in the art. The label can be a fluorescent protein, including but not limited to green fluorescent labels (e.g., GFP, ZsGreen), red fluorescent labels (e.g., tdTomato, DsRed, mCherry), Yellow Fluorescent Protein (YFP), Cyan Fluorescent Protein (CFP), and the like. In one or more embodiments, the marker is GFP, the amino acid sequence of which is shown in SEQ ID NO 6. In certain embodiments, the mouse of the invention has the LoxP-stop-LoxP-GFP coding sequence inserted at the Rosa26 gene site.
Also provided herein are polynucleotide products. By "polynucleotide product" as used herein is meant a product comprising one or more polynucleotide sequences as described herein. The polynucleotide sequences contained in the polynucleotide product may be the same or different. The plurality of polynucleotide sequences may be related to each other in any number or independent of each other. In one embodiment, the polynucleotide product comprises a plurality of polynucleotide sequences that are separate from each other.
In exemplary embodiments, the polynucleotide products described herein comprise: a fragment comprising a cell proliferation factor gene, a first recombinase coding sequence, an estrogen receptor ER coding sequence, and a recognition site for a second recombinase; and optionally, a polynucleotide sequence encoding a fusion protein of a polynucleotide sequence of a second recombinase and an estrogen or a complement thereof with an ER receptor, and a polynucleotide sequence comprising the structure: from the 5 'end to the 3' end, the polynucleotide sequence of the recognition site of the first recombinase, the termination sequence, the recognition site of the first recombinase and the marker coding sequence.
Also provided herein are one or more nucleic acid constructs comprising one or more polynucleotide sequences described herein. The nucleic acid construct may also contain one or more regulatory sequences or sequences required for homologous recombination with the genome operably linked to the sequence of the polynucleotide sequence. The nucleic acid construct may be a vector. For example, the polynucleotide sequences herein can be inserted into a recombinant expression vector or a gene knock-in vector. In some embodiments, the polynucleotide sequences herein are contained on the same nucleic acid construct. In certain embodiments, the polynucleotide sequences herein are contained on different nucleic acid constructs.
The term "recombinant expression vector" refers to a bacterial plasmid, bacteriophage, yeast plasmid, plant cell virus, mammalian cell virus such as adenovirus, retrovirus, or other vectors well known in the art. Any plasmid or vector may be used as long as it can replicate and is stable in the host. An important feature of expression vectors is that they generally contain an origin of replication, a promoter, a marker gene and translation control elements. The expression vector may also include a ribosome binding site for translation initiation and a transcription terminator. The polynucleotide sequences described herein are operably linked to an appropriate promoter in an expression vector to direct mRNA synthesis via the promoter. Representative examples of such promoters are: lac or trp promoter of E.coli; a lambda phage PL promoter; eukaryotic promoters include CMV immediate early promoter, HSV thymidine kinase promoter, early and late SV40 promoter, LTR of retrovirus, and other known promoters capable of controlling gene expression in prokaryotic or eukaryotic cells or viruses. Marker genes can be used to provide phenotypic traits useful for selection of transformed host cells, including but not limited to dihydrofolate reductase, neomycin resistance, and Green Fluorescent Protein (GFP) for eukaryotic cell culture, or tetracycline or ampicillin resistance for E.coli. When the polynucleotides described herein are expressed in higher eukaryotic cells, transcription will be enhanced if an enhancer sequence is inserted into the vector. Enhancers are cis-acting elements of DNA, usually about 10 to 300 base pairs, that act on a promoter to increase transcription of a gene.
The vectors described herein may be transformed into an appropriate host cell to enable expression of the proteins described herein. In certain embodiments, the polynucleotide or cell marker system described herein is contained in the genome of the host cell. The host cell may be a prokaryotic cell, such as a bacterial cell; or lower eukaryotic cells, such as yeast cells; filamentous fungal cells, or higher eukaryotic cells, such as mammalian cells. The host cell may also be a plant cell. Representative examples of host cells are: e.coli; streptomyces; bacterial cells of salmonella typhimurium; fungal cells such as yeast, filamentous fungi; a plant cell; insect cells of Drosophila S2 or Sf 9; CHO, COS, 293 cells, or Bowes melanoma cells.
Transformation of a host cell with recombinant DNA can be carried out using conventional techniques well known to those skilled in the art. When the host is prokaryotic, e.g., E.coli, competent cells capable of DNA uptake can be harvested after exponential growth phase using CaCl2Methods, the steps used are well known in the art. Another method is to use MgCl2. If desired, transformation can also be carried out by electroporation. When the host is a eukaryote, the following DNA transfection methods may be used: calcium phosphate coprecipitation, conventional mechanical methods such as microinjection, electroporation, liposome encapsulation, etc. The mouse DNA transfection method may be a fertilized egg injection method.
After transformation of the host cell, the resulting transformant can be cultured by conventional methods to allow expression of the fusion protein described herein. The medium used in the culture may be selected from various conventional media depending on the host cell used. The recombinant fusion proteins herein can be isolated and purified using various isolation methods known in the art. Such methods are well known to those skilled in the art and include, but are not limited to: conventional renaturation treatment, treatment with a protein precipitant (such as salt precipitation), centrifugation, cell lysis by osmosis, sonication, ultracentrifugation, molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, High Performance Liquid Chromatography (HPLC), and other various liquid chromatography techniques, and combinations thereof.
Accordingly, host cells comprising the proteins or polynucleotides or expression vectors described herein are also included herein. Such host cells can constitutively express the proteins described herein, can also express the proteins described herein under certain induction conditions, and can also specifically express the proteins described herein in different host cell types. Methods of how to make host cells constitutively express, inducibly express, or specifically express a protein of the invention are well known in the art. For example, in certain embodiments, inducible expression of a protein is achieved by constructing an expression vector of the invention using an inducible promoter. In certain embodiments, tissue-specific expression of a protein is achieved using a tissue-specific expression promoter or associating the coding sequence of the protein with a tissue-specific gene.
The knock-in vector is used to knock the polynucleotide sequences described herein into a region of interest in the genome. Typically, the knock-in vector will contain, in addition to the polynucleotide sequence, a 5 'homology arm and a 3' homology arm required for homologous recombination of the genome. In certain embodiments, the nucleic acid constructs herein comprise a first segment of a cell proliferation factor gene that is a 5 'homology arm, a coding sequence for a chimeric recombinase described herein, and a second segment of a cell proliferation factor gene that is a so-called 3' homology arm. In other embodiments, the nucleic acid constructs herein comprise a 5 'homology arm, a polynucleotide sequence described herein, and a 3' homology arm. When using knock-in vectors, the CRISPR/Cas9 technique can be simultaneously utilized to homologously recombine a polynucleotide sequence to a location of interest. The CRISPR/Cas9 technology is used for guiding Cas9 nuclease to modify genome at an insertion position by designing a guide RNA aiming at a target gene, so that the homologous recombination efficiency of a gene modification region is increased, and a target fragment contained in a gene knock-in vector is subjected to homologous recombination to the target site. Cas9 nuclease may be a Cas9 nuclease well known in the art.
When the polynucleotide sequence described herein is recombined to a position of interest using CRISPR/Cas9 technology, a guide RNA target sequence for the genome of the site of interest is designed and transcribed in vitro according to the sequence to obtain a guide RNA for the gene. Also, knock-in vectors are constructed for recombination of fragments of interest, which may be the coding sequences of proteins (e.g., chimeric recombinases) described herein or polynucleotide sequences described herein. Then, the in vitro transcribed guide RNA and the constructed knock-in vector are co-transformed into a cell of interest (for example, a fertilized egg), and then cells in which the desired fragment is knocked in the site of interest in the genome are selected.
The invention also includes methods of introducing (e.g., by genetic recombination) a polynucleotide sequence described herein into the genome of a mouse, thereby obtaining a transgenic mouse. Methods for obtaining transgenic animals such as transgenic mice are known in the art, such as fertilized egg injection, embryonic stem cell injection, and the like. In addition, different transgenic mice can be mated to obtain progeny mice with multiple polynucleotides of interest, thereby creating a transgenic mouse model.
The invention provides methods for constructing transgenic animals comprising introducing a nucleic acid molecule, polynucleotide product or nucleic acid construct as described herein into animal cells containing a cell proliferation factor gene and selecting transgenic animals that undergo homologous recombination.
The invention provides a method for constructing a transgenic animal, which comprises the following steps: (1) providing a first transgenic animal having in its genome one or more nucleic acid constructs; (2) providing a second transgenic animal whose genome comprises another one or more nucleic acid constructs that are different from the one or more nucleic acid constructs; and (3) mating the first transgenic animal and the second transgenic animal, and performing homologous recombination in a progeny animal to obtain a progeny transgenic animal.
The invention provides a method for long-term in vivo cell labeling comprising providing an animal comprising a nucleic acid molecule, polynucleotide product or nucleic acid construct as described herein, and labeling cells expressing the cell proliferation factor gene in the animal in the presence of an inducing agent.
For example, in the co-presence of Ki67-CrexER, DreER, and Rosa26-LoxP-stop-LoxP-GFP, tamoxifen induces Dre in DreER to recognize rox in the crexER gene sequence, and homologous recombination occurs to change crexER to Cre, so that Ki67 is co-expressed and released once Cre is expressed. Cre recognizes LoxP and activates GFP expression (fig. 1, b and fig. 3, a). The cell marking method of the present invention enables continuous tracking of gene expression over a long period of time.
The invention also provides kits comprising a chimeric recombinase, fusion protein, coding sequence, nucleic acid molecule, polynucleotide sequence product, nucleic acid construct, or host cell as described herein, and reagents necessary for knocking-in the nucleic acid molecule of the invention into the genome of the cell.
Embodiments of the present invention will be described in detail with reference to examples. It will be appreciated by those skilled in the art that the following examples are illustrative of the invention only and should not be taken as limiting the scope of the invention. The examples do not show the specific techniques or conditions, and the techniques or conditions are described in the literature in the art (for example, refer to molecular cloning, a laboratory Manual, third edition, scientific Press, written by J. SammBruker et al, Huang Petang et al) or according to the product instructions. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available.
Examples
Example 1 construction of mice
Ki67-CrexER mouse
The strategy for constructing a genetic tool mouse Ki67-CrexER (figure 2, a) is to adopt CRISPR/Cas9 technology and knock in a 2A-CrexER2 expression frame at the stop codon site of Mki67 gene in a homologous recombination mode. The brief procedure is as follows: obtaining Cas9 mRNA and gRNA (SEQ ID NO:11) by means of in vitro transcription; a homologous recombinant vector (donor vector) was constructed In the frame of the PBR322 plasmid by the In-Fusion cloning method, which contained a 4.1kb 5 'homology arm, a 2.1kb KI fragment and a 4.0kb 3' homology arm (primer sequences used: SEQ ID NOS: 7-8). Cas9 mRNA, gRNA, and donor vector were microinjected into fertilized eggs of C57BL/6J mice to obtain F0-generation mice. Obtaining correct homologous recombination F0 generation mice through long fragment PCR identification; mice of F0 generation were mated with C57BL/6J mice to obtain positive F1 generation mice.
The constructed tool mice were validated by staining with ESR (estrogen receptor immunofluorescence staining) and Ki67 or EDU (thymidine analogue staining) (fig. 2, b), and the results showed that in this faster proliferating tissue of the small intestine, both Ki67 expression and ESR expression were concentrated in the bottom fossa structure. While ESR and EdU staining results showed that both were co-localized in the same cells, indicating that the tool mouse was constructed to consistently label proliferating cells. It was further verified by mating with R26-GFP mice whether the tool mice could develop inducible homologous recombination reactions (fig. 2, c). The results show that under the induction of tamoxifen, the Ki67-CrexER tool mouse can mark paired cells in the aortic endothelium of adult mice, which indicates that the tool mouse can generate inducible homologous recombination reaction and capture the proliferative activity of the cells. The above results demonstrate the success of the genetic tool, mouse Ki67-CrexER construction.
R26-DreER mice
An R26-DreER mouse is constructed by adopting an ES cell targeting mode, and a CAG promoter-DREERT2-polyA expression frame is inserted into a Rosa26 gene site at a fixed point. The brief procedure is as follows: an ES cell targeting vector was constructed by In-Fusion cloning, which contained 1.087kb 5 'homology arm, CAG promoter, DREERT2 coding region, FRT-PGK-Neo-polyA-FRT, 4.259kb 3' homology arm, and MC1-DTA-polyA negative selection marker (primer sequence: SEQ ID NO: 9-10). After linearization, the vector was electrotransfected into C57/129ES cells. After G418 and Ganc drug screening, 144 resistant ES cell clones are obtained in total; through long-fragment PCR identification, 7 positive clones with correct homologous recombination are obtained. Positive ES cell clones were expanded and injected into blastocysts of C57/129 mice to obtain chimeric mice. A high proportion of chimeric mice were mated with C57/129 mice to obtain 4 positive F1 generation Neo-containing mice.
R26-GFP mouse
R26-GFP mice were obtained from the Allen institute for Brain science laboratory. The mouse has LoxP-stop-LoxP-GFP inserted into the Rosa26 gene site.
Example 2 construction and validation of the proliferating cell marker System ProTracker
Ki 67-CrexER; R26-DreER; the system consisting of R26-GFP is called ProTracker (FIG. 3, b). In the presence of Ki67-CrexER in combination with R26-DreER, tamoxifen induces Dre in DreER to recognize rox in CrexER into the nucleus, and homologous recombination occurs to change CrexER to Cre so that Ki67 is released once Cre is expressed (fig. 1, b, fig. 3, a). Ki67-CrexER can be regarded as a traditional method for tracking proliferation. Adult mice obtained by mating Ki67-CrexER with R26-GFP and protacker mice were induced with tamoxifen after adulthood, and were sampled at the beginning of the chase (two days) and four weeks later, and the difference in proliferation signals traced by both were observed (fig. 3, b and c). The results show that the traditional method of tracking proliferation did not differ from ProTracker in the onset, and little GFP positive cell signal was detected in tissues other than the small intestine, indicating that adult tissues or organs proliferated less (FIG. 3, d). The ProTracker can realize long-time continuous tracking, and samples collected around the traditional proliferation tracing method and the ProTracker show that the proliferation signals tracked by the traditional proliferation tracing method and the ProTracker are obviously different. In some tissue organs that proliferate relatively slowly, such as the heart, lung, liver, pancreas, and kidney, the ProTracker traces significantly more signals than traditional proliferation traces, suggesting that we can see almost all of the cell signals that proliferate during the tracing, thanks to the accumulation of signals by the ProTracker system. In muscle and brain, since cells themselves proliferate slowly, neither system captures proliferation signals. Almost all small intestine epithelial cells were captured by the ProTracker in the rapidly proliferating tissue of the small intestine, indicating that the ProTracker system is more efficient (FIG. 3, d).
The above results demonstrate that compared with the traditional method for tracing proliferation signals, ProTracker performs signal accumulation by seamlessly tracing cell proliferation signals induced by tamoxifen, finally presents all the proliferated cell signals, and restores the cell proliferation status in vivo for a certain period of time.
Example 3 verification of reliability
Mating R26-DreER tool mice with LoxP reporter mice R26-GFP mice and tamoxifen induction in adults, collecting individual tissues for full tissue fluorescence or slice immunofluorescence imaging (fig. 4), results show that essentially all tissue organs have no GFP fluorescence signal, indicating that DreER and LoxP in the present invention tracking cell proliferation system, protacker, do not mix homologous recombination reactions.
The ProTracker tracing in vivo proliferation signals are initiated by tamoxifen induction, and in order to prove that the whole set of system is really controlled by tamoxifen, the fossa sible of the tracer proliferation tool mice of the traditional tracer system and the ProTracker system is selected as a control without tamoxifen induction, and the result shows that the mice without tamoxifen induction in the traditional tracer proliferation system basically have no green fluorescence signals (figure 5, a), which indicates that the traditional tracer system basically has no leakage. The ProTracker system does not induce tamoxifen and basically has no green fluorescent signals in all tissues and organs of the whole body, which indicates that the ProTracker system is really controlled by the induction of tamoxifen. This indicates that the proliferating cell marking and tracking system of the present invention can truly and reliably track the in vivo cell proliferation signal.
Example 4 tracking of proliferating cells of liver Using ProTracker
The ProTracker system constructed in example 2 was used to specifically study cells proliferating in vivo. The liver is an organ having a regenerating ability composed of many liver lobules, in which hepatocytes have a certain proliferative ability. Meanwhile, the liver as a metabolic organ liver cell performs different metabolic functions in different regions of the hepatic lobule, and the hepatic cell along the blood flow flowing from the portal vein to the central vein in the hepatic lobule structure constituting the liver can be roughly divided into three regions: region 1 of E-CAD +, region 3 of GS +, and region 2 located therebetween (FIG. 6, b). Since cells in different regions exert different metabolic functions, it is also considered in the art that these cells have different proliferative capacities. The previous research uses molecular markers of hepatocytes in different regions to construct a genetic tool, and mice carry out lineage tracing on the hepatocytes in the regions, and the conclusion is that the hepatocytes in the central venous region have stronger proliferation capacity under physiological steady state. However, the lineage tracing technology using the hepatocyte markers in specific areas can only trace the change of a single group of hepatocytes at a time, and cannot probe the proliferation change of the hepatocytes at the whole level. To study hepatocyte proliferation at a global level, a set of genetic lineage tracing techniques independent of molecular markers in specific regions of hepatocytes is used. The ProTracker system of the present invention can meet this requirement and study the source and fate of newly generated hepatocytes.
The cells proliferation tracking was initiated by tamoxifen induction in mice of the ProTracker system after adult life, and samples were collected at various time points after induction (FIG. 6, a) to observe where the area in which hepatocytes initially started to proliferate and then the proliferating cells slowly migrated. The results show that at the very beginning of the tracking (zero and two days after tamoxifen induction), essentially no proliferating hepatocyte signals were captured (fig. 6, c). When the tracking time was extended to two weeks, sporadic proliferation signals appeared at the 1 and 2 positions of the liver lobules and essentially no proliferation signals appeared in zone 3. By the fourth week, the proliferation signal of region 2 was significantly increased, the proliferation signal of region 1 was also increased, and there was still substantially no proliferation in region 3. This was followed by week 6, week 8, week 10, and week 12 with hepatocyte proliferation remaining, with zone 2 proliferating a maximum of 1 time and zone 3 showing less hepatocyte proliferation signals (fig. 6, d and e). We therefore believe that the most vigorous hepatocyte proliferation in hepatic lobules of the liver at physiological homeostasis is hepatocytes in region 2, which is located in the middle of the portal and central venous regions.
Example 5 tracking of proliferating cells of the Heart Using ProTracker
Cardiomyocytes were initially considered incapable of proliferation as a terminally differentiated cell type. Recent studies suggest that cardiomyocytes in adult hearts can produce new cardiomyocytes by proliferation. However, the studies mainly used isotope incorporation methods, which on the one hand introduce the problem of detection difficulties and on the other hand, because cardiomyocytes are polyploidy and isotope incorporation mainly detects nuclei, it is difficult to distinguish nuclear polyploidization phenomena and cell division phenomena occurring in cardiomyocytes. In addition, whether the newly generated cardiomyocytes have regionality and whether the generation of new cardiomyocytes by cell division into two can be detected is still a problem to be solved.
Cardiomyocytes proliferate relatively slowly and no potential proliferation signal was captured by traditional proliferation marker staining and DNA analogue incorporation. While the traditional lineage tracing technique also missed most of the proliferated cardiomyocytes because of the short duration of tamoxifen action window, the ProTracer system of the present invention allows us to continuously capture the active signal of Ki67 because DreER nucleates Ki67-CrexER to Ki67-Cre under tamoxifen action (FIG. 7, b), thereby allowing signals to be superimposed from the very beginning of signal capture until the time of detection (FIG. 7, a). The superimposed proliferation signals are eventually present in the larger volume of cells, the cardiomyocytes, and facilitate the overall observation of the location and distance between all the signals.
Adult ProTracker mice were induced with tamoxifen and three months later heart tissue harvested for immunofluorescent staining to reveal many GFP + cardiomyocytes in the heart. As a result of immunofluorescence staining, it was statistically found that about 0.7% + -0.14% of cardiomyocytes had active expression of Ki67 within three months. Furthermore, we found trapped GFP+Most of the cardiomyocytes in (a) are located on the side of the left ventricular wall close to the endocardium, and the cardiomyocytes in the ventricular septum are also GFP + cardiomyocytes on the side facing the left ventricular cavity significantly more than those on the side facing the right ventricular cavity. The right ventricle wall of the heart was found to be essentially free of GFP + cardiomyocytes. To this end, we have discovered a population of actively proliferating cardiomyocytes encircling the left ventricular cavity in an adult heart using the ProTracker system.
Actively proliferating cells expressed Ki67 in G1, S, G2, and M phases of the cell cycle, so the proliferation signal of GFP + captured by the protacker included cell division as well as nuclear division (fig. 8, a). Next, the captured GFP + cardiomyocytes were further investigated. The isolated cardiomyocytes were subjected to Hoechst staining to find that GFP + cardiomyocytes containedMononuclear, binuclear and multinuclear individuals (FIG. 8, b), and GFP was found by counting the number of nuclei of these isolated cardiomyocytes+The number of mononuclear nuclei of the cardiomyocytes in comparison with GFP-is larger, and the number of binuclear and multinuclear nuclei is smaller, indicating that cell division occurs in the cardiomyocytes. The continuous multi-slice confocal scan captures both individual cardiomyocytes and two close-by cardiomyocytes (fig. 8, c and e). Successive slice scans showed that the individual cardiomyocytes captured were multinucleated cardiomyocytes and the two next cardiomyocytes were mononuclear cardiomyocytes (fig. 8, d and f). If the captured immediate cardiomyocytes are considered to be the result of cell division, actively proliferating cells within 12 weeks of the adult heart (GFP)+Cardiomyocytes) about 8% of the time that cell division occurs. Thus, we studied the proliferation of the adult heart using the ProTracker system and found that a population of proliferating cardiomyocytes located in the left ventricular cavity of the annulus captured the cardiomyocyte division phenomenon in the adult heart.
The invention realizes the goal of seamlessly tracking in-vivo cell proliferation signals by designing a new genetic tool mouse, and researches the proliferation of liver cells and myocardial cells in an adult mouse. The proliferation of various types of cells in vivo can also be followed using this technique. In addition, by replacing different DreER tools for hybridization, mice can individually track the proliferation of cells of a certain group in vivo, and the method has great significance for understanding the dynamic change of cells of various groups in vivo.
After reading the above teachings of the present invention, those skilled in the art may make various changes or modifications to the present invention, and such equivalents fall within the scope of the invention as defined by the appended claims.
Sequence listing
<110> Shanghai Life science research institute of Chinese academy of sciences
<120> in vivo cell proliferation marking and tracing system and application thereof
<130> 194954
<160> 11
<170> SIPOSequenceListing 1.0
<210> 1
<211> 8127
<212> DNA
<213> Artificial Sequence
<400> 1
actctgaagg aggtaaggac ccttgctgtc ttatattgtg tgagacccag aacattctaa 60
tctttatgcc caggttacaa aggcaaataa ggacatgcca gcagctcctg ctatgctagt 120
gggaaagcca tcattcgggc cacttacttt cttgtgtctg ttttggtaat tccactttga 180
atagtgaatt tacctattga gggtttcatt ccagaagagc tagctattta tatctaatac 240
cattttctga tgtttgtgat gtggcttatc ttggttatag atgacatttt cagcctctct 300
atgttggcaa cacttaacta aaacaaaggt gggcatgtca ggacctcaaa agatttcctt 360
taaaatagca gaatgagttg gctcagcagt tgagagcacc aaatactctt tcaaaggagc 420
cagttgttat tcctaggact cacatggctc acaactatct gtaactccag ttccagggga 480
tttgatggtc tcttctgacc tccttaaaca ccaagcaggc atatgcatag gtgaaggcaa 540
aaaacataaa aattaaaagg aataaatatg atttaaaaaa aataaataaa aaggcagaat 600
aagcccagta ttgtctatct cctaagcaaa taagtgaaaa taggtaaatt ctcctcagtg 660
ggaagtgtgt tgtttcagca tcccaagctc acagtactaa gctagtagac tgtaaaccca 720
tgtgctgtcc gtgctttgca tcattcccag tggcgtgctt gcattaagca gtagtctagg 780
accaggtaaa gatggggtgg gcaggagtga cacataattg ttctggacat tcaagttgaa 840
cctgaacata ctctgaacat tctgaaacag ctgatgggag ataaaggtat tgacagccgc 900
tcgaggcaga ggtgtgctga gctggcagtc ctacatgtga ctggcacagc acaagaagac 960
ccaggctttc cagaggtcat gaaacatccc atatagctga gggatggagc aagaggcatg 1020
gaagtctagt gcgaaggtgc agctaaagca cccagaatgg cggctctcaa cctcccaatg 1080
ctgtgactct ttaatacagt tcttctgtgg tgatccccaa ccataaaatt attttattgc 1140
tacttgatca ctaatgttgc tactgttatg aatcataatg taaatatctg gtatgcatga 1200
tatctgatat atgaaacaac ctgtgaaagg gttgtttgac ccccacaaag ggcttgagaa 1260
ccatggaagc gaggctcagg aaaggagagc aatacccagg aagtggaggt ggctgtccgg 1320
aagcagaaac tatctcatgg tggcaagcct tcattacaga caccaagaca caccaggaac 1380
tcagcctcat aggcttaaca agtatcttat ctttcctcag agctctaagc acagcttcat 1440
caccttgaaa gtagtacttt atcagaagga aatagaagga ataaaaccca ggttttttta 1500
gtcaaatgat cctgaacaca acaggcaagg cctgagggtg atcaggccag gtcatcgtgt 1560
catagacact caggtctctt ctcctcactg gtgcccagcg gatgtcatac actgacgagt 1620
tttcccagga ctatatcttt cttgtctgtt ctgttgtcca taggcccaga ttctacatat 1680
gtgtgtgtgg gaggggtggg tgcttcctgg cggtccatcc tgcaagtatg ctggagaagc 1740
aagcctctta tctggtgtgt gtgcctttct aacatgtgta cagtagatcc atctacctgt 1800
tattttctag aattcaacag cattcacata ccaagatctt cttgtccata ctgagcctca 1860
cacttaagag ttcctgtttt ccgtctccct ttcttaactg tccataatca ctcataaaac 1920
tgtgtctaaa gtatgcccag catcaccctc ggctttttct aatttttgtt ggatgggcct 1980
tgtgtttatc aggtaccaga actttgggtc atttgctcta agaggctatt gtgacctttt 2040
gctttctgta gattggatcc ccgcttccag gtagatgggc ctgcctctat ctccccactg 2100
ccttagagga cccactatcc ttttgcagcc acataggaga cctcaggaca cagtgactgt 2160
cctttgtctg tgggaagttg gctttaggat acttaagttt tcatctaggc cacagtcaaa 2220
ttttgtgaat gatgtttttt aattagtgaa ccacatacag tgatagagac cgtgtatgct 2280
ttagaaactt gtgaaagagc acagagtttg agttttaaaa actaagttaa aaaaaaacat 2340
ttaggaagaa acaaccttat ggtaagcatt gtaaaaggat ttccaacttt aatttttttc 2400
tttttaaaaa cactttgtag ccaggcagtg gtggtgcaca cctttaatcc cagcacttgg 2460
gaggcagagg caggtggatt tctgagttcg aggccagcct ggcctacaga gtgagttcca 2520
ggacagccag ggctatacag aggaaccctg tcttgataaa ccaaacaaac aaaaaagaaa 2580
aacactttgt ttttgttttg tgggtttttt tttttaatgt atactgagta ttttgccgtg 2640
tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tatcatgtgc ttgcctgcgg 2700
aggtcagtca gaaaagggca tctggttccc tggatggttg tgagccacca tgtggatgct 2760
gggagttaaa cttaggtcct ctgtaagtgc agaaagtgcc cctagacact gagccatctt 2820
tccagacctt caatgttaat ttatagatga gagacctgaa tgacacctag taaggacaag 2880
gggctcattg agttgaggtg atcacaaaaa ttgttccttc atattattta ggagtgtggt 2940
tttttttttc ccaaacagga tgaagacatt gtatgcacca agaagttaag aacaagaagt 3000
ggaagcggag ctactaactt cagcctgctg aagcaggctg gcgacgtgga ggagaaccct 3060
ggtcctatgg gctccaattt actgaccgta caccaaaatt tgcctgcatt accggtcgat 3120
gcaacgagtg atgaggttcg caagaacctg atggacatgt tcagggatcg ccaggcgttt 3180
tctgagcata cctggaaaat gcttctgtcc gtttgccggt cgtgggcggc atggtgcaag 3240
ttgaataacc ggaaatggtt tcccgcagaa cctgaagatg ttcgcgatta tcttctatat 3300
cttcaggcgc gcggtctggc agtaaaaact atccagcaac atttgggcca gctaaacatg 3360
cttcatcgtc ggtccgggct gccacgacca agtgacagca atgctgtttc actggttatg 3420
cggcggatcc gaaaagaaaa cgttgatgcc ggtgaacgtg caaaacaggc tctagcgttc 3480
gaacgcactg atttcgacca ggttcgttca ctcatggaaa atagcgatcg ctgccaggat 3540
atacgtaatc tggcatttct ggggattgct tataacaccc tgttacgtat agccgaaatt 3600
gccaggatca gggttaaaga tatctcacgt actgacggtg ggagaatgtt aatccatatt 3660
ggcagaacga aaacgctggt tagcaccgca ggtgtagaga aggcacttag cctgggggta 3720
actaaactgg tcgagcgatg gatttccgtc tctggtgtag ctgatgatcc gaataactac 3780
ctgttttgcc gggtcagaaa aaatggtgtt gccgcgccat ctgccaccag ccagctatca 3840
actcgcgccc tggaagggat ttttgaagca actcatcgat tgatttacgg cgctaaggat 3900
gactctggtc agagatacct ggcctggtct ggacacagtg cccgtgtcgg agccgcgcga 3960
gatatggccc gcgctggagt ttcaataccg gagatcatgc aagctggtgg ctggaccaat 4020
gtaaatattg tcatgaacta tatccgtaac ctggatagtg aaacaggggc aatggtgcgc 4080
ctgctggaag atggcgatct aactttaaat aattggcatt atttaaagtt actcgagcca 4140
tctgctggag acatgagagc tgccaacctt tggccaagcc cgctcatgat caaacgctct 4200
aagaagaaca gcctggcctt gtccctgacg gccgaccaga tggtcagtgc cttgttggat 4260
gctgagcccc ccatactcta ttccgagtat gatcctacca gacccttcag tgaagcttcg 4320
atgatgggct tactgaccaa cctggcagac agggagctgg ttcacatgat caactgggcg 4380
aagagggtgc caggctttgt ggatttgacc ctccatgatc aggtccacct tctagaatgt 4440
gcctggctag agatcctgat gattggtctc gtctggcgct ccatggagca cccagtgaag 4500
ctactgtttg ctcctaactt gctcttggac aggaaccagg gaaaatgtgt agagggcatg 4560
gtggagatct tcgacatgct gctggctaca tcatctcggt tccgcatgat gaatctgcag 4620
ggagaggagt ttgtgtgcct caaatctatt attttgctta attctggagt gtacacattt 4680
ctgtccagca ccctgaagtc tctggaagag aaggaccata tccaccgagt cctggacaag 4740
atcacagaca ctttgatcca cctgatggcc aaggcaggcc tgaccctgca gcagcagcac 4800
cagcggctgg cccagctcct cctcatcctc tcccacatca ggcacatgag taacaaaggc 4860
atggagcatc tgtacagcat gaagtgcaag aacgtggtgc ccctctatga cctgctgctg 4920
gaggcggcgg acgcccaccg cctacatgcg cccactagcc gtggaggggc atccgtggag 4980
gagacggacc aaagccactt ggccactgcg ggctctactt catcgcattc cttgcaaaag 5040
tattacatca cgggggaggc agagggtttc cctgccacag cttgataact aactttaaat 5100
aattggcatt atttaaagtt atgataagaa caagaagtta ccagaaaagt gaaactatgt 5160
agcaaagaca tttaagaagg aaaagtaaat ttgacttagt gataagttcc agtgtggttt 5220
tcacctccag tgtaaagatg aactgtaaat actactgcta ctgcctgagt ttaaggaagg 5280
aagctttgag ctttcctggt catactctct tcagacgcca atggaggtca tgaggaagat 5340
caccagggat ctcagcgcaa ttacagttta ggggtgagca ggcagaaatg tggccctctg 5400
tcctatccaa taaagctctg aaattcgctg cctcctttgg cctctctgac aactgcagct 5460
gctcccctct gccctcatga aggaggggaa ggtggtgccc ctccattcat tagacattgg 5520
ttgtgcagtt atatcagcca accttacaca ggatgactgt acggtggagt ggttggtttg 5580
taggctacac cattagtcac ttacgcaagt cagcctaatc ctctgggcct gtgacctttg 5640
ggagaaacat ctgacaagga tggctgccga gctcccttca ggggcacggg tcgctatgtt 5700
aaagagcggt tgatgtctgt gcttttcatt aggcctctgt attgagtgga ttggctgcct 5760
tgcctgtgga acctttgctg ctggggagtc tcctgtcccc actggagtct ccactccagt 5820
ctcctgtcct agcgtctgct tttatcacgg gctttctctg acctcttgcc tggcagcaca 5880
aggccatcct ggtgtctggt atgagatgct tatcttccaa gtttcacttt aaccctaaac 5940
tcttttctgt tggaaaccac tgcgcatttg catatgcaac tttgtgcttt tcactctgcc 6000
tgctagtccc ctttctgttt tccagcagta acatatctgc tggtgctgga agagagccta 6060
gagtgtgccc tggtcagcca ttgccctaac ctcttcactt ctccatctcc tgtctgagat 6120
acaggtgaag aacactgggt acgcaggtga gaaacactga gtggaggccg ggatttagca 6180
ttttgggtga gtctgggagt tctgccattt catctacctc aggaattctg taatcaagga 6240
atggcaactg gttattaata agggggcaaa agcttcatag ggtgggtaac agtggaactg 6300
gcaaaggaga ttgtgtagag cagaaggcac aggaaaagag cgcccttttt acctgttaga 6360
gggtgtgagg catgaaagtg cccttaattg acttaaatcc taaagtcaaa gtctttgaag 6420
taacaggaac cttgactatg aattgctctg atgtagaatt agaaatatca catgtatgtg 6480
ggaaattgta gtcaactgca tgctgattga atggaactgg gtgataaggg aaaggcctgc 6540
tcagttatag gaaattctgt ctgagccatg ttagcacatt ttctcactta ggacagatgt 6600
gacggctctg aagcagctgc tatgcaggca agaaggcaag agcagattag cagaacctat 6660
gtctgagctg ggcctggtga cataggtctg caaccccagc attgggatat ggagtcaggt 6720
atatgagtgc ccgaggcttg ctagccagcc accctagcca aagaggatcc agtagaaaga 6780
tgtcccaaat cagcctacat acatgtctgc ttgtgtgggc tgatgtgtgc acttggtatg 6840
tatatatgca cacacatgca gccaccatgt aacctaaaac gctcatttga gggtgatacc 6900
attgccaaga cattcttaga acacatcctc tatttatctc tgtgtgcaca tctgagaaag 6960
acccacttgt tggttgattg taacaaatat ccacccattc ctcaagtgtt tagctatggt 7020
ccctagcaat gtcagtttcc cagcagaaag catgatggga gattcccaag aaaggagtgc 7080
tgtacttttt gcctcccaga tctgtgactc ttcctgtttt gttgtcattt gctcctgccc 7140
ttctcataaa cagctactgt tttccctgcc tggaacttga cccagccccg cacttcatca 7200
attgtattca ttggaatgat gaacttagct ccaagaagct tcctggcctc tccactgcag 7260
ccactgtccc gggttaggaa cggcaggtcc ttagttgtca gcagcatcta ggcacctagt 7320
gagaatcggc atctgtatta gtcagggttc tctagagtca cagaacttag gaatagtctc 7380
tatatagtaa aggaatttat tgatgattta tagtagtcca attcccaaca atggttcagt 7440
agaagctgtg aatggaagtc caaagatcta gcagttactc agtcccacac ggcaagcagg 7500
cgaaggagca agagccagac tcccttcttc caatgtcctt atatggtctc cagcagaagg 7560
tgtagcccag attaaaggtc tgtcccacca cacctttaat cccagatgac cttgaactca 7620
gagatctccc tgtcttaatc ttctggaatc catagccact atgcctcaag atctccatac 7680
caagatccag atcagaaact tccatctccc agcctccaga ttagggtcac tggtgagcct 7740
tccaattctg gattgtagtt cattccaaat atagtcaagt tgacagctgg gaatagccac 7800
tacagcatcc taaatgcaat tttcatcccc ttgactaaac tgatttagtt taatagcatg 7860
taatctcagc ttgcctgatg attgcaatgt gacttggggc aaatctttaa caggcagttt 7920
tctggtctat agaatgatgt tctcagtgct ccatctcagg gtagttaaga tgaacagaat 7980
agaatactgc ttgcagctcc tgtagccttt ggccagtgct tggagtcaag ctgggtcatg 8040
agggctttct ccactgagaa ggtagaagga agatttggag caccgaagtc tcagcactag 8100
attttatatg atgtcctgaa cagggaa 8127
<210> 2
<211> 666
<212> PRT
<213> Artificial Sequence
<400> 2
Met Gly Ala Ser Glu Leu Ile Ile Ser Gly Ser Ser Gly Gly Phe Leu
1 5 10 15
Arg Asn Ile Gly Lys Glu Tyr Gln Glu Ala Ala Glu Asn Phe Met Arg
20 25 30
Phe Met Asn Asp Gln Gly Ala Tyr Ala Pro Asn Thr Leu Arg Asp Leu
35 40 45
Arg Leu Val Phe His Ser Trp Ala Arg Trp Cys His Ala Arg Gln Leu
50 55 60
Ala Trp Phe Pro Ile Ser Pro Glu Met Ala Arg Glu Tyr Phe Leu Gln
65 70 75 80
Leu His Asp Ala Asp Leu Ala Ser Thr Thr Ile Asp Lys His Tyr Ala
85 90 95
Met Leu Asn Met Leu Leu Ser His Cys Gly Leu Pro Pro Leu Ser Asp
100 105 110
Asp Lys Ser Val Ser Leu Ala Met Arg Arg Ile Arg Arg Glu Ala Ala
115 120 125
Thr Glu Lys Gly Glu Arg Thr Gly Gln Ala Ile Pro Leu Arg Trp Asp
130 135 140
Asp Leu Lys Leu Leu Asp Val Leu Leu Ser Arg Ser Glu Arg Leu Val
145 150 155 160
Asp Leu Arg Asn Arg Ala Phe Leu Phe Val Ala Tyr Asn Thr Leu Met
165 170 175
Arg Met Ser Glu Ile Ser Arg Ile Arg Val Gly Asp Leu Asp Gln Thr
180 185 190
Gly Asp Thr Val Thr Leu His Ile Ser His Thr Lys Thr Ile Thr Thr
195 200 205
Ala Ala Gly Leu Asp Lys Val Leu Ser Arg Arg Thr Thr Ala Val Leu
210 215 220
Asn Asp Trp Leu Asp Val Ser Gly Leu Arg Glu His Pro Asp Ala Val
225 230 235 240
Leu Phe Pro Pro Ile His Arg Ser Asn Lys Ala Arg Ile Thr Thr Thr
245 250 255
Pro Leu Thr Ala Pro Ala Met Glu Lys Ile Phe Ser Asp Ala Trp Val
260 265 270
Leu Leu Asn Lys Arg Asp Ala Thr Pro Asn Lys Gly Arg Tyr Arg Thr
275 280 285
Trp Thr Gly His Ser Ala Arg Val Gly Ala Ala Ile Asp Met Ala Glu
290 295 300
Lys Gln Val Ser Met Val Glu Ile Met Gln Glu Gly Thr Trp Lys Lys
305 310 315 320
Pro Glu Thr Leu Met Arg Tyr Leu Arg Arg Gly Gly Val Ser Val Gly
325 330 335
Ala Asn Ser Arg Leu Met Asp Ser Ala Ser Gly Ala Arg Arg Ile Cys
340 345 350
Val Arg Gly Ser Met Arg Ala Ala Asn Leu Trp Pro Ser Pro Leu Met
355 360 365
Ile Lys Arg Ser Lys Lys Asn Ser Leu Ala Leu Ser Leu Thr Ala Asp
370 375 380
Gln Met Val Ser Ala Leu Leu Asp Ala Glu Pro Pro Ile Leu Tyr Ser
385 390 395 400
Glu Tyr Asp Pro Thr Arg Pro Phe Ser Glu Ala Ser Met Met Gly Leu
405 410 415
Leu Thr Asn Leu Ala Asp Arg Glu Leu Val His Met Ile Asn Trp Ala
420 425 430
Lys Arg Val Pro Gly Phe Val Asp Leu Thr Leu His Asp Gln Val His
435 440 445
Leu Leu Glu Cys Ala Trp Leu Glu Ile Leu Met Ile Gly Leu Val Trp
450 455 460
Arg Ser Met Glu His Pro Val Lys Leu Leu Phe Ala Pro Asn Leu Leu
465 470 475 480
Leu Asp Arg Asn Gln Gly Lys Cys Val Glu Gly Met Val Glu Ile Phe
485 490 495
Asp Met Leu Leu Ala Thr Ser Ser Arg Phe Arg Met Met Asn Leu Gln
500 505 510
Gly Glu Glu Phe Val Cys Leu Lys Ser Ile Ile Leu Leu Asn Ser Gly
515 520 525
Val Tyr Thr Phe Leu Ser Ser Thr Leu Lys Ser Leu Glu Glu Lys Asp
530 535 540
His Ile His Arg Val Leu Asp Lys Ile Thr Asp Thr Leu Ile His Leu
545 550 555 560
Met Ala Lys Ala Gly Leu Thr Leu Gln Gln Gln His Gln Arg Leu Ala
565 570 575
Gln Leu Leu Leu Ile Leu Ser His Ile Arg His Met Ser Asn Lys Gly
580 585 590
Met Glu His Leu Tyr Ser Met Lys Cys Lys Asn Val Val Pro Leu Tyr
595 600 605
Asp Leu Leu Leu Glu Ala Ala Asp Ala His Arg Leu His Ala Pro Thr
610 615 620
Ser Arg Gly Gly Ala Ser Val Glu Glu Thr Asp Gln Ser His Leu Ala
625 630 635 640
Thr Ala Gly Ser Thr Ser Ser His Ser Leu Gln Lys Tyr Tyr Ile Thr
645 650 655
Gly Glu Ala Glu Gly Phe Pro Ala Thr Ala
660 665
<210> 3
<211> 344
<212> PRT
<213> Artificial Sequence
<400> 3
Met Gly Ser Asn Leu Leu Thr Val His Gln Asn Leu Pro Ala Leu Pro
1 5 10 15
Val Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe
20 25 30
Arg Asp Arg Gln Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser
35 40 45
Val Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp
50 55 60
Phe Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gln
65 70 75 80
Ala Arg Gly Leu Ala Val Lys Thr Ile Gln Gln His Leu Gly Gln Leu
85 90 95
Asn Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn
100 105 110
Ala Val Ser Leu Val Met Arg Arg Ile Arg Lys Glu Asn Val Asp Ala
115 120 125
Gly Glu Arg Ala Lys Gln Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp
130 135 140
Gln Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gln Asp Ile Arg
145 150 155 160
Asn Leu Ala Phe Leu Gly Ile Ala Tyr Asn Thr Leu Leu Arg Ile Ala
165 170 175
Glu Ile Ala Arg Ile Arg Val Lys Asp Ile Ser Arg Thr Asp Gly Gly
180 185 190
Arg Met Leu Ile His Ile Gly Arg Thr Lys Thr Leu Val Ser Thr Ala
195 200 205
Gly Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg
210 215 220
Trp Ile Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe
225 230 235 240
Cys Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gln
245 250 255
Leu Ser Thr Arg Ala Leu Glu Gly Ile Phe Glu Ala Thr His Arg Leu
260 265 270
Ile Tyr Gly Ala Lys Asp Asp Ser Gly Gln Arg Tyr Leu Ala Trp Ser
275 280 285
Gly His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly
290 295 300
Val Ser Ile Pro Glu Ile Met Gln Ala Gly Gly Trp Thr Asn Val Asn
305 310 315 320
Ile Val Met Asn Tyr Ile Arg Asn Leu Asp Ser Glu Thr Gly Ala Met
325 330 335
Val Arg Leu Leu Glu Asp Gly Asp
340
<210> 4
<211> 34
<212> DNA
<213> Artificial Sequence
<400> 4
ataacttcgt atannntann ntatacgaag ttat 34
<210> 5
<211> 32
<212> DNA
<213> Artificial Sequence
<400> 5
taactttaaa taatnnnnat tatttaaagt ta 32
<210> 6
<211> 765
<212> PRT
<213> Artificial Sequence
<400> 6
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu
1 5 10 15
Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly
20 25 30
Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile
35 40 45
Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr
50 55 60
Phe Thr Tyr Gly Val Gln Cys Phe Ala Arg Tyr Pro Asp His Met Lys
65 70 75 80
Gln His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu
85 90 95
Arg Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu
100 105 110
Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly
115 120 125
Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr
130 135 140
Asn Tyr Asn Ser His Lys Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn
145 150 155 160
Gly Ile Lys Val Asn Phe Lys Thr Arg His Asn Ile Glu Asp Gly Ser
165 170 175
Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly
180 185 190
Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu
195 200 205
Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe
210 215 220
Val Thr Ala Ala Gly Ile Thr Leu Gly Met Asp Glu Leu Tyr Lys Glu
225 230 235 240
Gly Arg Gly Ser Leu Leu Thr Cys Gly Asp Val Glu Glu Asn Pro Gly
245 250 255
Pro Ala Pro Gly Ser Met Ser Gly Gly Glu Glu Leu Phe Ala Gly Ile
260 265 270
Val Pro Val Leu Ile Glu Leu Asp Gly Asp Val His Gly His Lys Phe
275 280 285
Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Asp Tyr Gly Lys Leu Glu
290 295 300
Ile Lys Phe Ile Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr
305 310 315 320
Leu Val Thr Thr Leu Cys Tyr Gly Ile Gln Cys Phe Ala Arg Tyr Pro
325 330 335
Glu His Met Lys Met Asn Asp Phe Phe Lys Ser Ala Met Pro Glu Gly
340 345 350
Tyr Ile Gln Glu Arg Thr Ile Gln Phe Gln Asp Asp Gly Lys Tyr Lys
355 360 365
Thr Arg Gly Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile
370 375 380
Glu Leu Lys Gly Lys Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His
385 390 395 400
Lys Leu Glu Tyr Ser Phe Asn Ser His Asn Val Tyr Ile Arg Pro Asp
405 410 415
Lys Ala Asn Asn Gly Leu Glu Ala Asn Phe Lys Thr Arg His Asn Ile
420 425 430
Glu Gly Gly Gly Val Gln Leu Ala Asp His Tyr Gln Thr Asn Val Pro
435 440 445
Leu Gly Asp Gly Pro Val Leu Ile Pro Ile Asn His Tyr Leu Ser Thr
450 455 460
Gln Thr Lys Ile Ser Lys Asp Arg Asn Glu Ala Arg Asp His Met Val
465 470 475 480
Leu Leu Glu Ser Phe Ser Ala Cys Cys His Thr His Gly Met Asp Glu
485 490 495
Leu Tyr Arg Arg Ala Lys Arg Ser Gly Ser Gly Ala Thr Asn Phe Ser
500 505 510
Leu Leu Lys Gln Ala Gly Asp Val Glu Glu Asn Pro Gly Pro Met Val
515 520 525
Ser Lys Gln Ile Leu Lys Asn Thr Gly Leu Gln Glu Ile Met Ser Phe
530 535 540
Lys Val Asn Leu Glu Gly Val Val Asn Asn His Val Phe Thr Met Glu
545 550 555 560
Gly Cys Gly Lys Gly Asn Ile Leu Phe Gly Asn Gln Leu Val Gln Ile
565 570 575
Arg Val Thr Lys Gly Ala Pro Leu Pro Phe Ala Phe Asp Ile Leu Ser
580 585 590
Pro Ala Phe Gln Tyr Gly Asn Arg Thr Phe Thr Lys Tyr Pro Glu Asp
595 600 605
Ile Ser Asp Phe Phe Ile Gln Ser Phe Pro Ala Gly Phe Val Tyr Glu
610 615 620
Arg Thr Leu Arg Tyr Glu Asp Gly Gly Leu Val Glu Ile Arg Ser Asp
625 630 635 640
Ile Asn Leu Ile Glu Glu Met Phe Val Tyr Arg Val Glu Tyr Lys Gly
645 650 655
Arg Asn Phe Pro Asn Asp Gly Pro Val Met Lys Lys Thr Ile Thr Gly
660 665 670
Leu Gln Pro Ser Phe Glu Val Val Tyr Met Asn Asp Gly Val Leu Val
675 680 685
Gly Gln Val Ile Leu Val Tyr Arg Leu Asn Ser Gly Lys Phe Tyr Ser
690 695 700
Cys His Met Arg Thr Leu Met Lys Ser Lys Gly Val Val Lys Asp Phe
705 710 715 720
Pro Glu Tyr His Phe Ile Gln His Arg Leu Glu Lys Thr Tyr Val Glu
725 730 735
Asp Gly Gly Phe Val Glu Gln His Glu Thr Ala Ile Ala Gln Leu Thr
740 745 750
Ser Leu Gly Lys Pro Leu Gly Ser Leu His Glu Trp Val
755 760 765
<210> 7
<211> 29
<212> DNA
<213> Artificial Sequence
<400> 7
tggattttgc tgcagttatt tgtgtatag 29
<210> 8
<211> 27
<212> DNA
<213> Artificial Sequence
<400> 8
cacataaatt aaacatgact tggttac 27
<210> 9
<211> 23
<212> DNA
<213> Artificial Sequence
<400> 9
tccgagcgtg gtggagccgt tct 23
<210> 10
<211> 27
<212> DNA
<213> Artificial Sequence
<400> 10
tactaccttg ttctgataga aatattt 27
<210> 11
<211> 20
<212> DNA
<213> Artificial Sequence
<400> 11
ataaattaac attgaaggtc 20
Claims (10)
1. A nucleic acid molecule selected from
(1) A nucleic acid molecule comprising a 5 'homology arm and a 3' homology arm, the 5 'homology arm and the 3' homology arm being capable of recombining, into a genome, sequences therebetween such that the sequences therebetween are co-expressed with a cell proliferation factor gene in the genome, and a coding sequence for a first recombinase, an estrogen receptor ER coding sequence, and a recognition site for a second recombinase located between the 5 'homology arm and the 3' homology arm, and
(2) (1) the complementary sequence of said nucleic acid molecule.
2. The nucleic acid molecule of claim 1, wherein the polynucleotide sequence of the nucleic acid molecule is, in order from the 5 'end to the 3' end, a 5 'homology arm, a coding sequence for a first recombinase, a recognition site for a second recombinase, an estrogen receptor ER coding sequence, a recognition site for a second recombinase, and a 3' homology arm, wherein the 5 'homology arm and the 3' homology arm recombine sequences between them at the 5 'or 3' end of the cell proliferation factor gene,
preferably, the nucleic acid molecule has one or more characteristics selected from the group consisting of:
the cell proliferation factor is Ki67 or PCNA,
the nucleic acid sequence of the 5' homologous arm is shown as the 1 st to 3000 th nucleotides of SEQ ID NO. 1,
the nucleic acid sequence of the 3' homologous arm is shown as the 5128-8127 site nucleotide of SEQ ID NO. 1,
the first recombinase and the second recombinase are Cre and Dre or Dre and Cre respectively, wherein the recognition site of Cre is LoxP, the recognition site of Dre is rox,
the amino acid sequence of Dre is shown as amino acids 1-356 of SEQ ID NO. 2,
cre has an amino acid sequence shown in SEQ ID NO. 3,
the nucleic acid sequence of LoxP is shown in SEQ ID NO 4,
the nucleic acid sequence of rox is shown as SEQ ID NO. 5,
the amino acid sequence of the estrogen receptor ER is shown as the amino acid 357-666 of SEQ ID NO: 2.
3. A nucleic acid construct comprising the nucleic acid molecule of claim 1 or 2.
4. A recombinase system comprising
(1) The nucleic acid molecule of claim 1 or 2, and
(2) optionally, a nucleic acid molecule encoding a fusion protein of a second recombinase and an estrogen receptor ER,
(3) optionally, a nucleic acid molecule comprising the structure: from the 5 'end to the 3' end there is a recognition site for the first recombinase, a termination sequence, a recognition site for the first recombinase and a marker-coding sequence, respectively, or
The system comprises
(1) The nucleic acid construct of claim 3, and
(2) optionally a second nucleic acid construct having a polynucleotide sequence encoding a fusion protein of a second recombinase and an estrogen receptor ER,
(3) an optional third nucleic acid construct having a polynucleotide sequence comprising the structure: from the 5 'end to the 3' end there are a recognition site for the first recombinase, a termination sequence, a recognition site for the first recombinase and a marker-encoding sequence, respectively.
5. A host cell, comprising one or more selected from the group consisting of:
(1) the nucleic acid molecule of claim 1 or 2;
(2) the nucleic acid construct of claim 3;
(3) the system of claim 4.
6. A method of constructing a transgenic animal comprising:
f is to be0First generation animal and F0Second animal or F0Mating the animals of the third generation, and homologous recombination occurs in the progeny animals to obtain a transgenic animal comprising the first and second polynucleotide sequences or a transgenic animal comprising the first and third polynucleotide sequences, or
Subjecting the three kinds of F0Mating any two of the generations of animals, then F, which undergoes homologous recombination1Animal and a third F0Mating the animals at F2Homologous recombination occurs in the animal generations to obtain transgenic animals comprising the first, second and third polynucleotide sequences,
wherein,
F0a first polynucleotide sequence contained in the genome of the first animal, wherein the first polynucleotide sequence is the polynucleotide sequence of the nucleic acid molecule of claim 1 or 2, or the first polynucleotide sequence comprises a first recombinase coding sequence, a recognition site for a second recombinase, an Estrogen Receptor (ER) coding sequence, and a recognition site for the second recombinase from 5 'to 3' end, and the first polynucleotide sequence is co-expressed with a cell proliferation factor gene in the genome;
F0a second polynucleotide sequence contained in the genome of the second animal, said second polynucleotide sequence encoding a fusion protein of a second recombinase and an Estrogen Receptor (ER),
F0the genome of the third animal contains a third polynucleotide sequence which is provided with a recognition site of the first recombinase, a termination sequence, a recognition site of the first recombinase and a marker coding sequence from 5 'end to 3' end;
preferably, the animal is a mouse,
preferably, the first and second recombinant enzymes, the estrogen receptor ER and the cell proliferation factor are as defined in claim 1 or 2.
7. A method for constructing a transgenic animal comprising introducing any one, two or three of the first, second and third polynucleotide sequences into animal cells and culturing, and selecting a transgenic animal whose genome comprises any one, two or three of the first, second and third polynucleotide sequences, wherein
The first polynucleotide sequence is a first recombinase coding sequence, a recognition site of a second recombinase, an estrogen receptor ER coding sequence and a recognition site of the second recombinase from 5 'end to 3' end in sequence, and the first polynucleotide sequence is co-expressed with a cell proliferation factor gene in a genome in the transgenic animal,
the second polynucleotide sequence encodes a fusion protein of a second recombinase and an estrogen receptor ER,
the third polynucleotide sequence is provided with a recognition site of the first recombinase, a termination sequence, a recognition site of the first recombinase and a marker coding sequence from the 5 'end to the 3' end in sequence,
preferably, the cells are animal ES cells,
preferably, the animal is a mouse,
preferably, the first and second recombinant enzymes, the estrogen receptor ER and the cell proliferation factor are as defined in claim 1 or 2.
8. An in vivo long-term cell labeling method comprising labeling cells expressing a cell proliferation factor gene in an animal comprising a first, second, and third polynucleotide sequence in the presence of an inducer that interacts with the estrogen receptor ER, wherein,
the first polynucleotide sequence is a first recombinase coding sequence, a recognition site of a second recombinase, an estrogen receptor ER coding sequence and a recognition site of the second recombinase from 5 'end to 3' end in sequence, and the first polynucleotide sequence is co-expressed with a cell proliferation factor gene in a genome in the transgenic animal,
the second polynucleotide sequence encodes a fusion protein of a second recombinase and an estrogen receptor ER, and
the third polynucleotide sequence is provided with a recognition site of a first recombinase, a termination sequence, a first recombination site and a marker coding sequence from the 5 'end to the 3' end in sequence,
preferably, the animal is a mouse,
preferably, the first and second recombinant enzymes, the estrogen receptor ER and the cell proliferation factor are as defined in claim 1 or 2.
9. Use of the nucleic acid molecule of claim 1 or 2, the nucleic acid construct of claim 3, the system of claim 4 or the host cell of claim 5 for long-term cell labeling or cell tracking.
10. A kit comprising the nucleic acid molecule of claim 1 or 2, the nucleic acid construct of claim 3, the system of claim 4, or the host cell of claim 5, and reagents required to knock the nucleic acid molecule into the genome of the cell.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911242892.5A CN112921052B (en) | 2019-12-06 | 2019-12-06 | In vivo cell proliferation marker and tracer system and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911242892.5A CN112921052B (en) | 2019-12-06 | 2019-12-06 | In vivo cell proliferation marker and tracer system and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112921052A true CN112921052A (en) | 2021-06-08 |
CN112921052B CN112921052B (en) | 2023-07-21 |
Family
ID=76161668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911242892.5A Active CN112921052B (en) | 2019-12-06 | 2019-12-06 | In vivo cell proliferation marker and tracer system and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112921052B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115927414A (en) * | 2022-07-04 | 2023-04-07 | 余薇 | Nucleic acid molecule, homologous recombinant vector, transgenic animal and construction method thereof, and animal in-vivo cell marking method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002028175A2 (en) * | 2000-10-03 | 2002-04-11 | Association Pour Le Developpement De La Recherche En Genetique Moleculaire (Aderegem) | Transgenic mouse for targeted recombination mediated by modified cre-er |
WO2013155222A2 (en) * | 2012-04-10 | 2013-10-17 | The Regents Of The University Of California | Brain-specific enhancers for cell-based therapy |
CN107779462A (en) * | 2016-08-29 | 2018-03-09 | 中国科学院上海生命科学研究院 | Double homologous recombination pedigree tracer techniques |
CN107849583A (en) * | 2015-03-09 | 2018-03-27 | 西奈卫生系统公司 | The tool and method bred using cell division locus control cell |
CN108070035A (en) * | 2017-10-12 | 2018-05-25 | 中国科学院上海生命科学研究院 | Inducibility Genetic Recombination enzyme system CrexER |
-
2019
- 2019-12-06 CN CN201911242892.5A patent/CN112921052B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002028175A2 (en) * | 2000-10-03 | 2002-04-11 | Association Pour Le Developpement De La Recherche En Genetique Moleculaire (Aderegem) | Transgenic mouse for targeted recombination mediated by modified cre-er |
WO2013155222A2 (en) * | 2012-04-10 | 2013-10-17 | The Regents Of The University Of California | Brain-specific enhancers for cell-based therapy |
CN107849583A (en) * | 2015-03-09 | 2018-03-27 | 西奈卫生系统公司 | The tool and method bred using cell division locus control cell |
CN107779462A (en) * | 2016-08-29 | 2018-03-09 | 中国科学院上海生命科学研究院 | Double homologous recombination pedigree tracer techniques |
CN108070035A (en) * | 2017-10-12 | 2018-05-25 | 中国科学院上海生命科学研究院 | Inducibility Genetic Recombination enzyme system CrexER |
Non-Patent Citations (6)
Title |
---|
KAI KRETZSCHMAR ET AL.: "Profiling proliferative cells and their progeny in damaged murine hearts", 《PNAS》 * |
LINGJUAN HE ET AL.: "Proliferation tracing reveals regional hepatocyte generation in liver homeostasis and repair", 《SCIENCE》 * |
ONUR BASAK ET AL.: "Troy+ brain stem cells cycle through quiescence and regulate their number by sensing niche occupancy", 《PNAS》 * |
刘秀秀等: "成体哺乳动物心肌细胞增殖及其调控", 《上海大学学报(自然科学版)》 * |
李洁等: "ProTracer示踪组织器官稳态、修复与再生中的细胞增殖", 《生命的化学》 * |
马端: "《生物学前沿技术在医学研究中的应用》", 30 September 2007, 复旦大学出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115927414A (en) * | 2022-07-04 | 2023-04-07 | 余薇 | Nucleic acid molecule, homologous recombinant vector, transgenic animal and construction method thereof, and animal in-vivo cell marking method |
Also Published As
Publication number | Publication date |
---|---|
CN112921052B (en) | 2023-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Moessler et al. | The SM 22 promoter directs tissue-specific expression in arterial but not in venous or visceral smooth muscle cells in transgenic mice | |
CN113424798B (en) | Humanized SIRP alpha-IL 15 knock-in mice and methods of use thereof | |
Madisen et al. | A robust and high-throughput Cre reporting and characterization system for the whole mouse brain | |
Kim et al. | Generation and characterization of a conditional deletion allele for Lmna in mice | |
CN108070035B (en) | Inducible genetic recombinase System CrexER | |
KR20100103810A (en) | Methods for sequential replacement of targeted region by homologous recombination | |
EP1811023A1 (en) | Gene expressed specifically in es cells and utilization of the same | |
US20090202988A1 (en) | Trap vectors and gene trapping using the same | |
JP2021527427A (en) | Transposon-mediated gene transfer and related compositions, systems, and methods of the enhanced hAT family | |
CN108779159B (en) | Non-human animals with engineered ANGPTL8 genes | |
CN110461146A (en) | The non-human animal model of retinoschisis | |
WO2010010887A1 (en) | Tissue expression promoter | |
CN110996658A (en) | Non-human animals comprising a humanized ASGR1 locus | |
CN112921052B (en) | In vivo cell proliferation marker and tracer system and application thereof | |
KR20080016687A (en) | Expression vector having promoter sequence of vasa homologue gene derived from mammal and use thereof | |
US9084814B2 (en) | Conditional Mst overexpressing construct and conditional myostatin overexpressing transgenic mouse | |
Lau et al. | Adaptive evolution of gene expression in Antarctic fishes: Divergent transcription of the 5′-to-5′ linked adult α1-and β-globin genes of the Antarctic teleost Notothenia coriiceps is controlled by dual promoters and intergenic enhancers | |
US20230357792A1 (en) | Method of engineering and isolating adeno-associated virus | |
JP2004500879A (en) | Renal regulatory elements and methods of their use | |
CN112391366B (en) | Dre recombination system activated by light induction | |
CA2077686A1 (en) | Binary genetic system to control expression of a transgene in a transgenic animal | |
CN114075294A (en) | Intercellular genetic marker tracing technology | |
CN106701764A (en) | Promoter of 15kDa selenoprotein gene and core area and application of promoter | |
KR101289474B1 (en) | Transgenic Mouse Useful for Researching the Function of Ephrin-A5 Gene and the Method For Preparing the Same | |
JP2001157588A (en) | Gene trap using green fluorescent protein |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |