WO2023177526A2 - Compositions and methods for detecting an endotoxin - Google Patents
Compositions and methods for detecting an endotoxin Download PDFInfo
- Publication number
- WO2023177526A2 WO2023177526A2 PCT/US2023/014214 US2023014214W WO2023177526A2 WO 2023177526 A2 WO2023177526 A2 WO 2023177526A2 US 2023014214 W US2023014214 W US 2023014214W WO 2023177526 A2 WO2023177526 A2 WO 2023177526A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- sequence
- nucleic acid
- protein
- acid molecule
- Prior art date
Links
- 239000002158 endotoxin Substances 0.000 title claims abstract description 46
- 238000000034 method Methods 0.000 title claims abstract description 41
- 239000000203 mixture Substances 0.000 title description 4
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 330
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 284
- 230000014509 gene expression Effects 0.000 claims abstract description 206
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 169
- 239000003153 chemical reaction reagent Substances 0.000 claims abstract description 115
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 98
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 98
- 239000013612 plasmid Substances 0.000 claims abstract description 64
- 239000002510 pyrogen Substances 0.000 claims abstract description 35
- 238000003259 recombinant expression Methods 0.000 claims abstract description 10
- 241000239218 Limulus Species 0.000 claims abstract description 7
- 241000186226 Corynebacterium glutamicum Species 0.000 claims description 143
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 125
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 113
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 112
- 229920001184 polypeptide Polymers 0.000 claims description 109
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 63
- 108010045487 coagulogen Proteins 0.000 claims description 42
- 108090000056 Complement factor B Proteins 0.000 claims description 36
- 102000003712 Complement factor B Human genes 0.000 claims description 36
- 241000239219 Carcinoscorpius rotundicauda Species 0.000 claims description 33
- 241000239224 Tachypleus tridentatus Species 0.000 claims description 31
- 102000010911 Enzyme Precursors Human genes 0.000 claims description 29
- 108010062466 Enzyme Precursors Proteins 0.000 claims description 29
- 102000012479 Serine Proteases Human genes 0.000 claims description 29
- 108010022999 Serine Proteases Proteins 0.000 claims description 29
- 108010048121 pro-clotting enzyme Proteins 0.000 claims description 28
- 101000882917 Penaeus paulensis Hemolymph clottable protein Proteins 0.000 claims description 26
- 241000588724 Escherichia coli Species 0.000 claims description 25
- 101150110403 cspA gene Proteins 0.000 claims description 25
- 241001529572 Chaceon affinis Species 0.000 claims description 19
- 150000001413 amino acids Chemical group 0.000 claims description 19
- 230000003248 secreting effect Effects 0.000 claims description 17
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 11
- HVLSXIKZNLPZJJ-TXZCQADKSA-N HA peptide Chemical compound C([C@@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HVLSXIKZNLPZJJ-TXZCQADKSA-N 0.000 claims description 9
- 230000027455 binding Effects 0.000 claims description 8
- RWSXRVCMGQZWBV-WDSKDSINSA-N glutathione Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@@H](CS)C(=O)NCC(O)=O RWSXRVCMGQZWBV-WDSKDSINSA-N 0.000 claims description 8
- 108010018381 streptavidin-binding peptide Proteins 0.000 claims description 8
- 102000000584 Calmodulin Human genes 0.000 claims description 7
- 108010041952 Calmodulin Proteins 0.000 claims description 7
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 claims description 7
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 6
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Natural products NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 claims description 6
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 claims description 5
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 claims description 5
- 108010024636 Glutathione Proteins 0.000 claims description 4
- 108010070675 Glutathione transferase Proteins 0.000 claims description 4
- 102000005720 Glutathione transferase Human genes 0.000 claims description 4
- 239000004471 Glycine Substances 0.000 claims description 4
- 241000239221 Tachypleus gigas Species 0.000 claims description 4
- 229960003180 glutathione Drugs 0.000 claims description 4
- 101150049887 cspB gene Proteins 0.000 claims description 3
- 101150031507 porB gene Proteins 0.000 claims description 3
- 108010072542 endotoxin binding proteins Proteins 0.000 abstract description 4
- 238000010998 test method Methods 0.000 abstract description 3
- 125000003275 alpha amino acid group Chemical group 0.000 description 44
- 239000000523 sample Substances 0.000 description 34
- 108020004414 DNA Proteins 0.000 description 25
- 102000053602 DNA Human genes 0.000 description 25
- 210000004027 cell Anatomy 0.000 description 21
- 101100007857 Bacillus subtilis (strain 168) cspB gene Proteins 0.000 description 18
- 101150068339 cspLA gene Proteins 0.000 description 18
- 238000005457 optimization Methods 0.000 description 18
- 239000000499 gel Substances 0.000 description 13
- 102000004190 Enzymes Human genes 0.000 description 12
- 108090000790 Enzymes Proteins 0.000 description 12
- 238000011050 LAL assay Methods 0.000 description 12
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 12
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 12
- 229940088598 enzyme Drugs 0.000 description 12
- 239000006166 lysate Substances 0.000 description 12
- 238000004519 manufacturing process Methods 0.000 description 11
- 241000239220 Limulus polyphemus Species 0.000 description 10
- 238000012360 testing method Methods 0.000 description 10
- 238000013518 transcription Methods 0.000 description 10
- 230000035897 transcription Effects 0.000 description 10
- 238000010367 cloning Methods 0.000 description 9
- 108020004705 Codon Proteins 0.000 description 8
- 238000005345 coagulation Methods 0.000 description 8
- 230000015271 coagulation Effects 0.000 description 8
- 239000012228 culture supernatant Substances 0.000 description 8
- 239000003814 drug Substances 0.000 description 8
- 239000002773 nucleotide Substances 0.000 description 8
- 125000003729 nucleotide group Chemical group 0.000 description 8
- 229920002477 rna polymer Polymers 0.000 description 8
- 101150111062 C gene Proteins 0.000 description 7
- 108020004566 Transfer RNA Proteins 0.000 description 7
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 6
- 239000002609 medium Substances 0.000 description 6
- 108091008146 restriction endonucleases Proteins 0.000 description 6
- 108700010070 Codon Usage Proteins 0.000 description 5
- -1 Factor C Proteins 0.000 description 5
- 239000013642 negative control Substances 0.000 description 5
- 108010011170 Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly Proteins 0.000 description 4
- 241000193755 Bacillus cereus Species 0.000 description 4
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 4
- 108700019146 Transgenes Proteins 0.000 description 4
- 108010028230 Trp-Ser- His-Pro-Gln-Phe-Glu-Lys Proteins 0.000 description 4
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 238000005119 centrifugation Methods 0.000 description 4
- 238000011109 contamination Methods 0.000 description 4
- 229940079593 drug Drugs 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 235000013305 food Nutrition 0.000 description 4
- 239000008103 glucose Substances 0.000 description 4
- 238000003306 harvesting Methods 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 239000008188 pellet Substances 0.000 description 4
- 230000028327 secretion Effects 0.000 description 4
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000000844 transformation Methods 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 3
- 108091035707 Consensus sequence Proteins 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 102000035195 Peptidases Human genes 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- 239000004365 Protease Substances 0.000 description 3
- 108020004511 Recombinant DNA Proteins 0.000 description 3
- 102000002262 Thromboplastin Human genes 0.000 description 3
- 108010000499 Thromboplastin Proteins 0.000 description 3
- 108700009124 Transcription Initiation Site Proteins 0.000 description 3
- 125000000539 amino acid group Chemical group 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- LOHQECMUTAPWAC-UHFFFAOYSA-N coagulin Natural products C1C(C)=C(CO)C(=O)OC1C1(C)C(C2(C)CCC3C4(C(=O)CC=CC4=CCC43)C)(O)CCC24O1 LOHQECMUTAPWAC-UHFFFAOYSA-N 0.000 description 3
- 230000006378 damage Effects 0.000 description 3
- 210000000087 hemolymph Anatomy 0.000 description 3
- 238000001556 precipitation Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 235000019419 proteases Nutrition 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 238000005215 recombination Methods 0.000 description 3
- 239000013605 shuttle vector Substances 0.000 description 3
- 238000002560 therapeutic procedure Methods 0.000 description 3
- 229960005486 vaccine Drugs 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 208000025721 COVID-19 Diseases 0.000 description 2
- 241001678559 COVID-19 virus Species 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 2
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 2
- 241000239205 Merostomata Species 0.000 description 2
- 241001302191 Polyphemus Species 0.000 description 2
- 206010037660 Pyrexia Diseases 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 108091023040 Transcription factor Proteins 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 238000001261 affinity purification Methods 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 239000004202 carbamide Substances 0.000 description 2
- 239000013592 cell lysate Substances 0.000 description 2
- 238000013375 chromatographic separation Methods 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 108010055222 clotting enzyme Proteins 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000001502 gel electrophoresis Methods 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 206010022000 influenza Diseases 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 102000040430 polynucleotide Human genes 0.000 description 2
- 108091033319 polynucleotide Proteins 0.000 description 2
- 239000002157 polynucleotide Substances 0.000 description 2
- 239000013641 positive control Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 238000011218 seed culture Methods 0.000 description 2
- 238000005063 solubilization Methods 0.000 description 2
- 230000007928 solubilization Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 235000000346 sugar Nutrition 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 230000014616 translation Effects 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- HKZAAJSTFUZYTO-LURJTMIESA-N (2s)-2-[[2-[[2-[[2-[(2-aminoacetyl)amino]acetyl]amino]acetyl]amino]acetyl]amino]-3-hydroxypropanoic acid Chemical compound NCC(=O)NCC(=O)NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O HKZAAJSTFUZYTO-LURJTMIESA-N 0.000 description 1
- OWEGMIWEEQEYGQ-UHFFFAOYSA-N 100676-05-9 Natural products OC1C(O)C(O)C(CO)OC1OCC1C(O)C(O)C(O)C(OC2C(OC(O)C(O)C2O)CO)O1 OWEGMIWEEQEYGQ-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- VHUUQVKOLVNVRT-UHFFFAOYSA-N Ammonium hydroxide Chemical compound [NH4+].[OH-] VHUUQVKOLVNVRT-UHFFFAOYSA-N 0.000 description 1
- 101100301559 Bacillus anthracis repS gene Proteins 0.000 description 1
- 101100247969 Clostridium saccharobutylicum regA gene Proteins 0.000 description 1
- 206010053567 Coagulopathies Diseases 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 241000238557 Decapoda Species 0.000 description 1
- 101100412434 Escherichia coli (strain K12) repB gene Proteins 0.000 description 1
- 241000644323 Escherichia coli C Species 0.000 description 1
- 108050001049 Extracellular proteins Proteins 0.000 description 1
- 241000192125 Firmicutes Species 0.000 description 1
- 238000003794 Gram staining Methods 0.000 description 1
- 101710154606 Hemagglutinin Proteins 0.000 description 1
- 239000012480 LAL reagent Substances 0.000 description 1
- GUBGYTABKSRVRQ-PICCSMPSSA-N Maltose Natural products O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@@H](CO)OC(O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-PICCSMPSSA-N 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 101710093908 Outer capsid protein VP4 Proteins 0.000 description 1
- 101710135467 Outer capsid protein sigma-1 Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 101710176177 Protein A56 Proteins 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 108010000605 Ribosomal Proteins Proteins 0.000 description 1
- 108091081021 Sense strand Proteins 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 101100114425 Streptococcus agalactiae copG gene Proteins 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 108700026226 TATA Box Proteins 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 230000003281 allosteric effect Effects 0.000 description 1
- 239000000908 ammonium hydroxide Substances 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000011091 antibody purification Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000000740 bleeding effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 244000309464 bull Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 102000028861 calmodulin binding Human genes 0.000 description 1
- 108091000084 calmodulin binding Proteins 0.000 description 1
- 238000010523 cascade reaction Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000012411 cloning technique Methods 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 230000035602 clotting Effects 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 230000009137 competitive binding Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- NKLPQNGYXWVELD-UHFFFAOYSA-M coomassie brilliant blue Chemical compound [Na+].C1=CC(OCC)=CC=C1NC1=CC=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C=CC(=CC=2)N(CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=C1 NKLPQNGYXWVELD-UHFFFAOYSA-M 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000001687 destabilization Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 231100000676 disease causative agent Toxicity 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 229940126534 drug product Drugs 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000001125 extrusion Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 235000003642 hunger Nutrition 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000001802 infusion Methods 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000007928 intraperitoneal injection Substances 0.000 description 1
- 238000007913 intrathecal administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 238000012454 limulus amebocyte lysate test Methods 0.000 description 1
- 238000004811 liquid chromatography Methods 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 101150106875 malE gene Proteins 0.000 description 1
- 238000012768 mass vaccination Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 238000001000 micrograph Methods 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 108091005763 multidomain proteins Proteins 0.000 description 1
- 238000013433 optimization analysis Methods 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000000825 pharmaceutical preparation Substances 0.000 description 1
- 235000021317 phosphate Nutrition 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 150000004713 phosphodiesters Chemical class 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 230000001766 physiological effect Effects 0.000 description 1
- 230000006461 physiological response Effects 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 238000001243 protein synthesis Methods 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 238000012205 qualitative assay Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 238000012429 release testing Methods 0.000 description 1
- 101150044854 repA gene Proteins 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000004062 sedimentation Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 230000037351 starvation Effects 0.000 description 1
- 239000007929 subcutaneous injection Substances 0.000 description 1
- 238000010254 subcutaneous injection Methods 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 108010087967 type I signal peptidase Proteins 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/34—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase
- C12Q1/37—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase involving peptidase or proteinase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
- C12N15/625—DNA sequences coding for fusion proteins containing a sequence coding for a signal sequence
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/74—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
- C12N15/77—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora for Corynebacterium; for Brevibacterium
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y304/00—Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
- C12Y304/21—Serine endopeptidases (3.4.21)
- C12Y304/21084—Serine endopeptidases (3.4.21) limulus clotting factor C (3.4.21.84)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y304/00—Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
- C12Y304/21—Serine endopeptidases (3.4.21)
- C12Y304/21085—Limulus clotting factor B (3.4.21.85)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y304/00—Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
- C12Y304/21—Serine endopeptidases (3.4.21)
- C12Y304/21086—Limulus clotting enzyme (3.4.21.86)
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/68—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/02—Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/20—Fusion polypeptide containing a tag with affinity for a non-protein ligand
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2333/00—Assays involving biological materials from specific organisms or of a specific nature
- G01N2333/195—Assays involving biological materials from specific organisms or of a specific nature from bacteria
- G01N2333/34—Assays involving biological materials from specific organisms or of a specific nature from bacteria from Corynebacterium (G)
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2333/00—Assays involving biological materials from specific organisms or of a specific nature
- G01N2333/90—Enzymes; Proenzymes
- G01N2333/914—Hydrolases (3)
- G01N2333/948—Hydrolases (3) acting on peptide bonds (3.4)
- G01N2333/95—Proteinases, i.e. endopeptidases (3.4.21-3.4.99)
- G01N2333/964—Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue
- G01N2333/96425—Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue from mammals
- G01N2333/96427—Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue from mammals in general
- G01N2333/9643—Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue from mammals in general with EC number
- G01N2333/96433—Serine endopeptidases (3.4.21)
Definitions
- the present invention relates generally to the fields of biotechnology and infectious diseases, and more particularly it pertains to recombinant production of enzymes for detection of pyrogens and endotoxins.
- the assay uses the hemolymph (blood) of the horseshoe crab, Limulus polyphemus (L. polyphemus ⁇ and tests for the presence of fever-producing agents of bacterial origin, e.g., endotoxins.
- the limulus amoebocyte lysate (LAL) test method is a qualitative assay during which the L. polyphemus hemolymph lysate reacts with an endotoxin to form a gel.
- the LAL test is considered to be reproducible, simple to conduct, specific for the presence of endotoxins, and sensitive to even picogram quantities of endotoxins.
- the quantity of endotoxin may be determined by dilution techniques comparing gel formation of the test sample to that of a reference pyrogen.
- the following non-essential publications are incorporated by reference in their entirety to aid in understanding of the official use of the LAL assay for release testing of final drug products: Levin, J, et al. Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool.
- Shuster CNJ, Barlow RB and Brockman HJ eds
- McCullough KZ ed.
- the bacterial endotoxins test a practical approach. 2011: 1-13.
- the LAL assay comprises horseshoe crab lysate reagents that form a four-step coagulation cascade.
- Enzyme, and one clotting protein, Coagulogen form the enzymatic coagulation cascade that results in a coagulin gel clot in the presence of an endotoxin.
- an endotoxin activates the Factor C zymogen and the activated Factor C subsequently activates Factor B, which converts the Proclotting Enzyme into Clotting Enzyme that cleaves Coagulogen into Coagulin, forming a gel clot.
- the raw materials for the production of lysate reagents are harvested from wildcaught horseshoe crab, including L. polyphemus and Tachypleus tridentatus (T. tridentatus). Wild horseshoe populations are in decline due to the detrimental effect of capture, blood collection, and release, poor management of harvest regulations, and habitat destruction. Commercial-scale cultivation of horseshoe crabs has not been achieved.
- the following non-essential publications are incorporated by reference in their entirety to aid in understanding of the unsustainability of blood collection from wild-caught crabs for production of LAL assay reagents: Gauvry G. Current
- Horseshoe crab harvesting practices cannot support global demand for TAL/LAL: The pharmaceutical and medical device industries’ role in the sustainability of horseshoe crabs. In:
- the disclosure features expression cassettes, plasmids, and functional recombinant cascade reagents (RCRs) produced from these expression cassettes and plasmids.
- RCRs functional recombinant cascade reagents
- the disclosure also features expression cassettes for functional RCRs optimized for production in Corynebacterium glutamicum (C. glutamicum).
- the disclosure features optimized expression cassettes for production in C. glutamicum of the Factor C, Factor B, and Proclotting Enzyme serine protease zymogens, as well as optimized expression cassettes for production of the Coagulogen clotting protein.
- the disclosure provides nucleic acid molecules, comprising expression cassettes, wherein the expression cassettes comprise, from 5’ to 3’: a promoter; a signal sequence; and a sequence encoding a cascade reagent protein.
- the expression cassette is optimized for expression in C. glutamicum.
- the signal sequence encodes a signal peptide.
- the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
- the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene.
- the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene.
- the C. glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the
- sequence encoding the cascade reagent protein encodes
- the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus pofyphemus,
- Tachypleus gigas and Carcinoscorpius rotundicauda (C. rotundicauda).
- the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
- the expression cassette comprises a termination sequence.
- the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the
- Corynebacterium glutamicum cg!502 gene the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cg!338 gene.
- the expression cassette comprises a sequence encoding a polypeptide protein tag. In some embodiments, the expression cassette comprises two or more sequences encoding polypeptide protein tags. In some embodiments, the polypeptide protein tag or polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag,
- HA-tag calmodulin-binding peptide
- streptavidin-binding peptide streptavidin-binding peptide
- glutathione 5-transferase glutathione 5-transferase
- maltose-binding protein HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
- the sequence encoding the polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
- a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
- the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
- a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
- sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
- sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
- the linker or linkers are selected from the group consisting of flexible glycine-serine linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
- the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 278-283, or SEQ ID: 325, or a sequence at least 90% identical thereto.
- the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284—289, or a sequence at least 90% identical thereto.
- the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto.
- the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6-8 or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
- the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto.
- the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.
- the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
- the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
- the linker is encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271, or a sequence at least 90% thereto.
- the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
- the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128 or SEQ ID NO. 322-324, or a sequence at least 90% thereto.
- the disclosure provides plasmids comprising nucleic acid molecules disclosed herein.
- the disclosure also provides cells comprising any one of the nucleic acid molecules or plasmids disclosed herein.
- the disclosure provides methods of producing a recombinant expression system comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
- the disclosure also provides recombinant expression systems produced by the method of contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
- the disclosure provides methods of expressing Factor C serine protease zymogen
- the disclosure provides isolated, purified protein molecules, wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256. [0028]
- the disclosure provides kits for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant
- Coagulogen clotting protein expressed in C. glutamicum expressed in C. glutamicum.
- the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 257 or SEQ ID NO: 258. In some embodiments, the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 259 or SEQ ID NO: 260. In some embodiments, the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical SEQ ID NO: 261. In some embodiments, the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262
- the disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
- the disclosure provides methods of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kits disclosed herein.
- FIG. 1 depicts the coagulation cascade of the present disclosure based on the coagulation cascade in the horseshoe crab amoebocyte lysate.
- FIGS. 2A-2B depict the expression cassettes of the present disclosure.
- FIG. 2A shows expression cassettes comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and optionally a polypeptide tag.
- FIG. 2B shows exemplary expression cassettes according to the present invention.
- Expression cassette number 4 (SEQ ID NO: 322) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
- Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272).
- Expression cassette number 6 (SEQ ID NO: 324) comprises the promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), Factor C gene from T. tridentatus (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator (SEQ ID NO: 272).
- FIGS. 3A-3B show expression of the plasmids containing expression cassettes in
- FIG. 3 A depicts microscopy images showing untransformed gram-positive B. cereus and untransformed gram-negative E. coli (top left), untransformed, gram-positive C. glutamicum (top middle), C. glutamicum transformed with empty plasmid (top right), and C. glutamicum transformed with plasmids comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322, bottom left), expression cassette number 6 of FIG. 2B
- FIG. 3B depicts gel electrophoresis showing the molecular weight of plasmids containing the expression cassettes of the present disclosure.
- lane 1 shows C. glutamicum as a negative control
- lane 2 shows C. glutamicum expressing the pEC- pk!8mob2 empty plasmid as a positive control
- lane 3 shows C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322)
- lane 4 shows C. glutamicum expressing the plasmid comprising the expression cassette number 6 of FIG. 2B (SEQ ID NO: 322)
- lane 5 shows C. glutamicum expressing the plasmid comprising the expression cassette number 5 of FIG. 2B (SEQ ID NO: 323).
- FIG. 2 show C. glutamicum expressing the pEC-pkl8mob2 empty plasmid as a negative control
- lanes 3 and 4 show C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322).
- the limulus amoebocyte lysate (LAL) test method is a standard pyrogen assay employed by a variety of industries to ensure that samples are free of harmful endotoxins and pyrogens.
- the U.S. Food and Drug Administration (FDA) approved the LAL assay for testing drugs, products, and devices, and the assay is widely used to test ingredients of pharmaceuticals during manufacturing.
- the LAL assay is based on a coagulation cascade involving reagents harvested from the hemolymph of wild-caught horseshoe crab. Specifically, exposure of endotoxin to the serine protease zymogen Factor C initiates a cascade that activates the serine protease zymogen
- the disclosure provides nucleic acid molecules (comprising expression cassettes) and plasmids for producing the lysate reagents Factor C, Factor B, Proclotting Enzyme, and
- Coagulogen wherein the nucleic acid molecules and plasmids are optimized for expression in the generally regarded as safe (GRAS) actinobacteria Corynebacterium glutamicum (C. glutamicuni).
- GRAS safe actinobacteria
- Corynebacterium glutamicum C. glutamicuni
- kits for detecting a pyrogen or endotoxin in a sample comprising recombinant lysate reagents, methods of producing a recombinant expression system using the nucleic acid molecules and plasmids disclosed herein, and methods for detecting pyrogen or endotoxin in a sample.
- nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used to test for contamination in a variety of industries, including pharmaceuticals (both preclinical studies and clinical applications) and biotechnologies, and settings, including healthcare providers, veterinary clinics, agriculture, food processing and service, wineries, breweries, distilleries, military, and direct-to-consumer.
- nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the agriculture, food service, food processing, winery, brewery, or distillery industries to test for contamination at any point along the logistical supply chain.
- nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the healthcare provider, veterinary clinic, military, and direct-to-consumer industries to test for contamination and institute organizational processes and conditions to sanitize frequently touched objects and surfaces and prevent infection.
- expression cassette refers to a nucleic acid component of vector DNA comprising one or more transcriptional control elements (e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences) that direct gene expression of a sequence encoding a protein and/or polypeptide, e.g., a linear nucleic acid sequence encoding one or more transgenes that are expressed by one or more cell types.
- transcriptional control elements e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences
- expression vector and “plasmid” are terms of the art understood by skilled persons and refer to synthetic DNA molecules used to carry foreign genetic material into a cell.
- recombinant DNA is a term of the art understood by skilled persons and refers to combining two or more DNA molecules from two or more different sources
- recombinant protein is a term of the art understood by skilled persons and refers to protein encoded by recombinant
- recombinant is a term of the art understood by skilled persons and refers to recombined DNA, e.g., recombinant DNA, and/or artificially produced protein, e.g., recombinant protein.
- the term “recombinant expression system” refers to a system for expressing recombinant protein in cells by transfecting cells with a DNA vector, expression vector, or plasmid.
- expression is a term of the art understood by skilled persons and refers to production of large amounts of recombinant DNA and/or recombinant protein by manipulation of the genetic material.
- optimized expression or “optimized for expression” refer to adaptation of some or all of nucleic acid molecules, including synthetic DNA molecules, recombinant DNA, and/or DNA vector, to the host organism to optimize synthesis and/or production of recombinant proteins. Optimization for expression may include optimizing GC content and noncoding DNA elements. Optimization for expression may include optimization based on highly expressed genes (HEG) wherein the codon usage of predicted highly expressed genes from 150 bacterial genomes under translational selection determines codon usage.
- HEG highly expressed genes
- Optimization for expression may also include determination of codon usage based on ribosomal protein genes (RPG) or tRNA gene copy number (tRNA).
- RPG ribosomal protein genes
- tRNA tRNA gene copy number
- Optimization for expression may include general optimization based on the C. glutamicum codon usage table generated from 9,019 coding sequences representing 2,866,198 codons. Optimization for expression may include optimization based on the software OptimWiz, a proprietary codon optimization analysis tool, which may optimize for expression by modifying GC-content, mRNA secondary structure, Shine-Dalgamo sequence, RNA instability motifs, repetitive sequences, internal splice sites, and restriction enzyme recognition sites.
- Exemplary optimization for expression of the present invention includes replacing nucleic acids of a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and/or sequences encoding linkers with nucleic acids encoding codons based on the
- HEG HEG
- RPG tRNA
- tRNA tRNA
- general OptimWiz optimization methods.
- optical expression or “optimized for expression” may also refer to polypeptides or proteins encoded by nucleic acid sequences that have been optimized for expression, i.e. optimization of the coding sequence that codes for the sequence of amino acids in a protein.
- nucleic acid As used herein, the term “nucleic acid”, “nucleic acid molecule”, or
- nucleotide refers to a sequence of more than one nucleotide base monomer, for example deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), in a single chain, including naturally occurring and non-naturally occurring nucleotides.
- nucleotide refers to conventional nucleotide bases, e.g., the purine and pyrimidine bases adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U).
- a nucleic acid will generally contain sugars and phosphates connected in an alternating chain through phosphodiester linkages.
- Nucleic acid sequences may encode polypeptides or may include sequences regulating transcription (e.g., promoters and terminators).
- polypeptide refers to a continuous, unbranched chain of peptides linked by peptide bonds. Amino acids incorporated into peptides are known as residues, and the term “amino acid sequence” refers to a sequence of amino acids, including naturally occurring and non-naturally occurring amino acids. Longer polypeptides are known as proteins, and the term “protein tag” is used to refer to a shorter polypeptide. Generally, polypeptides have an N-terminus, also known as the N-terminal end or amine-terminus, and a C-terminus, also known as the C-terminal end, caiboxyl-terminus, or carboxy-terminus. Polypeptides may be fused to other polypeptides by combining the genes or parts of genes that encode them to produce recombinant
- DNA that encodes a recombinant fusion protein may be fused N- terminally or C-terminally to another protein tag or domain. Fusion of a protein tag to the N- terminus of a protein results in an N-terminally tagged protein, and fusion of a protein tag to the
- Linker sequences may encode cleavable polypeptides, which can be cleaved upon exposure to enzyme, chemical reagents, or irradiation, or non-cleavable polypeptides, including flexible polypeptide linkers composed of glycine and serine known as GS linkers, for example (Gly-Gly-Gly-Gly-Ser)n, or rigid linkers, for example proline-rich or a-helical linkers.
- streptavidin-binding peptide and “SBP” may include the 38-amino acid sequence or
- sequences are aligned for optimal comparison performance and the nucleotides or amino acid residues at corresponding nucleotide positions or amino acid positions are then compared.
- Molecules are identical at a position when a position in the first sequence is occupied by the same nucleotide or amino acid residue as the corresponding position in the second sequence.
- the percent identity between the two sequences is a function of the number of identical positions shared by the sequences.
- the term “homolog” refers to a protein that has a common ancestor, and may include proteins that exhibit sequence homology, i.e., the proteins share sequence similarity.
- promoter refers to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating binding of proteins
- the transcription start site is the location where transcription starts at the 5’-end of the operably linked nucleic acid sequence, and the promoter generally includes consensus sequences, such as a TATA box, near the transcription start site.
- terminal or “termination sequence” refer to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating termination of transcription of RNA from the DNA upstream of the terminator.
- the termination sequence is downstream of a stop codon that signals termination of translation of the protein translated from the RNA transcribed from the DNA upstream of the stop codon.
- transgene refers to a gene transferred from one organism to another, i.e., an exogenous nucleic acid sequence encoding a polypeptide to be expressed in a cell.
- a transgene contains a promoter, a protein coding sequence, and a termination sequence.
- the term “gene of parallelf’ refers to the nucleic acid sequence encoding a protein, i.e., a protein coding sequence.
- Exemplary genes of interest of the present invention include nucleic add sequences encoding clotting proteins, Factor C serine proteases, Factor B serine proteases,
- signal sequence refers to a nucleic acid sequence encoding a short peptide present at the terminus of most proteins destined for secretion via the cellular secretory pathway.
- signal peptide refers to the polypeptide encoded by the signal sequence, and is generally present at the N-terminus of secreted proteins.
- secretory gene refers to genes encoding proteins destined for secretion via the cellular secretory pathway.
- pyrogen and “endotoxin” are used interchangeably and refer to causative agents responsible for biological effects incidental to therapy administered parenterally, i.e. therapies administered to the body other than through the mouth and alimentary canal.
- Parenteral therapies including injection (e.g., subcutaneous injection, intraperitoneal injection, intrathecal injection, etc.), allow pyrogens or endotoxins to bypass the normal body defenses.
- the host’s response to pyrogens or endotoxins include fever, shock, and other physiological responses. While the terms pyrogen and endotoxin are used interchangeably herein, not all pyrogens are endotoxins.
- amino acid refers to naturally occurring and non- naturally occurring or synthetic amino acids. Naturally occurring, levorotatory (L-) amino acids and their abbreviations (three-letter code and one-letter code are shown in Table 1.
- the disclosure provides nucleic acid sequences comprising one or more expression cassettes optimized for expression in C. glutamicum.
- the expression cassette comprises, from 5’ to 3’, a promoter, a signal sequence, and a sequence encoding a cascade reagent protein.
- the cascade reagent protein is Factor C.
- the cascade reagent protein is Factor B.
- the cascade reagent protein is Proclotting Enzyme.
- the cascade reagent protein is Coagulogen.
- the expression cassette comprises a termination sequence, a sequence encoding a polypeptide protein tag, and/or a sequence encoding a linker.
- the expression cassette comprises a nucleic acid sequence of having least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
- Promoters or promoter sequences are sequences of DNA to which transcription factors bind, thereby initiating transcription of RNA from the DNA downstream of the promoter.
- Promoters are located upstream, or toward the 5’ region of the sense strand, of the transcription start site and may include consensus sequences such as TATAAT or TTGACA. Promoters drive expression of DNA, e.g., genes or transgenes, downstream of the promoter. RNA molecules transcribed from operably linked DNA sequences adjacent to promoters may encode a protein.
- RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to initiate protein synthesis by aligning the ribosome with the start codon and may include consensus sequences such as the Shine-Dalgamo sequence, e.g., AGGAGGU or GAGG.
- mRNA messenger RNA
- tRNA may add amino acids in sequence as dictated by the codons, moving downstream from the translational start site.
- the expression cassettes of the disclosure may comprise a promoter.
- the promoter drives expression of a signal sequence and a sequence encoding a cascade reagent protein.
- the promoter comprises a nucleic acid sequence derived from a promoter of a secretory gene.
- the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene, for example the promoters listed in Table 2.
- the promoter may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO:
- Signal sequences are sequences of DNA encoding a signal peptide. Signal sequences may be referred to as localization signals, localization sequences, leader sequences, or targeting signals and a signal peptide may be referred to as a transit peptide or leader peptide.
- Signal peptides are short peptides that prompt a cell to translocate the protein, and are often present at the N-terminus of proteins destined for secretion, which may include translocation to certain organelles, secretion from the cell, or insertion into cellular membranes.
- the expression cassettes of the disclosure may comprise a signal sequence.
- the signal sequence may encode a signal peptide.
- the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
- the signal sequence is located between the promoter and the sequence encoding a polypeptide protein tag.
- the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a secretory gene.
- the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene, for example the signal sequences listed in Table 3.
- the core of the signal peptide may comprise a sequence of hydrophobic amino acids.
- the sequence of hydrophobic amino acids may be about 5 to 16 residues in length, for example 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 ,15, or 16 residues in length.
- the signal peptide may comprise a short positively charged sequence of amino acids at the N-terminus.
- the signal peptide may comprise a sequence of amino adds recognized and cleaved by signal peptidases.
- the signal sequence may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
- the signal sequence may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 302-306.
- a sequence encoding a cascade reagent protein is a sequence of DNA encoding any one of the cascade reagent proteins of the LAL assay disclosed herein.
- a protein encoded by this sequence may also be referred to as a recombinant cascade reagent (RCR), and may include any one of three recombinant protease zymogens, namely Factor C, Factor B, and Proclotting Enzyme, and a clotting protein, namely Coagulogen.
- the expression cassettes of the disclosure may comprise a sequence encoding a cascade reagent protein.
- the sequence encoding a cascade reagent protein may be isolated or derived from the genome of one of any horseshoe crab, for example Tachypleus tridentatus,
- the sequence encoding a cascade reagent protein may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 4.
- optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
- sequence encoding a cascade reagent protein may be truncated or mutated from the wild type sequence.
- the sequence encoding a cascade reagent protein may encode a recombinant protein with activity higher than, lower than, or equivalent to that of the wild type protein.
- the sequence encoding a cascade reagent protein may encode a cascade reagent protein homolog.
- the sequence encoding a cascade reagent protein may encode the Factor C serine protease zymogen.
- the sequence encoding the cascade reagent protein Factor C may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least
- SEQ ID NO: 278-283 or SEQ ID NO: 325.
- the sequence encoding a cascade reagent protein may encode the Factor B serine protease zymogen and homologs thereof, e.g., C3 and C2/Bf.
- the sequence encoding the cascade reagent protein Factor B may comprise a nucleic add sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least
- the sequence encoding a cascade reagent protein may encode the Proclotting Enzyme serine protease zymogen.
- the sequence encoding the cascade reagent protein Proclotting Enzyme may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
- sequence encoding a cascade reagent protein may encode the Coagulogen clotting protein.
- sequence encoding the cascade reagent protein Coagulogen may comprise a nucleic acid sequence having at least 70%, at least
- a termination sequence, terminator, or transcription terminator is a sequence of
- Prokaryotic transcription terminators of the present disclosure may be Rho-dependent or Rho-independent. Transcription terminators may comprise a downstream transcription stop point sequence and/or a GC-rich region of dyad symmetry followed by a poly-A sequence to promote allosteric dissociation of the transcriptional complex and/or hairpin loop formation of the transcribed mRNA and subsequent transcription termination.
- the expression cassettes of the disclosure may comprise a termination sequence.
- the termination sequence may be isolated or derived from the genome of one of any suitable organism, for example Escherichia coli (E. coli) or C. glutamicum.
- the termination sequence may comprise the termination region of the E. coli rrnB gene, the termination region of the C. glutamicum cgl502 gene, the termination region of the C. glutamicum cg3011 gene, the termination region of the C. glutamicum cspA gene, and the termination region of the C. glutamicum cg!338 gene.
- the termination sequence may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 5.
- optimization for expression in C. glutamicum may include replacing nucleotides of the wild type termination sequence to optimize GC content for expression in C. glutamicum.
- the termination sequence may comprise the wild type rmB termination sequence from E. coli. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 272, or a sequence at least 70%, at least 75%, at least 80%, at least
- the termination sequence may comprise the rmB termination sequence from E. coli optimized for expression in C. glutamicum.
- the termination sequence may comprise the sequence of SEQ ID NO: 273, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
- the termination sequence may comprise the wild type cg!502 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 274, or a sequence at least 70%, at least
- the termination sequence may comprise the wild type cg3011 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 275, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cspA termination sequence from C. glutamicum.
- the termination sequence may comprise the sequence of SEQ ID NO: 276, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
- the termination sequence may comprise the wild type cgl338 termination sequence from C. glutamicum.
- the termination sequence may comprise the sequence of SEQ ID NO: 277, or a sequence at least
- a sequence encoding a polypeptide protein tag is a sequence of DNA encoding a peptide sequence, protein tag, or polypeptide protein tag.
- a sequence encoding a polypeptide protein tag may be fused, appended, or grafted to a sequence encoding a protein, generally at either the C-terminus or N-terminus, or at both the C-terminus and the N-terminus of the protein. Less frequently a sequence encoding a polypeptide protein tag may be inserted into the sequence encoding a protein.
- a polypeptide protein tag may be appended to a protein to aid in affinity purification from biological lysate, enhance resolution of chromatographic separation, and/or promote solubilization and proper folding of proteins prone to precipitation.
- Polypeptide protein tags may comprise polyanionic amino acids or epitope tags.
- the expression cassettes of the disclosure may comprise a sequence encoding a polypeptide protein tag. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the sequence encoding the cascade reagent protein and the termination sequence.
- a linker is located between the sequence encoding a polypeptide protein tag and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding a polypeptide protein tag.
- two or more sequences encoding polypeptide protein tags may be located in tandem at the 5’ end or the 3’ end of the sequence encoding the cascade reagent protein.
- the sequence encoding the cascade reagent protein may be located between two sequences encoding polypeptide protein tags, i.e., the sequences encoding polypeptide protein tags flank the sequence encoding the cascade reagent protein.
- sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the flanking sequences encoding the polypeptide protein tags.
- the cascade reagent protein may be N-terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be C- terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be N-terminally or C-terminally tagged with tandem polypeptide protein tags. In some embodiments, the cascade reagent protein may be both N-terminally and C-terminally tagged with polypeptide protein tags. In some embodiments, the two or more polypeptide protein tags are identical. In some embodiments, the two or more polypeptide protein tags are not identical. In some embodiments, cleavable, flexible, and/or rigid linkers may separate the polypeptide protein tag or tags from the cascade reagent protein.
- sequence encoding a polypeptide protein tag may encode a peptide or protein tag, for example a polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, or maltose-binding protein.
- sequence encoding a polypeptide protein tag may encode a polyhistidinetag, also referred to as His-tag, Hise tag, poly(His) tag, or 6His, which may be about 5-10 residues in length, for example 5, 6, 7, 8, 9, or 10 residues in length, e.g., the amino acid sequence
- sequence encoding a polypeptide protein tag may encode a
- FLAG-tag also referred to as FLAG octapeptide or FLAG epitope, which may have the amino add sequence D and may be used in tandem and with some variation in sequence identity, e.g., the 3xFLAG peptide of amino acid sequence
- sequence encoding a polypeptide protein tag may encode an HA-tag, also referred to as the human influenza hemagglutinin tag, which may be derived from amino acids
- sequence encoding a polypeptide protein tag may encode a calmodulin-binding peptide, also referred to as a calmodulin-binding protein peptide tag,
- the sequence encoding a polypeptide protein tag may encode a streptavidin-binding peptide, also referred to as an SEP or streptavi din-tag, including a 38 -amino add sequence or 8-amino acid sequences of the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK.
- sequence encoding a polypeptide protein tag may encode a glutathione S-transferase protein, also referred to as a GST-tag, which may be about 220 amino adds in length and may be derived from a sequence encoding a wild type glutathione S'-transferase.
- the sequence encoding a polypeptide protein tag may encode a maltose binding protein, also referred to as MBP-tag or maltose tag, which may be about 370-396 amino adds in length and may be derived from the malE gene of E. coli.
- MBP-tag maltose binding protein
- the sequence encoding a polypeptide protein tag may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 6.
- optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
- sequence encoding a polypeptide protein tag may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 19-32.
- sequence encoding a polypeptide tag may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least
- a sequence encoding a linker is a sequence of DNA encoding a polypeptide linker.
- Polypeptide linkers may encode cleavable, rigid, and/or flexible polypeptides.
- Polypeptide linkers also referred to as linkers, may link functional protein domains together or release free functional domains after cleavage.
- Linkers may be isolated from or derived from naturally-occurring multidomain proteins, or may be designed de novo. Linkers may increase stability, promote folding, increase expression, or improve biological activity of the protein domains they are fused to.
- linkers including length, hydrophobicity, amino acid residues, and secondary structure, may vary.
- linkers may adopt various conformations, such as P-strand, helical, coil/bend, and turns.
- the expression cassettes of the disclosure may comprise a sequence encoding a linker.
- the sequence encoding a linker may encode a polypeptide about 3-
- sequence encoding a linker may be located between a 5* sequence encoding a polypeptide protein tag and a 3’ sequence encoding a cascade reagent protein. In some embodiments, the sequence encoding a linker may be located between a 5 ’ sequence encoding a cascade reagent protein and a 3 ’ sequence encoding a polypeptide protein tag. In some embodiments, polar uncharged or charged residues are preferable amino acids of the linker.
- the sequence encoding a linker may encode a flexible GS linker, for example (Gly) 7 , or (Giy)g.
- the sequence encoding a linker may encode a rigid a-helical linker, for example or
- the sequence encoding a linker may encode a rigid proline-rich linker, for example PAPAP, (AP)n, (KP)n, or (EP)n, wherein n is 3-4.
- sequence encoding a linker may encode a cleavable disulfide linker, for example LEAGCKNFFPRSFTSCGSLE, or a cleavable protease linker, for example GFLG.
- the sequence encoding a linker may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 7.
- optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or
- the sequence encoding a linker may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of
- the sequence encoding a linker may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 315-321.
- the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, and a sequence encoding a cascade reagent protein.
- the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID NO: 325
- the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ
- the expression cassette comprises SEQ ID NO: 4 (Factor B (C. rotundicauda)).
- the expression cassette comprises SEQ ID NO: 4 (Factor B (C. rotundicauda)).
- the expression cassette comprises SEQ ID NO: 4 (C. rotundicauda)).
- the expression cassette comprises
- SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), and SEQ ID NO: 6 (Coagulogen (L potyphemus)), SEQ ID NO: 7 (Coagulogen (7. tridentatus)), or SEQ ID NO: 8
- the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator.
- the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO:
- the expression cassette comprises
- the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID NO: 4 (Factor B (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275,
- SEQ ID NO: 276, or SEQ ID NO: 277 wild type or optimized E. coli rrnB termination sequences
- the expression cassette comprises SEQ ID NO: 101 (Pcgl514-cgl514ss-
- the expression cassette comprises SEQ ID NO: 117 rotundicauda)-rrnB terminator.
- the expression cassette comprises SEQ ID NO: 117 rotundicauda)-rrnB terminator.
- the expression cassette comprises SEQ ID NO: 105
- the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z.
- the expression cassette comprises SEQ ID NO: 109
- the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from
- SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 34, SEQ ID NO: 37, SEQ ID NO: 39,
- SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47 cg7574ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively
- SEQ ID NO: 272 SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47
- the expression cassette comprises SEQ ID NO: 98 tridentatus)-rmB terminator ⁇ , SEQ ID NO: 323 (T. tridentatus version 2)-rmB terminator), SEQ ID NO: 114 rotundicauda)-rmB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
- the expression cassette comprises
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
- the expression cassette comprises SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).
- the expression cassette comprises SEQ ID NO:
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
- SBP SBP, GST, or MBP, respectively
- SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274,
- the expression cassette comprises SEQ ID NO: SEQ ID NO: 122 terminator), or SEQ ID NO: 126
- the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator.
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 35, SEQ ID NO: 38, SEQ ID NO:
- SEQ ID NO: 40 SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, or SEQ ID NO: tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO:
- SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:
- the expression cassette comprises SEQ ID NO: 99 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus)-6His- rrnB terminator), SEQ ID NO: 324 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus version 2)-6His- rrnB terminator), SEQ ID NO: 115 (Pcgl514-cgl514ss- ⁇ actor C terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
- the expression cassette comprises
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
- the expression cassette comprises SEQ ID NO:
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
- SBP SBP, GST, or MBP, respectively
- SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274,
- the expression cassette comprises SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, wild type or cg!338 termination sequences, respectively).
- the expression cassette comprises SEQ ID NO:
- the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator.
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
- SEQ ID NO: 36 (cg/5/Vxs-6His-Factor C-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:
- the expression cassette comprises
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 52 (cg7574ss-6His-Factor B-6His), and SEQ ID NO: 272, SEQ ID NO:
- the expression cassette comprises SEQ ID NO: 104 (7. tridentatus)-6tiis-rrnB terminator) or SEQ ID NO: 120 (C. rotundicauda)-6 ⁇ s-rrnB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
- SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:
- the expression cassette comprises SEQ ID NO: 108 -Proclotting Enzyme (T. . In some embodiments, the expression cassette comprises from
- SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 84 (cgI514ss-6His -Coagulogen-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).
- the expression cassette comprises SEQ ID NO: 112
- the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
- the expression cassette comprises from 5* to 3*
- SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 49 and SEQ ID NO: 272,
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 65
- the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 81 and SEQ ID NO: 272, SEQ ID NO:
- SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 E. coh rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cg!338 termination sequences, respectively).
- the disclosure provides methods of recombinant protein expression.
- the expression cassette is cloned into a plasmid.
- the expression cassette may be cloned into a multiple cloning site of a plasmid using restriction enzyme cloning, Gateway cloning, or TOPO cloning.
- the expression cassette may be Gibson assembled into a plasmid.
- the expression cassette may be inserted into a plasmid using a combination of restriction enzyme cloning, Gateway cloning, TOPO cloning, and/or Gibson assembly.
- nucleic acid sequences may comprise restriction enzyme recognition sites and/or recombination sequences to facilitate cloning.
- restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of: the promoter, the signal sequence, the sequence encoding a cascade reagent protein, the termination sequence, the sequence encoding a polypeptide tag, and the sequence encoding a linker.
- restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of two or more sequences encoding polypeptide protein tags and two or more sequences encoding linkers.
- a plasmid may be a cloning vector, a transfer vector, a shuttle vector, or an expression vector.
- a suitable plasmid may be a mobilizable E. coll - C. glutamicum shuttle vector.
- a suitable plasmid may be the pEC-pkl8mob2 plasmid.
- the disclosure provides methods of recombinant protein purification.
- the RCRs of the present invention may be purified from cultures of recombinant C. glutamicum cells expressing nucleic acid molecules, including expression cassettes and plasmids.
- the expression cassette comprises a sequence encoding a polypeptide tag fused to the 5’ end or the 3’ end of the sequence encoding a cascade reagent protein.
- the polypeptide tag may comprise a solubilization tag that facilitates proper protein folding and prevents precipitation during purification.
- the polypeptide tag may comprise an affinity tag that facilitates affinity purification.
- the polypeptide tag may comprise a chromatographic tag that modulates resolution during chromatographic separation.
- the polypeptide tag may comprise an epitope tag that facilitates antibody purification.
- the RCR may be purified from culture supernatant or cell lysate using column chromatography.
- the culture supernatant or cell lysate may be applied to a column, the column may be washed, and bound protein may be eluted from the column.
- additives and chelating agents e.g., EDTA, may be incorporated into buffers during purification.
- the tagged protein binds to the column matrix and may be eluted by competitive binding, cleavage of the protein tag, or by destabilization of the interaction between the protein tag and the column matrix, e.g., by a change of pH.
- the RCR may be purified by fast protein liquid chromatography
- elution fractions may be assayed for protein concentration and RCR activity and concentrated to obtain higher protein concentrations.
- the RCR is purified to apparent homogeneity.
- the isolated, purified protein molecule is an RCR derived from T. tridentatus, e.g., serine protease zymogen or clotting protein optimized for expression in C. glutamicum.
- the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 257 (Factor C), SEQ ID NO: 259 (Factor B), or SEQ ID NO:
- amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 263 (Coagulogen).
- the isolated, purified protein molecule is an RCR derived from C. rotundicauda including homologs thereof, e.g., Factor B C3 and C2/Bf.
- the isolated, purified protein molecule is a serine protease zymogen or clotting protein optimized for expression in C. glutamicum.
- the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 258 (Factor C) or SEQ ID NO: 260 (Factor B).
- the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 264 (Coagulogen).
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from T. tridentatus, e.g., a serine protease zymogen or clotting protein and optimized for expression in C. glutamicum.
- the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from C. rotundicauda or L. polyphemus, e.g., a serine protease zymogen or clotting protein, and optimized for expression in C. glutamicum.
- the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 193
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from T. tridentatus and optimized for expression in C. gluUmicum. In some embodiments, the isolated, purified protein molecule is an
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 130, SEQ ID NO: 133,
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor B derived from T. tridentatus optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 146, SEQ ID NO: 149, SEQ ID NO: 151, SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Proclotting Enzyme derived from T. tridentatus optimized for expression in C. gluUmicum.
- the amino acid sequence of the isolated, purified protein molecule is at least
- SEQ ID NO: 162 75% identical to SEQ ID NO: 162, SEQ ID NO: 165, SEQ ID NO: 167, SEQ ID NO: 169, SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from T tridentatus optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 178, SEQ ID NO: 181, SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID NO:
- SEQ ID NO: 189 SEQ ID NO: 191 (cg!514ss-tag-Coagulogen where the tag is 6His,
- FLAG FLAG, HA, CBP, SBP, GST, or MBP, respectively.
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
- the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 210, SEQ ID NO:
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 226, SEQ ID NO: 229, SEQ ID NO: 231, SEQ ID NO: 233, SEQ ID NO: 235, SEQ ID NO: 237, or SEQ ID NO: 239 (cgl514ss- tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. gluUmicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID NO: 247, SEQ
- SEQ ID NO: 249 SEQ ID NO: 251, SEQ ID NO: 253, or SEQ ID NO: 255 (cg!514ss-tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 131, SEQ ID NO: 134,
- SEQ ID NO: 136 SEQ ID NO: 138, SEQ ID NO: 140, SEQ ID NO: 142, or SEQ ID NO: 144
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Factor B derived from T tridentatus and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 147, SEQ ID NO: 150, SEQ ID NO: 152, SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 163, SEQ ID NO: 166, SEQ ID NO: 168, SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from T. tridentatus and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 179, SEQ ID NO: 182, SEQ ID NO: 184, SEQ
- SEQ ID NO: 186 SEQ ID NO: 188, SEQ ID NO: 190, or SEQ ID NO: 192 (cgl514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 195, SEQ ID NO: 195, SEQ ID
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 211, SEQ ID NO: 214, SEQ ID NO: 216, SEQ ID NO: 218, SEQ ID NO: 220, SEQ ID NO: 222, or SEQ ID NO: 224 (cgl514ss-
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 227, SEQ ID NO: 230, SEQ ID NO: 232, SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least
- SEQ ID NO: 252 SEQ ID NO: 252, SEQ ID NO: 254, or SEQ ID NO: 256 (cgl514ss-Coagulogen-tag where the tag is
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged RCR derived from T. tridentatus or C. rotundicauda optimized for expression in C. glutamicum.
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Factor C derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 132 (cgl514ss-6His-Factor C
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged Factor B derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum.
- the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
- the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally and C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ
- the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Coagulogen derived from T. tridentatus, L. polyphemus or C. rotundicauda and optimized for expression in C. glutamicum.
- the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 180 (cgl514ss-6His-
- Coagulogen SEQ ID NO: 228 (cgl514ss-6His-Coagulogen (L. or SEQ ID NO: 244 (cgl514ss-6His-Coagulogen (C. rotundicauda)-6His.
- kits and methods for detecting a pyrogen or endotoxin in a sample comprises one or more of the RCR proteins of the present disclosure.
- the kit comprises one or more of recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting
- the kit comprises one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity.
- the disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity.
- the method comprises contacting the sample with one or more of the components of the kit described herein, including recombinant
- the method comprises contacting the sample with one or more of the components of the kit described herein in combination with a commercialized natural lysate reagent.
- the method for detecting a pyrogen or endotoxin in a sample comprises the limited proteolysis of each protease zymogen in the coagulation cascade reaction of the LAL assay.
- the method for detecting a pyrogen or endotoxin in a sample may comprise admixing one or more components of the kit with the sample, separating precipitated proteins from the sample, admixing one or more components of the kit with the remaining sample, and measuring coagulation. Measuring coagulation may include observing increased turbidity and viscosity.
- the method further comprises centrifugation of the sample, sedimentation and separation of the sample, and/or removal of one or more layers or portions of the sample.
- Expression cassettes of the present disclosure include nucleic acid molecules comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and may include a polypeptide tag.
- the expression cassette comprises a promoter, a signal sequence, a gene of interest, and a termination sequence
- the expression cassette comprises a promoter, a signal sequence, an N-terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 1).
- the expression cassette comprises a promoter, a signal sequence, a C- terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 3).
- RCR expression cassettes comprising the , the cgl514 signal sequence indicated by cgl514ss (SEQ ID NO: 14), the T tridentatus Factor C gene optimized for expression in C. glutamicum (SEQ ID NO: 325), the E. coli rmBTlT2 terminator sequence indicated by rmB terminator (SEQ ID NO: 272), and optionally a polyhistidine-tag optimized for expression in C. glutamicum (SEQ ID NO: 26).
- Three RCR expression cassettes were engineered to result in a secretory expression system based on the Cgl514 secreted protein of C. glutamicum by using the promoter ) and signal sequence (cg!514ss) of cg!514.
- FIG. 2B shows schematic representations of the three RCR expression cassettes optimized for expression in C. glutamicum.
- Expression cassette number 4 (SEQ ID NO: 322) comprises the P promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
- Expression cassette number 5 comprises the Pcgisu promoter (SEQ ID NO: 9), the cgl514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272).
- Expression cassette number 6 comprises the Pcgisi* promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
- the three RCR expression cassettes comprise the nucleic acid sequences of
- SEQ ID NO: 322, SEQ ID NO: 323, and SEQ ID NO: 324 for expression of Factor C (FIG. 2B, number 4), N-terminally polyhistidine-tagged Factor C (FIG. 2B, number 5), and C-terminally polyhistidine-tagged Factor C (FIG 2B, number 6), respectively.
- the pEC-pk!8mob2 plasmid is a mobilizable E. coli - C. glutamicum shuttle vector based on a mini-replicon encoding the repA and per functions of the medium copy number plasmid pGAl.
- plasmid expression confirmation a single colony of each of the transformations was isolated from a fresh LEG plate (Luria Broth - Lennox’ s formulation supplemented with 0.5% glucose), inoculated in LEG broth and incubated at 30 °C shaking at 200 revolutions per minute (RPM) for about 6 - 8 hours. A sample of each of the transformations was then inoculated into fresh LEG broth and incubated at 30 °C shaking at 200
- FIG. 2B number 6 indicated by pK18mob2 - FC-CHis6 (FIG. 3A, bottom middle), and grampositive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 5, indicated by pK18mob2 - FC-NHis6 (FIG. 3 A, bottom right).
- Plasmids were isolated from bacteria using alkaline lysis, and samples were subjected to gel electrophoreses at 80 Volts for 120 minutes at room temperature on a 1% agarose gel in IX TAB buffer. Safe DNA Gel Stain (Bioland Scientific) was used to visualize
- Lane 1 shows no DNA present from the C. glutamicum negative control
- lane 2 shows the expected molecular weight of the pEC-pkl8mob2 empty plasmid expressed in C. glutamicum
- lane 3 shows the expected molecular weight of the plasmid comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322) expressed in C. glutamicum
- lane 4 shows the expected molecular weight of the plasmid comprising expression cassette number 6 of FIG. 2B (SEQ ID NO: 324) expressed in C. glutamicum
- lane 5 shows the expected molecular weight of the plasmid comprising expression cassette number 5 of FIG.
- Example 3 Expression of recombinant Factor C in C glutamicum [0122]
- C. glutamicum expression cassette 4 of FIG. 2B (SEQ ID NO: 322; pK18mob2 -
- FC FC
- kanamycin 50 mg/L was added to the culture medium as the sole antibiotic.
- As a seed culture cells were inoculated into 50 mL of semi-defined medium containing
- the semi-defined medium consists of 0.5 g urea, 0.25 mg ZnSCh, 2.5 mg CaCh in BHI media.
- the seed culture (40 mL) was inoculated into 400 mL of fresh semi-defined medium in a 1 L jar custom-built bioreactor. Throughout cultivation, the temperature was maintained at 30 °C and stirred with axial flow impeller at 300 RPM. Oxygen concentration was maximized by continual sterile air flow into the medium.
- the pH was maintained at 7.0 by adding 10% "V/V ammonium hydroxide solution (LabChem, Zelienople, PA) when the set point dropped below 7 or 37% hydrochloric acid (GTI Laboratory Supplies, Edna, Texas) when the set point increased above 7.
- a glucose solution (90 g in 150 mL BHI) was added to the culture in 90 second increments at a rate of 12.5 mL/hr.
- the pellet was air-dried and resuspended in denaturing 8M urea (pH 8.0), 300 mMNaCl, 50 mMNaH2PO 4 , 20 mM Tris-Cl, 1 mMEDTA, 10% glycerol, and 1% Triton X-100.
- lanes 1 and 2 are duplicates of the C. glutamicum pK18mob2 negative control sample
- lanes 3 and 4 are duplicates of the C. glutcanicum pK18mob2 - FC sample.
- Lanes 3 and 4 show expression of ⁇ 80 kDa and ⁇ 43 kDa polypeptides in the culture supernatant of C. glutamicum expressing pK18mob2 - FC, a plasmid which harbors cassette number 4 (SEQ ID NO: 322), referred to as C. glutamicum pKl8mob2 - FC.
- Lanes 1 and 2 do not show expression of ⁇ 80 kDa and ⁇ 43 kDa polypeptides in the culture supernatant of C. glutcanicum expressing pK18mob2, an empty plasmid, referred to as C. glutamicum pK18mob2.
- SDS-PAGE gel analysis with an 8% gel under denaturing conditions demonstrates expression of ⁇ 80 kDa and ⁇ 43 kDa polypeptides in the culture supernatant of C. glutamicum pK18mob2 - FC, corresponding to production of Factor C in C. glutamicum and extrusion of the protein into the culture supernatant.
- Embodiment 1 A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’: a. a promoter; b. a signal sequence; and c. a sequence encoding a cascade reagent protein.
- Embodiment 2 The nucleic acid molecule of embodiment 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.
- Embodiment 3 The nucleic acid molecule of embodiment 1 or 2, wherein the signal sequence encodes a signal peptide.
- Embodiment 4 The nucleic acid molecule as any one of embodiments 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
- Embodiment 5 The nucleic acid molecule as in any one of embodiments 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.
- Embodiment 6 The nucleic acid molecule as in any one of embodiments 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a Corynebacterium glutamicum secretory gene.
- Embodiment 7. The nucleic acid molecule as in embodiment 5 or 6, wherein the
- Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR.0949 gene, and the porB gene.
- Embodiment 8 The nucleic acid molecule as in any one of embodiments 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.
- Embodiment 9 The nucleic acid molecule as in any one of embodiments 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus polyphemus, Tachypleus gigas, and
- Embodiment 10 The nucleic acid molecule as in any one of embodiments 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
- Embodiment 11 The nucleic acid molecule as in any one of embodiments 1-10, wherein the expression cassette comprises a termination sequence.
- Embodiment 12 The nucleic acid molecule of embodiment 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rmB gene, the termination region of the Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cgl338 gene.
- Embodiment 13 The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.
- Embodiment 14 The nucleic acid molecule of embodiment 13, wherein the polypeptide protein tag is selected from the group consisting of polyhistidine-tag,
- FLAG-tag FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.
- Embodiment 15 The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
- Embodiment 16 The nucleic acid molecule of embodiment 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
- Embodiment 17 The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
- Embodiment 18 The nucleic acid molecule of embodiment 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
- Embodiment 19 The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.
- Embodiment 20 The nucleic acid molecule of embodiment 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
- polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
- Embodiment 21 The nucleic acid molecule of embodiment 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
- Embodiment 22 The nucleic acid molecule of embodiment 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
- Embodiment 23 The nucleic acid molecule as in any one of embodiments 16, 18, or
- linker or linkers are selected from the group consisting of flexible
- GS linkers flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
- Embodiment 24 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
- SEQ ID NO: 1 SEQ ID NO: 2, SEQ ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.
- Embodiment 25 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.
- Embodiment 26 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
- Embodiment 27 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
- Embodiment 28 The nucleic acid molecule as in any one of embodiments 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO:
- Embodiment 29 The nucleic acid molecule as in any one of embodiments 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of
- SEQ ID NO: 14-18 or a sequence at least 90% identical thereto.
- Embodiment 30 The nucleic acid molecule as in any one of embodiments 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
- Embodiment 31 The nucleic acid molecule as in any one of embodiments 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
- Embodiment 32 The nucleic acid molecule as in any one of embodiments 16, 18, or
- linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.
- Embodiment 33 The nucleic acid molecule as in any one of embodiments 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
- Embodiment 34 The nucleic acid molecule as in any one of embodiments 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of
- Embodiment 35 A plasmid, comprising the nucleic acid molecule as in any one of embodiments 1-34.
- Embodiment 36 A cell, comprising the nucleic acid molecule as in any one of embodiments 1-34 or the plasmid of embodiment 35.
- Embodiment 37 A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment
- Embodiment 38 A recombinant expression system produced by the method of embodiment 37.
- Embodiment 39 A method of expressing Factor C serine protease zymogen, Factor
- Coagulogen clotting protein comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment 35.
- Embodiment 40 An isolated, purified protein molecule, wherein the amino acid sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
- Embodiment 41 A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein expressed in Corynebacterium glutamicum.
- Embodiment 42 The kit for detecting a pyrogen or endotoxin in a sample of embodiment 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID NO:
- Embodiment 43 The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-42, wherein the amino acid sequence of the recombinant
- Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or
- Embodiment 44 The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-43, wherein the amino acid sequence of the recombinant
- Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID NO:
- Embodiment 45 The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-44, wherein the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO:
- Embodiment 46 A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule of embodiment 40.
- Embodiment 47 A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of embodiments 41-45.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Immunology (AREA)
- Urology & Nephrology (AREA)
- Plant Pathology (AREA)
- Analytical Chemistry (AREA)
- Hematology (AREA)
- Cell Biology (AREA)
- Food Science & Technology (AREA)
- Medicinal Chemistry (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The disclosure provides nucleic acid molecules (comprising expression cassettes), plasmids, protein molecules, cells (comprising nucleic acid molecules), and recombinant expression systems for producing recombinant cascade reagents for the limulus amoebocyte lysate test method. Also, provided herein are kits and methods for detecting a pyrogen or endotoxin in a sample.
Description
COMPOSITIONS AND METHODS FOR DETECTING AN ENDOTOXIN
INVENTORS:
Jennifer Watson
Richard Hatcher
TITLE OF THE INVENTION
Compositions and Methods for Detecting an Endotoxin
CROSS REFERENCE TO RELATED APPLICATION
[0001] The present Application claims the benefit of priority to U.S. Provisional
Application No. 63/315,513, filed on March 1 , 2022, the contents of which are hereby incorporated by reference in their entirety.
INCORPORATION BY REFERENCE OF SEQUENCE LISTING PROVIDED
ELECTRONICALLY
[0002] An electronic version of the Sequence Listing is filed herewith, the contents of which are incorporated by reference in their entirety. The electronic file is 571 kilobytes in size, and is titled 495-LM02_SequenceListing_ST26.txt.
BACKGROUND OF THE INVENTION
Field of the Invention
[0003] The present invention relates generally to the fields of biotechnology and infectious diseases, and more particularly it pertains to recombinant production of enzymes for detection of pyrogens and endotoxins.
Background
[0004] The standard pyrogen assay is a mandatory test for U.S. Food and Drug
Administration (FDA) approval of all vaccines, intravenous pharmaceuticals, and internal medical devices to prevent contamination with endotoxins. The assay uses the hemolymph (blood) of the horseshoe crab, Limulus polyphemus (L. polyphemus} and tests for the presence of fever-producing agents of bacterial origin, e.g., endotoxins. The limulus amoebocyte lysate (LAL) test method is a
qualitative assay during which the L. polyphemus hemolymph lysate reacts with an endotoxin to form a gel. The LAL test is considered to be reproducible, simple to conduct, specific for the presence of endotoxins, and sensitive to even picogram quantities of endotoxins. The quantity of endotoxin may be determined by dilution techniques comparing gel formation of the test sample to that of a reference pyrogen. The following non-essential publications are incorporated by reference in their entirety to aid in understanding of the official use of the LAL assay for release testing of final drug products: Levin, J, et al. Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool. In: Shuster CNJ, Barlow RB and Brockman HJ (eds) The American horseshoe crab. 2003: 310-340; Cooper, JF. Discovery and acceptance of the bacterial endotoxins test. In: McCullough KZ (ed.) The bacterial endotoxins test: a practical approach. 2011: 1-13.
[0005] The LAL assay comprises horseshoe crab lysate reagents that form a four-step coagulation cascade. Three serine protease zymogens, namely Factor C, Factor B, and Proclotting
Enzyme, and one clotting protein, Coagulogen, form the enzymatic coagulation cascade that results in a coagulin gel clot in the presence of an endotoxin. In this cascade, an endotoxin activates the Factor C zymogen and the activated Factor C subsequently activates Factor B, which converts the Proclotting Enzyme into Clotting Enzyme that cleaves Coagulogen into Coagulin, forming a gel clot.
[0006] The raw materials for the production of lysate reagents are harvested from wildcaught horseshoe crab, including L. polyphemus and Tachypleus tridentatus (T. tridentatus). Wild horseshoe populations are in decline due to the detrimental effect of capture, blood collection, and release, poor management of harvest regulations, and habitat destruction. Commercial-scale cultivation of horseshoe crabs has not been achieved. The following non-essential publications are incorporated by reference in their entirety to aid in understanding of the unsustainability of blood
collection from wild-caught crabs for production of LAL assay reagents: Gauvry G. Current
Horseshoe crab harvesting practices cannot support global demand for TAL/LAL: The pharmaceutical and medical device industries’ role in the sustainability of horseshoe crabs. In:
Carmichael RH, Botton ML, Shin PKS and Changing SGC (eds) Global perspectives on horseshoe crab biology, conservation and management. 2015: 475-482; Anderson RL et al., Sublethal behavioral and physiological effects of the biomedical bleeding process on the American horseshoe crab, Limulus pofyphemus. Biol Bull. 2013(225): 137-151; Novitsky TJ. Biomedical implications for managing the Limulus pofyphemus harvest along the northeast coast of the United
States. IN: Carmichael RH, Botton ML, Shin PKS and Changing SGC (eds) Global perspectives on horseshoe crab biology, conservation and management. 2015: 483-500.
[0007] Demand for lysate reagents for the LAL assay will likely continue to rise with the growth of the pharmaceutical industry, including the proliferation of biotechnology-based drugs and vaccines. The recent rapid development and deployment of vaccines against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in a mass vaccination campaign to address the coronavirus disease 2019 (COVID-19) pandemic demonstrates the ongoing necessity for endotoxin-free development and manufacturing of parenteral pharmaceuticals. The current reliance of the LAL assay on lysate reagents harvested from the horseshoe crab is a threat to horseshoe crab populations, the ecosystems in which the horseshoe crab lives, and humanity as the globe faced the CO VID-19 pandemic and the threat of future pandemics.
[0008] Accordingly, a sustainable alternative to lysate reagents for the LAL assay is urgently needed to protect the horseshoe crab and humanity from preventable harm.
BRIEF SUMMARY OF THE INVENTION
[0009] Thus, in accordance with the present disclosure, recombinant generation of lysate reagents for the LAL assay is provided herein. The disclosure features expression cassettes, plasmids, and functional recombinant cascade reagents (RCRs) produced from these expression cassettes and plasmids. The disclosure also features expression cassettes for functional RCRs optimized for production in Corynebacterium glutamicum (C. glutamicum). The disclosure features optimized expression cassettes for production in C. glutamicum of the Factor C, Factor B, and Proclotting Enzyme serine protease zymogens, as well as optimized expression cassettes for production of the Coagulogen clotting protein.
[0010] The disclosure provides nucleic acid molecules, comprising expression cassettes, wherein the expression cassettes comprise, from 5’ to 3’: a promoter; a signal sequence; and a sequence encoding a cascade reagent protein. In some embodiments, the expression cassette is optimized for expression in C. glutamicum. In some embodiments, the signal sequence encodes a signal peptide.
[0011] In some embodiments, the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene. In some embodiments, the C. glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the
CgR0949 gene, and the porB gene.
[0012] In some embodiments, the sequence encoding the cascade reagent protein encodes
Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein. In some embodiments, the sequence encoding
the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus pofyphemus,
Tachypleus gigas, and Carcinoscorpius rotundicauda (C. rotundicauda).
[0013] In some embodiments, the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
[0014] In some embodiments, the expression cassette comprises a termination sequence.
In some embodiments, the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the
Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cg!338 gene.
[0015] In some embodiments, the expression cassette comprises a sequence encoding a polypeptide protein tag. In some embodiments, the expression cassette comprises two or more sequences encoding polypeptide protein tags. In some embodiments, the polypeptide protein tag or polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag,
HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
[0016] In some embodiments, the sequence encoding the polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
[0017] In some embodiments, the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence. In some
embodiments, a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
[0018] In some embodiments, the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags. In some embodiments, sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
[0019] In some embodiments, the linker or linkers are selected from the group consisting of flexible glycine-serine linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
[0020] In some embodiments, the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 278-283, or SEQ ID: 325, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284—289, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6-8 or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
[0021] In some embodiments, the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto. In some embodiments, the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto. In some embodiments, the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90%
thereto. In some embodiments, the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto. In some embodiments, the linker is encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271, or a sequence at least 90% thereto.
[0022] In some embodiments, the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
[0023] In some embodiments, the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128 or SEQ ID NO. 322-324, or a sequence at least 90% thereto.
[0024] The disclosure provides plasmids comprising nucleic acid molecules disclosed herein. The disclosure also provides cells comprising any one of the nucleic acid molecules or plasmids disclosed herein.
[0025] The disclosure provides methods of producing a recombinant expression system comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein. The disclosure also provides recombinant expression systems produced by the method of contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
[0026] The disclosure provides methods of expressing Factor C serine protease zymogen,
Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein, comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
[0027] The disclosure provides isolated, purified protein molecules, wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
[0028] The disclosure provides kits for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant
Coagulogen clotting protein expressed in C. glutamicum.
[0029] In some embodiments, the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 257 or SEQ ID NO: 258. In some embodiments, the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 259 or SEQ ID NO: 260. In some embodiments, the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical SEQ ID NO: 261. In some embodiments, the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262
-264.
[0030] The disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
[0031] The disclosure provides methods of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kits disclosed herein.
[0032] These and other embodiments are described in more detail in the detailed description below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0033] FIG. 1 depicts the coagulation cascade of the present disclosure based on the coagulation cascade in the horseshoe crab amoebocyte lysate.
[0034] FIGS. 2A-2B depict the expression cassettes of the present disclosure. FIG. 2A shows expression cassettes comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and optionally a polypeptide tag. FIG. 2B shows exemplary expression cassettes according to the present invention. Expression cassette number 4 (SEQ ID NO: 322) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272).
Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272). Expression cassette number 6 (SEQ ID NO: 324) comprises the promoter (SEQ ID NO: 9), the cg!514
signal sequence (SEQ ID NO: 14), Factor C gene from T. tridentatus (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator (SEQ ID NO: 272).
[0035] FIGS. 3A-3B show expression of the plasmids containing expression cassettes in
C. glutamicum according to the present disclosure. FIG. 3 A depicts microscopy images showing untransformed gram-positive B. cereus and untransformed gram-negative E. coli (top left), untransformed, gram-positive C. glutamicum (top middle), C. glutamicum transformed with empty plasmid (top right), and C. glutamicum transformed with plasmids comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322, bottom left), expression cassette number 6 of FIG. 2B
(SEQ ID NO: 324, bottom middle), and expression cassette number 5 of FIG. 2B (SEQ ID NO:
323, bottom right). The scale bar is 10 pm. FIG. 3B depicts gel electrophoresis showing the molecular weight of plasmids containing the expression cassettes of the present disclosure. Lane
1 shows C. glutamicum as a negative control, lane 2 shows C. glutamicum expressing the pEC- pk!8mob2 empty plasmid as a positive control, lane 3 shows C. glutamicum expressing the plasmid
comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322), lane 4 shows C. glutamicum expressing the plasmid comprising the expression cassette number 6 of FIG. 2B (SEQ
ID NO: 324), and lane 5 shows C. glutamicum expressing the plasmid comprising the expression cassette number 5 of FIG. 2B (SEQ ID NO: 323).
[0036] FIG. 4 depicts gel electrophoresis showing the molecular weight of polypeptides in the culture supernatant of C. glutamicum in accordance with the present disclosure. Lanes 1 and
2 show C. glutamicum expressing the pEC-pkl8mob2 empty plasmid as a negative control, and lanes 3 and 4 show C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322).
DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS
[0037] The limulus amoebocyte lysate (LAL) test method is a standard pyrogen assay employed by a variety of industries to ensure that samples are free of harmful endotoxins and pyrogens. The U.S. Food and Drug Administration (FDA) approved the LAL assay for testing drugs, products, and devices, and the assay is widely used to test ingredients of pharmaceuticals during manufacturing.
[0038] The LAL assay is based on a coagulation cascade involving reagents harvested from the hemolymph of wild-caught horseshoe crab. Specifically, exposure of endotoxin to the serine protease zymogen Factor C initiates a cascade that activates the serine protease zymogen
Factor B, converts the serine protease zymogen Proclotting Enzyme into Clotting Enzyme, and ultimately cleaves Coagulogen into Coagulin to form a gel clot. The LAL assay depends on the availability of these reagents, and poor harvest management and habitat destruction threaten the horseshoe crab population and thus threaten the supply of horseshoe crab lysate reagents.
[0039] The disclosure provides nucleic acid molecules (comprising expression cassettes) and plasmids for producing the lysate reagents Factor C, Factor B, Proclotting Enzyme, and
Coagulogen, wherein the nucleic acid molecules and plasmids are optimized for expression in the generally regarded as safe (GRAS) actinobacteria Corynebacterium glutamicum (C. glutamicuni).
Also provided are isolated, purified protein molecules encoded by the nucleic acid molecules and plasmids disclosed herein, kits for detecting a pyrogen or endotoxin in a sample comprising recombinant lysate reagents, methods of producing a recombinant expression system using the nucleic acid molecules and plasmids disclosed herein, and methods for detecting pyrogen or endotoxin in a sample.
[0040] The nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used to test for contamination in a variety of industries, including pharmaceuticals (both preclinical studies and clinical applications) and biotechnologies, and settings, including healthcare providers, veterinary clinics, agriculture, food processing and service, wineries, breweries, distilleries, military, and direct-to-consumer. In some embodiments, nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the agriculture, food service, food processing, winery, brewery, or distillery industries to test for contamination at any point along the logistical supply chain. In some embodiments, nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the healthcare provider, veterinary clinic, military, and direct-to-consumer industries to test for contamination and institute organizational processes and conditions to sanitize frequently touched objects and surfaces and prevent infection.
Definitions
[0041] Unless otherwise defined, all scientific and technical terms used in the description herein and in the appended claims have identical meaning as understood by one of ordinary skill in the art. The terminology used herein is not intended to be limiting and is used for the purpose of describing particular embodiments in the description herein.
[0042] The singular forms “a,” “an,” and “the” are intended to include the plural forms as well and are consistent with the meaning of “one or more,” “at least one,” and “one or more than one,” unless the context clearly indicates otherwise.
[0043] As used herein, the term “about” when referring to a measurable value such as concentration, volume, length of time, length of a polypeptide or polynucleotide sequence, quantity, and the like, encompasses, ± 20%, ± 10%, ± 5%, ± 1%, ± 0.5%, or ± 0.1% of the specified amount.
[0044] As used herein, the term “expression cassette” refers to a nucleic acid component of vector DNA comprising one or more transcriptional control elements (e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences) that direct gene expression of a sequence encoding a protein and/or polypeptide, e.g., a linear nucleic acid sequence encoding one or more transgenes that are expressed by one or more cell types. The terms “DNA
“expression vector,” and “plasmid” are terms of the art understood by skilled persons and refer to synthetic DNA molecules used to carry foreign genetic material into a cell. The term
“recombinant DNA” is a term of the art understood by skilled persons and refers to combining two or more DNA molecules from two or more different sources, and the term “recombinant protein” is a term of the art understood by skilled persons and refers to protein encoded by recombinant
DNA that has been cloned into an expression vector. The term “recombinant” is a term of the art
understood by skilled persons and refers to recombined DNA, e.g., recombinant DNA, and/or artificially produced protein, e.g., recombinant protein.
[0045] As used herein, the term “recombinant expression system” refers to a system for expressing recombinant protein in cells by transfecting cells with a DNA vector, expression vector, or plasmid. The term “expression” is a term of the art understood by skilled persons and refers to production of large amounts of recombinant DNA and/or recombinant protein by manipulation of the genetic material. The terms “optimized expression” or “optimized for expression” refer to adaptation of some or all of nucleic acid molecules, including synthetic DNA molecules, recombinant DNA, and/or DNA vector, to the host organism to optimize synthesis and/or production of recombinant proteins. Optimization for expression may include optimizing GC content and noncoding DNA elements. Optimization for expression may include optimization based on highly expressed genes (HEG) wherein the codon usage of predicted highly expressed genes from 150 bacterial genomes under translational selection determines codon usage.
Optimization for expression may also include determination of codon usage based on ribosomal protein genes (RPG) or tRNA gene copy number (tRNA). The HEG, RPG, and tRNA optimization techniques apply a Monte Carlo algorithm using relative codon usage frequencies of a reference set as the relative probability that a given codon will be used in the optimization process.
Optimization for expression may include general optimization based on the C. glutamicum codon usage table generated from 9,019 coding sequences representing 2,866,198 codons. Optimization for expression may include optimization based on the software OptimWiz, a proprietary codon optimization analysis tool, which may optimize for expression by modifying GC-content, mRNA secondary structure, Shine-Dalgamo sequence, RNA instability motifs, repetitive sequences, internal splice sites, and restriction enzyme recognition sites.
[0046] Exemplary optimization for expression of the present invention includes replacing nucleic acids of a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and/or sequences encoding linkers with nucleic acids encoding codons based on the
HEG, RPG, tRNA, general, or OptimWiz optimization methods.
[0047] The terms “optimized expression” or “optimized for expression” may also refer to polypeptides or proteins encoded by nucleic acid sequences that have been optimized for expression, i.e. optimization of the coding sequence that codes for the sequence of amino acids in a protein.
[0048] As used herein, the term “nucleic acid”, “nucleic acid molecule”, or
“polynucleotide” refers to a sequence of more than one nucleotide base monomer, for example deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), in a single chain, including naturally occurring and non-naturally occurring nucleotides. As used herein, the term “nucleotide” refers to conventional nucleotide bases, e.g., the purine and pyrimidine bases adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U). A nucleic acid will generally contain sugars and phosphates connected in an alternating chain through phosphodiester linkages. Generally, the phosphate groups are attached to carbons at the 5 ’-end and the 3’ -end of the sugar, imparting directionality to nucleic acids. The ends of nucleic acids are referred to as the 5’-end and 3’-end, the 5’-end is referred to as “upstream” of the 3’-end, and the 3’-end is referred to as “downstream” of the 5’-end. Nucleic acid molecules may be circular (e.g., a plasmid) or linear (e.g., a cassette).
Nucleic acid sequences may encode polypeptides or may include sequences regulating transcription (e.g., promoters and terminators).
[0049] As used herein, the term “polypeptide” refers to a continuous, unbranched chain of peptides linked by peptide bonds. Amino acids incorporated into peptides are known as residues,
and the term “amino acid sequence” refers to a sequence of amino acids, including naturally occurring and non-naturally occurring amino acids. Longer polypeptides are known as proteins, and the term “protein tag” is used to refer to a shorter polypeptide. Generally, polypeptides have an N-terminus, also known as the N-terminal end or amine-terminus, and a C-terminus, also known as the C-terminal end, caiboxyl-terminus, or carboxy-terminus. Polypeptides may be fused to other polypeptides by combining the genes or parts of genes that encode them to produce recombinant
DNA that encodes a recombinant fusion protein. One protein tag or domain may be fused N- terminally or C-terminally to another protein tag or domain. Fusion of a protein tag to the N- terminus of a protein results in an N-terminally tagged protein, and fusion of a protein tag to the
C -terminus of a protein results in a C-terminally tagged protein. Recombinant proteins, including signal polypeptides, cascade reagent proteins, and protein tags, may be fused by linker sequences to separate these domains. Linker sequences may encode cleavable polypeptides, which can be cleaved upon exposure to enzyme, chemical reagents, or irradiation, or non-cleavable polypeptides, including flexible polypeptide linkers composed of glycine and serine known as GS linkers, for example (Gly-Gly-Gly-Gly-Ser)n, or rigid linkers, for example proline-rich or a-helical linkers.
[0050] As used herein, the terms “streptavidin-binding peptide” and “SBP” may include the 38-amino acid sequence or
8-amino add sequences of the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK.
[0051] Calculations of “identity” between two sequences, e.g., nucleic acid or amino acid sequences, can be performed by practices commonly understood by one of ordinary skill in the art.
The sequences are aligned for optimal comparison performance and the nucleotides or amino acid
residues at corresponding nucleotide positions or amino acid positions are then compared.
Molecules are identical at a position when a position in the first sequence is occupied by the same nucleotide or amino acid residue as the corresponding position in the second sequence. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences. As used herein, the term “homolog” refers to a protein that has a common ancestor, and may include proteins that exhibit sequence homology, i.e., the proteins share sequence similarity.
[0052] As used herein, the term “promoter” refers to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating binding of proteins
(e.g., transcription factors) that initiate transcription of RNA from the DNA downstream of the promoter. The transcription start site is the location where transcription starts at the 5’-end of the operably linked nucleic acid sequence, and the promoter generally includes consensus sequences, such as a TATA box, near the transcription start site.
[0053] As used herein, the terms “terminator” or “termination sequence” refer to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating termination of transcription of RNA from the DNA upstream of the terminator.
Generally, the termination sequence is downstream of a stop codon that signals termination of translation of the protein translated from the RNA transcribed from the DNA upstream of the stop codon.
[0054] As used herein, the term “transgene” refers to a gene transferred from one organism to another, i.e., an exogenous nucleic acid sequence encoding a polypeptide to be expressed in a cell. Generally, a transgene contains a promoter, a protein coding sequence, and a termination sequence. The term “gene of interesf’ refers to the nucleic acid sequence encoding a protein, i.e.,
a protein coding sequence. Exemplary genes of interest of the present invention include nucleic add sequences encoding clotting proteins, Factor C serine proteases, Factor B serine proteases,
Proclotting Enzyme serine proteases, Coagulogen clotting proteins, and recombinant cascade reagents (RCRs).
[0055] As used herein, the term “signal sequence” refers to a nucleic acid sequence encoding a short peptide present at the terminus of most proteins destined for secretion via the cellular secretory pathway. The term “signal peptide” refers to the polypeptide encoded by the signal sequence, and is generally present at the N-terminus of secreted proteins. The term
“secretory gene” refers to genes encoding proteins destined for secretion via the cellular secretory pathway.
[0056] As used herein, the terms “pyrogen” and “endotoxin” are used interchangeably and refer to causative agents responsible for biological effects incidental to therapy administered parenterally, i.e. therapies administered to the body other than through the mouth and alimentary canal. Parenteral therapies, including injection (e.g., subcutaneous injection, intraperitoneal injection, intrathecal injection, etc.), allow pyrogens or endotoxins to bypass the normal body defenses. The host’s response to pyrogens or endotoxins include fever, shock, and other physiological responses. While the terms pyrogen and endotoxin are used interchangeably herein, not all pyrogens are endotoxins.
[0057] As used herein, the term “amino acid” refers to naturally occurring and non- naturally occurring or synthetic amino acids. Naturally occurring, levorotatory (L-) amino acids and their abbreviations (three-letter code and one-letter code are shown in Table 1.
Expression cassettes
[0058] The disclosure provides nucleic acid sequences comprising one or more expression cassettes optimized for expression in C. glutamicum. In some embodiments, the expression cassette comprises, from 5’ to 3’, a promoter, a signal sequence, and a sequence encoding a cascade reagent protein. In some embodiments, the cascade reagent protein is Factor C. In some embodiments, the cascade reagent protein is Factor B. In some embodiments, the cascade reagent protein is Proclotting Enzyme. In some embodiments, the cascade reagent protein is Coagulogen.
In some embodiments, the expression cassette comprises a termination sequence, a sequence encoding a polypeptide protein tag, and/or a sequence encoding a linker.
[0059] In some embodiments, the expression cassette comprises a nucleic acid sequence of having least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID
NO: 97-128 or SEQ ID NO: 322-324.
(i) Promoter
[0060] Promoters or promoter sequences are sequences of DNA to which transcription factors bind, thereby initiating transcription of RNA from the DNA downstream of the promoter.
Promoters are located upstream, or toward the 5’ region of the sense strand, of the transcription start site and may include consensus sequences such as TATAAT or TTGACA. Promoters drive expression of DNA, e.g., genes or transgenes, downstream of the promoter. RNA molecules transcribed from operably linked DNA sequences adjacent to promoters may encode a protein.
[0061] The RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to initiate protein synthesis by aligning the ribosome with the start codon and may include consensus sequences such as the Shine-Dalgamo sequence, e.g., AGGAGGU or GAGG. Once recruited, tRNA may add amino acids in sequence as dictated by the codons, moving downstream from the translational start site.
[0062] The expression cassettes of the disclosure may comprise a promoter. In some embodiments, the promoter drives expression of a signal sequence and a sequence encoding a cascade reagent protein. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a secretory gene. In some embodiments, the promoter comprises a
nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene, for example the promoters listed in Table 2.
[0063] In some embodiments, the promoter may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO:
9-13.
(ii) Signal Sequence
[0064] Signal sequences are sequences of DNA encoding a signal peptide. Signal sequences may be referred to as localization signals, localization sequences, leader sequences, or targeting signals and a signal peptide may be referred to as a transit peptide or leader peptide.
Signal peptides are short peptides that prompt a cell to translocate the protein, and are often present at the N-terminus of proteins destined for secretion, which may include translocation to certain organelles, secretion from the cell, or insertion into cellular membranes.
[0065] The expression cassettes of the disclosure may comprise a signal sequence. The signal sequence may encode a signal peptide. In some embodiments, the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein. In some
embodiments, the signal sequence is located between the promoter and the sequence encoding a polypeptide protein tag. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a secretory gene. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene, for example the signal sequences listed in Table 3.
[0066] In some embodiments, the core of the signal peptide may comprise a sequence of hydrophobic amino acids. The sequence of hydrophobic amino acids may be about 5 to 16 residues in length, for example 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 ,15, or 16 residues in length. In some embodiments, the signal peptide may comprise a short positively charged sequence of amino acids at the N-terminus. In some embodiments, the signal peptide may comprise a sequence of amino adds recognized and cleaved by signal peptidases.
[0067] In some embodiments, the signal sequence may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99% identity to the nucleic add sequences of SEQ ID
NO: 14-18.
[0068] In some embodiments, the signal sequence may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least
91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 302-306.
(iii) Sequence Encoding a Cascade Reagent Protein
[0069] A sequence encoding a cascade reagent protein is a sequence of DNA encoding any one of the cascade reagent proteins of the LAL assay disclosed herein. A protein encoded by this sequence may also be referred to as a recombinant cascade reagent (RCR), and may include any one of three recombinant protease zymogens, namely Factor C, Factor B, and Proclotting Enzyme, and a clotting protein, namely Coagulogen.
[0070] The expression cassettes of the disclosure may comprise a sequence encoding a cascade reagent protein. The sequence encoding a cascade reagent protein may be isolated or derived from the genome of one of any horseshoe crab, for example Tachypleus tridentatus,
Limulus polyphemus, Tachypleus gigas, or Carcinoscorpius rotundicauda. In some embodiments, the sequence encoding a cascade reagent protein may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 4. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
TABLE 4
[0071] In some embodiments, the sequence encoding a cascade reagent protein may be truncated or mutated from the wild type sequence. In some embodiments, the sequence encoding a cascade reagent protein may encode a recombinant protein with activity higher than, lower than, or equivalent to that of the wild type protein. In some embodiments the sequence encoding a cascade reagent protein may encode a cascade reagent protein homolog.
[0072] In some embodiments, the sequence encoding a cascade reagent protein may encode the Factor C serine protease zymogen. In some embodiments, the sequence encoding the cascade reagent protein Factor C may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, or
SEQ ID NO: 278-283, or SEQ ID NO: 325.
[0073] In some embodiments, the sequence encoding a cascade reagent protein may encode the Factor B serine protease zymogen and homologs thereof, e.g., C3 and C2/Bf. In some embodiments, the sequence encoding the cascade reagent protein Factor B may comprise a nucleic add sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289.
[0074] In some embodiments, the sequence encoding a cascade reagent protein may encode the Proclotting Enzyme serine protease zymogen. In some embodiments, the sequence encoding the cascade reagent protein Proclotting Enzyme may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID
NO: 5 or SEQ ID NO: 290-292.
[0075] In some embodiments, the sequence encoding a cascade reagent protein may encode the Coagulogen clotting protein. In some embodiments, the sequence encoding the cascade reagent protein Coagulogen may comprise a nucleic acid sequence having at least 70%, at least
75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 6-8 or SEQ ID NO: 293-
301.
(iv) Termination Sequence
[0076] A termination sequence, terminator, or transcription terminator is a sequence of
DNA downstream of the translational stop codon that mediates termination of transcription of operably linked nucleic acid sequences. Prokaryotic transcription terminators of the present disclosure may be Rho-dependent or Rho-independent. Transcription terminators may comprise a downstream transcription stop point sequence and/or a GC-rich region of dyad symmetry followed by a poly-A sequence to promote allosteric dissociation of the transcriptional complex and/or hairpin loop formation of the transcribed mRNA and subsequent transcription termination.
[0077] The expression cassettes of the disclosure may comprise a termination sequence.
The termination sequence may be isolated or derived from the genome of one of any suitable organism, for example Escherichia coli (E. coli) or C. glutamicum. The termination sequence may comprise the termination region of the E. coli rrnB gene, the termination region of the C. glutamicum cgl502 gene, the termination region of the C. glutamicum cg3011 gene, the termination region of the C. glutamicum cspA gene, and the termination region of the C. glutamicum cg!338 gene. In some embodiments, the termination sequence may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 5. In some embodiments, optimization for expression in C. glutamicum may include replacing nucleotides of
the wild type termination sequence to optimize GC content for expression in C. glutamicum.
[0078] In some embodiments, the termination sequence may comprise the wild type rmB termination sequence from E. coli. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 272, or a sequence at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the rmB termination sequence from E. coli optimized for expression in C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 273, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0079] In some embodiments, the termination sequence may comprise the wild type cg!502 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 274, or a sequence at least 70%, at least
75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise
the wild type cg3011 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 275, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cspA termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 276, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cgl338 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 277, or a sequence at least
70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
(v) Sequence Encoding a Polypeptide Protein Tag
[0080] A sequence encoding a polypeptide protein tag is a sequence of DNA encoding a peptide sequence, protein tag, or polypeptide protein tag. A sequence encoding a polypeptide protein tag may be fused, appended, or grafted to a sequence encoding a protein, generally at either the C-terminus or N-terminus, or at both the C-terminus and the N-terminus of the protein. Less frequently a sequence encoding a polypeptide protein tag may be inserted into the sequence encoding a protein. A polypeptide protein tag may be appended to a protein to aid in affinity purification from biological lysate, enhance resolution of chromatographic separation, and/or promote solubilization and proper folding of proteins prone to precipitation. Polypeptide protein tags may comprise polyanionic amino acids or epitope tags.
[0081] The expression cassettes of the disclosure may comprise a sequence encoding a polypeptide protein tag. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the sequence encoding the cascade reagent protein and the termination sequence. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding a polypeptide protein tag and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding a polypeptide protein tag.
[0082] In some embodiments, two or more sequences encoding polypeptide protein tags may be located in tandem at the 5’ end or the 3’ end of the sequence encoding the cascade reagent protein. In some embodiments, the sequence encoding the cascade reagent protein may be located between two sequences encoding polypeptide protein tags, i.e., the sequences encoding polypeptide protein tags flank the sequence encoding the cascade reagent protein. In some embodiments, sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the flanking sequences encoding the polypeptide protein tags.
[0083] In some embodiments, the cascade reagent protein may be N-terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be C- terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be N-terminally or C-terminally tagged with tandem polypeptide protein tags. In some embodiments, the cascade reagent protein may be both N-terminally and C-terminally tagged with polypeptide protein tags. In some embodiments, the two or more polypeptide protein tags are identical. In some embodiments, the two or more polypeptide protein tags are not identical. In
some embodiments, cleavable, flexible, and/or rigid linkers may separate the polypeptide protein tag or tags from the cascade reagent protein.
[0084] In some embodiments, the sequence encoding a polypeptide protein tag may encode a peptide or protein tag, for example a polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, or maltose-binding protein. In some embodiments, the sequence encoding a polypeptide protein tag may encode a polyhistidinetag, also referred to as His-tag, Hise tag, poly(His) tag, or 6His, which may be about 5-10 residues in length, for example 5, 6, 7, 8, 9, or 10 residues in length, e.g., the amino acid sequence
In some embodiments, the sequence encoding a polypeptide protein tag may encode a
FLAG-tag, also referred to as FLAG octapeptide or FLAG epitope, which may have the amino add sequence D
and may be used in tandem and with some variation in sequence identity, e.g., the 3xFLAG peptide of amino acid sequence
In some embodiments, the sequence encoding a polypeptide protein tag may encode an HA-tag, also referred to as the human influenza hemagglutinin tag, which may be derived from amino acids
98-106 of the human influenza hemagglutinin protein and may have the amino acid sequence
YPYDVPDYA. In some embodiments, the sequence encoding a polypeptide protein tag may encode a calmodulin-binding peptide, also referred to as a calmodulin-binding protein peptide tag,
CBP-tag, or calmodulin-tag, which may have the amino acid sequence In some embodiments, the sequence encoding a
polypeptide protein tag may encode a streptavidin-binding peptide, also referred to as an SEP or streptavi din-tag, including a 38 -amino add sequence or 8-amino acid sequences of
the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence
WSHPQFEK. In some embodiments, the sequence encoding a polypeptide protein tag may encode a glutathione S-transferase protein, also referred to as a GST-tag, which may be about 220 amino adds in length and may be derived from a sequence encoding a wild type glutathione S'-transferase.
In some embodiments, the sequence encoding a polypeptide protein tag may encode a maltose binding protein, also referred to as MBP-tag or maltose tag, which may be about 370-396 amino adds in length and may be derived from the malE gene of E. coli.
[0085] In some embodiments, the sequence encoding a polypeptide protein tag may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 6. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
TABLE 6
[0086] In some embodiments, the sequence encoding a polypeptide protein tag may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 19-32.
[0087] In some embodiments, the sequence encoding a polypeptide tag may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least
97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 307-314.
(vi) Sequence Encoding a Linker
[0088] A sequence encoding a linker is a sequence of DNA encoding a polypeptide linker.
Polypeptide linkers may encode cleavable, rigid, and/or flexible polypeptides. Polypeptide linkers, also referred to as linkers, may link functional protein domains together or release free functional domains after cleavage. Linkers may be isolated from or derived from naturally-occurring multidomain proteins, or may be designed de novo. Linkers may increase stability, promote folding, increase expression, or improve biological activity of the protein domains they are fused to.
Properties of linkers, including length, hydrophobicity, amino acid residues, and secondary structure, may vary. For instance, linkers may adopt various conformations, such as P-strand, helical, coil/bend, and turns.
[0089] The expression cassettes of the disclosure may comprise a sequence encoding a linker. In some embodiments, the sequence encoding a linker may encode a polypeptide about 3-
30 residues in length, for example 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
22, 23, 24, 25, 26, 27, 28, 29, or 30 residues in length. In some embodiments, the sequence encoding a linker may be located between a 5* sequence encoding a polypeptide protein tag and a
3’ sequence encoding a cascade reagent protein. In some embodiments, the sequence encoding a linker may be located between a 5 ’ sequence encoding a cascade reagent protein and a 3 ’ sequence encoding a polypeptide protein tag. In some embodiments, polar uncharged or charged residues are preferable amino acids of the linker.
[0090] In some embodiments, the sequence encoding a linker may encode a flexible GS linker, for example
(Gly)7, or (Giy)g. In some embodiments, the sequence encoding a linker may encode a rigid a-helical linker, for example
or In some embodiments, the sequence encoding a linker may
encode a rigid proline-rich linker, for example PAPAP, (AP)n, (KP)n, or (EP)n, wherein n is 3-4.
In some embodiments, the sequence encoding a linker may encode a cleavable disulfide linker, for example LEAGCKNFFPRSFTSCGSLE, or a cleavable protease linker, for example GFLG.
[0091] In some embodiments, the sequence encoding a linker may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 7. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or
OptimWiz optimization and/or optimizing GC content of the DNA sequence.
[0092] In some embodiments, the sequence encoding a linker may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of
SEQ ID NO: 265-271.
[0093] In some embodiments, the sequence encoding a linker may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 315-321.
(vii) Exemplary Expression Cassettes
[0094] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, and a sequence encoding a cascade reagent protein. In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID NO: 325
(OptimWiz optimized Factor C (T. tridentatus)), or SEQ ID NO: 2 (Factor C (C. rotundicauda)).
In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ
ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID
NO: 4 (Factor B (C. rotundicauda)). In some embodiments, the expression cassette comprises SEQ
ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 5
(Proclotting Enzyme (T. tridentatus)). In some embodiments, the expression cassette comprises
SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), and SEQ ID NO: 6
(Coagulogen (L potyphemus)), SEQ ID NO: 7 (Coagulogen (7. tridentatus)), or SEQ ID NO: 8
(Coagulogen (C. rotundicauda)).
[0095] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO:
14 (cg!514 signal sequence), SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID
NO:325 (OptimWiz optimized Factor C (7. tridentatus)), or SEQ ID NO: 2 (Factor C (C. rotundicauda)), and SEQ ID NO: 272 (wild type rmB termination sequence) or SEQ ID NO: 273
(optimized rrnB termination sequence). In some embodiments, the expression cassette comprises
SEQ ID NO: SEQ ID NO: 322
tridentatus version 2)-rmB terminator), or SEQ ID NO: 113
rotundicauda)-rmB terminator). In some embodiments, the
expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID NO: 4 (Factor B (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275,
SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E. coli rrnB termination sequences,
C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).
In some embodiments, the expression cassette comprises SEQ ID NO: 101 (Pcgl514-cgl514ss-
Factor B (7. tridentatus)-rrnB terminator) or SEQ ID NO: 117
rotundicauda)-rrnB terminator). In some embodiments, the expression cassette comprises SEQ ID
NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 5 (Proclotting
Enzyme from T. tridentatus), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID
NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E. coli rmB termination
sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 105
(Pcg7574-cg7574ss-Proclotting Enzyme (T. tridentatus)-rrnB terminator). In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID
NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E coli rmB termination sequences, C. glutamicum wild type cg!502, cgSOll, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 109
[0096] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from
5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 34, SEQ ID NO: 37, SEQ ID NO: 39,
SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47 (cg7574ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272,
SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (£. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cgSOll, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 98 tridentatus)-rmB
terminator^, SEQ ID NO: 323
(T. tridentatus version 2)-rmB
terminator), SEQ ID NO: 114 rotundicauda)-rmB
terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
9 (cgl514 promoter), SEQ ID NO: 50, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ
ID NO: 59, SEQ ID NO: 61, or SEQ ID NO: 63 (cgl 514ss-tag-F actor B where the tag is 6His,
FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273,
SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises
NO: 118 rotundicauda)-rmB terminator). In some
embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
SEQ ID NO: 66, SEQ ID NO: 69, SEQ ID NO: 71, SEQ ID NO: 73, SEQ ID NO: 75, SEQ ID
CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:
274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:
106 Enzyme (7*. tridentatus)-rmB terminator). In some
embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
SEQ ID NO: 82, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID
SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,
SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized
termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cgl338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: SEQ ID NO: 122
terminator), or SEQ ID NO: 126
[0097] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 35, SEQ ID NO: 38, SEQ ID NO:
40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, or SEQ ID NO:
tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO:
272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:
277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or eg 1338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 99 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus)-6His- rrnB terminator), SEQ ID NO: 324 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus version 2)-6His- rrnB terminator), SEQ ID NO: 115 (Pcgl514-cgl514ss-¥actor C
terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
9 (cgl514 promoter), SEQ ID NO: 51, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ
FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273,
SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (£. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cgl338
termination sequences, respectively). In some embodiments, the expression cassette comprises
NO: 119
terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
SEQ ID NO: 67, SEQ ID NO: 70, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 76, SEQ ID
CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:
274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences,
type or termination
sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:
107 Enzyme In some
embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
SEQ ID NO: 83, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID
SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,
SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, wild type or cg!338 termination
sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:
[0098] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade
reagent protein, a sequence encoding a polypeptide protein tag, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
SEQ ID NO: 36 (cg/5/Vxs-6His-Factor C-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ
ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises
SEQ ID NO: 116
(C. rotundicauda) -6His-rrnB terminatoi). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 52 (cg7574ss-6His-Factor B-6His), and SEQ ID NO: 272, SEQ ID NO:
273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 104
(7. tridentatus)-6tiis-rrnB terminator) or SEQ ID NO: 120
(C. rotundicauda)-6¥\\s-rrnB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID
NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 108 -Proclotting Enzyme (T.
. In some embodiments, the expression cassette comprises from
5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 84 (cgI514ss-6His -Coagulogen-6His),
and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 112
[0099] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5* to 3*
SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 65
Proclotting Enzyme), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO:
275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
9 (cgl514 promoter), SEQ ID NO: 81 and SEQ ID NO: 272, SEQ ID
NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coh rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cg!338 termination sequences, respectively).
Methods of Recombinant Protein Expression and Purification
[0100] The disclosure provides methods of recombinant protein expression. In some embodiments, the expression cassette is cloned into a plasmid. In some embodiments, the expression cassette may be cloned into a multiple cloning site of a plasmid using restriction enzyme cloning, Gateway cloning, or TOPO cloning. In some embodiments, the expression cassette may be Gibson assembled into a plasmid. In some embodiments, the expression cassette may be inserted into a plasmid using a combination of restriction enzyme cloning, Gateway cloning, TOPO cloning, and/or Gibson assembly. In some embodiments, nucleic acid sequences may comprise restriction enzyme recognition sites and/or recombination sequences to facilitate cloning. In some embodiments, restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of: the promoter, the signal sequence, the sequence encoding a cascade reagent protein, the termination sequence, the sequence encoding a polypeptide tag, and the sequence encoding a linker. In some embodiments, restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of two or more sequences encoding polypeptide protein tags and two or more sequences encoding linkers.
[0101] In some embodiments, a plasmid may be a cloning vector, a transfer vector, a shuttle vector, or an expression vector. In some embodiments, a suitable plasmid may be a mobilizable E. coll - C. glutamicum shuttle vector. In some embodiments, a suitable plasmid may be the pEC-pkl8mob2 plasmid.
[0102] The disclosure provides methods of recombinant protein purification. In some embodiments, the RCRs of the present invention may be purified from cultures of recombinant C. glutamicum cells expressing nucleic acid molecules, including expression cassettes and plasmids.
In some embodiments, the expression cassette comprises a sequence encoding a polypeptide tag fused to the 5’ end or the 3’ end of the sequence encoding a cascade reagent protein. The polypeptide tag may comprise a solubilization tag that facilitates proper protein folding and prevents precipitation during purification. The polypeptide tag may comprise an affinity tag that facilitates affinity purification. The polypeptide tag may comprise a chromatographic tag that modulates resolution during chromatographic separation. The polypeptide tag may comprise an epitope tag that facilitates antibody purification.
[0103] In some embodiments, the RCR may be purified from culture supernatant or cell lysate using column chromatography. In some embodiments, the culture supernatant or cell lysate may be applied to a column, the column may be washed, and bound protein may be eluted from the column. In some embodiments, additives and chelating agents, e.g., EDTA, may be incorporated into buffers during purification. In some embodiments, the tagged protein binds to the column matrix and may be eluted by competitive binding, cleavage of the protein tag, or by destabilization of the interaction between the protein tag and the column matrix, e.g., by a change of pH. In some embodiments, the RCR may be purified by fast protein liquid chromatography
(FPLC), batch spin, or drip columns. In some embodiments, elution fractions may be assayed for protein concentration and RCR activity and concentrated to obtain higher protein concentrations.
In some embodiments, the RCR is purified to apparent homogeneity.
[0104] In some embodiments, the isolated, purified protein molecule is an RCR derived from T. tridentatus, e.g., serine protease zymogen or clotting protein optimized for expression in
C. glutamicum. In some embodiments, the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 257 (Factor C), SEQ ID NO: 259 (Factor B), or SEQ ID
NO: 261 (Proclotting Enzyme). In some embodiments, the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 263 (Coagulogen).
[0105] In some embodiments, the isolated, purified protein molecule is an RCR derived from C. rotundicauda including homologs thereof, e.g., Factor B C3 and C2/Bf. In some embodiments, the isolated, purified protein molecule is a serine protease zymogen or clotting protein optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 258 (Factor C) or SEQ ID NO: 260 (Factor B). In some embodiments, the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 264 (Coagulogen).
[0106] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from T. tridentatus, e.g., a serine protease zymogen or clotting protein and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ
ID NO: 145 (cgl514ss-Factor B), SEQ ID NO: 161 (cg!514ss-Proclotting Enzyme), or SEQ ID
NO: 177 (cgl514ss-Coagulogen).
[0107] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from C. rotundicauda or L. polyphemus, e.g., a serine protease zymogen or clotting protein, and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 193
(cgl514ss-Factor C from C. rotundicauda), SEQ ID NO: 209 (cgl514ss-Factor B from C.
rotundicauda), SEQ ID NO: 225 (cgl514ss-Coagulogen (Z. polyphemus)) or SEQ ID NO: 241
(cgl514ss-Coagulogen (C. rotundicauda)).
[0108] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from T. tridentatus and optimized for expression in C. gluUmicum. In some embodiments, the isolated, purified protein molecule is an
N-terminal signal peptide fused to an N-terminally tagged Factor C derived from T. tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 130, SEQ ID NO: 133,
SEQ ID NO: 135, SEQ ID NO: 137, SEQ ID NO: 139, SEQ ID NO: 141, or SEQ ID NO: 143
(cgl 514ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).
In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor B derived from T. tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 146, SEQ ID NO: 149, SEQ ID NO: 151, SEQ
ID NO: 153, SEQ ID NO: 155, SEQ ID NO: 157, or SEQ ID NO: 159 (cgl514ss-tag-Factor B where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Proclotting Enzyme derived from T. tridentatus optimized for expression in C. gluUmicum.
In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least
75% identical to SEQ ID NO: 162, SEQ ID NO: 165, SEQ ID NO: 167, SEQ ID NO: 169, SEQ
ID NO: 171, SEQ ID NO: 173, or SEQ ID NO: 175 (cgl514ss-tag-Proclotting Enzyme where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged
Coagulogen derived from T tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 178, SEQ ID NO: 181, SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID NO:
187, SEQ ID NO: 189, or SEQ ID NO: 191 (cg!514ss-tag-Coagulogen where the tag is 6His,
FLAG, HA, CBP, SBP, GST, or MBP, respectively).
[0109] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
194, SEQ ID NO: 197, SEQ ID NO: 199, SEQ ID NO: 201, SEQ ID NO: 203, SEQ ID NO: 205, or SEQ ID NO: 207 (cg!514ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 210, SEQ ID
NO: 213, SEQ ID NO: 215, SEQ ID NO: 217, SEQ ID NO: 219, SEQ ID NO: 221, or SEQ ID
NO: 223 (cg!514ss-tag-Factor B where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 226, SEQ ID NO: 229, SEQ ID
NO: 231, SEQ ID NO: 233, SEQ ID NO: 235, SEQ ID NO: 237, or SEQ ID NO: 239 (cgl514ss- tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. gluUmicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID NO: 247, SEQ
ID NO: 249, SEQ ID NO: 251, SEQ ID NO: 253, or SEQ ID NO: 255 (cg!514ss-tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
[0110] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an
N-terminal signal peptide fused to a C -terminally tagged Factor C derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 131, SEQ ID NO: 134,
SEQ ID NO: 136, SEQ ID NO: 138, SEQ ID NO: 140, SEQ ID NO: 142, or SEQ ID NO: 144
(cgl 514ss-Factor C-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Factor B derived from T tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 147, SEQ ID NO: 150, SEQ ID NO: 152, SEQ
ID NO: 154, SEQ ID NO: 156, SEQ ID NO: 158, or SEQ ID NO: 160 (cgl514ss-Factor B-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally
tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 163, SEQ ID NO: 166, SEQ ID NO: 168, SEQ
ID NO: 170, SEQ ID NO: 172, SEQ ID NO: 174, or SEQ ID NO: 176 (cgl514ss-Proclotting
Enzyme-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 179, SEQ ID NO: 182, SEQ ID NO: 184, SEQ
ID NO: 186, SEQ ID NO: 188, SEQ ID NO: 190, or SEQ ID NO: 192 (cgl514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).
[0111] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 195, SEQ ID
NO: 198, SEQ ID NO: 200, SEQ ID NO: 202, SEQ ID NO: 204, SEQ ID NO: 206, or SEQ ID
NO: 208 (cgl514ss-Factor C-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 211, SEQ ID NO: 214, SEQ ID
NO: 216, SEQ ID NO: 218, SEQ ID NO: 220, SEQ ID NO: 222, or SEQ ID NO: 224 (cgl514ss-
Factor B-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 227, SEQ ID NO: 230, SEQ ID NO: 232, SEQ
ID NO: 234, SEQ ID NO: 236, SEQ ID NO: 238, or SEQ ID NO: 240 (cgl 514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. glutamicum.
In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least
75% identical to SEQ ID NO: 243, SEQ ID NO: 246, SEQ ID NO: 248, SEQ ID NO: 250, SEQ
ID NO: 252, SEQ ID NO: 254, or SEQ ID NO: 256 (cgl514ss-Coagulogen-tag where the tag is
6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
[0112] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged RCR derived from T. tridentatus or C. rotundicauda optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Factor C derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 132 (cgl514ss-6His-Factor C
(Z tridentatus)-6His) or SEQ ID NO: 196 (cgl514ss-6His-Factor C (C. rotundicauda)-6Hisy In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused
to an N-terminally and C-terminally tagged Factor B derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
B (C. rotundicauda)-6His). In some embodiments, the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally and C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ
ID NO: 164 (cgl 514ss-6His-Proclotting Enzyme In some embodiments, the
isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Coagulogen derived from T. tridentatus, L. polyphemus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 180 (cgl514ss-6His-
Coagulogen SEQ ID NO: 228 (cgl514ss-6His-Coagulogen (L.
or SEQ ID NO: 244 (cgl514ss-6His-Coagulogen (C. rotundicauda)-6His.
Kits and Methods for Detecting a Pyrogen or Endotoxin in a Sample
[0113] The disclosure provides kits and methods for detecting a pyrogen or endotoxin in a sample. In some embodiments, the kit comprises one or more of the RCR proteins of the present disclosure. In some embodiments, the kit comprises one or more of recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting
Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein. In some embodiments, the kit comprises one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity.
[0114] The disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity. In some embodiments, the method comprises contacting the sample with one or more of the components of the kit described herein, including recombinant
Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant
Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein. In some embodiments, the method comprises contacting the sample with one or more of the components of the kit described herein in combination with a commercialized natural lysate reagent. In some embodiments, the method for detecting a pyrogen or endotoxin in a sample comprises the limited proteolysis of each protease zymogen in the coagulation cascade reaction of the LAL assay.
[0115] In some embodiments, the method for detecting a pyrogen or endotoxin in a sample may comprise admixing one or more components of the kit with the sample, separating precipitated proteins from the sample, admixing one or more components of the kit with the remaining sample, and measuring coagulation. Measuring coagulation may include observing increased turbidity and viscosity. In some embodiments, the method further comprises centrifugation of the sample, sedimentation and separation of the sample, and/or removal of one or more layers or portions of the sample.
EXAMPLES
[0116] The following examples are not intended to be limited and are included herein for illustration purposes only.
Example 1: Preparation of RCR expression cassettes
[0117] Expression cassettes of the present disclosure include nucleic acid molecules comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and may include a polypeptide tag. In an exemplary embodiment of the present invention, the expression cassette comprises a promoter, a signal sequence, a gene of interest, and a termination sequence
(FIG. 2A, number 1). In an embodiment, the expression cassette comprises a promoter, a signal sequence, an N-terminally tagged gene of interest, and a termination sequence (FIG. 2A, number
2). In an embodiment, the expression cassette comprises a promoter, a signal sequence, a C- terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 3).
[0118] In an embodiment, standard cloning techniques were used to construct RCR expression cassettes comprising the , the cgl514 signal sequence
indicated by cgl514ss (SEQ ID NO: 14), the T tridentatus Factor C gene optimized for expression in C. glutamicum (SEQ ID NO: 325), the E. coli rmBTlT2 terminator sequence indicated by rmB terminator (SEQ ID NO: 272), and optionally a polyhistidine-tag optimized for expression in C. glutamicum (SEQ ID NO: 26). Three RCR expression cassettes were engineered to result in a secretory expression system based on the Cgl514 secreted protein of C. glutamicum by using the promoter ) and signal sequence (cg!514ss) of cg!514.
[0119] FIG. 2B shows schematic representations of the three RCR expression cassettes optimized for expression in C. glutamicum. Expression cassette number 4 (SEQ ID NO: 322) comprises the P promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272). Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cgl514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272). Expression cassette number 6 (SEQ ID NO: 324)
comprises the Pcgisi* promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
Factor C gene (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator
(SEQ ID NO: 272). The three RCR expression cassettes comprise the nucleic acid sequences of
SEQ ID NO: 322, SEQ ID NO: 323, and SEQ ID NO: 324, for expression of Factor C (FIG. 2B, number 4), N-terminally polyhistidine-tagged Factor C (FIG. 2B, number 5), and C-terminally polyhistidine-tagged Factor C (FIG 2B, number 6), respectively.
Example 2: Expression of recombinant expression cassettes in G glutamicum
[0120] Each of the three RCR expression cassettes were cloned into a multiple cloning site
(MCS) of the pEC-pk!8mob2 plasmid, resulting in three plasmids comprising each of the three
RCR expression cassettes. The pEC-pk!8mob2 plasmid is a mobilizable E. coli - C. glutamicum shuttle vector based on a mini-replicon encoding the repA and per functions of the medium copy number plasmid pGAl. Each of the three plasmids, as well as pEC-pk!8mob2 empty plasmid, were transformed separately into C. glutamicum. For plasmid expression confirmation, a single colony of each of the transformations was isolated from a fresh LEG plate (Luria Broth - Lennox’ s formulation supplemented with 0.5% glucose), inoculated in LEG broth and incubated at 30 °C shaking at 200 revolutions per minute (RPM) for about 6 - 8 hours. A sample of each of the transformations was then inoculated into fresh LEG broth and incubated at 30 °C shaking at 200
RPM for about 14 - 16 hours. Samples of each of the plasmid transformations were removed for gram staining. Gram-positive bacteria (Bacillus cereus) and gram-negative bacteria (Escherichia coli KI 2) were used as positive and negative controls and are indicated by B. cereus and E. coli, respectively. Untransformed, rod-shaped C. glutamicum was also Gram stained. A Gram stain shows gram-positive B. cereus and gram-negative E. coli (FIG. 3A, top left), gram-positive C. glutamicum (FIG. 3A, top middle), gram-positive C. glutamicum transformed with pK18mob2
empty plasmid (FIG. 3A, top right), gram-positive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 4, indicated by pK18mob2 - FC (FIG. 3 A, bottom left), gram-positive C. glutamicum transformed with the Factor C expression cassette plasmid of
FIG. 2B number 6, indicated by pK18mob2 - FC-CHis6 (FIG. 3A, bottom middle), and grampositive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 5, indicated by pK18mob2 - FC-NHis6 (FIG. 3 A, bottom right).
[0121] Cells were harvested by centrifugation at 3500 RPM for 20 minutes at 4 °C.
Supernatants of each of the controls and three experimental plasmid transformations were removed and the cell pellets were washed once with STE buffer (10 mM Tris, 10 mM NaCl, 1 mM EDTA, pH 8.0). Cell pellets were frozen at -20 °C overnight. Cells pellets were thawed and resuspended in STE buffer supplemented with 500 mM sucrose and 10 mg/mL lysozyme, then shaken at 200
RPM at 37 °C for one hour. Plasmids were isolated from bacteria using alkaline lysis, and samples were subjected to gel electrophoreses at 80 Volts for 120 minutes at room temperature on a 1% agarose gel in IX TAB buffer. Safe DNA Gel Stain (Bioland Scientific) was used to visualize
DNA under blue LED light (FIG. 3B). Lane 1 shows no DNA present from the C. glutamicum negative control, lane 2 shows the expected molecular weight of the pEC-pkl8mob2 empty plasmid expressed in C. glutamicum, lane 3 shows the expected molecular weight of the plasmid comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322) expressed in C. glutamicum, lane 4 shows the expected molecular weight of the plasmid comprising expression cassette number 6 of FIG. 2B (SEQ ID NO: 324) expressed in C. glutamicum, and lane 5 shows the expected molecular weight of the plasmid comprising expression cassette number 5 of FIG.
2B (SEQ ID NO: 323) expressed in C. glutamicum.
Example 3: Expression of recombinant Factor C in C glutamicum
[0122] C. glutamicum expression cassette 4 of FIG. 2B (SEQ ID NO: 322; pK18mob2 -
FC) was cultivated in 14 mL round-bottom culture tubes containing 2.5 mL brain heart infusion
(BHI; Carolina Biological Supply Company, Burlington, NC) medium at 30 °C for 48 hours at
200 RPM. In all cultivations, kanamycin (50 mg/L) was added to the culture medium as the sole antibiotic. As a seed culture, cells were inoculated into 50 mL of semi-defined medium containing
20 g/L of glucose in a 250 mL baffled flask and cultivated at 30 °C for 24 hours at 200 RPM. The semi-defined medium consists of 0.5 g urea, 0.25 mg ZnSCh, 2.5 mg CaCh in BHI media. The seed culture (40 mL) was inoculated into 400 mL of fresh semi-defined medium in a 1 L jar custom-built bioreactor. Throughout cultivation, the temperature was maintained at 30 °C and stirred with axial flow impeller at 300 RPM. Oxygen concentration was maximized by continual sterile air flow into the medium. The pH was maintained at 7.0 by adding 10% "V/V ammonium hydroxide solution (LabChem, Zelienople, PA) when the set point dropped below 7 or 37% hydrochloric acid (GTI Laboratory Supplies, Edna, Texas) when the set point increased above 7.
To prevent glucose starvation, a glucose solution (90 g in 150 mL BHI) was added to the culture in 90 second increments at a rate of 12.5 mL/hr.
[0123] After bioreactor cultivation for 36 hours, extracellular proteins were prepared using acetone precipitation. After centrifugation at 4500 RPM for 10 minutes at 4°C, 75 mL of the culture supernatant was vigorously mixed with two volumes of cold acetone and incubated at -
20°C overnight. The protein samples were then precipitated by centrifugation at 13, 200 RPM for
30 minutes at 4°C. The pellet was air-dried and resuspended in denaturing 8M urea (pH 8.0), 300 mMNaCl, 50 mMNaH2PO4, 20 mM Tris-Cl, 1 mMEDTA, 10% glycerol, and 1% Triton X-100.
60 pL of the resuspended precipitated supernatant protein was added per lane in an 8% SDS-PAGE gel. SDS-PAGE gel was then stained in 0.025% Coomassie Brilliant Blue R-250 in 10% acetic
add at 50°C for 15 minutes while shaking. SDS-PAGE gel was destained overnight in 10% acetic add with several changes of 10% acetic acid. Gels were imaged on a light table (Figure 4).
[0124] Referring to Figure 4, lanes 1 and 2 are duplicates of the C. glutamicum pK18mob2 negative control sample, and lanes 3 and 4 are duplicates of the C. glutcanicum pK18mob2 - FC sample. Lanes 3 and 4 show expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutamicum expressing pK18mob2 - FC, a plasmid which harbors cassette number 4 (SEQ ID NO: 322), referred to as C. glutamicum pKl8mob2 - FC. Lanes 1 and 2 do not show expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutcanicum expressing pK18mob2, an empty plasmid, referred to as C. glutamicum pK18mob2. Factor C is a two-chain glycoprotdn (Mr = 123 kDa) composed of a heavy chain (Mr = 80 kDa) and a light chain (Mr = 43 kDa). SDS-PAGE gel analysis with an 8% gel under denaturing conditions demonstrates expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutamicum pK18mob2 - FC, corresponding to production of Factor C in C. glutamicum and extrusion of the protein into the culture supernatant.
NUMBERED EMBODIMENTS
[0125] The following list of embodiments is not intended to be limiting and is included herein for illustrative purposes. The subjected matter to be claimed is not limited to the following embodiments:
Embodiment 1. A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’: a. a promoter; b. a signal sequence; and c. a sequence encoding a cascade reagent protein.
Embodiment 2. The nucleic acid molecule of embodiment 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.
Embodiment 3. The nucleic acid molecule of embodiment 1 or 2, wherein the signal sequence encodes a signal peptide.
Embodiment 4. The nucleic acid molecule as any one of embodiments 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
Embodiment 5. The nucleic acid molecule as in any one of embodiments 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.
Embodiment 6. The nucleic acid molecule as in any one of embodiments 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a Corynebacterium glutamicum secretory gene.
Embodiment 7. The nucleic acid molecule as in embodiment 5 or 6, wherein the
Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR.0949 gene, and the porB gene.
Embodiment 8. The nucleic acid molecule as in any one of embodiments 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.
Embodiment 9. The nucleic acid molecule as in any one of embodiments 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus polyphemus, Tachypleus gigas, and
Carcinoscorpius rotundicauda.
Embodiment 10. The nucleic acid molecule as in any one of embodiments 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
Embodiment 11. The nucleic acid molecule as in any one of embodiments 1-10, wherein the expression cassette comprises a termination sequence.
Embodiment 12. The nucleic acid molecule of embodiment 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rmB gene, the termination region of the Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium
glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cgl338 gene.
Embodiment 13. The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.
Embodiment 14. The nucleic acid molecule of embodiment 13, wherein the polypeptide protein tag is selected from the group consisting of polyhistidine-tag,
FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.
Embodiment 15. The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
Embodiment 16. The nucleic acid molecule of embodiment 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
Embodiment 17. The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
Embodiment 18. The nucleic acid molecule of embodiment 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
Embodiment 19. The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.
Embodiment 20. The nucleic acid molecule of embodiment 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
Embodiment 21. The nucleic acid molecule of embodiment 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
Embodiment 22. The nucleic acid molecule of embodiment 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
Embodiment 23. The nucleic acid molecule as in any one of embodiments 16, 18, or
22, in which the linker or linkers are selected from the group consisting of flexible
GS linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
Embodiment 24. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.
Embodiment 25. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.
Embodiment 26. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto.
Embodiment 27. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 6, SEQ ID NO: 7, or SEQ ID NO: 8, or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
Embodiment 28. The nucleic acid molecule as in any one of embodiments 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID
NO: 9-13, or a sequence at least 90% identical thereto.
Embodiment 29. The nucleic acid molecule as in any one of embodiments 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of
SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.
Embodiment 30. The nucleic acid molecule as in any one of embodiments 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
Embodiment 31. The nucleic acid molecule as in any one of embodiments 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
Embodiment 32. The nucleic acid molecule as in any one of embodiments 16, 18, or
22-31, wherein the linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.
Embodiment 33. The nucleic acid molecule as in any one of embodiments 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
Embodiment 34. The nucleic acid molecule as in any one of embodiments 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of
SEQ ID NO: 97-128, SEQ ID NO: 322-324, or a sequence at least 90% thereto.
Embodiment 35. A plasmid, comprising the nucleic acid molecule as in any one of embodiments 1-34.
Embodiment 36. A cell, comprising the nucleic acid molecule as in any one of embodiments 1-34 or the plasmid of embodiment 35.
Embodiment 37. A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment
35.
Embodiment 38. A recombinant expression system produced by the method of embodiment 37.
Embodiment 39. A method of expressing Factor C serine protease zymogen, Factor
B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or
Coagulogen clotting protein, comprising contacting a Corynebacterium
glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment 35.
Embodiment 40. An isolated, purified protein molecule, wherein the amino acid sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
Embodiment 41. A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein expressed in Corynebacterium glutamicum.
Embodiment 42. The kit for detecting a pyrogen or endotoxin in a sample of embodiment 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID
NO: 258.
Embodiment 43. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-42, wherein the amino acid sequence of the recombinant
Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or
SEQ ID NO: 260.
Embodiment 44. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-43, wherein the amino acid sequence of the recombinant
Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID
NO: 261.
Embodiment 45. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-44, wherein the amino acid sequence of the recombinant
Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO:
262-264.
Embodiment 46. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule of embodiment 40.
Embodiment 47. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of embodiments 41-45.
Claims
1. A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’:
(i) a promoter;
(ii) a signal sequence; and
(iii) a sequence encoding a cascade reagent protein.
2. The nucleic acid molecule of claim 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.
3. The nucleic acid molecule of claim 1 or 2, wherein the signal sequence encodes a signal peptide.
4. The nucleic acid molecule as any one of claims 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
5. The nucleic acid molecule as in any one of claims 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.
6. The nucleic acid molecule as in any one of claims 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a
Corynebacterium glutamicum secretory gene.
7. The nucleic acid molecule as in claim 5 or 6, wherein the Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR0949 gene, and the porB gene.
8. The nucleic acid molecule as in any one of claims 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.
9. The nucleic acid molecule as in any one of claims 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus,
Limulus pofyphemus, Tachypleus gigas, and Carcinoscorpius rotundicauda.
10. The nucleic add molecule as in any one of claims 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
11. The nucleic acid molecule as in any one of claims 1-10, wherein the expression cassette comprises a termination sequence.
12. The nucleic acid molecule of claim 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the Corynebacterium glutamicum gene, the
termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum eg 1338 gene.
13. The nucleic acid molecule as in any one of claims 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.
14. The nucleic add molecule of claim 13, wherdn the polypeptide protein tag is selected from the group consisting of polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding
peptide, streptavidin-binding peptide, glutathione ^-transferase, and maltose-binding protein.
15. The nucleic acid molecule of claim 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
16. The nucleic acid molecule of claim 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
17. The nucleic acid molecule of claim 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
18. The nucleic acid molecule of claim 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
19. The nucleic acid molecule as in any one of claims 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.
20. The nucleic acid molecule of claim 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.
21. The nucleic acid molecule of claim 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
22. The nucleic acid molecule of claim 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
23. The nucleic add molecule as in any one of claims 16, 18, or 22, in which the linker or linkers are selected from the group consisting of flexible GS linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
24. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ
ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.
25. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or
SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.
26. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 5 or SEQ ID NO: 290-
292, or a sequence at least 90% identical thereto.
27. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6, SEQ ID NO: 7, or
SEQ ID NO: 8, or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
28. The nucleic acid molecule as in any one of claims 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto.
29. The nucleic acid molecule as in any one of claims 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.
30. The nucleic acid molecule as in any one of claims 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
31. The nucleic acid molecule as in any one of claims 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
32. The nucleic acid molecule as in any one of claims 16, 18, or 22-31, wherein the linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.
33. The nucleic acid molecule as in any one of claims 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ
ID NO: 33-96, or a sequence at least 90% thereto.
34. The nucleic acid molecule as in any one of claims 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128, SEQ ID NO:
322-324, or a sequence at least 90% thereto.
35. A plasmid, comprising the nucleic acid molecule as in any one of claims 1-34.
36. A cell, comprising the nucleic acid molecule as in any one of claims 1-34 or the plasmid of claim 35.
37. A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of claims 1-34, or the plasmid of claim 35.
38. A recombinant expression system produced by the method of claim 37.
39. A method of expressing Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein, comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of claims 1-34, or the plasmid of claim 35.
40. An isolated, purified protein molecule, wherein the amino acid sequence is at least
75% identical to any one of SEQ ID NO: 129-256.
41. A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor
C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant
Coagulogen clotting protein expressed in Corynebacterium glutamicum.
42. The kit for detecting a pyrogen or endotoxin in a sample of claim 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID NO: 258.
43. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-
42, wherein the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or SEQ ID NO: 260.
44. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-
43, wherein the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID NO: 261.
45. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-
44, wherein the amino add sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262-264.
46. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protdn molecule of claim 40.
47. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of claims 41-45.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2023/014214 WO2023177526A2 (en) | 2022-03-01 | 2023-03-01 | Compositions and methods for detecting an endotoxin |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263315513P | 2022-03-01 | 2022-03-01 | |
US63/315,513 | 2022-03-01 | ||
PCT/US2023/014214 WO2023177526A2 (en) | 2022-03-01 | 2023-03-01 | Compositions and methods for detecting an endotoxin |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2023177526A2 true WO2023177526A2 (en) | 2023-09-21 |
WO2023177526A3 WO2023177526A3 (en) | 2024-02-29 |
WO2023177526A9 WO2023177526A9 (en) | 2024-07-18 |
Family
ID=88024578
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/014214 WO2023177526A2 (en) | 2022-03-01 | 2023-03-01 | Compositions and methods for detecting an endotoxin |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023177526A2 (en) |
-
2023
- 2023-03-01 WO PCT/US2023/014214 patent/WO2023177526A2/en unknown
Non-Patent Citations (4)
Title |
---|
ANDERSON RL ET AL.: "Limulus polyphemus. Biol Bull", vol. 225, 2013, article "Sublethal behavioral and physiological effects of the biomedical bleeding process on the American horseshoe crab", pages: 137 - 151 |
COOPER, JF: "The bacterial endotoxins test: a practicalapproach", 2011, article "Discovery and acceptance of the bacterial endotoxins test", pages: 1 - 13 |
LEVIN, J ET AL.: "The American horseshoe crab", 2003, article "Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool", pages: 310 - 340 |
NOVITSKY TJ: "Global perspectives on horseshoe crab biology, conservation and management.", 2015, article "Biomedical implications for managing the Limulus polyphemus harvest along the northeast coast of the United States", pages: 483 - 500 |
Also Published As
Publication number | Publication date |
---|---|
WO2023177526A9 (en) | 2024-07-18 |
WO2023177526A3 (en) | 2024-02-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3225690A1 (en) | Method for preparing bacterial polysaccharide-modified recombinant fusion protein and use thereof | |
US10745450B2 (en) | Peptides and uses thereof | |
EP2929025B1 (en) | Method for recombinant production of horseshoe crab factor c protein in protozoa | |
CN111770687A (en) | Therapeutic bacteriocins | |
CN102686727A (en) | Combinatorial libraries based on C-type lectin domain | |
Dahl et al. | Carica papaya glutamine cyclotransferase belongs to a novel plant enzyme subfamily: cloning and characterization of the recombinant enzyme | |
TWI660042B (en) | Expression construct and method for producing proteins of interest | |
WO2023177526A2 (en) | Compositions and methods for detecting an endotoxin | |
JP7016552B2 (en) | How to increase the secretion of recombinant proteins | |
EP3630793B1 (en) | A recombinant protein | |
KR100963302B1 (en) | Recombinant Vector Containing ptsL Promoter and Method for Producing Exogeneous Proteins Using the Same | |
US10738090B2 (en) | Engineered microcompartment protein and related methods and systems of engineering bacterial systems for non-native protein expression and purification | |
KR20220097504A (en) | Horseshoe crab-derived recombinant FactorG and method for measuring β-glucan using the same | |
CN101775404A (en) | Method for highly expressing basic protein with prokaryotic expression system | |
US20090239262A1 (en) | Affinity Polypeptide for Purification of Recombinant Proteins | |
CN116790616B (en) | Gene for coding sCXCL16, expression vector, preparation method and application | |
CN112979769B (en) | Amino acid sequence, protein, preparation method and application thereof | |
CN113151227B (en) | Protease gene and heterologous expression thereof | |
KR20130138397A (en) | Peptide having antibacterial activity derived from lactoferrin and method for producing the same | |
RU2728652C1 (en) | Recombinant plasmid dna pet19b-sav, providing synthesis of full-length streptavidin streptomyces avidinii protein, bacterial strain escherichia coli - producer of soluble full-length protein of streptavidin streptomyces avidinii | |
CN108179142A (en) | A kind of new IgA protease and its preparation method and application | |
WO2018029333A1 (en) | Lipoprotein export signals and uses thereof | |
EP4079845A1 (en) | Method for enhancing water solubility of target protein by whep domain fusion | |
RU2707525C1 (en) | Recombinant plasmid expressing cloned chaperone hfq vibrio cholerae gene, and escherichia coli strain - chaperone superfood hfq vibrio cholerae | |
CN116970593A (en) | Serine protease homolog SLP-1 and preparation method and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WPC | Withdrawal of priority claims after completion of the technical preparations for international publication |
Ref document number: 63/315,513 Country of ref document: US Date of ref document: 20240524 Free format text: WITHDRAWN AFTER TECHNICAL PREPARATION FINISHED |