WO2023177526A2 - Compositions and methods for detecting an endotoxin - Google Patents

Compositions and methods for detecting an endotoxin Download PDF

Info

Publication number
WO2023177526A2
WO2023177526A2 PCT/US2023/014214 US2023014214W WO2023177526A2 WO 2023177526 A2 WO2023177526 A2 WO 2023177526A2 US 2023014214 W US2023014214 W US 2023014214W WO 2023177526 A2 WO2023177526 A2 WO 2023177526A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
sequence
nucleic acid
protein
acid molecule
Prior art date
Application number
PCT/US2023/014214
Other languages
French (fr)
Other versions
WO2023177526A9 (en
WO2023177526A3 (en
Inventor
Jennifer WATSON
Richard Hatcher
Original Assignee
Watson Jennifer
Richard Hatcher
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Watson Jennifer, Richard Hatcher filed Critical Watson Jennifer
Priority to PCT/US2023/014214 priority Critical patent/WO2023177526A2/en
Publication of WO2023177526A2 publication Critical patent/WO2023177526A2/en
Publication of WO2023177526A3 publication Critical patent/WO2023177526A3/en
Publication of WO2023177526A9 publication Critical patent/WO2023177526A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/34Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase
    • C12Q1/37Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase involving peptidase or proteinase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • C12N15/625DNA sequences coding for fusion proteins containing a sequence coding for a signal sequence
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/74Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
    • C12N15/77Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora for Corynebacterium; for Brevibacterium
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21084Serine endopeptidases (3.4.21) limulus clotting factor C (3.4.21.84)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21085Limulus clotting factor B (3.4.21.85)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21086Limulus clotting enzyme (3.4.21.86)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/195Assays involving biological materials from specific organisms or of a specific nature from bacteria
    • G01N2333/34Assays involving biological materials from specific organisms or of a specific nature from bacteria from Corynebacterium (G)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/90Enzymes; Proenzymes
    • G01N2333/914Hydrolases (3)
    • G01N2333/948Hydrolases (3) acting on peptide bonds (3.4)
    • G01N2333/95Proteinases, i.e. endopeptidases (3.4.21-3.4.99)
    • G01N2333/964Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue
    • G01N2333/96425Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue from mammals
    • G01N2333/96427Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue from mammals in general
    • G01N2333/9643Proteinases, i.e. endopeptidases (3.4.21-3.4.99) derived from animal tissue from mammals in general with EC number
    • G01N2333/96433Serine endopeptidases (3.4.21)

Definitions

  • the present invention relates generally to the fields of biotechnology and infectious diseases, and more particularly it pertains to recombinant production of enzymes for detection of pyrogens and endotoxins.
  • the assay uses the hemolymph (blood) of the horseshoe crab, Limulus polyphemus (L. polyphemus ⁇ and tests for the presence of fever-producing agents of bacterial origin, e.g., endotoxins.
  • the limulus amoebocyte lysate (LAL) test method is a qualitative assay during which the L. polyphemus hemolymph lysate reacts with an endotoxin to form a gel.
  • the LAL test is considered to be reproducible, simple to conduct, specific for the presence of endotoxins, and sensitive to even picogram quantities of endotoxins.
  • the quantity of endotoxin may be determined by dilution techniques comparing gel formation of the test sample to that of a reference pyrogen.
  • the following non-essential publications are incorporated by reference in their entirety to aid in understanding of the official use of the LAL assay for release testing of final drug products: Levin, J, et al. Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool.
  • Shuster CNJ, Barlow RB and Brockman HJ eds
  • McCullough KZ ed.
  • the bacterial endotoxins test a practical approach. 2011: 1-13.
  • the LAL assay comprises horseshoe crab lysate reagents that form a four-step coagulation cascade.
  • Enzyme, and one clotting protein, Coagulogen form the enzymatic coagulation cascade that results in a coagulin gel clot in the presence of an endotoxin.
  • an endotoxin activates the Factor C zymogen and the activated Factor C subsequently activates Factor B, which converts the Proclotting Enzyme into Clotting Enzyme that cleaves Coagulogen into Coagulin, forming a gel clot.
  • the raw materials for the production of lysate reagents are harvested from wildcaught horseshoe crab, including L. polyphemus and Tachypleus tridentatus (T. tridentatus). Wild horseshoe populations are in decline due to the detrimental effect of capture, blood collection, and release, poor management of harvest regulations, and habitat destruction. Commercial-scale cultivation of horseshoe crabs has not been achieved.
  • the following non-essential publications are incorporated by reference in their entirety to aid in understanding of the unsustainability of blood collection from wild-caught crabs for production of LAL assay reagents: Gauvry G. Current
  • Horseshoe crab harvesting practices cannot support global demand for TAL/LAL: The pharmaceutical and medical device industries’ role in the sustainability of horseshoe crabs. In:
  • the disclosure features expression cassettes, plasmids, and functional recombinant cascade reagents (RCRs) produced from these expression cassettes and plasmids.
  • RCRs functional recombinant cascade reagents
  • the disclosure also features expression cassettes for functional RCRs optimized for production in Corynebacterium glutamicum (C. glutamicum).
  • the disclosure features optimized expression cassettes for production in C. glutamicum of the Factor C, Factor B, and Proclotting Enzyme serine protease zymogens, as well as optimized expression cassettes for production of the Coagulogen clotting protein.
  • the disclosure provides nucleic acid molecules, comprising expression cassettes, wherein the expression cassettes comprise, from 5’ to 3’: a promoter; a signal sequence; and a sequence encoding a cascade reagent protein.
  • the expression cassette is optimized for expression in C. glutamicum.
  • the signal sequence encodes a signal peptide.
  • the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
  • the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene.
  • the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene.
  • the C. glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the
  • sequence encoding the cascade reagent protein encodes
  • the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus pofyphemus,
  • Tachypleus gigas and Carcinoscorpius rotundicauda (C. rotundicauda).
  • the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
  • the expression cassette comprises a termination sequence.
  • the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the
  • Corynebacterium glutamicum cg!502 gene the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cg!338 gene.
  • the expression cassette comprises a sequence encoding a polypeptide protein tag. In some embodiments, the expression cassette comprises two or more sequences encoding polypeptide protein tags. In some embodiments, the polypeptide protein tag or polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag,
  • HA-tag calmodulin-binding peptide
  • streptavidin-binding peptide streptavidin-binding peptide
  • glutathione 5-transferase glutathione 5-transferase
  • maltose-binding protein HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
  • the sequence encoding the polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
  • a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
  • the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
  • a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
  • sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
  • sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
  • the linker or linkers are selected from the group consisting of flexible glycine-serine linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
  • the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 278-283, or SEQ ID: 325, or a sequence at least 90% identical thereto.
  • the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284—289, or a sequence at least 90% identical thereto.
  • the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto.
  • the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6-8 or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
  • the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto.
  • the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.
  • the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
  • the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
  • the linker is encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271, or a sequence at least 90% thereto.
  • the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
  • the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128 or SEQ ID NO. 322-324, or a sequence at least 90% thereto.
  • the disclosure provides plasmids comprising nucleic acid molecules disclosed herein.
  • the disclosure also provides cells comprising any one of the nucleic acid molecules or plasmids disclosed herein.
  • the disclosure provides methods of producing a recombinant expression system comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
  • the disclosure also provides recombinant expression systems produced by the method of contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
  • the disclosure provides methods of expressing Factor C serine protease zymogen
  • the disclosure provides isolated, purified protein molecules, wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256. [0028]
  • the disclosure provides kits for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant
  • Coagulogen clotting protein expressed in C. glutamicum expressed in C. glutamicum.
  • the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 257 or SEQ ID NO: 258. In some embodiments, the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 259 or SEQ ID NO: 260. In some embodiments, the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical SEQ ID NO: 261. In some embodiments, the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262
  • the disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
  • the disclosure provides methods of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kits disclosed herein.
  • FIG. 1 depicts the coagulation cascade of the present disclosure based on the coagulation cascade in the horseshoe crab amoebocyte lysate.
  • FIGS. 2A-2B depict the expression cassettes of the present disclosure.
  • FIG. 2A shows expression cassettes comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and optionally a polypeptide tag.
  • FIG. 2B shows exemplary expression cassettes according to the present invention.
  • Expression cassette number 4 (SEQ ID NO: 322) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
  • Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272).
  • Expression cassette number 6 (SEQ ID NO: 324) comprises the promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), Factor C gene from T. tridentatus (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator (SEQ ID NO: 272).
  • FIGS. 3A-3B show expression of the plasmids containing expression cassettes in
  • FIG. 3 A depicts microscopy images showing untransformed gram-positive B. cereus and untransformed gram-negative E. coli (top left), untransformed, gram-positive C. glutamicum (top middle), C. glutamicum transformed with empty plasmid (top right), and C. glutamicum transformed with plasmids comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322, bottom left), expression cassette number 6 of FIG. 2B
  • FIG. 3B depicts gel electrophoresis showing the molecular weight of plasmids containing the expression cassettes of the present disclosure.
  • lane 1 shows C. glutamicum as a negative control
  • lane 2 shows C. glutamicum expressing the pEC- pk!8mob2 empty plasmid as a positive control
  • lane 3 shows C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322)
  • lane 4 shows C. glutamicum expressing the plasmid comprising the expression cassette number 6 of FIG. 2B (SEQ ID NO: 322)
  • lane 5 shows C. glutamicum expressing the plasmid comprising the expression cassette number 5 of FIG. 2B (SEQ ID NO: 323).
  • FIG. 2 show C. glutamicum expressing the pEC-pkl8mob2 empty plasmid as a negative control
  • lanes 3 and 4 show C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322).
  • the limulus amoebocyte lysate (LAL) test method is a standard pyrogen assay employed by a variety of industries to ensure that samples are free of harmful endotoxins and pyrogens.
  • the U.S. Food and Drug Administration (FDA) approved the LAL assay for testing drugs, products, and devices, and the assay is widely used to test ingredients of pharmaceuticals during manufacturing.
  • the LAL assay is based on a coagulation cascade involving reagents harvested from the hemolymph of wild-caught horseshoe crab. Specifically, exposure of endotoxin to the serine protease zymogen Factor C initiates a cascade that activates the serine protease zymogen
  • the disclosure provides nucleic acid molecules (comprising expression cassettes) and plasmids for producing the lysate reagents Factor C, Factor B, Proclotting Enzyme, and
  • Coagulogen wherein the nucleic acid molecules and plasmids are optimized for expression in the generally regarded as safe (GRAS) actinobacteria Corynebacterium glutamicum (C. glutamicuni).
  • GRAS safe actinobacteria
  • Corynebacterium glutamicum C. glutamicuni
  • kits for detecting a pyrogen or endotoxin in a sample comprising recombinant lysate reagents, methods of producing a recombinant expression system using the nucleic acid molecules and plasmids disclosed herein, and methods for detecting pyrogen or endotoxin in a sample.
  • nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used to test for contamination in a variety of industries, including pharmaceuticals (both preclinical studies and clinical applications) and biotechnologies, and settings, including healthcare providers, veterinary clinics, agriculture, food processing and service, wineries, breweries, distilleries, military, and direct-to-consumer.
  • nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the agriculture, food service, food processing, winery, brewery, or distillery industries to test for contamination at any point along the logistical supply chain.
  • nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the healthcare provider, veterinary clinic, military, and direct-to-consumer industries to test for contamination and institute organizational processes and conditions to sanitize frequently touched objects and surfaces and prevent infection.
  • expression cassette refers to a nucleic acid component of vector DNA comprising one or more transcriptional control elements (e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences) that direct gene expression of a sequence encoding a protein and/or polypeptide, e.g., a linear nucleic acid sequence encoding one or more transgenes that are expressed by one or more cell types.
  • transcriptional control elements e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences
  • expression vector and “plasmid” are terms of the art understood by skilled persons and refer to synthetic DNA molecules used to carry foreign genetic material into a cell.
  • recombinant DNA is a term of the art understood by skilled persons and refers to combining two or more DNA molecules from two or more different sources
  • recombinant protein is a term of the art understood by skilled persons and refers to protein encoded by recombinant
  • recombinant is a term of the art understood by skilled persons and refers to recombined DNA, e.g., recombinant DNA, and/or artificially produced protein, e.g., recombinant protein.
  • the term “recombinant expression system” refers to a system for expressing recombinant protein in cells by transfecting cells with a DNA vector, expression vector, or plasmid.
  • expression is a term of the art understood by skilled persons and refers to production of large amounts of recombinant DNA and/or recombinant protein by manipulation of the genetic material.
  • optimized expression or “optimized for expression” refer to adaptation of some or all of nucleic acid molecules, including synthetic DNA molecules, recombinant DNA, and/or DNA vector, to the host organism to optimize synthesis and/or production of recombinant proteins. Optimization for expression may include optimizing GC content and noncoding DNA elements. Optimization for expression may include optimization based on highly expressed genes (HEG) wherein the codon usage of predicted highly expressed genes from 150 bacterial genomes under translational selection determines codon usage.
  • HEG highly expressed genes
  • Optimization for expression may also include determination of codon usage based on ribosomal protein genes (RPG) or tRNA gene copy number (tRNA).
  • RPG ribosomal protein genes
  • tRNA tRNA gene copy number
  • Optimization for expression may include general optimization based on the C. glutamicum codon usage table generated from 9,019 coding sequences representing 2,866,198 codons. Optimization for expression may include optimization based on the software OptimWiz, a proprietary codon optimization analysis tool, which may optimize for expression by modifying GC-content, mRNA secondary structure, Shine-Dalgamo sequence, RNA instability motifs, repetitive sequences, internal splice sites, and restriction enzyme recognition sites.
  • Exemplary optimization for expression of the present invention includes replacing nucleic acids of a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and/or sequences encoding linkers with nucleic acids encoding codons based on the
  • HEG HEG
  • RPG tRNA
  • tRNA tRNA
  • general OptimWiz optimization methods.
  • optical expression or “optimized for expression” may also refer to polypeptides or proteins encoded by nucleic acid sequences that have been optimized for expression, i.e. optimization of the coding sequence that codes for the sequence of amino acids in a protein.
  • nucleic acid As used herein, the term “nucleic acid”, “nucleic acid molecule”, or
  • nucleotide refers to a sequence of more than one nucleotide base monomer, for example deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), in a single chain, including naturally occurring and non-naturally occurring nucleotides.
  • nucleotide refers to conventional nucleotide bases, e.g., the purine and pyrimidine bases adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U).
  • a nucleic acid will generally contain sugars and phosphates connected in an alternating chain through phosphodiester linkages.
  • Nucleic acid sequences may encode polypeptides or may include sequences regulating transcription (e.g., promoters and terminators).
  • polypeptide refers to a continuous, unbranched chain of peptides linked by peptide bonds. Amino acids incorporated into peptides are known as residues, and the term “amino acid sequence” refers to a sequence of amino acids, including naturally occurring and non-naturally occurring amino acids. Longer polypeptides are known as proteins, and the term “protein tag” is used to refer to a shorter polypeptide. Generally, polypeptides have an N-terminus, also known as the N-terminal end or amine-terminus, and a C-terminus, also known as the C-terminal end, caiboxyl-terminus, or carboxy-terminus. Polypeptides may be fused to other polypeptides by combining the genes or parts of genes that encode them to produce recombinant
  • DNA that encodes a recombinant fusion protein may be fused N- terminally or C-terminally to another protein tag or domain. Fusion of a protein tag to the N- terminus of a protein results in an N-terminally tagged protein, and fusion of a protein tag to the
  • Linker sequences may encode cleavable polypeptides, which can be cleaved upon exposure to enzyme, chemical reagents, or irradiation, or non-cleavable polypeptides, including flexible polypeptide linkers composed of glycine and serine known as GS linkers, for example (Gly-Gly-Gly-Gly-Ser)n, or rigid linkers, for example proline-rich or a-helical linkers.
  • streptavidin-binding peptide and “SBP” may include the 38-amino acid sequence or
  • sequences are aligned for optimal comparison performance and the nucleotides or amino acid residues at corresponding nucleotide positions or amino acid positions are then compared.
  • Molecules are identical at a position when a position in the first sequence is occupied by the same nucleotide or amino acid residue as the corresponding position in the second sequence.
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences.
  • the term “homolog” refers to a protein that has a common ancestor, and may include proteins that exhibit sequence homology, i.e., the proteins share sequence similarity.
  • promoter refers to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating binding of proteins
  • the transcription start site is the location where transcription starts at the 5’-end of the operably linked nucleic acid sequence, and the promoter generally includes consensus sequences, such as a TATA box, near the transcription start site.
  • terminal or “termination sequence” refer to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating termination of transcription of RNA from the DNA upstream of the terminator.
  • the termination sequence is downstream of a stop codon that signals termination of translation of the protein translated from the RNA transcribed from the DNA upstream of the stop codon.
  • transgene refers to a gene transferred from one organism to another, i.e., an exogenous nucleic acid sequence encoding a polypeptide to be expressed in a cell.
  • a transgene contains a promoter, a protein coding sequence, and a termination sequence.
  • the term “gene of parallelf’ refers to the nucleic acid sequence encoding a protein, i.e., a protein coding sequence.
  • Exemplary genes of interest of the present invention include nucleic add sequences encoding clotting proteins, Factor C serine proteases, Factor B serine proteases,
  • signal sequence refers to a nucleic acid sequence encoding a short peptide present at the terminus of most proteins destined for secretion via the cellular secretory pathway.
  • signal peptide refers to the polypeptide encoded by the signal sequence, and is generally present at the N-terminus of secreted proteins.
  • secretory gene refers to genes encoding proteins destined for secretion via the cellular secretory pathway.
  • pyrogen and “endotoxin” are used interchangeably and refer to causative agents responsible for biological effects incidental to therapy administered parenterally, i.e. therapies administered to the body other than through the mouth and alimentary canal.
  • Parenteral therapies including injection (e.g., subcutaneous injection, intraperitoneal injection, intrathecal injection, etc.), allow pyrogens or endotoxins to bypass the normal body defenses.
  • the host’s response to pyrogens or endotoxins include fever, shock, and other physiological responses. While the terms pyrogen and endotoxin are used interchangeably herein, not all pyrogens are endotoxins.
  • amino acid refers to naturally occurring and non- naturally occurring or synthetic amino acids. Naturally occurring, levorotatory (L-) amino acids and their abbreviations (three-letter code and one-letter code are shown in Table 1.
  • the disclosure provides nucleic acid sequences comprising one or more expression cassettes optimized for expression in C. glutamicum.
  • the expression cassette comprises, from 5’ to 3’, a promoter, a signal sequence, and a sequence encoding a cascade reagent protein.
  • the cascade reagent protein is Factor C.
  • the cascade reagent protein is Factor B.
  • the cascade reagent protein is Proclotting Enzyme.
  • the cascade reagent protein is Coagulogen.
  • the expression cassette comprises a termination sequence, a sequence encoding a polypeptide protein tag, and/or a sequence encoding a linker.
  • the expression cassette comprises a nucleic acid sequence of having least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
  • Promoters or promoter sequences are sequences of DNA to which transcription factors bind, thereby initiating transcription of RNA from the DNA downstream of the promoter.
  • Promoters are located upstream, or toward the 5’ region of the sense strand, of the transcription start site and may include consensus sequences such as TATAAT or TTGACA. Promoters drive expression of DNA, e.g., genes or transgenes, downstream of the promoter. RNA molecules transcribed from operably linked DNA sequences adjacent to promoters may encode a protein.
  • RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to initiate protein synthesis by aligning the ribosome with the start codon and may include consensus sequences such as the Shine-Dalgamo sequence, e.g., AGGAGGU or GAGG.
  • mRNA messenger RNA
  • tRNA may add amino acids in sequence as dictated by the codons, moving downstream from the translational start site.
  • the expression cassettes of the disclosure may comprise a promoter.
  • the promoter drives expression of a signal sequence and a sequence encoding a cascade reagent protein.
  • the promoter comprises a nucleic acid sequence derived from a promoter of a secretory gene.
  • the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene, for example the promoters listed in Table 2.
  • the promoter may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO:
  • Signal sequences are sequences of DNA encoding a signal peptide. Signal sequences may be referred to as localization signals, localization sequences, leader sequences, or targeting signals and a signal peptide may be referred to as a transit peptide or leader peptide.
  • Signal peptides are short peptides that prompt a cell to translocate the protein, and are often present at the N-terminus of proteins destined for secretion, which may include translocation to certain organelles, secretion from the cell, or insertion into cellular membranes.
  • the expression cassettes of the disclosure may comprise a signal sequence.
  • the signal sequence may encode a signal peptide.
  • the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
  • the signal sequence is located between the promoter and the sequence encoding a polypeptide protein tag.
  • the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a secretory gene.
  • the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene, for example the signal sequences listed in Table 3.
  • the core of the signal peptide may comprise a sequence of hydrophobic amino acids.
  • the sequence of hydrophobic amino acids may be about 5 to 16 residues in length, for example 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 ,15, or 16 residues in length.
  • the signal peptide may comprise a short positively charged sequence of amino acids at the N-terminus.
  • the signal peptide may comprise a sequence of amino adds recognized and cleaved by signal peptidases.
  • the signal sequence may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
  • the signal sequence may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 302-306.
  • a sequence encoding a cascade reagent protein is a sequence of DNA encoding any one of the cascade reagent proteins of the LAL assay disclosed herein.
  • a protein encoded by this sequence may also be referred to as a recombinant cascade reagent (RCR), and may include any one of three recombinant protease zymogens, namely Factor C, Factor B, and Proclotting Enzyme, and a clotting protein, namely Coagulogen.
  • the expression cassettes of the disclosure may comprise a sequence encoding a cascade reagent protein.
  • the sequence encoding a cascade reagent protein may be isolated or derived from the genome of one of any horseshoe crab, for example Tachypleus tridentatus,
  • the sequence encoding a cascade reagent protein may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 4.
  • optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
  • sequence encoding a cascade reagent protein may be truncated or mutated from the wild type sequence.
  • the sequence encoding a cascade reagent protein may encode a recombinant protein with activity higher than, lower than, or equivalent to that of the wild type protein.
  • the sequence encoding a cascade reagent protein may encode a cascade reagent protein homolog.
  • the sequence encoding a cascade reagent protein may encode the Factor C serine protease zymogen.
  • the sequence encoding the cascade reagent protein Factor C may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least
  • SEQ ID NO: 278-283 or SEQ ID NO: 325.
  • the sequence encoding a cascade reagent protein may encode the Factor B serine protease zymogen and homologs thereof, e.g., C3 and C2/Bf.
  • the sequence encoding the cascade reagent protein Factor B may comprise a nucleic add sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least
  • the sequence encoding a cascade reagent protein may encode the Proclotting Enzyme serine protease zymogen.
  • the sequence encoding the cascade reagent protein Proclotting Enzyme may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
  • sequence encoding a cascade reagent protein may encode the Coagulogen clotting protein.
  • sequence encoding the cascade reagent protein Coagulogen may comprise a nucleic acid sequence having at least 70%, at least
  • a termination sequence, terminator, or transcription terminator is a sequence of
  • Prokaryotic transcription terminators of the present disclosure may be Rho-dependent or Rho-independent. Transcription terminators may comprise a downstream transcription stop point sequence and/or a GC-rich region of dyad symmetry followed by a poly-A sequence to promote allosteric dissociation of the transcriptional complex and/or hairpin loop formation of the transcribed mRNA and subsequent transcription termination.
  • the expression cassettes of the disclosure may comprise a termination sequence.
  • the termination sequence may be isolated or derived from the genome of one of any suitable organism, for example Escherichia coli (E. coli) or C. glutamicum.
  • the termination sequence may comprise the termination region of the E. coli rrnB gene, the termination region of the C. glutamicum cgl502 gene, the termination region of the C. glutamicum cg3011 gene, the termination region of the C. glutamicum cspA gene, and the termination region of the C. glutamicum cg!338 gene.
  • the termination sequence may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 5.
  • optimization for expression in C. glutamicum may include replacing nucleotides of the wild type termination sequence to optimize GC content for expression in C. glutamicum.
  • the termination sequence may comprise the wild type rmB termination sequence from E. coli. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 272, or a sequence at least 70%, at least 75%, at least 80%, at least
  • the termination sequence may comprise the rmB termination sequence from E. coli optimized for expression in C. glutamicum.
  • the termination sequence may comprise the sequence of SEQ ID NO: 273, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
  • the termination sequence may comprise the wild type cg!502 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 274, or a sequence at least 70%, at least
  • the termination sequence may comprise the wild type cg3011 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 275, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cspA termination sequence from C. glutamicum.
  • the termination sequence may comprise the sequence of SEQ ID NO: 276, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
  • the termination sequence may comprise the wild type cgl338 termination sequence from C. glutamicum.
  • the termination sequence may comprise the sequence of SEQ ID NO: 277, or a sequence at least
  • a sequence encoding a polypeptide protein tag is a sequence of DNA encoding a peptide sequence, protein tag, or polypeptide protein tag.
  • a sequence encoding a polypeptide protein tag may be fused, appended, or grafted to a sequence encoding a protein, generally at either the C-terminus or N-terminus, or at both the C-terminus and the N-terminus of the protein. Less frequently a sequence encoding a polypeptide protein tag may be inserted into the sequence encoding a protein.
  • a polypeptide protein tag may be appended to a protein to aid in affinity purification from biological lysate, enhance resolution of chromatographic separation, and/or promote solubilization and proper folding of proteins prone to precipitation.
  • Polypeptide protein tags may comprise polyanionic amino acids or epitope tags.
  • the expression cassettes of the disclosure may comprise a sequence encoding a polypeptide protein tag. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the sequence encoding the cascade reagent protein and the termination sequence.
  • a linker is located between the sequence encoding a polypeptide protein tag and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding a polypeptide protein tag.
  • two or more sequences encoding polypeptide protein tags may be located in tandem at the 5’ end or the 3’ end of the sequence encoding the cascade reagent protein.
  • the sequence encoding the cascade reagent protein may be located between two sequences encoding polypeptide protein tags, i.e., the sequences encoding polypeptide protein tags flank the sequence encoding the cascade reagent protein.
  • sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the flanking sequences encoding the polypeptide protein tags.
  • the cascade reagent protein may be N-terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be C- terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be N-terminally or C-terminally tagged with tandem polypeptide protein tags. In some embodiments, the cascade reagent protein may be both N-terminally and C-terminally tagged with polypeptide protein tags. In some embodiments, the two or more polypeptide protein tags are identical. In some embodiments, the two or more polypeptide protein tags are not identical. In some embodiments, cleavable, flexible, and/or rigid linkers may separate the polypeptide protein tag or tags from the cascade reagent protein.
  • sequence encoding a polypeptide protein tag may encode a peptide or protein tag, for example a polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, or maltose-binding protein.
  • sequence encoding a polypeptide protein tag may encode a polyhistidinetag, also referred to as His-tag, Hise tag, poly(His) tag, or 6His, which may be about 5-10 residues in length, for example 5, 6, 7, 8, 9, or 10 residues in length, e.g., the amino acid sequence
  • sequence encoding a polypeptide protein tag may encode a
  • FLAG-tag also referred to as FLAG octapeptide or FLAG epitope, which may have the amino add sequence D and may be used in tandem and with some variation in sequence identity, e.g., the 3xFLAG peptide of amino acid sequence
  • sequence encoding a polypeptide protein tag may encode an HA-tag, also referred to as the human influenza hemagglutinin tag, which may be derived from amino acids
  • sequence encoding a polypeptide protein tag may encode a calmodulin-binding peptide, also referred to as a calmodulin-binding protein peptide tag,
  • the sequence encoding a polypeptide protein tag may encode a streptavidin-binding peptide, also referred to as an SEP or streptavi din-tag, including a 38 -amino add sequence or 8-amino acid sequences of the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK.
  • sequence encoding a polypeptide protein tag may encode a glutathione S-transferase protein, also referred to as a GST-tag, which may be about 220 amino adds in length and may be derived from a sequence encoding a wild type glutathione S'-transferase.
  • the sequence encoding a polypeptide protein tag may encode a maltose binding protein, also referred to as MBP-tag or maltose tag, which may be about 370-396 amino adds in length and may be derived from the malE gene of E. coli.
  • MBP-tag maltose binding protein
  • the sequence encoding a polypeptide protein tag may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 6.
  • optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
  • sequence encoding a polypeptide protein tag may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 19-32.
  • sequence encoding a polypeptide tag may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least
  • a sequence encoding a linker is a sequence of DNA encoding a polypeptide linker.
  • Polypeptide linkers may encode cleavable, rigid, and/or flexible polypeptides.
  • Polypeptide linkers also referred to as linkers, may link functional protein domains together or release free functional domains after cleavage.
  • Linkers may be isolated from or derived from naturally-occurring multidomain proteins, or may be designed de novo. Linkers may increase stability, promote folding, increase expression, or improve biological activity of the protein domains they are fused to.
  • linkers including length, hydrophobicity, amino acid residues, and secondary structure, may vary.
  • linkers may adopt various conformations, such as P-strand, helical, coil/bend, and turns.
  • the expression cassettes of the disclosure may comprise a sequence encoding a linker.
  • the sequence encoding a linker may encode a polypeptide about 3-
  • sequence encoding a linker may be located between a 5* sequence encoding a polypeptide protein tag and a 3’ sequence encoding a cascade reagent protein. In some embodiments, the sequence encoding a linker may be located between a 5 ’ sequence encoding a cascade reagent protein and a 3 ’ sequence encoding a polypeptide protein tag. In some embodiments, polar uncharged or charged residues are preferable amino acids of the linker.
  • the sequence encoding a linker may encode a flexible GS linker, for example (Gly) 7 , or (Giy)g.
  • the sequence encoding a linker may encode a rigid a-helical linker, for example or
  • the sequence encoding a linker may encode a rigid proline-rich linker, for example PAPAP, (AP)n, (KP)n, or (EP)n, wherein n is 3-4.
  • sequence encoding a linker may encode a cleavable disulfide linker, for example LEAGCKNFFPRSFTSCGSLE, or a cleavable protease linker, for example GFLG.
  • the sequence encoding a linker may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 7.
  • optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or
  • the sequence encoding a linker may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of
  • the sequence encoding a linker may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 315-321.
  • the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, and a sequence encoding a cascade reagent protein.
  • the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID NO: 325
  • the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ
  • the expression cassette comprises SEQ ID NO: 4 (Factor B (C. rotundicauda)).
  • the expression cassette comprises SEQ ID NO: 4 (Factor B (C. rotundicauda)).
  • the expression cassette comprises SEQ ID NO: 4 (C. rotundicauda)).
  • the expression cassette comprises
  • SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), and SEQ ID NO: 6 (Coagulogen (L potyphemus)), SEQ ID NO: 7 (Coagulogen (7. tridentatus)), or SEQ ID NO: 8
  • the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator.
  • the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO:
  • the expression cassette comprises
  • the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID NO: 4 (Factor B (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275,
  • SEQ ID NO: 276, or SEQ ID NO: 277 wild type or optimized E. coli rrnB termination sequences
  • the expression cassette comprises SEQ ID NO: 101 (Pcgl514-cgl514ss-
  • the expression cassette comprises SEQ ID NO: 117 rotundicauda)-rrnB terminator.
  • the expression cassette comprises SEQ ID NO: 117 rotundicauda)-rrnB terminator.
  • the expression cassette comprises SEQ ID NO: 105
  • the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z.
  • the expression cassette comprises SEQ ID NO: 109
  • the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from
  • SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 34, SEQ ID NO: 37, SEQ ID NO: 39,
  • SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47 cg7574ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively
  • SEQ ID NO: 272 SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47
  • the expression cassette comprises SEQ ID NO: 98 tridentatus)-rmB terminator ⁇ , SEQ ID NO: 323 (T. tridentatus version 2)-rmB terminator), SEQ ID NO: 114 rotundicauda)-rmB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
  • the expression cassette comprises
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
  • the expression cassette comprises SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).
  • the expression cassette comprises SEQ ID NO:
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
  • SBP SBP, GST, or MBP, respectively
  • SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274,
  • the expression cassette comprises SEQ ID NO: SEQ ID NO: 122 terminator), or SEQ ID NO: 126
  • the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator.
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 35, SEQ ID NO: 38, SEQ ID NO:
  • SEQ ID NO: 40 SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, or SEQ ID NO: tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO:
  • SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:
  • the expression cassette comprises SEQ ID NO: 99 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus)-6His- rrnB terminator), SEQ ID NO: 324 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus version 2)-6His- rrnB terminator), SEQ ID NO: 115 (Pcgl514-cgl514ss- ⁇ actor C terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
  • the expression cassette comprises
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
  • the expression cassette comprises SEQ ID NO:
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
  • SBP SBP, GST, or MBP, respectively
  • SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274,
  • the expression cassette comprises SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, wild type or cg!338 termination sequences, respectively).
  • the expression cassette comprises SEQ ID NO:
  • the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator.
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
  • SEQ ID NO: 36 (cg/5/Vxs-6His-Factor C-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:
  • the expression cassette comprises
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 52 (cg7574ss-6His-Factor B-6His), and SEQ ID NO: 272, SEQ ID NO:
  • the expression cassette comprises SEQ ID NO: 104 (7. tridentatus)-6tiis-rrnB terminator) or SEQ ID NO: 120 (C. rotundicauda)-6 ⁇ s-rrnB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
  • SEQ ID NO: 272 SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:
  • the expression cassette comprises SEQ ID NO: 108 -Proclotting Enzyme (T. . In some embodiments, the expression cassette comprises from
  • SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 84 (cgI514ss-6His -Coagulogen-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).
  • the expression cassette comprises SEQ ID NO: 112
  • the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
  • the expression cassette comprises from 5* to 3*
  • SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 49 and SEQ ID NO: 272,
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 65
  • the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 81 and SEQ ID NO: 272, SEQ ID NO:
  • SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 E. coh rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cg!338 termination sequences, respectively).
  • the disclosure provides methods of recombinant protein expression.
  • the expression cassette is cloned into a plasmid.
  • the expression cassette may be cloned into a multiple cloning site of a plasmid using restriction enzyme cloning, Gateway cloning, or TOPO cloning.
  • the expression cassette may be Gibson assembled into a plasmid.
  • the expression cassette may be inserted into a plasmid using a combination of restriction enzyme cloning, Gateway cloning, TOPO cloning, and/or Gibson assembly.
  • nucleic acid sequences may comprise restriction enzyme recognition sites and/or recombination sequences to facilitate cloning.
  • restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of: the promoter, the signal sequence, the sequence encoding a cascade reagent protein, the termination sequence, the sequence encoding a polypeptide tag, and the sequence encoding a linker.
  • restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of two or more sequences encoding polypeptide protein tags and two or more sequences encoding linkers.
  • a plasmid may be a cloning vector, a transfer vector, a shuttle vector, or an expression vector.
  • a suitable plasmid may be a mobilizable E. coll - C. glutamicum shuttle vector.
  • a suitable plasmid may be the pEC-pkl8mob2 plasmid.
  • the disclosure provides methods of recombinant protein purification.
  • the RCRs of the present invention may be purified from cultures of recombinant C. glutamicum cells expressing nucleic acid molecules, including expression cassettes and plasmids.
  • the expression cassette comprises a sequence encoding a polypeptide tag fused to the 5’ end or the 3’ end of the sequence encoding a cascade reagent protein.
  • the polypeptide tag may comprise a solubilization tag that facilitates proper protein folding and prevents precipitation during purification.
  • the polypeptide tag may comprise an affinity tag that facilitates affinity purification.
  • the polypeptide tag may comprise a chromatographic tag that modulates resolution during chromatographic separation.
  • the polypeptide tag may comprise an epitope tag that facilitates antibody purification.
  • the RCR may be purified from culture supernatant or cell lysate using column chromatography.
  • the culture supernatant or cell lysate may be applied to a column, the column may be washed, and bound protein may be eluted from the column.
  • additives and chelating agents e.g., EDTA, may be incorporated into buffers during purification.
  • the tagged protein binds to the column matrix and may be eluted by competitive binding, cleavage of the protein tag, or by destabilization of the interaction between the protein tag and the column matrix, e.g., by a change of pH.
  • the RCR may be purified by fast protein liquid chromatography
  • elution fractions may be assayed for protein concentration and RCR activity and concentrated to obtain higher protein concentrations.
  • the RCR is purified to apparent homogeneity.
  • the isolated, purified protein molecule is an RCR derived from T. tridentatus, e.g., serine protease zymogen or clotting protein optimized for expression in C. glutamicum.
  • the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 257 (Factor C), SEQ ID NO: 259 (Factor B), or SEQ ID NO:
  • amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 263 (Coagulogen).
  • the isolated, purified protein molecule is an RCR derived from C. rotundicauda including homologs thereof, e.g., Factor B C3 and C2/Bf.
  • the isolated, purified protein molecule is a serine protease zymogen or clotting protein optimized for expression in C. glutamicum.
  • the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 258 (Factor C) or SEQ ID NO: 260 (Factor B).
  • the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 264 (Coagulogen).
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from T. tridentatus, e.g., a serine protease zymogen or clotting protein and optimized for expression in C. glutamicum.
  • the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from C. rotundicauda or L. polyphemus, e.g., a serine protease zymogen or clotting protein, and optimized for expression in C. glutamicum.
  • the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 193
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from T. tridentatus and optimized for expression in C. gluUmicum. In some embodiments, the isolated, purified protein molecule is an
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 130, SEQ ID NO: 133,
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor B derived from T. tridentatus optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 146, SEQ ID NO: 149, SEQ ID NO: 151, SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Proclotting Enzyme derived from T. tridentatus optimized for expression in C. gluUmicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least
  • SEQ ID NO: 162 75% identical to SEQ ID NO: 162, SEQ ID NO: 165, SEQ ID NO: 167, SEQ ID NO: 169, SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from T tridentatus optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 178, SEQ ID NO: 181, SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID NO:
  • SEQ ID NO: 189 SEQ ID NO: 191 (cg!514ss-tag-Coagulogen where the tag is 6His,
  • FLAG FLAG, HA, CBP, SBP, GST, or MBP, respectively.
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
  • the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 210, SEQ ID NO:
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 226, SEQ ID NO: 229, SEQ ID NO: 231, SEQ ID NO: 233, SEQ ID NO: 235, SEQ ID NO: 237, or SEQ ID NO: 239 (cgl514ss- tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. gluUmicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID NO: 247, SEQ
  • SEQ ID NO: 249 SEQ ID NO: 251, SEQ ID NO: 253, or SEQ ID NO: 255 (cg!514ss-tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 131, SEQ ID NO: 134,
  • SEQ ID NO: 136 SEQ ID NO: 138, SEQ ID NO: 140, SEQ ID NO: 142, or SEQ ID NO: 144
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Factor B derived from T tridentatus and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 147, SEQ ID NO: 150, SEQ ID NO: 152, SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 163, SEQ ID NO: 166, SEQ ID NO: 168, SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from T. tridentatus and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 179, SEQ ID NO: 182, SEQ ID NO: 184, SEQ
  • SEQ ID NO: 186 SEQ ID NO: 188, SEQ ID NO: 190, or SEQ ID NO: 192 (cgl514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 195, SEQ ID NO: 195, SEQ ID
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 211, SEQ ID NO: 214, SEQ ID NO: 216, SEQ ID NO: 218, SEQ ID NO: 220, SEQ ID NO: 222, or SEQ ID NO: 224 (cgl514ss-
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 227, SEQ ID NO: 230, SEQ ID NO: 232, SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least
  • SEQ ID NO: 252 SEQ ID NO: 252, SEQ ID NO: 254, or SEQ ID NO: 256 (cgl514ss-Coagulogen-tag where the tag is
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged RCR derived from T. tridentatus or C. rotundicauda optimized for expression in C. glutamicum.
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Factor C derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 132 (cgl514ss-6His-Factor C
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged Factor B derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum.
  • the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
  • the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally and C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ
  • the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Coagulogen derived from T. tridentatus, L. polyphemus or C. rotundicauda and optimized for expression in C. glutamicum.
  • the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 180 (cgl514ss-6His-
  • Coagulogen SEQ ID NO: 228 (cgl514ss-6His-Coagulogen (L. or SEQ ID NO: 244 (cgl514ss-6His-Coagulogen (C. rotundicauda)-6His.
  • kits and methods for detecting a pyrogen or endotoxin in a sample comprises one or more of the RCR proteins of the present disclosure.
  • the kit comprises one or more of recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting
  • the kit comprises one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity.
  • the disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity.
  • the method comprises contacting the sample with one or more of the components of the kit described herein, including recombinant
  • the method comprises contacting the sample with one or more of the components of the kit described herein in combination with a commercialized natural lysate reagent.
  • the method for detecting a pyrogen or endotoxin in a sample comprises the limited proteolysis of each protease zymogen in the coagulation cascade reaction of the LAL assay.
  • the method for detecting a pyrogen or endotoxin in a sample may comprise admixing one or more components of the kit with the sample, separating precipitated proteins from the sample, admixing one or more components of the kit with the remaining sample, and measuring coagulation. Measuring coagulation may include observing increased turbidity and viscosity.
  • the method further comprises centrifugation of the sample, sedimentation and separation of the sample, and/or removal of one or more layers or portions of the sample.
  • Expression cassettes of the present disclosure include nucleic acid molecules comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and may include a polypeptide tag.
  • the expression cassette comprises a promoter, a signal sequence, a gene of interest, and a termination sequence
  • the expression cassette comprises a promoter, a signal sequence, an N-terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 1).
  • the expression cassette comprises a promoter, a signal sequence, a C- terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 3).
  • RCR expression cassettes comprising the , the cgl514 signal sequence indicated by cgl514ss (SEQ ID NO: 14), the T tridentatus Factor C gene optimized for expression in C. glutamicum (SEQ ID NO: 325), the E. coli rmBTlT2 terminator sequence indicated by rmB terminator (SEQ ID NO: 272), and optionally a polyhistidine-tag optimized for expression in C. glutamicum (SEQ ID NO: 26).
  • Three RCR expression cassettes were engineered to result in a secretory expression system based on the Cgl514 secreted protein of C. glutamicum by using the promoter ) and signal sequence (cg!514ss) of cg!514.
  • FIG. 2B shows schematic representations of the three RCR expression cassettes optimized for expression in C. glutamicum.
  • Expression cassette number 4 (SEQ ID NO: 322) comprises the P promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
  • Expression cassette number 5 comprises the Pcgisu promoter (SEQ ID NO: 9), the cgl514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272).
  • Expression cassette number 6 comprises the Pcgisi* promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
  • the three RCR expression cassettes comprise the nucleic acid sequences of
  • SEQ ID NO: 322, SEQ ID NO: 323, and SEQ ID NO: 324 for expression of Factor C (FIG. 2B, number 4), N-terminally polyhistidine-tagged Factor C (FIG. 2B, number 5), and C-terminally polyhistidine-tagged Factor C (FIG 2B, number 6), respectively.
  • the pEC-pk!8mob2 plasmid is a mobilizable E. coli - C. glutamicum shuttle vector based on a mini-replicon encoding the repA and per functions of the medium copy number plasmid pGAl.
  • plasmid expression confirmation a single colony of each of the transformations was isolated from a fresh LEG plate (Luria Broth - Lennox’ s formulation supplemented with 0.5% glucose), inoculated in LEG broth and incubated at 30 °C shaking at 200 revolutions per minute (RPM) for about 6 - 8 hours. A sample of each of the transformations was then inoculated into fresh LEG broth and incubated at 30 °C shaking at 200
  • FIG. 2B number 6 indicated by pK18mob2 - FC-CHis6 (FIG. 3A, bottom middle), and grampositive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 5, indicated by pK18mob2 - FC-NHis6 (FIG. 3 A, bottom right).
  • Plasmids were isolated from bacteria using alkaline lysis, and samples were subjected to gel electrophoreses at 80 Volts for 120 minutes at room temperature on a 1% agarose gel in IX TAB buffer. Safe DNA Gel Stain (Bioland Scientific) was used to visualize
  • Lane 1 shows no DNA present from the C. glutamicum negative control
  • lane 2 shows the expected molecular weight of the pEC-pkl8mob2 empty plasmid expressed in C. glutamicum
  • lane 3 shows the expected molecular weight of the plasmid comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322) expressed in C. glutamicum
  • lane 4 shows the expected molecular weight of the plasmid comprising expression cassette number 6 of FIG. 2B (SEQ ID NO: 324) expressed in C. glutamicum
  • lane 5 shows the expected molecular weight of the plasmid comprising expression cassette number 5 of FIG.
  • Example 3 Expression of recombinant Factor C in C glutamicum [0122]
  • C. glutamicum expression cassette 4 of FIG. 2B (SEQ ID NO: 322; pK18mob2 -
  • FC FC
  • kanamycin 50 mg/L was added to the culture medium as the sole antibiotic.
  • As a seed culture cells were inoculated into 50 mL of semi-defined medium containing
  • the semi-defined medium consists of 0.5 g urea, 0.25 mg ZnSCh, 2.5 mg CaCh in BHI media.
  • the seed culture (40 mL) was inoculated into 400 mL of fresh semi-defined medium in a 1 L jar custom-built bioreactor. Throughout cultivation, the temperature was maintained at 30 °C and stirred with axial flow impeller at 300 RPM. Oxygen concentration was maximized by continual sterile air flow into the medium.
  • the pH was maintained at 7.0 by adding 10% "V/V ammonium hydroxide solution (LabChem, Zelienople, PA) when the set point dropped below 7 or 37% hydrochloric acid (GTI Laboratory Supplies, Edna, Texas) when the set point increased above 7.
  • a glucose solution (90 g in 150 mL BHI) was added to the culture in 90 second increments at a rate of 12.5 mL/hr.
  • the pellet was air-dried and resuspended in denaturing 8M urea (pH 8.0), 300 mMNaCl, 50 mMNaH2PO 4 , 20 mM Tris-Cl, 1 mMEDTA, 10% glycerol, and 1% Triton X-100.
  • lanes 1 and 2 are duplicates of the C. glutamicum pK18mob2 negative control sample
  • lanes 3 and 4 are duplicates of the C. glutcanicum pK18mob2 - FC sample.
  • Lanes 3 and 4 show expression of ⁇ 80 kDa and ⁇ 43 kDa polypeptides in the culture supernatant of C. glutamicum expressing pK18mob2 - FC, a plasmid which harbors cassette number 4 (SEQ ID NO: 322), referred to as C. glutamicum pKl8mob2 - FC.
  • Lanes 1 and 2 do not show expression of ⁇ 80 kDa and ⁇ 43 kDa polypeptides in the culture supernatant of C. glutcanicum expressing pK18mob2, an empty plasmid, referred to as C. glutamicum pK18mob2.
  • SDS-PAGE gel analysis with an 8% gel under denaturing conditions demonstrates expression of ⁇ 80 kDa and ⁇ 43 kDa polypeptides in the culture supernatant of C. glutamicum pK18mob2 - FC, corresponding to production of Factor C in C. glutamicum and extrusion of the protein into the culture supernatant.
  • Embodiment 1 A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’: a. a promoter; b. a signal sequence; and c. a sequence encoding a cascade reagent protein.
  • Embodiment 2 The nucleic acid molecule of embodiment 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.
  • Embodiment 3 The nucleic acid molecule of embodiment 1 or 2, wherein the signal sequence encodes a signal peptide.
  • Embodiment 4 The nucleic acid molecule as any one of embodiments 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
  • Embodiment 5 The nucleic acid molecule as in any one of embodiments 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.
  • Embodiment 6 The nucleic acid molecule as in any one of embodiments 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a Corynebacterium glutamicum secretory gene.
  • Embodiment 7. The nucleic acid molecule as in embodiment 5 or 6, wherein the
  • Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR.0949 gene, and the porB gene.
  • Embodiment 8 The nucleic acid molecule as in any one of embodiments 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.
  • Embodiment 9 The nucleic acid molecule as in any one of embodiments 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus polyphemus, Tachypleus gigas, and
  • Embodiment 10 The nucleic acid molecule as in any one of embodiments 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
  • Embodiment 11 The nucleic acid molecule as in any one of embodiments 1-10, wherein the expression cassette comprises a termination sequence.
  • Embodiment 12 The nucleic acid molecule of embodiment 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rmB gene, the termination region of the Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cgl338 gene.
  • Embodiment 13 The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.
  • Embodiment 14 The nucleic acid molecule of embodiment 13, wherein the polypeptide protein tag is selected from the group consisting of polyhistidine-tag,
  • FLAG-tag FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.
  • Embodiment 15 The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
  • Embodiment 16 The nucleic acid molecule of embodiment 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
  • Embodiment 17 The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
  • Embodiment 18 The nucleic acid molecule of embodiment 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
  • Embodiment 19 The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.
  • Embodiment 20 The nucleic acid molecule of embodiment 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
  • polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
  • Embodiment 21 The nucleic acid molecule of embodiment 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
  • Embodiment 22 The nucleic acid molecule of embodiment 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
  • Embodiment 23 The nucleic acid molecule as in any one of embodiments 16, 18, or
  • linker or linkers are selected from the group consisting of flexible
  • GS linkers flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
  • Embodiment 24 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
  • SEQ ID NO: 1 SEQ ID NO: 2, SEQ ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.
  • Embodiment 25 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.
  • Embodiment 26 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
  • Embodiment 27 The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
  • Embodiment 28 The nucleic acid molecule as in any one of embodiments 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO:
  • Embodiment 29 The nucleic acid molecule as in any one of embodiments 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of
  • SEQ ID NO: 14-18 or a sequence at least 90% identical thereto.
  • Embodiment 30 The nucleic acid molecule as in any one of embodiments 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
  • Embodiment 31 The nucleic acid molecule as in any one of embodiments 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
  • Embodiment 32 The nucleic acid molecule as in any one of embodiments 16, 18, or
  • linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.
  • Embodiment 33 The nucleic acid molecule as in any one of embodiments 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
  • Embodiment 34 The nucleic acid molecule as in any one of embodiments 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of
  • Embodiment 35 A plasmid, comprising the nucleic acid molecule as in any one of embodiments 1-34.
  • Embodiment 36 A cell, comprising the nucleic acid molecule as in any one of embodiments 1-34 or the plasmid of embodiment 35.
  • Embodiment 37 A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment
  • Embodiment 38 A recombinant expression system produced by the method of embodiment 37.
  • Embodiment 39 A method of expressing Factor C serine protease zymogen, Factor
  • Coagulogen clotting protein comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment 35.
  • Embodiment 40 An isolated, purified protein molecule, wherein the amino acid sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
  • Embodiment 41 A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein expressed in Corynebacterium glutamicum.
  • Embodiment 42 The kit for detecting a pyrogen or endotoxin in a sample of embodiment 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID NO:
  • Embodiment 43 The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-42, wherein the amino acid sequence of the recombinant
  • Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or
  • Embodiment 44 The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-43, wherein the amino acid sequence of the recombinant
  • Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID NO:
  • Embodiment 45 The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-44, wherein the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO:
  • Embodiment 46 A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule of embodiment 40.
  • Embodiment 47 A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of embodiments 41-45.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Urology & Nephrology (AREA)
  • Plant Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Hematology (AREA)
  • Cell Biology (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The disclosure provides nucleic acid molecules (comprising expression cassettes), plasmids, protein molecules, cells (comprising nucleic acid molecules), and recombinant expression systems for producing recombinant cascade reagents for the limulus amoebocyte lysate test method. Also, provided herein are kits and methods for detecting a pyrogen or endotoxin in a sample.

Description

COMPOSITIONS AND METHODS FOR DETECTING AN ENDOTOXIN
INVENTORS:
Jennifer Watson
Richard Hatcher
TITLE OF THE INVENTION
Compositions and Methods for Detecting an Endotoxin
CROSS REFERENCE TO RELATED APPLICATION
[0001] The present Application claims the benefit of priority to U.S. Provisional
Application No. 63/315,513, filed on March 1 , 2022, the contents of which are hereby incorporated by reference in their entirety.
INCORPORATION BY REFERENCE OF SEQUENCE LISTING PROVIDED
ELECTRONICALLY
[0002] An electronic version of the Sequence Listing is filed herewith, the contents of which are incorporated by reference in their entirety. The electronic file is 571 kilobytes in size, and is titled 495-LM02_SequenceListing_ST26.txt.
BACKGROUND OF THE INVENTION
Field of the Invention
[0003] The present invention relates generally to the fields of biotechnology and infectious diseases, and more particularly it pertains to recombinant production of enzymes for detection of pyrogens and endotoxins.
Background
[0004] The standard pyrogen assay is a mandatory test for U.S. Food and Drug
Administration (FDA) approval of all vaccines, intravenous pharmaceuticals, and internal medical devices to prevent contamination with endotoxins. The assay uses the hemolymph (blood) of the horseshoe crab, Limulus polyphemus (L. polyphemus} and tests for the presence of fever-producing agents of bacterial origin, e.g., endotoxins. The limulus amoebocyte lysate (LAL) test method is a qualitative assay during which the L. polyphemus hemolymph lysate reacts with an endotoxin to form a gel. The LAL test is considered to be reproducible, simple to conduct, specific for the presence of endotoxins, and sensitive to even picogram quantities of endotoxins. The quantity of endotoxin may be determined by dilution techniques comparing gel formation of the test sample to that of a reference pyrogen. The following non-essential publications are incorporated by reference in their entirety to aid in understanding of the official use of the LAL assay for release testing of final drug products: Levin, J, et al. Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool. In: Shuster CNJ, Barlow RB and Brockman HJ (eds) The American horseshoe crab. 2003: 310-340; Cooper, JF. Discovery and acceptance of the bacterial endotoxins test. In: McCullough KZ (ed.) The bacterial endotoxins test: a practical approach. 2011: 1-13.
[0005] The LAL assay comprises horseshoe crab lysate reagents that form a four-step coagulation cascade. Three serine protease zymogens, namely Factor C, Factor B, and Proclotting
Enzyme, and one clotting protein, Coagulogen, form the enzymatic coagulation cascade that results in a coagulin gel clot in the presence of an endotoxin. In this cascade, an endotoxin activates the Factor C zymogen and the activated Factor C subsequently activates Factor B, which converts the Proclotting Enzyme into Clotting Enzyme that cleaves Coagulogen into Coagulin, forming a gel clot.
[0006] The raw materials for the production of lysate reagents are harvested from wildcaught horseshoe crab, including L. polyphemus and Tachypleus tridentatus (T. tridentatus). Wild horseshoe populations are in decline due to the detrimental effect of capture, blood collection, and release, poor management of harvest regulations, and habitat destruction. Commercial-scale cultivation of horseshoe crabs has not been achieved. The following non-essential publications are incorporated by reference in their entirety to aid in understanding of the unsustainability of blood collection from wild-caught crabs for production of LAL assay reagents: Gauvry G. Current
Horseshoe crab harvesting practices cannot support global demand for TAL/LAL: The pharmaceutical and medical device industries’ role in the sustainability of horseshoe crabs. In:
Carmichael RH, Botton ML, Shin PKS and Changing SGC (eds) Global perspectives on horseshoe crab biology, conservation and management. 2015: 475-482; Anderson RL et al., Sublethal behavioral and physiological effects of the biomedical bleeding process on the American horseshoe crab, Limulus pofyphemus. Biol Bull. 2013(225): 137-151; Novitsky TJ. Biomedical implications for managing the Limulus pofyphemus harvest along the northeast coast of the United
States. IN: Carmichael RH, Botton ML, Shin PKS and Changing SGC (eds) Global perspectives on horseshoe crab biology, conservation and management. 2015: 483-500.
[0007] Demand for lysate reagents for the LAL assay will likely continue to rise with the growth of the pharmaceutical industry, including the proliferation of biotechnology-based drugs and vaccines. The recent rapid development and deployment of vaccines against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in a mass vaccination campaign to address the coronavirus disease 2019 (COVID-19) pandemic demonstrates the ongoing necessity for endotoxin-free development and manufacturing of parenteral pharmaceuticals. The current reliance of the LAL assay on lysate reagents harvested from the horseshoe crab is a threat to horseshoe crab populations, the ecosystems in which the horseshoe crab lives, and humanity as the globe faced the CO VID-19 pandemic and the threat of future pandemics.
[0008] Accordingly, a sustainable alternative to lysate reagents for the LAL assay is urgently needed to protect the horseshoe crab and humanity from preventable harm.
BRIEF SUMMARY OF THE INVENTION [0009] Thus, in accordance with the present disclosure, recombinant generation of lysate reagents for the LAL assay is provided herein. The disclosure features expression cassettes, plasmids, and functional recombinant cascade reagents (RCRs) produced from these expression cassettes and plasmids. The disclosure also features expression cassettes for functional RCRs optimized for production in Corynebacterium glutamicum (C. glutamicum). The disclosure features optimized expression cassettes for production in C. glutamicum of the Factor C, Factor B, and Proclotting Enzyme serine protease zymogens, as well as optimized expression cassettes for production of the Coagulogen clotting protein.
[0010] The disclosure provides nucleic acid molecules, comprising expression cassettes, wherein the expression cassettes comprise, from 5’ to 3’: a promoter; a signal sequence; and a sequence encoding a cascade reagent protein. In some embodiments, the expression cassette is optimized for expression in C. glutamicum. In some embodiments, the signal sequence encodes a signal peptide.
[0011] In some embodiments, the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene. In some embodiments, the C. glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the
CgR0949 gene, and the porB gene.
[0012] In some embodiments, the sequence encoding the cascade reagent protein encodes
Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein. In some embodiments, the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus pofyphemus,
Tachypleus gigas, and Carcinoscorpius rotundicauda (C. rotundicauda).
[0013] In some embodiments, the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
[0014] In some embodiments, the expression cassette comprises a termination sequence.
In some embodiments, the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the
Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cg!338 gene.
[0015] In some embodiments, the expression cassette comprises a sequence encoding a polypeptide protein tag. In some embodiments, the expression cassette comprises two or more sequences encoding polypeptide protein tags. In some embodiments, the polypeptide protein tag or polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag,
HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
[0016] In some embodiments, the sequence encoding the polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
[0017] In some embodiments, the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence. In some embodiments, a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
[0018] In some embodiments, the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags. In some embodiments, sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
[0019] In some embodiments, the linker or linkers are selected from the group consisting of flexible glycine-serine linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
[0020] In some embodiments, the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 278-283, or SEQ ID: 325, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284—289, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6-8 or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
[0021] In some embodiments, the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto. In some embodiments, the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto. In some embodiments, the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto. In some embodiments, the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto. In some embodiments, the linker is encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271, or a sequence at least 90% thereto.
[0022] In some embodiments, the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
[0023] In some embodiments, the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128 or SEQ ID NO. 322-324, or a sequence at least 90% thereto.
[0024] The disclosure provides plasmids comprising nucleic acid molecules disclosed herein. The disclosure also provides cells comprising any one of the nucleic acid molecules or plasmids disclosed herein.
[0025] The disclosure provides methods of producing a recombinant expression system comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein. The disclosure also provides recombinant expression systems produced by the method of contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
[0026] The disclosure provides methods of expressing Factor C serine protease zymogen,
Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein, comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.
[0027] The disclosure provides isolated, purified protein molecules, wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256. [0028] The disclosure provides kits for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant
Coagulogen clotting protein expressed in C. glutamicum.
[0029] In some embodiments, the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 257 or SEQ ID NO: 258. In some embodiments, the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 259 or SEQ ID NO: 260. In some embodiments, the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical SEQ ID NO: 261. In some embodiments, the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262
-264.
[0030] The disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
[0031] The disclosure provides methods of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kits disclosed herein.
[0032] These and other embodiments are described in more detail in the detailed description below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0033] FIG. 1 depicts the coagulation cascade of the present disclosure based on the coagulation cascade in the horseshoe crab amoebocyte lysate. [0034] FIGS. 2A-2B depict the expression cassettes of the present disclosure. FIG. 2A shows expression cassettes comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and optionally a polypeptide tag. FIG. 2B shows exemplary expression cassettes according to the present invention. Expression cassette number 4 (SEQ ID NO: 322) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272).
Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272). Expression cassette number 6 (SEQ ID NO: 324) comprises the promoter (SEQ ID NO: 9), the cg!514
Figure imgf000011_0001
signal sequence (SEQ ID NO: 14), Factor C gene from T. tridentatus (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator (SEQ ID NO: 272).
[0035] FIGS. 3A-3B show expression of the plasmids containing expression cassettes in
C. glutamicum according to the present disclosure. FIG. 3 A depicts microscopy images showing untransformed gram-positive B. cereus and untransformed gram-negative E. coli (top left), untransformed, gram-positive C. glutamicum (top middle), C. glutamicum transformed with empty plasmid (top right), and C. glutamicum transformed with plasmids comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322, bottom left), expression cassette number 6 of FIG. 2B
(SEQ ID NO: 324, bottom middle), and expression cassette number 5 of FIG. 2B (SEQ ID NO:
323, bottom right). The scale bar is 10 pm. FIG. 3B depicts gel electrophoresis showing the molecular weight of plasmids containing the expression cassettes of the present disclosure. Lane
1 shows C. glutamicum as a negative control, lane 2 shows C. glutamicum expressing the pEC- pk!8mob2 empty plasmid as a positive control, lane 3 shows C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322), lane 4 shows C. glutamicum expressing the plasmid comprising the expression cassette number 6 of FIG. 2B (SEQ
ID NO: 324), and lane 5 shows C. glutamicum expressing the plasmid comprising the expression cassette number 5 of FIG. 2B (SEQ ID NO: 323).
[0036] FIG. 4 depicts gel electrophoresis showing the molecular weight of polypeptides in the culture supernatant of C. glutamicum in accordance with the present disclosure. Lanes 1 and
2 show C. glutamicum expressing the pEC-pkl8mob2 empty plasmid as a negative control, and lanes 3 and 4 show C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322).
DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS
[0037] The limulus amoebocyte lysate (LAL) test method is a standard pyrogen assay employed by a variety of industries to ensure that samples are free of harmful endotoxins and pyrogens. The U.S. Food and Drug Administration (FDA) approved the LAL assay for testing drugs, products, and devices, and the assay is widely used to test ingredients of pharmaceuticals during manufacturing.
[0038] The LAL assay is based on a coagulation cascade involving reagents harvested from the hemolymph of wild-caught horseshoe crab. Specifically, exposure of endotoxin to the serine protease zymogen Factor C initiates a cascade that activates the serine protease zymogen
Factor B, converts the serine protease zymogen Proclotting Enzyme into Clotting Enzyme, and ultimately cleaves Coagulogen into Coagulin to form a gel clot. The LAL assay depends on the availability of these reagents, and poor harvest management and habitat destruction threaten the horseshoe crab population and thus threaten the supply of horseshoe crab lysate reagents. [0039] The disclosure provides nucleic acid molecules (comprising expression cassettes) and plasmids for producing the lysate reagents Factor C, Factor B, Proclotting Enzyme, and
Coagulogen, wherein the nucleic acid molecules and plasmids are optimized for expression in the generally regarded as safe (GRAS) actinobacteria Corynebacterium glutamicum (C. glutamicuni).
Also provided are isolated, purified protein molecules encoded by the nucleic acid molecules and plasmids disclosed herein, kits for detecting a pyrogen or endotoxin in a sample comprising recombinant lysate reagents, methods of producing a recombinant expression system using the nucleic acid molecules and plasmids disclosed herein, and methods for detecting pyrogen or endotoxin in a sample.
[0040] The nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used to test for contamination in a variety of industries, including pharmaceuticals (both preclinical studies and clinical applications) and biotechnologies, and settings, including healthcare providers, veterinary clinics, agriculture, food processing and service, wineries, breweries, distilleries, military, and direct-to-consumer. In some embodiments, nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the agriculture, food service, food processing, winery, brewery, or distillery industries to test for contamination at any point along the logistical supply chain. In some embodiments, nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the healthcare provider, veterinary clinic, military, and direct-to-consumer industries to test for contamination and institute organizational processes and conditions to sanitize frequently touched objects and surfaces and prevent infection.
Definitions [0041] Unless otherwise defined, all scientific and technical terms used in the description herein and in the appended claims have identical meaning as understood by one of ordinary skill in the art. The terminology used herein is not intended to be limiting and is used for the purpose of describing particular embodiments in the description herein.
[0042] The singular forms “a,” “an,” and “the” are intended to include the plural forms as well and are consistent with the meaning of “one or more,” “at least one,” and “one or more than one,” unless the context clearly indicates otherwise.
[0043] As used herein, the term “about” when referring to a measurable value such as concentration, volume, length of time, length of a polypeptide or polynucleotide sequence, quantity, and the like, encompasses, ± 20%, ± 10%, ± 5%, ± 1%, ± 0.5%, or ± 0.1% of the specified amount.
[0044] As used herein, the term “expression cassette” refers to a nucleic acid component of vector DNA comprising one or more transcriptional control elements (e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences) that direct gene expression of a sequence encoding a protein and/or polypeptide, e.g., a linear nucleic acid sequence encoding one or more transgenes that are expressed by one or more cell types. The terms “DNA
“expression vector,” and “plasmid” are terms of the art understood by skilled persons and refer to synthetic DNA molecules used to carry foreign genetic material into a cell. The term
“recombinant DNA” is a term of the art understood by skilled persons and refers to combining two or more DNA molecules from two or more different sources, and the term “recombinant protein” is a term of the art understood by skilled persons and refers to protein encoded by recombinant
DNA that has been cloned into an expression vector. The term “recombinant” is a term of the art understood by skilled persons and refers to recombined DNA, e.g., recombinant DNA, and/or artificially produced protein, e.g., recombinant protein.
[0045] As used herein, the term “recombinant expression system” refers to a system for expressing recombinant protein in cells by transfecting cells with a DNA vector, expression vector, or plasmid. The term “expression” is a term of the art understood by skilled persons and refers to production of large amounts of recombinant DNA and/or recombinant protein by manipulation of the genetic material. The terms “optimized expression” or “optimized for expression” refer to adaptation of some or all of nucleic acid molecules, including synthetic DNA molecules, recombinant DNA, and/or DNA vector, to the host organism to optimize synthesis and/or production of recombinant proteins. Optimization for expression may include optimizing GC content and noncoding DNA elements. Optimization for expression may include optimization based on highly expressed genes (HEG) wherein the codon usage of predicted highly expressed genes from 150 bacterial genomes under translational selection determines codon usage.
Optimization for expression may also include determination of codon usage based on ribosomal protein genes (RPG) or tRNA gene copy number (tRNA). The HEG, RPG, and tRNA optimization techniques apply a Monte Carlo algorithm using relative codon usage frequencies of a reference set as the relative probability that a given codon will be used in the optimization process.
Optimization for expression may include general optimization based on the C. glutamicum codon usage table generated from 9,019 coding sequences representing 2,866,198 codons. Optimization for expression may include optimization based on the software OptimWiz, a proprietary codon optimization analysis tool, which may optimize for expression by modifying GC-content, mRNA secondary structure, Shine-Dalgamo sequence, RNA instability motifs, repetitive sequences, internal splice sites, and restriction enzyme recognition sites. [0046] Exemplary optimization for expression of the present invention includes replacing nucleic acids of a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and/or sequences encoding linkers with nucleic acids encoding codons based on the
HEG, RPG, tRNA, general, or OptimWiz optimization methods.
[0047] The terms “optimized expression” or “optimized for expression” may also refer to polypeptides or proteins encoded by nucleic acid sequences that have been optimized for expression, i.e. optimization of the coding sequence that codes for the sequence of amino acids in a protein.
[0048] As used herein, the term “nucleic acid”, “nucleic acid molecule”, or
“polynucleotide” refers to a sequence of more than one nucleotide base monomer, for example deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), in a single chain, including naturally occurring and non-naturally occurring nucleotides. As used herein, the term “nucleotide” refers to conventional nucleotide bases, e.g., the purine and pyrimidine bases adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U). A nucleic acid will generally contain sugars and phosphates connected in an alternating chain through phosphodiester linkages. Generally, the phosphate groups are attached to carbons at the 5 ’-end and the 3’ -end of the sugar, imparting directionality to nucleic acids. The ends of nucleic acids are referred to as the 5’-end and 3’-end, the 5’-end is referred to as “upstream” of the 3’-end, and the 3’-end is referred to as “downstream” of the 5’-end. Nucleic acid molecules may be circular (e.g., a plasmid) or linear (e.g., a cassette).
Nucleic acid sequences may encode polypeptides or may include sequences regulating transcription (e.g., promoters and terminators).
[0049] As used herein, the term “polypeptide” refers to a continuous, unbranched chain of peptides linked by peptide bonds. Amino acids incorporated into peptides are known as residues, and the term “amino acid sequence” refers to a sequence of amino acids, including naturally occurring and non-naturally occurring amino acids. Longer polypeptides are known as proteins, and the term “protein tag” is used to refer to a shorter polypeptide. Generally, polypeptides have an N-terminus, also known as the N-terminal end or amine-terminus, and a C-terminus, also known as the C-terminal end, caiboxyl-terminus, or carboxy-terminus. Polypeptides may be fused to other polypeptides by combining the genes or parts of genes that encode them to produce recombinant
DNA that encodes a recombinant fusion protein. One protein tag or domain may be fused N- terminally or C-terminally to another protein tag or domain. Fusion of a protein tag to the N- terminus of a protein results in an N-terminally tagged protein, and fusion of a protein tag to the
C -terminus of a protein results in a C-terminally tagged protein. Recombinant proteins, including signal polypeptides, cascade reagent proteins, and protein tags, may be fused by linker sequences to separate these domains. Linker sequences may encode cleavable polypeptides, which can be cleaved upon exposure to enzyme, chemical reagents, or irradiation, or non-cleavable polypeptides, including flexible polypeptide linkers composed of glycine and serine known as GS linkers, for example (Gly-Gly-Gly-Gly-Ser)n, or rigid linkers, for example proline-rich or a-helical linkers.
[0050] As used herein, the terms “streptavidin-binding peptide” and “SBP” may include the 38-amino acid sequence or
Figure imgf000017_0001
8-amino add sequences of the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK.
[0051] Calculations of “identity” between two sequences, e.g., nucleic acid or amino acid sequences, can be performed by practices commonly understood by one of ordinary skill in the art.
The sequences are aligned for optimal comparison performance and the nucleotides or amino acid residues at corresponding nucleotide positions or amino acid positions are then compared.
Molecules are identical at a position when a position in the first sequence is occupied by the same nucleotide or amino acid residue as the corresponding position in the second sequence. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences. As used herein, the term “homolog” refers to a protein that has a common ancestor, and may include proteins that exhibit sequence homology, i.e., the proteins share sequence similarity.
[0052] As used herein, the term “promoter” refers to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating binding of proteins
(e.g., transcription factors) that initiate transcription of RNA from the DNA downstream of the promoter. The transcription start site is the location where transcription starts at the 5’-end of the operably linked nucleic acid sequence, and the promoter generally includes consensus sequences, such as a TATA box, near the transcription start site.
[0053] As used herein, the terms “terminator” or “termination sequence” refer to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating termination of transcription of RNA from the DNA upstream of the terminator.
Generally, the termination sequence is downstream of a stop codon that signals termination of translation of the protein translated from the RNA transcribed from the DNA upstream of the stop codon.
[0054] As used herein, the term “transgene” refers to a gene transferred from one organism to another, i.e., an exogenous nucleic acid sequence encoding a polypeptide to be expressed in a cell. Generally, a transgene contains a promoter, a protein coding sequence, and a termination sequence. The term “gene of interesf’ refers to the nucleic acid sequence encoding a protein, i.e., a protein coding sequence. Exemplary genes of interest of the present invention include nucleic add sequences encoding clotting proteins, Factor C serine proteases, Factor B serine proteases,
Proclotting Enzyme serine proteases, Coagulogen clotting proteins, and recombinant cascade reagents (RCRs).
[0055] As used herein, the term “signal sequence” refers to a nucleic acid sequence encoding a short peptide present at the terminus of most proteins destined for secretion via the cellular secretory pathway. The term “signal peptide” refers to the polypeptide encoded by the signal sequence, and is generally present at the N-terminus of secreted proteins. The term
“secretory gene” refers to genes encoding proteins destined for secretion via the cellular secretory pathway.
[0056] As used herein, the terms “pyrogen” and “endotoxin” are used interchangeably and refer to causative agents responsible for biological effects incidental to therapy administered parenterally, i.e. therapies administered to the body other than through the mouth and alimentary canal. Parenteral therapies, including injection (e.g., subcutaneous injection, intraperitoneal injection, intrathecal injection, etc.), allow pyrogens or endotoxins to bypass the normal body defenses. The host’s response to pyrogens or endotoxins include fever, shock, and other physiological responses. While the terms pyrogen and endotoxin are used interchangeably herein, not all pyrogens are endotoxins.
[0057] As used herein, the term “amino acid” refers to naturally occurring and non- naturally occurring or synthetic amino acids. Naturally occurring, levorotatory (L-) amino acids and their abbreviations (three-letter code and one-letter code are shown in Table 1.
TABLE T
Figure imgf000019_0001
Figure imgf000020_0001
Expression cassettes
[0058] The disclosure provides nucleic acid sequences comprising one or more expression cassettes optimized for expression in C. glutamicum. In some embodiments, the expression cassette comprises, from 5’ to 3’, a promoter, a signal sequence, and a sequence encoding a cascade reagent protein. In some embodiments, the cascade reagent protein is Factor C. In some embodiments, the cascade reagent protein is Factor B. In some embodiments, the cascade reagent protein is Proclotting Enzyme. In some embodiments, the cascade reagent protein is Coagulogen. In some embodiments, the expression cassette comprises a termination sequence, a sequence encoding a polypeptide protein tag, and/or a sequence encoding a linker.
[0059] In some embodiments, the expression cassette comprises a nucleic acid sequence of having least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID
NO: 97-128 or SEQ ID NO: 322-324.
(i) Promoter
[0060] Promoters or promoter sequences are sequences of DNA to which transcription factors bind, thereby initiating transcription of RNA from the DNA downstream of the promoter.
Promoters are located upstream, or toward the 5’ region of the sense strand, of the transcription start site and may include consensus sequences such as TATAAT or TTGACA. Promoters drive expression of DNA, e.g., genes or transgenes, downstream of the promoter. RNA molecules transcribed from operably linked DNA sequences adjacent to promoters may encode a protein.
[0061] The RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to initiate protein synthesis by aligning the ribosome with the start codon and may include consensus sequences such as the Shine-Dalgamo sequence, e.g., AGGAGGU or GAGG. Once recruited, tRNA may add amino acids in sequence as dictated by the codons, moving downstream from the translational start site.
[0062] The expression cassettes of the disclosure may comprise a promoter. In some embodiments, the promoter drives expression of a signal sequence and a sequence encoding a cascade reagent protein. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a secretory gene. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene, for example the promoters listed in Table 2.
[0063] In some embodiments, the promoter may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO:
9-13.
TABLE 2
Figure imgf000022_0001
(ii) Signal Sequence
[0064] Signal sequences are sequences of DNA encoding a signal peptide. Signal sequences may be referred to as localization signals, localization sequences, leader sequences, or targeting signals and a signal peptide may be referred to as a transit peptide or leader peptide.
Signal peptides are short peptides that prompt a cell to translocate the protein, and are often present at the N-terminus of proteins destined for secretion, which may include translocation to certain organelles, secretion from the cell, or insertion into cellular membranes.
[0065] The expression cassettes of the disclosure may comprise a signal sequence. The signal sequence may encode a signal peptide. In some embodiments, the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein. In some embodiments, the signal sequence is located between the promoter and the sequence encoding a polypeptide protein tag. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a secretory gene. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene, for example the signal sequences listed in Table 3.
TABLE 3
Figure imgf000023_0001
[0066] In some embodiments, the core of the signal peptide may comprise a sequence of hydrophobic amino acids. The sequence of hydrophobic amino acids may be about 5 to 16 residues in length, for example 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 ,15, or 16 residues in length. In some embodiments, the signal peptide may comprise a short positively charged sequence of amino acids at the N-terminus. In some embodiments, the signal peptide may comprise a sequence of amino adds recognized and cleaved by signal peptidases.
[0067] In some embodiments, the signal sequence may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99% identity to the nucleic add sequences of SEQ ID
NO: 14-18.
[0068] In some embodiments, the signal sequence may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 302-306.
(iii) Sequence Encoding a Cascade Reagent Protein
[0069] A sequence encoding a cascade reagent protein is a sequence of DNA encoding any one of the cascade reagent proteins of the LAL assay disclosed herein. A protein encoded by this sequence may also be referred to as a recombinant cascade reagent (RCR), and may include any one of three recombinant protease zymogens, namely Factor C, Factor B, and Proclotting Enzyme, and a clotting protein, namely Coagulogen.
[0070] The expression cassettes of the disclosure may comprise a sequence encoding a cascade reagent protein. The sequence encoding a cascade reagent protein may be isolated or derived from the genome of one of any horseshoe crab, for example Tachypleus tridentatus,
Limulus polyphemus, Tachypleus gigas, or Carcinoscorpius rotundicauda. In some embodiments, the sequence encoding a cascade reagent protein may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 4. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
TABLE 4
Figure imgf000024_0001
Figure imgf000025_0001
[0071] In some embodiments, the sequence encoding a cascade reagent protein may be truncated or mutated from the wild type sequence. In some embodiments, the sequence encoding a cascade reagent protein may encode a recombinant protein with activity higher than, lower than, or equivalent to that of the wild type protein. In some embodiments the sequence encoding a cascade reagent protein may encode a cascade reagent protein homolog.
[0072] In some embodiments, the sequence encoding a cascade reagent protein may encode the Factor C serine protease zymogen. In some embodiments, the sequence encoding the cascade reagent protein Factor C may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, or
SEQ ID NO: 278-283, or SEQ ID NO: 325.
[0073] In some embodiments, the sequence encoding a cascade reagent protein may encode the Factor B serine protease zymogen and homologs thereof, e.g., C3 and C2/Bf. In some embodiments, the sequence encoding the cascade reagent protein Factor B may comprise a nucleic add sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289.
[0074] In some embodiments, the sequence encoding a cascade reagent protein may encode the Proclotting Enzyme serine protease zymogen. In some embodiments, the sequence encoding the cascade reagent protein Proclotting Enzyme may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID
NO: 5 or SEQ ID NO: 290-292. [0075] In some embodiments, the sequence encoding a cascade reagent protein may encode the Coagulogen clotting protein. In some embodiments, the sequence encoding the cascade reagent protein Coagulogen may comprise a nucleic acid sequence having at least 70%, at least
75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 6-8 or SEQ ID NO: 293-
301.
(iv) Termination Sequence
[0076] A termination sequence, terminator, or transcription terminator is a sequence of
DNA downstream of the translational stop codon that mediates termination of transcription of operably linked nucleic acid sequences. Prokaryotic transcription terminators of the present disclosure may be Rho-dependent or Rho-independent. Transcription terminators may comprise a downstream transcription stop point sequence and/or a GC-rich region of dyad symmetry followed by a poly-A sequence to promote allosteric dissociation of the transcriptional complex and/or hairpin loop formation of the transcribed mRNA and subsequent transcription termination.
[0077] The expression cassettes of the disclosure may comprise a termination sequence.
The termination sequence may be isolated or derived from the genome of one of any suitable organism, for example Escherichia coli (E. coli) or C. glutamicum. The termination sequence may comprise the termination region of the E. coli rrnB gene, the termination region of the C. glutamicum cgl502 gene, the termination region of the C. glutamicum cg3011 gene, the termination region of the C. glutamicum cspA gene, and the termination region of the C. glutamicum cg!338 gene. In some embodiments, the termination sequence may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 5. In some embodiments, optimization for expression in C. glutamicum may include replacing nucleotides of the wild type termination sequence to optimize GC content for expression in C. glutamicum.
TABLE S
Figure imgf000028_0001
[0078] In some embodiments, the termination sequence may comprise the wild type rmB termination sequence from E. coli. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 272, or a sequence at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the rmB termination sequence from E. coli optimized for expression in C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 273, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
[0079] In some embodiments, the termination sequence may comprise the wild type cg!502 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 274, or a sequence at least 70%, at least
75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cg3011 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 275, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cspA termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 276, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cgl338 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 277, or a sequence at least
70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.
(v) Sequence Encoding a Polypeptide Protein Tag
[0080] A sequence encoding a polypeptide protein tag is a sequence of DNA encoding a peptide sequence, protein tag, or polypeptide protein tag. A sequence encoding a polypeptide protein tag may be fused, appended, or grafted to a sequence encoding a protein, generally at either the C-terminus or N-terminus, or at both the C-terminus and the N-terminus of the protein. Less frequently a sequence encoding a polypeptide protein tag may be inserted into the sequence encoding a protein. A polypeptide protein tag may be appended to a protein to aid in affinity purification from biological lysate, enhance resolution of chromatographic separation, and/or promote solubilization and proper folding of proteins prone to precipitation. Polypeptide protein tags may comprise polyanionic amino acids or epitope tags. [0081] The expression cassettes of the disclosure may comprise a sequence encoding a polypeptide protein tag. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the sequence encoding the cascade reagent protein and the termination sequence. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding a polypeptide protein tag and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding a polypeptide protein tag.
[0082] In some embodiments, two or more sequences encoding polypeptide protein tags may be located in tandem at the 5’ end or the 3’ end of the sequence encoding the cascade reagent protein. In some embodiments, the sequence encoding the cascade reagent protein may be located between two sequences encoding polypeptide protein tags, i.e., the sequences encoding polypeptide protein tags flank the sequence encoding the cascade reagent protein. In some embodiments, sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the flanking sequences encoding the polypeptide protein tags.
[0083] In some embodiments, the cascade reagent protein may be N-terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be C- terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be N-terminally or C-terminally tagged with tandem polypeptide protein tags. In some embodiments, the cascade reagent protein may be both N-terminally and C-terminally tagged with polypeptide protein tags. In some embodiments, the two or more polypeptide protein tags are identical. In some embodiments, the two or more polypeptide protein tags are not identical. In some embodiments, cleavable, flexible, and/or rigid linkers may separate the polypeptide protein tag or tags from the cascade reagent protein.
[0084] In some embodiments, the sequence encoding a polypeptide protein tag may encode a peptide or protein tag, for example a polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, or maltose-binding protein. In some embodiments, the sequence encoding a polypeptide protein tag may encode a polyhistidinetag, also referred to as His-tag, Hise tag, poly(His) tag, or 6His, which may be about 5-10 residues in length, for example 5, 6, 7, 8, 9, or 10 residues in length, e.g., the amino acid sequence
Figure imgf000031_0005
In some embodiments, the sequence encoding a polypeptide protein tag may encode a
FLAG-tag, also referred to as FLAG octapeptide or FLAG epitope, which may have the amino add sequence D
Figure imgf000031_0004
and may be used in tandem and with some variation in sequence identity, e.g., the 3xFLAG peptide of amino acid sequence
Figure imgf000031_0003
In some embodiments, the sequence encoding a polypeptide protein tag may encode an HA-tag, also referred to as the human influenza hemagglutinin tag, which may be derived from amino acids
98-106 of the human influenza hemagglutinin protein and may have the amino acid sequence
YPYDVPDYA. In some embodiments, the sequence encoding a polypeptide protein tag may encode a calmodulin-binding peptide, also referred to as a calmodulin-binding protein peptide tag,
CBP-tag, or calmodulin-tag, which may have the amino acid sequence In some embodiments, the sequence encoding a
Figure imgf000031_0002
polypeptide protein tag may encode a streptavidin-binding peptide, also referred to as an SEP or streptavi din-tag, including a 38 -amino add sequence or 8-amino acid sequences of
Figure imgf000031_0001
the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK. In some embodiments, the sequence encoding a polypeptide protein tag may encode a glutathione S-transferase protein, also referred to as a GST-tag, which may be about 220 amino adds in length and may be derived from a sequence encoding a wild type glutathione S'-transferase.
In some embodiments, the sequence encoding a polypeptide protein tag may encode a maltose binding protein, also referred to as MBP-tag or maltose tag, which may be about 370-396 amino adds in length and may be derived from the malE gene of E. coli.
[0085] In some embodiments, the sequence encoding a polypeptide protein tag may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 6. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.
TABLE 6
Figure imgf000032_0001
[0086] In some embodiments, the sequence encoding a polypeptide protein tag may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 19-32.
[0087] In some embodiments, the sequence encoding a polypeptide tag may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least
97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 307-314.
(vi) Sequence Encoding a Linker
[0088] A sequence encoding a linker is a sequence of DNA encoding a polypeptide linker.
Polypeptide linkers may encode cleavable, rigid, and/or flexible polypeptides. Polypeptide linkers, also referred to as linkers, may link functional protein domains together or release free functional domains after cleavage. Linkers may be isolated from or derived from naturally-occurring multidomain proteins, or may be designed de novo. Linkers may increase stability, promote folding, increase expression, or improve biological activity of the protein domains they are fused to.
Properties of linkers, including length, hydrophobicity, amino acid residues, and secondary structure, may vary. For instance, linkers may adopt various conformations, such as P-strand, helical, coil/bend, and turns.
[0089] The expression cassettes of the disclosure may comprise a sequence encoding a linker. In some embodiments, the sequence encoding a linker may encode a polypeptide about 3-
30 residues in length, for example 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
22, 23, 24, 25, 26, 27, 28, 29, or 30 residues in length. In some embodiments, the sequence encoding a linker may be located between a 5* sequence encoding a polypeptide protein tag and a 3’ sequence encoding a cascade reagent protein. In some embodiments, the sequence encoding a linker may be located between a 5 ’ sequence encoding a cascade reagent protein and a 3 ’ sequence encoding a polypeptide protein tag. In some embodiments, polar uncharged or charged residues are preferable amino acids of the linker.
[0090] In some embodiments, the sequence encoding a linker may encode a flexible GS linker, for example
Figure imgf000034_0004
(Gly)7, or (Giy)g. In some embodiments, the sequence encoding a linker may encode a rigid a-helical linker, for example
Figure imgf000034_0002
or In some embodiments, the sequence encoding a linker may
Figure imgf000034_0001
encode a rigid proline-rich linker, for example PAPAP, (AP)n, (KP)n, or (EP)n, wherein n is 3-4.
In some embodiments, the sequence encoding a linker may encode a cleavable disulfide linker, for example LEAGCKNFFPRSFTSCGSLE, or a cleavable protease linker, for example GFLG.
[0091] In some embodiments, the sequence encoding a linker may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 7. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or
OptimWiz optimization and/or optimizing GC content of the DNA sequence.
TABLE ?
Figure imgf000034_0003
Figure imgf000035_0001
[0092] In some embodiments, the sequence encoding a linker may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of
SEQ ID NO: 265-271.
[0093] In some embodiments, the sequence encoding a linker may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 315-321.
(vii) Exemplary Expression Cassettes
[0094] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, and a sequence encoding a cascade reagent protein. In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID NO: 325
(OptimWiz optimized Factor C (T. tridentatus)), or SEQ ID NO: 2 (Factor C (C. rotundicauda)).
In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ
ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID
NO: 4 (Factor B (C. rotundicauda)). In some embodiments, the expression cassette comprises SEQ
ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 5
(Proclotting Enzyme (T. tridentatus)). In some embodiments, the expression cassette comprises
SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), and SEQ ID NO: 6 (Coagulogen (L potyphemus)), SEQ ID NO: 7 (Coagulogen (7. tridentatus)), or SEQ ID NO: 8
(Coagulogen (C. rotundicauda)).
[0095] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO:
14 (cg!514 signal sequence), SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID
NO:325 (OptimWiz optimized Factor C (7. tridentatus)), or SEQ ID NO: 2 (Factor C (C. rotundicauda)), and SEQ ID NO: 272 (wild type rmB termination sequence) or SEQ ID NO: 273
(optimized rrnB termination sequence). In some embodiments, the expression cassette comprises
SEQ ID NO: SEQ ID NO: 322
Figure imgf000036_0004
tridentatus version 2)-rmB terminator), or SEQ ID NO: 113
Figure imgf000036_0001
rotundicauda)-rmB terminator). In some embodiments, the
Figure imgf000036_0002
expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID NO: 4 (Factor B (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275,
SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E. coli rrnB termination sequences,
C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).
In some embodiments, the expression cassette comprises SEQ ID NO: 101 (Pcgl514-cgl514ss-
Factor B (7. tridentatus)-rrnB terminator) or SEQ ID NO: 117
Figure imgf000036_0003
rotundicauda)-rrnB terminator). In some embodiments, the expression cassette comprises SEQ ID
NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 5 (Proclotting
Enzyme from T. tridentatus), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID
NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E. coli rmB termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 105
(Pcg7574-cg7574ss-Proclotting Enzyme (T. tridentatus)-rrnB terminator). In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID
NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E coli rmB termination sequences, C. glutamicum wild type cg!502, cgSOll, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 109
Figure imgf000037_0005
SEQ ID NO: 121
Figure imgf000037_0004
terminator), or SEQ ID
NO: 125 rotundicauda)-rrnB terminator).
Figure imgf000037_0003
[0096] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from
5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 34, SEQ ID NO: 37, SEQ ID NO: 39,
SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47 (cg7574ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272,
SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (£. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cgSOll, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 98 tridentatus)-rmB
Figure imgf000037_0002
terminator^, SEQ ID NO: 323
Figure imgf000037_0001
(T. tridentatus version 2)-rmB terminator), SEQ ID NO: 114 rotundicauda)-rmB
Figure imgf000038_0004
terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
9 (cgl514 promoter), SEQ ID NO: 50, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ
ID NO: 59, SEQ ID NO: 61, or SEQ ID NO: 63 (cgl 514ss-tag-F actor B where the tag is 6His,
FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273,
SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises
SEQ ID NO terminator) or SEQ ID
Figure imgf000038_0003
NO: 118 rotundicauda)-rmB terminator). In some
Figure imgf000038_0002
embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
SEQ ID NO: 66, SEQ ID NO: 69, SEQ ID NO: 71, SEQ ID NO: 73, SEQ ID NO: 75, SEQ ID
NO: 77, or SEQ ID NO: 79 g Enzyme where the tag is 6His, FLAG, HA,
Figure imgf000038_0006
CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:
274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:
106 Enzyme (7*. tridentatus)-rmB terminator). In some
Figure imgf000038_0001
embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
SEQ ID NO: 82, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID
NO: 93, or SEQ ID NO: 95 where the tag is 6His, FLAG, HA, CBP,
Figure imgf000038_0005
SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,
SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cgl338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: SEQ ID NO: 122
Figure imgf000039_0002
Figure imgf000039_0001
terminator), or SEQ ID NO: 126
Figure imgf000039_0003
[0097] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 35, SEQ ID NO: 38, SEQ ID NO:
40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, or SEQ ID NO:
Figure imgf000039_0004
tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO:
272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:
277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or eg 1338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 99 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus)-6His- rrnB terminator), SEQ ID NO: 324 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus version 2)-6His- rrnB terminator), SEQ ID NO: 115 (Pcgl514-cgl514ss-¥actor C
Figure imgf000039_0006
terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
9 (cgl514 promoter), SEQ ID NO: 51, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ
ID NO: 60, SEQ ID NO: 62, or SEQ ID NO: 64 where the tag is 6His,
Figure imgf000039_0005
FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273,
SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (£. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cgl338 termination sequences, respectively). In some embodiments, the expression cassette comprises
SEQ ID NO: 103
Figure imgf000040_0002
terminator) or SEQ ID
NO: 119
Figure imgf000040_0001
terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
SEQ ID NO: 67, SEQ ID NO: 70, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 76, SEQ ID
NO: 78, or SEQ ID NO: 80
Figure imgf000040_0014
Enzyme-tag where the tag is 6His, FLAG, HA
CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:
274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences,
Figure imgf000040_0010
type or termination
Figure imgf000040_0006
Figure imgf000040_0007
sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:
107 Enzyme In some
Figure imgf000040_0004
Figure imgf000040_0005
embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
SEQ ID NO: 83, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID
NO: 94, or SEQ ID NO: where the tag is 6His, FLAG, HA CBP,
Figure imgf000040_0012
SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,
SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, wild type or cg!338 termination
Figure imgf000040_0011
Figure imgf000040_0013
sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:
111
Figure imgf000040_0009
SEQ ID NO: 123 or SEQ ID NO: 127
Figure imgf000040_0003
Figure imgf000040_0008
[0098] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),
SEQ ID NO: 36 (cg/5/Vxs-6His-Factor C-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ
ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises
SEQ ID NO: 100
Figure imgf000041_0001
tridentatus)-6¥V\s-rrnB terminator) or
SEQ ID NO: 116
Figure imgf000041_0002
(C. rotundicauda) -6His-rrnB terminatoi). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 52 (cg7574ss-6His-Factor B-6His), and SEQ ID NO: 272, SEQ ID NO:
273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 104
Figure imgf000041_0003
(7. tridentatus)-6tiis-rrnB terminator) or SEQ ID NO: 120
Figure imgf000041_0004
(C. rotundicauda)-6¥\\s-rrnB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:
9 (cg!514 promoter), SEQ ID NO: 68 -Proclotting Enzyme-6His), and SEQ ID
Figure imgf000041_0007
NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID
NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 108 -Proclotting Enzyme (T.
Figure imgf000041_0005
. In some embodiments, the expression cassette comprises from
Figure imgf000041_0006
5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 84 (cgI514ss-6His -Coagulogen-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 112
Figure imgf000042_0006
Coagulogen , SEQ ID NO: 124
Figure imgf000042_0003
Figure imgf000042_0005
Coagulogen
Figure imgf000042_0002
or SEQ ID NO: 128
Figure imgf000042_0004
6His-Coagulogen
Figure imgf000042_0001
[0099] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),
SEQ ID NO: 33 and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,
Figure imgf000042_0007
SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5* to 3*
SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 49
Figure imgf000042_0008
and SEQ ID NO: 272,
SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 65
Figure imgf000042_0009
Proclotting Enzyme), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO:
275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 81 and SEQ ID NO: 272, SEQ ID
Figure imgf000043_0001
NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coh rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cg!338 termination sequences, respectively).
Methods of Recombinant Protein Expression and Purification
[0100] The disclosure provides methods of recombinant protein expression. In some embodiments, the expression cassette is cloned into a plasmid. In some embodiments, the expression cassette may be cloned into a multiple cloning site of a plasmid using restriction enzyme cloning, Gateway cloning, or TOPO cloning. In some embodiments, the expression cassette may be Gibson assembled into a plasmid. In some embodiments, the expression cassette may be inserted into a plasmid using a combination of restriction enzyme cloning, Gateway cloning, TOPO cloning, and/or Gibson assembly. In some embodiments, nucleic acid sequences may comprise restriction enzyme recognition sites and/or recombination sequences to facilitate cloning. In some embodiments, restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of: the promoter, the signal sequence, the sequence encoding a cascade reagent protein, the termination sequence, the sequence encoding a polypeptide tag, and the sequence encoding a linker. In some embodiments, restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of two or more sequences encoding polypeptide protein tags and two or more sequences encoding linkers.
[0101] In some embodiments, a plasmid may be a cloning vector, a transfer vector, a shuttle vector, or an expression vector. In some embodiments, a suitable plasmid may be a mobilizable E. coll - C. glutamicum shuttle vector. In some embodiments, a suitable plasmid may be the pEC-pkl8mob2 plasmid. [0102] The disclosure provides methods of recombinant protein purification. In some embodiments, the RCRs of the present invention may be purified from cultures of recombinant C. glutamicum cells expressing nucleic acid molecules, including expression cassettes and plasmids.
In some embodiments, the expression cassette comprises a sequence encoding a polypeptide tag fused to the 5’ end or the 3’ end of the sequence encoding a cascade reagent protein. The polypeptide tag may comprise a solubilization tag that facilitates proper protein folding and prevents precipitation during purification. The polypeptide tag may comprise an affinity tag that facilitates affinity purification. The polypeptide tag may comprise a chromatographic tag that modulates resolution during chromatographic separation. The polypeptide tag may comprise an epitope tag that facilitates antibody purification.
[0103] In some embodiments, the RCR may be purified from culture supernatant or cell lysate using column chromatography. In some embodiments, the culture supernatant or cell lysate may be applied to a column, the column may be washed, and bound protein may be eluted from the column. In some embodiments, additives and chelating agents, e.g., EDTA, may be incorporated into buffers during purification. In some embodiments, the tagged protein binds to the column matrix and may be eluted by competitive binding, cleavage of the protein tag, or by destabilization of the interaction between the protein tag and the column matrix, e.g., by a change of pH. In some embodiments, the RCR may be purified by fast protein liquid chromatography
(FPLC), batch spin, or drip columns. In some embodiments, elution fractions may be assayed for protein concentration and RCR activity and concentrated to obtain higher protein concentrations.
In some embodiments, the RCR is purified to apparent homogeneity.
[0104] In some embodiments, the isolated, purified protein molecule is an RCR derived from T. tridentatus, e.g., serine protease zymogen or clotting protein optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 257 (Factor C), SEQ ID NO: 259 (Factor B), or SEQ ID
NO: 261 (Proclotting Enzyme). In some embodiments, the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 263 (Coagulogen).
[0105] In some embodiments, the isolated, purified protein molecule is an RCR derived from C. rotundicauda including homologs thereof, e.g., Factor B C3 and C2/Bf. In some embodiments, the isolated, purified protein molecule is a serine protease zymogen or clotting protein optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 258 (Factor C) or SEQ ID NO: 260 (Factor B). In some embodiments, the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 264 (Coagulogen).
[0106] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from T. tridentatus, e.g., a serine protease zymogen or clotting protein and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ
ID NO: 145 (cgl514ss-Factor B), SEQ ID NO: 161 (cg!514ss-Proclotting Enzyme), or SEQ ID
NO: 177 (cgl514ss-Coagulogen).
[0107] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from C. rotundicauda or L. polyphemus, e.g., a serine protease zymogen or clotting protein, and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 193
(cgl514ss-Factor C from C. rotundicauda), SEQ ID NO: 209 (cgl514ss-Factor B from C. rotundicauda), SEQ ID NO: 225 (cgl514ss-Coagulogen (Z. polyphemus)) or SEQ ID NO: 241
(cgl514ss-Coagulogen (C. rotundicauda)).
[0108] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from T. tridentatus and optimized for expression in C. gluUmicum. In some embodiments, the isolated, purified protein molecule is an
N-terminal signal peptide fused to an N-terminally tagged Factor C derived from T. tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 130, SEQ ID NO: 133,
SEQ ID NO: 135, SEQ ID NO: 137, SEQ ID NO: 139, SEQ ID NO: 141, or SEQ ID NO: 143
(cgl 514ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).
In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor B derived from T. tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 146, SEQ ID NO: 149, SEQ ID NO: 151, SEQ
ID NO: 153, SEQ ID NO: 155, SEQ ID NO: 157, or SEQ ID NO: 159 (cgl514ss-tag-Factor B where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Proclotting Enzyme derived from T. tridentatus optimized for expression in C. gluUmicum.
In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least
75% identical to SEQ ID NO: 162, SEQ ID NO: 165, SEQ ID NO: 167, SEQ ID NO: 169, SEQ
ID NO: 171, SEQ ID NO: 173, or SEQ ID NO: 175 (cgl514ss-tag-Proclotting Enzyme where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from T tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 178, SEQ ID NO: 181, SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID NO:
187, SEQ ID NO: 189, or SEQ ID NO: 191 (cg!514ss-tag-Coagulogen where the tag is 6His,
FLAG, HA, CBP, SBP, GST, or MBP, respectively).
[0109] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
194, SEQ ID NO: 197, SEQ ID NO: 199, SEQ ID NO: 201, SEQ ID NO: 203, SEQ ID NO: 205, or SEQ ID NO: 207 (cg!514ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 210, SEQ ID
NO: 213, SEQ ID NO: 215, SEQ ID NO: 217, SEQ ID NO: 219, SEQ ID NO: 221, or SEQ ID
NO: 223 (cg!514ss-tag-Factor B where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 226, SEQ ID NO: 229, SEQ ID NO: 231, SEQ ID NO: 233, SEQ ID NO: 235, SEQ ID NO: 237, or SEQ ID NO: 239 (cgl514ss- tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. gluUmicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID NO: 247, SEQ
ID NO: 249, SEQ ID NO: 251, SEQ ID NO: 253, or SEQ ID NO: 255 (cg!514ss-tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
[0110] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an
N-terminal signal peptide fused to a C -terminally tagged Factor C derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 131, SEQ ID NO: 134,
SEQ ID NO: 136, SEQ ID NO: 138, SEQ ID NO: 140, SEQ ID NO: 142, or SEQ ID NO: 144
(cgl 514ss-Factor C-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Factor B derived from T tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 147, SEQ ID NO: 150, SEQ ID NO: 152, SEQ
ID NO: 154, SEQ ID NO: 156, SEQ ID NO: 158, or SEQ ID NO: 160 (cgl514ss-Factor B-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 163, SEQ ID NO: 166, SEQ ID NO: 168, SEQ
ID NO: 170, SEQ ID NO: 172, SEQ ID NO: 174, or SEQ ID NO: 176 (cgl514ss-Proclotting
Enzyme-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 179, SEQ ID NO: 182, SEQ ID NO: 184, SEQ
ID NO: 186, SEQ ID NO: 188, SEQ ID NO: 190, or SEQ ID NO: 192 (cgl514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).
[0111] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 195, SEQ ID
NO: 198, SEQ ID NO: 200, SEQ ID NO: 202, SEQ ID NO: 204, SEQ ID NO: 206, or SEQ ID
NO: 208 (cgl514ss-Factor C-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 211, SEQ ID NO: 214, SEQ ID NO: 216, SEQ ID NO: 218, SEQ ID NO: 220, SEQ ID NO: 222, or SEQ ID NO: 224 (cgl514ss-
Factor B-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 227, SEQ ID NO: 230, SEQ ID NO: 232, SEQ
ID NO: 234, SEQ ID NO: 236, SEQ ID NO: 238, or SEQ ID NO: 240 (cgl 514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. glutamicum.
In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least
75% identical to SEQ ID NO: 243, SEQ ID NO: 246, SEQ ID NO: 248, SEQ ID NO: 250, SEQ
ID NO: 252, SEQ ID NO: 254, or SEQ ID NO: 256 (cgl514ss-Coagulogen-tag where the tag is
6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).
[0112] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged RCR derived from T. tridentatus or C. rotundicauda optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Factor C derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 132 (cgl514ss-6His-Factor C
(Z tridentatus)-6His) or SEQ ID NO: 196 (cgl514ss-6His-Factor C (C. rotundicauda)-6Hisy In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged Factor B derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:
148 (cgl514ss-6His-Factor or SEQ ID NO: 212 (cgl514ss-6His-Factor
Figure imgf000051_0001
B (C. rotundicauda)-6His). In some embodiments, the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally and C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ
ID NO: 164 (cgl 514ss-6His-Proclotting Enzyme In some embodiments, the
Figure imgf000051_0002
isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Coagulogen derived from T. tridentatus, L. polyphemus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 180 (cgl514ss-6His-
Coagulogen SEQ ID NO: 228 (cgl514ss-6His-Coagulogen (L.
Figure imgf000051_0004
or SEQ ID NO: 244 (cgl514ss-6His-Coagulogen (C. rotundicauda)-6His.
Figure imgf000051_0003
Kits and Methods for Detecting a Pyrogen or Endotoxin in a Sample
[0113] The disclosure provides kits and methods for detecting a pyrogen or endotoxin in a sample. In some embodiments, the kit comprises one or more of the RCR proteins of the present disclosure. In some embodiments, the kit comprises one or more of recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting
Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein. In some embodiments, the kit comprises one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity. [0114] The disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity. In some embodiments, the method comprises contacting the sample with one or more of the components of the kit described herein, including recombinant
Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant
Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein. In some embodiments, the method comprises contacting the sample with one or more of the components of the kit described herein in combination with a commercialized natural lysate reagent. In some embodiments, the method for detecting a pyrogen or endotoxin in a sample comprises the limited proteolysis of each protease zymogen in the coagulation cascade reaction of the LAL assay.
[0115] In some embodiments, the method for detecting a pyrogen or endotoxin in a sample may comprise admixing one or more components of the kit with the sample, separating precipitated proteins from the sample, admixing one or more components of the kit with the remaining sample, and measuring coagulation. Measuring coagulation may include observing increased turbidity and viscosity. In some embodiments, the method further comprises centrifugation of the sample, sedimentation and separation of the sample, and/or removal of one or more layers or portions of the sample.
EXAMPLES
[0116] The following examples are not intended to be limited and are included herein for illustration purposes only.
Example 1: Preparation of RCR expression cassettes [0117] Expression cassettes of the present disclosure include nucleic acid molecules comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and may include a polypeptide tag. In an exemplary embodiment of the present invention, the expression cassette comprises a promoter, a signal sequence, a gene of interest, and a termination sequence
(FIG. 2A, number 1). In an embodiment, the expression cassette comprises a promoter, a signal sequence, an N-terminally tagged gene of interest, and a termination sequence (FIG. 2A, number
2). In an embodiment, the expression cassette comprises a promoter, a signal sequence, a C- terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 3).
[0118] In an embodiment, standard cloning techniques were used to construct RCR expression cassettes comprising the , the cgl514 signal sequence
Figure imgf000053_0003
indicated by cgl514ss (SEQ ID NO: 14), the T tridentatus Factor C gene optimized for expression in C. glutamicum (SEQ ID NO: 325), the E. coli rmBTlT2 terminator sequence indicated by rmB terminator (SEQ ID NO: 272), and optionally a polyhistidine-tag optimized for expression in C. glutamicum (SEQ ID NO: 26). Three RCR expression cassettes were engineered to result in a secretory expression system based on the Cgl514 secreted protein of C. glutamicum by using the promoter ) and signal sequence (cg!514ss) of cg!514.
Figure imgf000053_0002
[0119] FIG. 2B shows schematic representations of the three RCR expression cassettes optimized for expression in C. glutamicum. Expression cassette number 4 (SEQ ID NO: 322) comprises the P promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
Figure imgf000053_0001
Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272). Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cgl514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272). Expression cassette number 6 (SEQ ID NO: 324) comprises the Pcgisi* promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),
Factor C gene (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator
(SEQ ID NO: 272). The three RCR expression cassettes comprise the nucleic acid sequences of
SEQ ID NO: 322, SEQ ID NO: 323, and SEQ ID NO: 324, for expression of Factor C (FIG. 2B, number 4), N-terminally polyhistidine-tagged Factor C (FIG. 2B, number 5), and C-terminally polyhistidine-tagged Factor C (FIG 2B, number 6), respectively.
Example 2: Expression of recombinant expression cassettes in G glutamicum
[0120] Each of the three RCR expression cassettes were cloned into a multiple cloning site
(MCS) of the pEC-pk!8mob2 plasmid, resulting in three plasmids comprising each of the three
RCR expression cassettes. The pEC-pk!8mob2 plasmid is a mobilizable E. coli - C. glutamicum shuttle vector based on a mini-replicon encoding the repA and per functions of the medium copy number plasmid pGAl. Each of the three plasmids, as well as pEC-pk!8mob2 empty plasmid, were transformed separately into C. glutamicum. For plasmid expression confirmation, a single colony of each of the transformations was isolated from a fresh LEG plate (Luria Broth - Lennox’ s formulation supplemented with 0.5% glucose), inoculated in LEG broth and incubated at 30 °C shaking at 200 revolutions per minute (RPM) for about 6 - 8 hours. A sample of each of the transformations was then inoculated into fresh LEG broth and incubated at 30 °C shaking at 200
RPM for about 14 - 16 hours. Samples of each of the plasmid transformations were removed for gram staining. Gram-positive bacteria (Bacillus cereus) and gram-negative bacteria (Escherichia coli KI 2) were used as positive and negative controls and are indicated by B. cereus and E. coli, respectively. Untransformed, rod-shaped C. glutamicum was also Gram stained. A Gram stain shows gram-positive B. cereus and gram-negative E. coli (FIG. 3A, top left), gram-positive C. glutamicum (FIG. 3A, top middle), gram-positive C. glutamicum transformed with pK18mob2 empty plasmid (FIG. 3A, top right), gram-positive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 4, indicated by pK18mob2 - FC (FIG. 3 A, bottom left), gram-positive C. glutamicum transformed with the Factor C expression cassette plasmid of
FIG. 2B number 6, indicated by pK18mob2 - FC-CHis6 (FIG. 3A, bottom middle), and grampositive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 5, indicated by pK18mob2 - FC-NHis6 (FIG. 3 A, bottom right).
[0121] Cells were harvested by centrifugation at 3500 RPM for 20 minutes at 4 °C.
Supernatants of each of the controls and three experimental plasmid transformations were removed and the cell pellets were washed once with STE buffer (10 mM Tris, 10 mM NaCl, 1 mM EDTA, pH 8.0). Cell pellets were frozen at -20 °C overnight. Cells pellets were thawed and resuspended in STE buffer supplemented with 500 mM sucrose and 10 mg/mL lysozyme, then shaken at 200
RPM at 37 °C for one hour. Plasmids were isolated from bacteria using alkaline lysis, and samples were subjected to gel electrophoreses at 80 Volts for 120 minutes at room temperature on a 1% agarose gel in IX TAB buffer. Safe DNA Gel Stain (Bioland Scientific) was used to visualize
DNA under blue LED light (FIG. 3B). Lane 1 shows no DNA present from the C. glutamicum negative control, lane 2 shows the expected molecular weight of the pEC-pkl8mob2 empty plasmid expressed in C. glutamicum, lane 3 shows the expected molecular weight of the plasmid comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322) expressed in C. glutamicum, lane 4 shows the expected molecular weight of the plasmid comprising expression cassette number 6 of FIG. 2B (SEQ ID NO: 324) expressed in C. glutamicum, and lane 5 shows the expected molecular weight of the plasmid comprising expression cassette number 5 of FIG.
2B (SEQ ID NO: 323) expressed in C. glutamicum.
Example 3: Expression of recombinant Factor C in C glutamicum [0122] C. glutamicum expression cassette 4 of FIG. 2B (SEQ ID NO: 322; pK18mob2 -
FC) was cultivated in 14 mL round-bottom culture tubes containing 2.5 mL brain heart infusion
(BHI; Carolina Biological Supply Company, Burlington, NC) medium at 30 °C for 48 hours at
200 RPM. In all cultivations, kanamycin (50 mg/L) was added to the culture medium as the sole antibiotic. As a seed culture, cells were inoculated into 50 mL of semi-defined medium containing
20 g/L of glucose in a 250 mL baffled flask and cultivated at 30 °C for 24 hours at 200 RPM. The semi-defined medium consists of 0.5 g urea, 0.25 mg ZnSCh, 2.5 mg CaCh in BHI media. The seed culture (40 mL) was inoculated into 400 mL of fresh semi-defined medium in a 1 L jar custom-built bioreactor. Throughout cultivation, the temperature was maintained at 30 °C and stirred with axial flow impeller at 300 RPM. Oxygen concentration was maximized by continual sterile air flow into the medium. The pH was maintained at 7.0 by adding 10% "V/V ammonium hydroxide solution (LabChem, Zelienople, PA) when the set point dropped below 7 or 37% hydrochloric acid (GTI Laboratory Supplies, Edna, Texas) when the set point increased above 7.
To prevent glucose starvation, a glucose solution (90 g in 150 mL BHI) was added to the culture in 90 second increments at a rate of 12.5 mL/hr.
[0123] After bioreactor cultivation for 36 hours, extracellular proteins were prepared using acetone precipitation. After centrifugation at 4500 RPM for 10 minutes at 4°C, 75 mL of the culture supernatant was vigorously mixed with two volumes of cold acetone and incubated at -
20°C overnight. The protein samples were then precipitated by centrifugation at 13, 200 RPM for
30 minutes at 4°C. The pellet was air-dried and resuspended in denaturing 8M urea (pH 8.0), 300 mMNaCl, 50 mMNaH2PO4, 20 mM Tris-Cl, 1 mMEDTA, 10% glycerol, and 1% Triton X-100.
60 pL of the resuspended precipitated supernatant protein was added per lane in an 8% SDS-PAGE gel. SDS-PAGE gel was then stained in 0.025% Coomassie Brilliant Blue R-250 in 10% acetic add at 50°C for 15 minutes while shaking. SDS-PAGE gel was destained overnight in 10% acetic add with several changes of 10% acetic acid. Gels were imaged on a light table (Figure 4).
[0124] Referring to Figure 4, lanes 1 and 2 are duplicates of the C. glutamicum pK18mob2 negative control sample, and lanes 3 and 4 are duplicates of the C. glutcanicum pK18mob2 - FC sample. Lanes 3 and 4 show expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutamicum expressing pK18mob2 - FC, a plasmid which harbors cassette number 4 (SEQ ID NO: 322), referred to as C. glutamicum pKl8mob2 - FC. Lanes 1 and 2 do not show expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutcanicum expressing pK18mob2, an empty plasmid, referred to as C. glutamicum pK18mob2. Factor C is a two-chain glycoprotdn (Mr = 123 kDa) composed of a heavy chain (Mr = 80 kDa) and a light chain (Mr = 43 kDa). SDS-PAGE gel analysis with an 8% gel under denaturing conditions demonstrates expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutamicum pK18mob2 - FC, corresponding to production of Factor C in C. glutamicum and extrusion of the protein into the culture supernatant.
NUMBERED EMBODIMENTS
[0125] The following list of embodiments is not intended to be limiting and is included herein for illustrative purposes. The subjected matter to be claimed is not limited to the following embodiments:
Embodiment 1. A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’: a. a promoter; b. a signal sequence; and c. a sequence encoding a cascade reagent protein.
Embodiment 2. The nucleic acid molecule of embodiment 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.
Embodiment 3. The nucleic acid molecule of embodiment 1 or 2, wherein the signal sequence encodes a signal peptide.
Embodiment 4. The nucleic acid molecule as any one of embodiments 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
Embodiment 5. The nucleic acid molecule as in any one of embodiments 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.
Embodiment 6. The nucleic acid molecule as in any one of embodiments 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a Corynebacterium glutamicum secretory gene. Embodiment 7. The nucleic acid molecule as in embodiment 5 or 6, wherein the
Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR.0949 gene, and the porB gene.
Embodiment 8. The nucleic acid molecule as in any one of embodiments 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.
Embodiment 9. The nucleic acid molecule as in any one of embodiments 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus polyphemus, Tachypleus gigas, and
Carcinoscorpius rotundicauda.
Embodiment 10. The nucleic acid molecule as in any one of embodiments 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
Embodiment 11. The nucleic acid molecule as in any one of embodiments 1-10, wherein the expression cassette comprises a termination sequence.
Embodiment 12. The nucleic acid molecule of embodiment 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rmB gene, the termination region of the Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cgl338 gene.
Embodiment 13. The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.
Embodiment 14. The nucleic acid molecule of embodiment 13, wherein the polypeptide protein tag is selected from the group consisting of polyhistidine-tag,
FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.
Embodiment 15. The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
Embodiment 16. The nucleic acid molecule of embodiment 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
Embodiment 17. The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
Embodiment 18. The nucleic acid molecule of embodiment 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag. Embodiment 19. The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.
Embodiment 20. The nucleic acid molecule of embodiment 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.
Embodiment 21. The nucleic acid molecule of embodiment 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
Embodiment 22. The nucleic acid molecule of embodiment 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
Embodiment 23. The nucleic acid molecule as in any one of embodiments 16, 18, or
22, in which the linker or linkers are selected from the group consisting of flexible
GS linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
Embodiment 24. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.
Embodiment 25. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.
Embodiment 26. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto.
Embodiment 27. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ
ID NO: 6, SEQ ID NO: 7, or SEQ ID NO: 8, or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
Embodiment 28. The nucleic acid molecule as in any one of embodiments 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID
NO: 9-13, or a sequence at least 90% identical thereto.
Embodiment 29. The nucleic acid molecule as in any one of embodiments 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of
SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.
Embodiment 30. The nucleic acid molecule as in any one of embodiments 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
Embodiment 31. The nucleic acid molecule as in any one of embodiments 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto. Embodiment 32. The nucleic acid molecule as in any one of embodiments 16, 18, or
22-31, wherein the linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.
Embodiment 33. The nucleic acid molecule as in any one of embodiments 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.
Embodiment 34. The nucleic acid molecule as in any one of embodiments 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of
SEQ ID NO: 97-128, SEQ ID NO: 322-324, or a sequence at least 90% thereto.
Embodiment 35. A plasmid, comprising the nucleic acid molecule as in any one of embodiments 1-34.
Embodiment 36. A cell, comprising the nucleic acid molecule as in any one of embodiments 1-34 or the plasmid of embodiment 35.
Embodiment 37. A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment
35.
Embodiment 38. A recombinant expression system produced by the method of embodiment 37.
Embodiment 39. A method of expressing Factor C serine protease zymogen, Factor
B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or
Coagulogen clotting protein, comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment 35.
Embodiment 40. An isolated, purified protein molecule, wherein the amino acid sequence is at least 75% identical to any one of SEQ ID NO: 129-256.
Embodiment 41. A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein expressed in Corynebacterium glutamicum.
Embodiment 42. The kit for detecting a pyrogen or endotoxin in a sample of embodiment 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID
NO: 258.
Embodiment 43. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-42, wherein the amino acid sequence of the recombinant
Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or
SEQ ID NO: 260.
Embodiment 44. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-43, wherein the amino acid sequence of the recombinant
Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID
NO: 261.
Embodiment 45. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-44, wherein the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO:
262-264.
Embodiment 46. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule of embodiment 40.
Embodiment 47. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of embodiments 41-45.

Claims

CLAIMS What is claimed is:
1. A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’:
(i) a promoter;
(ii) a signal sequence; and
(iii) a sequence encoding a cascade reagent protein.
2. The nucleic acid molecule of claim 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.
3. The nucleic acid molecule of claim 1 or 2, wherein the signal sequence encodes a signal peptide.
4. The nucleic acid molecule as any one of claims 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.
5. The nucleic acid molecule as in any one of claims 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.
6. The nucleic acid molecule as in any one of claims 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a
Corynebacterium glutamicum secretory gene.
7. The nucleic acid molecule as in claim 5 or 6, wherein the Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR0949 gene, and the porB gene.
8. The nucleic acid molecule as in any one of claims 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.
9. The nucleic acid molecule as in any one of claims 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus,
Limulus pofyphemus, Tachypleus gigas, and Carcinoscorpius rotundicauda.
10. The nucleic add molecule as in any one of claims 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.
11. The nucleic acid molecule as in any one of claims 1-10, wherein the expression cassette comprises a termination sequence.
12. The nucleic acid molecule of claim 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the Corynebacterium glutamicum gene, the
Figure imgf000067_0001
termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum eg 1338 gene.
13. The nucleic acid molecule as in any one of claims 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.
14. The nucleic add molecule of claim 13, wherdn the polypeptide protein tag is selected from the group consisting of polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione ^-transferase, and maltose-binding protein.
15. The nucleic acid molecule of claim 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.
16. The nucleic acid molecule of claim 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.
17. The nucleic acid molecule of claim 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.
18. The nucleic acid molecule of claim 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.
19. The nucleic acid molecule as in any one of claims 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.
20. The nucleic acid molecule of claim 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.
21. The nucleic acid molecule of claim 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.
22. The nucleic acid molecule of claim 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.
23. The nucleic add molecule as in any one of claims 16, 18, or 22, in which the linker or linkers are selected from the group consisting of flexible GS linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.
24. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ
ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.
25. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or
SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.
26. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 5 or SEQ ID NO: 290-
292, or a sequence at least 90% identical thereto.
27. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6, SEQ ID NO: 7, or
SEQ ID NO: 8, or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.
28. The nucleic acid molecule as in any one of claims 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto.
29. The nucleic acid molecule as in any one of claims 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.
30. The nucleic acid molecule as in any one of claims 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.
31. The nucleic acid molecule as in any one of claims 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.
32. The nucleic acid molecule as in any one of claims 16, 18, or 22-31, wherein the linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.
33. The nucleic acid molecule as in any one of claims 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ
ID NO: 33-96, or a sequence at least 90% thereto.
34. The nucleic acid molecule as in any one of claims 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128, SEQ ID NO:
322-324, or a sequence at least 90% thereto.
35. A plasmid, comprising the nucleic acid molecule as in any one of claims 1-34.
36. A cell, comprising the nucleic acid molecule as in any one of claims 1-34 or the plasmid of claim 35.
37. A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of claims 1-34, or the plasmid of claim 35.
38. A recombinant expression system produced by the method of claim 37.
39. A method of expressing Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein, comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of claims 1-34, or the plasmid of claim 35.
40. An isolated, purified protein molecule, wherein the amino acid sequence is at least
75% identical to any one of SEQ ID NO: 129-256.
41. A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor
C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant
Coagulogen clotting protein expressed in Corynebacterium glutamicum.
42. The kit for detecting a pyrogen or endotoxin in a sample of claim 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID NO: 258.
43. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-
42, wherein the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or SEQ ID NO: 260.
44. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-
43, wherein the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID NO: 261.
45. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-
44, wherein the amino add sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262-264.
46. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protdn molecule of claim 40.
47. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of claims 41-45.
PCT/US2023/014214 2022-03-01 2023-03-01 Compositions and methods for detecting an endotoxin WO2023177526A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2023/014214 WO2023177526A2 (en) 2022-03-01 2023-03-01 Compositions and methods for detecting an endotoxin

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202263315513P 2022-03-01 2022-03-01
US63/315,513 2022-03-01
PCT/US2023/014214 WO2023177526A2 (en) 2022-03-01 2023-03-01 Compositions and methods for detecting an endotoxin

Publications (3)

Publication Number Publication Date
WO2023177526A2 true WO2023177526A2 (en) 2023-09-21
WO2023177526A3 WO2023177526A3 (en) 2024-02-29
WO2023177526A9 WO2023177526A9 (en) 2024-07-18

Family

ID=88024578

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/014214 WO2023177526A2 (en) 2022-03-01 2023-03-01 Compositions and methods for detecting an endotoxin

Country Status (1)

Country Link
WO (1) WO2023177526A2 (en)

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ANDERSON RL ET AL.: "Limulus polyphemus. Biol Bull", vol. 225, 2013, article "Sublethal behavioral and physiological effects of the biomedical bleeding process on the American horseshoe crab", pages: 137 - 151
COOPER, JF: "The bacterial endotoxins test: a practicalapproach", 2011, article "Discovery and acceptance of the bacterial endotoxins test", pages: 1 - 13
LEVIN, J ET AL.: "The American horseshoe crab", 2003, article "Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool", pages: 310 - 340
NOVITSKY TJ: "Global perspectives on horseshoe crab biology, conservation and management.", 2015, article "Biomedical implications for managing the Limulus polyphemus harvest along the northeast coast of the United States", pages: 483 - 500

Also Published As

Publication number Publication date
WO2023177526A9 (en) 2024-07-18
WO2023177526A3 (en) 2024-02-29

Similar Documents

Publication Publication Date Title
EP3225690A1 (en) Method for preparing bacterial polysaccharide-modified recombinant fusion protein and use thereof
US10745450B2 (en) Peptides and uses thereof
EP2929025B1 (en) Method for recombinant production of horseshoe crab factor c protein in protozoa
CN111770687A (en) Therapeutic bacteriocins
CN102686727A (en) Combinatorial libraries based on C-type lectin domain
Dahl et al. Carica papaya glutamine cyclotransferase belongs to a novel plant enzyme subfamily: cloning and characterization of the recombinant enzyme
TWI660042B (en) Expression construct and method for producing proteins of interest
WO2023177526A2 (en) Compositions and methods for detecting an endotoxin
JP7016552B2 (en) How to increase the secretion of recombinant proteins
EP3630793B1 (en) A recombinant protein
KR100963302B1 (en) Recombinant Vector Containing ptsL Promoter and Method for Producing Exogeneous Proteins Using the Same
US10738090B2 (en) Engineered microcompartment protein and related methods and systems of engineering bacterial systems for non-native protein expression and purification
KR20220097504A (en) Horseshoe crab-derived recombinant FactorG and method for measuring β-glucan using the same
CN101775404A (en) Method for highly expressing basic protein with prokaryotic expression system
US20090239262A1 (en) Affinity Polypeptide for Purification of Recombinant Proteins
CN116790616B (en) Gene for coding sCXCL16, expression vector, preparation method and application
CN112979769B (en) Amino acid sequence, protein, preparation method and application thereof
CN113151227B (en) Protease gene and heterologous expression thereof
KR20130138397A (en) Peptide having antibacterial activity derived from lactoferrin and method for producing the same
RU2728652C1 (en) Recombinant plasmid dna pet19b-sav, providing synthesis of full-length streptavidin streptomyces avidinii protein, bacterial strain escherichia coli - producer of soluble full-length protein of streptavidin streptomyces avidinii
CN108179142A (en) A kind of new IgA protease and its preparation method and application
WO2018029333A1 (en) Lipoprotein export signals and uses thereof
EP4079845A1 (en) Method for enhancing water solubility of target protein by whep domain fusion
RU2707525C1 (en) Recombinant plasmid expressing cloned chaperone hfq vibrio cholerae gene, and escherichia coli strain - chaperone superfood hfq vibrio cholerae
CN116970593A (en) Serine protease homolog SLP-1 and preparation method and application thereof

Legal Events

Date Code Title Description
WPC Withdrawal of priority claims after completion of the technical preparations for international publication

Ref document number: 63/315,513

Country of ref document: US

Date of ref document: 20240524

Free format text: WITHDRAWN AFTER TECHNICAL PREPARATION FINISHED