WO2023177526A2

WO2023177526A2 - Compositions and methods for detecting an endotoxin

Info

Publication number: WO2023177526A2
Application number: PCT/US2023/014214
Authority: WO
Inventors: Jennifer WATSON; Richard Hatcher
Original assignee: Watson Jennifer; Richard Hatcher
Priority date: 2022-03-01
Filing date: 2023-03-01
Publication date: 2023-09-21
Also published as: WO2023177526A9; WO2023177526A3

Abstract

The disclosure provides nucleic acid molecules (comprising expression cassettes), plasmids, protein molecules, cells (comprising nucleic acid molecules), and recombinant expression systems for producing recombinant cascade reagents for the limulus amoebocyte lysate test method. Also, provided herein are kits and methods for detecting a pyrogen or endotoxin in a sample.

Description

COMPOSITIONS AND METHODS FOR DETECTING AN ENDOTOXIN

INVENTORS:

Jennifer Watson

Richard Hatcher

TITLE OF THE INVENTION

Compositions and Methods for Detecting an Endotoxin

CROSS REFERENCE TO RELATED APPLICATION

[0001] The present Application claims the benefit of priority to U.S. Provisional

Application No. 63/315,513, filed on March 1 , 2022, the contents of which are hereby incorporated by reference in their entirety.

INCORPORATION BY REFERENCE OF SEQUENCE LISTING PROVIDED

ELECTRONICALLY

[0002] An electronic version of the Sequence Listing is filed herewith, the contents of which are incorporated by reference in their entirety. The electronic file is 571 kilobytes in size, and is titled 495-LM02_SequenceListing_ST26.txt.

BACKGROUND OF THE INVENTION

Field of the Invention

[0003] The present invention relates generally to the fields of biotechnology and infectious diseases, and more particularly it pertains to recombinant production of enzymes for detection of pyrogens and endotoxins.

Background

[0004] The standard pyrogen assay is a mandatory test for U.S. Food and Drug

Administration (FDA) approval of all vaccines, intravenous pharmaceuticals, and internal medical devices to prevent contamination with endotoxins. The assay uses the hemolymph (blood) of the horseshoe crab, Limulus polyphemus (L. polyphemus} and tests for the presence of fever-producing agents of bacterial origin, e.g., endotoxins. The limulus amoebocyte lysate (LAL) test method is a qualitative assay during which the L. polyphemus hemolymph lysate reacts with an endotoxin to form a gel. The LAL test is considered to be reproducible, simple to conduct, specific for the presence of endotoxins, and sensitive to even picogram quantities of endotoxins. The quantity of endotoxin may be determined by dilution techniques comparing gel formation of the test sample to that of a reference pyrogen. The following non-essential publications are incorporated by reference in their entirety to aid in understanding of the official use of the LAL assay for release testing of final drug products: Levin, J, et al. Clotting cells and Limulus Amebocyte lysate: an amazing analytical tool. In: Shuster CNJ, Barlow RB and Brockman HJ (eds) The American horseshoe crab. 2003: 310-340; Cooper, JF. Discovery and acceptance of the bacterial endotoxins test. In: McCullough KZ (ed.) The bacterial endotoxins test: a practical approach. 2011: 1-13.

[0005] The LAL assay comprises horseshoe crab lysate reagents that form a four-step coagulation cascade. Three serine protease zymogens, namely Factor C, Factor B, and Proclotting

Enzyme, and one clotting protein, Coagulogen, form the enzymatic coagulation cascade that results in a coagulin gel clot in the presence of an endotoxin. In this cascade, an endotoxin activates the Factor C zymogen and the activated Factor C subsequently activates Factor B, which converts the Proclotting Enzyme into Clotting Enzyme that cleaves Coagulogen into Coagulin, forming a gel clot.

[0006] The raw materials for the production of lysate reagents are harvested from wildcaught horseshoe crab, including L. polyphemus and Tachypleus tridentatus (T. tridentatus). Wild horseshoe populations are in decline due to the detrimental effect of capture, blood collection, and release, poor management of harvest regulations, and habitat destruction. Commercial-scale cultivation of horseshoe crabs has not been achieved. The following non-essential publications are incorporated by reference in their entirety to aid in understanding of the unsustainability of blood collection from wild-caught crabs for production of LAL assay reagents: Gauvry G. Current

Horseshoe crab harvesting practices cannot support global demand for TAL/LAL: The pharmaceutical and medical device industries’ role in the sustainability of horseshoe crabs. In:

Carmichael RH, Botton ML, Shin PKS and Changing SGC (eds) Global perspectives on horseshoe crab biology, conservation and management. 2015: 475-482; Anderson RL et al., Sublethal behavioral and physiological effects of the biomedical bleeding process on the American horseshoe crab, Limulus pofyphemus. Biol Bull. 2013(225): 137-151; Novitsky TJ. Biomedical implications for managing the Limulus pofyphemus harvest along the northeast coast of the United

States. IN: Carmichael RH, Botton ML, Shin PKS and Changing SGC (eds) Global perspectives on horseshoe crab biology, conservation and management. 2015: 483-500.

[0007] Demand for lysate reagents for the LAL assay will likely continue to rise with the growth of the pharmaceutical industry, including the proliferation of biotechnology-based drugs and vaccines. The recent rapid development and deployment of vaccines against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in a mass vaccination campaign to address the coronavirus disease 2019 (COVID-19) pandemic demonstrates the ongoing necessity for endotoxin-free development and manufacturing of parenteral pharmaceuticals. The current reliance of the LAL assay on lysate reagents harvested from the horseshoe crab is a threat to horseshoe crab populations, the ecosystems in which the horseshoe crab lives, and humanity as the globe faced the CO VID-19 pandemic and the threat of future pandemics.

[0008] Accordingly, a sustainable alternative to lysate reagents for the LAL assay is urgently needed to protect the horseshoe crab and humanity from preventable harm.

BRIEF SUMMARY OF THE INVENTION [0009] Thus, in accordance with the present disclosure, recombinant generation of lysate reagents for the LAL assay is provided herein. The disclosure features expression cassettes, plasmids, and functional recombinant cascade reagents (RCRs) produced from these expression cassettes and plasmids. The disclosure also features expression cassettes for functional RCRs optimized for production in Corynebacterium glutamicum (C. glutamicum). The disclosure features optimized expression cassettes for production in C. glutamicum of the Factor C, Factor B, and Proclotting Enzyme serine protease zymogens, as well as optimized expression cassettes for production of the Coagulogen clotting protein.

[0010] The disclosure provides nucleic acid molecules, comprising expression cassettes, wherein the expression cassettes comprise, from 5’ to 3’: a promoter; a signal sequence; and a sequence encoding a cascade reagent protein. In some embodiments, the expression cassette is optimized for expression in C. glutamicum. In some embodiments, the signal sequence encodes a signal peptide.

[0011] In some embodiments, the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene. In some embodiments, the C. glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the

CgR0949 gene, and the porB gene.

[0012] In some embodiments, the sequence encoding the cascade reagent protein encodes

Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein. In some embodiments, the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus pofyphemus,

Tachypleus gigas, and Carcinoscorpius rotundicauda (C. rotundicauda).

[0013] In some embodiments, the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.

[0014] In some embodiments, the expression cassette comprises a termination sequence.

In some embodiments, the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the

Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cg!338 gene.

[0015] In some embodiments, the expression cassette comprises a sequence encoding a polypeptide protein tag. In some embodiments, the expression cassette comprises two or more sequences encoding polypeptide protein tags. In some embodiments, the polypeptide protein tag or polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag,

HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.

[0016] In some embodiments, the sequence encoding the polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.

[0017] In some embodiments, the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence. In some embodiments, a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.

[0018] In some embodiments, the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags. In some embodiments, sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.

[0019] In some embodiments, the linker or linkers are selected from the group consisting of flexible glycine-serine linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.

[0020] In some embodiments, the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 278-283, or SEQ ID: 325, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284—289, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic add sequence of SEQ ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto. In some embodiments, the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6-8 or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.

[0021] In some embodiments, the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto. In some embodiments, the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto. In some embodiments, the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto. In some embodiments, the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto. In some embodiments, the linker is encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271, or a sequence at least 90% thereto.

[0022] In some embodiments, the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.

[0023] In some embodiments, the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128 or SEQ ID NO. 322-324, or a sequence at least 90% thereto.

[0024] The disclosure provides plasmids comprising nucleic acid molecules disclosed herein. The disclosure also provides cells comprising any one of the nucleic acid molecules or plasmids disclosed herein.

[0025] The disclosure provides methods of producing a recombinant expression system comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein. The disclosure also provides recombinant expression systems produced by the method of contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.

[0026] The disclosure provides methods of expressing Factor C serine protease zymogen,

Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein, comprising contacting a C. glutamicum cell with any one of the nucleic acid molecules or plasmids disclosed herein.

[0027] The disclosure provides isolated, purified protein molecules, wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256. [0028] The disclosure provides kits for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant

Coagulogen clotting protein expressed in C. glutamicum.

[0029] In some embodiments, the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 257 or SEQ ID NO: 258. In some embodiments, the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to any one of SEQ ID NO: 259 or SEQ ID NO: 260. In some embodiments, the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical SEQ ID NO: 261. In some embodiments, the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262

-264.

[0030] The disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule wherein the amino add sequence is at least 75% identical to any one of SEQ ID NO: 129-256.

[0031] The disclosure provides methods of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kits disclosed herein.

[0032] These and other embodiments are described in more detail in the detailed description below.

BRIEF DESCRIPTION OF THE DRAWINGS

[0033] FIG. 1 depicts the coagulation cascade of the present disclosure based on the coagulation cascade in the horseshoe crab amoebocyte lysate. [0034] FIGS. 2A-2B depict the expression cassettes of the present disclosure. FIG. 2A shows expression cassettes comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and optionally a polypeptide tag. FIG. 2B shows exemplary expression cassettes according to the present invention. Expression cassette number 4 (SEQ ID NO: 322) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),

Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272).

Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene from T. tridentatus (SEQ ID NO: 325), and the rrnB terminator (SEQ ID NO: 272). Expression cassette number 6 (SEQ ID NO: 324) comprises the promoter (SEQ ID NO: 9), the cg!514

signal sequence (SEQ ID NO: 14), Factor C gene from T. tridentatus (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator (SEQ ID NO: 272).

[0035] FIGS. 3A-3B show expression of the plasmids containing expression cassettes in

C. glutamicum according to the present disclosure. FIG. 3 A depicts microscopy images showing untransformed gram-positive B. cereus and untransformed gram-negative E. coli (top left), untransformed, gram-positive C. glutamicum (top middle), C. glutamicum transformed with empty plasmid (top right), and C. glutamicum transformed with plasmids comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322, bottom left), expression cassette number 6 of FIG. 2B

(SEQ ID NO: 324, bottom middle), and expression cassette number 5 of FIG. 2B (SEQ ID NO:

323, bottom right). The scale bar is 10 pm. FIG. 3B depicts gel electrophoresis showing the molecular weight of plasmids containing the expression cassettes of the present disclosure. Lane

1 shows C. glutamicum as a negative control, lane 2 shows C. glutamicum expressing the pEC- pk!8mob2 empty plasmid as a positive control, lane 3 shows C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322), lane 4 shows C. glutamicum expressing the plasmid comprising the expression cassette number 6 of FIG. 2B (SEQ

ID NO: 324), and lane 5 shows C. glutamicum expressing the plasmid comprising the expression cassette number 5 of FIG. 2B (SEQ ID NO: 323).

[0036] FIG. 4 depicts gel electrophoresis showing the molecular weight of polypeptides in the culture supernatant of C. glutamicum in accordance with the present disclosure. Lanes 1 and

2 show C. glutamicum expressing the pEC-pkl8mob2 empty plasmid as a negative control, and lanes 3 and 4 show C. glutamicum expressing the plasmid comprising the expression cassette number 4 of FIG. 2B (SEQ ID NO: 322).

DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS

[0037] The limulus amoebocyte lysate (LAL) test method is a standard pyrogen assay employed by a variety of industries to ensure that samples are free of harmful endotoxins and pyrogens. The U.S. Food and Drug Administration (FDA) approved the LAL assay for testing drugs, products, and devices, and the assay is widely used to test ingredients of pharmaceuticals during manufacturing.

[0038] The LAL assay is based on a coagulation cascade involving reagents harvested from the hemolymph of wild-caught horseshoe crab. Specifically, exposure of endotoxin to the serine protease zymogen Factor C initiates a cascade that activates the serine protease zymogen

Factor B, converts the serine protease zymogen Proclotting Enzyme into Clotting Enzyme, and ultimately cleaves Coagulogen into Coagulin to form a gel clot. The LAL assay depends on the availability of these reagents, and poor harvest management and habitat destruction threaten the horseshoe crab population and thus threaten the supply of horseshoe crab lysate reagents. [0039] The disclosure provides nucleic acid molecules (comprising expression cassettes) and plasmids for producing the lysate reagents Factor C, Factor B, Proclotting Enzyme, and

Coagulogen, wherein the nucleic acid molecules and plasmids are optimized for expression in the generally regarded as safe (GRAS) actinobacteria Corynebacterium glutamicum (C. glutamicuni).

Also provided are isolated, purified protein molecules encoded by the nucleic acid molecules and plasmids disclosed herein, kits for detecting a pyrogen or endotoxin in a sample comprising recombinant lysate reagents, methods of producing a recombinant expression system using the nucleic acid molecules and plasmids disclosed herein, and methods for detecting pyrogen or endotoxin in a sample.

[0040] The nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used to test for contamination in a variety of industries, including pharmaceuticals (both preclinical studies and clinical applications) and biotechnologies, and settings, including healthcare providers, veterinary clinics, agriculture, food processing and service, wineries, breweries, distilleries, military, and direct-to-consumer. In some embodiments, nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the agriculture, food service, food processing, winery, brewery, or distillery industries to test for contamination at any point along the logistical supply chain. In some embodiments, nucleic acid molecules, plasmids, isolated, purified protein molecules, kits, and methods of the present disclosure may be used in the healthcare provider, veterinary clinic, military, and direct-to-consumer industries to test for contamination and institute organizational processes and conditions to sanitize frequently touched objects and surfaces and prevent infection.

Definitions [0041] Unless otherwise defined, all scientific and technical terms used in the description herein and in the appended claims have identical meaning as understood by one of ordinary skill in the art. The terminology used herein is not intended to be limiting and is used for the purpose of describing particular embodiments in the description herein.

[0042] The singular forms “a,” “an,” and “the” are intended to include the plural forms as well and are consistent with the meaning of “one or more,” “at least one,” and “one or more than one,” unless the context clearly indicates otherwise.

[0043] As used herein, the term “about” when referring to a measurable value such as concentration, volume, length of time, length of a polypeptide or polynucleotide sequence, quantity, and the like, encompasses, ± 20%, ± 10%, ± 5%, ± 1%, ± 0.5%, or ± 0.1% of the specified amount.

[0044] As used herein, the term “expression cassette” refers to a nucleic acid component of vector DNA comprising one or more transcriptional control elements (e.g., promoters, enhancers, and/or regulatory sequences and polyadenylation sequences) that direct gene expression of a sequence encoding a protein and/or polypeptide, e.g., a linear nucleic acid sequence encoding one or more transgenes that are expressed by one or more cell types. The terms “DNA

“expression vector,” and “plasmid” are terms of the art understood by skilled persons and refer to synthetic DNA molecules used to carry foreign genetic material into a cell. The term

“recombinant DNA” is a term of the art understood by skilled persons and refers to combining two or more DNA molecules from two or more different sources, and the term “recombinant protein” is a term of the art understood by skilled persons and refers to protein encoded by recombinant

DNA that has been cloned into an expression vector. The term “recombinant” is a term of the art understood by skilled persons and refers to recombined DNA, e.g., recombinant DNA, and/or artificially produced protein, e.g., recombinant protein.

[0045] As used herein, the term “recombinant expression system” refers to a system for expressing recombinant protein in cells by transfecting cells with a DNA vector, expression vector, or plasmid. The term “expression” is a term of the art understood by skilled persons and refers to production of large amounts of recombinant DNA and/or recombinant protein by manipulation of the genetic material. The terms “optimized expression” or “optimized for expression” refer to adaptation of some or all of nucleic acid molecules, including synthetic DNA molecules, recombinant DNA, and/or DNA vector, to the host organism to optimize synthesis and/or production of recombinant proteins. Optimization for expression may include optimizing GC content and noncoding DNA elements. Optimization for expression may include optimization based on highly expressed genes (HEG) wherein the codon usage of predicted highly expressed genes from 150 bacterial genomes under translational selection determines codon usage.

Optimization for expression may also include determination of codon usage based on ribosomal protein genes (RPG) or tRNA gene copy number (tRNA). The HEG, RPG, and tRNA optimization techniques apply a Monte Carlo algorithm using relative codon usage frequencies of a reference set as the relative probability that a given codon will be used in the optimization process.

Optimization for expression may include general optimization based on the C. glutamicum codon usage table generated from 9,019 coding sequences representing 2,866,198 codons. Optimization for expression may include optimization based on the software OptimWiz, a proprietary codon optimization analysis tool, which may optimize for expression by modifying GC-content, mRNA secondary structure, Shine-Dalgamo sequence, RNA instability motifs, repetitive sequences, internal splice sites, and restriction enzyme recognition sites. [0046] Exemplary optimization for expression of the present invention includes replacing nucleic acids of a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and/or sequences encoding linkers with nucleic acids encoding codons based on the

HEG, RPG, tRNA, general, or OptimWiz optimization methods.

[0047] The terms “optimized expression” or “optimized for expression” may also refer to polypeptides or proteins encoded by nucleic acid sequences that have been optimized for expression, i.e. optimization of the coding sequence that codes for the sequence of amino acids in a protein.

[0048] As used herein, the term “nucleic acid”, “nucleic acid molecule”, or

“polynucleotide” refers to a sequence of more than one nucleotide base monomer, for example deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), in a single chain, including naturally occurring and non-naturally occurring nucleotides. As used herein, the term “nucleotide” refers to conventional nucleotide bases, e.g., the purine and pyrimidine bases adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U). A nucleic acid will generally contain sugars and phosphates connected in an alternating chain through phosphodiester linkages. Generally, the phosphate groups are attached to carbons at the 5 ’-end and the 3’ -end of the sugar, imparting directionality to nucleic acids. The ends of nucleic acids are referred to as the 5’-end and 3’-end, the 5’-end is referred to as “upstream” of the 3’-end, and the 3’-end is referred to as “downstream” of the 5’-end. Nucleic acid molecules may be circular (e.g., a plasmid) or linear (e.g., a cassette).

Nucleic acid sequences may encode polypeptides or may include sequences regulating transcription (e.g., promoters and terminators).

[0049] As used herein, the term “polypeptide” refers to a continuous, unbranched chain of peptides linked by peptide bonds. Amino acids incorporated into peptides are known as residues, and the term “amino acid sequence” refers to a sequence of amino acids, including naturally occurring and non-naturally occurring amino acids. Longer polypeptides are known as proteins, and the term “protein tag” is used to refer to a shorter polypeptide. Generally, polypeptides have an N-terminus, also known as the N-terminal end or amine-terminus, and a C-terminus, also known as the C-terminal end, caiboxyl-terminus, or carboxy-terminus. Polypeptides may be fused to other polypeptides by combining the genes or parts of genes that encode them to produce recombinant

DNA that encodes a recombinant fusion protein. One protein tag or domain may be fused N- terminally or C-terminally to another protein tag or domain. Fusion of a protein tag to the N- terminus of a protein results in an N-terminally tagged protein, and fusion of a protein tag to the

C -terminus of a protein results in a C-terminally tagged protein. Recombinant proteins, including signal polypeptides, cascade reagent proteins, and protein tags, may be fused by linker sequences to separate these domains. Linker sequences may encode cleavable polypeptides, which can be cleaved upon exposure to enzyme, chemical reagents, or irradiation, or non-cleavable polypeptides, including flexible polypeptide linkers composed of glycine and serine known as GS linkers, for example (Gly-Gly-Gly-Gly-Ser)n, or rigid linkers, for example proline-rich or a-helical linkers.

[0050] As used herein, the terms “streptavidin-binding peptide” and “SBP” may include the 38-amino acid sequence or

8-amino add sequences of the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK.

[0051] Calculations of “identity” between two sequences, e.g., nucleic acid or amino acid sequences, can be performed by practices commonly understood by one of ordinary skill in the art.

The sequences are aligned for optimal comparison performance and the nucleotides or amino acid residues at corresponding nucleotide positions or amino acid positions are then compared.

Molecules are identical at a position when a position in the first sequence is occupied by the same nucleotide or amino acid residue as the corresponding position in the second sequence. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences. As used herein, the term “homolog” refers to a protein that has a common ancestor, and may include proteins that exhibit sequence homology, i.e., the proteins share sequence similarity.

[0052] As used herein, the term “promoter” refers to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating binding of proteins

(e.g., transcription factors) that initiate transcription of RNA from the DNA downstream of the promoter. The transcription start site is the location where transcription starts at the 5’-end of the operably linked nucleic acid sequence, and the promoter generally includes consensus sequences, such as a TATA box, near the transcription start site.

[0053] As used herein, the terms “terminator” or “termination sequence” refer to one or more DNA sequences that regulate expression of operably linked nucleic acid sequences by facilitating termination of transcription of RNA from the DNA upstream of the terminator.

Generally, the termination sequence is downstream of a stop codon that signals termination of translation of the protein translated from the RNA transcribed from the DNA upstream of the stop codon.

[0054] As used herein, the term “transgene” refers to a gene transferred from one organism to another, i.e., an exogenous nucleic acid sequence encoding a polypeptide to be expressed in a cell. Generally, a transgene contains a promoter, a protein coding sequence, and a termination sequence. The term “gene of interesf’ refers to the nucleic acid sequence encoding a protein, i.e., a protein coding sequence. Exemplary genes of interest of the present invention include nucleic add sequences encoding clotting proteins, Factor C serine proteases, Factor B serine proteases,

Proclotting Enzyme serine proteases, Coagulogen clotting proteins, and recombinant cascade reagents (RCRs).

[0055] As used herein, the term “signal sequence” refers to a nucleic acid sequence encoding a short peptide present at the terminus of most proteins destined for secretion via the cellular secretory pathway. The term “signal peptide” refers to the polypeptide encoded by the signal sequence, and is generally present at the N-terminus of secreted proteins. The term

“secretory gene” refers to genes encoding proteins destined for secretion via the cellular secretory pathway.

[0056] As used herein, the terms “pyrogen” and “endotoxin” are used interchangeably and refer to causative agents responsible for biological effects incidental to therapy administered parenterally, i.e. therapies administered to the body other than through the mouth and alimentary canal. Parenteral therapies, including injection (e.g., subcutaneous injection, intraperitoneal injection, intrathecal injection, etc.), allow pyrogens or endotoxins to bypass the normal body defenses. The host’s response to pyrogens or endotoxins include fever, shock, and other physiological responses. While the terms pyrogen and endotoxin are used interchangeably herein, not all pyrogens are endotoxins.

[0057] As used herein, the term “amino acid” refers to naturally occurring and non- naturally occurring or synthetic amino acids. Naturally occurring, levorotatory (L-) amino acids and their abbreviations (three-letter code and one-letter code are shown in Table 1.

TABLE T

Expression cassettes

[0058] The disclosure provides nucleic acid sequences comprising one or more expression cassettes optimized for expression in C. glutamicum. In some embodiments, the expression cassette comprises, from 5’ to 3’, a promoter, a signal sequence, and a sequence encoding a cascade reagent protein. In some embodiments, the cascade reagent protein is Factor C. In some embodiments, the cascade reagent protein is Factor B. In some embodiments, the cascade reagent protein is Proclotting Enzyme. In some embodiments, the cascade reagent protein is Coagulogen. In some embodiments, the expression cassette comprises a termination sequence, a sequence encoding a polypeptide protein tag, and/or a sequence encoding a linker.

[0059] In some embodiments, the expression cassette comprises a nucleic acid sequence of having least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least

96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID

NO: 97-128 or SEQ ID NO: 322-324.

(i) Promoter

[0060] Promoters or promoter sequences are sequences of DNA to which transcription factors bind, thereby initiating transcription of RNA from the DNA downstream of the promoter.

Promoters are located upstream, or toward the 5’ region of the sense strand, of the transcription start site and may include consensus sequences such as TATAAT or TTGACA. Promoters drive expression of DNA, e.g., genes or transgenes, downstream of the promoter. RNA molecules transcribed from operably linked DNA sequences adjacent to promoters may encode a protein.

[0061] The RNA sequence helps recruit the ribosome to the messenger RNA (mRNA) to initiate protein synthesis by aligning the ribosome with the start codon and may include consensus sequences such as the Shine-Dalgamo sequence, e.g., AGGAGGU or GAGG. Once recruited, tRNA may add amino acids in sequence as dictated by the codons, moving downstream from the translational start site.

[0062] The expression cassettes of the disclosure may comprise a promoter. In some embodiments, the promoter drives expression of a signal sequence and a sequence encoding a cascade reagent protein. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a secretory gene. In some embodiments, the promoter comprises a nucleic acid sequence derived from a promoter of a C. glutamicum secretory gene, for example the promoters listed in Table 2.

[0063] In some embodiments, the promoter may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO:

9-13.

TABLE 2

(ii) Signal Sequence

[0064] Signal sequences are sequences of DNA encoding a signal peptide. Signal sequences may be referred to as localization signals, localization sequences, leader sequences, or targeting signals and a signal peptide may be referred to as a transit peptide or leader peptide.

Signal peptides are short peptides that prompt a cell to translocate the protein, and are often present at the N-terminus of proteins destined for secretion, which may include translocation to certain organelles, secretion from the cell, or insertion into cellular membranes.

[0065] The expression cassettes of the disclosure may comprise a signal sequence. The signal sequence may encode a signal peptide. In some embodiments, the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein. In some embodiments, the signal sequence is located between the promoter and the sequence encoding a polypeptide protein tag. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a secretory gene. In some embodiments, the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a C. glutamicum secretory gene, for example the signal sequences listed in Table 3.

TABLE 3

[0066] In some embodiments, the core of the signal peptide may comprise a sequence of hydrophobic amino acids. The sequence of hydrophobic amino acids may be about 5 to 16 residues in length, for example 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 ,15, or 16 residues in length. In some embodiments, the signal peptide may comprise a short positively charged sequence of amino acids at the N-terminus. In some embodiments, the signal peptide may comprise a sequence of amino adds recognized and cleaved by signal peptidases.

[0067] In some embodiments, the signal sequence may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least

96%, at least 97%, at least 98%, or at least 99% identity to the nucleic add sequences of SEQ ID

NO: 14-18.

[0068] In some embodiments, the signal sequence may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 302-306.

(iii) Sequence Encoding a Cascade Reagent Protein

[0069] A sequence encoding a cascade reagent protein is a sequence of DNA encoding any one of the cascade reagent proteins of the LAL assay disclosed herein. A protein encoded by this sequence may also be referred to as a recombinant cascade reagent (RCR), and may include any one of three recombinant protease zymogens, namely Factor C, Factor B, and Proclotting Enzyme, and a clotting protein, namely Coagulogen.

[0070] The expression cassettes of the disclosure may comprise a sequence encoding a cascade reagent protein. The sequence encoding a cascade reagent protein may be isolated or derived from the genome of one of any horseshoe crab, for example Tachypleus tridentatus,

Limulus polyphemus, Tachypleus gigas, or Carcinoscorpius rotundicauda. In some embodiments, the sequence encoding a cascade reagent protein may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 4. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.

TABLE 4

[0071] In some embodiments, the sequence encoding a cascade reagent protein may be truncated or mutated from the wild type sequence. In some embodiments, the sequence encoding a cascade reagent protein may encode a recombinant protein with activity higher than, lower than, or equivalent to that of the wild type protein. In some embodiments the sequence encoding a cascade reagent protein may encode a cascade reagent protein homolog.

[0072] In some embodiments, the sequence encoding a cascade reagent protein may encode the Factor C serine protease zymogen. In some embodiments, the sequence encoding the cascade reagent protein Factor C may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least

98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 1, SEQ ID NO: 2, or

SEQ ID NO: 278-283, or SEQ ID NO: 325.

[0073] In some embodiments, the sequence encoding a cascade reagent protein may encode the Factor B serine protease zymogen and homologs thereof, e.g., C3 and C2/Bf. In some embodiments, the sequence encoding the cascade reagent protein Factor B may comprise a nucleic add sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least

95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289.

[0074] In some embodiments, the sequence encoding a cascade reagent protein may encode the Proclotting Enzyme serine protease zymogen. In some embodiments, the sequence encoding the cascade reagent protein Proclotting Enzyme may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least

NO: 5 or SEQ ID NO: 290-292. [0075] In some embodiments, the sequence encoding a cascade reagent protein may encode the Coagulogen clotting protein. In some embodiments, the sequence encoding the cascade reagent protein Coagulogen may comprise a nucleic acid sequence having at least 70%, at least

75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 6-8 or SEQ ID NO: 293-

301.

(iv) Termination Sequence

[0076] A termination sequence, terminator, or transcription terminator is a sequence of

DNA downstream of the translational stop codon that mediates termination of transcription of operably linked nucleic acid sequences. Prokaryotic transcription terminators of the present disclosure may be Rho-dependent or Rho-independent. Transcription terminators may comprise a downstream transcription stop point sequence and/or a GC-rich region of dyad symmetry followed by a poly-A sequence to promote allosteric dissociation of the transcriptional complex and/or hairpin loop formation of the transcribed mRNA and subsequent transcription termination.

[0077] The expression cassettes of the disclosure may comprise a termination sequence.

The termination sequence may be isolated or derived from the genome of one of any suitable organism, for example Escherichia coli (E. coli) or C. glutamicum. The termination sequence may comprise the termination region of the E. coli rrnB gene, the termination region of the C. glutamicum cgl502 gene, the termination region of the C. glutamicum cg3011 gene, the termination region of the C. glutamicum cspA gene, and the termination region of the C. glutamicum cg!338 gene. In some embodiments, the termination sequence may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 5. In some embodiments, optimization for expression in C. glutamicum may include replacing nucleotides of the wild type termination sequence to optimize GC content for expression in C. glutamicum.

TABLE S

[0078] In some embodiments, the termination sequence may comprise the wild type rmB termination sequence from E. coli. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 272, or a sequence at least 70%, at least 75%, at least 80%, at least

85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the rmB termination sequence from E. coli optimized for expression in C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 273, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

[0079] In some embodiments, the termination sequence may comprise the wild type cg!502 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 274, or a sequence at least 70%, at least

75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cg3011 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 275, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cspA termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 276, or a sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto. In some embodiments, the termination sequence may comprise the wild type cgl338 termination sequence from C. glutamicum. In some embodiments, the termination sequence may comprise the sequence of SEQ ID NO: 277, or a sequence at least

70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical thereto.

(v) Sequence Encoding a Polypeptide Protein Tag

[0080] A sequence encoding a polypeptide protein tag is a sequence of DNA encoding a peptide sequence, protein tag, or polypeptide protein tag. A sequence encoding a polypeptide protein tag may be fused, appended, or grafted to a sequence encoding a protein, generally at either the C-terminus or N-terminus, or at both the C-terminus and the N-terminus of the protein. Less frequently a sequence encoding a polypeptide protein tag may be inserted into the sequence encoding a protein. A polypeptide protein tag may be appended to a protein to aid in affinity purification from biological lysate, enhance resolution of chromatographic separation, and/or promote solubilization and proper folding of proteins prone to precipitation. Polypeptide protein tags may comprise polyanionic amino acids or epitope tags. [0081] The expression cassettes of the disclosure may comprise a sequence encoding a polypeptide protein tag. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the signal sequence and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ the sequence encoding a polypeptide protein tag may be located between the sequence encoding the cascade reagent protein and the termination sequence. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding a polypeptide protein tag and the sequence encoding the cascade reagent protein. In some embodiments, from 5’ to 3’ a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding a polypeptide protein tag.

[0082] In some embodiments, two or more sequences encoding polypeptide protein tags may be located in tandem at the 5’ end or the 3’ end of the sequence encoding the cascade reagent protein. In some embodiments, the sequence encoding the cascade reagent protein may be located between two sequences encoding polypeptide protein tags, i.e., the sequences encoding polypeptide protein tags flank the sequence encoding the cascade reagent protein. In some embodiments, sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the flanking sequences encoding the polypeptide protein tags.

[0083] In some embodiments, the cascade reagent protein may be N-terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be C- terminally tagged with the polypeptide protein tag. In some embodiments, the cascade reagent protein may be N-terminally or C-terminally tagged with tandem polypeptide protein tags. In some embodiments, the cascade reagent protein may be both N-terminally and C-terminally tagged with polypeptide protein tags. In some embodiments, the two or more polypeptide protein tags are identical. In some embodiments, the two or more polypeptide protein tags are not identical. In some embodiments, cleavable, flexible, and/or rigid linkers may separate the polypeptide protein tag or tags from the cascade reagent protein.

[0084] In some embodiments, the sequence encoding a polypeptide protein tag may encode a peptide or protein tag, for example a polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, or maltose-binding protein. In some embodiments, the sequence encoding a polypeptide protein tag may encode a polyhistidinetag, also referred to as His-tag, Hise tag, poly(His) tag, or 6His, which may be about 5-10 residues in length, for example 5, 6, 7, 8, 9, or 10 residues in length, e.g., the amino acid sequence

In some embodiments, the sequence encoding a polypeptide protein tag may encode a

FLAG-tag, also referred to as FLAG octapeptide or FLAG epitope, which may have the amino add sequence D

and may be used in tandem and with some variation in sequence identity, e.g., the 3xFLAG peptide of amino acid sequence

In some embodiments, the sequence encoding a polypeptide protein tag may encode an HA-tag, also referred to as the human influenza hemagglutinin tag, which may be derived from amino acids

98-106 of the human influenza hemagglutinin protein and may have the amino acid sequence

YPYDVPDYA. In some embodiments, the sequence encoding a polypeptide protein tag may encode a calmodulin-binding peptide, also referred to as a calmodulin-binding protein peptide tag,

CBP-tag, or calmodulin-tag, which may have the amino acid sequence In some embodiments, the sequence encoding a

polypeptide protein tag may encode a streptavidin-binding peptide, also referred to as an SEP or streptavi din-tag, including a 38 -amino add sequence or 8-amino acid sequences of

the Strep-tag system (e.g., the Strep-tag or Strep-tag II), which may have the amino acid sequence WSHPQFEK. In some embodiments, the sequence encoding a polypeptide protein tag may encode a glutathione S-transferase protein, also referred to as a GST-tag, which may be about 220 amino adds in length and may be derived from a sequence encoding a wild type glutathione S'-transferase.

In some embodiments, the sequence encoding a polypeptide protein tag may encode a maltose binding protein, also referred to as MBP-tag or maltose tag, which may be about 370-396 amino adds in length and may be derived from the malE gene of E. coli.

[0085] In some embodiments, the sequence encoding a polypeptide protein tag may be wild type or optimized for expression in C. glutamicum, e.g., the sequences listed in Table 6. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or OptimWiz optimization and/or optimizing GC content of the DNA sequence.

TABLE 6

[0086] In some embodiments, the sequence encoding a polypeptide protein tag may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of SEQ ID NO: 19-32.

[0087] In some embodiments, the sequence encoding a polypeptide tag may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least

97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 307-314.

(vi) Sequence Encoding a Linker

[0088] A sequence encoding a linker is a sequence of DNA encoding a polypeptide linker.

Polypeptide linkers may encode cleavable, rigid, and/or flexible polypeptides. Polypeptide linkers, also referred to as linkers, may link functional protein domains together or release free functional domains after cleavage. Linkers may be isolated from or derived from naturally-occurring multidomain proteins, or may be designed de novo. Linkers may increase stability, promote folding, increase expression, or improve biological activity of the protein domains they are fused to.

Properties of linkers, including length, hydrophobicity, amino acid residues, and secondary structure, may vary. For instance, linkers may adopt various conformations, such as P-strand, helical, coil/bend, and turns.

[0089] The expression cassettes of the disclosure may comprise a sequence encoding a linker. In some embodiments, the sequence encoding a linker may encode a polypeptide about 3-

30 residues in length, for example 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,

22, 23, 24, 25, 26, 27, 28, 29, or 30 residues in length. In some embodiments, the sequence encoding a linker may be located between a 5* sequence encoding a polypeptide protein tag and a 3’ sequence encoding a cascade reagent protein. In some embodiments, the sequence encoding a linker may be located between a 5 ’ sequence encoding a cascade reagent protein and a 3 ’ sequence encoding a polypeptide protein tag. In some embodiments, polar uncharged or charged residues are preferable amino acids of the linker.

[0090] In some embodiments, the sequence encoding a linker may encode a flexible GS linker, for example

(Gly)₇, or (Giy)g. In some embodiments, the sequence encoding a linker may encode a rigid a-helical linker, for example

or In some embodiments, the sequence encoding a linker may

encode a rigid proline-rich linker, for example PAPAP, (AP)n, (KP)n, or (EP)n, wherein n is 3-4.

In some embodiments, the sequence encoding a linker may encode a cleavable disulfide linker, for example LEAGCKNFFPRSFTSCGSLE, or a cleavable protease linker, for example GFLG.

[0091] In some embodiments, the sequence encoding a linker may be optimized for expression in C. glutamicum, e.g., the sequences listed in Table 7. In some embodiments, optimization for expression in C. glutamicum may include HEG, RPG, tRNA, general, or

OptimWiz optimization and/or optimizing GC content of the DNA sequence.

TABLE ?

[0092] In some embodiments, the sequence encoding a linker may comprise a nucleic acid sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the nucleic acid sequences of

SEQ ID NO: 265-271.

[0093] In some embodiments, the sequence encoding a linker may encode an amino acid sequence having at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to the amino acid sequences of SEQ ID NO: 315-321.

(vii) Exemplary Expression Cassettes

[0094] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, and a sequence encoding a cascade reagent protein. In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID NO: 325

(OptimWiz optimized Factor C (T. tridentatus)), or SEQ ID NO: 2 (Factor C (C. rotundicauda)).

In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ

ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID

NO: 4 (Factor B (C. rotundicauda)). In some embodiments, the expression cassette comprises SEQ

ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cgl514 signal sequence), and SEQ ID NO: 5

(Proclotting Enzyme (T. tridentatus)). In some embodiments, the expression cassette comprises

SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), and SEQ ID NO: 6 (Coagulogen (L potyphemus)), SEQ ID NO: 7 (Coagulogen (7. tridentatus)), or SEQ ID NO: 8

(Coagulogen (C. rotundicauda)).

[0095] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO:

14 (cg!514 signal sequence), SEQ ID NO: 1 (HEG optimized Factor C (7. tridentatus)), SEQ ID

NO:325 (OptimWiz optimized Factor C (7. tridentatus)), or SEQ ID NO: 2 (Factor C (C. rotundicauda)), and SEQ ID NO: 272 (wild type rmB termination sequence) or SEQ ID NO: 273

(optimized rrnB termination sequence). In some embodiments, the expression cassette comprises

SEQ ID NO: SEQ ID NO: 322

tridentatus version 2)-rmB terminator), or SEQ ID NO: 113

rotundicauda)-rmB terminator). In some embodiments, the

expression cassette comprises SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 3 (Factor B (7. tridentatus)) or SEQ ID NO: 4 (Factor B (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275,

SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E. coli rrnB termination sequences,

C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively).

In some embodiments, the expression cassette comprises SEQ ID NO: 101 (Pcgl514-cgl514ss-

Factor B (7. tridentatus)-rrnB terminator) or SEQ ID NO: 117

rotundicauda)-rrnB terminator). In some embodiments, the expression cassette comprises SEQ ID

NO: 9 (cg!514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 5 (Proclotting

Enzyme from T. tridentatus), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID

NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E. coli rmB termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 105

(Pcg7574-cg7574ss-Proclotting Enzyme (T. tridentatus)-rrnB terminator). In some embodiments, the expression cassette comprises SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 14 (cg!514 signal sequence), SEQ ID NO: 6 (Coagulogen (Z. polyphemus)), SEQ ID NO: 7 (Coagulogen (Z tridentatus)), or SEQ ID NO: 8 (Coagulogen (C. rotundicauda)), and SEQ ID NO: 272, SEQ ID

NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (wild type or optimized E coli rmB termination sequences, C. glutamicum wild type cg!502, cgSOll, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 109

SEQ ID NO: 121

terminator), or SEQ ID

NO: 125 rotundicauda)-rrnB terminator).

[0096] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from

5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 34, SEQ ID NO: 37, SEQ ID NO: 39,

SEQ ID NO: 41, SEQ ID NO: 43, SEQ ID NO: 45, or SEQ ID NO: 47 (cg7574ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272,

SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (£. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cgSOll, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 98 tridentatus)-rmB

terminator^, SEQ ID NO: 323

(T. tridentatus version 2)-rmB terminator), SEQ ID NO: 114 rotundicauda)-rmB

terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:

9 (cgl514 promoter), SEQ ID NO: 50, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 57, SEQ

ID NO: 59, SEQ ID NO: 61, or SEQ ID NO: 63 (cgl 514ss-tag-F actor B where the tag is 6His,

FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273,

SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises

SEQ ID NO terminator) or SEQ ID

NO: 118 rotundicauda)-rmB terminator). In some

embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),

SEQ ID NO: 66, SEQ ID NO: 69, SEQ ID NO: 71, SEQ ID NO: 73, SEQ ID NO: 75, SEQ ID

NO: 77, or SEQ ID NO: 79 g Enzyme where the tag is 6His, FLAG, HA,

CBP, SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO:

274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:

106 Enzyme (7*. tridentatus)-rmB terminator). In some

SEQ ID NO: 82, SEQ ID NO: 85, SEQ ID NO: 87, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID

NO: 93, or SEQ ID NO: 95 where the tag is 6His, FLAG, HA, CBP,

SBP, GST, or MBP, respectively), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,

SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cgl338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: SEQ ID NO: 122

terminator), or SEQ ID NO: 126

[0097] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 35, SEQ ID NO: 38, SEQ ID NO:

40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, or SEQ ID NO:

tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively), and SEQ ID NO:

272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO:

277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or eg 1338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 99 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus)-6His- rrnB terminator), SEQ ID NO: 324 (Pcgl 514-cgl 514ss-F actor C (T. tridentatus version 2)-6His- rrnB terminator), SEQ ID NO: 115 (Pcgl514-cgl514ss-¥actor C

9 (cgl514 promoter), SEQ ID NO: 51, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ

ID NO: 60, SEQ ID NO: 62, or SEQ ID NO: 64 where the tag is 6His,

SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (£. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cgl338 termination sequences, respectively). In some embodiments, the expression cassette comprises

SEQ ID NO: 103

terminator) or SEQ ID

NO: 119

terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),

SEQ ID NO: 67, SEQ ID NO: 70, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 76, SEQ ID

NO: 78, or SEQ ID NO: 80

Enzyme-tag where the tag is 6His, FLAG, HA

274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences,

type or termination

sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO:

107 Enzyme In some

embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),

SEQ ID NO: 83, SEQ ID NO: 86, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID

NO: 94, or SEQ ID NO: where the tag is 6His, FLAG, HA CBP,

SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, wild type or cg!338 termination

111

SEQ ID NO: 123 or SEQ ID NO: 127

[0098] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a polypeptide protein tag, a sequence encoding a cascade reagent protein, a sequence encoding a polypeptide protein tag, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter),

SEQ ID NO: 36 (cg/5/Vxs-6His-Factor C-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ

ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises

SEQ ID NO: 100

tridentatus)-6¥V\s-rrnB terminator) or

SEQ ID NO: 116

(C. rotundicauda) -6His-rrnB terminatoi). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 52 (cg7574ss-6His-Factor B-6His), and SEQ ID NO: 272, SEQ ID NO:

273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 104

(7. tridentatus)-6tiis-rrnB terminator) or SEQ ID NO: 120

(C. rotundicauda)-6¥\\s-rrnB terminator). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO:

9 (cg!514 promoter), SEQ ID NO: 68 -Proclotting Enzyme-6His), and SEQ ID

NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID

NO: 277 (E. coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 108 -Proclotting Enzyme (T.

. In some embodiments, the expression cassette comprises from

5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 84 (cgI514ss-6His -Coagulogen-6His), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises SEQ ID NO: 112

Coagulogen , SEQ ID NO: 124

Coagulogen

or SEQ ID NO: 128

6His-Coagulogen

[0099] In some embodiments, the expression cassette comprises from 5’ to 3’ a promoter, a signal sequence, a sequence encoding a cascade reagent protein, and a terminator. In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter),

SEQ ID NO: 33 and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274,

SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coli rmB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5* to 3*

SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 49

and SEQ ID NO: 272,

SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cg!514 promoter), SEQ ID NO: 65

Proclotting Enzyme), and SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 274, SEQ ID NO:

275, SEQ ID NO: 276, or SEQ ID NO: 277 (E coli rrnB wild type or optimized termination sequences, C. glutamicum wild type cg!502, cg3011, cspA, or cg!338 termination sequences, respectively). In some embodiments, the expression cassette comprises from 5’ to 3’ SEQ ID NO: 9 (cgl514 promoter), SEQ ID NO: 81 and SEQ ID NO: 272, SEQ ID

NO: 273, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID NO: 276, or SEQ ID NO: 277 (E. coh rmB wild type or optimized termination sequences, C. glutamicum wild type cgl502, cg3011, cspA, or cg!338 termination sequences, respectively).

Methods of Recombinant Protein Expression and Purification

[0100] The disclosure provides methods of recombinant protein expression. In some embodiments, the expression cassette is cloned into a plasmid. In some embodiments, the expression cassette may be cloned into a multiple cloning site of a plasmid using restriction enzyme cloning, Gateway cloning, or TOPO cloning. In some embodiments, the expression cassette may be Gibson assembled into a plasmid. In some embodiments, the expression cassette may be inserted into a plasmid using a combination of restriction enzyme cloning, Gateway cloning, TOPO cloning, and/or Gibson assembly. In some embodiments, nucleic acid sequences may comprise restriction enzyme recognition sites and/or recombination sequences to facilitate cloning. In some embodiments, restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of: the promoter, the signal sequence, the sequence encoding a cascade reagent protein, the termination sequence, the sequence encoding a polypeptide tag, and the sequence encoding a linker. In some embodiments, restriction enzyme recognition sites and/or recombination sequences may be located at the 5’ and/or 3’ ends of two or more sequences encoding polypeptide protein tags and two or more sequences encoding linkers.

[0101] In some embodiments, a plasmid may be a cloning vector, a transfer vector, a shuttle vector, or an expression vector. In some embodiments, a suitable plasmid may be a mobilizable E. coll - C. glutamicum shuttle vector. In some embodiments, a suitable plasmid may be the pEC-pkl8mob2 plasmid. [0102] The disclosure provides methods of recombinant protein purification. In some embodiments, the RCRs of the present invention may be purified from cultures of recombinant C. glutamicum cells expressing nucleic acid molecules, including expression cassettes and plasmids.

In some embodiments, the expression cassette comprises a sequence encoding a polypeptide tag fused to the 5’ end or the 3’ end of the sequence encoding a cascade reagent protein. The polypeptide tag may comprise a solubilization tag that facilitates proper protein folding and prevents precipitation during purification. The polypeptide tag may comprise an affinity tag that facilitates affinity purification. The polypeptide tag may comprise a chromatographic tag that modulates resolution during chromatographic separation. The polypeptide tag may comprise an epitope tag that facilitates antibody purification.

[0103] In some embodiments, the RCR may be purified from culture supernatant or cell lysate using column chromatography. In some embodiments, the culture supernatant or cell lysate may be applied to a column, the column may be washed, and bound protein may be eluted from the column. In some embodiments, additives and chelating agents, e.g., EDTA, may be incorporated into buffers during purification. In some embodiments, the tagged protein binds to the column matrix and may be eluted by competitive binding, cleavage of the protein tag, or by destabilization of the interaction between the protein tag and the column matrix, e.g., by a change of pH. In some embodiments, the RCR may be purified by fast protein liquid chromatography

(FPLC), batch spin, or drip columns. In some embodiments, elution fractions may be assayed for protein concentration and RCR activity and concentrated to obtain higher protein concentrations.

In some embodiments, the RCR is purified to apparent homogeneity.

[0104] In some embodiments, the isolated, purified protein molecule is an RCR derived from T. tridentatus, e.g., serine protease zymogen or clotting protein optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 257 (Factor C), SEQ ID NO: 259 (Factor B), or SEQ ID

NO: 261 (Proclotting Enzyme). In some embodiments, the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 263 (Coagulogen).

[0105] In some embodiments, the isolated, purified protein molecule is an RCR derived from C. rotundicauda including homologs thereof, e.g., Factor B C3 and C2/Bf. In some embodiments, the isolated, purified protein molecule is a serine protease zymogen or clotting protein optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the serine protease zymogen is at least 75% identical to SEQ ID NO: 258 (Factor C) or SEQ ID NO: 260 (Factor B). In some embodiments, the amino acid sequence of the clotting protein is at least 75% identical to SEQ ID NO: 264 (Coagulogen).

[0106] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from T. tridentatus, e.g., a serine protease zymogen or clotting protein and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 129 (cgl 514ss-Factor C), SEQ

ID NO: 145 (cgl514ss-Factor B), SEQ ID NO: 161 (cg!514ss-Proclotting Enzyme), or SEQ ID

NO: 177 (cgl514ss-Coagulogen).

[0107] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an RCR derived from C. rotundicauda or L. polyphemus, e.g., a serine protease zymogen or clotting protein, and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the RCR is at least 75% identical to SEQ ID NO: 193

(cgl514ss-Factor C from C. rotundicauda), SEQ ID NO: 209 (cgl514ss-Factor B from C. rotundicauda), SEQ ID NO: 225 (cgl514ss-Coagulogen (Z. polyphemus)) or SEQ ID NO: 241

(cgl514ss-Coagulogen (C. rotundicauda)).

[0108] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from T. tridentatus and optimized for expression in C. gluUmicum. In some embodiments, the isolated, purified protein molecule is an

N-terminal signal peptide fused to an N-terminally tagged Factor C derived from T. tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 130, SEQ ID NO: 133,

SEQ ID NO: 135, SEQ ID NO: 137, SEQ ID NO: 139, SEQ ID NO: 141, or SEQ ID NO: 143

(cgl 514ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).

In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor B derived from T. tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 146, SEQ ID NO: 149, SEQ ID NO: 151, SEQ

ID NO: 153, SEQ ID NO: 155, SEQ ID NO: 157, or SEQ ID NO: 159 (cgl514ss-tag-Factor B where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Proclotting Enzyme derived from T. tridentatus optimized for expression in C. gluUmicum.

In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least

75% identical to SEQ ID NO: 162, SEQ ID NO: 165, SEQ ID NO: 167, SEQ ID NO: 169, SEQ

ID NO: 171, SEQ ID NO: 173, or SEQ ID NO: 175 (cgl514ss-tag-Proclotting Enzyme where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from T tridentatus optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 178, SEQ ID NO: 181, SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID NO:

187, SEQ ID NO: 189, or SEQ ID NO: 191 (cg!514ss-tag-Coagulogen where the tag is 6His,

FLAG, HA, CBP, SBP, GST, or MBP, respectively).

[0109] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:

194, SEQ ID NO: 197, SEQ ID NO: 199, SEQ ID NO: 201, SEQ ID NO: 203, SEQ ID NO: 205, or SEQ ID NO: 207 (cg!514ss-tag-Factor C where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 210, SEQ ID

NO: 213, SEQ ID NO: 215, SEQ ID NO: 217, SEQ ID NO: 219, SEQ ID NO: 221, or SEQ ID

NO: 223 (cg!514ss-tag-Factor B where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 226, SEQ ID NO: 229, SEQ ID NO: 231, SEQ ID NO: 233, SEQ ID NO: 235, SEQ ID NO: 237, or SEQ ID NO: 239 (cgl514ss- tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. gluUmicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID NO: 247, SEQ

ID NO: 249, SEQ ID NO: 251, SEQ ID NO: 253, or SEQ ID NO: 255 (cg!514ss-tag-Coagulogen where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).

[0110] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an

N-terminal signal peptide fused to a C -terminally tagged Factor C derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 131, SEQ ID NO: 134,

SEQ ID NO: 136, SEQ ID NO: 138, SEQ ID NO: 140, SEQ ID NO: 142, or SEQ ID NO: 144

(cgl 514ss-Factor C-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).

In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Factor B derived from T tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 147, SEQ ID NO: 150, SEQ ID NO: 152, SEQ

ID NO: 154, SEQ ID NO: 156, SEQ ID NO: 158, or SEQ ID NO: 160 (cgl514ss-Factor B-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 163, SEQ ID NO: 166, SEQ ID NO: 168, SEQ

ID NO: 170, SEQ ID NO: 172, SEQ ID NO: 174, or SEQ ID NO: 176 (cgl514ss-Proclotting

Enzyme-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 179, SEQ ID NO: 182, SEQ ID NO: 184, SEQ

ID NO: 186, SEQ ID NO: 188, SEQ ID NO: 190, or SEQ ID NO: 192 (cgl514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SEP, GST, or MBP, respectively).

[0111] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged RCR derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor C derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 195, SEQ ID

NO: 198, SEQ ID NO: 200, SEQ ID NO: 202, SEQ ID NO: 204, SEQ ID NO: 206, or SEQ ID

NO: 208 (cgl514ss-Factor C-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C -terminally tagged Factor B derived from C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 211, SEQ ID NO: 214, SEQ ID NO: 216, SEQ ID NO: 218, SEQ ID NO: 220, SEQ ID NO: 222, or SEQ ID NO: 224 (cgl514ss-

Factor B-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C- terminally tagged Coagulogen derived from L. polyphemus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 227, SEQ ID NO: 230, SEQ ID NO: 232, SEQ

ID NO: 234, SEQ ID NO: 236, SEQ ID NO: 238, or SEQ ID NO: 240 (cgl 514ss-Coagulogen-tag where the tag is 6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively). In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to a C-terminally tagged Coagulogen derived from C. rotundicauda and optimized for expression in C. glutamicum.

75% identical to SEQ ID NO: 243, SEQ ID NO: 246, SEQ ID NO: 248, SEQ ID NO: 250, SEQ

ID NO: 252, SEQ ID NO: 254, or SEQ ID NO: 256 (cgl514ss-Coagulogen-tag where the tag is

6His, FLAG, HA, CBP, SBP, GST, or MBP, respectively).

[0112] In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged RCR derived from T. tridentatus or C. rotundicauda optimized for expression in C. glutamicum. In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Factor C derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 132 (cgl514ss-6His-Factor C

(Z tridentatus)-6His) or SEQ ID NO: 196 (cgl514ss-6His-Factor C (C. rotundicauda)-6Hisy In some embodiments, the isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C-terminally tagged Factor B derived from T. tridentatus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino add sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO:

148 (cgl514ss-6His-Factor or SEQ ID NO: 212 (cgl514ss-6His-Factor

B (C. rotundicauda)-6His). In some embodiments, the isolated, purified protein molecule is an N- terminal signal peptide fused to an N-terminally and C-terminally tagged Proclotting Enzyme derived from T. tridentatus and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ

ID NO: 164 (cgl 514ss-6His-Proclotting Enzyme In some embodiments, the

isolated, purified protein molecule is an N-terminal signal peptide fused to an N-terminally and C- terminally tagged Coagulogen derived from T. tridentatus, L. polyphemus or C. rotundicauda and optimized for expression in C. glutamicum. In some embodiments, the amino acid sequence of the isolated, purified protein molecule is at least 75% identical to SEQ ID NO: 180 (cgl514ss-6His-

Coagulogen SEQ ID NO: 228 (cgl514ss-6His-Coagulogen (L.

or SEQ ID NO: 244 (cgl514ss-6His-Coagulogen (C. rotundicauda)-6His.

Kits and Methods for Detecting a Pyrogen or Endotoxin in a Sample

[0113] The disclosure provides kits and methods for detecting a pyrogen or endotoxin in a sample. In some embodiments, the kit comprises one or more of the RCR proteins of the present disclosure. In some embodiments, the kit comprises one or more of recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting

Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein. In some embodiments, the kit comprises one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity. [0114] The disclosure provides methods for detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with one or more RCR proteins expressed in C. glutamicum and purified to apparent homogeneity. In some embodiments, the method comprises contacting the sample with one or more of the components of the kit described herein, including recombinant

Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant

Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein. In some embodiments, the method comprises contacting the sample with one or more of the components of the kit described herein in combination with a commercialized natural lysate reagent. In some embodiments, the method for detecting a pyrogen or endotoxin in a sample comprises the limited proteolysis of each protease zymogen in the coagulation cascade reaction of the LAL assay.

[0115] In some embodiments, the method for detecting a pyrogen or endotoxin in a sample may comprise admixing one or more components of the kit with the sample, separating precipitated proteins from the sample, admixing one or more components of the kit with the remaining sample, and measuring coagulation. Measuring coagulation may include observing increased turbidity and viscosity. In some embodiments, the method further comprises centrifugation of the sample, sedimentation and separation of the sample, and/or removal of one or more layers or portions of the sample.

EXAMPLES

[0116] The following examples are not intended to be limited and are included herein for illustration purposes only.

Example 1: Preparation of RCR expression cassettes [0117] Expression cassettes of the present disclosure include nucleic acid molecules comprising a promoter, a signal sequence, a gene of interest, and a termination sequence, and may include a polypeptide tag. In an exemplary embodiment of the present invention, the expression cassette comprises a promoter, a signal sequence, a gene of interest, and a termination sequence

(FIG. 2A, number 1). In an embodiment, the expression cassette comprises a promoter, a signal sequence, an N-terminally tagged gene of interest, and a termination sequence (FIG. 2A, number

2). In an embodiment, the expression cassette comprises a promoter, a signal sequence, a C- terminally tagged gene of interest, and a termination sequence (FIG. 2A, number 3).

[0118] In an embodiment, standard cloning techniques were used to construct RCR expression cassettes comprising the , the cgl514 signal sequence

indicated by cgl514ss (SEQ ID NO: 14), the T tridentatus Factor C gene optimized for expression in C. glutamicum (SEQ ID NO: 325), the E. coli rmBTlT2 terminator sequence indicated by rmB terminator (SEQ ID NO: 272), and optionally a polyhistidine-tag optimized for expression in C. glutamicum (SEQ ID NO: 26). Three RCR expression cassettes were engineered to result in a secretory expression system based on the Cgl514 secreted protein of C. glutamicum by using the promoter ) and signal sequence (cg!514ss) of cg!514.

[0119] FIG. 2B shows schematic representations of the three RCR expression cassettes optimized for expression in C. glutamicum. Expression cassette number 4 (SEQ ID NO: 322) comprises the P promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),

Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272). Expression cassette number 5 (SEQ ID NO: 323) comprises the Pcgisu promoter (SEQ ID NO: 9), the cgl514 signal sequence (SEQ ID NO: 14), polyhistidine-tag (SEQ ID NO: 26), Factor C gene (SEQ ID NO: 325), and the rmB terminator (SEQ ID NO: 272). Expression cassette number 6 (SEQ ID NO: 324) comprises the Pcgisi* promoter (SEQ ID NO: 9), the cg!514 signal sequence (SEQ ID NO: 14),

Factor C gene (SEQ ID NO: 325), polyhistidine-tag (SEQ ID NO: 26), and the rrnB terminator

(SEQ ID NO: 272). The three RCR expression cassettes comprise the nucleic acid sequences of

SEQ ID NO: 322, SEQ ID NO: 323, and SEQ ID NO: 324, for expression of Factor C (FIG. 2B, number 4), N-terminally polyhistidine-tagged Factor C (FIG. 2B, number 5), and C-terminally polyhistidine-tagged Factor C (FIG 2B, number 6), respectively.

Example 2: Expression of recombinant expression cassettes in G glutamicum

[0120] Each of the three RCR expression cassettes were cloned into a multiple cloning site

(MCS) of the pEC-pk!8mob2 plasmid, resulting in three plasmids comprising each of the three

RCR expression cassettes. The pEC-pk!8mob2 plasmid is a mobilizable E. coli - C. glutamicum shuttle vector based on a mini-replicon encoding the repA and per functions of the medium copy number plasmid pGAl. Each of the three plasmids, as well as pEC-pk!8mob2 empty plasmid, were transformed separately into C. glutamicum. For plasmid expression confirmation, a single colony of each of the transformations was isolated from a fresh LEG plate (Luria Broth - Lennox’ s formulation supplemented with 0.5% glucose), inoculated in LEG broth and incubated at 30 °C shaking at 200 revolutions per minute (RPM) for about 6 - 8 hours. A sample of each of the transformations was then inoculated into fresh LEG broth and incubated at 30 °C shaking at 200

RPM for about 14 - 16 hours. Samples of each of the plasmid transformations were removed for gram staining. Gram-positive bacteria (Bacillus cereus) and gram-negative bacteria (Escherichia coli KI 2) were used as positive and negative controls and are indicated by B. cereus and E. coli, respectively. Untransformed, rod-shaped C. glutamicum was also Gram stained. A Gram stain shows gram-positive B. cereus and gram-negative E. coli (FIG. 3A, top left), gram-positive C. glutamicum (FIG. 3A, top middle), gram-positive C. glutamicum transformed with pK18mob2 empty plasmid (FIG. 3A, top right), gram-positive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 4, indicated by pK18mob2 - FC (FIG. 3 A, bottom left), gram-positive C. glutamicum transformed with the Factor C expression cassette plasmid of

FIG. 2B number 6, indicated by pK18mob2 - FC-CHis6 (FIG. 3A, bottom middle), and grampositive C. glutamicum transformed with the Factor C expression cassette plasmid of FIG. 2B number 5, indicated by pK18mob2 - FC-NHis6 (FIG. 3 A, bottom right).

[0121] Cells were harvested by centrifugation at 3500 RPM for 20 minutes at 4 °C.

Supernatants of each of the controls and three experimental plasmid transformations were removed and the cell pellets were washed once with STE buffer (10 mM Tris, 10 mM NaCl, 1 mM EDTA, pH 8.0). Cell pellets were frozen at -20 °C overnight. Cells pellets were thawed and resuspended in STE buffer supplemented with 500 mM sucrose and 10 mg/mL lysozyme, then shaken at 200

RPM at 37 °C for one hour. Plasmids were isolated from bacteria using alkaline lysis, and samples were subjected to gel electrophoreses at 80 Volts for 120 minutes at room temperature on a 1% agarose gel in IX TAB buffer. Safe DNA Gel Stain (Bioland Scientific) was used to visualize

DNA under blue LED light (FIG. 3B). Lane 1 shows no DNA present from the C. glutamicum negative control, lane 2 shows the expected molecular weight of the pEC-pkl8mob2 empty plasmid expressed in C. glutamicum, lane 3 shows the expected molecular weight of the plasmid comprising expression cassette number 4 of FIG. 2B (SEQ ID NO: 322) expressed in C. glutamicum, lane 4 shows the expected molecular weight of the plasmid comprising expression cassette number 6 of FIG. 2B (SEQ ID NO: 324) expressed in C. glutamicum, and lane 5 shows the expected molecular weight of the plasmid comprising expression cassette number 5 of FIG.

2B (SEQ ID NO: 323) expressed in C. glutamicum.

Example 3: Expression of recombinant Factor C in C glutamicum [0122] C. glutamicum expression cassette 4 of FIG. 2B (SEQ ID NO: 322; pK18mob2 -

FC) was cultivated in 14 mL round-bottom culture tubes containing 2.5 mL brain heart infusion

(BHI; Carolina Biological Supply Company, Burlington, NC) medium at 30 °C for 48 hours at

200 RPM. In all cultivations, kanamycin (50 mg/L) was added to the culture medium as the sole antibiotic. As a seed culture, cells were inoculated into 50 mL of semi-defined medium containing

20 g/L of glucose in a 250 mL baffled flask and cultivated at 30 °C for 24 hours at 200 RPM. The semi-defined medium consists of 0.5 g urea, 0.25 mg ZnSCh, 2.5 mg CaCh in BHI media. The seed culture (40 mL) was inoculated into 400 mL of fresh semi-defined medium in a 1 L jar custom-built bioreactor. Throughout cultivation, the temperature was maintained at 30 °C and stirred with axial flow impeller at 300 RPM. Oxygen concentration was maximized by continual sterile air flow into the medium. The pH was maintained at 7.0 by adding 10% "V/V ammonium hydroxide solution (LabChem, Zelienople, PA) when the set point dropped below 7 or 37% hydrochloric acid (GTI Laboratory Supplies, Edna, Texas) when the set point increased above 7.

To prevent glucose starvation, a glucose solution (90 g in 150 mL BHI) was added to the culture in 90 second increments at a rate of 12.5 mL/hr.

[0123] After bioreactor cultivation for 36 hours, extracellular proteins were prepared using acetone precipitation. After centrifugation at 4500 RPM for 10 minutes at 4°C, 75 mL of the culture supernatant was vigorously mixed with two volumes of cold acetone and incubated at -

20°C overnight. The protein samples were then precipitated by centrifugation at 13, 200 RPM for

30 minutes at 4°C. The pellet was air-dried and resuspended in denaturing 8M urea (pH 8.0), 300 mMNaCl, 50 mMNaH2PO₄, 20 mM Tris-Cl, 1 mMEDTA, 10% glycerol, and 1% Triton X-100.

60 pL of the resuspended precipitated supernatant protein was added per lane in an 8% SDS-PAGE gel. SDS-PAGE gel was then stained in 0.025% Coomassie Brilliant Blue R-250 in 10% acetic add at 50°C for 15 minutes while shaking. SDS-PAGE gel was destained overnight in 10% acetic add with several changes of 10% acetic acid. Gels were imaged on a light table (Figure 4).

[0124] Referring to Figure 4, lanes 1 and 2 are duplicates of the C. glutamicum pK18mob2 negative control sample, and lanes 3 and 4 are duplicates of the C. glutcanicum pK18mob2 - FC sample. Lanes 3 and 4 show expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutamicum expressing pK18mob2 - FC, a plasmid which harbors cassette number 4 (SEQ ID NO: 322), referred to as C. glutamicum pKl8mob2 - FC. Lanes 1 and 2 do not show expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutcanicum expressing pK18mob2, an empty plasmid, referred to as C. glutamicum pK18mob2. Factor C is a two-chain glycoprotdn (M_r = 123 kDa) composed of a heavy chain (M_r = 80 kDa) and a light chain (Mr = 43 kDa). SDS-PAGE gel analysis with an 8% gel under denaturing conditions demonstrates expression of ~80 kDa and ~43 kDa polypeptides in the culture supernatant of C. glutamicum pK18mob2 - FC, corresponding to production of Factor C in C. glutamicum and extrusion of the protein into the culture supernatant.

NUMBERED EMBODIMENTS

[0125] The following list of embodiments is not intended to be limiting and is included herein for illustrative purposes. The subjected matter to be claimed is not limited to the following embodiments:

Embodiment 1. A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’: a. a promoter; b. a signal sequence; and c. a sequence encoding a cascade reagent protein.

Embodiment 2. The nucleic acid molecule of embodiment 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.

Embodiment 3. The nucleic acid molecule of embodiment 1 or 2, wherein the signal sequence encodes a signal peptide.

Embodiment 4. The nucleic acid molecule as any one of embodiments 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.

Embodiment 5. The nucleic acid molecule as in any one of embodiments 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.

Embodiment 6. The nucleic acid molecule as in any one of embodiments 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a Corynebacterium glutamicum secretory gene. Embodiment 7. The nucleic acid molecule as in embodiment 5 or 6, wherein the

Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR.0949 gene, and the porB gene.

Embodiment 8. The nucleic acid molecule as in any one of embodiments 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.

Embodiment 9. The nucleic acid molecule as in any one of embodiments 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus, Limulus polyphemus, Tachypleus gigas, and

Carcinoscorpius rotundicauda.

Embodiment 10. The nucleic acid molecule as in any one of embodiments 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.

Embodiment 11. The nucleic acid molecule as in any one of embodiments 1-10, wherein the expression cassette comprises a termination sequence.

Embodiment 12. The nucleic acid molecule of embodiment 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rmB gene, the termination region of the Corynebacterium glutamicum cg!502 gene, the termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum cgl338 gene.

Embodiment 13. The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.

Embodiment 14. The nucleic acid molecule of embodiment 13, wherein the polypeptide protein tag is selected from the group consisting of polyhistidine-tag,

FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.

Embodiment 15. The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.

Embodiment 16. The nucleic acid molecule of embodiment 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.

Embodiment 17. The nucleic acid molecule of embodiment 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.

Embodiment 18. The nucleic acid molecule of embodiment 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag. Embodiment 19. The nucleic acid molecule as in any one of embodiments 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.

Embodiment 20. The nucleic acid molecule of embodiment 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine- tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione 5-transferase, and maltose-binding protein.

Embodiment 21. The nucleic acid molecule of embodiment 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.

Embodiment 22. The nucleic acid molecule of embodiment 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.

Embodiment 23. The nucleic acid molecule as in any one of embodiments 16, 18, or

22, in which the linker or linkers are selected from the group consisting of flexible

GS linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.

Embodiment 24. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ

ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.

Embodiment 25. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.

Embodiment 26. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ

ID NO: 5 or SEQ ID NO: 290-292, or a sequence at least 90% identical thereto.

Embodiment 27. The nucleic acid molecule of any one of embodiments 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ

ID NO: 6, SEQ ID NO: 7, or SEQ ID NO: 8, or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.

Embodiment 28. The nucleic acid molecule as in any one of embodiments 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID

NO: 9-13, or a sequence at least 90% identical thereto.

Embodiment 29. The nucleic acid molecule as in any one of embodiments 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of

SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.

Embodiment 30. The nucleic acid molecule as in any one of embodiments 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.

Embodiment 31. The nucleic acid molecule as in any one of embodiments 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto. Embodiment 32. The nucleic acid molecule as in any one of embodiments 16, 18, or

22-31, wherein the linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.

Embodiment 33. The nucleic acid molecule as in any one of embodiments 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ ID NO: 33-96, or a sequence at least 90% thereto.

Embodiment 34. The nucleic acid molecule as in any one of embodiments 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of

SEQ ID NO: 97-128, SEQ ID NO: 322-324, or a sequence at least 90% thereto.

Embodiment 35. A plasmid, comprising the nucleic acid molecule as in any one of embodiments 1-34.

Embodiment 36. A cell, comprising the nucleic acid molecule as in any one of embodiments 1-34 or the plasmid of embodiment 35.

Embodiment 37. A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment

35.

Embodiment 38. A recombinant expression system produced by the method of embodiment 37.

Embodiment 39. A method of expressing Factor C serine protease zymogen, Factor

B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or

Coagulogen clotting protein, comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of embodiments 1-34, or the plasmid of embodiment 35.

Embodiment 40. An isolated, purified protein molecule, wherein the amino acid sequence is at least 75% identical to any one of SEQ ID NO: 129-256.

Embodiment 41. A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant Coagulogen clotting protein expressed in Corynebacterium glutamicum.

Embodiment 42. The kit for detecting a pyrogen or endotoxin in a sample of embodiment 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID

NO: 258.

Embodiment 43. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-42, wherein the amino acid sequence of the recombinant

Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or

SEQ ID NO: 260.

Embodiment 44. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-43, wherein the amino acid sequence of the recombinant

Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID

NO: 261.

Embodiment 45. The kit for detecting a pyrogen or endotoxin in a sample as in any one of embodiments 41-44, wherein the amino acid sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO:

262-264.

Embodiment 46. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protein molecule of embodiment 40.

Embodiment 47. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of embodiments 41-45.

Claims

CLAIMS What is claimed is:

1. A nucleic acid molecule, comprising an expression cassette, wherein the expression cassette comprises, from 5’ to 3’:

(i) a promoter;

(ii) a signal sequence; and

(iii) a sequence encoding a cascade reagent protein.

2. The nucleic acid molecule of claim 1, wherein the expression cassette is optimized for expression in Corynebacterium glutamicum.

3. The nucleic acid molecule of claim 1 or 2, wherein the signal sequence encodes a signal peptide.

4. The nucleic acid molecule as any one of claims 1-3, wherein the promoter drives expression of the signal sequence and the sequence encoding the cascade reagent protein.

5. The nucleic acid molecule as in any one of claims 1-4, wherein the promoter comprises a nucleic acid sequence derived from a promoter of a Corynebacterium glutamicum secretory gene.

6. The nucleic acid molecule as in any one of claims 1-5, wherein the signal sequence comprises a nucleic acid sequence derived from a signal sequence of a

Corynebacterium glutamicum secretory gene.

7. The nucleic acid molecule as in claim 5 or 6, wherein the Corynebacterium glutamicum secretory gene is selected from the group consisting of the cg!514 gene, the cspA gene, the cspB gene, the CgR0949 gene, and the porB gene.

8. The nucleic acid molecule as in any one of claims 1-7, wherein the sequence encoding the cascade reagent protein encodes Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein.

9. The nucleic acid molecule as in any one of claims 1-8, wherein the sequence encoding the cascade reagent protein comprises a nucleic acid sequence derived from the genome of a horseshoe crab selected from the group consisting of Tachypleus tridentatus,

Limulus pofyphemus, Tachypleus gigas, and Carcinoscorpius rotundicauda.

10. The nucleic add molecule as in any one of claims 1-9, wherein the signal sequence is located between the promoter and the sequence encoding the cascade reagent protein.

11. The nucleic acid molecule as in any one of claims 1-10, wherein the expression cassette comprises a termination sequence.

12. The nucleic acid molecule of claim 11, wherein the termination sequence is selected from the group consisting of the termination region of the Escherichia coli rrnB gene, the termination region of the Corynebacterium glutamicum gene, the

termination region of the Corynebacterium glutamicum cg3011 gene, the termination region of the Corynebacterium glutamicum cspA gene, and the termination region of the Corynebacterium glutamicum eg 1338 gene.

13. The nucleic acid molecule as in any one of claims 1-12 wherein the expression cassette comprises a sequence encoding a polypeptide protein tag.

14. The nucleic add molecule of claim 13, wherdn the polypeptide protein tag is selected from the group consisting of polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione ^-transferase, and maltose-binding protein.

15. The nucleic acid molecule of claim 13 or 14, wherein the sequence encoding a polypeptide protein tag is located between the signal sequence and the sequence encoding the cascade reagent protein.

16. The nucleic acid molecule of claim 15, wherein a sequence encoding a linker is located between the sequence encoding the polypeptide protein tag and the sequence encoding the cascade reagent protein.

17. The nucleic acid molecule of claim 13 or 14, wherein the sequence encoding the polypeptide protein tag is located between the sequence encoding the cascade reagent protein and the termination sequence.

18. The nucleic acid molecule of claim 17, wherein a sequence encoding a linker is located between the sequence encoding the cascade reagent protein and the sequence encoding the polypeptide protein tag.

19. The nucleic acid molecule as in any one of claims 1-12 wherein the expression cassette comprises two or more sequences encoding polypeptide protein tags.

20. The nucleic acid molecule of claim 19, wherein the polypeptide protein tags are selected from the group consisting of polyhistidine-tag, FLAG-tag, HA-tag, calmodulin-binding peptide, streptavidin-binding peptide, glutathione S-transferase, and maltose-binding protein.

21. The nucleic acid molecule of claim 19 or 20, wherein the sequence encoding the cascade reagent protein is located between two sequences encoding polypeptide protein tags.

22. The nucleic acid molecule of claim 21, wherein sequences encoding linkers are located between the sequence encoding the cascade reagent protein and the sequences encoding the polypeptide protein tags.

23. The nucleic add molecule as in any one of claims 16, 18, or 22, in which the linker or linkers are selected from the group consisting of flexible GS linkers, flexible glycine linkers, rigid a-helical linkers, rigid proline-rich linkers, and cleavable disulfide linkers.

24. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ

ID NO: 278-283, or SEQ ID NO: 325 or a sequence at least 90% identical thereto.

25. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 3, SEQ ID NO: 4, or

SEQ ID NO: 284-289, or a sequence at least 90% identical thereto.

26. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 5 or SEQ ID NO: 290-

292, or a sequence at least 90% identical thereto.

27. The nucleic acid molecule of any one of claims 1-23, wherein the cascade reagent protein is encoded by a nucleic acid sequence of SEQ ID NO: 6, SEQ ID NO: 7, or

SEQ ID NO: 8, or SEQ ID NO: 293-301, or a sequence at least 90% identical thereto.

28. The nucleic acid molecule as in any one of claims 1-27, wherein the promoter is encoded by a nucleic acid sequence of any one of SEQ ID NO: 9-13, or a sequence at least 90% identical thereto.

29. The nucleic acid molecule as in any one of claims 1-28, wherein the signal sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 14-18, or a sequence at least 90% identical thereto.

30. The nucleic acid molecule as in any one of claims 13-29, wherein the polypeptide protein tag is encoded by a nucleic acid sequence of any one of SEQ ID NO: 19-32, or a sequence at least 90% thereto.

31. The nucleic acid molecule as in any one of claims 11-30, wherein the termination sequence is encoded by a nucleic acid sequence of any one of SEQ ID NO: 272-277, or a sequence at least 90% thereto.

32. The nucleic acid molecule as in any one of claims 16, 18, or 22-31, wherein the linker or linkers are encoded by a nucleic acid sequence of any one of SEQ ID NO: 265-271 or a sequence at least 90% thereto.

33. The nucleic acid molecule as in any one of claims 1-32, wherein the signal sequence and cascade reagent protein are encoded by a nucleic acid sequence of any one of SEQ

ID NO: 33-96, or a sequence at least 90% thereto.

34. The nucleic acid molecule as in any one of claims 1-33, wherein the expression cassette comprises a nucleic acid sequence of any one of SEQ ID NO: 97-128, SEQ ID NO:

322-324, or a sequence at least 90% thereto.

35. A plasmid, comprising the nucleic acid molecule as in any one of claims 1-34.

36. A cell, comprising the nucleic acid molecule as in any one of claims 1-34 or the plasmid of claim 35.

37. A method of producing a recombinant expression system, the method comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of claims 1-34, or the plasmid of claim 35.

38. A recombinant expression system produced by the method of claim 37.

39. A method of expressing Factor C serine protease zymogen, Factor B serine protease zymogen, Proclotting Enzyme serine protease zymogen, or Coagulogen clotting protein, comprising contacting a Corynebacterium glutamicum cell with a nucleic acid molecule as in any one of claims 1-34, or the plasmid of claim 35.

40. An isolated, purified protein molecule, wherein the amino acid sequence is at least

75% identical to any one of SEQ ID NO: 129-256.

41. A kit for detecting a pyrogen or endotoxin in a sample comprising recombinant Factor

C serine protease zymogen, recombinant Factor B serine protease zymogen, recombinant Proclotting Enzyme serine protease zymogen, and recombinant

Coagulogen clotting protein expressed in Corynebacterium glutamicum.

42. The kit for detecting a pyrogen or endotoxin in a sample of claim 41, wherein the amino acid sequence of the recombinant Factor C serine protease zymogen is at least 75% identical to SEQ ID NO: 257 or SEQ ID NO: 258.

43. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-

42, wherein the amino acid sequence of the recombinant Factor B serine protease zymogen is at least 75% identical to SEQ ID NO: 259 or SEQ ID NO: 260.

44. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-

43, wherein the amino acid sequence of the recombinant Proclotting Enzyme serine protease zymogen is at least 75% identical to SEQ ID NO: 261.

45. The kit for detecting a pyrogen or endotoxin in a sample as in any one of claims 41-

44, wherein the amino add sequence of the recombinant Coagulogen clotting protein is at least 75% identical to any one of SEQ ID NO: 262-264.

46. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with an isolated, purified protdn molecule of claim 40.

47. A method of detecting a pyrogen or endotoxin in a sample, comprising contacting the sample with the components of the kit as in any one of claims 41-45.