WO2020210455A1 - Engineered chymotrypsins and uses thereof - Google Patents

Engineered chymotrypsins and uses thereof Download PDF

Info

Publication number
WO2020210455A1
WO2020210455A1 PCT/US2020/027418 US2020027418W WO2020210455A1 WO 2020210455 A1 WO2020210455 A1 WO 2020210455A1 US 2020027418 W US2020027418 W US 2020027418W WO 2020210455 A1 WO2020210455 A1 WO 2020210455A1
Authority
WO
WIPO (PCT)
Prior art keywords
chymotrypsin
modified
asn
protein
peptide
Prior art date
Application number
PCT/US2020/027418
Other languages
French (fr)
Inventor
Navin VARADARAJAN
Balakrishnan RAMESH
Shaza ABNOUF
Original Assignee
University Of Houston System
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Houston System filed Critical University Of Houston System
Publication of WO2020210455A1 publication Critical patent/WO2020210455A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/64Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
    • C12N9/6421Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
    • C12N9/6424Serine endopeptidases (3.4.21)
    • C12N9/6427Chymotrypsins (3.4.21.1; 3.4.21.2); Trypsin (3.4.21.4)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6842Proteomic analysis of subsets of protein mixtures with reduced complexity, e.g. membrane proteins, phosphoproteins, organelle proteins
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6848Methods of protein analysis involving mass spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/90Enzymes; Proenzymes
    • G01N2333/914Hydrolases (3)
    • G01N2333/948Hydrolases (3) acting on peptide bonds (3.4)
    • G01N2333/976Trypsin; Chymotrypsin
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2440/00Post-translational modifications [PTMs] in chemical analysis of biological material
    • G01N2440/38Post-translational modifications [PTMs] in chemical analysis of biological material addition of carbohydrates, e.g. glycosylation, glycation

Definitions

  • the present disclosure pertains to modified chymotrypsins with an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins.
  • the modified chymotrypsins of the present disclosure cleave after a single Asn residue of proteins or peptides.
  • the modified chymotrypsins of the present disclosure cleave after a plurality of Asn residues of proteins or peptides.
  • the one or more Asn residues of proteins or peptides that are cleaved by the modified chymotrypsins of the present disclosure can be in various forms.
  • the one or more Asn residues are in native form.
  • the one or more Asn residues are post-translationally modified.
  • the one or more Asn residues are glycosylated.
  • the modified chymotrypsins of the present disclosure are in isolated form.
  • the modified chymotrypsins of the present disclosure have at least one amino acid substitution relative to the native chymotrypsin.
  • the native chymotrypsin is native chymotrypsin B (chyB) from Rattus norvegicus.
  • chyB native chymotrypsin B
  • the modified chymotrypsins of the present disclosure lack chain A. In some embodiments, the modified chymotrypsins of the present disclosure lack a signal peptide.
  • the modified chymotrypsins of the present disclosure are in active form. In some embodiments, the modified chymotrypsins of the present disclosure are in zymogen form, thereby requiring activation by trypsin.
  • nucleic acids with a nucleotide sequence that encodes a modified chymotrypsin of the present disclosure are in the form of an expression vector, such as a plasmid.
  • the expression vector is in a host cell, such as bacterial cells, yeast cells, insect cells, mammalian cells, and combinations thereof.
  • the nucleic acids of the present disclosure are codon optimized for expression in host cells.
  • Further embodiments of the present disclosure pertain to methods of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure.
  • the methods of the present disclosure include contacting the protein or peptide with a modified chymotrypsin to result in cleavage after at least one Asn residue in the peptide or protein and generation of one or more fragments.
  • the one or more fragments are analyzed by mass spectrometry to generate a mass spectrum.
  • the generated mass spectrum is utilized to identify the protein or peptide.
  • FIGURE 1 provides a method of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure.
  • FIGURES 2A-2D show a flow-cytometric assay for the screening of recombinant chymotrypsin with expanded specificity for Asn.
  • FIG. 2A shows the mature form of chymotrypsin (mChyB) is displayed on the surface of E. coli cells via fusion to the autotransporter, Ag43. A FLAG epitope tag is inserted C-terminal of the mchyB gene to enable monitoring of protein expression.
  • FIGS. 2B and 2D show a high-throughput, flow cytometric, single-cell screening assay. E.
  • FIG. 2C shows sequences of the Asn-sub and Tyr-sub peptide substrates used for flow-cytometry.
  • FIGURES 3A-3E show the isolation of chymotrypsin variants with altered specificity for Asn substrates.
  • FIGS. 3A-D show the substrate preferences of populations of E. coli MCI 061 cells displaying either WT-mChyB or mChyB variants isolated after sequential sorting. The variants at the end of the active site library were used as templates for the shuffled library, and the variants at the end of the shuffled library were used as templates for the error- prone library to yield the final variant, designated mChyB-Asn.
  • FIG. 3E shows amino acid changes in the mChyB-Asn variants in comparison to WT-mChyB. The average hydropathy index (HI), reflective of the hydrophobic character of the protein, is also reported for each of these proteins. The net HI of the amino acids that are being compared is also listed in parentheses for comparison.
  • HI hydropathy index
  • FIGURES 4A-4C shows kinetic and biophysical characterization of rChyB-Asn.
  • FIG. 4A shows kinetic characterization of purified, soluble WT-rChyB and rChyB-Asn with aminomethyl coumarin (AMC) fluorogenic substrates. Also shown are comparative assessments of the stability of proteases (FIG. 4B) to undergo autoproteolysis or function in the presence of urea (FIG. 4C). The relative activities are measured using the cognate AMC substrates. Error bars in all panels represent standard deviation from at least triplicate measurements.
  • FIGURE 5 shows a comprehensive evaluation of the specificity and utility of rChyB-
  • FIGURE 6 shows mass spectrometric analysis of Saccharomyces cerevisiae invertase (Suc2) digested using rChyB-Asn.
  • Purified invertase was denatured, treated with the glycosidase EndoH, digested with mChyB-Asn, and characterized using LC-MS/MS.
  • the combined mass spectrometry results from three independent digests with the protease was used to map the detected glycosylation patterns.
  • the consensus NXS/T motif is underlined and all Asn are in bold. Red denotes parts of the protein sequence detected by mass spectrometry, green indicates Asn residues that were identified as being partially glycosylated.
  • Asterisks identify sites of HexNAc modification.
  • MS/MS spectra A representative example of the MS/MS spectra obtained is shown, which illustrates that the AEPILNISN peptide contains two Asn, of which, only one is glycosylated.
  • the rChyB-Asn did not cut after the glycosylated Asn, but was able to mediate proteolysis after the unmodified Asn.
  • FIGURES 7A-7D show the detection of proteins and N-glycan sites within Jurkat secretome.
  • FIG. 7A shows the pathway analyses of the 2676 unique proteins identified using mass spectrometric LC-MS/MS analyses of Jurkat cell secretome digested with rChyB-Asn. Pathway and protein function classification was performed using PantherDB (www .pantherdb .org) . Shown are the MS/MS spectra obtained of the proteins CALM-1 (FIG. 7B), FAM3B (FIG. 7C), and aGPCR F5 (FIG. 7D), which illustrate putative N-glycan macroheterogeneity.
  • MS post-translational modifications
  • MS selected reaction monitoring
  • Both of the aforementioned MS methods utilize proteolytic digestion to generate peptide fragments that are then detected using MS.
  • the proteomic sample is digested with trypsin, fractionated, and subjected to liquid chromatography-mass- spectrometry (LC-MS/MS).
  • LC-MS/MS liquid chromatography-mass- spectrometry
  • Trypsin and the trypsin family of enzymes are widely used as the main proteases of mass- spectrometry-based proteomics. This is primarily due to the ability of these enzymes to cleave protein mixtures yielding peptides of mass in the preferred mass range for MS and with defined substrate specificity, thus providing readily interpretable and reproducible fragmentation spectra. Furthermore, these enzymes are fairly stable and can function in the presence of denaturants (e.g., urea) that assist in unfolding the target proteins that need to be proteolyzed prior to detection.
  • denaturants e.g., urea
  • N-linked glycosylation represents one of the best-characterized forms of glycosylation in eukaryotes with the glycans being attached to the amide group of asparagine (Asn) residues within proteins.
  • the present disclosure pertains to modified chymotrypsins with an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins.
  • the present disclosure pertains to nucleic acids with a nucleotide sequence that encodes the modified chymotrypsins of the present disclosure.
  • the present disclosure pertains to methods of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure.
  • the modified chymotrypsins, nucleic acids, and cleavage methods of the present disclosure can have numerous embodiments.
  • the modified chymotrypsins, nucleic acids, and cleavage methods of the present disclosure can also have numerous applications.
  • the modified chymotrypsins of the present disclosure have an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in a peptide or a protein.
  • the modified chymotrypsins of the present disclosure can have various structures, forms and proteolytic activities.
  • the modified chymotrypsins of the present disclosure can have various proteolytic activities. For instance, in some embodiments, the modified chymotrypsins of the present disclosure cleave after one or more Asn residues with a k cat /KM of at least 0.5 M V 1 . In some embodiments, the modified chymotrypsins of the present disclosure cleave after one or more Asn residues with a k cat /KM of about 0.5 M V 1 . In some embodiments, the modified chymotrypsins of the present disclosure cleave after one or more Asn residues with a k cat /KM of between about 0.4 M V 1 to about 1 M V 1 . [0034] The modified chymotrypsins of the present disclosure can cleave after one or more
  • the modified chymotrypsins of the present disclosure cleave at a region that is C-terminal to the one or more Asn residues. In some embodiments, the modified chymotrypsins of the present disclosure cleave at a region that is N-terminal to the one or more Asn residues. [0035] The modified chymotrypsins of the present disclosure can cleave after one or more
  • the modified chymotrypsins of the present disclosure cleave immediately after one or more Asn residues.
  • the modified chymotrypsins of the present disclosure can cleave different numbers of Asn residues of proteins or peptides. For instance, in some embodiments, the modified chymotrypsins of the present disclosure cleave after a single Asn residue of proteins or peptides. In some embodiments, the modified chymotrypsins of the present disclosure cleave after a plurality of Asn residues of proteins or peptides.
  • the one or more Asn residues of proteins or peptides that are cleaved by the modified chymotrypsins of the present disclosure can be in various forms.
  • the one or more Asn residues are in native form.
  • the one or more Asn residues are post-translationally modified.
  • the one or more Asn residues are glycosylated.
  • the one or more Asn residues are partially glycosylated.
  • the one or more Asn residues are fully glycosylated.
  • the modified chymotrypsins of the present disclosure can have various forms and structures.
  • the modified chymotrypsins of the present disclosure are in the form of a formulation.
  • the formulation includes, without limitation, suitable buffers, ions, salts, stabilizing agents, and combinations thereof.
  • the modified chymotrypsins of the present disclosure are in isolated form. In some embodiments, the modified chymotrypsins of the present disclosure have at least one amino acid substitution relative to the native chymotrypsin. In some embodiments, the native chymotrypsin is native chymotrypsin B (chyB) from Rattus norvegicus.
  • chyB native chymotrypsin B
  • the modified chymotrypsins of the present disclosure have at least one amino acid substitution that includes an amino acid change in at least one of positions 99, 189, 192, 218, 219, 221, 222, 223, 224, or 226 of the native chymotrypsin.
  • the one or more amino acid substitutions include, without limitation, Seri 89, Argl92, Arg218, Trp224, Gly226, and combinations thereof.
  • the modified chymotrypsins of the present disclosure lack chain A. In some embodiments, the modified chymotrypsins of the present disclosure lack a signal peptide. [0042] In some embodiments, the modified chymotrypsins of the present disclosure are in active form. In some embodiments, the modified chymotrypsins of the present disclosure are in zymogen form, thereby requiring activation by trypsin.
  • the modified chymotrypsins of the present disclosure are identified by SEQ ID NO: 1.
  • SEQ ID NO:l includes the following sequence:
  • the nucleic acids of the present disclosure include a nucleotide sequence that encodes a modified chymotrypsin of the present disclosure. As set forth in more detail herein, the nucleic acids of the present disclosure can have various forms and properties.
  • the nucleic acids of the present disclosure are in the form of an expression vector, such as a plasmid.
  • the expression vector is in a host cell.
  • the host cell includes, without limitation, bacterial cells, yeast cells, insect cells, mammalian cells, and combinations thereof.
  • the nucleic acids of the present disclosure are codon optimized for expression in host cells.
  • the host cell that contains the expression vectors of the present disclosure is a mammalian cell.
  • the mammalian cell includes, without limitation, HEK cells, CHO cells, HEK293F cells, derivatives thereof, and combinations thereof.
  • the nucleic acids of the present disclosure can include various nucleotide sequences that encode various modified chymotrypsins of the present disclosure.
  • the nucleotide sequences are identified by SEQ ID NO: 2.
  • SEQ ID NO: 2 includes the following sequence:
  • the present disclosure pertains to methods of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure.
  • the methods of the present disclosure include a step of contacting the protein or peptide with a modified chymotrypsin (step 10) to result in cleavage after at least one Asn residue in the peptide or protein (step 12) and generation of one or more fragments (step 14).
  • the one or more fragments are analyzed by mass spectrometry to generate a mass spectrum (step 16).
  • the generated mass spectrum is utilized to identify the protein or peptide (step 18).
  • the cleavage methods of the present disclosure can have numerous embodiments.
  • the protein or peptide is identified by comparing the mass spectrum of the one or more fragments to a mass spectrum library of known proteins or peptides.
  • the mass spectrum library is a theoretical mass spectrum library.
  • the theoretical mass spectrum library is derived from in silico digestion.
  • the mass spectrum library is a mass spectrum library derived from enzymatic digestion of known proteins or peptides.
  • the mass spectrum library is accessed by entering the mass spectrum into a search database.
  • the methods of the present disclosure can have numerous advantageous applications. For instance, in some embodiments, the methods of the present disclosure may be utilized to identify or map the post-translational modification (PTM) sites of a protein or peptide.
  • the post-translational modification sites include glycosylation sites of a protein or peptide.
  • the post-translational modification sites include glycosylated Asn (e.g., partially or fully glycosylated Asn).
  • the modified chymotrypsin cleaves directly after the post-translational modification sites (e.g., directly after the glycosylated Asn residue).
  • Example 1 Engineered chymotrypsin for mass-spectrometry based detection of protein 1 glycosylation
  • a Forster resonance energy transfer (FRET) based multiplexed assay previously utilized to monitor bacterial protease specificity, was adapted to the chymotrypsin system by optimizing the labeling parameters such as cell density and pH to improve the dynamic range of the assay.
  • Example 1.2 Chymotrypsin variant with improved activity towards PI Asn identified by iterative mutagenesis and FACS [0064]
  • the amino acids responsible for the substrate specificity of chymotrypsin and trypsin have been extensively investigated in the past. Residues 189, 216 and 226, which are in close proximity to the PI substrate residue, were considered to be responsible for substrate specificity. Site-directed mutagenesis studies targeting these positions to modify enzyme specificity did not yield functional variants.
  • Loops 185-192 and 215-226 were identified as key determinants of substrate specificity based on their significance in evolutionary divergence of substrate specificity within chymotrypsin-fold serine proteases and in all the previous protein engineering efforts to modify PI specificity of trypsin and chymotrypsin.
  • coli MC1061 to yield -107 transformants.
  • the expression of mChyB was induced by the addition of arabinose and the library of cells were screened using 20 nM Asn-sub and 20 nM Tyr-sub peptides (FIG. 2C).
  • Applicant constructed an error-prone library using the DNA sequence of mChyB-Asn-v2 as the template.
  • the library of -107 transformants was again screened using identical concentrations of Asn-sub and Tyr-sub. After 4 rounds of sorting, DNA sequencing of 10 random clones identified no consensus mutations.
  • mChyB-Asn with an additional mutation, Val99Met, in chain B showed the highest fluorescence.
  • Asn-sub (3.3-fold in comparison to WT-mChyB, FIG. 3D) was selected for subsequent characterization.
  • rChyB-Asn showed a 10-fold increase in the rate of product generation in comparison to wild-type chymotrypsin.
  • rChyB-Asn showed no activity towards AAPN[GlcNAc]-AMC, confirming Applicant’s hypothesis that glycosylation blocks proteolysis after Asn (FIG. 4A).
  • Example 1.4 Expanded substrate specificity of engineered chymotrypsin
  • PSM peptide to spectra matching
  • Applicant started with specific searches where the C-termini were restricted ([RKJINot Pro for trypsin and [YWFLMN]INot Pro for chymotrypsins) and identified 176 unique proteins with a peptide false discovery rate of ⁇ 1% to create a custom database.
  • non-specific searches were performed to match spectra from WT-rChyB and rChyB-Asn to peptides within this custom database.
  • proteases with orthogonal specificity for mass- spectrometry-based proteomics is the generation of unique peptides in a proteomic sample that either increases the coverage of a single protein or that increases the number of unique proteins identified.
  • Example 1.5 Partially glycosylated Asn identified in a model protein using chymotrypsin
  • Glycan macroheterogeneity defined as the heterogeneity due to sub- stoichiometric glycosylation, is significant in physiological phenomena such as glycoprotein hormonal regulation and is relevant to quality control of recombinant glycobiologics produced.
  • PSM for glycopeptides is challenging due to the microheterogeneity of glycan composition.
  • Suc2 was digested with endoglycosidase H enzyme which trims glycans to leave only the N-acetylglucosamine moiety attached to the Asn residue.
  • Suc2 was digested with rChyB-Asn for 72 hours and the digested peptides were subjected to LC-MS/MS.
  • LC-MS/MS LC-MS/MS.
  • sequons 4 and 5 overlap and only sequon 4 is glycosylated. This feature was correctly resolved using rChyB-Asn as proteolysis was observed at the C- terminus of Asnl l2 in sequon 5. Finally, although sequon 13 is predicted to be almost fully glycosylated, Applicant only detected the unglycosylated peptide.
  • Applicant next examined the utility of ChyB-Asn for analyzing the extracellular proteome of the human Jurkat T cell line, and to identify glycan macroheterogeneity within this compartment.
  • the cell-free supernatant fraction from cells grown in serum-free media was first digested with rChyB-Asn, subsequently with Peptide-N-Glycosidase (PNGase) F and then subjected to LC-MS/MS.
  • PNGase Peptide-N-Glycosidase
  • FDR false discovery rate
  • Applicant In order to identify glycan macroheterogeneity, Applicant analyzed the dataset two ways: First, within the peptides that were identified to arise from Asn proteolysis, Applicant identified a set of 78 peptides that were: (1) part of the NXS/T sequon, and (2) annotated to be a glycosite by machine learning algorithms implemented within Uniprot. These 78 peptides represent annotated glycosites that have been detected in Applicant’s experiments to not have sugar modifications, and to have been proteolyzed. This, in turn, implies that these sites within these proteins either are completely unmodified or have glycan macroheterogenity.
  • RTPENFPCKN3108 which resulted from cleavage at Asn308 of human plasminogen, is known to display glycan macroheterogeneity.
  • the presence or absence of sugar at Asn308 of human plasminogen can affect the rate of activation by tissue plasminogen activator, and the binding affinity to cell surface receptors, ultimately affecting the rate of fibrinolysis of blood clots.
  • proteases have defined substrate binding pockets that help them differentiate the many different chemical properties of the amino-acid side chains.
  • Trypsin and chymotrypsin for example, have very similar tertiary structures and almost superimposable substrate binding pocket architecture (layout of the backbone).
  • substrate binding pocket architecture layout of the backbone.
  • site-directed mutational swapping of selected residues within the binding pocket of trypsin did not endow chymotryp sin-like reactivity and required the grafting of extended loops outside the substrate binding pocket to bring about the change in substrate specificity. Aided by advances in combinatorial screening, progress been reported in the systematic engineering of the substrate specificity of a wide-range of microbial proteases.
  • Applicant has utilized a two-step strategy to engineer the substrate specificity of chyB to cleave after Asn and demonstrate its utility for proteomic and Asn-N-linked glycan mapping experiments.
  • Applicant designed a mature version of chyB (mChyB) containing only chains B and C and containing the Tyrl64Ala mutation that eliminates the need for autoproteolytic processing.
  • Applicant screened libraries of mChyB via fusion to the E. coli autotransporter Ag43, and by utilizing a high-throughput selection/counter-selection system.
  • Applicant isolated a variant, mChyB- Asn, that contained 9 mutations, and that demonstrated both increased activity towards an Asn containing substrate and decreased activity towards Tyr 333 containing substrates.
  • proteases with distinct specificities makes it likely that different peptides are generated for mass- spectrometry, and therefore subjecting the same proteome or proteins to proteolysis by these orthogonal proteases increases the likelihood that complementary parts are sequenced leading to enhanced coverage of sequence space.
  • This parallel digestion approach using multiple proteases in parallel reactions has been utilized for the quantification of proteomes obtained from mammalian cells and viruses, and lead to the increased identification of
  • Applicant’s results are largely consistent with known mapping experiments but there are also differences. According to Applicant’s results, sequon 14 (Asn 512) is partially glycosylated since Applicant detected both peptides but prior results have claimed both partial and complete glycosylation.
  • the number of proteins identified by the aid of chymotrypsin is similar to the proteins identified using trypsin. More importantly, Applicant identified 87 proteins that are either sub- stoichiometrically glycosylated or non-glycosylated at putative N-linked glycosylation sites. For three of these proteins, Applicant identified both the glycopeptide and the unmodified peptide confirming N-glycan macroheterogeneity.
  • chymotrypsin can serve as a proteomic tool, as demonstrated by studies in this Example. Although Applicant has demonstrated the detection of candidate glycopeptides, these likely have some false positives since it has been previously demonstrated that spontaneous deamidation of Asn can happen independently of PNGase activity in MS/MS experiments. In order to overcome this limitation, a more thorough annotation of N-glycan sites can be undertaken by the direct detection of glycopeptides as illustrated in a number of recent studies.
  • Lyophilized peptide (KAAPNGSCGRGR, N-terminal acetylated and C-terminal amidated, >70% 430 purity, Genscript, NJ) and dyes, Atto-maleimide (Atto-TEC GmbH, Singen, Germany) and QSY21 carboxylic acid succinimidyl ester (Life Technologies, NY), were resuspended in anhydrous N, N-d ⁇ mcthy 1 formamidc (DMF) to a concentration of 10 mM and 10 pg/pL, respectively.
  • 20 pL of 1 M NaHCCF solution was mixed with 50 pL of peptide solution.
  • the peptide/Atto-633 reaction mixture was added to a tube containing 100 pL of 1M 4-dimethyl amino pyridine (DMAP, Sigma Aldrich, MO) and 50 pL of QSY 21 solution.
  • the reaction mixture was diluted with 4.5 mL of water containing 10% v/v acetic acid and loaded on to the column using a 5 mL sample loop.
  • the chemical identity of the synthesized substrate was verified using ESI-MS (Rice core mass spectrometry facility, TX). The desired fractions were pooled together and lyophilized overnight.
  • substrate (Asn-AtQ21) concentration was estimated by measuring the absorbance of dilutions in Infinite 200 PRO plate reader (Tecan, Switzerland).
  • LB culture was inoculated using a glycerol stock of the library to a starting OD600 of 0.02 and grown up to an OD600 of 0.5 for induction with 100 pM arabinose at 37°C for 2 h. Induced cells were washed with 1% sucrose solution and resuspended to an OD of 1. The washed cells were incubated with 20 nM of Tyr-BQ7 and Asn-AtQ21 at an OD 0.01 in 1% sucrose solution containing 2 mM Tris (pH 7.5) for 10 min at 250 C.
  • the labeled cells were analyzed using BD Biosciences FACSJazz cell sorter at an event rate of 7000/s. Sorting was performed in the labeling time window of 10-25 min as with longer incubation, cells could get labeled non-specifically. For all the libraries, the number of cells screened was at least 3-fold higher than their genetic diversity estimated based on transformation efficiency.
  • Plasmid DNA was recovered from the sorted cells using Zymo miniprep kit (Zymo Research, CA) and transformed into electrocompetent E.coli cells as described previously. Transformants recovered with SOC media were directly grown in 100 mL of 2xYT media supplemented with 0.5% glucose and 25 pg/mL chloramphenicol at 37 °C for 10 h and used to seed a subculture for the next round of sorting. Ten colonies were randomly picked from the sorted population for clonal characterization using flow cytometry with Tyr-BQ7 and Asn- AtQ21 substrates. Mutations in chymotrypsin B corresponding to the clones that showed the desired phenotype on the cell sorter were identified by standard Sanger sequencing (SeqWright Inc., Houston, TX).
  • pcDNA 3.4 based plasmid vector used for the expression of chymotrypsinogens in human embryonic kidney (HEK293F) cells was a kind gift from Georgiou lab (University of Texas, Austin).
  • HF/Bsu36I digested vector for subsequent cloning by Gibson assembly was commercially purchased (IDT, IA).
  • Wild-type chymotrypsinogen gene (rChyB ) was generated by overlap-extension PCR of DNA fragments obtained using rChyB-Asn and mChyB as templates. After verifying the cloned plasmids by DNA sequencing, 100 mL 2xYT culture supplemented with 200 ug/ml ampicillin was inoculated with a colony of cells harboring plasmid containing rChyB or rChyB-Asn, grown overnight at 37 °C and plasmid DNA was isolated using
  • HEK293F cells grown to a density of 2.5 x 106 cells/mL after three passages in a suspension media culture at 37 °C and 5% CO2 were transiently transfected with 100 pg of plasmid DNA using 80 pi of Expifectamine transfection reagent (Life Technologies, NY). Transfection enhancers provided in the kit were then added 16 h following transfection. Protein secretion was allowed for 4-5 days, usually when viability (as measured by trypan blue) was 30-50%, and cells were spun down at 4000 xg for 20 min at 4 °C to collect the supernatant.
  • chymotrypsinogen in the supernatant was verified by Western blotting using rabbit anti-His antibody (300 ng/mL, GenScript, NJ), goat anti-rabbit antibody conjugated to HRP (40 ng/mL, Jackson ImmunoResearch, PA), and chemiluminescent HRP substrate (SuperSignal West Pico, Thermo Scientific, MA).
  • Asn-sub at concentration 1 mM, was incubated with activated chymotrypsin variants in Tris buffer (pH 7.5) at 37 °C for about 1 h.
  • Tris buffer pH 7.5
  • an aliquot of the reaction mixture was analyzed on a C18 column by measuring absorbance at 630 and 670 nm corresponding to Atto633 fluorophore and QSY21 quencher respectively using the same gradient method that was previously used during substrate synthesis.
  • aliquots were also characterized by LC-MS (Mass spectrometry laboratory, Department of Chemistry, University of Houston).
  • enzyme concentration was in the range of 2.5-50 nM and substrate concentrations were varied from 20-200 pM except for suc-LLVY-AMC substrate where the highest concentration was limited to 50 pM due to poor solubility.
  • Enzyme concentration for AAPN(GlcNAc)-AMC ranged from 0-560 pM .
  • Example 1.14 Protease stability experiments
  • Protease activity was assessed by incubation of the appropriate substrate in activity buffer (50mM Tris, lOmM CaCh) with either active WT rChyB or rChyB-Asn, and measuring the increase in fluorescence with time using excitation at 380 nm and emission at 460 nm wavelengths.
  • activity buffer 50mM Tris, lOmM CaCh
  • 80 nM WT rChyB or 600 nM rChyB-Asn was incubated in indicated concentrations at 25 °C for 20 minutes.
  • Example 1.15 HEK293F cells supernatant digestion
  • HEK293F cells were cultured in Freestyle 293 serum-free media (Thermo Scientific, MA) and passaged three times. Cells were then seeded at a density of 0.9*106 cells/ml in 200 ml fresh media and the supernatant was harvested after 24 hours, lyophilized and the protein sample was dissolved in 0.5% SDS. N-linked glycosylation was trimmed by EndoH (New England Biolabs, MA) treatment following the manufacturer’s instructions.
  • the sample was reduced with 4 mM tris 2-carboxyethyl phosphine (TCEP) at 370 C for 15 minutes and alkylated with 4 mM 24 iodoacetamide (IAA) for 30 minutes in dark at room temperature.
  • TCEP tris 2-carboxyethyl phosphine
  • IAA iodoacetamide
  • the sample was purified by chloroform-methanol precipitation and resuspended in 25 pi of 6 M urea.
  • the protein concentration was measured with Nanodrop A280.
  • 5 m ⁇ (65 mg) of the sample was digested with engineered chymotrypsin, wild-type chymotrypsin, and trypsin respectively in 100 mM ammonium bicarbonate buffer at 370 C for 72 hours.
  • the reaction was stopped by adding formic acid to a final concentration of 1%.
  • Example 1.16 Jurkat cells digestion
  • Jurkat cells were cultured in RPMI 1640 containing 10% fetal bovine serum and supplemented with 50 pg/ml gentamicin and insulin-transferrin- selenium supplement (Thermofisher) until a cell number of 6 x 107 was reached.
  • Cells were re-suspended in 100 ml of serum-starved RPMI1640 and the supernatant was harvested after 24 hours.
  • Secreted proteins were concentrated using Amicon 10-kDa column and protein content was measured using BCA assay.
  • Glycopeptide enrichment, PNGase F treatment were performed by Creative
  • Jurkat secretome 8 mg was transferred into Microcon devices YM-10 (Millipore), and washed twice with 50 mM ammonium bicarbonate. After reduction by 10 mM dithiothreitol (DTT) at 56°C for 1 hour and alkylation by 20 mM IAA at room temperature in dark for 1 hour, proteins were washed thrice with 50 mM ammonium bicarbonate. Trypsin activated rChyB-Asn was added to Jurkat secretome at a ratio of 1:1 at 37°C for 3 days. Again, the peptides were washed with 100 pi of 50 mM ammonium bicarbonate twice and lyophilized.
  • DTT dithiothreitol
  • Enrichment kit (Sigma) following manufacturers’ instructions. Enriched and depleted (peptides that did not bind to the Glycocapture resin) samples were treated with PNGase F at a ratio of 1:50 overnight at 37 °C. PNGase F-treated proteins were washed with ammonium bicarbonate lyophilized and re-suspended in 0.1% formic acid prior to LC-MS/MS analysis.
  • Liquid-chromatography tandem mass -spectrometry was performed on a ThermoFinnigan LTQ equipped with an Agilent 1290 Infinity UPLC system using Solvent A (water + 0.1% formic acid) and Solvent B (methanol + 0.1% formic acid). 5 m ⁇ (8 mg) of the sample was injected for each mass spectrometry analysis.
  • the peptides were separated in a home-packed 500 pm x 6 cm C18 reversed phase column by a 30-minute linear gradient of 20% to 100% Solvent B with a flow rate of 30 pl/min.
  • Electrospray voltage was set to 3.78 kV with a sweep, auxiliary, and sheath gas set to 0 on a standard IonMax ESI Source.
  • the capillary temperature was set to 250°C and the mass spectrometer was set for Dynamic Exclusion Data-Dependent MS/MS with the 3 highest intensity masses observed in the MS scan targeted for MS/MS fragmentation.
  • the RAW data files were converted to MGF format using the MSConvert utility program from the ProteoWizard program suite (http://proteowizard.sourceforge.net/toois.shtmi). The data were first analyzed using X!
  • the identification settings were as follows: No cleavage specificity; 3.0 Da as MSI and 0.5 Da as MS2 tolerances; fixed modifications: Carbamidomethylation of C (+57.021464 Da), variable modifications: Oxidation of M (+15.994915 Da), fixed modifications during refinement procedure: Carbamidomethylation of C (+57.021464 Da), variable modifications during refinement procedure: Acetylation of protein N-term 26 (+42.010565 Da), Pyrolidone from E (—18.010565 Da), Pyrolidone from Q (— 17.026549 Da), Pyrolidone from carbamidomethylated C (—17.026549 Da). Peptides and proteins were inferred from the spectrum identification results using PeptideShaker version 1.16.1263. Peptide Spectrum Matches (PSMs), peptides and proteins were validated at a 1.0% False Discovery Rate (FDR) estimated using the decoy hit distribution.
  • PSMs Peptide Spectrum Matches
  • FDR False Discovery
  • the peptide mixture was separated in a home packed 100 pm x 10 cm column with a reverse phase ReproSil-Pur C18-AQ resin (3 pm, 120 A ) with a flow rate of 600 nL min-1 by applying a linear gradient from 6 to 30% of B for 38 minutes, 30-42% for 10 minutes, 42-90% for 6 minutes, and constant 90% to complete the 60-minute program.
  • the eluted peptides were electro sprayed at a voltage of 2.2 kV with a capillary temperature of 270°C.
  • Full MS scans were acquired in the Orbitrap with a resolution of 70,000 at m/z 400.
  • the MS raw data of rChyB-Asn was analyzed and searched against a database of human-related proteins using Byonic software v2.15.7. Similarly to HEK293F analysis, protein identification was conducted against a target/decoy database with an estimated FDR of ⁇ 1%. A semi- specific enzymatic search against YWFFMN was set with a maximum number of missed cleavages of 2 and a peptide molecular weight tolerance of 10 ppm. The MS/MS tolerances of 0.5 Da were allowed.
  • the sole fixed modification parameter was carbamidomethylation (C), and the variable modification parameters were oxidation (O), and deamination (D) and the glycan modifications HexNAc(l), HexNAc(2), HexNAc(l)Fuc(l), and HexNAc(2)Fuc(l).
  • C carbamidomethylation
  • O oxidation
  • D deamination
  • HexNAc(l) HexNAc(2), HexNAc(l)Fuc(l)
  • HexNAc(2)Fuc(l) HexNAc(2)Fuc(l
  • Peptide list from enriched and 27 depleted fractions was combined and narrowed to proteins with a taxonomy of Homo sapiens.
  • a total of 4380 peptides belonging to 2678 proteins are annotated as secreted by either (1) Gene Ontology (GO) cellular component as either extracellular region (G0:0005576), extracellular matrix (GO: 0031012), extracellular space (GO: 0005615), extracellular exosome(GO: 0070062), cell surface (GO: 0009986), external side of plasma membrane (GO: 0009897), or protein secretion (GO: 0009306) or (2) proteins predicted to contain a signal peptide according to SecretomeP 2.0 or SignalP 4.1 servers or (3) proteins classified as secreted by a non-classical pathway with a SecretomeP 2.0 score exceeding O.6.. [00152]
  • Invertase (Sigma Aldrich, MO) sample was treated with EndoH to remove all glycosylation except for the core GlcNac. Then, invertase (0.6 pg) was digested with engineered chymotrypsin at 1:1 ratio in 100 mM ammonium bicarbonate buffer and incubated at 370C for up to 72 hours. The reaction was stopped by adding 10% formic acid to a final concentration of 1%. 4 pi of digested peptides (70 or 130 ng) were analyzed by Bruker MicrOTOF-Q mass spectrometry as described previously.
  • the data were searched against the invertase protein sequence database created in OMSSA using the following search criteria; no enzyme, 2 missed cleavage allowed, precursor mass tolerance 1 Da, product mass tolerance of 0.4 Da, Asn HexNac, Asn dHexHexNac, deamidation of N and Q as variable modification and E value threshold was set to l.OOOe+OOO.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Hematology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Urology & Nephrology (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Genetics & Genomics (AREA)
  • Cell Biology (AREA)
  • Analytical Chemistry (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Food Science & Technology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Embodiments of the present disclosure pertain to modified chymotrypsins with an altered substrate specificity that allows the modified chymo trypsin to cleave after at least one Asn residue in peptides and proteins. Additional embodiments of the present disclosure pertain to nucleic acids with a nucleotide sequence that encodes the modified chymotrypsins of the present disclosure. Further embodiments of the present disclosure pertain to methods of cleaving a protein or peptide by contacting the protein or peptide with a modified chymotrypsin to result in cleavage and generation of one or more fragments. The one or more fragments may be analyzed by mass spectrometry to generate a mass spectrum that can be utilized to identify the protein or peptide.

Description

TITLE
ENGINEERED CHYMOTRYPSINS AND USES THEREOF
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Patent Application No. 62/831,298, filed on April 9, 2019. The entirety of the aforementioned application is incorporated herein by reference.
BACKGROUND
[0002] Current methods of utilizing proteolytic enzymes to identify peptides and proteins have numerous limitations, including loss of numerous parts of a proteome due to the size of peptides being generated (e.g., too small or too large), limited proteolytic efficiency across different proteins within complex mixtures, and limited ability to identify post-translational modifications. Numerous embodiments of the present disclosure address the aforementioned limitations.
SUMMARY [0003] In some embodiments, the present disclosure pertains to modified chymotrypsins with an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins. In some embodiments, the modified chymotrypsins of the present disclosure cleave after a single Asn residue of proteins or peptides. In some embodiments, the modified chymotrypsins of the present disclosure cleave after a plurality of Asn residues of proteins or peptides.
[0004] The one or more Asn residues of proteins or peptides that are cleaved by the modified chymotrypsins of the present disclosure can be in various forms. For instance, in some embodiments, the one or more Asn residues are in native form. In some embodiments, the one or more Asn residues are post-translationally modified. In some embodiments, the one or more Asn residues are glycosylated. [0005] In some embodiments, the modified chymotrypsins of the present disclosure are in isolated form. In some embodiments, the modified chymotrypsins of the present disclosure have at least one amino acid substitution relative to the native chymotrypsin. In some embodiments, the native chymotrypsin is native chymotrypsin B (chyB) from Rattus norvegicus. [0006] In some embodiments, the modified chymotrypsins of the present disclosure lack chain A. In some embodiments, the modified chymotrypsins of the present disclosure lack a signal peptide.
[0007] In some embodiments, the modified chymotrypsins of the present disclosure are in active form. In some embodiments, the modified chymotrypsins of the present disclosure are in zymogen form, thereby requiring activation by trypsin.
[0008] Additional embodiments of the present disclosure pertain to nucleic acids with a nucleotide sequence that encodes a modified chymotrypsin of the present disclosure. In some embodiments, the nucleic acids of the present disclosure are in the form of an expression vector, such as a plasmid. In some embodiments, the expression vector is in a host cell, such as bacterial cells, yeast cells, insect cells, mammalian cells, and combinations thereof. In some embodiments, the nucleic acids of the present disclosure are codon optimized for expression in host cells.
[0009] Further embodiments of the present disclosure pertain to methods of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure. In some embodiments, the methods of the present disclosure include contacting the protein or peptide with a modified chymotrypsin to result in cleavage after at least one Asn residue in the peptide or protein and generation of one or more fragments. In additional embodiments, the one or more fragments are analyzed by mass spectrometry to generate a mass spectrum. In some embodiments, the generated mass spectrum is utilized to identify the protein or peptide. DESCRIPTION OF THE DRAWINGS
[0010] FIGURE 1 provides a method of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure.
[0011] FIGURES 2A-2D show a flow-cytometric assay for the screening of recombinant chymotrypsin with expanded specificity for Asn. FIG. 2A shows the mature form of chymotrypsin (mChyB) is displayed on the surface of E. coli cells via fusion to the autotransporter, Ag43. A FLAG epitope tag is inserted C-terminal of the mchyB gene to enable monitoring of protein expression. FIGS. 2B and 2D show a high-throughput, flow cytometric, single-cell screening assay. E. coli cells expressed mChyB variants, which were displayed via fusion to Ag43, and incubated with Asn-sub (selection) and Tyr-sub (counter- selection) peptide sequences. To eliminate issues associated with protease induced toxicity, plasmids were directly isolated from cells after FACS and retransformed into fresh E. coli competent cells, which were subjected to further rounds of sorting. The FLAG epitope tag was used to normalize protease expression. FIG. 2C shows sequences of the Asn-sub and Tyr-sub peptide substrates used for flow-cytometry.
[0012] FIGURES 3A-3E show the isolation of chymotrypsin variants with altered specificity for Asn substrates. FIGS. 3A-D show the substrate preferences of populations of E. coli MCI 061 cells displaying either WT-mChyB or mChyB variants isolated after sequential sorting. The variants at the end of the active site library were used as templates for the shuffled library, and the variants at the end of the shuffled library were used as templates for the error- prone library to yield the final variant, designated mChyB-Asn. FIG. 3E shows amino acid changes in the mChyB-Asn variants in comparison to WT-mChyB. The average hydropathy index (HI), reflective of the hydrophobic character of the protein, is also reported for each of these proteins. The net HI of the amino acids that are being compared is also listed in parentheses for comparison.
[0013] FIGURES 4A-4C shows kinetic and biophysical characterization of rChyB-Asn. FIG. 4A shows kinetic characterization of purified, soluble WT-rChyB and rChyB-Asn with aminomethyl coumarin (AMC) fluorogenic substrates. Also shown are comparative assessments of the stability of proteases (FIG. 4B) to undergo autoproteolysis or function in the presence of urea (FIG. 4C). The relative activities are measured using the cognate AMC substrates. Error bars in all panels represent standard deviation from at least triplicate measurements. [0014] FIGURE 5 shows a comprehensive evaluation of the specificity and utility of rChyB-
Asn for proteomic experiments. Shown is a heat map representation of the substrate specificity of WT-rChyB and rChyB-Asn profiled using HEK293F cell lysates.
[0015] FIGURE 6 shows mass spectrometric analysis of Saccharomyces cerevisiae invertase (Suc2) digested using rChyB-Asn. Purified invertase was denatured, treated with the glycosidase EndoH, digested with mChyB-Asn, and characterized using LC-MS/MS. The combined mass spectrometry results from three independent digests with the protease was used to map the detected glycosylation patterns. The consensus NXS/T motif is underlined and all Asn are in bold. Red denotes parts of the protein sequence detected by mass spectrometry, green indicates Asn residues that were identified as being partially glycosylated. Asterisks identify sites of HexNAc modification. A representative example of the MS/MS spectra obtained is shown, which illustrates that the AEPILNISN peptide contains two Asn, of which, only one is glycosylated. The rChyB-Asn did not cut after the glycosylated Asn, but was able to mediate proteolysis after the unmodified Asn.
[0016] FIGURES 7A-7D show the detection of proteins and N-glycan sites within Jurkat secretome. FIG. 7A shows the pathway analyses of the 2676 unique proteins identified using mass spectrometric LC-MS/MS analyses of Jurkat cell secretome digested with rChyB-Asn. Pathway and protein function classification was performed using PantherDB (www .pantherdb .org) . Shown are the MS/MS spectra obtained of the proteins CALM-1 (FIG. 7B), FAM3B (FIG. 7C), and aGPCR F5 (FIG. 7D), which illustrate putative N-glycan macroheterogeneity. DETAILED DESCRIPTION
[0017] It is to be understood that both the foregoing general description and the following detailed description are illustrative and explanatory, and are not restrictive of the subject matter, as claimed. In this application, the use of the singular includes the plural, the word“a” or“an” means“at least one”, and the use of“or” means“and/or”, unless specifically stated otherwise. Furthermore, the use of the term“including”, as well as other forms, such as“includes” and “included”, is not limiting. Also, terms such as“element” or“component” encompass both elements or components comprising one unit and elements or components that comprise more than one unit unless specifically stated otherwise. [0018] The section headings used herein are for organizational purposes and are not to be construed as limiting the subject matter described. All documents, or portions of documents, cited in this application, including, but not limited to, patents, patent applications, articles, books, and treatises, are hereby expressly incorporated herein by reference in their entirety for any purpose. In the event that one or more of the incorporated literature and similar materials defines a term in a manner that contradicts the definition of that term in this application, this application controls.
[0019] Despite advances in high-throughput sequencing of DNA and RNA, detailed maps of the human proteome with direct measurements of proteins and their modifications are just starting to emerge. Comprehensive characterization of biological systems enabled by quantitative differences in protein expression and abundance, cell-type specific and temporal expression patterns, protein-protein interactions and post-translational modifications (PTM), is best accomplished using mass- spectrometry (MS) proteomics.
[0020] Thus, while the analysis of the human genome has led to the identification of about 20,000 putative protein-coding genes, direct evidence for the existence of these proteins and their PTM is incomplete. The complexity of the proteome is a formidable barrier when accounting for all modifications. [0021] For instance, it is estimated that there are more than 100,000 proteins encoded by about 20,000 genes and that the relative abundances of these proteins can vary across ten orders of magnitude. The dynamic changes in the relative abundance, modifications and even the subcellular localization of the proteome all provide formidable barriers to thorough annotation. [0022] Mass-spectrometry (MS) is the most comprehensive and versatile tool for proteomics.
Two major MS approaches have been developed: (1) shotgun proteomics, suited for discovery, including post-translational modifications (PTM); and (2) selected reaction monitoring, suited for targeted quantifications and comparisons. Both of the aforementioned MS methods utilize proteolytic digestion to generate peptide fragments that are then detected using MS. [0023] In a typical shotgun proteomics experiment, the proteomic sample is digested with trypsin, fractionated, and subjected to liquid chromatography-mass- spectrometry (LC-MS/MS). The identification of peptides is accomplished by comparing the observed peptide and fragmentation product ion masses with theoretical mass spectra derived from in silico digestion.
[0024] Trypsin and the trypsin family of enzymes, including chymotrypsin, are widely used as the main proteases of mass- spectrometry-based proteomics. This is primarily due to the ability of these enzymes to cleave protein mixtures yielding peptides of mass in the preferred mass range for MS and with defined substrate specificity, thus providing readily interpretable and reproducible fragmentation spectra. Furthermore, these enzymes are fairly stable and can function in the presence of denaturants (e.g., urea) that assist in unfolding the target proteins that need to be proteolyzed prior to detection.
[0025] There are, however, limitations to the nearly exclusive use of Trypsin and the trypsin family of enzymes in proteomics. To begin with, any single protease only covers certain regions of the proteome through the library of peptides it generates. As a result, other parts of the proteome are lost due to the size of peptides being generated (e.g., too small or too large). Moreover, proteolytic efficiency is variable across different proteins within complex mixtures. [0026] Protein glycosylation is among the most abundant PTM found in all domains of life. Despite their considerable compositional and structural diversity, glycosylation plays important roles from protein folding to cellular homeostasis. Not surprisingly, aberrant glycosylation is well-recognized in a number of diseases, including cancer. N-linked glycosylation represents one of the best-characterized forms of glycosylation in eukaryotes with the glycans being attached to the amide group of asparagine (Asn) residues within proteins.
[0027] Comprehensive annotation of the N-glycans of proteins requires quantitative identification of: (a) the Asn residues that are modified, (b) N-glycan microheterogeneity through the diversity of the glycans that are attached to these Asn residues, and (c) N-glycan microheterogeneity through the sub-stoichiometric glycosylation of Asn residues. For any given protein, the annotation of all the Asns that are modified by glycans represents a preferred first step to N-glycan analysis.
[0028] Engineering proteolytic specificity to enable detection of post-translationally modified (PTM) amino acids has been demonstrated for the bacterial proteases OmpT, subtilisin, and more recently trypsin. However, a need exists to broaden the toolkit of proteases that can be used for mass- spectrometry based proteomics, including the development of proteases with orthogonal cleavage specificities that will dramatically alter the coverage of the proteome, such as the human proteome and the annotation of the PTM. Embodiments of the present disclosure address this need. [0029] In some embodiments, the present disclosure pertains to modified chymotrypsins with an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins. In additional embodiments, the present disclosure pertains to nucleic acids with a nucleotide sequence that encodes the modified chymotrypsins of the present disclosure. In further embodiments, the present disclosure pertains to methods of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure.
[0030] As set forth in more detail herein, the modified chymotrypsins, nucleic acids, and cleavage methods of the present disclosure can have numerous embodiments. The modified chymotrypsins, nucleic acids, and cleavage methods of the present disclosure can also have numerous applications.
[0031] Modified Chymotrypsins
[0032] The modified chymotrypsins of the present disclosure have an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in a peptide or a protein. As set forth in more detail here, the modified chymotrypsins of the present disclosure can have various structures, forms and proteolytic activities.
[0033] The modified chymotrypsins of the present disclosure can have various proteolytic activities. For instance, in some embodiments, the modified chymotrypsins of the present disclosure cleave after one or more Asn residues with a kcat/KM of at least 0.5 M V1. In some embodiments, the modified chymotrypsins of the present disclosure cleave after one or more Asn residues with a kcat/KM of about 0.5 M V1. In some embodiments, the modified chymotrypsins of the present disclosure cleave after one or more Asn residues with a kcat/KM of between about 0.4 M V1 to about 1 M V1. [0034] The modified chymotrypsins of the present disclosure can cleave after one or more
Asn residues of proteins or peptides in various directions. For instance, in some embodiments, the modified chymotrypsins of the present disclosure cleave at a region that is C-terminal to the one or more Asn residues. In some embodiments, the modified chymotrypsins of the present disclosure cleave at a region that is N-terminal to the one or more Asn residues. [0035] The modified chymotrypsins of the present disclosure can cleave after one or more
Asn residues of proteins or peptides in various locations. For instance, in some embodiments, the modified chymotrypsins of the present disclosure cleave immediately after one or more Asn residues.
[0036] The modified chymotrypsins of the present disclosure can cleave different numbers of Asn residues of proteins or peptides. For instance, in some embodiments, the modified chymotrypsins of the present disclosure cleave after a single Asn residue of proteins or peptides. In some embodiments, the modified chymotrypsins of the present disclosure cleave after a plurality of Asn residues of proteins or peptides.
[0037] The one or more Asn residues of proteins or peptides that are cleaved by the modified chymotrypsins of the present disclosure can be in various forms. For instance, in some embodiments, the one or more Asn residues are in native form. In some embodiments, the one or more Asn residues are post-translationally modified. In some embodiments, the one or more Asn residues are glycosylated. In some embodiments, the one or more Asn residues are partially glycosylated. In some embodiments, the one or more Asn residues are fully glycosylated.
[0038] The modified chymotrypsins of the present disclosure can have various forms and structures. For instance, in some embodiments, the modified chymotrypsins of the present disclosure are in the form of a formulation. In some embodiments, the formulation includes, without limitation, suitable buffers, ions, salts, stabilizing agents, and combinations thereof.
[0039] In some embodiments, the modified chymotrypsins of the present disclosure are in isolated form. In some embodiments, the modified chymotrypsins of the present disclosure have at least one amino acid substitution relative to the native chymotrypsin. In some embodiments, the native chymotrypsin is native chymotrypsin B (chyB) from Rattus norvegicus.
[0040] In some embodiments, the modified chymotrypsins of the present disclosure have at least one amino acid substitution that includes an amino acid change in at least one of positions 99, 189, 192, 218, 219, 221, 222, 223, 224, or 226 of the native chymotrypsin. In some embodiments, the one or more amino acid substitutions include, without limitation, Seri 89, Argl92, Arg218, Trp224, Gly226, and combinations thereof.
[0041] In some embodiments, the modified chymotrypsins of the present disclosure lack chain A. In some embodiments, the modified chymotrypsins of the present disclosure lack a signal peptide. [0042] In some embodiments, the modified chymotrypsins of the present disclosure are in active form. In some embodiments, the modified chymotrypsins of the present disclosure are in zymogen form, thereby requiring activation by trypsin.
[0043] In some embodiments, the modified chymotrypsins of the present disclosure are identified by SEQ ID NO: 1. SEQ ID NO:l includes the following sequence:
M AFLWLVSCFALV GATFGCGVPTIQPVLTGLSRIVN GED AIPGS WPW QV S LQDKTGFHF CGGSFISEDWVVTAAHCGVKTSDVVVAGEFDQGSDEENIQVFKIAQVFKNPKFNMFTM RNDITLLKLATPAQFSETVSAVCLPNVDDDFPPGTVCATTGWGKTKYNALKTPEKLQQA ALPIVSE ADCKKS WGS KITD VMTC AGAS GV S SCRGDS GGPLVCQKDGVWTLAGIVS WG S RECTGTWPG V Y S RVT ALMPW V QQILE AN .
[0044] Nucleic Adds
[0045] The nucleic acids of the present disclosure include a nucleotide sequence that encodes a modified chymotrypsin of the present disclosure. As set forth in more detail herein, the nucleic acids of the present disclosure can have various forms and properties.
[0046] In some embodiments, the nucleic acids of the present disclosure are in the form of an expression vector, such as a plasmid. In some embodiments, the expression vector is in a host cell. In some embodiments, the host cell includes, without limitation, bacterial cells, yeast cells, insect cells, mammalian cells, and combinations thereof. In some embodiments, the nucleic acids of the present disclosure are codon optimized for expression in host cells.
[0047] In some embodiments, the host cell that contains the expression vectors of the present disclosure is a mammalian cell. In some embodiments, the mammalian cell includes, without limitation, HEK cells, CHO cells, HEK293F cells, derivatives thereof, and combinations thereof.
[0048] The nucleic acids of the present disclosure can include various nucleotide sequences that encode various modified chymotrypsins of the present disclosure. For instance, in some embodiments, the nucleotide sequences are identified by SEQ ID NO: 2.
[0049] SEQ ID NO: 2 includes the following sequence:
ATGGCCTTTCTTTGGTTGGTGAGCTGTTTCGCGCTCGTGGGTGCAACTTTTGGCTGCG GTGTTCCCACAATACAGCCTGTCCTCACCGGGCTCAGCAGAATAGTCAATGGTGAGG
ATGCTATCCCGGGTAGCTGGCCGTGGCAAGTGTCTTTGCAGGATAAAACTGGCTTTC
ATTTTTGTGGAGGTAGTCTTATTTCAGAGGACTGGGTAGTTACAGCTGCCCATTGTG
GAGTCAAGACAAGTGACGTGGTCGTGGCCGGAGAGTTCGACCAAGGTTCAGATGAA
GAGAACATACAGGTACTCAAGATTGCCCAAGTGTTCAAAAATCCCAAATTCAATAT
GTTTACGATGCGGAATGATATTACGCTGCTTAAACTTGCAACACCCGCTCAATTTAG
TGAGACCGTTTCAGCCGTTTGTCTTCCAAATGTGGATGACGACTTCCCCCCGGGGAC
TGTGTGTGCTACAACCGGGTGGGGCAAGACTAAATACAATGCCTTGAAAACCCCCG
AGAAACTTCAACAAGCGGCTCTTCCCATTGTTTCCGAGGCCGATTGTAAGAAGTCAT
GGGGAAGCAAAATTACAGACGTAATGACTTGTGCGGGAGCTTCAGGTGTGTCATCC
TGCCGCGGCGACTCTGGTGGACCCCTGGTTTGTCAAAAAGATGGTGTCTGGACCCTG
GCTGGAATCGTAAGCTGGGGATCTAGGGAATGTACAGGTACCTGGCCGGGCGTATA
TTCACGCGTGACAGCCCTCATGCCCTGGGTCCAGCAGATACTTGAGGCTAAT.
[0050] Methods of cleaving a protein or peptide
[0051] In further embodiments, the present disclosure pertains to methods of cleaving a protein or peptide by utilizing the modified chymotrypsins of the present disclosure. In some embodiments illustrated in FIG. 1A, the methods of the present disclosure include a step of contacting the protein or peptide with a modified chymotrypsin (step 10) to result in cleavage after at least one Asn residue in the peptide or protein (step 12) and generation of one or more fragments (step 14). In additional embodiments, the one or more fragments are analyzed by mass spectrometry to generate a mass spectrum (step 16). In some embodiments, the generated mass spectrum is utilized to identify the protein or peptide (step 18). As set forth in more detail herein, the cleavage methods of the present disclosure can have numerous embodiments.
[0052] Various methods may be utilized to identify a protein or peptide that has been cleaved with the modified chymotrypsins of the present disclosure. For instance, in some embodiments, the protein or peptide is identified by comparing the mass spectrum of the one or more fragments to a mass spectrum library of known proteins or peptides. In some embodiments, the mass spectrum library is a theoretical mass spectrum library. In some embodiments, the theoretical mass spectrum library is derived from in silico digestion. [0053] In some embodiments, the mass spectrum library is a mass spectrum library derived from enzymatic digestion of known proteins or peptides. In some embodiments, the mass spectrum library is accessed by entering the mass spectrum into a search database.
[0054] The methods of the present disclosure can have numerous advantageous applications. For instance, in some embodiments, the methods of the present disclosure may be utilized to identify or map the post-translational modification (PTM) sites of a protein or peptide. In some embodiments, the post-translational modification sites include glycosylation sites of a protein or peptide. In some embodiments, the post-translational modification sites include glycosylated Asn (e.g., partially or fully glycosylated Asn). In some embodiments, the modified chymotrypsin cleaves directly after the post-translational modification sites (e.g., directly after the glycosylated Asn residue).
[0055] Additional Embodiments
[0056] Reference will now be made to more specific embodiments of the present disclosure and experimental results that provide support for such embodiments. However, Applicants note that the disclosure below is for illustrative purposes only and is not intended to limit the scope of the claimed subject matter in any way.
[0057] Example 1. Engineered chymotrypsin for mass-spectrometry based detection of protein 1 glycosylation
[0058] In this Example, Applicant have engineered the substrate specificity of chymotrypsin to cleave after Asn by high-throughput screening of large libraries created by comprehensive remodeling of the substrate binding pocket. The engineered variant (ChyB-Asn) demonstrated an altered substrate specificity with an expanded preference for Asn containing substrates. Biophysical characterization confirmed that protein engineering did not compromise the stability of the enzyme, and comparison of WT-ChyB and ChyB-Asn in profiling lysates of HEK293 cells demonstrated both qualitative and quantitative differences in the nature of the peptides and proteins identified by LC-33 MS/MS. ChyB-Asn enabled the identification of partially glycosylated Asn sites within a model glycoprotein, and in the extracellular proteome of Jurkat T cells.
[0059] Example 1.1. Results
[0060] In order to facilitate engineering the substrate specificity of recombinant mammalian proteases, Applicant established a comprehensive methodology that allows efficient fluorescence-activated cell sorting (FACS) of large libraries on E.coli by integrating three different components: (i) surface display system compatible with proteases that need free N- terminus (FIG. 2A), (ii) protease activity assay with single-cell resolution, and (iii) genotype recovery method to facilitate iterative enrichment of desired phenotype (FIGS. 2B and 2D). Applicant displayed an engineered version of the mature rat chymotrypsin (mChyB) by fusion to the E. coli autotransporter, Antigen 43, and characterized its enzymatic activity and specificity. A Forster resonance energy transfer (FRET) based multiplexed assay, previously utilized to monitor bacterial protease specificity, was adapted to the chymotrypsin system by optimizing the labeling parameters such as cell density and pH to improve the dynamic range of the assay. The peptide substrates contain the amino acid sequence (AAPXGS, X = Asn or Tyr) in the linker region sandwiched between the FRET pair (FIG. 2C).
[0061] Upon proteolysis by the surface-displayed enzyme, the fluorescent fragment which carries a +3 charge is captured locally on the bacterial cell surface due to electrostatic interactions. Individual E. coli cells displaying mChyB that have high Asn and low Tyr activities are isolated using FACS while wild-type-like and non-specific variants are rejected (FIGS. 2B and 2D). The expression of full-length chymotrypsin on the cell surface was confirmed using an antibody specific for FFAG epitope tag.
[0062] Despite displaying chymotrypsin on the surface, sorted E. coli cells with the desired phenotype showed poor viability due to the proteolytic nature of the enzyme. In order to overcome this limitation, Applicant have recently reported a technique for the direct isolation of plasmid DNA from sorted whole-cells and subsequent transformation into competent cells with recovery efficiency sufficiently high for the complete screening of large, diverse libraries. Applicant thus used direct plasmid recovery for all further library screening steps (FIGS. 2B and 2D).
[0063] Example 1.2. Chymotrypsin variant with improved activity towards PI Asn identified by iterative mutagenesis and FACS [0064] The amino acids responsible for the substrate specificity of chymotrypsin and trypsin have been extensively investigated in the past. Residues 189, 216 and 226, which are in close proximity to the PI substrate residue, were considered to be responsible for substrate specificity. Site-directed mutagenesis studies targeting these positions to modify enzyme specificity did not yield functional variants. [0065] Loops 185-192 and 215-226 were identified as key determinants of substrate specificity based on their significance in evolutionary divergence of substrate specificity within chymotrypsin-fold serine proteases and in all the previous protein engineering efforts to modify PI specificity of trypsin and chymotrypsin. In order to attempt a comprehensive redesign of the substrate specificity of chyB, Applicant hypothesized that specific residues in these loops critical for the catalytic activity, and structural stability of the enzyme, are likely conserved across the protease family and hence should not be targeted for randomization. Multiple sequence alignment of amino acid sequences of 17 different proteases with chymotrypsin-like tertiary structure, and differing substrate specificities performed using MUSCLE25, showed that positions 190, 191, 215, 220 and 225 were highly conserved. [0066] As a trade-off to decrease the mutational load and increase the fraction of library diversity screened, residues 185-188, which were the farthest from the substrate analog among all the candidate residues, were not randomized. Applicant’s final list of residues targeted for randomization (positions 189, 192, 216-219, 221-224 and 226) was also confirmed using Hotspot wizard, where the web server used the crystal structure (PDB ID: 1DLK) of d- chymotrypsin bound to a peptidyl chloromethyl ketone (substrate analog) as the input to collate structural, functional and evolutionary information available from the public databases and identify candidate residues for mutagenesis. [0067] By utilizing degenerate NNS (N = T, A, G, C; S= G or C) oligonucleotides targeting these 11 amino acids, a partial saturation library was constructed, cloned and transformed into E. coli MC1061 to yield -107 transformants. The expression of mChyB was induced by the addition of arabinose and the library of cells were screened using 20 nM Asn-sub and 20 nM Tyr-sub peptides (FIG. 2C).
[0068] 140 cells displaying high Atto633 fluorescence and low BODIPY fluorescence were isolated after six rounds of sorting. Ten clones randomly picked from this population were found to contain identical mutations in the mChyB gene. This variant labeled mChyB -Asn-vl yielded 1.3-fold increased fluorescence with Asn-sub and 3.9-fold decreased fluorescence with Tyr-sub, in comparison to wild-type mChyB (FIGS. 3A-B). Of the 11 positions targeted in the library, mChyB -Asn-vl had mutations at 10 of them except Gly216.
[0069] In order to remove any potential undesired mutations necessary for the observed change in substrate specificity, a second library was constructed by backcrossing the DNA encoding mChyB-Asn-vl with WT-mChyB. The resulting library of 106 transformants was screened using 20 nM Asn-sub and 20 nM Tyr-sub. During sorting of this library, Applicant decreased the stringency of counter- selection ( i.e ., allowed sorting of clones that displayed slightly higher activity towards Tyr-sub in comparison to the sort gate used for the initial library).
[0070] Applicant reasoned that some residual wild-type activity is essential for functional activation of mChyB (and mChyB variants), wherein the autoproteolysis of chymotrypsin at Tyr 146 yields chains B and C. Complete abolishment of Tyr activity would thus likely prevent auto- proteolytic processing and active enzyme formation.
[0071] After 4 rounds of FACS sorting, DNA sequencing of 10 random clones demonstrated that five amino acids (Serl89, Argl92, Arg218, Trp224, and Gly226) were conserved across all the different clones tested, suggesting that they were essential for the altered specificity. mChyB- Asn-v2, which showed the highest fluorescence with Asn-sub, had two mutations (Metl89 and Val217) reverted back to Ser, as in wild-type mChyB, leading to an increase in activity towards Asn-sub by 2.2-fold in comparison to WT-mChyB (FIG. 3C). It is to be noted that the activity of mChyB-Asn-v2 towards Tyr-sub also increased, although it was still 1.8-fold lower in comparison to WT-mChyB.
[0072] To identify beneficial mutations outside the 11 positions targeted, Applicant constructed an error-prone library using the DNA sequence of mChyB-Asn-v2 as the template. The library of -107 transformants was again screened using identical concentrations of Asn-sub and Tyr-sub. After 4 rounds of sorting, DNA sequencing of 10 random clones identified no consensus mutations. mChyB-Asn with an additional mutation, Val99Met, in chain B showed the highest fluorescence. [0073] Asn-sub (3.3-fold in comparison to WT-mChyB, FIG. 3D) was selected for subsequent characterization. Based on the amino acid changes introduced into the active site of chymotrypsin during iterative mutagenesis, Applicant observed a positive correlation between the polarity of the active site, quantified using GRAVY (grand average of hydropathy) score, and engineered activity towards PI asparagine residue (FIG. 3E). [0074] Example 1.3. Characterization of purified chymotrypsin with peptide substrates
[0075] To characterize the enzymes in a purified, soluble form, WT-mChyB and mChyB- Asn were expressed as inclusion bodies in E. coli MC1061 cells and subject to refolding in the presence of urea, essentially as described previously. This procedure did not, however, yield catalytically active protein when assayed with Casein-FITC as the model substrate (not shown). Applicants recloned the WT-mChyB and mChyB-Asn constructs into a mammalian expression system using the pcDNA 3.4 vector and added: (1) native leader peptide of chymotrypsinogen to facilitate secretion of the protease into the supernatants, and (2) the chain A from WT-mChyB that contains the trypsin activation site. This yielded the recombinant zymogen of these proteins, WT-rChyB and rChyB-Asn that could be activated by trypsin. [0076] Subsequent to transfection into HEK293F cells and protein expression over a period of 5 days, the supernatant containing the secreted protein was purified by His-tag affinity and cation exchange chromatography to greater than 95% purity (as determined by SDS-PAGE). Immediately prior to use, the proteases were activated by brief incubation with trypsin, and the trypsin was inactivated using TLCK inhibitor.
[0077] Reverse-phase HPLC analysis of Asn-sub substrate (1 mM) digested with activated rChyB-Asn (50 nM) showed one extra peak in addition to the substrate, and LC-MS analysis confirmed that it was the C-terminal peptide fragment upon proteolysis after Asn. Only intact Asn-sub substrate was detected when Asn-sub was co-incubated with activated WT-rChyB (up to [250 nM] even after 24 h, thus confirming the proteolytic activity of the engineered variant. Next, the kinetics of WT-rChyB and rChyB-Asn were measured using fluorogenic peptide substrates containing Tyr (LLVY-AMC), Asn (AAPN-AMC) or GlcNAc-linked Asn (AAPN[GlcNAc]-AMC) at the PI position.
[0078] Since the solubility of the peptides did not permit Applicant to measure saturation kinetics, Applicant could not estimate individual Michaelis-Menten parameters. WT-rChyB showed very fast kinetics towards LLVY-AMC even at low enzyme concentrations (1 nM). By comparison at an identical Tyr substrate concentration of 50 mM, rChyB-Asn even at 50-fold higher enzyme concentration, demonstrated a decrease in the rate of product generation by 20- fold (FIG. 4A), suggesting that high wild-type like activity was decreased by counter-selection with the Tyr-sub substrate during library screening.
[0079] With the AAPN-AMC substrate, rChyB-Asn showed a 10-fold increase in the rate of product generation in comparison to wild-type chymotrypsin. In parallel, rChyB-Asn showed no activity towards AAPN[GlcNAc]-AMC, confirming Applicant’s hypothesis that glycosylation blocks proteolysis after Asn (FIG. 4A).
[0080] Since proteomic experiments routinely require digestion for extended periods of time (12-48 h), and the ability to function in the presence of protein denaturants like urea, Applicant next evaluated these biophysical properties of the engineered variant in comparison to the WT enzyme. rChyB-Asn displayed enhanced resistance to autoproteolysis with a half-life of 67 h, in comparison to WT-rChyB that had a half-life 8 h (FIG. 4B). [0081] In the presence of 0.5M urea, both enzymes demonstrated only modest reduction in activity, although at urea concentrations higher than 1M, both enzymes demonstrated <50% activity and the reduction in activity was higher for rChyB-Asn (FIG. 4C). In aggregate, engineered chymotrypsin demonstrated the desired altered selectivity while demonstrating excellent stability.
[0082] Example 1.4. Expanded substrate specificity of engineered chymotrypsin
[0083] To derive a more comprehensive picture of the altered specificity of rChyB-Asn, and test its utility in a proteomic context, Applicant applied a mass-spectrometry-based approach to profile the selectivity of rChyB-Asn. Accordingly, the secreted protein (secretome) fraction harvested from HEK293F cells was concentrated and 65 pg of total protein was digested separately with activated WT-rChyB or rChyB-Asn, or trypsin (as a control) and subjected to LC-MS/MS. In a typical peptide to spectra matching (PSM) search, amino acids expected to be present in the C-termini of peptides are restricted based on the known specificity of the protease to reduce the computational effort required to mine large databases like the human proteome. When the specificity of the enzyme is unknown, a sparsely populated database is preferred to perform a non-specific PSM search.
[0084] Applicant started with specific searches where the C-termini were restricted ([RKJINot Pro for trypsin and [YWFLMN]INot Pro for chymotrypsins) and identified 176 unique proteins with a peptide false discovery rate of <1% to create a custom database. Next, non- specific searches were performed to match spectra from WT-rChyB and rChyB-Asn to peptides within this custom database. The P2-P2'positions in protein substrates cleaved by WT-rChyB (n=405) or rChyB-Asn (n=338) to generate the identified peptides were deduced and heat map representation of the substrate specificity of the proteases was generated using Ice Logo (FIG. 5). [0085] Not surprisingly, WT-rChyB preferred hydrophobic amino acids were found at the PI position. Unlike WT-rChyB, rChyB-Asn preferred Asn at PI position in addition to the hydrophobic amino acids. Thus, Applicant’s enzyme engineering efforts resulted in expanding the natural substrate preference of chymotrypsin to include asparagine.
[0086] One of the advantages of using proteases with orthogonal specificity for mass- spectrometry-based proteomics is the generation of unique peptides in a proteomic sample that either increases the coverage of a single protein or that increases the number of unique proteins identified.
[0087] Consistent with the expanded specificity of rChyB-Asn, when Applicant annotated the proteins identified in the HEK293F secretome digested with either WT-rChyB or rChyB-Asn using pathway analysis within PantherDB32, Applicant observed both quantitative and qualitative differences in the nature of pathways, thus suggesting that coverage of protein mixtures was increased by the use of engineered rChyB-Asn.
[0088] Example 1.5. Partially glycosylated Asn identified in a model protein using chymotrypsin
[0089] Glycan macroheterogeneity, defined as the heterogeneity due to sub- stoichiometric glycosylation, is significant in physiological phenomena such as glycoprotein hormonal regulation and is relevant to quality control of recombinant glycobiologics produced. Applicant hypothesized that rChyB-Asn can be used for identifying partially occupied N-glycosites as it would generate two signature peptides: (i) a non-glycosylated peptide with Asn residue in the C- terminus and (ii) a glycopeptide. [0090] Here, Saccharomyces cerevisiae invertase (Suc2) was used as the model protein to test this hypothesis since it contains 14 potential glycosylation sites (based on the NXS/T sequons, X = any amino acid) and the occupancy frequency of these glycosites has been characterized previously. PSM for glycopeptides is challenging due to the microheterogeneity of glycan composition. To address this issue, Suc2 was digested with endoglycosidase H enzyme which trims glycans to leave only the N-acetylglucosamine moiety attached to the Asn residue. Then, Suc2 was digested with rChyB-Asn for 72 hours and the digested peptides were subjected to LC-MS/MS. Using non-specific PSM search against a custom database containing the invertase sequence with the criterion that N-acetylglucos amine addition to Asn could be present as a variable modification, all the peptides (nascent and glycosylated) were identified.
[0091] In order to specifically focus on glycan macroheterogeneity, Applicant analyzed the subset of peptides containing the NXS/T sequon (data not shown). Out of the 14 sequons, two of them (2 and 11), were in a region of the protein not mapped by MS/MS and hence no assignment of glycosylation could be made (FIG. 6).
[0092] For five sequons (1, 3, 4, 6, and 7), Applicant only identified the glycosylated peptide. For sequon 5, Applicant only detected the non-modified cleaved peptide. These results were consistent with the previously reported complete glycosylation at these sites. Sequons 10, 12 and 14 displayed the dual signature characteristic wherein both the uncleaved but modified and cleaved but unmodified peptides were detected and were thus correctly identified as partially occupied glycosites. By contrast, for both sequons 8 and 9 (Asn266 and Asn275), Applicant observed only the glycopeptide but not the non-glycosylated peptide, even though their occupancy frequency was measured previously to be only 10%. Closely spaced, even overlapping, NXS/T sequons frequently occur and often at least one of them remains unglycosylated due to site skipping by oligosaccharyltransferase complex.
[0093] In the invertase sequence, sequons 4 and 5 overlap and only sequon 4 is glycosylated. This feature was correctly resolved using rChyB-Asn as proteolysis was observed at the C- terminus of Asnl l2 in sequon 5. Finally, although sequon 13 is predicted to be almost fully glycosylated, Applicant only detected the unglycosylated peptide.
[0094] In summary, as outlined with the invertase data, the engineered substrate specificity of rChyB-Asn can be exploited for glycoproteomic analyses. Moreover, in conjunction with tryptic digestions, rChyB-Asn can be the readily incorporated into high-throughput methods described previously. [0095] Example 1.6. Large-scale analysis of N-linked glycosylation in Jurkat secretome
[0096] Applicant next examined the utility of ChyB-Asn for analyzing the extracellular proteome of the human Jurkat T cell line, and to identify glycan macroheterogeneity within this compartment. The cell-free supernatant fraction from cells grown in serum-free media was first digested with rChyB-Asn, subsequently with Peptide-N-Glycosidase (PNGase) F and then subjected to LC-MS/MS. A semi-specific database search with a false discovery rate (FDR) of 1% identified 4379 peptides mapped to 2676 extracellular proteins (FIG. 7A).
[0097] The number of proteins identified by Applicant’s engineered enzyme compares favorably with other independent studies of Jurkat extracellular proteins profiled using trypsin. Within these peptides, 19% (808) were identified based on proteolysis at Asn, further confirming the modified specificity of the engineered enzyme.
[0098] In order to identify glycan macroheterogeneity, Applicant analyzed the dataset two ways: First, within the peptides that were identified to arise from Asn proteolysis, Applicant identified a set of 78 peptides that were: (1) part of the NXS/T sequon, and (2) annotated to be a glycosite by machine learning algorithms implemented within Uniprot. These 78 peptides represent annotated glycosites that have been detected in Applicant’s experiments to not have sugar modifications, and to have been proteolyzed. This, in turn, implies that these sites within these proteins either are completely unmodified or have glycan macroheterogenity.
[0099] As an example, RTPENFPCKN318, which resulted from cleavage at Asn308 of human plasminogen, is known to display glycan macroheterogeneity. The presence or absence of sugar at Asn308 of human plasminogen can affect the rate of activation by tissue plasminogen activator, and the binding affinity to cell surface receptors, ultimately affecting the rate of fibrinolysis of blood clots.
[00100] Second, within all the peptides, Applicants identified the subset of 48 peptides with the Asn [N+l] 302 modification signature associated with PNGase F deglycosylation. Comparison of the 48 peptides, to the 78 peptides that were missing the sugar despite being annotated as a glycosite, led to the identification of 3 unique protein sites wherein Applicant detected both the modified and unmodified peptides: Asnl20 of FAM3B (UniProt ID P58499), Asn61 of Calmodulin-1 (UniProt 306 ID P0DP23), and Asn315 of adhesion G-protein coupled receptor F5 (UniProt ID Q8IZF2) (FIGS. 7B-D). [00101] Example 1.7. Discussion
[00102] Protease engineering has fascinated biochemists for decades. Most proteases have defined substrate binding pockets that help them differentiate the many different chemical properties of the amino-acid side chains. Trypsin and chymotrypsin, for example, have very similar tertiary structures and almost superimposable substrate binding pocket architecture (layout of the backbone). Despite this similarity, site-directed mutational swapping of selected residues within the binding pocket of trypsin did not endow chymotryp sin-like reactivity and required the grafting of extended loops outside the substrate binding pocket to bring about the change in substrate specificity. Aided by advances in combinatorial screening, progress been reported in the systematic engineering of the substrate specificity of a wide-range of microbial proteases.
[00103] By contrast, engineering the specificity of mammalian proteases still remains a formidable barrier with few reports of successful engineering. Comprehensive engineering of mammalian protease substrate specificity is a challenging task due to: (1) structural complexity requiring disulfide bonds, (2) the need for zymogen activation, and (3) the capacity of proteases to cleave host-proteins, resulting in cell death; or to cleave themselves leading to autoproteolysis.
[00104] Applicant has utilized a two-step strategy to engineer the substrate specificity of chyB to cleave after Asn and demonstrate its utility for proteomic and Asn-N-linked glycan mapping experiments. To facilitate a comprehensive redesign of the substrate specificity, Applicant designed a mature version of chyB (mChyB) containing only chains B and C and containing the Tyrl64Ala mutation that eliminates the need for autoproteolytic processing. Next, Applicant screened libraries of mChyB via fusion to the E. coli autotransporter Ag43, and by utilizing a high-throughput selection/counter-selection system. After three complete cycles of diversification and screening assisted by flow-cytometry, Applicant isolated a variant, mChyB- Asn, that contained 9 mutations, and that demonstrated both increased activity towards an Asn containing substrate and decreased activity towards Tyr 333 containing substrates.
[00105] Since the E. coli based screening format utilized the surface displayed variant, Applicant switched to a soluble expression format containing the native chain A (rChyB) and expressed these proteins in mammalian cells. Subsequent to purification, the proteolytic activity and kinetics of purified rChyB -Asn variant was profiled using standard AMC substrates, and mass -spectrometry confirmed that the variant affected site-specific cleavage after Asn.
[00106] One of the challenges in engineering proteases for proteomics applications is that the activity of proteases towards small peptide sequences, either natural or synthetically derived, does not always translate to cleavage of the same sequences in full-length proteins. Applicant specifically chose chyB as a starting scaffold for engineering efforts since it is well-documented that this family of proteases has the ability to cleave full-length proteins at their cognate recognition sites. In order to ensure that the engineered variants could be utilized for proteomics digestions, Applicant performed biophysical experiments to confirm that rChyB -Asn had the ability to function in the presence of the commonly used denaturant, urea; and had enhanced autostability in comparison to WT -rChyB. Next, Applicant evaluated the secretome of HEK293F cell subsequent to proteolytic digestion with either WT-rChyB or rChyB-Asn. These experiments allowed Applicant to map the specificity of rChyB-Asn by surveying cleavage across a vast sequence landscape and demonstrated that rChyB-Asn had expanded substrate specificity to cleave after Asn.
[00107] The availability of proteases with distinct specificities makes it likely that different peptides are generated for mass- spectrometry, and therefore subjecting the same proteome or proteins to proteolysis by these orthogonal proteases increases the likelihood that complementary parts are sequenced leading to enhanced coverage of sequence space. This parallel digestion approach using multiple proteases in parallel reactions has been utilized for the quantification of proteomes obtained from mammalian cells and viruses, and lead to the increased identification of
PTMs. In this context, comparative profiling of the proteins identified by either WT-rChyB or rChyB-Asn showed qualitative differences in the peptides and proteins identified. Applicant thus envisions that rChyB-Asn can be added to the toolbox of proteases available for parallel digestion reactions.
[00108] To test the utility of mChyB-Asn in mapping N-linked glycosylation sites, Applicant chose the well-studied glycoprotein Invertase since it harbors 14 sequons and up to 50% of proteins’ mass is contributed by complex sugars. Both the distribution of sugars across the sequons and the variation in the degree of glycosylation at a given sequon give rise to the heterogeneity of the protein. Since proteolysis after Asn is blocked by sugar modifications due to the large size of the sugars relative to the amino acids, peptide fragments identified by MS/MS that have a C-terminal Asn residue have to be unmodified amino acids. Using this strategy, Applicant compared their results with published results on the glycan mapping of Invertase using mixtures of different proteases.
[00109] Applicant’s results are largely consistent with known mapping experiments but there are also differences. According to Applicant’s results, sequon 14 (Asn 512) is partially glycosylated since Applicant detected both peptides but prior results have claimed both partial and complete glycosylation.
[00110] Next, Applicant validated the application of chymotrypsiN to proteomic studies of glycosylation by undertaking the mapping of the extracellular proteome of Jurkat T cells. The number of proteins identified by the aid of chymotrypsin is similar to the proteins identified using trypsin. More importantly, Applicant identified 87 proteins that are either sub- stoichiometrically glycosylated or non-glycosylated at putative N-linked glycosylation sites. For three of these proteins, Applicant identified both the glycopeptide and the unmodified peptide confirming N-glycan macroheterogeneity.
[00111] In summary, Applicant envisions that chymotrypsin can serve as a proteomic tool, as demonstrated by studies in this Example. Although Applicant has demonstrated the detection of candidate glycopeptides, these likely have some false positives since it has been previously demonstrated that spontaneous deamidation of Asn can happen independently of PNGase activity in MS/MS experiments. In order to overcome this limitation, a more thorough annotation of N-glycan sites can be undertaken by the direct detection of glycopeptides as illustrated in a number of recent studies.
[00112] There are a wide-range of PTMs that are currently sub-optimally annotated and thus engineering proteases specific for PTM offers a route to continuously expand the set of proteases available for proteome-wide mapping. Although here Applicant has demonstrated specificity for the unmodified parent amino acid as a mechanism to detect PTM, PTMs like phosphorylation, due to the substoichiometric modification, and labile nature, will likely require that the engineered proteases directly detect and cleave after the PTM. [00113] Example 1.8. Construction of chymotrypsin libraries
[00114] For the construction of the active site library of chymotrypsin B ( rattus norvegicus, Uniprot P07388), residues 189, 192, 216-219, 221-224 and 226 were targeted for randomization. PCR was performed using primers 1 and 2 (data not shown) and pBAD_AChy_70015 as a template to obtain fragment 1 coding for 206-245 aa of chymotrypsin B with a FLAG tag and a Kpnl site with overhang for efficient restriction digestion. After gel purification, fragment 1 was used as template in a PCR reaction with primers 2 and 3 such that region coding for 193-206 aa is added to its 5’.
[00115] The above step was repeated using primers 2 and 4 to obtain fragment 2 coding for 183-245 aa with all the desired positions randomized with NNS codon. Fragment 3 containing Sad restriction site with an additional overhang, Shine-Dalgamo sequence and gene coding for Ag43 signal peptide and 16-188 aa of chymotrypsin B was obtained using pBAD_AChy_700 as template and primers 5 and 6. 100 fmols of gel-purified fragments 2 and 3 were first assembled together by 10 cycles of PCR and then amplified by primers 2 and 5 for the next 20 cycles. All primers were purchased from Integrated DNA Technologies (IDT). [00116] Assembled gene library was gel-purified, digested with SacI-HF and KpnI-HF at 37
°C for 1 h and ligated into digested pBAD_700. Ligated plasmid was dialyzed with water for 1 h and transformed into E. coli 410 MC1061 cells by electroporation. Transformants recovered with SOC media (~ 4 x 107) were directly grown in 100 mL of 2xYT media supplemented with 0.5% glucose and 25 pg/mL chloramphenicol at 37 °C for 10 h. 10 mL of cells were lysed and their plasmid DNA isolated using QIAGEN miniprep kit was stored at -20 °C. 300 pi aliquots of cells (OD600 ~2) in 20% glycerol were frozen using liquid nitrogen and stored at -80 °C.
[00117] To create the backcross library where positions 189, 192, 217-219, 221, 222, 224 and 226 were to contain the codon corresponding to mChyB or mChyB-Asn-vl sequence, primers 7 and 8 with mixed bases were used. The procedure described above was repeated with primers 1 and 4 replaced by 8 and 7 respectively. It is to be noted that library members contained amino acids other than those which correspond to mChyB or mChyB-Asn-vl in positions where more than one mixed base was used for randomization.
[00118] For random mutagenesis of mChyB-Asn-v2, error-prone PCR was performed using GeneMorph II kit (Agilent Technologies, CA) with mChyB-Asn-v2 gene as a template and primers 2 and 9. Assembly with the gene fragment coding for Ag43 signal peptide obtained using primers 5 and 10, cloning into digested pBAD_700, transformation, and cell recovery was performed as described earlier.
[00119] For all the libraries, plasmid DNA isolated from 10 randomly picked clones were sequenced (SeqWright Genomic 426 Services, TX) to assess the genetic diversity and mutation rate for the library created using error-prone PCR. The error rate was estimated to be 0.65%. [00120] Example 1.9. Synthesis of FRET peptide substrates
[00121] Lyophilized peptide (KAAPNGSCGRGR, N-terminal acetylated and C-terminal amidated, >70% 430 purity, Genscript, NJ) and dyes, Atto-maleimide (Atto-TEC GmbH, Singen, Germany) and QSY21 carboxylic acid succinimidyl ester (Life Technologies, NY), were resuspended in anhydrous N, N-d\ mcthy 1 formamidc (DMF) to a concentration of 10 mM and 10 pg/pL, respectively. [00122] 20 pL of 1 M NaHCCF solution was mixed with 50 pL of peptide solution. Excess salt was pelleted to transfer the supernatant to a fresh tube and 50 pL of Atto-maleimide was added to it. After incubation at 25 °C for 1 h, an aliquot was analyzed on a C 18 column (Cat. No. 436 00G-4041-E0, Phenomenex, CA) with water/acetonitrile mobile phases containing 1% acetic acid at a flow rate of 1 mL/min by measuring absorbance at wavelengths 630 and 670 nm in HPLC (Shimadzu scientific, Kyoto, Japan). For the second reaction, the peptide/Atto-633 reaction mixture was added to a tube containing 100 pL of 1M 4-dimethyl amino pyridine (DMAP, Sigma Aldrich, MO) and 50 pL of QSY 21 solution. The reaction mixture was diluted with 4.5 mL of water containing 10% v/v acetic acid and loaded on to the column using a 5 mL sample loop. The chemical identity of the synthesized substrate was verified using ESI-MS (Rice core mass spectrometry facility, TX). The desired fractions were pooled together and lyophilized overnight.
[00123] After reconstitution in 100 pL of water, substrate (Asn-AtQ21) concentration was estimated by measuring the absorbance of dilutions in Infinite 200 PRO plate reader (Tecan, Switzerland).
[00124] Example 1.10. Screening of chymotrypsin libraries using FACS
[00125] For protein expression, 3 mL LB culture was inoculated using a glycerol stock of the library to a starting OD600 of 0.02 and grown up to an OD600 of 0.5 for induction with 100 pM arabinose at 37°C for 2 h. Induced cells were washed with 1% sucrose solution and resuspended to an OD of 1. The washed cells were incubated with 20 nM of Tyr-BQ7 and Asn-AtQ21 at an OD 0.01 in 1% sucrose solution containing 2 mM Tris (pH 7.5) for 10 min at 250 C. The labeled cells were analyzed using BD Biosciences FACSJazz cell sorter at an event rate of 7000/s. Sorting was performed in the labeling time window of 10-25 min as with longer incubation, cells could get labeled non-specifically. For all the libraries, the number of cells screened was at least 3-fold higher than their genetic diversity estimated based on transformation efficiency.
[00126] Plasmid DNA was recovered from the sorted cells using Zymo miniprep kit (Zymo Research, CA) and transformed into electrocompetent E.coli cells as described previously. Transformants recovered with SOC media were directly grown in 100 mL of 2xYT media supplemented with 0.5% glucose and 25 pg/mL chloramphenicol at 37 °C for 10 h and used to seed a subculture for the next round of sorting. Ten colonies were randomly picked from the sorted population for clonal characterization using flow cytometry with Tyr-BQ7 and Asn- AtQ21 substrates. Mutations in chymotrypsin B corresponding to the clones that showed the desired phenotype on the cell sorter were identified by standard Sanger sequencing (SeqWright Inc., Houston, TX).
[00127] Example 1.11. Enzyme expression and purification
[00128] Applicants use the notations mChyB and rChyB to refer to the surface display form and the soluble form of the enzymes respectively. pcDNA 3.4 based plasmid vector, used for the expression of chymotrypsinogens in human embryonic kidney (HEK293F) cells was a kind gift from Georgiou lab (University of Texas, Austin). A gene fragment coding for rChyB-Asn with a His6 tag and 469 containing Kozak sequence (GCCACC) and sequences overlapping with 5' and 3' ends of Agel. [00129] HF/Bsu36I digested vector for subsequent cloning by Gibson assembly, was commercially purchased (IDT, IA). Wild-type chymotrypsinogen gene ( rChyB ) was generated by overlap-extension PCR of DNA fragments obtained using rChyB-Asn and mChyB as templates. After verifying the cloned plasmids by DNA sequencing, 100 mL 2xYT culture supplemented with 200 ug/ml ampicillin was inoculated with a colony of cells harboring plasmid containing rChyB or rChyB-Asn, grown overnight at 37 °C and plasmid DNA was isolated using
QIAGEN plasmid maxi kit and QIAvac 24 plus vacuum manifold (Qiagen Inc., CA). 100 mL of
HEK293F cells grown to a density of 2.5 x 106 cells/mL after three passages in a suspension media culture at 37 °C and 5% CO2 were transiently transfected with 100 pg of plasmid DNA using 80 pi of Expifectamine transfection reagent (Life Technologies, NY). Transfection enhancers provided in the kit were then added 16 h following transfection. Protein secretion was allowed for 4-5 days, usually when viability (as measured by trypan blue) was 30-50%, and cells were spun down at 4000 xg for 20 min at 4 °C to collect the supernatant. The presence of chymotrypsinogen in the supernatant was verified by Western blotting using rabbit anti-His antibody (300 ng/mL, GenScript, NJ), goat anti-rabbit antibody conjugated to HRP (40 ng/mL, Jackson ImmunoResearch, PA), and chemiluminescent HRP substrate (SuperSignal West Pico, Thermo Scientific, MA). After activating 100 pL of supernatant with 1 U of proteomics-grade trypsin (Sigma Aldrich, MO) at 25 °C for lh, 1 pL of 5 pg/pL casein-FITC (50-100 pg FITC/mg solid, Sigma Aldrich, MO) solution was added to monitor proteolytic activity by measuring fluorescence (excitation: 488 nm, 489 emission: 530 nm and cut-off: 515 nm) at 25 °C using SpectraMAX Gemini EM plate reader 490 (Molecular Devices, CA).
[00130] After ascertaining the presence of chymotrypsinogen, purification was performed using AKTA FPLC system (GE Healthcare Bio-Sciences, PA) at 4 °C. Filtered (0.22 pm) supernatant was loaded on to the column and 12 mL of phosphate buffer was passed to elute any non- specific ally bound proteins. Elution was performed by increasing imidazole concentration from 0 to 150 mM in a linear gradient in 16 min to collect 5 mL fractions.
[00131] An aliquot of the fractions expected to contain the desired protein was exchanged to Tris buffer (50 mM Tris, 10 mM CaC12, pH 7.5) using Zeba spin columns in a 96 well plate format (Thermo 498 Scientific, MA) and protease activity was confirmed using casein-FITC substrate after activation. The pooled samples were further purified by cation exchange chromatography. 1 mL of 500 concentrated protein was loaded on to a column packed with 0.8 mL of resin (GE Healthcare Bio-501 Sciences, PA) by injection. The purity of fractions (1 mL volume) expected to contain a majority of the protein was estimated by running samples denatured using SDS and b-mercaptoethanol, in a 4-20% polyacrylamide gel (Lonza Houston Inc, TX) and subsequent Coomassie blue staining. Prior to any functional characterization, zymogens were activated using trypsin (1:100 ratio of enzyme to chymotrypsinogen) at 25 °C for 1 h and subsequently treated with 500 mM of tosyl-L-lysine chloromethyl ketone hydrochloride (TLCK, Sigma Aldrich, MO) to inhibit trypsin. [00132] Example 1.12. HPLC characterization
[00133] Asn-sub, at concentration 1 mM, was incubated with activated chymotrypsin variants in Tris buffer (pH 7.5) at 37 °C for about 1 h. To estimate the degree of proteolysis, an aliquot of the reaction mixture was analyzed on a C18 column by measuring absorbance at 630 and 670 nm corresponding to Atto633 fluorophore and QSY21 quencher respectively using the same gradient method that was previously used during substrate synthesis. To validate the identity of the peaks observed in the chromatogram, aliquots were also characterized by LC-MS (Mass spectrometry laboratory, Department of Chemistry, University of Houston).
[00134] Example 1.13 Kinetic parameter measurements
[00135] Total protein concentration was estimated by bicinchoninic acid (BCA) colorimetric assay (Thermo Scientific, MA) where the calibration curve was generated by measuring dilutions of albumin standard (1 mg/mL, Sigma Aldrich, MO). The concentration of functional chymotrypsin after zymogen activation was determined by active-site titration against 4- Methylumbelliferyl p-trimethylammoniocinnamate chloride (MUTMAC, Sigma Aldrich, MO). Conversion of relative fluorescence units (RFU) measured at 380/445 nm (excitation/emission wavelengths) to molarity was achieved by calibrating with different concentrations (0-5 mM) of the fluorophore, 4-Methylumbelliferone, in presence of 50 pM MUTMAC and rChyB zymogen incubated with TLCK-524 treated trypsin was used as negative control. Kinetics of rChyB and rChyB -Asn towards suc-Ala-Ala-Pro-Asn-AMC (custom synthesized, Anaspec, CA), suc-Ala- Ala-Pro-Asn(GlcNAc)-AMC (custom synthesized, Anaspec, CA), and suc-Leu-Leu-Val-Tyr- AMC (RnD Systems Inc, MN) were measured. AMC fluorescence, measured at 380/460 nm, was calibrated with dilutions of unconjugated AMC stock solution. Depending on the kinetics, enzyme concentration was in the range of 2.5-50 nM and substrate concentrations were varied from 20-200 pM except for suc-LLVY-AMC substrate where the highest concentration was limited to 50 pM due to poor solubility. Enzyme concentration for AAPN(GlcNAc)-AMC ranged from 0-560 pM .
[00136] Example 1.14. Protease stability experiments [00137] Protease activity was assessed by incubation of the appropriate substrate in activity buffer (50mM Tris, lOmM CaCh) with either active WT rChyB or rChyB-Asn, and measuring the increase in fluorescence with time using excitation at 380 nm and emission at 460 nm wavelengths. To assess the stability in urea, 80 nM WT rChyB or 600 nM rChyB-Asn was incubated in indicated concentrations at 25 °C for 20 minutes.
[00138] For studying auto- stability, WT rChyB and rChyB-Asn Chymotrypsin was incubated at 37°C at various intervals and aliquots were removed and stored at 4°C. Cleavage activity of 80nM WT rChyB and 400nM rChyB-Asn at time points 0, 2, 8, 25, and 50 hours was measured.
[00139] Example 1.15. HEK293F cells supernatant digestion
[00140] HEK293F cells were cultured in Freestyle 293 serum-free media (Thermo Scientific, MA) and passaged three times. Cells were then seeded at a density of 0.9*106 cells/ml in 200 ml fresh media and the supernatant was harvested after 24 hours, lyophilized and the protein sample was dissolved in 0.5% SDS. N-linked glycosylation was trimmed by EndoH (New England Biolabs, MA) treatment following the manufacturer’s instructions. Then, the sample was reduced with 4 mM tris 2-carboxyethyl phosphine (TCEP) at 370 C for 15 minutes and alkylated with 4 mM 24 iodoacetamide (IAA) for 30 minutes in dark at room temperature.
[00141] The sample was purified by chloroform-methanol precipitation and resuspended in 25 pi of 6 M urea. The protein concentration was measured with Nanodrop A280. 5 mΐ (65 mg) of the sample was digested with engineered chymotrypsin, wild-type chymotrypsin, and trypsin respectively in 100 mM ammonium bicarbonate buffer at 370 C for 72 hours. The reaction was stopped by adding formic acid to a final concentration of 1%.
[00142] Example 1.16. Jurkat cells digestion [00143] Jurkat cells were cultured in RPMI 1640 containing 10% fetal bovine serum and supplemented with 50 pg/ml gentamicin and insulin-transferrin- selenium supplement (Thermofisher) until a cell number of 6 x 107 was reached. Cells were re-suspended in 100 ml of serum-starved RPMI1640 and the supernatant was harvested after 24 hours. Secreted proteins were concentrated using Amicon 10-kDa column and protein content was measured using BCA assay. [00144] Glycopeptide enrichment, PNGase F treatment were performed by Creative
Proteomics. 8 mg of Jurkat secretome was transferred into Microcon devices YM-10 (Millipore), and washed twice with 50 mM ammonium bicarbonate. After reduction by 10 mM dithiothreitol (DTT) at 56°C for 1 hour and alkylation by 20 mM IAA at room temperature in dark for 1 hour, proteins were washed thrice with 50 mM ammonium bicarbonate. Trypsin activated rChyB-Asn was added to Jurkat secretome at a ratio of 1:1 at 37°C for 3 days. Again, the peptides were washed with 100 pi of 50 mM ammonium bicarbonate twice and lyophilized.
[00145] Glycopeptide enrichment was performed using Protein Extract Glycopeptide
Enrichment kit (Sigma) following manufacturers’ instructions. Enriched and depleted (peptides that did not bind to the Glycocapture resin) samples were treated with PNGase F at a ratio of 1:50 overnight at 37 °C. PNGase F-treated proteins were washed with ammonium bicarbonate lyophilized and re-suspended in 0.1% formic acid prior to LC-MS/MS analysis.
[00146] Example 1.17. Mass spectrometry
[00147] Liquid-chromatography tandem mass -spectrometry (LC-MS/MS) was performed on a ThermoFinnigan LTQ equipped with an Agilent 1290 Infinity UPLC system using Solvent A (water + 0.1% formic acid) and Solvent B (methanol + 0.1% formic acid). 5 mΐ (8 mg) of the sample was injected for each mass spectrometry analysis. The peptides were separated in a home-packed 500 pm x 6 cm C18 reversed phase column by a 30-minute linear gradient of 20% to 100% Solvent B with a flow rate of 30 pl/min. Electrospray voltage was set to 3.78 kV with a sweep, auxiliary, and sheath gas set to 0 on a standard IonMax ESI Source.
[00148] The capillary temperature was set to 250°C and the mass spectrometer was set for Dynamic Exclusion Data-Dependent MS/MS with the 3 highest intensity masses observed in the MS scan targeted for MS/MS fragmentation. The RAW data files were converted to MGF format using the MSConvert utility program from the ProteoWizard program suite (http://proteowizard.sourceforge.net/toois.shtmi). The data were first analyzed using X! tandem advanced search against human protein database using following criteria: specific cleavage site [YWFLMN]INot Pro for engineered and wild-type chymotrypsin, [RK]INot Pro for trypsin, 1 missed cleavage allowed, precursor mass tolerance +3 Da and -IDa, product mass tolerance of 0.4 Da, carbamidomethyl cysteine as fixed modification and other parameters as default setting.
[00149] For the non-specific search, peak lists obtained from MS/MS spectra were identified using OMSSA version 2.1.9 and X!Tandem version X! Tandem Sledgehammer (2013.09.01.1). The search was conducted using SearchGUI62 against the custom database of proteins identified (176 592 sequences) from the previous specific search. Protein identification was conducted against a concatenated target/decoy version of the custom database. The decoy sequences were created by reversing the target sequences in SearchGUI. The identification settings were as follows: No cleavage specificity; 3.0 Da as MSI and 0.5 Da as MS2 tolerances; fixed modifications: Carbamidomethylation of C (+57.021464 Da), variable modifications: Oxidation of M (+15.994915 Da), fixed modifications during refinement procedure: Carbamidomethylation of C (+57.021464 Da), variable modifications during refinement procedure: Acetylation of protein N-term 26 (+42.010565 Da), Pyrolidone from E (—18.010565 Da), Pyrolidone from Q (— 17.026549 Da), Pyrolidone from carbamidomethylated C (—17.026549 Da). Peptides and proteins were inferred from the spectrum identification results using PeptideShaker version 1.16.1263. Peptide Spectrum Matches (PSMs), peptides and proteins were validated at a 1.0% False Discovery Rate (FDR) estimated using the decoy hit distribution.
[00150] Mass spectrometry analysis of Jurkat secretome was performed commercially. Glycopeptides from Jurkat secretome were analyzed using Nanoflow UPLC: Easy nLClOOO (Thermofisher Scientific) coupled to Orbitrap Q Exactive mass spectrometry (Thermofisher Scientific) using mobile phases A (0.1% formic acid in water) and B (0.1% formic acid in acetonitrile). The peptide mixture was separated in a home packed 100 pm x 10 cm column with a reverse phase ReproSil-Pur C18-AQ resin (3 pm, 120 A) with a flow rate of 600 nL min-1 by applying a linear gradient from 6 to 30% of B for 38 minutes, 30-42% for 10 minutes, 42-90% for 6 minutes, and constant 90% to complete the 60-minute program. The eluted peptides were electro sprayed at a voltage of 2.2 kV with a capillary temperature of 270°C. Full MS scans were acquired in the Orbitrap with a resolution of 70,000 at m/z 400. In each Orbitrap survey scan, a full-scan scan spectrum was acquired in the mass range of m/z 300-1650, followed by CID on the 15 most intense peptide ions from the preview scan in the Orbitrap. The ion-trap MS analysis was performed with CID with activation q = 0.25 and an activation time of 30 ms for one microscan.
[00151] The MS raw data of rChyB-Asn was analyzed and searched against a database of human-related proteins using Byonic software v2.15.7. Similarly to HEK293F analysis, protein identification was conducted against a target/decoy database with an estimated FDR of <1%. A semi- specific enzymatic search against YWFFMN was set with a maximum number of missed cleavages of 2 and a peptide molecular weight tolerance of 10 ppm. The MS/MS tolerances of 0.5 Da were allowed. The sole fixed modification parameter was carbamidomethylation (C), and the variable modification parameters were oxidation (O), and deamination (D) and the glycan modifications HexNAc(l), HexNAc(2), HexNAc(l)Fuc(l), and HexNAc(2)Fuc(l). Peptide list from enriched and 27 depleted fractions was combined and narrowed to proteins with a taxonomy of Homo sapiens. A total of 4380 peptides belonging to 2678 proteins are annotated as secreted by either (1) Gene Ontology (GO) cellular component as either extracellular region (G0:0005576), extracellular matrix (GO: 0031012), extracellular space (GO: 0005615), extracellular exosome(GO: 0070062), cell surface (GO: 0009986), external side of plasma membrane (GO: 0009897), or protein secretion (GO: 0009306) or (2) proteins predicted to contain a signal peptide according to SecretomeP 2.0 or SignalP 4.1 servers or (3) proteins classified as secreted by a non-classical pathway with a SecretomeP 2.0 score exceeding O.6.. [00152] Example 1.18. Invertase digestion
[00153] Invertase (Sigma Aldrich, MO) sample was treated with EndoH to remove all glycosylation except for the core GlcNac. Then, invertase (0.6 pg) was digested with engineered chymotrypsin at 1:1 ratio in 100 mM ammonium bicarbonate buffer and incubated at 370C for up to 72 hours. The reaction was stopped by adding 10% formic acid to a final concentration of 1%. 4 pi of digested peptides (70 or 130 ng) were analyzed by Bruker MicrOTOF-Q mass spectrometry as described previously. The data were searched against the invertase protein sequence database created in OMSSA using the following search criteria; no enzyme, 2 missed cleavage allowed, precursor mass tolerance 1 Da, product mass tolerance of 0.4 Da, Asn HexNac, Asn dHexHexNac, deamidation of N and Q as variable modification and E value threshold was set to l.OOOe+OOO.
[00154] Without further elaboration, it is believed that one skilled in the art can, using the description herein, utilize the present disclosure to its fullest extent. The embodiments described herein are to be construed as illustrative and not as constraining the remainder of the disclosure in any way whatsoever. While the embodiments have been shown and described, many variations and modifications thereof can be made by one skilled in the art without departing from the spirit and teachings of the invention. Accordingly, the scope of protection is not limited by the description set out above, but is only limited by the claims, including all equivalents of the subject matter of the claims. The disclosures of all patents, patent applications and publications cited herein are hereby incorporated herein by reference, to the extent that they provide procedural or other details consistent with and supplementary to those set forth herein.

Claims

WHAT IS CLAIMED IS:
1. A modified chymotrypsin, wherein the modified chymotrypsin has an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins.
2. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin cleaves after the at least one Asn residue with a kcat/KM of at least 0.5 M V1.
3. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin cleaves after a plurality of Asn residues.
4. The modified chymotrypsin of claim 1, wherein the at least one Asn residue is in native form.
5. The modified chymotrypsin of claim 1, wherein the at least one Asn residue is post- translationally modified.
6. The modified chymotrypsin of claim 5, wherein the at least one Asn residue is glycosylated.
7. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin cleaves immediately after the at least one Asn residue.
8. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin has at least one amino acid substitution relative to the native chymotrypsin.
9. The modified chymotrypsin of claim 8, wherein the native chymotrypsin is native
chymotrypsin B (chyB) from Rattus norvegicus.
10. The modified chymotrypsin of claim 8, wherein the at least one amino acid substitution includes an amino acid change in at least one of positions 99, 189, 192, 218, 219, 221, 222, 223, 224, or 226 of the native chymotrypsin.
11. The modified chymotrypsin of claim 8, wherein the at least one amino acid substitution is selected from the group consisting of Serl89, Argl92, Arg218, Trp224, Gly226, and
combinations thereof.
12. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin lacks a chain A.
13. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin lacks a signal peptide.
14. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin is in active form.
15. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin is in zymogen form, thereby requiring activation by trypsin.
16. The modified chymotrypsin of claim 1, wherein the modified chymotrypsin is identified by
SEQ ID NO: 1.
17. A method of cleaving a protein or peptide, wherein the method comprises:
contacting the protein or peptide with a modified chymotrypsin,
wherein the modified chymotrypsin has an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins; and wherein the modified chymotrypsin cleaves after at least one Asn residue in the peptide or protein to generate one or more fragments.
18. The method of claim 17, further comprising a step of analyzing the one or more fragments by mass spectrometry to generate a mass spectrum.
19. The method of claim 18, wherein the mass spectrum is utilized to identify the protein or peptide.
20. The method of claim 19, wherein the protein or peptide is identified by comparing the mass spectrum to a mass spectrum library of known proteins or peptides in order to identify the protein or peptide.
21. The method of claim 20, wherein the mass spectrum library is selected from the group consisting of a theoretical mass spectrum library, a mass spectrum library derived from enzymatic digestion of known proteins or peptides, and combinations thereof.
22. The method of claim 17, wherein the method is utilized to identify or map the post- translational modification (PTM) sites of the protein or peptide.
23. The method of claim 22, wherein the post-translational modification sites comprise glycosylation sites of the protein or peptide.
24. The method of claim 23, wherein the glycosylation sites comprise glycosylated Asn.
25. The method of claim 24, wherein the modified chymotrypsin cleaves directly after the glycosylated Asn.
26. A nucleic acid comprising a nucleotide sequence encoding a modified chymotrypsin, wherein the modified chymotrypsin has an altered substrate specificity that allows the modified chymotrypsin to cleave after at least one Asn residue in peptides and proteins.
27. The nucleic acid of claim 26, wherein the nucleic acid is in the form of an expression vector.
28. The nucleic acid of claim 27, wherein the expression vector is in a host cell selected from the group consisting of a bacterial cell, a yeast cell, an insect cell, a mammalian cell, and combinations thereof.
29. The nucleic acid of claim 26, wherein the nucleotide sequence encoding the modified chymotrypsin is identified by SEQ ID NO: 2.
PCT/US2020/027418 2019-04-09 2020-04-09 Engineered chymotrypsins and uses thereof WO2020210455A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962831298P 2019-04-09 2019-04-09
US62/831,298 2019-04-09

Publications (1)

Publication Number Publication Date
WO2020210455A1 true WO2020210455A1 (en) 2020-10-15

Family

ID=72750580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/027418 WO2020210455A1 (en) 2019-04-09 2020-04-09 Engineered chymotrypsins and uses thereof

Country Status (1)

Country Link
WO (1) WO2020210455A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000139472A (en) * 1998-11-04 2000-05-23 Natl Food Res Inst ASPARAGINE RESIDUE-SPECIFIC ENDOPROTEASE cDNA DERIVED FROM PLANT AND GENE
US20040146938A1 (en) * 2002-10-02 2004-07-29 Jack Nguyen Methods of generating and screening for proteases with altered specificity
JP2009183278A (en) * 2008-01-09 2009-08-20 Osaka Prefecture Univ Mutant-type protease and peptide synthesis method using the peptide

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000139472A (en) * 1998-11-04 2000-05-23 Natl Food Res Inst ASPARAGINE RESIDUE-SPECIFIC ENDOPROTEASE cDNA DERIVED FROM PLANT AND GENE
US20040146938A1 (en) * 2002-10-02 2004-07-29 Jack Nguyen Methods of generating and screening for proteases with altered specificity
JP2009183278A (en) * 2008-01-09 2009-08-20 Osaka Prefecture Univ Mutant-type protease and peptide synthesis method using the peptide

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BALAKRISHNAN RAMESH, ABNOUF SHAZA, MALI SUJINA, MOREE WILNA J., PATIL UJWAL, BARK STEVEN J., VARADARAJAN NAVIN: "Engineered ChymotrypsiN for Mass Spectrometry-Based Detection of Protein Glycosylation", ACS CHEMICAL BIOLOGY, vol. 14, no. 12, 11 November 2019 (2019-11-11), pages 2616 - 2628, XP055741980, ISSN: 1554-8929, DOI: 10.1021/acschembio.9b00506 *
BALAKRISHNAN RAMESH: "Engineering Substrate Specificity of Mammalian Proteases", DOCTOR OF PHILOSOPHY IN CHEMICAL ENGINEERING, 31 August 2015 (2015-08-31), pages 1 - 178, XP055741974 *

Similar Documents

Publication Publication Date Title
Fottner et al. Site-specific ubiquitylation and SUMOylation using genetic-code expansion and sortase
US20220098293A1 (en) Split inteins, conjugates and uses thereof
Nikghalb et al. Expanding the scope of Sortase‐mediated ligations by using Sortase homologues
Zhao et al. Protein engineering in the ubiquitin system: tools for discovery and beyond
JP2018521640A (en) Methods and products for synthesizing fusion proteins
Mohler et al. MS-READ: quantitative measurement of amino acid incorporation
JP2003534768A (en) Cyclic peptide
Pauwels et al. Mass spectrometry and the cellular surfaceome
Kwon et al. Recombinant expression and functional analysis of proteases from Streptococcus pneumoniae, Bacillus anthracis, and Yersinia pestis
Ramesh et al. Engineered ChymotrypsiN for mass spectrometry-based detection of protein glycosylation
WO2012135902A1 (en) Protease activity assay
Yoo et al. Directed evolution of highly selective proteases using a novel FACS based screen that capitalizes on the p53 regulator MDM2
Hacker et al. Direct, competitive comparison of linear, monocyclic, and bicyclic libraries using mRNA display
Zhang et al. SpyCatcher-NTEV: A Circularly Permuted, Disordered SpyCatcher Variant for Less Trace Ligation
US20110129935A1 (en) Protein stability assay using a fluorescent reporter of protein folding
Brunner et al. Production and application of nanobodies for membrane protein structural biology
El‐Shafey et al. “Zero‐length” cross‐linking in solid state as an approach for analysis of protein–protein interactions
WO2020210455A1 (en) Engineered chymotrypsins and uses thereof
ES2376586T3 (en) METHOD OF IDENTIFICATION OF A NUCLEIC ACID CODIFYING A HEMOPEXINE TYPE STRUCTURE THAT JOINS SPECIFICALLY TO A DEFAULT DIE MOLECULE.
JP2016518855A (en) Fusion protease
Shu et al. Uncover new reactivity of genetically encoded alkyl bromide non-canonical amino acids
Sandersjöö et al. Protease substrate profiling using bacterial display of self‐blocking affinity proteins and flow‐cytometric sorting
Shu et al. Detecting active deconjugating enzymes with genetically encoded activity-based ubiquitin and ubiquitin-like protein probes
JP7185929B2 (en) Protein screening and detection methods
Su et al. Efficient protein–protein couplings mediated by small molecules under mild conditions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20788182

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20788182

Country of ref document: EP

Kind code of ref document: A1