WO2020242765A1 - Peptides modifiés à l'échelle mésométrique et procédés de sélection - Google Patents

Peptides modifiés à l'échelle mésométrique et procédés de sélection Download PDF

Info

Publication number
WO2020242765A1
WO2020242765A1 PCT/US2020/032715 US2020032715W WO2020242765A1 WO 2020242765 A1 WO2020242765 A1 WO 2020242765A1 US 2020032715 W US2020032715 W US 2020032715W WO 2020242765 A1 WO2020242765 A1 WO 2020242765A1
Authority
WO
WIPO (PCT)
Prior art keywords
constraints
derived
reference target
engineered peptide
peptide
Prior art date
Application number
PCT/US2020/032715
Other languages
English (en)
Inventor
Matthew P. Greving
Kevin Eduard HAUSER
Andrew Morin
Jordan R. WILLIS
Original Assignee
Rubryc Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rubryc Therapeutics, Inc. filed Critical Rubryc Therapeutics, Inc.
Priority to EP20813167.2A priority Critical patent/EP3977117A4/fr
Priority to CN202080050892.XA priority patent/CN114585918A/zh
Priority to JP2021570755A priority patent/JP2022535511A/ja
Priority to CA3142227A priority patent/CA3142227A1/fr
Priority to KR1020217043265A priority patent/KR20220041784A/ko
Publication of WO2020242765A1 publication Critical patent/WO2020242765A1/fr
Priority to US17/537,215 priority patent/US20220081472A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K1/00General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
    • C07K1/10General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length using coupling agents
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/001Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof by chemical synthesis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6845Methods of identifying protein-protein interactions in protein mixtures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • G16B5/30Dynamic-time models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • an engineered peptide wherein the engineered peptide has a molecular mass of between 1 kDa and 10 kDa, comprises up to 50 amino acids, and comprises: a combination of spatially-associated topological constraints, wherein one or more of the constraints is a reference target-derived constraint; and wherein between 10% to 98% of the amino acids of the engineered peptide meet the one or more reference target- derived constraints, wherein the amino acids that meet the one or more reference target- derived constraints have less than 8.0 A backbone root-mean-square deviation (RSMD) structural homology with the reference target.
  • RSMD backbone root-mean-square deviation
  • the amino acids that meet the one or more reference target- derived constraints have between 10% and 90% sequence homology with the reference target. In some embodiments, they have a van der Waals surface area overlap with the reference of between 30 A 2 to 3000 A 2 .
  • the combination comprises at least two, or at least five reference target-derived constraints. In some embodiments, the combination of constraints comprises one or more constraints not derived from a reference target. In some embodiments, the one or more non-reference target-derived constraints describes a desired structural, dynamical, chemical, or functional characteristic, or any combinations thereof. In still further embodiments, one or more constraints is independently associated with a biological response or biological function.
  • At least a portion of the atoms in the engineered peptide associated with a biological response or biological function are topologically constrained to a secondary structural element in the reference target, such as a beta-sheet, or an alpha helix.
  • a method of selecting an engineered peptide comprising: identifying one or more topological characteristics of a reference target;
  • the overlap between each characteristic is independently less than or equal to 75% Mean Percentage Error (MPE) as determined by one or more of Total Topological Constraint Distance (TCD), topological clustering coefficient (TCC), Euclidean distance, power distance, Soergel distance, Canberra distance, Sorensen distance, Jaccard distance, Mahalanobis distance, Hamming distance, Quantitative Estimate of Likeness (QEL), or Chain Topology Parameter (CTP).
  • MPE Mean Percentage Error
  • one or more constraints is derived from per-residue energy, per-residue interaction, per-residue fluctuation, per-residue atomic distance, per-residue chemical descriptor, per-residue solvent exposure, per-residue amino acid sequence similarity, per-residue bioinformatic descriptor, per-residue non-covalent bonding propensity, per-residue phi/psi angles, per-residue van der Waals radii, per-residue secondary structure propensity, per-residue amino acid adjacency, or per-residue amino acid contact.
  • the characteristics of one or more candidate peptides are determined by computer simulation.
  • one or more constraints is independently associated with a biological response or biological function.
  • at least a portion of the atoms in the engineered peptide associated with a biological response or biological function are topologically constrained to a secondary structural element in the reference target, such as a beta-sheet, or an alpha helix.
  • composition comprising two or more selection steering polypeptides, wherein each polypeptide is independently a positive selection molecule comprising one or more positive steering characteristics, or a negative selection molecule comprising one or more negative steering characteristics, wherein each characteristic type is independently selected from the group consisting of: amino acid sequence, polypeptide secondary structure, molecular dynamics, chemical features, biological function, immunogenicity, reference target(s) multi-specificity, cross-species reference target reactivity, selectivity of desired reference target(s) over undesired reference target(s), selectivity of reference target(s) within a sequence and/or structurally homologous family, selectivity of reference target(s) with similar protein function, selectivity of distinct desired reference target(s) from a larger family of undesired targets with high sequence and/or structurally homology, selectivity for distinct reference target alleles or mutations, selectivity for distinct reference target residue level chemical modifications, selectivity for cell type, selectivity for tissue type, selectivity for tissue environment,
  • At least one of the two or more polypeptides is a positive selection molecule, and at least one of the two or more polypeptides is a negative selection molecule. In some embodiments, at least one of the two or more polypeptides is a native protein. In certain embodiments, at least one pair of counterpart positive and negative selection molecules comprising at least one shared characteristic type, wherein the positive selection molecule comprises the positive characteristic and the negative selection molecule comprises the negative characteristic. [0009] In yet additional aspects, provided herein is a method of screening a library of binding molecules with a composition comprising two or more selection steering molecules as described herein, the method comprising subjecting a pool of candidate binding molecules to at least one round of selection, wherein each round of selection comprises:
  • the library of binding molecules is a phage library, or a cell library, such as a B-cell library or a T-cell library.
  • the method comprises two or more rounds of selection, or three or more rounds of selection. In certain embodiments, each round comprises a different set of selection molecules. In some embodiments, at least two rounds comprise the same negative selection molecule, or the same positive selection molecule, or both. In some embodiments, the method comprises analyzing the subset of the pool obtained from a round of selection prior to proceeding to the next round of selection.
  • the patent or application file contains at least one drawing executed in color.
  • FIG. 1 provides a schematic demonstrating construction of an exemplary combination of three spatially-associated topological constraints, for use in selecting an engineered peptide as described herein.
  • FIG. 2 provides a schematic of the steps involved in some exemplary methods of determining the reference-derived spatially-associated topological constraints and their use in selecting an engineered peptide (mesoscale molecule, MEM).
  • FIGS. 3A-3C provide schematics demonstrating the selection of a group of engineered peptides using the methods described herein.
  • FIG. 3A shows the extraction of spatially-associated topological information about an interface of interest in a reference, and use thereof in defining a topological constraint for use in selecting an engineered peptide.
  • FIG. 3B provides a schematic detailing the in silico screen step, demonstrating how mismatched candidates are discarded while candidates that match the topology are retained.
  • FIG. 3C presents the top 12 selected engineered peptide candidates identified.
  • FIGS. 4A-4B provide a second set of schematics demonstrating the selection of a different group of engineered peptides based on a different set of reference parameters, using the methods described herein.
  • FIG. 4A shows extraction of spatially-associated topological information and construction of a topology matrix.
  • FIG. 4B provides a list of top 8 engineered peptide candidates selected by in silico comparing candidates to the topological constraints.
  • FIG. 5 is a schematic providing an overview of the design of an exemplary programmable in vitro selection using engineered peptides as described herein, and also using native proteins as positive (T) or negative (X) selection molecules.
  • FIGS. 6A-6H provide an overview of the selection of five engineered peptides, and their use in a programmable in vitro selection protocol for phage panning.
  • FIG. 6A demonstrates the selection of VEGF as the reference target, and identification of the portion of VEGF from which spatially-associated topological information was derived and used to construct a combination of spatially-associated topological constraints (Step 1). This combination was then used for in silico screening of candidate engineered peptides to identify positive selection molecules and negative selection molecules (Step 2). The selected candidates were further screened in silico for stabilizing cross-linking options.
  • FIG. 6B shows the analysis and identification of spatially-associated topological constraints based on the reference target (a portion of VEGF) to be used in selecting engineered peptides.
  • FIG. 6C, FIG. 6D, FIG. 6E demonstrates the construction of a first, second, and third candidate engineered peptide, respectively, and derivation of the parameters to compare to the combination of constraints developed in FIG. 6B.
  • FIG. 6F lists the mean percentage error (MPE) for each MEM compared to the reference target, and their rank based on the MPE.
  • FIG. 6G shows how an additional set of constraints was added to the combination based on the reference target. In FIG. 6H, this additional set of constraints is used to evaluate candidate MEM 1. The MPE of this comparison was 36.6%.
  • FIG. 7 A is ribbon diagram of VEGF, with the reference section used to select engineered peptides indicated (R82-H90).
  • FIG. 7B are ribbon diagrams of 5 candidate engineered peptides selected based on the constraints developed from the target reference in FIG. 7A. The sequences and root-mean square RMSIP are listed in Table 1.
  • FIG. 7 A is ribbon diagram of VEGF, with the reference section used to select engineered peptides indicated (R82-H90).
  • FIG. 7B are ribbon diagrams of 5 candidate engineered peptides selected based on the constraints developed from the target reference in FIG. 7A. The sequences and root-mean square RMSIP are listed in Table 1.
  • FIG. 7 A is ribbon diagram of VEGF, with the reference section used to select engineered peptides indicated (R82-H90).
  • FIG. 7B are ribbon diagrams of 5 candidate engineered peptides selected based on the constraints developed from the target reference in FIG. 7A. The sequences and root-mean square
  • FIG. 7D shows the two eigenvectors that describe the two most dominant motions of the epitope in the reference target, with the x-, y-, and z-components of the ten Ca atoms in the epitope and the eigenvalues of the eigenvectors tabulated; structures show the projection of each Ca atom in the epitope along eigenvector 1 (arrows) and eigenvector 2 (arrows). Eigenvectors are orthonormal by definition.
  • FIG. 7E is the eigenvectors describing the most dominant motion (mode) in the epitope of the reference target (left) and the MEM (right). Structure of the MEM superimposed on the epitope are shown along with the MEM variant ID and RMSIP.
  • FIG. 7F provide the eigenvectors describing the second most dominant motion (mode) in the epitope of the reference target (left) and the MEM (right). Structure of the MEM
  • FIG. 7G provide the structures of the reference target and the MEM with associated projections along the three most dominant motions (modes, eigenvectors 1-3) in relation to their location in the inner product matrix used to compute RMSIP.
  • the RMSIP equation used is shown for reference.
  • FIG. 8 shows the structure ensembles and coordinate covariance matrices of the reference target (TOP) and the MEM (BOTTOM) generated from experimental data or computer simulation.
  • the epitope is the darker section on the upper right of the reference target.
  • FIG. 9 is an overview of an in vitro programmable selection design, using four engineered peptides (also called meso-scale engineered molecules, or MEMs) for positive or negative selection.
  • engineered peptides also called meso-scale engineered molecules, or MEMs
  • the atomic motion and topology scores of the MEMs are included for reference.
  • the sequences are provided as SEQ ID NOS: 1-4.
  • FIGS. 10A-10D are graphs of a binding biosensor assay using the different engineered peptides from FIG. 9 against Bevacizumab.
  • FIG. 11 is a description of eight different panning programs, seven including engineered peptides as one or more selection molecules, and an eight program that uses conventional native proteins for selection.
  • a naive Hu scFV library was separately panned with each program.
  • FIGS. 12A and 12B are VEGF ELISA response graphs comparing the VEGF binding response against binding partners selected using the different panning programs described in FIG. 11.
  • MEM programmed in vitro selection does not significantly reduce full-length target binding propensity, with specific MEM program inputs, but not all inputs.
  • Horizontal bars indicate mean; significant difference between P12 and P7: p-value ⁇ 0.0001.
  • MEM programmed in vitro selection directs towards putative-epitope selective clones in a statistically significant manner.
  • Horizontal bars indicate mean, P12 vs. P6: p-value is 0.024; P12 vs. P9: p-value is 0.0004; P12 vs. P10: p- value is 0.049.
  • FIGS. 13A-13H are graphs demonstrating the binding of binding partners selected using the different panning programs described in FIG. 11 with the sMEM engineered peptide vs. VEGF (reference).
  • FIGS. 14A-14I are graphs demonstrating the binding of binding partners selected using the different panning programs described in FIG. 11 in a cross blocking assay of VEGF with dose-responsive competition with Bevacizumab (0 nM, 67 pM, 670 pM, 6.7 nM).
  • FIG. 15 is a graph of the distinct clones with confirmed cross-blocking
  • FIG. 16 is a summary of the binding, cross-blocking, CDR sequences and germline usage for all Fabs produced from the selection programs outlined in FIG. 11.
  • FIG. 17 and FIG. 18 are ELISA binding results for all of the Fabs listed in FIG. 17.
  • FIG. 19 shows the Bevacizumab blocking propensity score for random clones vs. those selected from the selection programs outlined in FIG. 11 (0 nM, 67 pM, 670 pM, 6.7 nM).
  • FIG. 20 summarizes the cross-blocking enrichment for a random-uniform selection of clones from across the panning programs described in FIG. 11.
  • FIG. 21 is a schematic showing how next-generation sequencing samples of the selected clones were prepared. Individual heavy and light chain sequence at constant portions of the expression vector were cloned out, using a 2 x 250 paired end sequencing run. The ends were then joined and the reads annotated (e.g., using Pylg). The reads obtained from clones selected using each selection program are shown in the bar graphs.
  • FIG. 22 demonstrates a clonality analysis (number of distinct antibodies) of the different panning rounds, and normalized Shannon analysis.
  • FIG. 23 shows the clonality of the different screening programs described in FIG. 11
  • FIGS. 24A-24L are germline usage heatmaps and dimension reduction plots analyzing how the different screening rounds and programs, for round 1 (FIGS. 24A-24D), round 2 (FIGS. 24E-24H), and round 3 (FIGS. 24I-24L), shape diversity of the resulting selected pools.
  • FIGS. 25A-25B summarize the clones isolated from each selection program (S# in x-axis) and their binding to VEGF and the engineered peptide sMEM.
  • FIG. 26 is a summary of the rate of enrichment of unique mAh hits obtained from each round of each program that were confirmed to bind VEGF and cross-block
  • Bevacizumab and which were not identified in the conventional panning not using engineered peptides (program 12).
  • FIG. 27 is a summary of the rate of enrichment of mAh hits obtained the convention panning program (12) which were confirmed to bind VEGF but which were not putative epitope-selective mAh hits.
  • FIG. 28 summarizes binding to sMEM or VEGF of different clones obtained from different panning programs.
  • FIG. 29 is a schematic overview of a second exemplary set of programmed in vitro selection protocols, targeting a proposed therapeutic epitope reference site on PD-L1.
  • FIG. 30 provides the modeled structure and peptide sequences of the three engineered peptides selected according to the schematic in FIG. 29. Sequences are provided as SEQ ID NOS: 5-7.
  • FIGS. 31A-31D are the atomic distance and amino acid descriptor matrices derived from the reference (FIG. 31 A), and the engineered peptides sMEM (FIG. 3 IB), nMEM (FIG. 31C), and iMEM (FIB. 3 ID).
  • the mean percentage error of the sMEM, nMEM, and iMEM topologies were 3.58%, 0.84%, and 19.3%, respectively.
  • FIG. 31E-31G are biosensor binding graphs demonstrating the binding between the engineered peptides described in FIG. 30 with Avelumab.
  • the KD of nMEM binding with Avelumab was 43.4 uM.
  • FIGS. 32A-32C are biosensor binding graphs demonstrating the binding between the engineered peptides described in FIG. 30 with Durvalumab.
  • FIG. 33 is a summary of the difference programmed in vitro selection panning programs using one or more of the engineered peptides described in FIG. 30, and a conventional panning method using native proteins (Cl).
  • the engineered peptides sMEM, nMEM, and iMEM in FIG. 30 are sMEM #1, sMEM #5, and iMEM in FIG. 33.
  • FIG. 34 is a graph and summary of PD-L1 ELISA binding response for clones selected using each panning program described in FIG. 33.
  • FIG. 35 is a graph and summary of ELISA binding response against the sMEM #1 for clones selected using each panning program described in FIG. 33.
  • FIG. 36 is a graph and summary of ELISA binding response against the nMEM #5 for clones selected using each panning program described in FIG. 33.
  • FIG. 37 is a graph and summary of ELISA epitope selectivity response against PD- L1 and sMEM #1 for clones selected using each panning program described in FIG. 33.
  • FIG. 38 is a graph and summary of ELISA epitope selectivity response against PD- L1 and nMEM #5 for clones selected using each panning program described in FIG. 33.
  • FIGS. 39A-39U are diagrams comparing the different ELISA binding responses of FIGS. 34-38, demonstrating the selectivity of binding partners selected using the different programs.
  • FIG. 40 is a table summarizing the anti-PD-Ll panning ELISA hit identification criteria used to analyze clones obtained from the selection programs described in FIG. 33.
  • FIGS. 41A-41C are diagrams comparing the different ELISA binding responses to sMEM #1 and nMEM# 5 compared to PD-L1 (FIGS. 41 A and 42B respectively), and sMEM #1 compared to nMEM#5 (FIG. 41C) for binding partners selected using the different panning programs described in FIG. 33.
  • FIGS. 42A-42F are diagrams comparing the different ELISA responses and confirmed Tx mAh X-blockers for all of the programs described in FIG. 33.
  • FIG. 43 summarized the 23 distinct clones from the programs described in FIG. 33, as identified from cross-blocking hits and their sequences.
  • FIG. 44 is a chart of the confirmed cross-blocking distinct clones obtained from each of the programs described in FIG. 33.
  • FIG. 45A is a graph of the blocking propensity of randomly selected clones obtained from each of the programs described in FIG. 33. Blocking was evaluated as blocking by clones of binding of PD-L1 to Avelumab or Durvalumab. The blocking propensity was evaluated as ELISA Z-Score(sMEMl + sMEM5 + PD-L1 - iMEM) + MAX(Avelumab Blocking Z-score, Durvalumab Blocking Z-score). [0057] FIGS. 45B and 45C summarize the blocking propensity of clones obtained from the different programs evaluated in FIG. 45A. The shaded entries in FIG. 45C were obtained using the conventional selection approach using native proteins.
  • FIG. 46 is a summary of the cross-blocking enrichment observed in pools of clones obtained using the programs described in FIG. 33, compared to the control (conventional approach).
  • FIG. 47 is an example of a topological matrix that can be used in the selection of an engineered peptide as described herein.
  • FIG. 48 is an example of a topological constraint chemical descriptor vector that can be used in the selection of an engineered peptide as described herein.
  • FIG. 49 is an exemplary Lx2 phi/psi matrix that can be used in the selection of an engineered peptide as described herein.
  • FIG. 50 is an exemplary SxSxM matrix for secondary structure interaction descriptors that can be used in the selection of an engineered peptide as described herein.
  • FIG. 51 is an exemplary diagram showing clusters and TCC vector for an exemplary engineered peptide that can be used in the selection of an engineered peptide as described herein.
  • FIG. 52 is an exemplary LxM topological constraint matrix that can be used in the selection of an engineered peptide as described herein.
  • FIG. 53 is an exemplary secondary structure index and lookup table that can be used in the selection of an engineered peptide as described herein.
  • FIG. 54 is another representation of the data obtained from the VEGF panning programs.
  • SI refers to anti-VEGF Panning Program 6
  • S2 refers to anti-VEGF Panning Program 13
  • C is the conventional full length VEGF program.
  • FIG. 55 is another representation of the data provided in FIG. 24F SI refers to anti- VEGF Panning Program 6, S2 refers to anti-VEGF Panning Program 13, and C is the conventional full length VEGF program.
  • FIG. 56 is another representation of the data provided in FIG. 26. SI refers to anti- VEGF Panning Program 6, S2 refers to anti-VEGF Panning Program 13, and C is the conventional full length VEGF program.
  • FIGS. 57A-57E are graphs of the VEGF (gray solid line) and cross-blocking (dotted line) binding data for selected on-epitope clones from programmed in vitro selection.
  • FIGS. 58A-58C are graphs of VEGF binding data for off-epitope selected clones from full length in vitro selection.
  • FIG. 59A-59B summarize the antibody clone hits CDR loop sequence diversity for anti-VEGF programmed in vitro selection (red) and conventional in vitro selection (gray).
  • FIG. 60 is a sequence alignment of clones selected using the programmable in vitro selection methods described herein, using exemplary engineered peptides as described herein.
  • the top row is an alignment of heavy chain sequences of the top five on-epitope clones selected across all programmed in vitro selection programs;
  • the second row is an alignment of heavy chain sequences of the top five off-epitope clones selected using a conventional approach, using VEGF and BSA as selection molecules;
  • the third row is an alignment of light chain sequences of the top five on-epitope clones selected across all programmed in vitro selection programs;
  • the bottom row is an alignment of light chain sequences of clones selected using the conventional approach with VEGF and BSA.
  • FIG. 61 is a schematic description of an exemplary method of engineered polypeptide design.
  • FIG. 62 is a schematic description of an exemplary method of using a machine learning model for engineered polypeptide design.
  • the engineered peptides of the present disclosure are between 1 kDa and 10 kDa, referred to herein as“meso-scale”.
  • Engineered peptides of this size may, in some embodiments, have certain advantages, such as protein-like functionality, a large theoretical space from which to select candidates, cell permeability, and/or structural and dynamical variability.
  • the methods provided herein comprise identifying a plurality of spatially-associated topological constraints, some of which may be derived from a reference target, constructing a combination of said constraints, comparing candidate peptides with said combination, and selecting a candidate that has constraints which overlap with the combination.
  • spatially-associated topological constraints different aspects of an engineered peptide can be included in the combination depending on the intended use, or desired function, or another desired characteristic. Further, not all constraints must, in some embodiments, be derived from a reference target.
  • the selected engineered peptides are not simply variations of a reference target (such as might be obtained through peptide mutagenesis or progressive modification of a single reference), but rather may have a different overall structure than the reference peptide, while still retaining desired functional characteristics and/or key substructures.
  • engineered peptides which include methods of programmable in vitro selection using one or more engineered peptides. Such selection may be used, for example, in the identification of antibodies.
  • methods of selecting an engineered peptide comprising: identifying one or more topological characteristics of a reference target;
  • the engineered peptides described herein are selected based on how closely they match a combination of spatially-associated topological constraints. This combination may also be described using the mathematical concept of a“tensor”. In such a combination (or tensor), each constraint is independently described in three dimensional space (e.g., spatially- associated), and the combination of these constraints in three dimensional space provides, for example, a representational“map” of different desired characteristics and their desired level (if applicable) relative to location. This map is not, in some embodiments, based on a linear or otherwise pre-determined amino acid backbone, and therefore can allow for flexibility in the structures that could fulfill the desired combination, as described.
  • the“map” includes a spatial area wherein the prescribed constraint limitations could be adequately met by two adjacent amino acids - in some embodiments, these amino acids could be directly bonded (e.g., two contiguous amino acids) while in other
  • the amino acids are not directly bonded to each other but could be brought together in space by the folding of the peptide (e.g., are not contiguous amino acids).
  • the separate constraints themselves are also not necessarily based on structure, but could include, for example, chemical descriptors and/or functional descriptors.
  • constraints include structural descriptors, such as a desired secondary structure or amino acid residue.
  • each constraint is independently selected.
  • FIG. l is a schematic demonstrating the construction of a
  • the three constraints in FIG. 1 are sequence, nearest neighbor distance, and atomic motion, with nearest neighbor distance and atomic motion combined into one graphic. As shown, some constraints are mapped independent of the location of the backbone (e.g., atomic motion of certain side chains), therefore allowing for a much greater variety of structural configurations to be tried, compared to just varying one or more positions on a reference scaffold.
  • the three different constraints and their spatial descriptions are combined into a matrix (e.g., tensor), and then a series of candidate peptides can be compared with this combination to identify new engineered peptides which meet the desired criteria.
  • one or more additional non-reference derived constraints is also included in the combination. Comparison of candidate peptides with a defined combination may be done, for example, using in silico methods to evaluate the constraints of each candidate peptide against the desired
  • Said candidates which have the desired level of overlap with the prescribed combination may then be synthesized using standard peptide synthetic methods known to one of skill in the art, and evaluated.
  • the combination of constraints comprises at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, between 3 to 12, between 3 to 10, between 3 to 8, between 3 to 6, or 3, or 4, or 5, or 6 independently selected spatially-associated topological constraints.
  • One or more of the constraints is derived from a reference target.
  • each of the constraints is derived from a reference target.
  • at least one constraint is derived from a reference target, and the remaining constraints are not derived from the reference target.
  • between 1 and 9 constraints, between 1 and 7 constraints, between 1 and 5 constraints, or between 1 and 3 constraints are derived from a reference target, and between 1 and 9 constraints, between 1 and 7 constraints, between 1 and 5 constraints, or between 1 and 3 constraints are not derived from the reference target.
  • a series of candidate peptides is compared to said combination to identify one or more new engineered peptides which meet the desired criteria.
  • at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 125, at least 150, at least 175, at least 200, or at least 250 or more candidate peptides are compared to the combination to identify one or more new engineered peptides which meet the desired criteria.
  • more than 250 candidate peptides, more than 300 candidate peptides, more than 400 candidate peptides, more than 500 candidate peptides, more than 600 candidate peptides, or more than 750 candidate peptides are compared, for example.
  • topological characteristic simulations are used to evaluate the topological characteristic overlap, if any, of a candidate peptide compared to the combination of constraints.
  • one or more candidate peptides are also compared to the reference target, and overlap, if any, of candidate peptide topological characteristics with reference target topological characteristics is evaluated.
  • the engineered peptide is identified from a
  • the spatially-associated topological constraints used to construct the desired combination may each be independently selected from a wide group of possible characteristics. These may include, for example, constraints describing structural, dynamical, chemical, or functional characteristics, or any combinations thereof.
  • Structural constraints may include, for example, atomic distance, amino acid sequence similarity, solvent exposure, phi angle, psi angle, secondary structure, or amino acid contact, or any combinations thereof.
  • Dynamical constraints may include, for example, atomic fluctuation, atomic energy, van der Waals radii, amino acid adjacency, or non-covalent bonding propensity.
  • Atomic energy may include, for example, pairwise attractive energy between two atoms, pairwise repulsive energy between two atoms, atom-level solvation energy, pairwise charged attraction energy between two atoms, pairwise hydrogen bonding attraction energy between two atoms, or non-covalent bonding energy, or any combinations thereof.
  • Chemical characteristics may include, for example, chemical descriptors.
  • chemical descriptors may include, for example, hydrophobicity, polarity, atomic volume, atomic radius, net charge, logP, HPLC retention, van der Waals radii, charge patterns, or H- bonding patterns, or any combinations thereof.
  • Bioinformatic descriptors may include, for example, BLOSUM similarity, pKa, zScale, Cruciani Properties, Kidera Factors, VHSE- scale, ProtFP, MS-WHIM scores, T-scale, ST-scale, Transmembrane tendency, protein buried area, helix propensity, sheet propensity, coil propensity, turn propensity, immunogenic propensity, antibody epitope occurrence, and/or protein interface occurrence, or any combinations thereof.
  • designing the constraints incorporates information about per- residue energy, per-residue interaction, per-residue fluctuation, per-residue atomic distance, per-residue chemical descriptor, per-residue solvent exposure, per-residue amino acid sequence similarity, per-residue bioinformatic descriptor, per-residue non-covalent bonding propensity, per-residue phi/psi angles, per-residue van der Waals radii, per-residue secondary structure propensity, per-residue amino acid adjacency, or per-residue amino acid contact.
  • these characteristics are used for a subset of the total residues in the reference target, or a subset of the total residues of the total combination of constraints, or a combination thereof.
  • one or more different characteristics are used for one or more different residues. That is, in some embodiments, one or more characteristics are used for a subset of residues, and at least one different characteristic is used for a different subset of residues.
  • one or more of said characteristics used to design one or more constraints is determined by computer simulation. Suitable computer simulation methods may include, for example, molecular dynamics simulations, Monte Carlo
  • simulations coarse-grained simulations, Gaussian network models, machine learning, or any combinations thereof.
  • multiple constraints are selected from one category.
  • the combination comprises two or more constraints that are independently a type of biological response.
  • two or more constraints are independently a type of secondary structure.
  • two or more constraints are independently a type of chemical descriptor.
  • the combination comprises no overlapping categories of constraints.
  • one or more constraints is independently associated with a biological response or biological function.
  • said constraint is a spatially defined atom(s)-level constraint, or spatially defined shape/area/volume-level constraint (such as a characteristic shape/area/volume that can be satisfied by several different atomic compositions), or a spatially defined dynamic-level constraint (such as a characteristic dynamic or set of dynamics that can be satisfied by several different atomic compositions).
  • one or more constraints is derived from a protein structure or peptide structure associated with a biological function or biological response.
  • one or more constraints is derived from an extracellular domain, such as a G protein-coupled receptor (GPCR) extracellular domain, or an ion channel extracellular domain.
  • GPCR G protein-coupled receptor
  • one or more constraints is derived from a protein-protein interface junction.
  • one or more constraints is derived from a protein- peptide interface junction, such as MHC-peptide or GPCR-peptide interfaces.
  • the atoms or amino acids constrained to such a protein or peptide structure are atoms or amino acids associated with a biological function or biological response.
  • the atoms or amino acids in the engineered peptide constrained to such a protein or peptide structure are atoms or amino acids derived from a reference target.
  • one or more constraints is derived from a polymorphic region of a reference target (e.g., a region subject to allelic variation between individuals).
  • the biological response or biological function is selected from the group consisting of gene expression, metabolic activity, protein expression, cell proliferation, cell death, cytokine secretion, kinase activity, epigenetic modification, cell killing activity, inflammatory signals, chemotaxis, tissue infiltration, immune cell lineage commitment, tissue microenvironment modification, immune synapse formation, IL-2 secretion, IL-10 secretion, growth factor secretion, interferon gamma secretion, transforming growth factor beta secretion, immunoreceptor tyrosine-based activation motif activity, immunoreceptor tyrosine-based inhibition motif activity, antibody directed cell cytotoxicity, complement directed cytotoxicity, biological pathway agonism, biological pathway antagonism, biological pathway redirection, kinase cascade modification, proteolytic pathway modification, proteostasis pathway modification, protein folding/ pathways, post-translational modification pathways, metabolic pathways, gene transcription/translation, mRNA
  • the one or more atoms associated with a biological function or biological response are selected from the group consisting of carbon, oxygen, nitrogen, hydrogen, sulfur, phosphorus, sodium, potassium, zinc, manganese, magnesium, copper, iron, molybdenum, and nickel.
  • the atoms are selected from the group consisting of oxygen, nitrogen, sulfur, and hydrogen.
  • one of the constraints is one or more amino acids associated with a biological function or biological response
  • the engineered peptide comprises one or more amino acids associated with a biological function or biological response
  • the one or more amino acids are independently selected from the group consisting of the 20 proteinogenic naturally occurring amino acids, non-proteinogenic naturally occurring amino acids, and non-natural amino acids.
  • the non-natural amino acids are chemically synthesized.
  • the one or more amino acids are selected from the 20 proteinogenic naturally occurring amino acids.
  • the one or more amino acids are selected from the non-proteinogenic naturally occurring amino acids.
  • the one or more amino acids are selected from non-natural amino acids.
  • the one or more amino acids are selected from a combination of 20 proteinogenic naturally occurring amino acids, non-proteinogenic naturally occurring amino acids, and non-natural amino acids.
  • the combination of constraints used to select an engineered peptide as described herein comprises at least one constraint derived from a reference target, in some embodiments one or more constraints of the combination are not derived from a reference target. Thus, in certain embodiments, the selected engineered peptide comprises one or more characteristics that are not shared with the reference target.
  • one or more constraints derived from the reference target and used in the combination describes the inverse of the characteristic as observed in the reference target.
  • a reference target may have a certain pattern of positive charge
  • a constraint related to charge is derived from said reference target
  • the derived constraint describes a similar pattern but of neutral charge, or of negative charge.
  • one or more inverse constraints are derived from the reference target and included in the combination. Such inverse constraints may be useful, for example, in selecting engineered peptides as control molecules for certain assays or panning methods, or as negative selection molecules in the programmable in vitro selection methods described herein.
  • the combination of spatially-defined topological constraints comprises one or more non-reference derived topological constraints.
  • the one or more non-reference derived topological constraints enforces or stabilizes one or more secondary structural elements, enforces atomic fluctuations, alters peptide total hydrophobicity, alters peptide solubility, alters peptide total charge, enables detection in a labeled or label-free assay, enables detection in an in vitro assay, enables detection in an in vivo assay, enables capture from a complex mixture, enables enzymatic processing, enables cell membrane permeability, enables binding to a secondary target, or alters immunogenicity.
  • the one or more non-reference derived topological constraints constrains one or more atoms or amino acids in the combination of constraints (or
  • the combination of constraints includes a secondary structure that was derived from the reference target, and the combination of constraints also comprises a constraint that stabilizes the secondary structural element (e.g., through additional hydrogen bonding, or hydrophobic interactions, or side chain stacking, or a salt bridge, or a disulfide bond), wherein the stabilizing constraint is not present in the reference target.
  • the combination of constraints comprises one or more atoms or amino acids that was derived from the reference target, and the combination of constraints also includes a constraint that enforces atomic fluctuations in at least a portion of the atoms or amino acids derived from the target reference, wherein the constraint is not present in the target reference.
  • one or more non-reference derived constraints is an inverse constraint.
  • two combinations of constraints are constructed to select engineered peptides with inverse characteristics.
  • a first combination of constraints will comprise one or more constraints derived from the reference target, and one or more constraints not derived from the reference target; and a second combination of constraints will comprise the same one or more constraints derived from the reference target, and the inverse of one or more of non-reference target constraints of the first combination.
  • any suitable reference target may be used to derive one or more spatially-associated topological constraints for use in the methods provided herein.
  • the reference target is a full-length native protein.
  • the reference target is a portion of a full-length native protein.
  • the reference target is a non-native protein, or portion thereof.
  • the reference target is a cell-surface receptor, or a transmembrane protein, or a signaling protein, or a multiprotein complex, or a protein- peptide complex, or a portion thereof.
  • the reference target is a portion of a protein of interest, wherein the protein of interest is involved in disease process in an organism, such as a human.
  • the protein of interest is involved in the growth or metastasis of cancer, or in an inflammatory disorder, and the reference target is a portion of said protein of interest that is a putative epitope.
  • the methods provided herein may be used to select one or more engineered peptides that may serve as an immunogen, and may be used to raise antibodies of a protein of interest.
  • proteins examples include, for example, PD-1, PD-L1, CD25, IL2, MIF, CXCR4, or VEGF.
  • the reference target is PD-1, PD-L1, CD25, IL2, MIF, CXCR4, or VEGF, or a portion thereof, such as an epitope.
  • the methods provided herein may be used to select one or more engineered peptides that are immunogens, and which may be used to raise one or more antibodies that specifically bind to the protein from which the target reference is derived.
  • the methods provided herein may be used to select one or more engineered peptides which in turn may be used to select one or more binding partners of a protein of interest, such as an antibody or a Fab-displaying phage.
  • the one or more constraints are determined by molecular simulation (e.g. molecular dynamics), or laboratory measurement (e.g. NMR), or a combination thereof.
  • molecular simulation e.g. molecular dynamics
  • laboratory measurement e.g. NMR
  • engineered peptide candidates are, in some embodiments, generated using a computational protein design (e.g., Rosetta). In some embodiments, other methods of sampling peptide space are used. Dynamics simulations may then be carried out on the candidate engineered peptides to obtain the parameters of constraints that have been selected.
  • a covariance matrix of atomic fluctuations is generated for the reference target, covariance matrices are generated for the residues in each of the candidate engineered peptides, and these covariance matrices are compared to determine overlap.
  • Principal component analysis is performed to compute the eigenvectors and eigenvalues for each covariance matrix - one covariance matrix for the reference target and one covariance for each of the candidate engineered peptides - and those eigenvectors with the largest eigenvalues are retained.
  • the eigenvectors describe the most, second-most, third-most, N-most dominant motion observed in a set of simulated molecular structures. Without wishing to be bound by any theory, if a candidate engineered peptide moves like the reference target, its eigenvectors will be similar to the eigenvectors of the reference target. The similarity of eigenvectors corresponds to their components (a 3D vector centered on each CA atom) being aligned, pointing in the same direction. Exemplary eigenvector comparisons between a reference target and a candidate engineered peptide are shown in FIGS. 7D-7G.
  • this similarity between candidate engineered peptide and reference target eigenvectors is computed using the inner product of two eigenvectors.
  • the inner product value is 0 if two eigenvectors are 90 degrees to each other or 1 if the two eigenvectors point precisely in the same direction.
  • MD molecular dynamics
  • the inner product between all pairs of eigenvectors in a candidate engineered peptide and the reference target are computed. This results in a matrix of inner products the dimensions of which are determined by the number of eigenvectors analyzed. For example, for 10 eigenvectors, the matrix of inner products is 10 by 10. This matrix of inner products can be distilled into a single value by computing the root mean- square value of the 100 (if 10 by 10) inner products. This is the root mean square inner product (RMSIP). The equation for RMSIP is shown in FIG. 7G. From this comparison, one or more candidate engineered peptides that have similarity with the defined combination of constraints are selected. e. Additional Steps
  • selection of one or more engineered peptides comprises one or more additional steps.
  • an engineered peptide candidate is selected based on similarity to the defined combination of spatially-associated topological constraints, as described herein, and then undergoes one or more analyses to determine one or more additional characteristics, and one or more structural adjustments to impart or enforce said desired characteristics.
  • the selected candidate is analyzed, such as through molecule dynamics simulations, to determine overall stability of the molecule and/or propensity for a particular folded structure.
  • one or more modifications are made to the engineered peptide to impart or reinforce a desired level of stability, or a desired propensity for a desired folded structure. Such modifications may include, for example, the installation of one or more cross-links (such as a disulfide bond), salt bridges, hydrogen bonding interactions, or hydrophobic interactions, or any combinations thereof.
  • the methods provided herein may further comprise assaying one or more selected engineered peptides for one or more desired characteristics, such as desired binding interactions or activity. Any suitable assay may be used, as appropriate to measure the desired characteristic.
  • engineered peptides such as engineered peptides selected through the methods described herein.
  • the engineered peptide has a molecular mass between 1 kDa and 10 kDa, and comprises up to 50 amino acids.
  • the engineered peptide has a molecular mass between 2 kDa and 10 kDa, between 2 kDa and 10 kDa, between 3 kDa and 10 kDa, between 4 kDa and 10 kDa, between 5 kDa and 10 kDa, between 6 kDa and 10 kDa, between 7 kDa and 10 kDa, between 8 kDa and 10 kDa, between 9 kDa and 10 kDa, between 1 kDa and 9 kDa, between 1 kDa and 8 kDa, between 1 kDa and 7 kDa, between 1 kDa and 6 kDa, between 1 kDa and 5 kDa, between 1 kDa and 4 kDa, between 1 kDa and 3 kDa, or between 1 kDa and 2 kDa.
  • the engineered peptide comprises up to 45 amino acids, up to 40 amino acids, up to 35 amino acids, up to 30 amino acids, up to 25 amino acids, up to 20 amino acids, at least 5 amino acids, at least 10 amino acids, at least 15 amino acids, at least 20 amino acids, at least 25 amino acids, at least 30 amino acids, at least 35 amino acids, or at least 40 amino acids.
  • the engineered peptide comprises a combination of spatially-associated topological constraints, wherein one or more of the constraints is a reference target-derived constraint. Any constraints described herein may be used in the combination, in some embodiments. In still further embodiments, between 10% to 98% of the amino acids of the engineered peptide meet the one or more reference target-derived constraints (e.g., if the engineered peptide comprises 50 amino acids, between 5 to 49 amino acids meet the one or more reference target-derived constraints).
  • the one or more amino acids that meet the one or more reference target-derived constraints have less than 8.0 A, less than 7.5 A, less than 7.0 A, less than 6.5 A, less than 6.0 A, less than 5.5 A, or less than 5.0 A backbone root-mean- square deviation (RSMD) structural homology with the reference target.
  • RSMD backbone root-mean- square deviation
  • the engineered peptide has a molecular mass of between 1 kDa and 10 kDa; comprises up to 50 amino acids; a combination of spatially-associated topological constraints, wherein one or more of the constraints is a reference target-derived constraint; between 10% to 98% of the amino acids of the engineered peptide meet the one or more reference target- derived constraints; and the amino acids that meet the one or more reference target-derived constraints have less than 8.0
  • RSMD backbone root-mean-square deviation
  • the amino acids of the engineered peptide that meet the one or more reference target-derived constraints have between 10% and 90% sequence homology, between 20% and 90% sequence homology, between 30% and 90% sequence homology, between 40% and 90% sequence homology, between 50% and 90% sequence homology, between 60% and 90% sequence homology, between 70% and 90% sequence homology, or between 80% and 90% sequence homology with the reference target.
  • the amino acids that meet the one or more reference target-derived constraints have a van der Waals surface area overlap with the reference of between 30 A 2 to 3000 A 2 , or between 100 A 2 to 3000 A 2 , or between 250 A 2 to 3000 A 2 , or between 500 A 2 to 3000 A 2 , or between 750 A 2 to 3000 A 2 , or between 1000 A 2 to 3000 A 2 , or between 1250 A 2 to 3000 A 2 , or between 1500 A 2 to 3000 A 2 , or between 1750 A 2 to 3000 A 2 , or between 2000 A 2 to 3000 A 2 , or between 2250 A 2 to 3000 A 2 , or between 2500 A 2 to 3000 A 2 , or between 2750 A 2 to 3000
  • the combination of constraints that the engineered peptide meets may comprise two or more, three or more, four or more, five or more, six or more, or seven or more reference target-derived constraints.
  • the combination may comprise one or more constraints not derived from the reference target, as described elsewhere in the present disclosure.
  • These reference-derived constraints, and non-reference derived constraints if present, may independently be any of the constraints described herein, such as any of the structural, dynamical, chemical, or functional characteristics described herein, or any combinations thereof.
  • the engineered peptide comprises at least one structural difference when compared to the reference target.
  • structural differences may include, for example, a difference in the sequence, number of amino acid residues, total number of atoms, total hydrophilicity, total hydrophobicity, total positive charge, total negative charge, one or more secondary structures, shape factor, Zernike descriptors, van der Waals surface, structure graph nodes and edges, volumetric surface, electrostatic potential surface, hydrophobic potential surface, local diameter, local surface features, skeleton model, charge density, hydrophilic density, surface to volume ratio, amphiphilicity density, or surface roughness, or any combinations thereof.
  • the difference in one or more characteristics is at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, or greater than 100% when compared to the characteristic in the reference target, as applicable to the type of characteristic.
  • the difference in one or more characteristics is at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, or greater than 100% when compared to the characteristic in the reference target, as applicable to the type of characteristic.
  • the difference is the total number of atoms, and the engineered peptide has at least 10%, at least 20%, or at least 30% more atoms than the reference target, or at least 10%, at least 20%, or at least 30% fewer atoms than the reference target.
  • the difference is in total positive charge, and the total positive charge of the engineered peptide is at least 10%, at least 20%, at least 30%, at least 40%, or at least 50% larger (e.g., more positive) than the reference target, while in other embodiments the total positive charge of the engineered peptide is at least 10%, at least 20%, at least 30%, at least 40%, or at least 50% smaller (e.g., less positive) than the reference target.
  • the combination of spatially-defined topological constraints includes one or more secondary structural elements not present in the reference target.
  • the engineered peptide comprises one or more secondary structural elements that are not present in the reference target.
  • the combination and/or engineered peptide comprises one secondary structural element, two secondary structural elements, three secondary structural elements, four secondary structural elements, or more than four secondary structural elements not found in the reference target.
  • each secondary structural element is independently selected form the group consisting of helices, sheets, loops, turns, and coils.
  • each secondary structural element not present in the reference target is independently an a-helix, b-bridge, b- strand, 3io helix, p-helix, turn, loop, or coil.
  • the engineered peptide comprises one or more atoms, or one or more amino acids, or a combination thereof, that is associated with a biological response or a biological function.
  • the biological response or biological function is selected from the group consisting of gene expression, metabolic activity, protein expression, cell proliferation, cell death, cytokine secretion, kinase activity, epigenetic modification, cell killing activity, inflammatory signals, chemotaxis, tissue infiltration, immune cell lineage commitment, tissue microenvironment modification, immune synapse formation, IL-2 secretion, IL-10 secretion, growth factor secretion, interferon gamma secretion, transforming growth factor beta secretion, immunoreceptor tyrosine-based activation motif activity, immunoreceptor tyrosine-based inhibition motif activity, antibody directed cell cytotoxicity, complement directed cytotoxicity, biological pathway agonism, biological pathway antagonism, biological pathway redirection, kinase cascade modification, proteo
  • mRNA degradation pathways mRNA degradation pathways, gene methylation/acetylation pathways, histone modification pathways, epigenetic pathways, immune directed clearance, opsonization, hormone signaling, integrin pathways, membrane protein signal transduction, ion channel flux, and g-protein coupled receptor response.
  • the reference target comprises one or more atoms associated with a biological response or a biological function (such as one described herein);
  • the engineered peptide comprises one or more atoms associated with a biological response or a biological function (such as one described herein); and the atomic fluctuations of said atoms in the engineered peptide overlap with the atomic fluctuations of said atoms in the reference target.
  • the atoms themselves are different atoms, but their atomic fluctuations overlap.
  • the atoms are the same atoms, and their atomic fluctuations overlap.
  • the atoms are
  • the overlap is a root mean square inner product (RMSIP) greater than 0.25.
  • RMSIP root mean square inner product
  • the overlap is a RMSIP greater than 0.3, greater than 0.35, greater than 0.4, greater than 0.45, greater than 0.5, greater than 0.55, greater than 0.6, greater than 0.65, greater than 0.7, greater than 0.75, greater than 0.8, greater than 0.85, greater than 0.9, or greater than 0.95.
  • the RMSIP is calculated by:
  • n is the eigenvector of the engineered peptide topological constraints
  • v is the eigenvector of the reference target topological constraints
  • the engineered peptide comprises atoms or amino acids (or combination thereof) associated with a biological response or biological function, and at least a portion of said atoms or amino acids or combination is derived from a reference target, and certain constraints of the set of atoms or amino acids in the engineered peptide and the set in the reference target can be described by a matrix.
  • the matrix is an LxL matrix.
  • the matrix is an SxSxM matrix.
  • the matrix is an Lx2 phi/psi angle matrix
  • the atomic fluctuations of the atoms or amino acids in the engineered peptide that are associated with a biological response or biological function are described by an LxL matrix; a portion of said atoms or amino acids are derived from the reference target; and the atomic fluctuations in the reference target of said portion are described by an LxL matrix.
  • the adjacency of each set is described by corresponding LxL matrices.
  • the mean percentage error (MPE) across all matrix elements (i, j) of the engineered peptide LxL atomic fluctuation or adjacency matrix is less than or equal to 75% relative to the corresponding (i, j) elements in the reference target atomic fluctuation or adjacency matrix, for the fraction of the engineered peptide derived from the reference target.
  • the MPE is less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, or less than 40% relative to the corresponding elements in the reference target matrix, for the fraction of the engineered peptide derived from the reference target.
  • L is the number of amino acid positions and the (i, j) value in the atomic fluctuation matrix element is the sum of intra-molecular atomic fluctuations for the 1 th and j th amino acid respectively if the (i, j) atomic distance is less than or equal to 7 A, or zero if the (i, j) atomic distance is greater than 7 A or if (i, j) is on the diagonal.
  • the atomic distance can serve as a weighting factor for the atomic fluctuation matrix element (i, j) instead of a 0 or 1 multiplier.
  • the 1 th and j* 11 atomic fluctuations and distances can be determined by molecular simulation (e.g. molecular dynamics) and/or laboratory measurement (e.g. NMR).
  • L is the number of amino acid positions and the value in adjacency matrix element (i, j) is the intra-molecular atomic distance between the i th and j th amino acid respectively if the atomic distance is less than or equal to 7 A, or zero if the atomic distance is greater than 7 A or if (i, j) is on the diagonal.
  • the atomic distance can serve as a weighting factor for the adjacency matrix element (i, j) instead of a 0 or 1 multiplier.
  • the 1 th and j* 11 atomic distances could be determined by molecular simulation (e.g. molecular dynamics) and/or laboratory measurement (e.g. NMR).
  • the atoms or amino acids associated with a response or function in the engineered peptide have a topological constraint chemical descriptor vector and a mean percentage error (MPE) less than 75% relative to the reference described by the same chemical descriptor, for the fraction of the engineered peptide derived from the reference target, wherein each 1 th element in the chemical descriptor vector corresponds to an amino acid position index.
  • MPE mean percentage error
  • the MPE is less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, or less than 40% relative to the reference described by the same chemical descriptor, for the fraction of the engineered peptide derived from the reference target.
  • An exemplary vector is presented in FIG. 48.
  • the matrix is an Lx2 phi/psi angel matrix
  • the atoms or amino acids associated with a response or function in the engineered peptide have an MPE less than 75% with respect to the reference phi/psi angles matrix in the fraction of the engineered peptide derived from the reference target, wherein L is the number of amino acid positions and phi, psi values are in dimensions (L,l) and (L,2) respectively.
  • the MPE is less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, or less than 40% with respect to the reference phi/psi angles matrix in the fraction of the engineered peptide derived from the reference target.
  • the phi/psi values are determined by molecular simulation (e.g. molecular dynamics), knowledge-based structure prediction, or laboratory measurement (e.g. NMR).
  • FIG. 49 An exemplary Lx2 phi/psi matrix is shown in FIG. 49.
  • the matrix is an SxSxM secondary structural element interaction matrix
  • the atoms or amino acids associated with a response or function in the engineered peptide have less than 75% mean percentage error (MPE) relative to the reference secondary structural element relationship matrix, in the fraction of the engineered peptide derived from the reference target, where S is the number of secondary structural elements and M is the number of interaction descriptors.
  • MPE mean percentage error
  • the MPE is less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, or less than 40% relative to the reference secondary structural element relationship matrix, in the fraction of the engineered peptide derived from the reference target.
  • Interaction descriptors may include, for example, hydrogen bonding, hydrophobic packing, van der Waals interaction, ionic interaction, covalent bridge, chirality, orientation, or distance, or any combinations thereof.
  • Mean Percentage Error (MPE) for different matrices as described herein may be calculated by:
  • n is the topological constraint vector or matrix position index for the engineered peptide (eng n ) and the corresponding reference (refn), summed up to vector or matrix position n.
  • An exemplary example of a topological matrix is provided in FIG. 47.
  • the engineered peptide has an MPE of less than 75% compared to the reference target. In certain embodiments, the engineered peptide has an MPE of less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, or less than 40% compared to the reference target. In some embodiments, the MPE is determined by Total Topological Constraint Distance (TCD), topological clustering coefficient (TCC), Euclidean distance, power distance, Soergel distance, Canberra distance, Sorensen distance, Jaccard distance, Mahalanobis distance, Hamming distance, Quantitative Estimate of Likeness (QEL), or Chain Topology Parameter (CTP).
  • TCD Total Topological Constraint Distance
  • TCC topological clustering coefficient
  • Euclidean distance power distance
  • Soergel distance Canberra distance
  • Sorensen distance Jaccard distance
  • Mahalanobis distance Mahalanobis distance
  • Hamming distance Quantitative Estimate of Likeness
  • CTP Chain Topology Parameter
  • the engineered peptide is topologically constrained to one or more secondary structural elements.
  • the atoms or amino acids associated with a biological response or biological function in the engineered peptide are topologically constrained to one or more secondary structural elements.
  • the secondary structural element is independently a sheet, helix, turn, loop, or coil.
  • the secondary structural element is independently an a-helix, b- bridge, b-strand, 3io helix, p-helix, turn, loop, or coil.
  • one or more of the secondary structural elements to which at least a portion of the engineered peptide is topologically constrained is present in the reference target.
  • At least a portion of the engineered peptide is topologically constrained to a combination of secondary structural elements, wherein each element is independently selected from the group consisting of sheet, helix, turn, loop, and coil. In still further embodiments, each element is independently selected from the group consisting of an a-helix, b-bridge, b-strand, 3io helix, p-helix, turn, loop, and coil.
  • the secondary structural element is a parallel or anti-parallel sheet.
  • a sheet secondary structure comprises greater than or equal to 2 residues.
  • a sheet secondary structure comprises less than or equal to 50 residues.
  • a sheet secondary structure comprises between 2 and 50 residues. Sheets can be parallel or anti-parallel.
  • a parallel sheet secondary structure may be described as having two strands i, j in a parallel (N-termini of i and j strands opposing orientation), and a pattern of hydrogen bonding of residues i:j.
  • an anti-parallel sheet secondary structure may also be described as having two strands i, j in an anti -parallel (N-termini of i andj strands same orientation), and a pattern of hydrogen bonding of residues i : j - 1 , i:j+l.
  • the orientation and hydrogen bonding of strands can be determined by knowledge-based or molecular dynamics simulation and/or laboratory measurement.
  • the secondary structural element is a helix. Helices may be right or left handed. In some embodiments, the helix has a residue per turn (residues/turn) value of between 2.5 and 6.0, and a pitch between 3.0 A and 9.0 A. In some embodiments, the residues/turn and pitch are determined by knowledge-based or molecular dynamics simulation and/or laboratory measurement.
  • the secondary structural element is a turn.
  • a turn comprises between 2 to 7 residues, and 1 or more inter-residue hydrogen bonds.
  • the turn comprises 2, 3, or 4 inter-residue hydrogen bonds.
  • the turn is determined by knowledge-based or molecular dynamics simulation and/or laboratory measurement.
  • the secondary structural element is a coil.
  • the coil comprises between 2 to 20 residues and zero predicted inter-residue hydrogen bonds. In some embodiments, these coil parameters are determined by knowledge- based or molecular dynamics simulation and/or laboratory measurement.
  • the engineered peptide comprises one or more atoms or amino acids derived from the reference target, wherein said atoms or amino acids have a secondary structure. In some embodiments, these atoms or amino acids are associated with a biological response or biological function.
  • the secondary structure motif vector of the atoms or amino acids in the engineered peptide has a cosine similarity greater than 0.25 relative to the reference target secondary structure motif vector for the fraction of the engineered peptide derived from the reference target, wherein the length of the vector is the number of secondary structure motifs and the value at the i th vector position defines the identity of the secondary structure motif (e.g. helix, sheet) derived from a lookup table.
  • each motif comprises two or more amino acids.
  • motifs include, for example, a-helix, b-bridge, b-strand, 3io helix, p-helix, turn, and loop.
  • the cosine similarity is greater than 0.3, greater than 0.35, greater than 0.4, greater than 0.45, or greater than 0.5 relative to the reference target secondary structure motif vector for the fraction of the engineered peptide derived from the reference target.
  • An exemplary secondary structure index and lookup table is provided in FIG. 53. Cosine similarity may be calculated by:
  • A is the peptide vector of secondary structure motif identifiers
  • B is the reference vector of secondary structure motif identifiers
  • n is the length of the secondary structure motif vector
  • i is the 1 th secondary structure motif.
  • one or more atoms or amino acids of the engineered peptide which are derived from the reference target can be compared to the corresponding reference target atoms or amino acids using a total topological constraint distance (TCD).
  • TCD total topological constraint distance
  • the total TCD of said engineered peptide atoms or amino acids derived from the reference target is +/- 75% relative to the TCD distance of the corresponding atoms in the reference target, wherein two intra-molecule topological constraints are interacting if their pairwise distance is less than or equal to 7 A.
  • the atoms or amino acids in the engineered peptide being compared are associated with a biological function or biological response.
  • the i* 11 , j* 11 pairwise distance of two atoms or amino acids can, in some embodiments, be determined by molecular simulation (e.g. molecular dynamics) and/or laboratory measurement (e.g. NMR).
  • An exemplary equation for calculating total topological constraint distance (TCD) is:
  • i, j are the intra-molecular position indices for amino acids (i, j)
  • Sy is the difference between constraints S(i) and S(j)
  • D( ⁇ j) 1 if amino acids (i, j) are within the 7 A interaction threshold
  • L is the number of amino acid positions in the peptide or the corresponding reference target.
  • A(i,j) can serve as a weighting factor for the Sy difference instead of a 0 or 1 multiplier.
  • one or more atoms or amino acids of the engineered peptide which are derived from the reference target can be compared to the corresponding reference target atoms or amino acids using a chain topology parameter (CTP).
  • CTP chain topology parameter
  • the CTP of said engineered peptide atoms or amino acids is +/- 50% relative to the CTP of the corresponding atoms or amino acids in the reference target, wherein intra-chain topological interaction is a pairwise distance less than or equal to 7 A.
  • the atoms or amino acids in the engineered peptide being compared are associated with a biological function or biological response.
  • i 1 * 1 , j* 11 pairwise distance can be determined by molecular simulation (e.g. molecular dynamics) and/or laboratory measurement (e.g. NMR).
  • An exemplary equation for evaluating CTP is:
  • A(i,j) can serve as a weighting factor for the Sy difference instead of a 0 or 1 multiplier.
  • one or more atoms or amino acids of the engineered peptide which are derived from the reference target can be compared to the corresponding reference target atoms or amino acids using a quantitative estimate of likeness (QEL).
  • QEL quantitative estimate of likeness
  • the QEL of said engineered peptide atoms or amino acids is +/- 50% relative to the QEL of the corresponding atoms or amino acids in the reference target.
  • the atoms or amino acids in the engineered peptide being compared are associated with a biological function or biological response.
  • QEL Quantitative Estimate of Likeness
  • one or more atoms or amino acids of the engineered peptide which are derived from the reference target can be compared to the corresponding reference target atoms or amino acids using a topological clustering coefficient (TCC) vector and a mean percentage error (MPE).
  • TCC topological clustering coefficient
  • MPE mean percentage error
  • the TCC vector and MPE is less than 75% relative to the TCC of the corresponding atoms or amino acids in the reference target, wherein each element (i) of the vector is a topological clustering coefficient for the 1 th amino acid position, intra-molecule clusters are defined by an interacting edge distance less than or equal to 7 A, and two edges: i-j, j-1 from the i* 11 amino acid position.
  • the atoms or amino acids in the engineered peptide being compared are associated with a biological function or biological response.
  • the i* 11 , j* 11 and 1 th edge distance can be determined by molecular simulation (e.g. molecular dynamics) and/or laboratory measurement (e.g. NMR).
  • An exemplary equation for evaluating the topological clustering coefficient for the i th position is:
  • Syi is the combination (e.g. sum) of topological constraints for the i th , j* 11 and 1 th amino acid
  • L is the number of amino acid positions in the peptide vector or corresponding reference target vector
  • N c is the number of intra-molecular interacting amino acid positions for the 1 th amino acid, meeting the 7 A edge threshold and two edges: i-j, j-1 from the i* 11 amino acid.
  • A(i,j), D( ⁇ ,1) and D(],1) can serve as weighting factors for the clustering coefficient vector element (i) instead of a 0 or 1 multiplier.
  • one or more atoms or amino acids of the engineered peptide which are derived from the reference target can be compared to the corresponding reference target atoms or amino acids using an LxM topological constraint matrix and mean percentage error (MPE) of: Euclidean distance, power distance, Soergel distance, Canberra distance, Sorensen distance, Jaccard distance, Mahalanobis distance, or Hamming distance across all M-dimensions.
  • LxM matrix element (/, m) contains the m lh constraint value for the I th amino acid position, wherein L is the number of amino acid positions and M is the number of distinct topological constraints.
  • the MPE of the engineered peptide LxM matrix is less than 75% relative to the matrix of the corresponding reference target atoms or amino acids. In some embodiments, the MPE is less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, or less than 45%. In some embodiments, the atoms or amino acids in the engineered peptide being compared are associated with a biological function or biological response.
  • An exemplary LxM matrix is provided in FIG. 52.
  • the methods include subjecting a pool of candidate binding molecules to at least one round of selection, wherein each round comprises at least one negative selection step wherein at least a portion of the pool is screened against a negative selection molecule, and at least one positive selection step wherein at least a portion of the pool is screened against a positive selection molecule.
  • the method comprises at least two rounds, at least three rounds, at least four rounds, at least five rounds, at least six rounds, at least seven rounds, at least eight rounds, at least nine rounds, at least ten rounds, or more, wherein each round independently comprises at least one negative selection step and at least one positive selection step.
  • each round independently comprises more than one negative selection step, or more than one positive selection step, or a combination thereof.
  • FIG. 5 provides an exemplary schematic detailing three rounds of selection, wherein the first and third round comprise more than one negative selection step, and the first round further comprises more than one positive selection round. As shown in the scheme, two negative selection molecules (“baits”) are used in the first round, and three negative selection molecules are used in the third round. In addition, two positive selection molecules are used in the first round.
  • each negative and positive selection molecule is independently chosen.
  • the same negative selection molecule, or the same positive selection molecule, or a combination thereof may be used in more than one round.
  • the same negative selection molecules used in round 1 are used again in round 3, with an additional third negative selection molecule also included in round 3.
  • the order of negative and positive selection steps may be, in certain embodiments, independently chosen within each round of selection.
  • the method comprises one or more rounds of selection, wherein each round comprises first a negative selection step, and then a positive selection step.
  • the method comprises one or more rounds of selection, wherein each round comprises first a positive selection step, and then a negative selection step.
  • the method comprises one or more rounds of selection, wherein each round independently comprise a negative selection step and a positive selection step, wherein in each round the negative selection step is independently before the positive selection step or after the positive selection step.
  • Such methods of selection use positive (+) and negative (-) steps to steer the library of candidate binding molecules towards and away from certain desired characteristics, such as binding specificity or binding affinity.
  • the pool of candidates can be directed in a stepwise manner to select for characteristics that are desirable and against characteristics that are undesirable.
  • the order of each step within each round, and the order of the rounds relative to each other can direct the selection in different directions.
  • a method comprising one round with (+) selection followed by (-) selection will result in a different final pool of candidates than if (-) selection is first, followed by (+) selection. Extrapolating this out to methods comprising multiple rounds, the order of selection steps may result in a different final pool of selected candidates even if the same positive and negative selection molecules are used overall.
  • a selection molecule is used that has in inverse characteristic of another selection molecule. This may be useful, for example, to ensure that the candidate binding partners identified using the positive selection molecule (or excluded because of a negative selection molecule) were identified (or excluded) because of a desired trait (or undesired trait), not because of a separate, unrelated binding interaction.
  • an inverse selection molecule can be used that has similar or the same structure and characteristics as the selection molecule, except for the residues/structures conveying the desired trait (or undesired trait).
  • an inverse negative selection molecule may be used that has replaced the residues providing that charge pattern with uncharged residues, and/or residues of the opposite charge.
  • multiple different corresponding inverse selection molecules may be possible.
  • At least one of the selection molecules is an engineered peptide as described herein. In some embodiments, more than one engineered peptide is used. In some embodiments, each engineered peptide is independently a positive or negative selection molecule. In certain embodiments, each selection molecule used in the one or more rounds of selection is independently an engineered peptide. In other words,
  • At least one molecule that is not an engineered peptide is used as a selection molecule.
  • selection molecules that are not engineered peptides may comprise, for example, a naturally-occurring polypeptide, or a portion thereof.
  • one or more selection molecules that are not engineered peptides may comprise, for example, a non-naturally occurring polypeptide or portion thereof.
  • one or more selection molecules e.g., positive selection molecule or negative selection molecule
  • one or more selection molecules is PD-1, PD-L1, CD25, IL2, MIF, CXCR4, or VEGF, or a portion of any of these, or an antibody to any of these (such as Bevacizumab, Avelumab, or Durvalumab).
  • the positive and negative characteristics being selected for or against in each step may be selected from a variety of traits, and may be tailored depending on the desired features of the final one or more binding molecules obtained. Such desired features may depend, for example, on the intended use of the one or more binding molecules.
  • the methods provided herein are used to screen antibody candidates for one or more positive characteristics such as high specificity, and against one or more negative characteristics such as cross-reactivity. It should be understood that what is considered a positive characteristic in one context might be a negative characteristic in another context, and vice versa.
  • a positive selection molecule in one series of selection rounds may, in some embodiments, be a negative selection molecule in a different series of selection rounds, or in selecting a different type of binding molecule, or in selecting the same type of binding molecule but for a different purpose.
  • each selection characteristic is independently selected from the group consisting of amino acid sequence, polypeptide secondary structure, molecular dynamics, chemical features, biological function, immunogenicity, reference target(s) multi specificity, cross-species reference target reactivity, selectivity of desired reference target(s) over undesired reference target(s), selectivity of reference target(s) within a sequence and/or structurally homologous family, selectivity of reference target(s) with similar protein function, selectivity of distinct desired reference target(s) from a larger family of undesired targets with high sequence and/or structurally homology, selectivity for distinct reference target alleles or mutations, selectivity for distinct reference target residue level chemical modifications, selectivity for cell type, selectivity for tissue type, selectivity for tissue environment, tolerance to reference target(s) structural diversity, tolerance to reference target(s) sequence diversity, and tolerance to reference target(s) dynamics diversity.
  • each selection characteristic is a different type of selection characteristic.
  • two or more selection characteristics are different characteristics but of the same type.
  • two or more selection characteristics are polypeptide secondary structure, wherein one is a positive selection for a desired polypeptide secondary structure and one is a negative selection for an undesired polypeptide secondary structure.
  • two or more selection characteristics are selectivity for cell type, wherein a positive selection characteristic is selectivity for a specific desired cell type, and a negative selection characteristic is selectivity for a specific undesired cell type.
  • two or more, three or more, four or more, five or more, or six or more selection characteristics are of the same type.
  • composition comprising two or more selection steering polypeptides, wherein each polypeptide is independently a positive selection molecule comprising one or more positive steering characteristics, or a negative selection molecule comprising one or more negative steering characteristics.
  • Such characteristics may, in some embodiments, be selected from the group consisting of amino acid sequence, polypeptide secondary structure, molecular dynamics, chemical features, biological function, immunogenicity, reference target(s) multi-specificity, cross-species reference target reactivity, selectivity of desired reference target(s) over undesired reference target(s), selectivity of reference target(s) within a sequence and/or structurally homologous family, selectivity of reference target(s) with similar protein function, selectivity of distinct desired reference target(s) from a larger family of undesired targets with high sequence and/or structurally homology, selectivity for distinct reference target alleles or mutations, selectivity for distinct reference target residue level chemical modifications, selectivity for cell type, selectivity for tissue type, selectivity for tissue environment, tolerance to reference target(s) structural diversity, tolerance to reference target(s) sequence diversity, and tolerance to reference target(s) dynamics diversity.
  • each round of selection comprises: a negative selection step of screening at least a portion of the pool against a negative selection molecule; and a positive selection step of screening at least a portion of the pool for a positive selection molecule; wherein the order of selection steps within each round, and the order of rounds, result in the selection of a different subset of the pool than an alternative order.
  • the binding partners being evaluated using the composition of selection steering polypeptides as described herein, or the methods of screening as described herein are a phage library, for example a Fab-containing phage library; or a cell library, for example a B-cell library or a T-cell library.
  • the methods comprise two or more, three or more, four or more, five or more, six or more, or seven or more rounds of selection. In some embodiments, wherein there is more than one round, each round comprises a different set of selection molecules. In other embodiments, wherein there is more than one round, at least two rounds comprise the same negative selection molecule, the same positive selection molecule, or both.
  • the method comprises analyzing the subset of the pool prior to proceeding to the next round of selection.
  • each subset pool analysis is independently selected from the group consisting of peptide/protein biosensor binding, peptide/protein ELISA, peptide library binding, cell extract binding, cell surface binding, cell activity assay, cell proliferation assay, cell death assay, enzyme activity assay, gene expression profile, protein modification assay, Western blot, and immunohistochemistry.
  • gene expression profile comprises full sequence repertoire analysis of the subset pool, such as next-generation sequencing.
  • statistical and/or informatic scoring, or machine learning training is used to evaluate one or more subsets of the pool in one or more selection rounds.
  • the identity and/or order of positive and/or negative selection molecules for a subsequent round is determined by analyzing a subset pool from one selection round. In some embodiments, statistical and/or informatic scoring, or machine learning training, is used to evaluate one or more subsets of the pool in one or more selection rounds to determine the identity and/or order of the positive and/or negative selection molecules for a subsequent round (such as the next round, or a round further along in the program).
  • the methods of selection include modifying a subset pool obtained from a selection round before proceeding to the next selection round.
  • modifications may include, for example, genetic mutation of the subset pool, genetic depletion of the subset pool (e.g., selecting a subset of the subset pool to move forward in selection), genetic enrichment of the subset pool (e.g., increasing the size of the pool), chemical modification of at least a portion of the subset pool, or enzymatic modification of at least a portion of the subset pool, or any combinations thereof.
  • statistical and/or informatic scoring, or machine learning training is used to evaluate a subset pool and determine the one or more modifications to make prior to moving the modified subset pool forward in selection.
  • such statistical and/or informatic scoring, or machine learning training is also used to determine the identity and/or order of positive and/or negative selection molecules for a subsequent round of selection.
  • binding is directly evaluated, for example by directly detecting a label on the binding partner.
  • labels may include, for example, fluorescent labels, such as a fluorophore or a fluorescent protein.
  • binding is indirectly evaluated, for example using a sandwich assay. In a sandwich assay, a binding partner binds to the selection molecule, and then a secondary labeled reagent is added to label the bound binding partner. This secondary labeled reagent is then detected.
  • sandwich assay components include His-tagged-binding partner detected with an anti-His-tag antibody or His-tag-specific fluorescent probe; a biotin-labeled binding partner detected with labeled streptavidin or labeled avidin; or an unlabeled binding partner detected with an anti-binding-partner antibody.
  • the binding partners being selected in each step are identified based on the binding signal, or dose-response, using any number of available detection methods. These detection methods may include, for example, imaging, fluorescence- activated cell sorting (FACS), mass spectrometry, or biosensors.
  • a hit threshold is defined (for example the median signal), and any with signal above that signal is flagged as a putative hit motif.
  • the engineered peptides provided herein, and identified by the methods provided herein, may be used, for example, to produce one or more antibodies.
  • the antibody is a monoclonal or polyclonal antibody.
  • provided herein is an antibody produced by immunizing an animal with an immunogen, wherein the immunogen is an engineered peptide as provided herein.
  • the animal is a human, a rabbit, a mouse, a hamster, a monkey, etc.
  • the monkey is a cynomolgus monkey, a macaque monkey, or a rhesus macaque monkey.
  • Immunizing the animal with an engineered peptide can comprise, for example, administering at least one dose of a composition comprising the peptide and optionally an adjuvant to the animal.
  • generating the antibody from an animal comprises isolating a B cell which expresses the antibody.
  • Some embodiments further comprise fusing the B cell with a myeloma cell to create a hybridoma which expresses the antibody.
  • the antibody generated using the engineered peptide can cross react with a human and a monkey, for example a cynomolgus monkey.
  • Embodiment 1-1 An engineered peptide, wherein the engineered peptide has a molecular mass of between 1 kDa and 10 kDa and comprises up to 50 amino acids, and wherein the engineered peptide comprises: a combination of spatially-associated topological constraints, wherein one or more of the constraints is a reference target-derived constraint; and
  • amino acids that meet the one or more reference target-derived constraints have less than 8.0
  • RSMD backbone root-mean-square deviation
  • Embodiment 1-2 The engineered peptide of embodiment 1-1, wherein the amino acids that meet the one or more reference target-derived constraints have between 10% and 90% sequence homology with the reference target.
  • Embodiment 1-3 The engineered peptide of embodiment 1-1 or 1-2, wherein the amino acids that meet the one or more reference target-derived constraints have a van der Waals surface area overlap with the reference of between 30 A 2 to 3000 A 2 .
  • Embodiment 1-4 The engineered peptide of any one of embodiments 1-1 to 1-3, wherein the combination comprises at least two reference target-derived constraints.
  • Embodiment 1-5 The engineered peptide of any one of embodiments 1-1 to 1-4, wherein the combination comprises at least five reference target-derived constraints.
  • Embodiment 1-6 The engineered peptide of any one of embodiments 1-1 to 1-5, wherein the combination of constraints comprises one or more constraints not derived from a reference target.
  • Embodiment 1-7 The engineered peptide of embodiment 1-6, wherein the one or more non-reference target-derived constraints describes a desired structural, dynamical, chemical, or functional characteristic, or any combinations thereof.
  • Embodiment 1-8 The engineered peptide of any one of embodiments 1-1 to 1-7, wherein the constraints are independently selected from the group consisting of: atomic distances;
  • Embodiment 1-9 The engineered peptide of any one of embodiments 1-1 to 1-8, wherein one or more constraints is independently an atomic fluctuation.
  • Embodiment 1-10 The engineered peptide of any one of embodiments 1-1 to 1-9, wherein one or more constraints is independently a chemical descriptor.
  • Embodiment 1-11 The engineered peptide of any one of embodiments 1-1 to I- 10, wherein one or more constraints is independently atomic distance.
  • Embodiment 1-12 The engineered peptide of any one of embodiments 1-1 to 1-11, wherein one or more constraints is independently secondary structure.
  • Embodiment 1-13 The engineered peptide of any one of embodiments 1-1 to 1-12, wherein one or more constraints is independently van der Waals surface.
  • Embodiment 1-14 The engineered peptide of any one of embodiments 1-1 to 1-13, wherein one or more constraints is independently associated with a biological response or biological function.
  • Embodiment 1-15 The engineered peptide of any one of embodiments 1-1 to 1-14, comprising one or more atoms associated with a biological response or biological function.
  • Embodiment 1-16 The engineered peptide of any one of embodiments 1-1 to 1-15, comprising one or more amino acids associated with a biological response or biological function.
  • Embodiment 1-17 The engineered peptide of any one of embodiments 1-14 to 1-16, wherein the biological response or biological function is selected from the group consisting of gene expression, metabolic activity, protein expression, cell proliferation, cell death, cytokine secretion, kinase activity, epigenetic modification, cell killing activity, inflammatory signals, chemotaxis, tissue infiltration, immune cell lineage commitment, tissue
  • microenvironment modification immune synapse formation, IL-2 secretion, IL-10 secretion, growth factor secretion, interferon gamma secretion, transforming growth factor beta secretion, immunoreceptor tyrosine-based activation motif activity, immunoreceptor tyrosine- based inhibition motif activity, antibody directed cell cytotoxicity, complement directed cytotoxicity, biological pathway agonism, biological pathway antagonism, biological pathway redirection, kinase cascade modification, proteolytic pathway modification, proteostasis pathway modification, protein folding/ pathways, post-translational modification pathways, metabolic pathways, gene transcription/translation, mRNA degradation pathways, gene methylation/acetylation pathways, histone modification pathways, epigenetic pathways, immune directed clearance, opsonization, hormone signaling, integrin pathways, membrane protein signal transduction, ion channel flux, and g-protein coupled receptor response.
  • Embodiment 1-18 The engineered peptide of embodiment 1-15, wherein the reference target comprises one or more atoms associated with a biological response or biological function, and wherein the atomic fluctuations of the one or more atoms in the engineered peptide associated with a biological response or biological function overlap with the atomic fluctuations of the one or more atoms in the reference target associated with a biological response or biological function.
  • Embodiment 1-19 The engineered peptide of embodiment 1-18, wherein the overlap is a root mean square inner product (RMSIP) greater than 0.25.
  • RMSIP root mean square inner product
  • Embodiment 1-20 The engineered peptide of embodiment 1-19, wherein the overlap has a root mean square inner product (RMSIP) greater than 0.75.
  • RMSIP root mean square inner product
  • Embodiment 1-21 The engineered peptide of any one of embodiments 1-18 to 1-20, wherein at least a portion of the atoms in the engineered peptide associated with a biological response or biological function are topologically constrained to a secondary structural element in the reference target.
  • Embodiment 1-22 The engineered peptide of embodiment 1-21, wherein the secondary structural element is a beta-sheet.
  • Embodiment 1-23. The engineered peptide of embodiment 1-21, wherein the secondary structural element is an alpha helix.
  • Embodiment 1-24. The engineered peptide of embodiment 1-21, wherein the secondary structural element is a turn, wherein the turn comprises between 2 to 7 residues, and comprises at least one inter-residue hydrogen bond.
  • Embodiment 1-25 The engineered peptide of embodiment 1-21, wherein the secondary structural element is a coil, wherein the coil comprises between 2 to 20 residues.
  • Embodiment 1-26 The engineered peptide of embodiment 1-25, wherein the coil comprises no inter-residue hydrogen bonds.
  • Embodiment 1-27 The engineered peptide of any one of embodiments 1-21 to 1-26, wherein at least a portion of the atoms in the engineered peptide associated with a biological response or biological function are topologically constrained to a combination of two or more secondary structural elements independently selected from the group consisting of a beta- sheet, an alpha helix, a turn, and a coil.
  • Embodiment 1-28 The engineered peptide of any one of embodiments 1-1 to 1-27, wherein one or more spatially-associated topological constraints is atomic distance.
  • Embodiment 1-2 The engineered peptide of any one of embodiments 1-1 to 1-28, wherein one or more spatially-associated topological constraints is an atomic energy.
  • Embodiment 1-30 The engineered peptide of embodiment 1-29, wherein each atomic energy is independently pairwise attractive energy between two atoms, pairwise repulsive energy between two atoms, atom-level solvation energy, pairwise charged attraction energy between two atoms, pairwise hydrogen bonding attraction energy between two atoms, or non-covalent bonding energy.
  • Embodiment 1-3 The engineered peptide of any one of embodiments 1-1 to 1-30, wherein one or more spatially-associated topological constraints is a chemical descriptor.
  • Embodiment 1-32 The engineered peptide of embodiment 1-31, wherein each chemical descriptor is independently hydrophobicity, polarity, volume, net charge, logP, high performance liquid chromatography retention, or van der Waals radii.
  • Embodiment 1-33 The engineered peptide of any one of embodiments 1-1 to 1-32, wherein one or more spatially-associated topological constraints is a bioinformatic descriptor.
  • Embodiment 1-34 The engineered peptide of embodiment 1-33, wherein each bioinformatics descriptor is independently BLOSUM similarity, pKa, zScale, Cruciani Properties, Kidera Factors, VHSE-scale, ProtFP, MS-WHIM scores, T-scale, ST-scale, Transmembrane tendency, protein buried area, helix propensity, sheet propensity, coil propensity, turn propensity, immunogenic propensity, antibody epitope occurrence, or protein interface occurrence.
  • Embodiment 1-35 The engineered peptide of any one of embodiments 1-1 to 1-34, wherein one or more spatially-associated topological constraints is solvent exposure.
  • Embodiment 1-36 The engineered peptide of any one of embodiments 1-1 to 1-35, wherein at least one of the one or more reference target-derived constraints is a GPCR extracellular domain.
  • Embodiment 1-37 The engineered peptide of any one of embodiments 1-1 to 1-36, wherein at least one of the one or more reference target-derived constraints is an ion channel extracellular domain.
  • Embodiment 1-38 The engineered peptide of any one of embodiments 1-1 to 1-37, wherein at least one of the one or more reference target-derived constraints is a protein- protein or peptide-protein interface junction.
  • Embodiment 1-39 The engineered peptide of any one of embodiments 1-1 to 1-38, wherein at least one of the one or more reference target-derived constraints is derived from a polymorphic region of the target.
  • Embodiment 1-40 The engineered peptide of any one of embodiments 1-1 to 1-39, comprising one or more atoms associated with a biological response or biological function, wherein each of the one or more atoms is independently selected from the group consisting of carbon, oxygen, nitrogen, hydrogen, sulfur, phosphorus, sodium, potassium, zinc, manganese, magnesium, copper, iron, molybdenum, and nickel.
  • Embodiment 1-4 The engineered peptide of any one of embodiments 1-1 to 1-40, comprising one or more amino acids associated with a biological function or biological response, wherein each of the one or more amino acids is independently a proteinogenic naturally occurring amino acid, a non-proteinogenic naturally occurring amino acid, or a chemically synthesized non-natural amino acid.
  • Embodiment 1-42 The engineered peptide of any one of embodiments 1-1 to 1-41, wherein the engineered peptide has at least one structural difference when compared to the reference target.
  • Embodiment 1-43 The engineered peptide of embodiment 1-42, wherein the at least one structural difference is independently selected from the group consisting of sequence, number of amino acid residues, total number of atoms, total hydrophilicity, total
  • Embodiment 1-44 The engineered peptide of embodiment 1-16, wherein the difference in one or more secondary structures is the presence of one or more additional secondary structural elements in the engineered peptide compared to the reference target, wherein each additional secondary structural element is independently selected from the group consisting of alpha helices, beta-sheets, loops, turns, and coils.
  • Embodiment 1-45 The engineered peptide of any one of embodiments 1-1 to 1-44, wherein between 10% to 90% of the amino acids meet one or more non-reference target- derived topological constraints.
  • Embodiment 1-46 The engineered peptide of embodiment 1-45, wherein the one or more non-reference target-derived topological constraints enforce a pre-specified function.
  • Embodiment 1-47 The engineered peptide of embodiment 1-46, wherein the non-reference derived topological constraints enforce or stabilize secondary structural elements in the reference derived fraction of the peptide;
  • non-reference derived topological constraints enforce atomic fluctuations in the reference derived fraction of the peptide; non-reference derived topological constraints alter peptide total hydrophobicity;
  • non-reference derived topological constraints enable detection in a labeled or label-free assay
  • non-reference derived topological constraints enable detection in an in vivo assay
  • non-reference derived topological constraints enable capture from a complex mixture
  • non-reference derived topological constraints enable enzymatic processing; non-reference derived topological constraints enable cell membrane permeability; non-reference derived topological constraints enable binding to a secondary target, and
  • Embodiment 1-48 A method of selecting an engineered peptide, comprising:
  • identifying one or more topological characteristics of a reference target designing spatially-associated constraints for each topological characteristic to produce a combination of spatially-associated topological constraints derived from the reference target; comparing spatially-associated topological characteristics of candidate peptides with the combination of spatially-associated topological constraints derived from the reference target; and selecting a candidate peptide with spatially-associated topological
  • Embodiment 1-49 The method of embodiment 1-48, wherein the overlap between each characteristic is independently less than or equal to 75% Mean Percentage Error (MPE) as determined by one or more of Total Topological Constraint Distance (TCD), topological clustering coefficient (TCC), Euclidean distance, power distance, Soergel distance, Canberra distance, Sorensen distance, Jaccard distance, Mahalanobis distance, Hamming distance, Quantitative Estimate of Likeness (QEL), or Chain Topology Parameter (CTP).
  • MPE Mean Percentage Error
  • Embodiment 1-50 The method of embodiment 1-48 or 1-49, wherein one or more constraints is derived from per-residue energy, per-residue interaction, per-residue fluctuation, per-residue atomic distance, per-residue chemical descriptor, per-residue solvent exposure, per-residue amino acid sequence similarity, per-residue bioinformatic descriptor, per-residue non-covalent bonding propensity, per-residue phi/psi angles, per-residue van der Waals radii, per-residue secondary structure propensity, per-residue amino acid adjacency, per-residue amino acid contact.
  • Embodiment 1-51 The method of any one of embodiments 1-48 to 1-50, wherein the characteristics of one or more candidate peptides are determined by computer simulation.
  • Embodiment 1-52 The method of embodiment 1-51, wherein the computer simulation comprises molecular dynamics simulations, Monte Carlo simulations, coarse grained simulations, Gaussian network models, machine learning, or any combinations thereof.
  • Embodiment 1-53 The method of any one of embodiments 1-48 to 1-52, wherein the characteristics of one or more candidate peptides are determined by experimental characterization.
  • Embodiment 1-54 The method of any one of embodiments 1-48 to 1-53, wherein the amino acids meeting the one or more reference target-derived constraints have between 10% and 90% sequence homology with the reference target.
  • Embodiment 1-55 The method of any one of embodiments 1-48 to 1-54, wherein the amino acids meeting the one or more reference target-derived constraints have a van der Waals surface area overlap with the reference of between 30 A 2 to 3000 A 2 .
  • Embodiment 1-56 The method of any one of embodiments 1-48 to 1-55, wherein the combination comprises at least two reference target-derived constraints.
  • Embodiment 1-57 The method of any one of embodiments 1-48 to 1-56, wherein the combination comprises at least five reference target-derived constraints.
  • Embodiment 1-58 The method of any one of embodiments 1-48 to 1-57, wherein the combination of constraints comprises one or more constraints not derived from a reference target.
  • Embodiment 1-59 The method of embodiment 1-58, wherein the one or more non reference target-derived constraints describes a desired structural, dynamical, chemical, or functional characteristic, or any combinations thereof.
  • Embodiment 1-60 The method of any one of embodiments 1-48 to 1-59, wherein the constraints are independently selected from the group consisting of: atomic distances; atomic fluctuations; atomic energies; chemical descriptors; solvent exposures; amino acid sequence similarity; bioinformatic descriptors; non-covalent bonding propensity; phi angles; psi angles; van der Waals radii; secondary structure propensity; amino acid adjacency; and amino acid contact.
  • the constraints are independently selected from the group consisting of: atomic distances; atomic fluctuations; atomic energies; chemical descriptors; solvent exposures; amino acid sequence similarity; bioinformatic descriptors; non-covalent bonding propensity; phi angles; psi angles; van der Waals radii; secondary structure propensity; amino acid adjacency; and amino acid contact.
  • Embodiment 1-61 The method of any one of embodiments 1-48 to 1-60, wherein one or more constraints is independently an atomic fluctuation.
  • Embodiment 1-62 The method of any one of embodiments 1-48 to 1-61, wherein one or more constraints is independently a chemical descriptor.
  • Embodiment 1-63 The method of any one of embodiments 1-48 to 1-62, wherein one or more constraints is independently atomic distance.
  • Embodiment 1-64 The method of any one of embodiments 1-48 to 1-63, wherein one or more constraints is independently secondary structure.
  • Embodiment 1-65 The method of any one of embodiments 1-48 to 1-64, wherein one or more constraints is independently van der Waals surface.
  • Embodiment 1-66 The method of any one of embodiments 1-48 to 1-65, wherein one or more constraints is independently associated with a biological response or biological function.
  • Embodiment 1-67 The method of any one of embodiments 1-48 to 1-66, wherein the engineered peptide comprises one or more atoms associated with a biological response or biological function.
  • Embodiment 1-68 The method of any one of embodiments 1-48 to 1-66, wherein the engineered peptide comprises one or more amino acids associated with a biological response or biological function
  • Embodiment 1-69 The method of any one of embodiments 1-66 to 1-68, wherein the biological response or biological function is selected from the group consisting of gene expression, metabolic activity, protein expression, cell proliferation, cell death, cytokine secretion, kinase activity, epigenetic modification, cell killing activity, inflammatory signals, chemotaxis, tissue infiltration, immune cell lineage commitment, tissue microenvironment modification, immune synapse formation, IL-2 secretion, IL-10 secretion, growth factor secretion, interferon gamma secretion, transforming growth factor beta secretion,
  • immunoreceptor tyrosine-based activation motif activity immunoreceptor tyrosine-based inhibition motif activity
  • antibody directed cell cytotoxicity immunoreceptor tyrosine-based inhibition motif activity
  • cytotoxicity cytotoxicity, biological pathway agonism, biological pathway antagonism, biological pathway redirection, kinase cascade modification, proteolytic pathway modification, proteostasis pathway modification, protein folding/ pathways, post-translational modification pathways, metabolic pathways, gene transcription/translation, mRNA degradation pathways, gene methylation/acetylation pathways, histone modification pathways, epigenetic pathways, immune directed clearance, opsonization, hormone signaling, integrin pathways, membrane protein signal transduction, ion channel flux, and g-protein coupled receptor response.
  • Embodiment 1-70 The method of embodiment 1-66, wherein the reference target comprises one or more atoms associated with a biological response or biological function, and wherein the atomic fluctuations of the one or more atoms in the engineered peptide associated with a biological response or biological function overlap with the atomic fluctuations of the one or more atoms in the reference target associated with a biological response or biological function.
  • Embodiment 1-71 The method of embodiment 1-70, wherein the overlap is a root mean square inner product (RMSIP) greater than 0.25.
  • RMSIP root mean square inner product
  • Embodiment 1-72 The method of embodiment 1-71, wherein the overlap has a root mean square inner product (RMSIP) greater than 0.75.
  • RMSIP root mean square inner product
  • Embodiment 1-73 The method of any one of embodiments 1-67 to 1-69, wherein at least a portion of the atoms in the engineered peptide associated with a biological response or biological function are topologically constrained to a secondary structural element in the reference target.
  • Embodiment 1-74 The method of embodiment 1-73, wherein the secondary structural element is a beta-sheet.
  • Embodiment 1-75 The method of embodiment 1-73, wherein the secondary structural element is an alpha helix.
  • Embodiment 1-76 The method of embodiment 1-73, wherein the secondary structural element is a turn, wherein the turn comprises between 2 to 7 residues, and comprises at least one inter-residue hydrogen bond.
  • Embodiment 1-77 The method of embodiment 1-73, wherein the secondary structural element is a coil, wherein the coil comprises between 2 to 20 residues.
  • Embodiment 1-78 The method of embodiment 1-73, wherein the coil comprises no inter-residue hydrogen bonds.
  • Embodiment 1-79 The method of any one of embodiments 1-67 to 1-69, wherein at least a portion of the atoms in the engineered peptide associated with a biological response or biological function are topologically constrained to a combination of two or more secondary structural elements independently selected from the group consisting of a beta-sheet, an alpha helix, a turn, and a coil.
  • Embodiment 1-80 The method of any one of embodiments 1-48 to 1-79, wherein one or more spatially-associated topological constraints is atomic distance.
  • Embodiment 1-81 The method of any one of embodiments 1-48 to 1-80, wherein one or more spatially-associated topological constraints is an atomic energy.
  • Embodiment 1-82 The method of embodiment 1-81, wherein each atomic energy is independently pairwise attractive energy between two atoms, pairwise repulsive energy between two atoms, atom-level solvation energy, pairwise charged attraction energy between two atoms, pairwise hydrogen bonding attraction energy between two atoms, or non-covalent bonding energy.
  • Embodiment 1-83 The method of any one of embodiments 1-48 to 1-82, wherein one or more spatially-associated topological constraints is a chemical descriptor.
  • Embodiment 1-84 The method of embodiment 1-83, wherein each chemical descriptor is independently hydrophobicity, polarity, volume, net charge, logP, high performance liquid chromatography retention, or van der Waals radii.
  • Embodiment 1-85 The method of any one of embodiments 1-48 to 1-84, wherein one or more spatially-associated topological constraints is a bioinformatic descriptor.
  • Embodiment 1-86 The method of embodiment 1-85, wherein each bioinformatics descriptor is independently BLOSEIM similarity, pKa, zScale, Cruciani Properties, Kidera Factors, VHSE-scale, ProtFP, MS-WHIM scores, T-scale, ST-scale, Transmembrane tendency, protein buried area, helix propensity, sheet propensity, coil propensity, turn propensity, immunogenic propensity, antibody epitope occurrence, or protein interface occurrence.
  • each bioinformatics descriptor is independently BLOSEIM similarity, pKa, zScale, Cruciani Properties, Kidera Factors, VHSE-scale, ProtFP, MS-WHIM scores, T-scale, ST-scale, Transmembrane tendency, protein buried area, helix propensity, sheet propensity, coil propensity, turn propensity, immunogenic propensity, antibody epitope occurrence, or protein interface occurrence.
  • Embodiment 1-87 The method of any one of embodiments 1-48 to 1-86, wherein one or more spatially-associated topological constraints is solvent exposure.
  • Embodiment 1-88 The method of any one of embodiments 1-48 to 1-87, wherein at least one of the one or more reference target-derived constraints is a GPCR extracellular domain.
  • Embodiment 1-89 The method of any one of embodiments 1-48 to 1-88, wherein at least one of the one or more reference target-derived constraints is an ion channel extracellular domain.
  • Embodiment 1-90 The method of any one of embodiments 1-48 to 1-89, wherein at least one of the one or more reference target-derived constraints is a protein-protein or protein-peptide interface junction.
  • Embodiment 1-91 The method of any one of embodiments 1-48 to 1-90, wherein at least one of the one or more reference target-derived constraints is derived from a polymorphic region of the target.
  • Embodiment 1-92 The method of any one of embodiments 1-48 to 1-91, wherein the engineered peptide comprises one or more atoms associated with a biological response or biological function, wherein each of the one or more atoms is independently selected from the group consisting of carbon, oxygen, nitrogen, hydrogen, sulfur, phosphorus, sodium, potassium, zinc, manganese, magnesium, copper, iron, molybdenum, and nickel.
  • Embodiment 1-93 The method of any one of embodiments 1-48 to 1-92, wherein the engineered peptide comprises one or more amino acids associated with a biological function or biological response, wherein each of the one or more amino acids is independently a proteinogenic naturally occurring amino acid, a non-proteinogenic naturally occurring amino acid, or a chemically synthesized non-natural amino acid.
  • Embodiment 1-94 The method of any one of embodiments 1-48 to 1-93, wherein the engineered peptide has at least one structural difference when compared to the reference target.
  • Embodiment 1-95 The method of embodiment 1-94, wherein the at least one structural difference is independently selected from the group consisting of sequence, number of amino acid residues, total number of atoms, total hydrophilicity, total hydrophobicity total positive charge, total negative charge, one or more secondary structures, shape factor,
  • Embodiment 1-96 The method of embodiment 1-95, wherein the difference in one or more secondary structures is the presence of one or more additional secondary structural elements in the engineered peptide compared to the reference target, wherein each additional secondary structural element is independently selected from the group consisting of alpha helices, beta-sheets, loops, turns, and coils.
  • Embodiment 1-97 The method of any one of embodiments 1-48 to 1-96, wherein between 10% to 90% of the amino acids of the engineered peptide meet one or more non reference target-derived topological constraints.
  • Embodiment 1-98 The method of embodiment 1-97, wherein the one or more non reference target-derived topological constraints enforce a pre-specified function.
  • Embodiment 1-99 The method of embodiment 1-98, wherein: non-reference derived topological constraints enforce or stabilize secondary structural elements in the reference derived fraction of the peptide; non-reference derived topological constraints enforce atomic fluctuations in the reference derived fraction of the peptide; non-reference derived topological constraints alter peptide total hydrophobicity; non-reference derived topological constraints alter peptide solubility; non-reference derived topological constraints alter peptide total charge; non-reference derived topological constraints enable detection in a labeled or label-free assay; non-reference derived topological constraints enable detection in an in vitro assay; non-reference derived topological constraints enable detection in an in vivo assay; non-reference derived topological constraints enable capture from a complex mixture; non-reference derived topological constraints enable enzymatic processing; non-reference derived topological constraints enable cell membrane permeability; non-reference derived topological constraints enable binding to a secondary target, or
  • Embodiment I- 100 A composition comprising two or more selection steering polypeptides, wherein each polypeptide is independently a positive selection molecule comprising one or more positive steering characteristics, or a negative selection molecule comprising one or more negative steering characteristics, wherein each characteristic type is independently selected from the group consisting of: amino acid sequence, polypeptide secondary structure, molecular dynamics, chemical features, biological function, immunogenicity, reference target(s) multi-specificity, cross-species reference target reactivity, selectivity of desired reference target(s) over undesired reference target(s), selectivity of reference target(s) within a sequence and/or structurally homologous family, selectivity of reference target(s) with similar protein function, selectivity of distinct desired reference target(s) from a larger family of undesired targets with high sequence and/or structurally homology, selectivity for distinct reference target alleles or mutations, selectivity for distinct reference target residue level chemical modifications, selectivity for cell type, selectivity for tissue type, selectivity for tissue environment, tolerance to
  • Embodiment 1-101 The composition of embodiment 1-100, wherein at least one of the two or more polypeptides is a positive selection molecule, and at least one of the two or more polypeptides is a negative selection molecule.
  • Embodiment 1-102 The composition of embodiment 1-100 or 1-101, wherein at least one of the two or more polypeptides is a native protein.
  • Embodiment 1-103 The composition of any one of embodiments I- 100 to 1-102, comprising at least one pair of counterpart positive and negative selection molecules comprising at least one shared characteristic type, wherein the positive selection molecule comprises the positive characteristic and the negative selection molecule comprises the negative characteristic.
  • Embodiment 1-104 A method of screening a library of binding molecules with the composition of embodiment I- 100, comprising subjecting a pool of candidate binding molecules to at least one round of selection, wherein each round of selection comprises: a negative selection step of screening at least a portion of the pool against a negative selection molecule; and a positive selection step of screening at least a portion of the pool for a positive selection molecule; wherein the order of selection steps within each round, and the order of rounds, result in the selection of a different subset of the pool than an alternative order.
  • Embodiment 1-105 The method of embodiment 1-104, wherein the library of binding molecules is a phage library.
  • Embodiment 1-106 The method of embodiment 1-105, wherein the library of binding molecules is a cell library.
  • Embodiment 1-107 The method of embodiment 1-106, wherein the library of binding molecules is a B-cell library.
  • Embodiment 1-109 The method of any one of embodiments 1-104 to 1-108, comprising two or more rounds of selection.
  • Embodiment 1-110 The method of any one of embodiments 1-104 to 1-109, comprising three or more rounds of selection.
  • Embodiment 1-111 The method of embodiment 1-109 or 1-110, wherein each round comprises a different set of selection molecules.
  • Embodiment 1-112. The method of embodiment 1-109 or 1-110, wherein at least two rounds comprise the same negative selection molecule, or the same positive selection molecule, or both.
  • Embodiment 1-113 The method of any one embodiments 1-109 to 1-112, comprising analyzing the subset of the pool obtained from a round of selection prior to proceeding to the next round of selection.
  • Embodiment 1-114 The method of embodiment 1-113, wherein the subset pool analysis determines the set of positive and/or negative selection molecules used in one or more subsequent rounds of selection.
  • Embodiment 1-115 The method of embodiment 1-113 or I- 114, wherein each subset pool analysis is independently selected from the group consisting of peptide/protein biosensor binding, peptide/protein ELISA, peptide library binding, cell extract binding, cell surface binding, cell activity assay, cell proliferation assay, cell death assay, enzyme activity assay, gene expression profile, protein modification assay, Western blot, and
  • Embodiment 1-116 The method of any one of embodiments 1-113 to 1-115, wherein the positive, negative, or both positive and negative selection molecules used in one or more subsequent rounds of selection are determined by statistical/informatic scoring, or machine learning training, of a subset pool analysis.
  • Embodiment 1-117 The method of any one of embodiments 1-109 to 1-116, wherein the subset pool obtained from a round of selection is modified before moving to the next selection round.
  • Embodiment 1-118 The method embodiment 1-117, wherein the subset pool analysis determines the positive, negative, or both positive and negative selection molecules used in one or more subsequent rounds of selection; and modification of the subset pool before moving to the next selection round.
  • Embodiment 1-119 The method of embodiment 1-117 or I- 118, wherein each modification is independently selected from the group selected from genetic mutation, genetic depletion, genetic enrichment, chemical modification, and enzymatic modification.
  • Example 1 Selection of Engineered Peptides using a VEGF Epitope as the Reference Target
  • a putative therapeutic epitope of VEGF was identified as a reference target for engineered peptide selection, and atomic distance and amino acid descriptor topology were determined (FIG. 6B).
  • the atomic distance and amino acid descriptor topology of the reference target were obtained using dynamic simulations, and a covariance matrix of atomic fluctuations was generated for the epitope in the reference target.
  • different engineered peptide candidates were generated using computational protein design (e.g. Rosetta), dynamics simulations performed on the candidates, and the atomic distance and amino acid descriptor topologies determined (FIGS. 6C-6E). These mean percentage error (MPE) of these topologies were compared (FIGS. 6G-6H).
  • the MPE values were: reference topology vs. candidate 1 topology: 6.03%; reference topology vs. candidate 2 topology: 6.00%; and reference topology vs. candidate 3 topology: 22.8%.
  • Example 2 Selection of Engineered Peptides using a VEGF Epitope as the Reference Target
  • Engineered peptide candidates were generated using computational protein design (e.g. Rosetta) or other methods of sampling peptide space, and dynamics simulations were performed on the candidates.
  • a covariance matrix of atomic fluctuations was generated for the reference target epitope, and for the residues in the candidates corresponding to the residues in the epitope of the reference target.
  • the similarity of eigenvectors corresponds to their components (a 3D vector centered on each CA atom) being aligned— pointing in the same direction (FIGS. 7D-7G).
  • This similarity between candidates and reference target eigenvectors was computed using the inner product of two eigenvectors.
  • the inner product value was 0 if two
  • eigenvectors are 90 degrees to each other or 1 if the two eigenvectors point precisely in the same direction.
  • Principal component analysis reduces the 3Lx3L dimensional coordinate covariance matrices ( L being number of atoms) into sets of eigenvectors, F (reference target) and Y (MEM), and eigenvalues, L.
  • the set F contains N eigenvectors (pi for the reference target and the set Y contains N eigenvectors y] for the MEM, where eigenvectors are ordered in their respective sets by their associated eigenvalues.
  • the eigenvector with the largest eigenvalue accounts for the largest fraction of total coordinate covariation.
  • /j eigenvector is computed to compare the similarity of motion between the reference target and the MEM.
  • the root mean square of all inner product combinations of (pi and y] eigenvectors renders the total similarity of motion of the engineered peptide candidate (MEM) to the reference target (RMSIP) (FIG. 8).
  • the RMSIP results from 5 candidate engineered peptides vs. the VEGF reference epitope are shown in Table 1. These data were sampled from a total simulation of 1000 candidates generated using Rosetta design with a candidate vs. reference static structure RMSD cutoff. Of the 1000 candidates, XTR-1000-T0 had the lowest Rosetta (static structure) Energy (lower is more favorable), but intermediate RMSIP dynamics matching. Candidates XTR-1000-B1 and B2 had the highest dynamics-matching score (e.g., their motions most closely matched the motions of the reference target, computed by RMSIP). Candidates XTR- 1000-Wl and W2 had the lowest dynamics-matching score, shown to demonstrate the RMSIP dynamic range in this 1000 candidate data set, RMSIP range 0.772 - 0.545.
  • FIG. 7B Structures of the candidates aligned to the VEGF reference epitope are shown in FIG. 7B.
  • Octet/Biosensor Screening The affinity of the different engineered peptides were evaluated on an Octet Red 384 instrument, using a single-cycle kinetics assay design. The peptides were evaluated separately, and immobilized via a biotin linker to the streptavidin- coated tip of the biosensor. The remaining open streptavidin sites were blocked with biocytin. An analyte was washed over the sensor tip and the binding of the molecules in the analyte to the peptides recorded. For this assay, the analyte was a serial dilution of
  • Bevacizumab from 0.19 uM to 1.5 uM. Each assay was run in duplicate. Controls were also run, using just a buffer (to control for sensor drift) and a separate control of purified IgG from human ND serum (to control for non-specific IgG binding).
  • Each program used at least one engineered peptide as a selection molecule.
  • a conventional selection was also included using conventional methods (VEGF as the positive target and BSA as a negative target selecting against non-specific binding). 738 clones were selected for ELISA response analysis after three rounds of panning.
  • the panning protocol began with a human naive scFv library, and panning was performed in solution, with the selection molecules bound to biotin (but still in solution).
  • the starting pool was combined with the negative selection molecule first in solution, and then a streptavidin-coated substrate (e.g., magnetic beads) was applied to the mixture to bind the negative selection molecules.
  • a streptavidin-coated substrate e.g., magnetic beads
  • any phage in the pool that was bound to the negative selection molecule was also bound to the streptavidin-coated support.
  • the remaining solution was removed, and this flow through was then taken on to the positive selection step.
  • the flow through was combined with positive selection molecule, allowed to bind, and then a streptavidin-coated solid substrate applied to the mixture. In this step, the bound phage were retained while the remaining unbound phage were removed. Then the bound phage were then eluted.
  • coli were transfected with the eluted phage using a 30 minute cultivation, the transfected cells were split for next-generation sequencing and DNA isolation for analysis, and then the phage amplified for use in the subsequent panning round. For each panning program, in each round negative selection was performed first, and positive selection second.
  • the candidate pools were also tested in a cross-blocking ELISA assay for blocking of bevacizumab:VEGF binding (dose-responseive competition with bevacizumab at 0 nM, 67 pM, 670 pM, and 6.7 nM). These results are shown in FIGS. 14A-14I and Table 2, and the total count of confirmed cross-blocking clones obtained from each program is summarized in FIG. 15. These demonstrate that the programmable in vitro selection programs using the engineered peptides were able to isolate clones from the full clone library that cross-block bevacizumab, which shares the reference target epitope used to derive the engineered peptides. Table 2
  • FIG. 17 summarizes the binding, cross-blocking, CDR sequences, and germline usage for all Fabs produced for further testing.
  • FIG. 17 and FIG. 18 show ELISA binding results for the Fabs listed in Tables 3A and 3B.
  • Blocking Propensity SUM(X-blocking Slope, (sMEM + VEGF) - iMEM), where X- blocking Slope, sMEM and VEGF are Robust Z-Scores. [0286] Scoring rationale: If a blocking response is observed, through a significant (by robust z-score) negative slope, then blocking propensity is a combination of z-scores for VEGF binding and X-blocking slope. The blocking propensity is summarized in FIG. 19, and in the below table.
  • FIG. 21 provides schematic overview of the preparation of NGS samples. Briefly, samples were prepared by cloning out individual heavy and light chain sequences at constant portions of the expression vector. A 2 x 250 paired end sequencing run was used, and the reads were joined and annotated with a tool such as Pylg.
  • FIGS. 24A-24L are pairing frequency comparisons and dimensional charts analyzing how the different screening rounds, for round 1 (FIGS. 24A-24D), round 2 (FIGS. 24E-24H), and round 3 (FIGS. 24I-24L), shape diversity of the resulting selected pools.
  • the engineered peptide (MEM)-programmed in vitro selection isolates distinct antibody clonotypes with higher diversity germline usage vs. conventional approach at the first round of selection.
  • MEM-based in vitro selection produces more diverse light chain germline usage at round 1 vs. full length antigen and uMEM.
  • MEM-based in vitro selection programs produce distinct heavy chain germline usage at round 2 vs. full length antigen.
  • the order and identity of the MEM used in the in vitro selection program affect heavy chain germline usage.
  • MEM-based in vitro selection programs produce distinct light chain germline usage at round 2 vs. full length antigen.
  • the order and identity of the MEM used in the in vitro selection program affect light chain germline usage.
  • MEM-based in vitro selection programs produce distinct, AND more diverse heavy chain germline usage at round 3 vs. full length antigen.
  • the order and identity of the MEM used in the in vitro selection program affect heavy chain germline usage and diversity.
  • MEM-based in vitro selection programs produce distinct, AND more diverse light chain germline usage at round 3 vs. full length antigen.
  • the order and identity of the MEM used in the in vitro selection program affect light chain germline usage and diversity.
  • FIGS. 25 A and 25B A summary of how the different phage panning programs focused Fab hits is provided in FIGS. 25 A and 25B.
  • FIG. 27 summarizes off-epitope VEGF hit frequency per panning round for each program, demonstrating the conventional program identified mAb hits confirmed to bind VEGF but not putative epitope-selective mAb hits.
  • FIG. 28 summarizes the binding.
  • FIGS. 42A-42F The ELISA responses are provided in FIGS. 42A-42F.
  • the 23 distinct clones identified from the cross-blocking hits were sequenced (via Sanger sequencing), and are listed in FIG. 43.
  • a summary of the distinct clone count of cross-blocking hits across panning programs is provided in FIG. 44.
  • These results were analyzed to determine if any of the in vitro selection programs produce a random-selection enrichment of clones that cross-block PD- L1 :avelumab/durvalumab.
  • Scoring rationale If a blocking response is observed, through a significant (by robust z-score) negative slope, then blocking propensity is a combination of z-scores for PD-L1, MEM binding and X-blocking slope, where the X- blocking z-score used is the maximum z-score of avelumab vs. durvalumab since these Tx mAbs have slightly different epitopes on the surface.
  • the scaffold blueprint may constrain the sequence of the amino acids in the engineered polypeptide to match the order of the amino acids in the reference target.
  • the sequence homology may be constrained to 100% (each amino acid in the reference target corresponds to one amino acid in the blueprint) or the sequence homology may be permitted to be lower, e.g ., 10 to 90% homology.
  • the scaffold blueprints may be converted into a vector representation (FIG.
  • a machine-learning (ML) model may be trained on training data that includes representations of the scaffold blueprints and the corresponding scores.
  • the representations may be, for example, one-dimensional vector of numbers, two dimensional matrices of alphanumerical data, three-dimensional tensor of normalized numbers. More specifically, in some instances, the representations are vectors including an ordered list of numbers of intervening scaffold residue positions. Such representations may be used because the order of target-residues can be inferred from target structures, therefore the representations do not need to identify the amino acid identity of target-residue positions.
  • the scores of the scaffold blueprints can be generated using computational protein modeling (e.g., Rosetta remodeler) that determines an energy term for each scaffold blueprint. The scores can be then calculated based on the energy terms generated by the computational protein modeling.
  • the ML model can be, for example, a boosted decision tree algorithm, an ensemble of decision trees, an extreme gradient boosting (XGBoost) model, a random forest, a support vector machine (SVM), and/or the like.
  • XGBoost extreme gradient boosting
  • SVM support vector machine
  • the ML model is then executed to generate a set of predicted scores from a set of scaffold blueprints. If a predicted score is above a desired score, a scaffold blueprint corresponding to the predicted score can be simulated by computational protein modeling to generate a ground-truth score. The ground- truth score and the predicted score can be compared to determine retraining of the ML model.
  • the training and executing steps may be iterated as shown in FIG. 62 until optimal/improved scaffold blueprints having the desired score are predicted. The optimal/improved scaffold blueprints are then converted into engineered peptides.

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Organic Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medicinal Chemistry (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Genetics & Genomics (AREA)
  • Urology & Nephrology (AREA)
  • Immunology (AREA)
  • Hematology (AREA)
  • Physiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Analytical Chemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)

Abstract

L'invention concerne des peptides modifiés qui comprennent une combinaison de contraintes topologiques associées à l'espace, au moins une contrainte étant dérivée d'une cible de référence, et des procédés de sélection desdits peptides modifiés. L'invention concerne en outre des procédés d'utilisation des peptides modifiés, comprenant en tant que molécules de sélection positive et/ou négative dans des procédés de criblage une bibliothèque de molécules de liaison.
PCT/US2020/032715 2019-05-31 2020-05-13 Peptides modifiés à l'échelle mésométrique et procédés de sélection WO2020242765A1 (fr)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP20813167.2A EP3977117A4 (fr) 2019-05-31 2020-05-13 Peptides modifiés à l'échelle mésométrique et procédés de sélection
CN202080050892.XA CN114585918A (zh) 2019-05-31 2020-05-13 中尺度工程化肽和选择方法
JP2021570755A JP2022535511A (ja) 2019-05-31 2020-05-13 メソスケール操作されたペプチドおよび選択方法
CA3142227A CA3142227A1 (fr) 2019-05-31 2020-05-13 Peptides modifies a l'echelle mesometrique et procedes de selection
KR1020217043265A KR20220041784A (ko) 2019-05-31 2020-05-13 메소 스케일 조작된 펩티드 및 선택 방법
US17/537,215 US20220081472A1 (en) 2019-05-31 2021-11-29 Meso-scale engineered peptides and methods of selecting

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962855767P 2019-05-31 2019-05-31
US62/855,767 2019-05-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/537,215 Continuation US20220081472A1 (en) 2019-05-31 2021-11-29 Meso-scale engineered peptides and methods of selecting

Publications (1)

Publication Number Publication Date
WO2020242765A1 true WO2020242765A1 (fr) 2020-12-03

Family

ID=73553528

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2020/032724 WO2020242766A1 (fr) 2019-05-31 2020-05-13 Appareil à base d'apprentissage automatique pour la modification de peptides à l'échelle méso et procédés et système pour celui-ci
PCT/US2020/032715 WO2020242765A1 (fr) 2019-05-31 2020-05-13 Peptides modifiés à l'échelle mésométrique et procédés de sélection

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/US2020/032724 WO2020242766A1 (fr) 2019-05-31 2020-05-13 Appareil à base d'apprentissage automatique pour la modification de peptides à l'échelle méso et procédés et système pour celui-ci

Country Status (7)

Country Link
US (3) US11545238B2 (fr)
EP (2) EP3976083A4 (fr)
JP (2) JP7579812B2 (fr)
KR (2) KR20220041784A (fr)
CN (2) CN114401734A (fr)
CA (2) CA3142227A1 (fr)
WO (2) WO2020242766A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115512763A (zh) * 2022-09-06 2022-12-23 北京百度网讯科技有限公司 多肽序列的生成方法、多肽生成模型的训练方法和装置

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117980995A (zh) * 2021-05-21 2024-05-03 蛋白胨有限公司 多肽结构的时空确定
CN114065620B (zh) * 2021-11-11 2022-06-03 四川大学 基于像素图表征和cnn的可解释性分子动力学轨迹分析方法
WO2023215887A1 (fr) * 2022-05-06 2023-11-09 Dyno Therapeutics, Inc. Système et procédés de prédiction de caractéristiques de séquences biologiques
CN115881220B (zh) * 2023-02-15 2023-06-06 北京深势科技有限公司 一种抗体结构预测的处理方法和装置
CN116467894B (zh) * 2023-05-16 2024-08-20 郑州大学 一种基于机器学习分子动力学的辐照损伤仿真系统及方法
CN116913395B (zh) * 2023-09-13 2023-11-28 青岛虹竹生物科技有限公司 一种构建小分子肽数据库的数字化方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002064734A2 (fr) * 2000-12-19 2002-08-22 Palatin Technologies, Inc. Identification de sites de repliement specifiques a la cible dans des peptides et des proteines
US20060020396A1 (en) * 2002-09-09 2006-01-26 Rene Gantier Rational directed protein evolution using two-dimensional rational mutagenesis scanning
US20070016380A1 (en) * 1998-10-21 2007-01-18 The University Of Queensland Protein engineering
US8374828B1 (en) * 2007-12-24 2013-02-12 The University Of North Carolina At Charlotte Computer implemented system for protein and drug target design utilizing quantified stability and flexibility relationships to control function
US20130090265A1 (en) 2011-10-11 2013-04-11 Biolauncher Ltd. Systems and methods for generation of context-specific, molecular field-based amino acid substitution matrices
WO2016005969A1 (fr) * 2014-07-07 2016-01-14 Yeda Research And Development Co. Ltd. Procédé de conception assistée par ordinateur de protéines
WO2020102603A1 (fr) * 2018-11-14 2020-05-22 Rubryc Therapeutics, Inc. Polypeptides cd25 génétiquement modifiés et leurs utilisations

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU3217000A (en) 1999-01-27 2000-08-18 Scripps Research Institute, The Protein modeling tools
JP2003510672A (ja) * 1999-08-02 2003-03-18 シンテム ソシエテ アノニム 分子疑似体を作製するためのコンピュータによる設計方法
EP1482433A3 (fr) * 2001-08-10 2006-04-12 Xencor, Inc. Automatisation de conception de protéines pour la préparation de bibliothèques de protéines
US20070192033A1 (en) 2006-02-16 2007-08-16 Microsoft Corporation Molecular interaction predictors
US8050870B2 (en) 2007-01-12 2011-11-01 Microsoft Corporation Identifying associations using graphical models
JP2010113473A (ja) 2008-11-05 2010-05-20 Saitama Univ ペプチドとタンパク質の結合部位を予測する方法、装置、およびプログラム
AU2011305371B2 (en) * 2010-09-21 2015-05-21 Massachusetts Institute Of Technology Human-adapted HA polypeptides, vaccines, and influenza treatment
US10431325B2 (en) 2012-08-03 2019-10-01 Novartis Ag Methods to identify amino acid residues involved in macromolecular binding and uses therefor
EP2925348B1 (fr) * 2012-11-28 2019-03-06 BioNTech RNA Pharmaceuticals GmbH Vaccins individualisés pour le cancer
WO2016164305A1 (fr) 2015-04-06 2016-10-13 Subdomain, Llc Polypeptides contenant un domaine de liaison de novo et leurs utilisations
US20180068054A1 (en) * 2016-09-06 2018-03-08 University Of Washington Hyperstable Constrained Peptides and Their Design
WO2018132752A1 (fr) 2017-01-13 2018-07-19 Massachusetts Institute Of Technology Conception d'anticorps basée sur l'apprentissage automatique
WO2018201020A1 (fr) * 2017-04-28 2018-11-01 University Of Washington Polypeptides pliés et résistants aux protéases
EP3899954A4 (fr) 2018-12-21 2022-09-14 BioNTech US Inc. Procédé et systèmes de prédiction d'épitopes spécifiques des hla de classe ii et caractérisation de lymphocytes t cd4+

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070016380A1 (en) * 1998-10-21 2007-01-18 The University Of Queensland Protein engineering
WO2002064734A2 (fr) * 2000-12-19 2002-08-22 Palatin Technologies, Inc. Identification de sites de repliement specifiques a la cible dans des peptides et des proteines
US20060020396A1 (en) * 2002-09-09 2006-01-26 Rene Gantier Rational directed protein evolution using two-dimensional rational mutagenesis scanning
US8374828B1 (en) * 2007-12-24 2013-02-12 The University Of North Carolina At Charlotte Computer implemented system for protein and drug target design utilizing quantified stability and flexibility relationships to control function
US20130090265A1 (en) 2011-10-11 2013-04-11 Biolauncher Ltd. Systems and methods for generation of context-specific, molecular field-based amino acid substitution matrices
WO2016005969A1 (fr) * 2014-07-07 2016-01-14 Yeda Research And Development Co. Ltd. Procédé de conception assistée par ordinateur de protéines
WO2020102603A1 (fr) * 2018-11-14 2020-05-22 Rubryc Therapeutics, Inc. Polypeptides cd25 génétiquement modifiés et leurs utilisations

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3977117A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115512763A (zh) * 2022-09-06 2022-12-23 北京百度网讯科技有限公司 多肽序列的生成方法、多肽生成模型的训练方法和装置
CN115512763B (zh) * 2022-09-06 2023-10-24 北京百度网讯科技有限公司 多肽序列的生成方法、多肽生成模型的训练方法和装置

Also Published As

Publication number Publication date
EP3976083A1 (fr) 2022-04-06
EP3977117A1 (fr) 2022-04-06
US20230095685A1 (en) 2023-03-30
EP3976083A4 (fr) 2023-07-12
US20220081472A1 (en) 2022-03-17
CN114401734A (zh) 2022-04-26
CA3142339A1 (fr) 2020-12-03
CA3142227A1 (fr) 2020-12-03
WO2020242766A1 (fr) 2020-12-03
JP2022535511A (ja) 2022-08-09
CN114585918A (zh) 2022-06-03
EP3977117A4 (fr) 2023-08-16
KR20220041784A (ko) 2022-04-01
KR20220039659A (ko) 2022-03-29
JP2022535769A (ja) 2022-08-10
US20210166788A1 (en) 2021-06-03
US11545238B2 (en) 2023-01-03
JP7579812B2 (ja) 2024-11-08

Similar Documents

Publication Publication Date Title
WO2020242765A1 (fr) Peptides modifiés à l'échelle mésométrique et procédés de sélection
CN114303201B (zh) 使用机器学习技术生成蛋白质序列
Li et al. OptMAVEn–a new framework for the de novo design of antibody variable region models targeting specific antigen epitopes
JP5457009B2 (ja) ヒトに適合したモノクローナル抗体における使用法
WO2020102603A1 (fr) Polypeptides cd25 génétiquement modifiés et leurs utilisations
Bailey et al. Locking the elbow: improved antibody Fab fragments as chaperones for structure determination
Peterson et al. Using hapten design to discover therapeutic monoclonal antibodies for treating methamphetamine abuse
WO2007109742A2 (fr) Methodes d'humanisation d'anticorps et anticorps humanises ainsi obtenus
Lee et al. A two-in-one antibody engineered from a humanized interleukin 4 antibody through mutation in heavy chain complementarity-determining regions
Noël et al. Global analysis of VHHs framework regions with a structural alphabet
JP2021151236A (ja) 三次元構造に基づくヒト化方法
JPWO2020242765A5 (fr)
Vlachakis Antibody Clustering and 3D Modeling for Neurodegenerative Diseases
JP7511561B2 (ja) エピトープおよびパラトープを同定する方法
Rampuria et al. Molecular insights into recognition of GUCY2C by T-cell engaging bispecific antibody anti-GUCY2CxCD3
Leem Development of computational methodologies for antibody design
CN117836308A (zh) 用于选择特异性结合剂的手段和方法
VIART Computational design of peptide ligands based on antibody-antigen interface properties
Nakariyakul Sequence-based interaction prediction for mouse PDZ domains and peptide ligands

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20813167

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021570755

Country of ref document: JP

Kind code of ref document: A

Ref document number: 3142227

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020813167

Country of ref document: EP

Effective date: 20220103