WO2023232857A1 - Common light chain antibody libraries - Google Patents

Common light chain antibody libraries Download PDF

Info

Publication number
WO2023232857A1
WO2023232857A1 PCT/EP2023/064529 EP2023064529W WO2023232857A1 WO 2023232857 A1 WO2023232857 A1 WO 2023232857A1 EP 2023064529 W EP2023064529 W EP 2023064529W WO 2023232857 A1 WO2023232857 A1 WO 2023232857A1
Authority
WO
WIPO (PCT)
Prior art keywords
hcdr1
hcdr2
human
amino acid
library
Prior art date
Application number
PCT/EP2023/064529
Other languages
French (fr)
Inventor
Jeremy Loyau
Michael Dyson
Gregory LA SALA
Original Assignee
Ichnos Sciences SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ichnos Sciences SA filed Critical Ichnos Sciences SA
Publication of WO2023232857A1 publication Critical patent/WO2023232857A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/18Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
    • C07K16/28Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
    • C07K16/2803Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against the immunoglobulin superfamily
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/18Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
    • C07K16/28Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants
    • C07K16/2803Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against the immunoglobulin superfamily
    • C07K16/2818Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans against receptors, cell surface antigens or cell surface determinants against the immunoglobulin superfamily against CD28 or CD152
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/20Immunoglobulins specific features characterized by taxonomic origin
    • C07K2317/21Immunoglobulins specific features characterized by taxonomic origin from primates, e.g. man
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/30Immunoglobulins specific features characterized by aspects of specificity or valency
    • C07K2317/33Crossreactivity, e.g. for species or epitope, or lack of said crossreactivity
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/40Immunoglobulins specific features characterized by post-translational modification
    • C07K2317/41Glycosylation, sialylation, or fucosylation
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/50Immunoglobulins specific features characterized by immunoglobulin fragments
    • C07K2317/515Complete light chain, i.e. VL + CL
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/50Immunoglobulins specific features characterized by immunoglobulin fragments
    • C07K2317/56Immunoglobulins specific features characterized by immunoglobulin fragments variable (Fv) region, i.e. VH and/or VL
    • C07K2317/565Complementarity determining region [CDR]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/60Immunoglobulins specific features characterized by non-natural combinations of immunoglobulin fragments
    • C07K2317/62Immunoglobulins specific features characterized by non-natural combinations of immunoglobulin fragments comprising only variable region components
    • C07K2317/622Single chain antibody (scFv)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/90Immunoglobulins specific features characterized by (pharmaco)kinetic aspects or by stability of the immunoglobulin
    • C07K2317/92Affinity (KD), association rate (Ka), dissociation rate (Kd) or EC50 value

Definitions

  • the present invention relates generally to methods for generating a collection of diverse human HCDR1 and/or HCDR2.
  • the present invention also relates to a common light chain antibody library comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein.
  • the common light chain antibody library further comprises a human HCDR3 being a naturally occurring HCDR3.
  • Therapeutic antibodies are currently one of the fastest growing classes of drugs and are approved for the treatment of several indications, from cancer to autoimmune disease. However, the identification of therapeutically useful antibodies, namely antibodies capable of binding a specific selected target and having the desired properties, is not trivial.
  • Methods for identifying desirable antibodies may involve phage display of antibodies or antibody fragment, for example by construction of human libraries derived by amplification of nucleic acids from B cells or tissues, or by construction of synthetic or semi-synthetic libraries.
  • the antibody diversity is designed in silico and synthesized in a controlled manner.
  • Synthetic or semi-synthetic antibody libraries can be constructed by introducing randomized complementarity determining region (CDR) sequences into antibody frameworks.
  • the present disclosure is directed towards addressing such limitations and therefore aims at designing and introducing diversity in the heavy chain CDRs, to create common light chain antibody synthetic or semi-synthetic libraries which can improve the possibility of identifying and generating common light chain multispecific antibodies for therapeutic applications. Summary of the invention
  • the present invention relates to a first method to generate a first method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Removing from said HCDR1 and/or HCDR2 amino acid sequence dataset:
  • the present invention also relates to a second method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat position 25 to 36, and wherein said human HCDR2 corresponds to the Kabat position 47 to 65, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset. c.
  • said plurality of amino acids is made of the naturally occurring amino acid at each position, excluding amino acids with frequency less than 1%, Cys, and optionally Met;
  • said plurality of amino acids is made of any amino acid, excluding Cys, Met, Trp, and optionally Pro.
  • the human antibody variable heavy chain germline of the methods disclosed herein is selected from the group comprising VH1-46 (SEQ. ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3).
  • HCDR1 and/or HCDR2 Disclosed herein is also a collection of diverse human HCDR1 and/or HCDR2 obtained by the methods above.
  • the collection of diverse human HCDR1 and/or HCDR2 is obtained by the second method, wherein said HCDR1 is encoded as trimer oligonucleotides in VH1-69 germline according to the frequency of Table 3, or said HCDR1 is encoded as trimer oligonucleotides in VH1-46 germline according to the frequency of Table 4, said HCDR1 is encoded as trimer oligonucleotides in VH3-23 germline according to the frequency of Table 5; and said HCDR2 is encoded as trimer oligonucleotides in VH1-69 germline according to the frequency of Table 6, or said HCDR2 is encoded as trimer oligonucleotides in VH1-46 germline according to the frequency of Table 7, and said HCDR2 is encoded as trimer oligonucleotides in VH3-23 germline according to the frequency of Table 8.
  • the present invention also relates to a library of antibody binding regions, wherein said antibody binding regions comprise a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the methods above and a common light chain variable domain.
  • the common light chain variable domain is a VK3-15 or a VK1-39 variable light chain.
  • the common light chain variable domain is Vi ⁇ 3-15/Jkl (SEQ ID NO: 4).
  • the heavy chain variable domain of the antibody binding regions of the library disclosed herein further comprise a naturally occurring heavy chain framework region.
  • the naturally occurring heavy chain framework region is derived from a human antibody variable heavy chain germline selected from the group comprising VH1-46 (SEQ. ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3).
  • the heavy chain variable domain of the antibody binding regions of the library disclosed herein further comprises a human HCDR3 being a naturally occurring HCDR3 from a human IgM repertoire.
  • the library disclosed herein is a phage display library.
  • the present invention also relates to a method for identifying an antibody binding region from the library disclosed herein, wherein said antibody binding region specifically binds to an antigen target of interest comprising the steps of (i) panning on said antigen and (ii) screening said antibody binding region that specifically binds to said antigen.
  • the present invention further relates to an antibody binding region identified according to the method mentioned above for identifying an antibody binding region.
  • the present invention relates to methods for generating a collection of diverse human HCDR1 and/or HCDR2 starting from the sequences of the human IgG repertoire.
  • Sequences of the human IgG repertoire can be obtained by any technique known in the art. In specific embodiments said sequences are obtained by Next Generation Sequencing of the human IgG repertoire.
  • CDR diversity refers to the heterogeneity in the amino acid sequence of a given CDR, wherein said heterogeneity is given by variation in the amino acid type at each position of a given CDR.
  • Such heterogeneity can be naturally derived or can be in silica designed.
  • non-naturally occurring refers to the fact that the object can be found in nature and has not been manipulated by man.
  • a polynucleotide or polypeptide that is present in an organism (including viruses) that can be isolated from a source in nature and that has not been intentionally modified by man is naturally-occurring.
  • non-naturally occurring refers to an object that is not found in nature or that has been structurally modified or synthesized by man.
  • Stereoisomers e.g., D-amino acids
  • unnatural amino acids and analogs such as a-, a-di substituted amino acids, N- alkyl amino acids, lactic acid, and other unconventional amino acids may also be suitable components for the polypeptide chains of the binding proteins.
  • Examples of unconventional amino acids include: 4-hydroxyproline, y-carboxyglutamate, e-N,N,N-trimethyllysine, e-N- acetyllysine, O-phosphoserine, N-acetyl serine, N-formylmethionine, 3-methylhistidine, 5- hydroxylysine, o-N-methylarginine, and other similar amino acids and imino acids (e.g., 4- hydroxyproline).
  • the left-hand direction is the amino terminal direction and the right-hand direction is the carboxyl-terminal direction, in accordance with standard usage and convention.
  • Naturally occurring residues may be divided into classes based on common side chain properties:
  • Conservative amino acid substitutions may involve exchange of a member of one of these classes with another member of the same class.
  • Non-conservative substitutions may involve the exchange of a member of one of these classes for a member from another class.
  • a skilled artisan will be able to determine suitable variants of the polypeptide chains of the binding proteins using well-known techniques. For example, one skilled in the art may identify suitable areas of a polypeptide chain that may be changed without destroying activity by targeting regions not believed to be important for activity. Alternatively, one skilled in the art can identify residues and portions of the molecules that are conserved among similar polypeptides. In addition, even areas that may be important for biological activity or for structure may be subject to conservative amino acid substitutions without destroying the biological activity or without adversely affecting the polypeptide structure.
  • antibody and "immunoglobulin” as used herein are interchangeable, and they refer to whole antibodies as well as to any antigen binding fragments or single chains thereof.
  • Naturally occurring antibodies typically comprise a tetramer. Each such tetramer is typically composed of two identical pairs of polypeptide chains, each pair having one full-length "light” chain (typically having a molecular weight of about 25 kDa) and one full-length "heavy” chain (typically having a molecular weight of about 50-70 kDa).
  • the terms “heavy chain” and “light chain” as used herein refer to any immunoglobulin polypeptide having sufficient variable domain sequence to confer specificity for a target antigen.
  • each light and heavy chain typically includes a variable domain of about 100 to 110 or more amino acids that typically is responsible for antigen recognition.
  • the carboxy-terminal portion of each chain typically defines a constant domain responsible for effector function.
  • a full-length heavy chain immunoglobulin polypeptide includes a heavy chain variable domain (VH), also referred herein as variable heavy chain, and three constant domains (CHI, CH2, and CH3), wherein the VH domain is at the amino-terminus of the polypeptide and the CH3 domain is at the carboxyl-terminus
  • a full-length light chain immunoglobulin polypeptide includes a light chain variable domain (VL), also referred herein as variable light chain, and a constant domain (CL), wherein the VL domain is at the amino-terminus of the polypeptide and the CL domain is at the carboxyl-terminus.
  • Human light chains are typically classified as kappa and lambda light chains, and human heavy chains are typically classified as mu, delta, gamma, alpha, or epsilon, and define the antibody's isotype as IgM, IgD, IgG, IgA, and IgE, respectively.
  • IgG has several subclasses, including, but not limited to, IgGl, lgG2, lgG3, and lgG4.
  • IgM has subclasses including, but not limited to, IgMl and lgM2.
  • IgA is similarly subdivided into subclasses including, but not limited to, IgAl and lgA2.
  • variable and constant domains typically are joined by a "J" region of about 12 or more amino acids, with the heavy chain also including a "D” region of about 10 more amino acids.
  • the variable regions of each light/heavy chain pair typically form an antigen binding site.
  • the variable domains of naturally occurring antibodies typically exhibit the same general structure of relatively conserved framework regions (FR) joined by three hypervariable regions, also called complementarity determining regions or CDRs.
  • both light and heavy chain variable domains typically comprise the domains FR1, CDR1, FR2, CDR2, FR3, CDR3, and FR4.
  • Heavy chain CDRs are herein also abbreviated as HCDR (HCDR1, HCDR2, and HCDR3), or CDRH (CDRH1, CDRH2, and CDRH3).
  • Light chain CDRs are herein also abbreviated as LCDR (LCDR1, LCDR2, and LCDR3), or CDRL (CDRL1, CDRL2, and CDRL3).
  • CDR boundary definitions may not strictly follow one of the herein systems, but will nonetheless overlap with the Kabat CDRs, although they may be shortened or lengthened in light of prediction or experimental findings that particular residues or groups of residues or even entire CDRs do not significantly impact antigen binding.
  • the methods used herein may utilize CDRs defined according to any of these systems, although certain embodiments use Kabat or Chothia defined CDRs. Identification of predicted CDRs using the amino acid sequence is well known in the field, such as in Martin, A.C. "Protein sequence and structure analysis of antibody variable domains," In Antibody Engineering, Vol. 2.
  • the amino acid sequence of the heavy and/or light chain variable domain may be also inspected to identify the sequences of the CDRs by other conventional methods, e.g., by comparison to known amino acid sequences of other heavy and light chain variable regions to determine the regions of sequence hypervariability.
  • the numbered sequences may be aligned by eye, or by employing an alignment program such as one of the CLUSTAL suite of programs, as described in Thompson, 1994, Nucleic Acids Res. 22: 4673-80.
  • Molecular models are conventionally used to correctly delineate framework and CDR regions and thus correct the sequence-based assignments. All such alternative definitions are encompassed by the current invention and the sequences provided in this specification are not intended to exclude alternatively defined CDR sequences which may only comprise a portion of the CDR sequences provided in the sequence listing.
  • Antibody fragments include, but are not limited to, (i) the Fab fragment consisting of VL, VH, CL and CHI domains, including Fab' and Fab'-SH, (ii) the Fd fragment consisting of the VH and CHI domains, (iii) the Fv fragment consisting of the VL and VH domains of a single antibody; (iv) the dAb fragment ( Ward ES et al., (1989) Nature, 341: 544-546) which consists of a single variable, (v) F(ab')2 fragments, a bivalent fragment comprising two linked Fab fragments (vi) single chain Fv molecules (scFv), wherein a VH domain and a VL domain are linked by a peptide linker which allows the two domains to associate to form an antigen binding site ( Bird RE et al., (1988) Science 242: 423-426 ; Huston JS et al., (1988) Proc.
  • Fc refers to a molecule comprising the sequence of a non-antigen-binding fragment resulting from digestion of an antibody or produced by other means, whether in monomeric or multimeric form, and can contain the hinge region.
  • the original immunoglobulin source of the native Fc is preferably of human origin and can be any of the immunoglobulins.
  • Fc molecules are made up of monomeric polypeptides that can be linked into dimeric or multimeric forms by covalent ⁇ i.e., disulfide bonds) and non- covalent association.
  • the number of intermolecular disulfide bonds between monomeric subunits of native Fc molecules ranges from 1 to 4 depending on class ⁇ e.g., IgG, IgA, and IgE) or subclass ⁇ e.g., IgGl, lgG2, lgG3, IgAl, lgGA2, and lgG4).
  • a Fc is a disulfide-bonded dimer resulting from papain digestion of an IgG.
  • native Fc as used herein is generic to the monomeric, dimeric, and multimeric forms.
  • a F(ab) fragment typically includes one light chain and the VH and CHI domains of one heavy chain, wherein the VH-CH1 heavy chain portion of the F(ab) fragment cannot form a disulfide bond with another heavy chain polypeptide.
  • a F(ab) fragment can also include one light chain containing two variable domains separated by an amino acid linker and one heavy chain containing two variable domains separated by an amino acid linker and a CHI domain.
  • a F(ab') fragment typically includes one light chain and a portion of one heavy chain that contains more of the constant region (between the CHI and CH2 domains), such that an interchain disulfide bond can be formed between two heavy chains to form a F(ab')2molecule.
  • antibody binding regions refers to one or more portions of an immunoglobulin or antibody variable region capable of binding an antigen(s).
  • Antibody binding regions include whole antibodies as well as to any antigen binding fragments or single chains thereof.
  • the antibody binding region is, for example, an antibody light chain, or an light chain variable domain (VL), an antibody heavy chain, or a heavy chain variable domain (VH), an heavy chain Fd region, a combined antibody light and heavy chain (or variable region thereof) such as a Fab, F(ab')2, single domain, or single chain antibody (scFv), any of the antibody fragments mentioned above, or a full length antibody, for example, an IgG (e.g., an IgGl, lgG2, lgG3, or lgG4 subtype), IgAl, lgA2, IgD, IgE, or IgM antibody.
  • VL light chain variable domain
  • VH heavy chain variable domain
  • scFv single chain antibody
  • antigen or “target antigen” or “antigen target” or “target” as used herein refers to a molecule or a portion of a molecule that is capable of being bound by an antibody binding region, and/or that is capable of being used in an animal to produce antibodies capable of binding to an epitope of that antigen.
  • a target antigen may have one or more epitopes.
  • the binding protein is capable of competing with an intact antibody that recognizes the target antigen.
  • the antigen is bound by an antibody binding region, or generally by a binding protein, such as an antibody, via an antigen binding site, also referred herein as "binding portion" or "binding domain".
  • epitope includes any determinant, preferably a polypeptide determinant, capable of specifically binding to an immunoglobulin or T-cell receptor.
  • epitope determinants include chemically active surface groupings of molecules such as amino acids, sugar side chains, phosphoryl groups, or sulfonyl groups, and, in certain embodiments, may have specific three- dimensional structural characteristics and/or specific charge characteristics.
  • An epitope is a region of an antigen that is bound by an antibody or binding protein.
  • a binding protein is said to specifically bind an antigen when it preferentially recognizes its target antigen in a complex mixture of proteins and/or macromolecules.
  • a binding protein is said to specifically bind an antigen when the equilibrium dissociation constant is ⁇ 10 8 M, more preferably when the equilibrium dissociation constant is ⁇ 10 9 M, and most preferably when the dissociation constant is ⁇ IO 10 M.
  • the dissociation constant (KD) of a binding protein can be determined, for example, by surface plasmon resonance.
  • surface plasmon resonance analysis measures real-time binding interactions between ligand (a target antigen on a biosensor matrix) and analyte (a binding protein in solution) by surface plasmon resonance (SPR) using the BIAcore system (GE).
  • SPR surface plasmon resonance
  • GE BIAcore system
  • Surface plasmon analysis can also be performed by immobilizing the analyte (binding protein on a biosensor matrix) and presenting the ligand (target antigen).
  • KD refers to the dissociation constant of the interaction between a particular binding protein and a target antigen.
  • binding protein refers to the ability of a binding protein or an antigen-binding fragment thereof to bind to an antigen containing an epitope with an KD of at least about 1 x 10 s M, 1 x 10’ 7 M, 1 x 10’ 8 M, 1 x 10' 9 M, 1 x 10 10 M, 1 x 10 xl M, 1 x 10 12 M, or less, and/or to bind to an epitope with an affinity that is at least two-fold greater than its affinity for a nonspecific antigen.
  • the present invention relates generally to a method to generate a collection of diverse human HCDR1 and/or HCDR2, and/or HCDR3, and/or LCDR1, and/or LCDR2, and/or LCDR3, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c.
  • Removing from said HCDR1 and/or HCDR2 amino acid sequence dataset (i) sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations); (ii) sequences comprising one or more motifs of Table 1; and (iii) sequences encompassing immunogenic peptides.
  • the present invention relates to a method for generating a collection of diverse human HCDR1 and/or HCDR2 starting from the sequences of the human IgG repertoire, for instance starting from the sequences obtained by Next Generation Sequencing of the human IgG repertoire, wherein said human HCDR1 corresponds to the Kabat position 27 to 35, and wherein said human HCDR2 corresponds to the Kabat position 50 to 58.
  • the first method disclosed herein to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprises a step of performing Next-Generation-Sequencing of human IgG antibody repertoire.
  • the first method disclosed herein to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58 also comprises the steps of selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset, and removing from said HCDR1 and/or HCDR2 amino acid sequence dataset all the sequences harboring more than about 50%, or more than about 60%, or more than about 70%, preferably more than about 60% of Somatic-Hyper-Mutations, wherein the term "Somatic-Hyper-Mutations" refers herein to any amino acid different than the amino acid present in the germline. More in particular, the method comprises the step of removing sequences comprising about 60% or more amino acid substitutions per CDR
  • the present method comprises a step of removing from said HCDR1 and/or HCDR2 amino acid sequence dataset, sequences comprising one or more liabilities.
  • the term "liability” refers to a motif that which makes the molecule probe to post-translational modification, or poly-reactivity, or self-association.
  • a liability is selected from the group comprising: unpaired cystines, oxidation sites, N-glycosylation sites, non-canonical N-glycosylation, Asparagine/Glutamine deamidation sites, Aspartate isomerization sites, fragmentation site, O- glycosylation sites, Lysine glycation, Tyrosine sulphation, integrin avP3 binding sites, integrin a4pi binding sites, integrin a2pi binding sites, CDllc/CD18 binding sites, hydrophobic sites.
  • a liability is one or more motif selected from the motifs listed in Table 1.
  • the present method comprises a step of removing from said HCDR1 and/or HCDR2 amino acid sequences dataset sequences comprising one or more motif of Table 1.
  • the present method also comprises a step of removing from said CDRs sequences dataset sequences encompassing immunogenic peptides.
  • sequences encompassing immunogenic peptides are detected by the Immune Epitope Database (IEDB) CD4 T cell immunogenicity prediction tool http://tools.iedb.org/CD4episcore/ (ref: Dhanda et. al.: Prediction of HLA CD4 immunogenicity in human populations, Frontiers in Immunology, 2018, 9, 1369 // Paul et. al: Development and validation of a broad scheme for prediction of HLA class II restricted T cell epitopes, Journal of Immunological Methods, 2015, 422, 28-34). More in particular, wherein the sequences encompassing immunogenic peptides were discarded upon filtering using the Episcore in-silico immunogenicity prediction tool with a threshold value of 50% and by using the default Immune Epitope Database prediction method "IEDB”.
  • IEDB Immune Epitope Database
  • the present invention relates to a method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c.
  • HCDR1 and/or HCDR2 amino acid sequence dataset (i) sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations); (ii) sequences comprising one or more motifs of Table 1; and (iii) sequences encompassing immunogenic peptides; wherein said sequences encompassing immunogenic peptides are detected by the Immune Epitope Database (IEDB) CD4 T cell immunogenicity prediction tool (http://tools.iedb.org/CD4episcore/).
  • IEDB Immune Epitope Database
  • the present invention also relates generally to a second method to generate a collection of diverse human HCDR1 and/or HCDR2, and/or HCDR3, and/or LCDR1, and/or LCDR2, and/or LCDR3, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Aligning the amino acid sequences of HCDR1 and/or HCDR2 of said HCDR1 and/or HCDR2 amino acid sequence dataset; d.
  • the present invention also relates generally to a second method to generate a collection of diverse human HCDR1 and/or HCDR2, and/or HCDR3, and/or LCDR1, and/or LCDR2, and/or LCDR3, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Aligning the amino acid sequences of HCDR1 and/or HCDR2 of said HCDR1 and/or HCDR2 amino acid sequence dataset; d.
  • unique CDR sequences refers generally to CDR sequences that are all different one another.
  • unique CDR sequence is used herein to indicate that a CDR sequence, even when present multiple times in the human IgG repertoire, is extracted, and/or used, and/or considered only once, so that unique CDR sequences are all different one another.
  • the present invention also relates to a collection of diverse human HCDR1 and/or HCDR2 obtained by the methods disclosed herein.
  • the present invention further relates to a library of antibody binding regions.
  • the present invention further relates to a library of antibody binding regions comprising a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein.
  • library of antibody binding regions refers to two or more antibody binding regions having a diversity as described herein, specifically designed according to the methods of the invention.
  • library is used herein in its broadest sense, and also may include the sub-libraries that may or may not be combined to produce libraries of the invention.
  • an antibody display technology is used to generate library of antibody binding regions, in particular to display on a system the protein of interest, i.e., the antibody binding regions, to enable the isolation of molecules with given properties.
  • the protein of interest is linked to the genetic information or, generally an anchor of said system, wherein the system is selected from a group comprising: phages, yeasts, mammalian cells, ribosomes, DNA.
  • Antibody display technologies include but are not limited to phage display (McCafferty et al., 1990; Bazan et al., 2012, Valldorf et al., 2022), yeast display (Boder ET and Wittrup KD, 1997; Weaver-Feldhaus JM et al., 2004), bacterial display (van Blarcom TJ and Harvey BR, 2009. Bacterial display of antibodies.
  • In vivo systems such as phage display, bacterial display, yeast display, mammalian display, usually rely on anchoring the expressed antibody on a cell surface protein to maintain the phenotype-genotype linkage.
  • In vitro systems anchor the transcription unit through a linker to the anchor (ribosome, puromycin or polystyrene beads) to maintain the phenotype-genotype linkage.
  • This principle of genotype phenotype coupling allows for 'barcoding' of up to several billion different protein variants of which binders can be selected via high-throughput identification in an iterative process (Valldorf et al., 2022).
  • the present invention relates to a phage display library of antibody binding regions comprising a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein, and a common light chain variable domain, namely a light chain variable domain that is identical in all the antibody binding regions of the phage display library.
  • the common light chain variable domain is any one selected from the germline repertoires: IGKV/IGKJ, IGLV/IGU (https://www.imgt.org/IMGTrepertoire/Proteins/).
  • the common light chain variable domain is a VK3-15 or a VK1-39 variable light chain. More in particular, the common light chain variable domain is Vi ⁇ 3-15/Jkl (SEQ. ID NO: 4).
  • the phage display library of antibody binding regions disclosed herein comprises a heavy chain variable domain further comprising a heavy chain framework region.
  • framework region refers to the art recognized portions of an antibody variable region that exist between the more divergent CDR regions.
  • Such framework regions are typically referred to as frameworks 1 through 4 (FR1, FR2, FR3, and FR4) and provide a scaffold for holding, in three- dimensional space, the three CDRs found in a heavy or light chain antibody variable region, such that the CDRs can form an antigen-binding surface.
  • the framework region(s) used to support the one or more CDR sequences determined or obtained as described herein can be, e.g., naturally occurring, synthetic, semisynthetic, or combinations thereof.
  • the phage display library of antibody binding regions disclosed herein comprises a heavy chain variable domain further comprising a naturally occurring heavy chain framework regions. More specifically the naturally occurring heavy chain framework regions are derived from a VH germline listed above, e.g.
  • the library of antibody binding regions disclosed herein comprises a heavy chain variable domain further comprising a human HCDR3 being a natural occurring HCDR3 from a human IgM repertoire.
  • said natural occurring HCDR3 from a human IgM repertoire by keeping the HCDR3 cassettes disclosed herein separated based on their individual VH-families (VH1 sequences or VH3 sequences) or pooled across all VH amplified.
  • the library of antibody binding regions is a phage display library.
  • phage display library refers generally to a library generated by a phage display technique.
  • a phage display library is a collection of phages each expressing on their surface different binding proteins, such as different antibody binding regions, from one phage to another.
  • phage is used herein to refer to both bacteriophage or archaeophage.
  • Bacteriophage and archaeophage are obligate intracellular parasites (with respect to both the step of identifying a host cell to infect and to only being able to productively replicate their genome in an appropriate host cell) that infect and multiply inside bacteria/archaea by making use of some or all the host biosynthetic machinery.
  • different bacteriophages and archaeophages may contain different materials, they all contain nucleic acids and proteins, and can, under certain circumstances, be encapsulated in a lipid membrane.
  • the nucleic acid may be either DNA or RNA (but typically not both) and it can exist in various forms, with the size of the nucleic acid depending on the phage.
  • Non limiting examples of phage display systems that can be used in the present invention include: filamentous bacteriophages, such as fl, fd, M13; T4 phage display system; T7 phage display system and lambda phage display system.
  • Phage display libraries can be created either on phage or phagemid vector backbones (Scott JK and CF Barbas III. 2001. Phage-display vectors. In: Phage Display: A Laboratory Manual).
  • Phage vectors consist of an essentially complete phage genome, often M13 phage, into which is inserted DNA encoding the protein or peptide of interest.
  • phagemid refers to a DNA-based cloning vector, which has both bacteriophage and plasmid properties. These vectors carry, in addition to the origin of plasmid replication, an origin of replication derived from bacteriophage.
  • the phagemid is a plasmid that contains an fl origin of replication from an fl phage (Analysis of Genes and Genomes, John Wiley & Sons, S. 140 (2004)).
  • a phagemid can be used to clone DNA fragments and be introduced into a bacterial host by a range of techniques, such as transformation and electroporation. Phagemids are preferred both for monovalent display of antibody fragments that allow selection by true affinity as well for the fact that transformation efficiencies in E. coli are far superior compared to the phage vectors.
  • vector refers to any molecule (e.g., nucleic acid, plasmid, or virus) that is used to transfer coding information to a host cell.
  • the term “vector” includes a nucleic acid molecule that is capable of transporting another nucleic acid to which it has been linked.
  • plasmid refers to a circular double- stranded DNA molecule into which additional DNA segments may be inserted.
  • viral vector Another type of vector, wherein additional DNA segments may be inserted into the viral genome.
  • Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors).
  • vectors e.g., non-episomal mammalian vectors
  • vectors can be integrated into the genome of a host cell upon introduction into the host cell and thereby are replicated along with the host genome.
  • certain vectors are capable of directing the expression of genes to which they are operatively linked.
  • Such vectors are referred to herein as "recombinant expression vectors” (or simply, “expression vectors”).
  • expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
  • plasmid and vector may be used interchangeably herein, as a plasmid is the most commonly used form of vector. This disclosure is intended to include any forms of expression vectors described above, which serve equivalent functions.
  • a phage display library can be used to screen for a binding protein of interest with a binding partner, such as an antigen target.
  • a binding partner such as an antigen target.
  • the formation of a complex between a phage, the binding protein expressed on its surface and said binding partner enables the determination as to which phage express the binding protein of interest.
  • the sequence on the vector is analyzed in order to obtain the exact sequences encoding the binding protein of interest.
  • the present invention relates to a method for identifying an antibody binding region that specifically binds to an antigen target of interest comprising the steps of (i) phage display panning on said antigen target and (ii) screening said antibody binding region that specifically binds to an antigen target of interest by contacting said antigen binding region with a target substrate.
  • panning or “biopanning” as used herein are interchangeable and they refer to a sequence of steps in which a library, for instance a phage display library is incubated with target molecule, e.g. an antigen target, then phages displaying specificity to the target are bound to that, while unbound phages are washed out.
  • target molecule e.g. an antigen target
  • the specific phages are eluted and amplified in bacteria. After several rounds, amplified phages could be analyzed and further screened.
  • panning may be employed to select binders from antibody libraries.
  • Biopanning may be conducted in vitro by immobilizing pure antigens on solid surfaces such as polystyrene, or by biotinylating the pure antigens and immobilizing them on streptavidin-coated polystyrene surfaces. Biopanning may also be conducted in vitro by capturing the biotinylated antigens on streptavidin coated magnetic microbeads. Biopanning may also be carried out against target antigens present on the surface or inside a living cell, or against antigens such as cell surface receptors stabilized in lipid bilayers. Several rounds of panning are required to enrich the specific binding subpopulation over the background. Furthermore, the small proportion of specific binders captured at each round of panning does require amplification of these binders by amplification in a host cell, such as a in bacteria.
  • a "phage host cell” or “host cell” or the like is a cell that can form phage from a particular type of phage genomic DNA.
  • the phage genomic DNA is introduced into the cell by infection of the cell by a phage.
  • the phage binds to a receptor molecule on the outside of the host cell and injects its genomic DNA into the host cell.
  • the phage genomic DNA is introduced into the cell using transformation or any other suitable techniques.
  • the host cell is a bacteria, more specifically is Escherichia coli (E. coli) bacteria.
  • the generated antibody binding regions are screened to select the ones that have the desired properties, such as desired binding affinity.
  • the screening comprises contacting the antigen binding region with a target substrate.
  • the screening of the expressed antibody binding region can be done by any appropriate means. For example, binding activity can be evaluated by standard immunoassay and/or affinity chromatography. Determining the ability of candidate antibodies to bind therapeutic targets can be assayed in vitro using, e.g., flow cytometry, surface plasmon resonance, and ELISA.
  • the binding regions selected from the common light chain libraries disclosed herein have advantageous developability properties, including a transient expression yield in HEK cells comprised between about 25 to 92 mg/mL; high solubility with median percentage of monomers higher than 95%, preferably higher than 98%, more preferably higher than 99% measured by SEC, and median SEC retention time between 12 to 14 minutes; high thermostability with melting temperature Tm higher than 80°C, e.g., equal or higher than 82°C, 86°, 87°C, 92°C, measured by Differential Scanning Fluorimetry (DSF).
  • DSF Differential Scanning Fluorimetry
  • the present invention also relates to a polynucleotide, e.g. an isolated polynucleotide such as an antibody binding region identified according to the method of above for identifying an antibody binding region that specifically binds to an antigen target of interest comprising the steps of (i) phage display panning on said antigen target and (ii) screening said antibody binding region that specifically binds to an antigen target of interest by contacting said antigen binding region with a target substrate.
  • antigen panning is made with any display system indicated above (e.g., phage, yeast, mammalian).
  • the panning is a phage display panning.
  • the antibody binding region can be one or more portions of an immunoglobulin or antibody variable region capable of binding an antigen(s).
  • Antibody binding regions include whole antibodies as well as to any antigen binding fragments or single chains thereof as indicated above.
  • polynucleotide refers to single-stranded or double- stranded nucleic acid polymers of at least 10 nucleotides in length.
  • the nucleotides comprising the polynucleotide can be ribonucleotides or deoxyribonucleotides or a modified form of either type of nucleotide.
  • Such modifications include base modifications such as bromuridine, ribose modifications such as arabinoside and 2',3'-dideoxyribose, and internucleotide linkage modifications such as phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phoshoraniladate and phosphoroamidate.
  • base modifications such as bromuridine, ribose modifications such as arabinoside and 2',3'-dideoxyribose, and internucleotide linkage modifications such as phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phoshoraniladate and phosphoroamidate.
  • polynucleotide specifically includes single-stranded and double-stranded forms of DNA.
  • isolated polynucleotide is a polynucleotide of genomic, cDNA, or synthetic origin or some combination thereof, which: (1) is not associated with all or a portion of a polynucleotide in which the isolated polynucleotide is found in nature, (2) is linked to a polynucleotide to which it is not linked in nature, or (3) does not occur in nature as part of a larger sequence.
  • isolated polypeptide is one that: (1) is free of at least some other polypeptides with which it would normally be found, (2) is essentially free of other polypeptides from the same source, e.g., from the same species, (3) is expressed by a cell from a different species, (4) has been separated from at least about 50 percent of polynucleotides, lipids, carbohydrates, or other materials with which it is associated in nature, (5) is not associated (by covalent or noncovalent interaction) with portions of a polypeptide with which the "isolated polypeptide" is associated in nature, (6) is operably associated (by covalent or noncovalent interaction) with a polypeptide with which it is not associated in nature, or (7) does not occur in nature.
  • Such an isolated polypeptide can be encoded by genomic DNA, cDNA, mRNA or other RNA, of synthetic origin, or any combination thereof.
  • the isolated polypeptide is substantially free from polypeptides or other contaminants that are found in its natural environment that would interfere with its use (therapeutic, diagnostic, prophylactic, research or otherwise).
  • the binding region identified according to the methods disclosed herein is used for the development of therapeutic antibodies.
  • Said therapeutic antibody can be a full antibody, or an antibody fragments or single chains thereof as defined above.
  • the therapeutic antibody is a monoclonal antibody or antibody fragment thereof.
  • the term "monoclonal antibody” as used herein, refers to antibodies that are produced by clone cells all deriving from the same single cell, and that specifically bind the same epitope of the target antigen. When therapeutic antibodies are produced, the generation of monoclonal antibodies is preferred over polyclonal antibodies.
  • monoclonal antibodies are produced by cells originating from a single clone and bind all the same epitope
  • polyclonal antibodies are produced by different immune cells and recognize multiple epitopes of a certain antigen.
  • Monoclonal antibodies assure batch to batch homogeneity, reduced cross-reactivity and high specificity toward the target.
  • Monoclonal antibodies can be expressed, for instance in host cells, using recombinant DNA, giving rise to a recombinant antibody.
  • the therapeutic antibody is monospecific or multispecific antibody or antibody fragment thereof.
  • the term "monospecific antibody" as used herein, refers to any antibody or fragment having one or more binding sites, all binding the same epitope.
  • multispecific antibody refers to any antibody or fragment having more than one binding site that can bind different epitopes of the same antigen, or different antigens.
  • a non-limiting example of multispecific antibodies are bispecific, trispecific, tetraspecific antibodies.
  • the present invention further relates to an antibody binding region, e.g., an antibody binding region identified from the library disclosed herein.
  • the antibody binding region of the present invention binds to a target that is a polypeptide.
  • the target is a protein.
  • targets bound by the antibody binding region of the present invention include: ABCF1; ACVR1; ACVR1B; ACVR2; ACVR2B; ACVRL1; ADORA2A; ADRB3; Aggrecan; AGR2; AICDA; AIF1; AIG1; AKAP1; AKAP2; ALK; AM H; AMHR2; ANGPT1; ANGPT2; ANGPTL3; ANGPTL4; ANPEP; APC; APOCI; AR; AZGP1 (zinc-a-glycoprotein); B7.1; B7.2; BAD; BAFF; BAG1; BAI1; BCL2; BCL6; BDNF; BLNK; BLR1 (MDR15); BlyS; BMP1; BMP2; BMP3B (GDF10); BMP4; BM P
  • FIG. 1 (A) HCDR1 and (B) HCDR2 normalized diversity scores for VH1-69 (diamonds), VH1-46 (triangles) and VH3-23 (circles) germlines. Residues showing a diversity score close to or higher than 75% of the highest calculated value, including positions 30, 31, 53, 56 and 58 of VH-69, positions 31, 56 and 58 of VH1-46 and, positions 30, 31, 55 and 56 of VH3-23 were diversified beyond the natural diversity in libraries A.
  • Figure 2 Distribution of the number of Somatic-Hyper-Mutations (i.e. substitution compared to the germline sequence) in unique (A) HCDR1 Kabat 27-35 and (B) HCDR2 Kabat 50-58 sequences from VH1-69 (diamonds), VH1-46 (triangles) and VH3-23 (circles) extracted from human IgG repertoire. Sequences harboring more than 5 or 6 substitutions compared to the germline sequence were removed from HCRD1 or HCDR2 in libraries B, respectively.
  • HCDR1/2 cassettes were constructed using trimer oligonucleotides (library A), array-based synthesis (library B) or germline DNA template (library C).
  • Synthetic HCDR3 diversity was generated using trimer oligonucleotides (library 1) while natural VH-family specific and pooled HCDR3 diversity was harvested from 10 human donors (library 2 and 3, respectively).
  • Synthetic and semi-synthetic libraries were generated by assembling HCDR1/2 and HCDR3 DNA fragments by PCR and cloned into the pNGLEN phagemid vector.
  • HCDR3 length (amino-acid) distribution as observed in representatives for library 1 (VH1-69_A1, HCDR3 synthetic sequences), library 2 (VH1-69_A2, natural VH-family specific) and library 3 (VH1-69_A3, natural pooled sequences).
  • VH1-69_A1 HCDR3 length distribution as observed in representatives for library 1 (VH1-69_A1, HCDR3 synthetic sequences), library 2 (VH1-69_A2, natural VH-family specific) and library 3 (VH1-69_A3, natural pooled sequences).
  • Figure 7 Comparison of the number of hits selected from the different HCDR1 and HCDR2 designs including library A (generated using trimer oligonucleotides) library B (array-based synthesis) and library C (germline sequence) against CD47, CD28 and NKp46. NA: Not Applicable because this library does not exist.
  • Figure 8 Comparison of the number of clonotypes (>80% HCDR3 sequence identity) selected from the different HCDR1 and HCDR2 designs including library A (generated using trimer oligonucleotides) library B (array-based synthesis) and library C (germline sequence) against CD47, CD28 and NKp46. NA: Not Applicable because this library does not exist.
  • Figure 9 Comparison of the percentage of hit sequences without sequence liabilities in HCDR1/2 selected from the different HCDR1 and HCDR2 designs including library A (generated using trimer oligonucleotides) and library B (array-based synthesis) against CD47, CD28 and NKp46. NA: Not Applicable because this library does not exist, or no hits were selected.
  • Figure 10 Comparison of the number of hits and clonotypes, and percentage of hit sequences without sequence liabilities in HCDR3 selected from the different HCDR3 designs including Library 1 (synthetic), library 2 (natural VH-family specific) and library 3 (natural pooled sequences) against CD47, CD28 and NKp46.
  • FIG 22 Example of data processing for anti-Nkp46 Fab epitope binning.
  • the sensorgram depicts an overlay of the signals obtained from all cycles of Nkp46 and Fab 2 (analyte) binding to one immobilized Fab. Signal was normalized at the end of the antigen injection (first vertical band), the binning signal was read at the second vertical band with gating above the buffer cycles (horizontal band). Signals below the gate result from competing Fabs (no binding of the second Fab as analyte) and signals above the gate result from non-competing Fab (binding of the second Fab).
  • FIG 23 Graphical representation of the anti NKp46 Fabs epitope binning data and the distribution of Fabs according to their germline. Binders are represented as a circle when used as both ligand and analyte, as a square when used as analyte only (black squares: VH1-69 germline; white squares: VH1-46 germline). Competition is depicted as a straight cord and Fabs within a same bin are enclosed. Data was generated using Carterra LSATM instrument and figure using Carterra EpitopeTM software.
  • Figure 24 Graphical representation of the anti NKp46 Fabs epitope binning data and the distribution of Fabs according to their HCDR1/2 origin.
  • Binders are represented as a circle when used as both ligand and analyte, as a square when used as analyte only (black squares: Trimer-based HCDR1/2; grey squares: Array-based HCDR1/2; white squares: Germline HCDR1/2). Competition is depicted as a straight cord and Fabs within a same bin are enclosed. Data was generated using Carterra LSATM instrument and figure using Carterra EpitopeTM software.
  • FIG. 25 Graphical representation of the anti NKp46 Fabs epitope binning data and the distribution of Fabs according to their HCDR3 origin. Binders are represented as a circle when used as both ligand and analyte, as a square when used as analyte only (black squares: Synthetic HCDR3; grey squares: VH-specific HCDR3; white squares: VH-pooled HCDR3). Competition is depicted as a straight cord and Fabs within a same bin are enclosed. Data was generated using Carterra LSATM instrument and figure using Carterra EpitopeTM software.
  • Example 1 Design of HCDR1 and HCDR2 libraries
  • NGS Next-Generation-Sequencing of human antibody repertoire was performed at Quintara Biosciences (US).
  • IgG variable domains were amplified from peripheral leukocytes mRNA (636170, TaKaRa) using a reverse IgG specific primer (sgatgggcccttggtggargc) (ref: Commonality despite exceptional diversity in the baseline human antibody repertoire, Bryan Briney, https://www.nature.com/articles/s41586-019-0879- y).
  • VH antibody variable heavy chain
  • TSO template switching oligo
  • VH amplification that incorporated unique molecular identifiers (readl and read2) together with Illumina Adapter Sequences (P5 and P7, Illumina) was performed, resulting in a 500-550 bp amplicon library with the following structure: P5-Read l-TSO-UTR-Leader-VDJ-Read2-P7.
  • Read l and Read 2 sequences were subsequently used for MiSeq sequencing using a 2 x300 bp CHIP.
  • Raw sequence reads corresponding to TSO-UTR-Leader-VDJ sequences were paired and analyzed using a custom NGS analysis pipeline.
  • the first strategy to introduce diversity in HCDR1 and HCDR2 took advantage of the germline-specific natural amino acid frequencies at each position with modifications.
  • Nucleotide VH sequences from IgG NGS data set were segregated according to their respective germlines upon successful alignment to reference IMGT VH nucleotide sequences.
  • unique HCDR1 and HCDR2 amino-acid sequence regions (Kabat 25-36 and Kabat 47-65, respectively) were extracted and aligned to identify natural residues frequencies in HCDR1 and HCDR2.
  • single positions mutation rates were calculated according to the occurrence of non-germline residues identified in each HCDR positions.
  • a diversity score representing their relative tolerance to residue substitution was calculated by multiplying the mutation rate with the summed number of amino acids required to explain 80% of the mutation rate.
  • the second strategy to introduce diversity in HCDR1 and HCDR2 used unique germline-specific HCDR1/2 sequences (Kabat 27-35 and Kabat 50-58, respectively) from the IgG NGS data set, with modifications.
  • sequences harboring more than 5 or 6 substitutions per CDR compared to the germline sequence have been removed from HCDR1 or HCDR2 libraries, respectively.
  • sequences containing sequence liabilities were removed and sequences encompassing immunogenic peptides were discarded upon filtering using the Episcore in-silico immunogenicity prediction tool with a threshold value of 50% (ref: Dhanda et.
  • VH1-46 SEQ ID NO: 1
  • VH1-69 SEQ ID NO: 2
  • VH3-23 VH3-23
  • Table 2 Number of unique HCDR1 and HCDR2 regions from human IgG repertoire NGS analysis considerec for the design of trimer oligonucleotides Based on antibody structure, natural residue frequencies and diversity score analyses, residues Kabat 27- 35 (referred as HCDR1) and Kabat 50-58 (referred as HCDR2) were considered for randomization. As described in the method, for the positions having the lower diversity score ( ⁇ 75% of the highest calculated value, Figure 1), residues occurring ⁇ 1% and C were excluded from the design, and M were not considered unless localized in a buried canonical position. Indeed, both C and M are high-risk Post-Translational Modification (PTM) motifs. All other amino acids were kept at naturally occurring ratios.
  • PTM Post-Translational Modification
  • HCDR1 and HCDR2 amino-acid frequencies encoded in germline-specific trimer oligonucleotides are described in Table 3 to Table 8.
  • Table 8 Amino-acid frequencies of VH3-23 HCDR2 encoded in trimer oligonucleotides.
  • the human IgG repertoire NGS data set was also used to design unique set of germline specific HCDR1/2 sequences. Importantly, all NGS reads have been considered, even the ones appearing at negligeable frequencies and that could potentially be attributed to NGS errors. Such sequences are expected to create extra-diversity beyond the natural diversity and could be beneficial when generating common Light Chain (cLC) libraries where the diversity is restricted to the heavy chain.
  • Unique HCDR1 and HCDR2 sequences (Kabat 27-35 and Kabat 50-58, respectively) were identified and further filtered to optimize library functionality and binder developability. First, sequences harboring more than 60% of Somatic-Hyper- Mutations (SHMs) were removed to generate a more "naive" library (i.e.
  • Table 9 HCDR1 and HCDR2 sequences number evolution throughout the filtering process performed prior to jet-array synthesis.
  • Example 2 Synthetic and semi-synthetic phage display common Light Chain libraries generation and quality control
  • HCDR1/2 cassette was generated either by PCR assembly using E. coli codon optimized germline FR1 or FR3 as DNA template and trimer oligonucleotides (library A), or amplification of the array-based synthetized FR1-FR3 fragments (library B) or amplification of the germline FR1-FR3 sequences (library C).
  • HCDR3 synthetic diversity was introduced using a pool of trimer oligonucleotides encoding 15 HCDR3 lengths (6-20) and mimicking length-specific naturally occurring diversity at Kabat residues 95-102.
  • DNA fragments encompassing E. coli codon optimized HCDR3, IGHJ1, linker and germline VK3-15/JK1 variable light chain (VL) (SEQ. ID NO: 4) were generated by PCR and pooled to mimic natural HCDR3 length distribution (Library 1).
  • HCDR1/2 cassette and synthetic HCDR3-containing fragments were assembled by PCR, scFv were cloned into the pNGLEN (in-house modified pUC119 phagemid vector) using Ncol/Notl restriction sites and the resulting ligation product electroporated into E. coli TGI cells.
  • Natural HCDR3 were amplified from a human IgM repertoire. Briefly, mRNAwas purified from total PBMCs extracted from Buffy Coats of 10 human donors (5 males, 5 females) of Caucasian origin and aged between 19-69. mRNA was converted to cDNA by reverse transcription using oligo-dT. All VH and HCDR3 amplifications were performed individually from each donor and were next pooled together according to their VH-families. Firstly, PCR was performed using a set of VH-specific forward primers (ref: https://doi.org/10.1038/s42003-021-01881-0, communication Biology) and a reverse IgM-specific primer (TGGAAGAGGCACGTTCTTTTCTTT).
  • HCDR3 cassettes were either kept separated based on their individual VH-families (VH1 sequences or VH3 sequences - library 2) or were pooled across all VH amplified (library 3).
  • the HCDR3 cassettes were associated to the HCDR1/2 cassettes, whose FR3 end was mutated to be compatible with human sequences.
  • Semi-synthetic VHs were cloned into the germline VK3-15/JK1 VL-containing pNGLEN using Ncol/Xhol restriction sites and the resulting ligation electroporated into E. coli TGI cells.
  • the library cloning process is depicted in Figure 3.
  • E. coli TGI cells harboring the phagemid libraries were superinfected with M13K07 helper phage for assembly and production of recombinant phages. Phages were purified by two precipitations steps with 1/3 v/v of 20% PEG-6000, 2.5 M NaCI and resuspended in PBS.
  • PCR amplification resulted in an amplicon of approximately 400 bp further purified by gel extraction (Qiagen).
  • the resulting DNA sample was subsequently processed using magnetic SPRIselect beads (BeckmanCoulter) according to the manufacturer's instructions leading to NGS-grade DNA quality. Samples were submitted to Illumina MiSeq 2x200 bp analysis at Genewiz (amplicon-EZ services). Raw FASTQ paired-end files were processed using a custom NGS analysis pipeline. Bioinformatic analysis included calculation of percentage of functional sequences (sequences in frame and without stop codon), CDR length distributions, CDR amino-acid frequencies and identification of CDR unique sequences.
  • a set of 13 phage display libraries were generated to compare variable heavy chain (VH) germlines, and HCDR1/2 (diversification based on trimer oligonucleotides (library A) or array-based synthesis (library B) or germline sequence (library C)) and HCDR3 (synthetic (library 1), natural VH-family specific (library 2) or pooled HCDR3 (library 3)) diversification strategies for the selection of diverse and highly developable common Light Chain binders.
  • VH variable heavy chain
  • HCDR1/2 diversification based on trimer oligonucleotides
  • library B array-based synthesis
  • library C germline sequence
  • HCDR3 synthetic (library 1), natural VH-family specific (library 2) or pooled HCDR3 (library 3)) diversification strategies for the selection of diverse and highly developable common Light Chain binders.
  • the libraries were assembled by PCR and cloned into the pNGLEN phagemid vector using restriction enzymes.
  • VH1-69_A1, VH1-69_A2, VH1-69_A3, were picked as representative libraries to assess the quality of the HCDR3.
  • HCDR3 amino-acid length distribution analysis confirmed the expected diversity of the synthetic design (VH1-69_A1) encompassing length 6 to 20 ( Figure 4).
  • both natural VH1- family specific (VH1-69_A2) and pooled HCDR3 (VH1-69_A3) length distributions were comparable and in accordance with the natural repertoire.
  • the amino-acid content of HCDR3 was also checked taking the most represented length (HCDR3 length 12) as example. As illustrated in Table 18 to Table 20, the amino acid frequencies at each position were comparable.
  • Table 20 Amino-acid frequencies in HCDR3 of VH1-69_A3 library
  • Example 3 Selection of synthetic and semi-synthetic phage display common Light Chain libraries against mmCD47, hsCD28 and hsNKp46
  • phage particles from individual scFv libraries (10 12 plaque-forming units) were blocked in PBS/BSA 3% (w/v) for 1 h at room temperature (RT).
  • Magnetic DynabeadsTM MyOneTM Streptavidin Cl beads (Invitrogen, catalog NO: 65002) were blocked in the same conditions. Phages were depleted against pre-blocked beads for 1 h at RT.
  • Phages were then incubated with 100 nM (rounds 1 and 2) or 25 nM (round 3) of biotinylated recombinant his-tagged mouse CD47 extracellular domain (ECD) (produced in-house) or biotinylated recombinant human CD28 protein (Acrobiosystems, catalog NO: H82E5) for 1 h at RT.
  • ECD his-tagged mouse CD47 extracellular domain
  • H82E5 biotinylated recombinant human CD28 protein
  • Eluted phages were used to infect 5 ml of exponentially growing E. coll TGI cells. Infected cells were grown in 2YT medium for 1 h at 37 °C and 100 RPM, then grown in 2YT medium supplemented with 2% (w/v) glucose for 1 h at 37 °C and 240 RPM. Cells were then superinfected with the M13K07 helper using a multiplicity of infection (MOI) of 10 for 1 h at 37 °C and 100 RPM. Culture medium was then changed for 2YTAK (2YT medium supplemented with 100 pg/ml ampicillin and 50 pg/ml kanamycin) and cells were further cultured ON at 30 °C and 240 RPM. The next day, 10 pl of phage containing cell-free supernatant were used for the subsequent round of selection.
  • MOI multiplicity of infection
  • NKp46 The panning on NKp46 followed the same procedure as above but included an additional depletion step.
  • Magnetic Protein G Dynabeads® (Invitrogen, catalog no. 10003D) were blocked in PBS/BSA 3% (w/v), coated with human IgGl and incubated with pre-coated beads for 1 h at RT prior to depletion on streptavidin beads. Phages were then incubated with 100 nM (rounds 1 and 2) or 25 nM (round 3) of biotinylated recombinant Fc-tagged human NKp46 protein (NC1-H82F9, Acrobiosystems). scFv screening on recombinant protein by flow cytometry
  • biotinylated recombinant his-tagged mouse CD47 extracellular domain (ECD) produced in house and biotinylated recombinant Fc-tagged human NKp46 protein (Acrobiosystems, catalog NO: NC1-H82F9) coated on beads was assessed by flow cytometry.
  • E. coli colonies from the third round of selection were picked and grown in autoinduction medium (Formedium, catalog NO: AIM2YT0210) supplemented with 100 pg/ml ampicillin and 0.1% (w/v) glucose in 96-well deepwell plates, overnight at 30 °C and 260 RPM.
  • Cells were centrifuged and periplasmic extracts were obtained by resuspending the bacterial pellets in TES buffer (50 mM Tris-HCI pH 8; 1 mM EDTA pH 8; 20% sucrose) followed by incubation on ice for 30 min. Cellular debris were removed by centrifugation, and the scFv containing supernatants were stored at 4 °C.
  • PolyAn Red4 deca-plex beads were procured (PolyAn, catalog NO: 106 52 005).
  • the beads had functionalized streptavidin on their surface and caged 10 discrete amounts of Allophycocyanin (APC) in their polymeric Polymethylmethacrylate (PMMA) matrix.
  • APC Allophycocyanin
  • PMMA polymeric Polymethylmethacrylate
  • Biotinylated antigens were individually coated on beads at 20 pg/ml coating concentration in PBS-BSA 3% (w/v) and beads were then washed to remove unbound antigen.
  • Control beads non-antigen coated streptavidin beads
  • Equal quantities of control and antigen-coated beads were mixed in PBS-BSA 3% (w/v) and 10 pl was pipetted in each 96 well (approx.
  • Periplasmic extracts were diluted 1:8 in PBS-BSA 3% (w/v) and 10 pl was added to each well and incubated for 60 min at 4 °C.
  • 10 pl of FITC-conjugated anti-myc tag antibody 9E10 (Abeam, catalog NO: ab202008)
  • PBS- BSA 3% w/v
  • FITC-conjugated anti-myc tag antibody 9E10 Abeam, catalog NO: ab202008
  • PBS- BSA 3% w/v
  • At least 2000 events were recorded on IntelliCyt® iQue Screener PLUS (Sartorius) with 4 s of sip time and 0.5 s of additional up time.
  • GM Geometric Mean
  • SPR Surface Plasmon Resonance
  • Biotinylated recombinant human CD28 protein (Acrobiosystems, catalog NO: CD8-H82E5), recombinant his-tagged mouse CD47 extracellular domain (ECD) produced in house or recombinant his-tagged human NKp46 protein (Acrobiosystems, catalog NO: NC1-H52H4) were diluted to a final concentration of 10 pg/ml in acetate buffer pH 4.5 (Cytiva Life Sciences, catalog NO: BR100530) and subsequently immobilized on flowpath 2 on the eight channels, to around 1600 resonance units (abbreviated RU), 500 RU and 1500 RU, respectively, on Series S CM5 Sensor Chips (Cytiva Life Sciences, catalog NO: BR100012) using an amine coupling kit (Cytiva Life Sciences, catalog NO: BR100050).
  • HBS-EP+ (Cytiva Life Sciences, catalog NO: BR100669) was used as running buffer.
  • Filtered periplasmic extracts were injected directly on the covalently coupled human CD28, mouse CD47 or human NKp46 Series S CM5 Sensor Chip.
  • Samples were injected on the flow-paths 1 and 2 (flow-path 1 being used as reference) at a 30 pl/min flow rate for 3 min, followed by a dissociation time of 3 min in running buffer. After each binding event, surface was regenerated with 10 mM Glycine pH 1.5 solution (Cytiva Life Sciences, catalog NO: BR100354) injected for 60 s at 30 pl/min on both flow-paths.
  • Each measurement included zero-concentration samples as well as irrelevant scFv periplasmic extracts for referencing and specificity, respectively.
  • the 13 phage display libraries were panned against 3 targets: mouse CD47, human CD28 and human NKp46. Then, 288 scFv clones from each round of the 3 panning arms were screened for binding to the target recombinant protein by flow-cytometry and Surface Plasmon Resonance. This panning campaign led to the selection of 138, 383 and 438 hits against CD47, CD28 and NKp46, respectively.
  • HCDR1, HCDR2 and HCDR3 sequences selected against each target were analyzed by calculating the Lenvenshtein distance for each CDR and HCDR3 length distribution.
  • the Levenshtein distance is defined as minimum number of amino acid substitutions, insertions or deletions required to change one sequence into the other (e.g. Levenshtein distance 1 corresponds to one substitution, insertion or deletion between two sequences).
  • HCDR1 and HCDR2 Lenvenshtein distributions range from 1 to 9 and from 1 to 10 and reached a maximum at 5 and 6 respectively, indicating very high CDR diversity (Figure 5).
  • HCDR3 is the most diverse CDR in nature and is known to play a critical role for driving the binding to the antigen.
  • antibodies harboring distant HCDR3 sequences are expected to recognize different epitopes.
  • Both the vast HCDR3 Lenvenshtein distances and amino acid length distribution of the unique HCDR3 selected against CD47, CD28 and NKp46 suggest a broad epitope coverage ( Figure 5 and Figure 6).
  • HCDR1/2 diversification strategies were also compared for delivering unique clonotypes.
  • a clonotype as a group of sequences having HCDR3 with >80% sequence identity and that could target close epitopes.
  • sequence liabilities were removed from libraries B generated using array-based synthesis.
  • the percentage of unique hits without sequence liabilities in HCDR1 and HCDR2 were compared (Table 23, Figure 9).
  • Most of the sequences originated from libraries B are free of sequence liabilities while most of sequences originated from VH3-23 and VH1-46 based-libraries A contain sequence liabilities.
  • 64% of the sequence from VH1-69 based-library A are free of sequence liabilities.
  • the percentage of liabilities-free sequences originated from libraries A were expected to depend on the VH germline knowing that they harbor different germline-specific natural diversity.
  • VH1-69 library 3 based on natural pooled HCDR3 delivered slightly more unique hits than VH1-69 library 1 based on synthetic HCDR3 when built with trimer-based HCDR1/2 (i.e.VHl-69-A3> VH1-69-A1).
  • VHl-69-Bl the number of unique hits and clonotypes and percentage of liabilities-free sequences generated from the different libraries were then arranged to compare HCDR3 diversification strategies function of HCDR1/2 randomization strategies (Table 21, Table 22 and Table 23, Figure 10).
  • VH1-69 library 3 based on natural pooled HCDR3 delivered slightly more unique hits than VH1-69 library 1 based on synthetic HCDR3 when built with trimer-based HCDR1/2 (i.e.VHl-69-A3> VH1-69-A1).
  • the inverse was observed when the libraries were built using array-based synthesis (i.e. VHl-69-Bl> VH1-69-B3).
  • Table 21 Number of unique hits selected against CD47, CD28 and NKp46 from the 13 synthetic and semisynthetic common Light Chain libraries.
  • Table 22 Number of clonotypes (>80% HCDR3 sequence identity) selected against CD47, CD28 and NKp46 from the 13 synthetic and semi-synthetic common Light Chain libraries.
  • NKp46 from the 13 synthetic and semi-synthetic common Light Chain libraries. from the 13 synthetic and semi-synthetic common Light Chain libraries.
  • Table 25 Percentage of hits without HCDR3 sequence liabilities selected against CD47, CD28 and NKp46 from the 13 synthetic and semi-synthetic common Light Chain libraries.
  • Example 4 Characterization of anti-CD47 and anti-CD28 antibodies
  • Fab expression cDNAs encoding the different antibody constant regions were gene synthetized by GENEART (Regensburg, Germany) or Twist Biosciences (San Francisco, USA) and modified using standard molecular biology techniques. PCR products were digested with appropriate DNA restriction enzymes, purified and ligated in modified pcDNA3.1 plasmids (Invitrogen), which carried a CMV promoter and a bovine hormone polyadenylation (poly(A)). The expression vectors also carried an oriP, which is the origin of plasmid replication of Epstein-Barr virus, and the murine IgK light chain leader peptide for secretion of the encoded polypeptide chain.
  • oriP is the origin of plasmid replication of Epstein-Barr virus, and the murine IgK light chain leader peptide for secretion of the encoded polypeptide chain.
  • each scFv clone in its phage library vector was used to amplify its individual VH cDNA by PCR.
  • the VH PCR product was cloned in the modified pcDNA 3.1 vector described above upstream of a cDNA encoding a human IgGl heavy chain CHI domain, whereas the fixed VK3-15/JK1 light chain was cloned in the modified pcDNA 3.1 vector, described above, upstream of a cDNA encoding a human kappa constant light chain domain.
  • Fab expression was performed in 24 well plates using the Expi293TM Expression System (Thermo Fisher, catalog NO: A14527), according to the manufacturer's protocol.
  • Expi293F cells were seeded at 2x10 s viable cells in Expi293TM Expression Medium (Thermo Fisher, catalog NO: A1435101) one day prior to transfection, and incubated with orbital shaking at 37 °C, 8% CO2 and 80% humidity.
  • ExpiFectamineTM 293 transfection reagent from the ExpiFectamineTM 293 Transfection Kit (Thermo Fisher, catalog NO: A14526) and equal quantities of light chain and heavy chain vectors (1 pg plasmid DNA/mL of transfection culture volume each) were separately diluted in OptiMEM I Reduced Serum Medium (Thermo Fisher, catalog NO: 31985062). Following a 10 min incubation at room temperature, the ExpiFectamineTM 293 in OptiMEM solution was combined with the DNA dilution, and the final solution mix was added to the cells.
  • SEC size exclusion chromatography
  • SPR Surface plasmon resonance
  • Fabs were injected in single cycle kinetic analysis mode at different concentrations ranging from 6.4 to 4000 nM, in HBS-EP+ buffer (Cytiva Life Sciences, catalog NO: BR100669) at a flow rate of 30 pl/min for 3 min on flow-path 1 and 2 (flow-path 1 being used as reference). Dissociation was monitored for 10 min. After each cycle, the surface was regenerated with 60 pl of regeneration solution provided with Series S Biotin CAPture Kit (Cytiva Life Sciences, catalog NO: 28920234). Experimental data were processed using the 1:1 Langmuir kinetic fitting model. Measurements included zero-concentration samples for referencing. Chi 2 and residual values were used to evaluate the quality of the fit between the experimental data and individual binding models.
  • SPR Surface plasmon resonance
  • Fabs were injected in single cycle kinetic analysis mode at different concentrations ranging from 1.6 to 1000 nM, in HBS-EP+ buffer (Cytiva Life Sciences, catalog NO: BR100669) at a flow rate of 30 pl/min for 3 min on flow-path 1 and 2 (flow-path 1 being used as reference). Dissociation was monitored for 10 min. After each cycle, the surface was regenerated with 60 pl of regeneration solution provided with Series S Biotin CAPture Kit (Cytiva Life Sciences, catalog NO: 28920234). Experimental data were processed using the 1:1 Langmuir kinetic fitting model. Measurements included zero-concentration samples for referencing. Chi 2 and residual values were used to evaluate the quality of the fit between the experimental data and individual binding models.
  • the thermal stability of the antibodies was investigated by Differential Scanning Fluorimetry (DSF).
  • DSF Differential Scanning Fluorimetry
  • the analysis was performed using the Rotor Gene Q instrument (Qiagen). Briefly, 5 pg of protein were mixed with SYPRO® Orange 10X (Life technologies) in a final volume of 20 pl adjusted with H 2 O. The temperature was increased by 1°C per second from 25°C to 99°C.
  • the emission and detection wavelength were set at 460 nm and 510 nm respectively, and the gain was set at 3.
  • binders discovered from VH1-46, VH1-69 and VH3-23-based libraries show comparable yield (medians ranging from 48 mg/mL to 54 mg/mL), % of monomer (medians ranging from 99.04 % to 99.76 %), SEC retention time (medians around 12.7 min) and thermostability (medians ranging from 84°C to 86°C) confirming that all 3 VH germlines are suitable for generating VK3- 15-based cLC binders (Table 26, Figure 11- Figure 14).
  • binders isolated from libraries generated using HCDR1/2 trimer oligonucleotides or HCDR1/2 array-based synthetized cassette have comparable developability properties (Table 26, Figure 11- Figure 14), although the libraries generated using array-based synthesis may have been expected to present better developability profiles because this approach introduced naturally existing sequence motif diversity, compared to the trimer oligonucleotide approach that only mimics the natural diversity at each individual residue.
  • the binders discovered from synthetic or semi-synthetic libraries naturally or VH-family specific HCDR3 diversity
  • semi-synthetic libraries may have been expected to be superior than synthetic libraries for the same reason as the one mentioned above regarding HCDR1 and 2 diversity.
  • K D affinity values were measured by SPR.
  • Table 27 diverse anti-CD47 cLC binders ranging from single digit nanomolar affinity to micromolar affinity were discovered. Comparing the median affinity of the binders selected from the individual libraries could be misleading because only a few Fab were expressed from each library.
  • the data were combined according to the VH germlines, or HCDR1/2 designs or HCDR3 diversification strategies. As showed Figure 15, there are no statistical differences between the affinity of binders originating from the different library designs.
  • Table 27 KD va ues of anti-CD47 Fabs measured by SPR.
  • Table 28 KD values of anti-CD28 Fabs measured by SPR.
  • Example 5 Characterization of anti-Nkp46 antibodies
  • SPR Surface plasmon resonance
  • Fabs were injected in single cycle kinetic analysis mode at different concentrations ranging from 6.4 to 4000 nM, in HBS-EP+ buffer (Cytiva Life Sciences, catalog NO: BR100669) at a flow rate of 30 pL/min for 3 min on flow-path 1 and 2 (flow-path 1 being used as reference). Dissociation was monitored for 10 min. After each cycle, the surface was regenerated with 60 pl of regeneration solution provided with Series S Biotin CAPture Kit (Cytiva Life Sciences, catalog NO: 28920234). Experimental data were processed using the 1:1 Langmuir kinetic fitting model. Measurements included zero-concentration samples for referencing. Chi 2 and residual values were used to evaluate the quality of the fit between the experimental data and individual binding models.
  • a panel of anti-Nkp46 hits were reformatted to Fab format for developability and affinity assessment.
  • All individual libraries delivered Fab having good transient expression yield in HEK cells with medians ranging from 25 mg/mL to 92 mg/mL (Table 29) which is comparable to a well-behaved comparator control, trastuzumab Fab (72 mg/mL, average of duplicate).
  • trastuzumab Fab 72 mg/mL, average of duplicate.
  • the binders selected from libraries harboring diversity in HCDR1/2 either Trimer-based or Array-based
  • binders selected from the semi-synthetic libraries have a better yield than the one originated from the synthetic libraries.
  • thermostability of the Fab's isolated from each library is very high, with medians ranging from 86 °C to 92 °C.
  • VHl-69-based antibodies have a slightly higher Tm than VHl-46-based antibodies ( Figure 20).
  • K D affinity values were measured by SPR. As reported in Table 30, diverse anti-Nkp46 cLC binders ranging from single digit nanomolar affinity to micromolar affinity were discovered. As shown in Figure 21, there are no statistical differences between the affinity of binders originating from the different library designs.
  • Table 30 K D values of anti-Nkp46 Fabs measured by SPR.
  • Example 6 Epitope binning of anti-Nkp46 antibodies
  • Chip preparation - Epitope binning was performed using Carterra LSATM instrument equipped with an HC30M chip at 25 °C using a sandwich method. Based on the affinities measured by SPR (Biacore 8K+, Example 5 Figure 21 and Example 5 Table 30), Fab with KD ⁇ 200 nM, or KD ⁇ 400 nM and Koff ⁇ 0.02 sec 1 were immobilized).
  • the chip was primed with 10 mM MES pH 5.5, activated for 5 min using N- hydroxysuccinimide and N-(3-Dimethylaminopropyl)-N'-ethylcarbodiimide hydrochloride (NHS-EDC) reagents from the Amine coupling kit (Cytiva BR100633) at the recommended manufacturer's concentration for 5 min, Fabs were injected with the 96 printhead at 800 nM in NaOAc 20 mM pH 4.3 for 20 min, the chip was blocked using 1 M ethanolamine pH8 for 5 min.
  • NHS-EDC N-(3-Dimethylaminopropyl)-N'-ethylcarbodiimide hydrochloride
  • Table 31 Epitope binning of anti-Nkp46 antibodies selected from the 13 common Light Chain libraries.

Landscapes

  • Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention relates generally to methods for generating a collection of diverse human HCDR1 and/or HCDR2. The present invention also relates to a common light chain antibody library comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein. In a specific aspect, the common light chain antibody library further comprises a human HCDR3 being a naturally occurring HCDR3.

Description

Common light chain antibody libraries
Field of the Invention
The present invention relates generally to methods for generating a collection of diverse human HCDR1 and/or HCDR2. The present invention also relates to a common light chain antibody library comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein. In a specific aspect, the common light chain antibody library further comprises a human HCDR3 being a naturally occurring HCDR3.
Background of the Invention
Therapeutic antibodies are currently one of the fastest growing classes of drugs and are approved for the treatment of several indications, from cancer to autoimmune disease. However, the identification of therapeutically useful antibodies, namely antibodies capable of binding a specific selected target and having the desired properties, is not trivial.
Methods for identifying desirable antibodies may involve phage display of antibodies or antibody fragment, for example by construction of human libraries derived by amplification of nucleic acids from B cells or tissues, or by construction of synthetic or semi-synthetic libraries.
In synthetic antibody libraries, as well as semi-synthetic libraries which comprise synthetically designed segments, the antibody diversity is designed in silico and synthesized in a controlled manner. Synthetic or semi-synthetic antibody libraries can be constructed by introducing randomized complementarity determining region (CDR) sequences into antibody frameworks.
When multispecific antibodies having a common light chain (cLC) are developed, diversification of the CDRs of the heavy chain is of particular importance, since the binding to the target is only driven by the heavy chain variable region. Therefore, when generating common light chain multispecific antibodies, introducing diversity beyond the natural diversity is beneficial, but still remains a challenge.
The present disclosure is directed towards addressing such limitations and therefore aims at designing and introducing diversity in the heavy chain CDRs, to create common light chain antibody synthetic or semi-synthetic libraries which can improve the possibility of identifying and generating common light chain multispecific antibodies for therapeutic applications. Summary of the invention
The present invention relates to a first method to generate a first method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Removing from said HCDR1 and/or HCDR2 amino acid sequence dataset:
(i) sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations);
(ii) sequences comprising one or more motifs of Table 1; and
(iii) sequences encompassing immunogenic peptides.
The present invention also relates to a second method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat position 25 to 36, and wherein said human HCDR2 corresponds to the Kabat position 47 to 65, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset. c. Aligning the amino acid sequences of HCDR1 and/or HCDR2 of said HCDR1 and/or HCDR2 amino acid sequence dataset. d. Calculation of the frequency of each amino acid at each position of said amino acid sequences of HCDR1 and/or HCDR2; e. Calculation of a single point mutation rate (MR) for each position of said HCDR1 and/or HCDR2, wherein MR is the frequency of a non-germline amino acid at each position of said HCDR1 and/or HCDR2; f. Calculation of a diversity score (DS) for each position of said HCDR1 and/or HCDR2, wherein DS is the MR multiplied for the minimum number of amino acids whose summed frequency is equal to the 80% of the MR. g. Obtaining said collection of diverse human HCDR1 and/or HCDR2 by providing at each position of said HCDR1 and/or HCDR2 a plurality of amino acids, characterized in that:
- for a position having said DS lower than about 70% of the DS highest value, said plurality of amino acids is made of the naturally occurring amino acid at each position, excluding amino acids with frequency less than 1%, Cys, and optionally Met;
- for a position having said DS equal to or higher than about 70% of the DS highest value, said plurality of amino acids is made of any amino acid, excluding Cys, Met, Trp, and optionally Pro.
In certain embodiments, the human antibody variable heavy chain germline of the methods disclosed herein is selected from the group comprising VH1-46 (SEQ. ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3).
Disclosed herein is also a collection of diverse human HCDR1 and/or HCDR2 obtained by the methods above.
In a particular embodiment, the collection of diverse human HCDR1 and/or HCDR2 is obtained by the second method, wherein said HCDR1 is encoded as trimer oligonucleotides in VH1-69 germline according to the frequency of Table 3, or said HCDR1 is encoded as trimer oligonucleotides in VH1-46 germline according to the frequency of Table 4, said HCDR1 is encoded as trimer oligonucleotides in VH3-23 germline according to the frequency of Table 5; and said HCDR2 is encoded as trimer oligonucleotides in VH1-69 germline according to the frequency of Table 6, or said HCDR2 is encoded as trimer oligonucleotides in VH1-46 germline according to the frequency of Table 7, and said HCDR2 is encoded as trimer oligonucleotides in VH3-23 germline according to the frequency of Table 8.
The present invention also relates to a library of antibody binding regions, wherein said antibody binding regions comprise a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the methods above and a common light chain variable domain. In a particular embodiment, the common light chain variable domain is a VK3-15 or a VK1-39 variable light chain.
In a more particular embodiment, the common light chain variable domain is Vi<3-15/Jkl (SEQ ID NO: 4).
In certain aspect, the heavy chain variable domain of the antibody binding regions of the library disclosed herein further comprise a naturally occurring heavy chain framework region.
In certain specific embodiments, the naturally occurring heavy chain framework region is derived from a human antibody variable heavy chain germline selected from the group comprising VH1-46 (SEQ. ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3).
In certain aspect, the heavy chain variable domain of the antibody binding regions of the library disclosed herein further comprises a human HCDR3 being a naturally occurring HCDR3 from a human IgM repertoire.
In a particular embodiment, the library disclosed herein is a phage display library.
The present invention also relates to a method for identifying an antibody binding region from the library disclosed herein, wherein said antibody binding region specifically binds to an antigen target of interest comprising the steps of (i) panning on said antigen and (ii) screening said antibody binding region that specifically binds to said antigen.
The present invention further relates to an antibody binding region identified according to the method mentioned above for identifying an antibody binding region.
As utilized in accordance with the present disclosure, the following terms, unless otherwise indicated, shall be understood to have the following meanings. Unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. As used in this specification and the appended claims, the singular forms "a", "an" and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a molecule" optionally includes a combination of two or more such molecules, and the like.
Generally, nomenclature utilized in connection with and techniques of cell and tissue culture, molecular biology and protein and oligo-or polypeptide chemistry and hybridization described herein and those well- known and commonly used in the art. Standard techniques are used for recombinant DNA, oligonucleotide synthesis, phage display and cell culture. Enzymatic reactions and purification techniques are performed according to manufactures specifications or as commonly accomplished in the art as described herein. The foregoing techniques and procedures are generally performed according to the conventional methods well known in the art and as described herein in various general and more specific references that are cited and discussed throughout the present specification. (See, e.g. Sambrook et al., Molecular Cloning. A Laboratory Manual).
It is understood that aspects and embodiments of the present disclosure described herein include "comprising," "consisting," and "consisting essentially of aspects and embodiments.
The present invention relates to methods for generating a collection of diverse human HCDR1 and/or HCDR2 starting from the sequences of the human IgG repertoire. Sequences of the human IgG repertoire can be obtained by any technique known in the art. In specific embodiments said sequences are obtained by Next Generation Sequencing of the human IgG repertoire.
As used herein, the term "CDR diversity" refers to the heterogeneity in the amino acid sequence of a given CDR, wherein said heterogeneity is given by variation in the amino acid type at each position of a given CDR. Such heterogeneity can be naturally derived or can be in silica designed.
The term "naturally occurring" as used herein and applied to an object refers to the fact that the object can be found in nature and has not been manipulated by man. For example, a polynucleotide or polypeptide that is present in an organism (including viruses) that can be isolated from a source in nature and that has not been intentionally modified by man is naturally-occurring. Similarly, "non-naturally occurring" as used herein refers to an object that is not found in nature or that has been structurally modified or synthesized by man.
As used herein, the twenty conventional amino acids and their abbreviations follow conventional usage. Stereoisomers (e.g., D-amino acids) of the twenty conventional amino acids; unnatural amino acids and analogs such as a-, a-di substituted amino acids, N- alkyl amino acids, lactic acid, and other unconventional amino acids may also be suitable components for the polypeptide chains of the binding proteins. Examples of unconventional amino acids include: 4-hydroxyproline, y-carboxyglutamate, e-N,N,N-trimethyllysine, e-N- acetyllysine, O-phosphoserine, N-acetyl serine, N-formylmethionine, 3-methylhistidine, 5- hydroxylysine, o-N-methylarginine, and other similar amino acids and imino acids (e.g., 4- hydroxyproline). In the polypeptide notation used herein, the left-hand direction is the amino terminal direction and the right-hand direction is the carboxyl-terminal direction, in accordance with standard usage and convention.
Naturally occurring residues may be divided into classes based on common side chain properties:
(1) hydrophobic: Met, Ala, Vai, Leu, He, Phe, Trp, Tyr, Pro;
(2) polar hydrophilic: Arg, Asn, Asp, Gin, Glu, His, Lys, Ser, Thr ;
(3) aliphatic: Ala, Gly, He, Leu, Vai, Pro;
(4) aliphatic hydrophobic: Ala, lie, Leu, Vai, Pro;
(5) neutral hydrophilic: Cys, Ser, Thr, Asn, Gin;
(6) acidic: Asp, Glu;
(7) basic: His, Lys, Arg;
(8) residues that influence chain orientation: Gly, Pro;
(9) aromatic: His, Trp, Tyr, Phe; and
(10) aromatic hydrophobic: Phe, Trp, Tyr.
Conservative amino acid substitutions may involve exchange of a member of one of these classes with another member of the same class. Non-conservative substitutions may involve the exchange of a member of one of these classes for a member from another class.
A skilled artisan will be able to determine suitable variants of the polypeptide chains of the binding proteins using well-known techniques. For example, one skilled in the art may identify suitable areas of a polypeptide chain that may be changed without destroying activity by targeting regions not believed to be important for activity. Alternatively, one skilled in the art can identify residues and portions of the molecules that are conserved among similar polypeptides. In addition, even areas that may be important for biological activity or for structure may be subject to conservative amino acid substitutions without destroying the biological activity or without adversely affecting the polypeptide structure.
The term "antibody" and "immunoglobulin" as used herein are interchangeable, and they refer to whole antibodies as well as to any antigen binding fragments or single chains thereof. Naturally occurring antibodies typically comprise a tetramer. Each such tetramer is typically composed of two identical pairs of polypeptide chains, each pair having one full-length "light" chain (typically having a molecular weight of about 25 kDa) and one full-length "heavy" chain (typically having a molecular weight of about 50-70 kDa). The terms "heavy chain" and "light chain" as used herein refer to any immunoglobulin polypeptide having sufficient variable domain sequence to confer specificity for a target antigen. The amino-terminal portion of each light and heavy chain typically includes a variable domain of about 100 to 110 or more amino acids that typically is responsible for antigen recognition. The carboxy-terminal portion of each chain typically defines a constant domain responsible for effector function. Thus, in a naturally occurring antibody, a full-length heavy chain immunoglobulin polypeptide includes a heavy chain variable domain (VH), also referred herein as variable heavy chain, and three constant domains (CHI, CH2, and CH3), wherein the VH domain is at the amino-terminus of the polypeptide and the CH3 domain is at the carboxyl-terminus, and a full-length light chain immunoglobulin polypeptide includes a light chain variable domain (VL), also referred herein as variable light chain, and a constant domain (CL), wherein the VL domain is at the amino-terminus of the polypeptide and the CL domain is at the carboxyl-terminus.
Human light chains are typically classified as kappa and lambda light chains, and human heavy chains are typically classified as mu, delta, gamma, alpha, or epsilon, and define the antibody's isotype as IgM, IgD, IgG, IgA, and IgE, respectively. IgG has several subclasses, including, but not limited to, IgGl, lgG2, lgG3, and lgG4. IgM has subclasses including, but not limited to, IgMl and lgM2. IgA is similarly subdivided into subclasses including, but not limited to, IgAl and lgA2. Within full-length light and heavy chains, the variable and constant domains typically are joined by a "J" region of about 12 or more amino acids, with the heavy chain also including a "D" region of about 10 more amino acids. See, e.g., FUNDAMENTAL IMMUNOLOGY (Paul, W., ed., Raven Press, 2nd ed., 1989), which is incorporated by reference in its entirety for all purposes. The variable regions of each light/heavy chain pair typically form an antigen binding site. The variable domains of naturally occurring antibodies typically exhibit the same general structure of relatively conserved framework regions (FR) joined by three hypervariable regions, also called complementarity determining regions or CDRs. The CDRs from the two chains of each pair typically are aligned by the framework regions, which may enable binding to a specific epitope. From the aminoterminus to the carboxyl-terminus, both light and heavy chain variable domains typically comprise the domains FR1, CDR1, FR2, CDR2, FR3, CDR3, and FR4. Heavy chain CDRs are herein also abbreviated as HCDR (HCDR1, HCDR2, and HCDR3), or CDRH (CDRH1, CDRH2, and CDRH3). Light chain CDRs are herein also abbreviated as LCDR (LCDR1, LCDR2, and LCDR3), or CDRL (CDRL1, CDRL2, and CDRL3).
The exact boundaries of these CDRs have been defined differently according to different systems. The system described by Kabat (Kabat et al, SEQUENCES OF PROTEINS OF IMMUNOLOGICAL INTEREST (National Institutes of Health, Bethesda, Md. (1987) and (1991)) not only provides an unambiguous residue numbering system applicable to any variable region of an antibody, but also provides precise residue boundaries defining the three CDRs. These CDRs may be referred to as Kabat CDRs. Chothia and coworkers (Chothia and Lesk, 1987, J. Mol Biol. 196: 901-17; Chothia et al, 1989, Nature 342: 877-83) found that certain sub-portions within Kabat CDRs adopt nearly identical peptide backbone conformations, despite having great diversity at the level of amino acid sequence. These sub-portions were designated as LI, L2, and L3 or Hl, H2, and H3 where the "L" and the "H" designates the light chain and the heavy chain regions, respectively. These regions may be referred to as Chothia CDRs, which have boundaries that overlap with Kabat CDRs. Other boundaries defining CDRs overlapping with the Kabat CDRs have been described by Padlan, 1995, FASEB J. 9: 133- 39; MacCallum, 1996, J. Mol. Biol. 262(5): 732-45; and Lefranc, 2003, Dev. Comp. Immunol. 27: 55-77. Still other CDR boundary definitions may not strictly follow one of the herein systems, but will nonetheless overlap with the Kabat CDRs, although they may be shortened or lengthened in light of prediction or experimental findings that particular residues or groups of residues or even entire CDRs do not significantly impact antigen binding. The methods used herein may utilize CDRs defined according to any of these systems, although certain embodiments use Kabat or Chothia defined CDRs. Identification of predicted CDRs using the amino acid sequence is well known in the field, such as in Martin, A.C. "Protein sequence and structure analysis of antibody variable domains," In Antibody Engineering, Vol. 2. Kontermann R., Dubel S., eds. Springer- Verlag, Berlin, p. 33-51 (2010). The amino acid sequence of the heavy and/or light chain variable domain may be also inspected to identify the sequences of the CDRs by other conventional methods, e.g., by comparison to known amino acid sequences of other heavy and light chain variable regions to determine the regions of sequence hypervariability. The numbered sequences may be aligned by eye, or by employing an alignment program such as one of the CLUSTAL suite of programs, as described in Thompson, 1994, Nucleic Acids Res. 22: 4673-80. Molecular models are conventionally used to correctly delineate framework and CDR regions and thus correct the sequence-based assignments. All such alternative definitions are encompassed by the current invention and the sequences provided in this specification are not intended to exclude alternatively defined CDR sequences which may only comprise a portion of the CDR sequences provided in the sequence listing.
Antibody fragments include, but are not limited to, (i) the Fab fragment consisting of VL, VH, CL and CHI domains, including Fab' and Fab'-SH, (ii) the Fd fragment consisting of the VH and CHI domains, (iii) the Fv fragment consisting of the VL and VH domains of a single antibody; (iv) the dAb fragment ( Ward ES et al., (1989) Nature, 341: 544-546) which consists of a single variable, (v) F(ab')2 fragments, a bivalent fragment comprising two linked Fab fragments (vi) single chain Fv molecules (scFv), wherein a VH domain and a VL domain are linked by a peptide linker which allows the two domains to associate to form an antigen binding site ( Bird RE et al., (1988) Science 242: 423-426 ; Huston JS et al., (1988) Proc. Natl. Acad. Sci. USA, 85: 5879-83 ), (vii) bispecific single chain Fv dimers ( PCT/US92/09965 ), (viii) "diabodies" or "triabodies", multivalent or multispecific fragments constructed by gene fusion ( Tomlinson I & Hollinger P (2000) Methods Enzymol. 326: 461-79; WO94/13804; Holliger P et al., (1993) Proc. Natl. Acad. Sci. USA, 90: 6444-48 ) and (ix) scFv genetically fused to the same or a different antibody ( Coloma MJ & Morrison SL (1997) Nature Biotechnology, 15(2): 159-163 ).
The term "Fc" as used herein refers to a molecule comprising the sequence of a non-antigen-binding fragment resulting from digestion of an antibody or produced by other means, whether in monomeric or multimeric form, and can contain the hinge region. The original immunoglobulin source of the native Fc is preferably of human origin and can be any of the immunoglobulins. Fc molecules are made up of monomeric polypeptides that can be linked into dimeric or multimeric forms by covalent {i.e., disulfide bonds) and non- covalent association. The number of intermolecular disulfide bonds between monomeric subunits of native Fc molecules ranges from 1 to 4 depending on class {e.g., IgG, IgA, and IgE) or subclass {e.g., IgGl, lgG2, lgG3, IgAl, lgGA2, and lgG4). One example of a Fc is a disulfide-bonded dimer resulting from papain digestion of an IgG. The term "native Fc" as used herein is generic to the monomeric, dimeric, and multimeric forms.
A F(ab) fragment typically includes one light chain and the VH and CHI domains of one heavy chain, wherein the VH-CH1 heavy chain portion of the F(ab) fragment cannot form a disulfide bond with another heavy chain polypeptide. As used herein, a F(ab) fragment can also include one light chain containing two variable domains separated by an amino acid linker and one heavy chain containing two variable domains separated by an amino acid linker and a CHI domain.
A F(ab') fragment typically includes one light chain and a portion of one heavy chain that contains more of the constant region (between the CHI and CH2 domains), such that an interchain disulfide bond can be formed between two heavy chains to form a F(ab')2molecule.
The term "antibody binding regions" or "antibody binding portion" refers to one or more portions of an immunoglobulin or antibody variable region capable of binding an antigen(s). Antibody binding regions include whole antibodies as well as to any antigen binding fragments or single chains thereof. The antibody binding region is, for example, an antibody light chain, or an light chain variable domain (VL), an antibody heavy chain, or a heavy chain variable domain (VH), an heavy chain Fd region, a combined antibody light and heavy chain (or variable region thereof) such as a Fab, F(ab')2, single domain, or single chain antibody (scFv), any of the antibody fragments mentioned above, or a full length antibody, for example, an IgG (e.g., an IgGl, lgG2, lgG3, or lgG4 subtype), IgAl, lgA2, IgD, IgE, or IgM antibody.
The term "antigen" or "target antigen" or "antigen target" or "target" as used herein refers to a molecule or a portion of a molecule that is capable of being bound by an antibody binding region, and/or that is capable of being used in an animal to produce antibodies capable of binding to an epitope of that antigen. A target antigen may have one or more epitopes. With respect to each target antigen recognized by an antibody, the binding protein is capable of competing with an intact antibody that recognizes the target antigen. The antigen is bound by an antibody binding region, or generally by a binding protein, such as an antibody, via an antigen binding site, also referred herein as "binding portion" or "binding domain".
The term "epitope" includes any determinant, preferably a polypeptide determinant, capable of specifically binding to an immunoglobulin or T-cell receptor. In certain embodiments, epitope determinants include chemically active surface groupings of molecules such as amino acids, sugar side chains, phosphoryl groups, or sulfonyl groups, and, in certain embodiments, may have specific three- dimensional structural characteristics and/or specific charge characteristics. An epitope is a region of an antigen that is bound by an antibody or binding protein. In certain embodiments, a binding protein is said to specifically bind an antigen when it preferentially recognizes its target antigen in a complex mixture of proteins and/or macromolecules. In some embodiments, a binding protein is said to specifically bind an antigen when the equilibrium dissociation constant is < 108 M, more preferably when the equilibrium dissociation constant is < 109 M, and most preferably when the dissociation constant is < IO10 M.
The dissociation constant (KD) of a binding protein can be determined, for example, by surface plasmon resonance. Generally, surface plasmon resonance analysis measures real-time binding interactions between ligand (a target antigen on a biosensor matrix) and analyte (a binding protein in solution) by surface plasmon resonance (SPR) using the BIAcore system (GE). Surface plasmon analysis can also be performed by immobilizing the analyte (binding protein on a biosensor matrix) and presenting the ligand (target antigen). The term "KD," as used herein refers to the dissociation constant of the interaction between a particular binding protein and a target antigen. The term "binds to" as used herein in reference to a binding protein refers to the ability of a binding protein or an antigen-binding fragment thereof to bind to an antigen containing an epitope with an KD of at least about 1 x 10 sM, 1 x 10’7M, 1 x 10’8M, 1 x 10'9M, 1 x 1010M, 1 x 10 xlM, 1 x 1012M, or less, and/or to bind to an epitope with an affinity that is at least two-fold greater than its affinity for a nonspecific antigen.
The present invention relates generally to a method to generate a collection of diverse human HCDR1 and/or HCDR2, and/or HCDR3, and/or LCDR1, and/or LCDR2, and/or LCDR3, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Removing from said HCDR1 and/or HCDR2 amino acid sequence dataset: (i) sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations); (ii) sequences comprising one or more motifs of Table 1; and (iii) sequences encompassing immunogenic peptides.
In certain aspect, the present invention relates to a method for generating a collection of diverse human HCDR1 and/or HCDR2 starting from the sequences of the human IgG repertoire, for instance starting from the sequences obtained by Next Generation Sequencing of the human IgG repertoire, wherein said human HCDR1 corresponds to the Kabat position 27 to 35, and wherein said human HCDR2 corresponds to the Kabat position 50 to 58.
In a specific embodiment, the first method disclosed herein to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprises a step of performing Next-Generation-Sequencing of human IgG antibody repertoire.
In a more specific embodiment, the first method disclosed herein to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, also comprises the steps of selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset, and removing from said HCDR1 and/or HCDR2 amino acid sequence dataset all the sequences harboring more than about 50%, or more than about 60%, or more than about 70%, preferably more than about 60% of Somatic-Hyper-Mutations, wherein the term "Somatic-Hyper-Mutations" refers herein to any amino acid different than the amino acid present in the germline. More in particular, the method comprises the step of removing sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations).
In another aspect, the present method comprises a step of removing from said HCDR1 and/or HCDR2 amino acid sequence dataset, sequences comprising one or more liabilities. According to the present invention the term "liability" refers to a motif that which makes the molecule probe to post-translational modification, or poly-reactivity, or self-association. In particular, a liability is selected from the group comprising: unpaired cystines, oxidation sites, N-glycosylation sites, non-canonical N-glycosylation, Asparagine/Glutamine deamidation sites, Aspartate isomerization sites, fragmentation site, O- glycosylation sites, Lysine glycation, Tyrosine sulphation, integrin avP3 binding sites, integrin a4pi binding sites, integrin a2pi binding sites, CDllc/CD18 binding sites, hydrophobic sites. More in particular a liability is one or more motif selected from the motifs listed in Table 1. In a more particular embodiment, the present method comprises a step of removing from said HCDR1 and/or HCDR2 amino acid sequences dataset sequences comprising one or more motif of Table 1.
In another aspect, the present method also comprises a step of removing from said CDRs sequences dataset sequences encompassing immunogenic peptides. In particular, wherein the sequences encompassing immunogenic peptides are detected by the Immune Epitope Database (IEDB) CD4 T cell immunogenicity prediction tool http://tools.iedb.org/CD4episcore/ (ref: Dhanda et. al.: Prediction of HLA CD4 immunogenicity in human populations, Frontiers in Immunology, 2018, 9, 1369 // Paul et. al: Development and validation of a broad scheme for prediction of HLA class II restricted T cell epitopes, Journal of Immunological Methods, 2015, 422, 28-34). More in particular, wherein the sequences encompassing immunogenic peptides were discarded upon filtering using the Episcore in-silico immunogenicity prediction tool with a threshold value of 50% and by using the default Immune Epitope Database prediction method "IEDB".
In a more particular aspect, the present invention relates to a method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Removing from said HCDR1 and/or HCDR2 amino acid sequence dataset: (i) sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations); (ii) sequences comprising one or more motifs of Table 1; and (iii) sequences encompassing immunogenic peptides; wherein said sequences encompassing immunogenic peptides are detected by the Immune Epitope Database (IEDB) CD4 T cell immunogenicity prediction tool (http://tools.iedb.org/CD4episcore/).
The present invention also relates generally to a second method to generate a collection of diverse human HCDR1 and/or HCDR2, and/or HCDR3, and/or LCDR1, and/or LCDR2, and/or LCDR3, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Aligning the amino acid sequences of HCDR1 and/or HCDR2 of said HCDR1 and/or HCDR2 amino acid sequence dataset; d. Calculation of the frequency of each amino acid at each position of said amino acid sequences of HCDR1 and/or HCDR2; e. Calculation of a single point mutation rate (MR) for each position of said HCDR1 and/or HCDR2, wherein MR is the frequency of a non-germline amino acid at each position of said HCDR1 and/or HCDR2; e. Calculation of a diversity score (DS) for each position of said HCDR1 and/or HCDR2, wherein DS is the MR multiplied for the minimum number of amino acids whose summed frequency is equal to the 80% of the MR; f. Obtaining said collection of diverse human HCDR1 and/or HCDR2 by providing at each position of said HCDR1 and/or HCDR2 a plurality of amino acids, characterized in that: for a position having said DS lower than about 70% of the DS highest value, said plurality of amino acids is made of the naturally occurring amino acid at each position, excluding amino acids with frequency less than 1%, Cys, and optionally Met; for a position having said DS equal to or higher than about 70% of the DS highest value, said plurality of amino acids is made of any amino acid, excluding Cys, Met, Trp, and optionally Pro.
The present invention also relates generally to a second method to generate a collection of diverse human HCDR1 and/or HCDR2, and/or HCDR3, and/or LCDR1, and/or LCDR2, and/or LCDR3, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Aligning the amino acid sequences of HCDR1 and/or HCDR2 of said HCDR1 and/or HCDR2 amino acid sequence dataset; d. Calculation of the frequency of each amino acid at each position of said amino acid sequences of HCDR1 and/or HCDR2; e. Calculation of a single point mutation rate (MR) for each position of said HCDR1 and/or HCDR2, wherein MR is the frequency of a non-germline amino acid at each position of said HCDR1 and/or HCDR2; e. Calculation of a diversity score (DS) for each position of said HCDR1 and/or HCDR2, wherein DS is the MR multiplied for the minimum number of amino acids whose summed frequency is equal to the 60%, or 70%, or 80%, or 90% of the MR, preferably is equal to the 80% of the MR; f. Obtaining said collection of diverse human HCDR1 and/or HCDR2 by providing at each position of said HCDR1 and/or HCDR2 a plurality of amino acids, characterized in that: for a position having said DS lower than about 60%, or about 70%, or about 80% of the DS highest value, preferably lower than about 70% of the DS highest value, said plurality of amino acids is made of the naturally occurring amino acid at each position, excluding amino acids with frequency less than 1%, Cys, and optionally Met; for a position having said DS equal to or higher than about 60%, or about 70%, or about 80% of the DS highest value, preferably equal to or higher than about 70% of the DS highest value, said plurality of amino acids is made of any amino acid, excluding Cys, Met, Trp, and optionally Pro.
The term "unique" CDR sequences refers generally to CDR sequences that are all different one another. For instance, the term "unique" CDR sequence is used herein to indicate that a CDR sequence, even when present multiple times in the human IgG repertoire, is extracted, and/or used, and/or considered only once, so that unique CDR sequences are all different one another.
According to certain specific aspects of the present invention the antibody variable heavy chain germline used herein is a germline selected from any human antibody variable heavy chain germline ( https://www.imgt.org/IMGTrepertoire/Proteins/proteinDisplays.php?species=human&latin=Homo%20s apiens&group=IGHV). For instance, selected from the group comprising: VH1-2, VH1-3, VH1-8, VH1-18, VH 1-24, VH1-45, VH1-46, VH1-58, VH1-69, VH2-5, VH2-26, VH2-70, VH3-7, VH3-9, VH3-11, VH3-13, VH3- 15, VH3-20, VH3-21, VH3-23, VH3-30, VH3-33, VH3-43, VH3-48, VH3-49, VH3-53, VH3-64, VH3-66, VH3- 72, VH3-73, VH3-74, VH4-4, VH4-28, VH4-31, VH4-34, VH4-39, VH4-59, VH4-61, VH4-B, VH5-51, VH6-1, and VH74; preferably from the group comprising VH1-46, VH1-69, and VH3-23.
The present invention also relates to a collection of diverse human HCDR1 and/or HCDR2 obtained by the methods disclosed herein. In particular embodiment, also relates to a collection of diverse human HCDR1 and/or HCDR2, wherein said HCDR1 is encoded as trimer oligonucleotides in said VH1-69 germline according to a frequency that is at least about 70%, 75%, 80%, 85%, 86%, 87%, 88% 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the frequency of Table 3, or said HCDR1 is encoded as trimer oligonucleotides in said VH1-46 germline according to a frequency that is at least about 70%, 75%, 80%, 85%, 86%, 87%, 88% 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the frequency of Table 4, said HCDR1 is encoded as trimer oligonucleotides in said VH3-23 germline according to a frequency that is at least about 70%, 75%, 80%, 85%, 86%, 87%, 88% 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the frequency of Table 5; and said HCDR2 is encoded as trimer oligonucleotides in said VH1-69 germline according to a frequency that is at least about 70%, 75%, 80%, 85%, 86%, 87%, 88% 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the frequency of Table 6, or said HCDR2 is encoded as trimer oligonucleotides in said VH1-46 germline according to a frequency that is at least about 70%, 75%, 80%, 85%, 86%, 87%, 88% 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the frequency of Table 7, and said HCDR2 is encoded as trimer oligonucleotides in said VH3- 23 germline according to a frequency that is at least about 70%, 75%, 80%, 85%, 86%, 87%, 88% 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identical to the frequency of Table 8.
The present invention further relates to a library of antibody binding regions. In particular, the present invention further relates to a library of antibody binding regions comprising a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein.
The term "library of antibody binding regions" refers to two or more antibody binding regions having a diversity as described herein, specifically designed according to the methods of the invention. As described throughout the specification, the term "library" is used herein in its broadest sense, and also may include the sub-libraries that may or may not be combined to produce libraries of the invention.
In the present invention, an antibody display technology is used to generate library of antibody binding regions, in particular to display on a system the protein of interest, i.e., the antibody binding regions, to enable the isolation of molecules with given properties. Specifically, the protein of interest is linked to the genetic information or, generally an anchor of said system, wherein the system is selected from a group comprising: phages, yeasts, mammalian cells, ribosomes, DNA. Antibody display technologies include but are not limited to phage display (McCafferty et al., 1990; Bazan et al., 2012, Valldorf et al., 2022), yeast display (Boder ET and Wittrup KD, 1997; Weaver-Feldhaus JM et al., 2004), bacterial display (van Blarcom TJ and Harvey BR, 2009. Bacterial display of antibodies. In: Therapeutic Monoclonal Antibodies: From Bench to Clinic), mammalian display (Beerli RR et al., 2008; Tomimatsu K et al., 2013; Breous-Nystrom E et al., 2013; Horlick RA et al., 2013; Parthiban K et al., 2019), ribosome display (Hanes J and Pluckthun A, 1997), DNA display (Reiersen H et al., 2005; Sumida T et al, 2009), mRNA display (Roberts RW and Szostak JW, 1997) and bead display (Diamante L et al., 2013). In vivo systems, such as phage display, bacterial display, yeast display, mammalian display, usually rely on anchoring the expressed antibody on a cell surface protein to maintain the phenotype-genotype linkage. In vitro systems anchor the transcription unit through a linker to the anchor (ribosome, puromycin or polystyrene beads) to maintain the phenotype-genotype linkage. This principle of genotype phenotype coupling allows for 'barcoding' of up to several billion different protein variants of which binders can be selected via high-throughput identification in an iterative process (Valldorf et al., 2022).
In a more particular embodiment, the present invention relates to a phage display library of antibody binding regions comprising a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the methods disclosed herein, and a common light chain variable domain, namely a light chain variable domain that is identical in all the antibody binding regions of the phage display library. In particular, the common light chain variable domain is any one selected from the germline repertoires: IGKV/IGKJ, IGLV/IGU (https://www.imgt.org/IMGTrepertoire/Proteins/). In a particular embodiment, the common light chain variable domain is a VK3-15 or a VK1-39 variable light chain. More in particular, the common light chain variable domain is Vi<3-15/Jkl (SEQ. ID NO: 4).
In a further particular embodiment of the present invention, the phage display library of antibody binding regions disclosed herein comprises a heavy chain variable domain further comprising a heavy chain framework region. The term "framework region" refers to the art recognized portions of an antibody variable region that exist between the more divergent CDR regions. Such framework regions are typically referred to as frameworks 1 through 4 (FR1, FR2, FR3, and FR4) and provide a scaffold for holding, in three- dimensional space, the three CDRs found in a heavy or light chain antibody variable region, such that the CDRs can form an antigen-binding surface. The framework region(s) used to support the one or more CDR sequences determined or obtained as described herein can be, e.g., naturally occurring, synthetic, semisynthetic, or combinations thereof. In a preferred aspect, the phage display library of antibody binding regions disclosed herein comprises a heavy chain variable domain further comprising a naturally occurring heavy chain framework regions. More specifically the naturally occurring heavy chain framework regions are derived from a VH germline listed above, e.g. selected from the group comprising VH1-2, VH1-3, VH1- 8, VH1-18, VH 1-24, VH1-45, VH1-46, VH1-58, VH1-69, VH2-5, VH2-26, VH2-70, VH3-7, VH3-9, VH3-11, VH3-13, VH3-15, VH3-20, VH3-21, VH3-23, VH3-30, VH3-33, VH3-43, VH3-48, VH3-49, VH3-53, VH3-64, VH3-66, VH3-72, VH3-73, VH3-74, VH4-4, VH4-28, VH4-31, VH4-34, VH4-39, VH4-59, VH4-61, VH4-B, VH5- 51, VH6-1, and VH74; preferably from the group comprising VH1-46, VH1-69, and VH3-23. More specifically, selected from the group comprising VH1-46 (SEQ ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3- 23 (SEQ ID NO: 3).
In one particular embodiment of the present invention, the library of antibody binding regions disclosed herein comprises a heavy chain variable domain further comprising a human HCDR3 being a natural occurring HCDR3 from a human IgM repertoire. In a more particular embodiment, said natural occurring HCDR3 from a human IgM repertoire by keeping the HCDR3 cassettes disclosed herein separated based on their individual VH-families (VH1 sequences or VH3 sequences) or pooled across all VH amplified.
In a preferred embodiment of the present invention the library of antibody binding regions is a phage display library. The term "phage display library" refers generally to a library generated by a phage display technique. In particular, a phage display library is a collection of phages each expressing on their surface different binding proteins, such as different antibody binding regions, from one phage to another. The term "phage" is used herein to refer to both bacteriophage or archaeophage. Bacteriophage and archaeophage are obligate intracellular parasites (with respect to both the step of identifying a host cell to infect and to only being able to productively replicate their genome in an appropriate host cell) that infect and multiply inside bacteria/archaea by making use of some or all the host biosynthetic machinery. Though different bacteriophages and archaeophages may contain different materials, they all contain nucleic acids and proteins, and can, under certain circumstances, be encapsulated in a lipid membrane. Depending upon the phage, the nucleic acid may be either DNA or RNA (but typically not both) and it can exist in various forms, with the size of the nucleic acid depending on the phage. Non limiting examples of phage display systems that can be used in the present invention include: filamentous bacteriophages, such as fl, fd, M13; T4 phage display system; T7 phage display system and lambda phage display system.
Phage display libraries can be created either on phage or phagemid vector backbones (Scott JK and CF Barbas III. 2001. Phage-display vectors. In: Phage Display: A Laboratory Manual). Phage vectors consist of an essentially complete phage genome, often M13 phage, into which is inserted DNA encoding the protein or peptide of interest. As used herein the term "phagemid" refers to a DNA-based cloning vector, which has both bacteriophage and plasmid properties. These vectors carry, in addition to the origin of plasmid replication, an origin of replication derived from bacteriophage. In particular aspects, the phagemid is a plasmid that contains an fl origin of replication from an fl phage (Analysis of Genes and Genomes, John Wiley & Sons, S. 140 (2004)). Similarly to a plasmid, a phagemid can be used to clone DNA fragments and be introduced into a bacterial host by a range of techniques, such as transformation and electroporation. Phagemids are preferred both for monovalent display of antibody fragments that allow selection by true affinity as well for the fact that transformation efficiencies in E. coli are far superior compared to the phage vectors.
The term "vector" as used herein refers to any molecule (e.g., nucleic acid, plasmid, or virus) that is used to transfer coding information to a host cell. The term "vector" includes a nucleic acid molecule that is capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid," which refers to a circular double- stranded DNA molecule into which additional DNA segments may be inserted. Another type of vector is a viral vector, wherein additional DNA segments may be inserted into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) can be integrated into the genome of a host cell upon introduction into the host cell and thereby are replicated along with the host genome. In addition, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "recombinant expression vectors" (or simply, "expression vectors"). In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. The terms "plasmid" and "vector" may be used interchangeably herein, as a plasmid is the most commonly used form of vector. This disclosure is intended to include any forms of expression vectors described above, which serve equivalent functions.
A phage display library can be used to screen for a binding protein of interest with a binding partner, such as an antigen target. The formation of a complex between a phage, the binding protein expressed on its surface and said binding partner enables the determination as to which phage express the binding protein of interest. After having selected the complex, the sequence on the vector is analyzed in order to obtain the exact sequences encoding the binding protein of interest.
In one aspect, the present invention relates to a method for identifying an antibody binding region that specifically binds to an antigen target of interest comprising the steps of (i) phage display panning on said antigen target and (ii) screening said antibody binding region that specifically binds to an antigen target of interest by contacting said antigen binding region with a target substrate.
The term "panning" or "biopanning" as used herein are interchangeable and they refer to a sequence of steps in which a library, for instance a phage display library is incubated with target molecule, e.g. an antigen target, then phages displaying specificity to the target are bound to that, while unbound phages are washed out. The specific phages are eluted and amplified in bacteria. After several rounds, amplified phages could be analyzed and further screened. Thus, panning may be employed to select binders from antibody libraries. Biopanning may be conducted in vitro by immobilizing pure antigens on solid surfaces such as polystyrene, or by biotinylating the pure antigens and immobilizing them on streptavidin-coated polystyrene surfaces. Biopanning may also be conducted in vitro by capturing the biotinylated antigens on streptavidin coated magnetic microbeads. Biopanning may also be carried out against target antigens present on the surface or inside a living cell, or against antigens such as cell surface receptors stabilized in lipid bilayers. Several rounds of panning are required to enrich the specific binding subpopulation over the background. Furthermore, the small proportion of specific binders captured at each round of panning does require amplification of these binders by amplification in a host cell, such as a in bacteria.
As used herein, a "phage host cell" or "host cell" or the like is a cell that can form phage from a particular type of phage genomic DNA. In some embodiments, the phage genomic DNA is introduced into the cell by infection of the cell by a phage. The phage binds to a receptor molecule on the outside of the host cell and injects its genomic DNA into the host cell. In some embodiments, the phage genomic DNA is introduced into the cell using transformation or any other suitable techniques. In certain embodiment of the present invention the host cell is a bacteria, more specifically is Escherichia coli (E. coli) bacteria.
In certain aspects of the present invention, the generated antibody binding regions are screened to select the ones that have the desired properties, such as desired binding affinity. In one embodiment, the screening comprises contacting the antigen binding region with a target substrate. The screening of the expressed antibody binding region can be done by any appropriate means. For example, binding activity can be evaluated by standard immunoassay and/or affinity chromatography. Determining the ability of candidate antibodies to bind therapeutic targets can be assayed in vitro using, e.g., flow cytometry, surface plasmon resonance, and ELISA. In particular aspects of the present invention the binding regions selected from the common light chain libraries disclosed herein have advantageous developability properties, including a transient expression yield in HEK cells comprised between about 25 to 92 mg/mL; high solubility with median percentage of monomers higher than 95%, preferably higher than 98%, more preferably higher than 99% measured by SEC, and median SEC retention time between 12 to 14 minutes; high thermostability with melting temperature Tm higher than 80°C, e.g., equal or higher than 82°C, 86°, 87°C, 92°C, measured by Differential Scanning Fluorimetry (DSF).
The present invention also relates to a polynucleotide, e.g. an isolated polynucleotide such as an antibody binding region identified according to the method of above for identifying an antibody binding region that specifically binds to an antigen target of interest comprising the steps of (i) phage display panning on said antigen target and (ii) screening said antibody binding region that specifically binds to an antigen target of interest by contacting said antigen binding region with a target substrate. In particular, antigen panning is made with any display system indicated above (e.g., phage, yeast, mammalian). In a specific embodiment, the panning is a phage display panning.
According to the present invention the antibody binding region can be one or more portions of an immunoglobulin or antibody variable region capable of binding an antigen(s). Antibody binding regions include whole antibodies as well as to any antigen binding fragments or single chains thereof as indicated above.
The term "polynucleotide" as used herein refers to single-stranded or double- stranded nucleic acid polymers of at least 10 nucleotides in length. In certain embodiments, the nucleotides comprising the polynucleotide can be ribonucleotides or deoxyribonucleotides or a modified form of either type of nucleotide. Such modifications include base modifications such as bromuridine, ribose modifications such as arabinoside and 2',3'-dideoxyribose, and internucleotide linkage modifications such as phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phoshoraniladate and phosphoroamidate. The term "polynucleotide" specifically includes single-stranded and double-stranded forms of DNA.
An "isolated polynucleotide" is a polynucleotide of genomic, cDNA, or synthetic origin or some combination thereof, which: (1) is not associated with all or a portion of a polynucleotide in which the isolated polynucleotide is found in nature, (2) is linked to a polynucleotide to which it is not linked in nature, or (3) does not occur in nature as part of a larger sequence.
An "isolated polypeptide" is one that: (1) is free of at least some other polypeptides with which it would normally be found, (2) is essentially free of other polypeptides from the same source, e.g., from the same species, (3) is expressed by a cell from a different species, (4) has been separated from at least about 50 percent of polynucleotides, lipids, carbohydrates, or other materials with which it is associated in nature, (5) is not associated (by covalent or noncovalent interaction) with portions of a polypeptide with which the "isolated polypeptide" is associated in nature, (6) is operably associated (by covalent or noncovalent interaction) with a polypeptide with which it is not associated in nature, or (7) does not occur in nature. Such an isolated polypeptide can be encoded by genomic DNA, cDNA, mRNA or other RNA, of synthetic origin, or any combination thereof. Preferably, the isolated polypeptide is substantially free from polypeptides or other contaminants that are found in its natural environment that would interfere with its use (therapeutic, diagnostic, prophylactic, research or otherwise).
In other aspects of the present invention the binding region identified according to the methods disclosed herein is used for the development of therapeutic antibodies. Said therapeutic antibody can be a full antibody, or an antibody fragments or single chains thereof as defined above. In certain aspects of the present disclosure, the therapeutic antibody is a monoclonal antibody or antibody fragment thereof. The term "monoclonal antibody" as used herein, refers to antibodies that are produced by clone cells all deriving from the same single cell, and that specifically bind the same epitope of the target antigen. When therapeutic antibodies are produced, the generation of monoclonal antibodies is preferred over polyclonal antibodies. In fact, while monoclonal antibodies are produced by cells originating from a single clone and bind all the same epitope, polyclonal antibodies are produced by different immune cells and recognize multiple epitopes of a certain antigen. Monoclonal antibodies assure batch to batch homogeneity, reduced cross-reactivity and high specificity toward the target. Monoclonal antibodies can be expressed, for instance in host cells, using recombinant DNA, giving rise to a recombinant antibody. In other aspects of the present disclosure, the therapeutic antibody is monospecific or multispecific antibody or antibody fragment thereof. The term "monospecific antibody" as used herein, refers to any antibody or fragment having one or more binding sites, all binding the same epitope. The term "multispecific antibody" as used herein, refers to any antibody or fragment having more than one binding site that can bind different epitopes of the same antigen, or different antigens. A non-limiting example of multispecific antibodies are bispecific, trispecific, tetraspecific antibodies. The present invention further relates to an antibody binding region, e.g., an antibody binding region identified from the library disclosed herein.
In particular embodiments, the antibody binding region of the present invention binds to a target that is a polypeptide. For example, the target is a protein. Non limiting examples of targets bound by the antibody binding region of the present invention include: ABCF1; ACVR1; ACVR1B; ACVR2; ACVR2B; ACVRL1; ADORA2A; ADRB3; Aggrecan; AGR2; AICDA; AIF1; AIG1; AKAP1; AKAP2; ALK; AM H; AMHR2; ANGPT1; ANGPT2; ANGPTL3; ANGPTL4; ANPEP; APC; APOCI; AR; AZGP1 (zinc-a-glycoprotein); B7.1; B7.2; BAD; BAFF; BAG1; BAI1; BCL2; BCL6; BDNF; BLNK; BLR1 (MDR15); BlyS; BMP1; BMP2; BMP3B (GDF10); BMP4; BM P6; BMP8; BMPR1A; BM PR1B; BM PR2; BPAG1 (plectin); BRCA1; C19orfl0 (IL27w); C3; C4A; C5; C5R1; Cadherin 17; CANT1; CASP1; CASP4; CAV1; CCBP2 (D6/JAB61); CCL1 (1- 309); CCL11 (eotaxin); CCL13 (MCP-4); CCL15 (MIP-ld); CCL16 (HCC-4); CCL17 (TARC); CCL18 (PARC); CCL19 (MIP-3b); CCL2 (MCP-1); MCAF; CCL20 (MIP-3a); CCL21 (MIP-2); SLC; exodus-2; CCL22 (MDC/STC-1); CCL23 (MPIF-1); CCL24 (MPIF- 2/eotaxin-2); CCL25 (TECK); CCL26 (eotaxin-3); CCL27 (CTACK/ILC); CCL28; CCL3 (MIP-la); CCL4 (MIP-lb); CCL5 (RANTES); CCL7 (MCP-3); CCL8 (mcp-2); CCNA1; CCNA2; CCND1; CCNE1; CCNE2; CCR1 (CKR1/HM145); CCR2 (mcp-1 RB/RA); CCR3 (CKR3/CMKBR3); CCR4; CCR5; (CMKBR5/ChemR13); CCR6 (C M KB R6/C KR- L3/STR L22/D RY6); CCR7 (CKR7/EBI1); CCR8 (CMKBR8/TER1/CKR-L1); CCR9 (GPR-9-6); CCRL1 (VSHK1); CCRL2 (L-CCR); CD164; CD19; CD1C; CD20; CD200; CD22; CD24; CD28; CD3; CD37; CD38; CD3E; CD3G; CD3Z; CD4; CD32b; CD40; CD40L; CD44; CD45RB; CD47; CD52; CD69; CD72; CD74; CD79A; CD79B; CD8; CD80; CD81 ; CD83; CD86; CD97; CD179a; CDH1 (E-cadherin); CDH10; CDH12; CDH13; CDH18; CDH19; CDH20; CDH5; CDH7; CDH8; CDH9; CDK2; CDK3; CDK4; CDK5; CDK6; CDK7; CDK9; CDKN1A (p21Wapl/Cipl); CDKN1 B (p27Kipl); CDKN1C; CDKN2A (pl6INK4a); CDKN2B; CDKN2C; CDKN3; CEBPB; CER1 ; CHGA; CHGB; Chitinase; CHST10; CKLFSF2; CKLFSF3; CKLFSF4; CKLFSF5; CKLFSF6; CKLFSF7; CKLFSF8; CLDN1; CLDN3; CLDN6; CLDN7; CLN3; CLU (clusterin); CMKLR1 ; CM KOR1 (RDC1); CNR1 ; COL18A1 ; COL1A1 ; COL4A3; COL6A1 ; CR2; CRP; CSF1 (M-CSF); CSF2 (GM-CSF); CSF3 (GCSF); CTLA4; CTNNB1 (b- catenin); CTSB (cathepsin B); CX3CL1 (SCYD1); CX3CR1 (V28); CXCL1 (GR01); CXCLIO(IP-IO); CXCL11 (1- TAC/IP-9); CXCL12 (SDF1); CXCL13; CXCL14; CXCL16; CXCL2 (GR02); CXCL3 (GR03); CXCL5 (ENA-78/LIX); CXCL6 (GCP-2); CXCL9 (MIG); CXCR3 (GPR9/CKR-L2); CXCR4; CXCR6 (TYMSTR/STRL33/Bonzo); CYB5; CYC1; CYSLTR1 ; CGRP; Clq; Cl r; Cl ; C4a; C4b; C2a; C2b; C3a; C3b; DAB2IP; DES; DKFZp451 J0118; DNCL1 ; DPP4; E-selectin; E2F1 ; ECGF1 ; EDG1 ; EFNA1 ; EFNA3; EFNB2; EGF; EGFR; EGFRvlll; ELAC2; ENG; EN01 ; EN02; EN03; EPHB4; EPO; ERBB2 (Her-2); EREG; ERK8; ESRI ; ESR2; F3 (TF); Factor VII; Factor IX; Factor V; Factor Vila; Factor Factor X; Factor XII; Factor XIII; FADD; FasL; FASN; FCER1A; FCER2; Fc gamma receptor; FCGR3A; FCRL5; FGF; FGF1 (aFGF); FGF10; FGF11 ; FGF12; FGF12B; FGF13; FGF14; FGF16; FGF17; FGF18; FGF19; FGF2 (bFGF); FGF20; FGF21 ; FGF22; FGF23; FGF3 (int-2); FGF4 (HST); FGF5; FGF6 (HST-2); FGF7 (KGF); FGF8; FGF9; FGFR3; FIGF (VEGFD); FILI (EPSILON); FILI (ZETA); FU12584; FU25530; FLRT1 (fibronectin); FLT1; Folate receptor alpha; Folate receptor beta; FOS; F0SL1 (FRA-1); Fucosyl GM1; FY (DARC); GABRP (GABAa); GAGEB1; GAGECI; GALN AC4S-6ST ; GATA3; GDF5; GFI1; GGT1; GM-CSF; GloboH; GNAS1; GNRH1; GPNMB; GPR2 (CCR10); GPR20; GPR31; GPR44; GPR64; GPR81 (FKSG80); GPRC5D; GRCC10 (CIO); GRP; GSN (Gelsolin); GSTP1; glycoprotein (gP) llb/llla; HAVCR1; HAVCR2; HDAC4; HDAC5; HDAC7A; HDAC9; Her2; HER3; HGF; HIF1A; HIP1; histamine and histamine receptors; HLA-A; HLA-DRA; HM74; HMGB1; HMOX1; HMWMAA; HUMCYT2A; ICEBERG; ICOSL; ID2; IFN-a; IFNA1; IFNA2; IFNA4; IFNA5; IFNA6; IFNA7; IFNB1; IFN-g; IFNW1; IGBP1; IGF1; IGF1R; IGF2; IGFBP2; IGFBP3; IGFBP6; IL-1; IL-a; IL-1- b; IL10; IL10RA; IL10RB; IL11; IL11RA; IL-12; IL12A; IL12B; IL12RB1; IL12RB2; IL13; IL13RA1; IL13RA2; IL14; IL15; IL15RA; IL16; IL17; IL17B; IL17C; IL17R; IL18; IL18BP; IL18R1; IL18RAP; IL19; ILIA; IL1B; IL1F10; IL1F5; IL1F6; IL1F7; IL1F8; IL1F9; IL1HY1; IL1R1; IL1R2; IL1RAP; IL1RAPL1; IL1RAPL2; IL1RL1; IL1RL2; IL1RN; IL2; IL20; IL20RA; IL21R; IL22; IL22R; IL22RA2; IL23; IL24; IL25; IL26; IL27; IL28A; IL28B; IL29; IL2RA; IL2RB; IL2RG; IL3; IL30; IL3RA; IL4; IL4R; IL5; IL5RA; IL6; IL6R; IL6ST (glycoprotein 130); IL7; IL7R; IL8; IL8RA; IL8RB; IL8RB; IL9; IL9R; ILK; INHA; INHBA; INSL3; INSL4; IRAKI; IRAK2; ITGA1; ITGA2; ITGA3; ITGA6 (a6 integrin); ITGAV; ITGB3; ITGB4 (b 4 integrin); JAG1; JAK1; JAK3; JUN; K6HF; KAI1; KDR; KITLG; KLF5 (GC Box BP); KLF6; KLK10; KLK12; KLK13; KLK14; KLK15; KLK3; KLK4; KLK5; KLK6; KLK9; KRT1; KRT19 (Keratinl9); KRT2A; KRTHB6 (hair-specific type II Keratinl9); L-selectin; LAMAS; LEP (leptin); Lingo-p75; Lingo-Troy; LRP6; LPS; LTA (TNF-b); LTB; LTB4R (GPR16); LTB4R2; LTBR; LY6K; LYPD8; MACMARCKS; MAG or Omgp; MAP2K7 (c- Jun); MDK; mesothelin; MIB1; midkine; MIF; MIP-2; MKI67 (Ki-67); MMP2; MMP9; MS4A1; MSMB; MT3 (metallothionectin-lll); MTSS1; MUC1 (mucin); MYC; MYD88; NCK2; neurocan; Nectin-4; NKp46; NKp44; NKp30; NKG2D; NFKB1; NFKB2; NGF; NGFB (NGF); NGFR; NgR-Lingo; NgR-Nogo66 (Nogo); NgR-p75; NgR- Troy; NME1 (NM23A); NOX5; NPPB; NR0B1; NR0B2; NR1D1; NR1D2; NR1H2; NR1H3; NR1H4; NRII2; NRII3; NR2C1; NR2C2; NR2E1; NR2E3; NR2F1; NR2F2; NR2F6; NR3C1; NR3C2; NR4A1; NR4A2; NR4A3; NR5A1; NR5A2; NR6A1; NRP1; NRP2; NT5E; NTN4; NY-BR-1; o-acetyl-GD2; ODZ1; OPRD1; OR51E2; P2RX7; PANX3; PAP; PART 1; PATE; PAWR; PCA3; PCNA; PDGFA; PDGFB; PD1; PDL1; PDL2; PECAM1; PF4 (CXCL4); PGE2; PGF; PGR; phosphacan; PIAS2; PIK3CG; PLAC1; plasminogen activator; PLAU (uPA); PLG; PLXDC1; polysialic acid; PPBP (CXCL7); PPID; PR1; PRKCQ; PRKD1; PRL; PROC; Protein C; PROK2; PSAP; PSCA; PTAFR; PTEN; PTGS2 (COX-2); PTN; RAC2 (p21Rac2); RAGE; RARB; RGS1; RGS13; RGS3; RNF110 (ZNF144); ROB02; SIOOA2; SCGB1D2 (lipophilin B); SCGB2A1 (mammaglobin 2); SCGB2A2 (mammaglobin 1); SCYE1 (endothelial Monocyte-activating cytokine); SDF2; SERPINA1; SERPINA3; SERPINB5 (maspin); SERPINE1 (PAI-1); SERPINF1; SHBG; SLA2; SLC2A2; SLC33A1 ; SLC34A2; SLC39A6; SLC43A1 ; SLIT2; SLITRK6; SPP1 ; SPRR1 B (Sprl); ST6GAL1 ; STAB1 ; STAT6; STEAP; STEAP2; substance P; TACSTD2; TB4R2; TBX21 ; TCP10; TDGF1 ; TEK; T EM 1/CD 248; TEM7R; TGFA; TGFB1 ; TGFB111 ; TGFB2; TGFB3; TGFBI; TGFBR1 ; TGFBR2; TGFBR3; TH1 L; THBS1; (thrombospondin-1); THBS2; THBS4; THPO; TIE (Tie-1); TIMP3; tissue factor; TLR10; TLR2; TLR3; TLR4; TLR5; TLR6; TLR7; TLR8; TLR9; TNF; TNF-a; TNFAIP2 (B94); TNFAIP3; TNFRSF11A; TNFRSF1A; TNFRSF1 B; TNFRSF21 ; TNFRSF5; TNFRSF6 (Fas); TNFRSF7; TNFRSF8; TNFRSF9; TNFSF10 (TRAIL); TNFSF11 (TRANCE); TNFSF12 (AP03L); TNFSF13 (April); TNFSF13B; TNFSF14 (HVEM-L); TNFSF15 (VEGI); TNFSF18; TNFSF4 (0X40 ligand); TNFSF5 (CD40 ligand); TNFSF6 (FasL); TNFSF7 (CD27 ligand); TNFSF8 (CD30 ligand); TNFSF9 (4-1 BB ligand); TOLLIP; Toll-like receptors; TOP2A (topoisomerase ha); TP53; TPM1 ; TPM2; TRADD; TRAF1 ; TRAF2; TRAF3; TRAF4; TRAF5; TRAF6; TREM1; TREM2; TROP2 TRPC6; TSHR; TSLP; TWEAK; thrombomodulin; thrombin; UPK2; VEGF; VEGFB; VEGFC; versican; VHL C5; VLA-4; XCL1 (lymphotactin); XCL2 (SCM-lb); XCR1 (GPRS/CCXCR1); YY1 ; and ZFPM2.
Figures
Figure 1. (A) HCDR1 and (B) HCDR2 normalized diversity scores for VH1-69 (diamonds), VH1-46 (triangles) and VH3-23 (circles) germlines. Residues showing a diversity score close to or higher than 75% of the highest calculated value, including positions 30, 31, 53, 56 and 58 of VH-69, positions 31, 56 and 58 of VH1-46 and, positions 30, 31, 55 and 56 of VH3-23 were diversified beyond the natural diversity in libraries A.
Figure 2. Distribution of the number of Somatic-Hyper-Mutations (i.e. substitution compared to the germline sequence) in unique (A) HCDR1 Kabat 27-35 and (B) HCDR2 Kabat 50-58 sequences from VH1-69 (diamonds), VH1-46 (triangles) and VH3-23 (circles) extracted from human IgG repertoire. Sequences harboring more than 5 or 6 substitutions compared to the germline sequence were removed from HCRD1 or HCDR2 in libraries B, respectively.
Figure 3. Synthetic and semi-synthetic phage display common Light Chain libraries cloning. HCDR1/2 cassettes were constructed using trimer oligonucleotides (library A), array-based synthesis (library B) or germline DNA template (library C). Synthetic HCDR3 diversity was generated using trimer oligonucleotides (library 1) while natural VH-family specific and pooled HCDR3 diversity was harvested from 10 human donors (library 2 and 3, respectively). Synthetic and semi-synthetic libraries were generated by assembling HCDR1/2 and HCDR3 DNA fragments by PCR and cloned into the pNGLEN phagemid vector.
Figure 4. HCDR3 length (amino-acid) distribution as observed in representatives for library 1 (VH1-69_A1, HCDR3 synthetic sequences), library 2 (VH1-69_A2, natural VH-family specific) and library 3 (VH1-69_A3, natural pooled sequences). As a comparison, HCDR3 length distribution of natural human IgG repertoire obtained by Next-Generation-Sequencing is showed.
Figure 5. Levenshtein distances between unique HCDR1, HCDR2 or HCDR3 from unique hits selected against CD47, CD28 and NKp46.
Figure 6. HCDR3 length (amino acid) distribution of unique HCDR3 sequences selected against CD47, CD28 and NKp46.
Figure 7. Comparison of the number of hits selected from the different HCDR1 and HCDR2 designs including library A (generated using trimer oligonucleotides) library B (array-based synthesis) and library C (germline sequence) against CD47, CD28 and NKp46. NA: Not Applicable because this library does not exist.
Figure 8. Comparison of the number of clonotypes (>80% HCDR3 sequence identity) selected from the different HCDR1 and HCDR2 designs including library A (generated using trimer oligonucleotides) library B (array-based synthesis) and library C (germline sequence) against CD47, CD28 and NKp46. NA: Not Applicable because this library does not exist.
Figure 9. Comparison of the percentage of hit sequences without sequence liabilities in HCDR1/2 selected from the different HCDR1 and HCDR2 designs including library A (generated using trimer oligonucleotides) and library B (array-based synthesis) against CD47, CD28 and NKp46. NA: Not Applicable because this library does not exist, or no hits were selected.
Figure 10. Comparison of the number of hits and clonotypes, and percentage of hit sequences without sequence liabilities in HCDR3 selected from the different HCDR3 designs including Library 1 (synthetic), library 2 (natural VH-family specific) and library 3 (natural pooled sequences) against CD47, CD28 and NKp46. Figure 11. Yield values (shown in mg/L) for all the CD47 and CD28 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t- test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 12. HPLC-SEC monomer peak at 214 nm (shown as % of total) for all the CD47 and CD28 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t-test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 13. HPLC-SEC retention time (shown in minutes) for all the CD47 and CD28 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t-test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 14. Melting temperature (Tm) measured by DSF (shown in °C) for all the CD47 and CD28 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t-test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 15. Affinity values measured by SPR for CD47 Fabs with KD<4000 nM, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t- test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 16. Affinity values measured by SPR for CD28 Fabs with KD<4000 nM, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t- test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 17. Yield values (shown in mg/L) for Nkp46 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t-test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005). 1
Figure 18. HPLC-SEC monomer peak at 214 nm (shown as % of total) for Nkp46 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t-test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 19. HPLC-SEC retention time (shown in minutes) for Nkp46 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t- test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 20. Melting temperature (Tm) measured by DSF (shown in °C) for Nkp46 Fabs expressed, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t-test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 21. Affinity values measured by SPR for Nkp46 Fabs with KD<4000 nM, segregated according to germlines (A), HCDR1/2 diversity (B) and HCDR3 origin (C). Assuming Gaussian distribution, unpaired t- test was used for reporting data significance (ns = not significant; * = P<0.05; ** = P<0.005 and *** = P<0.0005).
Figure 22. Example of data processing for anti-Nkp46 Fab epitope binning. The sensorgram depicts an overlay of the signals obtained from all cycles of Nkp46 and Fab 2 (analyte) binding to one immobilized Fab. Signal was normalized at the end of the antigen injection (first vertical band), the binning signal was read at the second vertical band with gating above the buffer cycles (horizontal band). Signals below the gate result from competing Fabs (no binding of the second Fab as analyte) and signals above the gate result from non-competing Fab (binding of the second Fab).
Figure 23. Graphical representation of the anti NKp46 Fabs epitope binning data and the distribution of Fabs according to their germline. Binders are represented as a circle when used as both ligand and analyte, as a square when used as analyte only (black squares: VH1-69 germline; white squares: VH1-46 germline). Competition is depicted as a straight cord and Fabs within a same bin are enclosed. Data was generated using Carterra LSA™ instrument and figure using Carterra Epitope™ software. Figure 24. Graphical representation of the anti NKp46 Fabs epitope binning data and the distribution of Fabs according to their HCDR1/2 origin. Binders are represented as a circle when used as both ligand and analyte, as a square when used as analyte only (black squares: Trimer-based HCDR1/2; grey squares: Array-based HCDR1/2; white squares: Germline HCDR1/2). Competition is depicted as a straight cord and Fabs within a same bin are enclosed. Data was generated using Carterra LSA™ instrument and figure using Carterra Epitope™ software.
Figure 25. Graphical representation of the anti NKp46 Fabs epitope binning data and the distribution of Fabs according to their HCDR3 origin. Binders are represented as a circle when used as both ligand and analyte, as a square when used as analyte only (black squares: Synthetic HCDR3; grey squares: VH-specific HCDR3; white squares: VH-pooled HCDR3). Competition is depicted as a straight cord and Fabs within a same bin are enclosed. Data was generated using Carterra LSA™ instrument and figure using Carterra Epitope™ software.
Example 1: Design of HCDR1 and HCDR2 libraries
Materials and Methods
Next-Generation-Sequencing of human IgG antibody repertoire.
Next-Generation-Sequencing (NGS) of human antibody repertoire was performed at Quintara Biosciences (US). IgG variable domains were amplified from peripheral leukocytes mRNA (636170, TaKaRa) using a reverse IgG specific primer (sgatgggcccttggtggargc) (ref: Commonality despite exceptional diversity in the baseline human antibody repertoire, Bryan Briney, https://www.nature.com/articles/s41586-019-0879- y). Antibody variable heavy chain (VH) libraries for NGS were constructed using a template switching oligo (TSO) protocol (ref: Zhu YY, Machleder EM, et al. (2001) Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction Biotechniques, 30(4):892-897. // Wellenreuther R, Schupp I, et al. (2004) SMART amplification combined with cDNA size fractionation in order to obtain large full-length clones. BMC Genomics, 5(1):36.// Ramskold D, Luo S, et al. (2012) Full- length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat Biotechnol, 30(8):777-782.). VH amplification that incorporated unique molecular identifiers (readl and read2) together with Illumina Adapter Sequences (P5 and P7, Illumina) was performed, resulting in a 500-550 bp amplicon library with the following structure: P5-Read l-TSO-UTR-Leader-VDJ-Read2-P7. Read l and Read 2 sequences were subsequently used for MiSeq sequencing using a 2 x300 bp CHIP. Raw sequence reads corresponding to TSO-UTR-Leader-VDJ sequences were paired and analyzed using a custom NGS analysis pipeline.
Design of trimer oligonucleotides to randomize HCDR1 and HCDR2 (library A)
The first strategy to introduce diversity in HCDR1 and HCDR2 took advantage of the germline-specific natural amino acid frequencies at each position with modifications. Nucleotide VH sequences from IgG NGS data set were segregated according to their respective germlines upon successful alignment to reference IMGT VH nucleotide sequences. After translation, unique HCDR1 and HCDR2 amino-acid sequence regions (Kabat 25-36 and Kabat 47-65, respectively) were extracted and aligned to identify natural residues frequencies in HCDR1 and HCDR2. In addition, single positions mutation rates were calculated according to the occurrence of non-germline residues identified in each HCDR positions. Similarly, for each position a diversity score representing their relative tolerance to residue substitution was calculated by multiplying the mutation rate with the summed number of amino acids required to explain 80% of the mutation rate.
For the positions having a diversity score below 75% of the highest calculated value, naturally occurring amino acid are used except residues occurring <1% and C, that were excluded from the design, and M were not considered unless localized in a buried canonical position. All other amino acids were kept at naturally occurring proportion. The positions having a higher diversity score were "hard randomized" using all amino acids, independently from the natural diversity, with exceptions: C, M and W were removed, and P were not considered unless naturally occurring. Final diversity-incorporating designs were encoded as trimer oligonucleotides using an E. coli optimized codon set (Ella Biotech).
Design of HCDR1 and HCDR2for array-based synthesis (library B)
The second strategy to introduce diversity in HCDR1 and HCDR2 used unique germline-specific HCDR1/2 sequences (Kabat 27-35 and Kabat 50-58, respectively) from the IgG NGS data set, with modifications. First, sequences harboring more than 5 or 6 substitutions per CDR compared to the germline sequence have been removed from HCDR1 or HCDR2 libraries, respectively. Then, sequences containing sequence liabilities (reported in Table 1) were removed and sequences encompassing immunogenic peptides were discarded upon filtering using the Episcore in-silico immunogenicity prediction tool with a threshold value of 50% (ref: Dhanda et. al.: Prediction of HLA CD4 immunogenicity in human populations, Frontiers in Immunology, 2018, 9, 1369 // Paul et. al: Development and validation of a broad scheme for prediction of HLA class II restricted T cell epitopes, Journal of Immunological Methods, 2015, 422, 28-34). The resulting HCDR1/2 germline-specific sequence lists were synthetized using array-based synthesis and shuffled in an HCDR1/2 germline-specific fragment (framework (FR) 1 to framework 3) using E. coli codon optimization (Twist Biosciences).
Figure imgf000032_0001
Table 1: List of sequences liabilities removed from HCDR1 and HCDR2 sequences prior to array-basec synthesis. * x * P, ** y = P/S/T/A, ***Neg = D/E
Results and conclusions
Design of trimer oligonucleotides to randomize HCDR1 and HCDR2 Three VH germlines, VH1-46 (SEQ ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3) were selected as heavy chain framework because they are well represented in the human antibody repertoire, and they can naturally pair with VK3-15 variable light chain (VL) that is used as common Light Chain germline in this study. In addition, these VH germlines are known to have good developability properties and are well represented in the antibody therapeutics landscape. To develop optimal germline-specific HCDR1 and HCDR2 libraries, the human IgG repertoire was characterized by NGS. Around 3.9 million paired reads could be successfully aligned with human VH sequences. Then, unique HCDR1 and HCDR2 regions matching germline sequence length (Kabat 25-36 and Kabat 47-65, respectively) were aligned to identify natural residue frequencies at each position and calculate a diversity score. The number of unique HCDR1 and HCDR2 regions used in this analysis are reported in Table 2.
Figure imgf000033_0001
Table 2: Number of unique HCDR1 and HCDR2 regions from human IgG repertoire NGS analysis considerec for the design of trimer oligonucleotides Based on antibody structure, natural residue frequencies and diversity score analyses, residues Kabat 27- 35 (referred as HCDR1) and Kabat 50-58 (referred as HCDR2) were considered for randomization. As described in the method, for the positions having the lower diversity score (<75% of the highest calculated value, Figure 1), residues occurring <1% and C were excluded from the design, and M were not considered unless localized in a buried canonical position. Indeed, both C and M are high-risk Post-Translational Modification (PTM) motifs. All other amino acids were kept at naturally occurring ratios. The positions having a higher diversity score were "hard randomized" using all amino acids with exceptions: C, M and W were removed, and P were not considered unless naturally occurring. Here, W was removed to limit the selection of binders with hydrophobic patches and P was removed from non-naturally occurring positions to prevent CDR structure perturbation. HCDR1 and HCDR2 amino-acid frequencies encoded in germline-specific trimer oligonucleotides are described in Table 3 to Table 8.
Figure imgf000033_0002
Figure imgf000034_0001
Table 3: Amino-acid frequencies of VH1-69 HCDR1 encoded in trimer oligonucleotides
Figure imgf000034_0002
Table 4: Amino-acid frequencies of VH1-46 HCDR1 encoded in trimer oligonucleotides
Figure imgf000034_0003
Figure imgf000035_0001
Table 5: Amino-acid frequencies of VH3-23 HCDR1 encoded in trimer oligonucleotides
Figure imgf000035_0002
Figure imgf000036_0001
Table 6: Amino-acid frequencies of VH1-69 HCDR2 encoded in trimer oligonucleotides
Figure imgf000036_0002
Table 7: Amino-acid frequencies of VH1-46 HCDR2 encoded in trimer oligonucleotides
Figure imgf000036_0003
Figure imgf000037_0001
Table 8: Amino-acid frequencies of VH3-23 HCDR2 encoded in trimer oligonucleotides.
Design of HCDR1 and HCDR2for array-based synthesis
The human IgG repertoire NGS data set was also used to design unique set of germline specific HCDR1/2 sequences. Importantly, all NGS reads have been considered, even the ones appearing at negligeable frequencies and that could potentially be attributed to NGS errors. Such sequences are expected to create extra-diversity beyond the natural diversity and could be beneficial when generating common Light Chain (cLC) libraries where the diversity is restricted to the heavy chain. Unique HCDR1 and HCDR2 sequences (Kabat 27-35 and Kabat 50-58, respectively) were identified and further filtered to optimize library functionality and binder developability. First, sequences harboring more than 60% of Somatic-Hyper- Mutations (SHMs) were removed to generate a more "naive" library (i.e. removing more highly mutated CDR's), while keeping most of the naturally occurring diversity (Figure 2). As described in the method, sequences containing sequence liabilities and sequences encompassing immunogenic peptides were discarded to optimize antibody developability. These efforts resulted in an enriched number of highly functional and developable sequences that were synthetized using array-based synthesis (Table 9).
Figure imgf000037_0002
Figure imgf000038_0001
Table 9: HCDR1 and HCDR2 sequences number evolution throughout the filtering process performed prior to jet-array synthesis.
Example 2: Synthetic and semi-synthetic phage display common Light Chain libraries generation and quality control
Materials and Methods
Synthetic and semi-synthetic phage display common Light Chain libraries generation
Library generation was performed by PCR assembly of the different fragments encompassing the HCDRs diversity (Figure 4). An initial step consisted of the generation of HCDR1/2 cassette either by PCR assembly using E. coli codon optimized germline FR1 or FR3 as DNA template and trimer oligonucleotides (library A), or amplification of the array-based synthetized FR1-FR3 fragments (library B) or amplification of the germline FR1-FR3 sequences (library C).
HCDR3 synthetic diversity was introduced using a pool of trimer oligonucleotides encoding 15 HCDR3 lengths (6-20) and mimicking length-specific naturally occurring diversity at Kabat residues 95-102. DNA fragments encompassing E. coli codon optimized HCDR3, IGHJ1, linker and germline VK3-15/JK1 variable light chain (VL) (SEQ. ID NO: 4) were generated by PCR and pooled to mimic natural HCDR3 length distribution (Library 1). Finally, HCDR1/2 cassette and synthetic HCDR3-containing fragments were assembled by PCR, scFv were cloned into the pNGLEN (in-house modified pUC119 phagemid vector) using Ncol/Notl restriction sites and the resulting ligation product electroporated into E. coli TGI cells.
Natural HCDR3 were amplified from a human IgM repertoire. Briefly, mRNAwas purified from total PBMCs extracted from Buffy Coats of 10 human donors (5 males, 5 females) of Caucasian origin and aged between 19-69. mRNA was converted to cDNA by reverse transcription using oligo-dT. All VH and HCDR3 amplifications were performed individually from each donor and were next pooled together according to their VH-families. Firstly, PCR was performed using a set of VH-specific forward primers (ref: https://doi.org/10.1038/s42003-021-01881-0, communication Biology) and a reverse IgM-specific primer (TGGAAGAGGCACGTTCTTTTCTTT). Nested PCR using a forward primer encoding a consensus FR3 sequence (ref: M. Erasmus et al. Comm. Biology 2021 // P. Valadon mAbs 2013) and biotinylated multiplexed FR4 primers (ACRGTGACCAGGGTGCC, ACGGTGACCATTGTCCC, ACGGTGACCAGGGTTCC and ACGGTGACCGTGGTCCC) followed to amplify HCDR3 sequences. These HCDR3 cassettes were either kept separated based on their individual VH-families (VH1 sequences or VH3 sequences - library 2) or were pooled across all VH amplified (library 3). Upon purification using streptavidin beads, the HCDR3 cassettes were associated to the HCDR1/2 cassettes, whose FR3 end was mutated to be compatible with human sequences. Semi-synthetic VHs were cloned into the germline VK3-15/JK1 VL-containing pNGLEN using Ncol/Xhol restriction sites and the resulting ligation electroporated into E. coli TGI cells. The library cloning process is depicted in Figure 3.
E. coli TGI cells harboring the phagemid libraries were superinfected with M13K07 helper phage for assembly and production of recombinant phages. Phages were purified by two precipitations steps with 1/3 v/v of 20% PEG-6000, 2.5 M NaCI and resuspended in PBS.
Synthetic and semi-synthetic phage display common Light Chain libraries guality control by Next-
Generation-Seguencing
Purified library DNA was used as template for PCR amplification of the VH using a 5' primer binding upstream to the VH and a 3' reverse primer binding to the linker region and beginning of the VL. PCR amplification resulted in an amplicon of approximately 400 bp further purified by gel extraction (Qiagen). The resulting DNA sample was subsequently processed using magnetic SPRIselect beads (BeckmanCoulter) according to the manufacturer's instructions leading to NGS-grade DNA quality. Samples were submitted to Illumina MiSeq 2x200 bp analysis at Genewiz (amplicon-EZ services). Raw FASTQ paired-end files were processed using a custom NGS analysis pipeline. Bioinformatic analysis included calculation of percentage of functional sequences (sequences in frame and without stop codon), CDR length distributions, CDR amino-acid frequencies and identification of CDR unique sequences.
Results and conclusions
Synthetic and semi-synthetic phage display common Light Chain libraries generation and guality control by Next-Generation-Seguencing
A set of 13 phage display libraries were generated to compare variable heavy chain (VH) germlines, and HCDR1/2 (diversification based on trimer oligonucleotides (library A) or array-based synthesis (library B) or germline sequence (library C)) and HCDR3 (synthetic (library 1), natural VH-family specific (library 2) or pooled HCDR3 (library 3)) diversification strategies for the selection of diverse and highly developable common Light Chain binders. As described in the methods and Figure 3, the libraries were assembled by PCR and cloned into the pNGLEN phagemid vector using restriction enzymes. As shown in Table 10, the individual library sizes are comparable and range from 9.32E+09 to 3.57E+10. The quality of these libraries was assessed by Next-Generation-Sequencing and this experiment revealed that more than 85% of overall sequences were functional (in frame, no STOP codon). Individual library qualities are comparable and range from 79% to 92.8% of functional sequences (Table 10).
Figure imgf000040_0001
percentage of functional sequences. To evaluate the accuracy of the technologies used for HCDR1 and HCDR2 diversification and to enable a fair comparison between HCDR1/2 diversification strategies, theoretical and experimental diversities were compared. On the first hand, VH1-69_A2, VH1-46_A2 and VH3-23_A2 were picked as representative libraries to assess the quality of the trimer oligonucleotides encoding germline specific HCDR1 and HCDR2 diversity. As depicted in Table 3 to Table 8 and Table 11 to Table 16, the theoretical and experimental diversities are comparable and confirmed high-quality of the trimer oligonucleotides. On the second hand, we aimed at comparing designed and experimental diversities of the libraries generated using array-based synthesis. To exclude some of the NGS errors while keeping an NGS coverage >10 (NGS reads/theoretical diversity), sequences appearing only once were removed. This analysis revealed that between 98.7% and 99.9% of the designed diversity is found in the experimental library (Table 17). In addition, the experimental libraries contained between 2.5% and 6.8% of sequences that were not designed but these values are likely overestimated due to sequencing errors. These results confirmed high quality of the libraries generated by array-based synthesis.
Figure imgf000041_0001
Table 11: Amino-acid frequencies in HCDR1 of VH1-69_A2 library.
Figure imgf000041_0002
Figure imgf000042_0001
Table 12: Amino-acid frequencies in HCDR1 of VH1-46_A2 library
Figure imgf000042_0002
Table 13: Amino-acid frequencies in HCDR1 of VH3-23_A2 library
Figure imgf000043_0001
Table 14: Amino-acid frequencies in HCDR2 of VH1-69_A2 library
Figure imgf000043_0002
Figure imgf000044_0001
Table 15: Amino-acid frequencies in HCDR2 of VH1-46_A2 library
Figure imgf000044_0002
Table 16: Amino-acid frequencies in HCDR2 of VH3-23_A2 library
Figure imgf000045_0001
Finally, VH1-69_A1, VH1-69_A2, VH1-69_A3, were picked as representative libraries to assess the quality of the HCDR3. Firstly, HCDR3 amino-acid length distribution analysis confirmed the expected diversity of the synthetic design (VH1-69_A1) encompassing length 6 to 20 (Figure 4). In addition, both natural VH1- family specific (VH1-69_A2) and pooled HCDR3 (VH1-69_A3) length distributions were comparable and in accordance with the natural repertoire. The amino-acid content of HCDR3 was also checked taking the most represented length (HCDR3 length 12) as example. As illustrated in Table 18 to Table 20, the amino acid frequencies at each position were comparable.
Figure imgf000046_0001
Table 18: Amino-acid frequencies in HCDR3 of VH1-69_A1 library
Figure imgf000046_0002
Figure imgf000047_0001
Table 19: Amino-acid frequencies in HCDR3 of VH1-69_A2 library
Figure imgf000047_0002
Table 20: Amino-acid frequencies in HCDR3 of VH1-69_A3 library Example 3: Selection of synthetic and semi-synthetic phage display common Light Chain libraries against mmCD47, hsCD28 and hsNKp46
Materials and Methods
Phage-display panning
For panning on CD47 and CD28, purified phage particles from individual scFv libraries (1012 plaque-forming units) were blocked in PBS/BSA 3% (w/v) for 1 h at room temperature (RT). Magnetic Dynabeads™ MyOne™ Streptavidin Cl beads (Invitrogen, catalog NO: 65002) were blocked in the same conditions. Phages were depleted against pre-blocked beads for 1 h at RT. Phages were then incubated with 100 nM (rounds 1 and 2) or 25 nM (round 3) of biotinylated recombinant his-tagged mouse CD47 extracellular domain (ECD) (produced in-house) or biotinylated recombinant human CD28 protein (Acrobiosystems, catalog NO: H82E5) for 1 h at RT. Antigen-bound phages were captured on streptavidin beads for 15 min at RT and beads were washed three times with PBS-Tween 0.1% and four times with PBS. Phages were eluted with citric acid 76 mM pH 2.0 for 15 min at RT and neutralized using Tris-HCI 1 M pH 8. Eluted phages were used to infect 5 ml of exponentially growing E. coll TGI cells. Infected cells were grown in 2YT medium for 1 h at 37 °C and 100 RPM, then grown in 2YT medium supplemented with 2% (w/v) glucose for 1 h at 37 °C and 240 RPM. Cells were then superinfected with the M13K07 helper using a multiplicity of infection (MOI) of 10 for 1 h at 37 °C and 100 RPM. Culture medium was then changed for 2YTAK (2YT medium supplemented with 100 pg/ml ampicillin and 50 pg/ml kanamycin) and cells were further cultured ON at 30 °C and 240 RPM. The next day, 10 pl of phage containing cell-free supernatant were used for the subsequent round of selection.
The panning on NKp46 followed the same procedure as above but included an additional depletion step. Magnetic Protein G Dynabeads® (Invitrogen, catalog no. 10003D) were blocked in PBS/BSA 3% (w/v), coated with human IgGl and incubated with pre-coated beads for 1 h at RT prior to depletion on streptavidin beads. Phages were then incubated with 100 nM (rounds 1 and 2) or 25 nM (round 3) of biotinylated recombinant Fc-tagged human NKp46 protein (NC1-H82F9, Acrobiosystems). scFv screening on recombinant protein by flow cytometry
The binding of scFv clones to biotinylated recombinant human CD28 protein (Acrobiosystems, catalog NO:
H82E5), biotinylated recombinant his-tagged mouse CD47 extracellular domain (ECD) produced in house and biotinylated recombinant Fc-tagged human NKp46 protein (Acrobiosystems, catalog NO: NC1-H82F9) coated on beads was assessed by flow cytometry.
Individual E. coli colonies from the third round of selection were picked and grown in autoinduction medium (Formedium, catalog NO: AIM2YT0210) supplemented with 100 pg/ml ampicillin and 0.1% (w/v) glucose in 96-well deepwell plates, overnight at 30 °C and 260 RPM. Cells were centrifuged and periplasmic extracts were obtained by resuspending the bacterial pellets in TES buffer (50 mM Tris-HCI pH 8; 1 mM EDTA pH 8; 20% sucrose) followed by incubation on ice for 30 min. Cellular debris were removed by centrifugation, and the scFv containing supernatants were stored at 4 °C.
PolyAn Red4 deca-plex beads were procured (PolyAn, catalog NO: 106 52 005). The beads had functionalized streptavidin on their surface and caged 10 discrete amounts of Allophycocyanin (APC) in their polymeric Polymethylmethacrylate (PMMA) matrix. Biotinylated antigens were individually coated on beads at 20 pg/ml coating concentration in PBS-BSA 3% (w/v) and beads were then washed to remove unbound antigen. Control beads (non-antigen coated streptavidin beads) were simultaneously processed in the same way. Equal quantities of control and antigen-coated beads were mixed in PBS-BSA 3% (w/v) and 10 pl was pipetted in each 96 well (approx. 3000 beads/plex). Periplasmic extracts were diluted 1:8 in PBS-BSA 3% (w/v) and 10 pl was added to each well and incubated for 60 min at 4 °C. Next, 10 pl of FITC-conjugated anti-myc tag antibody 9E10 (Abeam, catalog NO: ab202008), previously diluted in PBS- BSA 3% (w/v) at 1 pg/ml, was added to each well and incubated at 4 °C for 30 min, to detect myc-tagged scFv's. At least 2000 events were recorded on IntelliCyt® iQue Screener PLUS (Sartorius) with 4 s of sip time and 0.5 s of additional up time.
Geometric Mean (GM) was determined for each bead plex in each well in BL1-A channel and a ratio of GMantigen to G Mstreptavidin was calculated. Clones having a ratio (GMantigen/GMstreptavi in) >30 were considered as specific antigen binders scFv screening on recombinant protein by Surface Plasmon Resonance
Surface Plasmon Resonance (SPR) analysis was used to confirm specific binding activity of the scFv clones. Measurements were performed on a Biacore 8K+ instrument (Cytiva Life Sciences) using the Biacore 8K+ Control Software at 25 °C and analyzed with the Biacore Insight Evaluation Software (v3.0). Biotinylated recombinant human CD28 protein (Acrobiosystems, catalog NO: CD8-H82E5), recombinant his-tagged mouse CD47 extracellular domain (ECD) produced in house or recombinant his-tagged human NKp46 protein (Acrobiosystems, catalog NO: NC1-H52H4) were diluted to a final concentration of 10 pg/ml in acetate buffer pH 4.5 (Cytiva Life Sciences, catalog NO: BR100530) and subsequently immobilized on flowpath 2 on the eight channels, to around 1600 resonance units (abbreviated RU), 500 RU and 1500 RU, respectively, on Series S CM5 Sensor Chips (Cytiva Life Sciences, catalog NO: BR100012) using an amine coupling kit (Cytiva Life Sciences, catalog NO: BR100050). HBS-EP+ (Cytiva Life Sciences, catalog NO: BR100669) was used as running buffer. Filtered periplasmic extracts were injected directly on the covalently coupled human CD28, mouse CD47 or human NKp46 Series S CM5 Sensor Chip. Samples were injected on the flow-paths 1 and 2 (flow-path 1 being used as reference) at a 30 pl/min flow rate for 3 min, followed by a dissociation time of 3 min in running buffer. After each binding event, surface was regenerated with 10 mM Glycine pH 1.5 solution (Cytiva Life Sciences, catalog NO: BR100354) injected for 60 s at 30 pl/min on both flow-paths. Each measurement included zero-concentration samples as well as irrelevant scFv periplasmic extracts for referencing and specificity, respectively.
Sequence analysis
Lenvenshtein distances of unique HCDR1, HCDR2 and HCDR3 were calculated using the R package stringdist v.0.9.6.3 (ref: M. van der Loo, J. van der Laan, R Core Team, N. Logan, C. Muir, J. Gruber, stringdist: Approximate String Matching, Fuzzy Text Search, and String Distance Functions (2021; https://CRAN.R-project.org/package=stringdist).
Results and conclusions
In order to validate and compare the different library designs for the selection of diverse and highly developable common Light Chain binders, the 13 phage display libraries were panned against 3 targets: mouse CD47, human CD28 and human NKp46. Then, 288 scFv clones from each round of the 3 panning arms were screened for binding to the target recombinant protein by flow-cytometry and Surface Plasmon Resonance. This panning campaign led to the selection of 138, 383 and 438 hits against CD47, CD28 and NKp46, respectively. To assess the overall diversity of the selected hits, unique HCDR1, HCDR2 and HCDR3 sequences selected against each target were analyzed by calculating the Lenvenshtein distance for each CDR and HCDR3 length distribution. The Levenshtein distance is defined as minimum number of amino acid substitutions, insertions or deletions required to change one sequence into the other (e.g. Levenshtein distance 1 corresponds to one substitution, insertion or deletion between two sequences). HCDR1 and HCDR2 Lenvenshtein distributions range from 1 to 9 and from 1 to 10 and reached a maximum at 5 and 6 respectively, indicating very high CDR diversity (Figure 5). HCDR3 is the most diverse CDR in nature and is known to play a critical role for driving the binding to the antigen. In addition, antibodies harboring distant HCDR3 sequences are expected to recognize different epitopes. Both the vast HCDR3 Lenvenshtein distances and amino acid length distribution of the unique HCDR3 selected against CD47, CD28 and NKp46 suggest a broad epitope coverage (Figure 5 and Figure 6).
The number of unique hits generated from the different libraries and targets were first arranged to compare HCDR1/2 diversification strategies function of VH germline and HCDR3 randomization strategies (Table 21, Figure 7). Libraries A generated using trimer oligonucleotides delivered more hits than libraries B in 6 out of 15 screening experiments while libraries B generated by array-based synthesis delivered more hits than libraries A in 8 out 15 experiments. In addition, there is no clear correlation between best HCDR1/2 randomization strategy and VH germline or HCDR3 randomization strategies. However, we observed that libraries A were superior to libraries B for delivering binders against CD47 while libraries B were superior to libraries A for CD28 and NKp46. To conclude, both HCDR1/2 randomization strategies are comparable for selecting common light chain binders. Interestingly, libraries C harboring the germline HCDR1 and HCDR2 sequences delivered substantially fewer binders than either libraries A and libraries B indicating that introducing diversity in HCDR1 and HCDR2 is a prerequisite to generate diverse common Light Chain binders. HCDR1/2 diversification strategies were also compared for delivering unique clonotypes. Here, we define a clonotype as a group of sequences having HCDR3 with >80% sequence identity and that could target close epitopes. The same analysis as the one described above, considering clonotype as the diversity metrics instead of unique sequences, led to the exact same conclusions confirming comparable performance of both HCDR1/2 randomization strategies, as far as diversity is concerned (Table 22, Figure 8). By design, sequence liabilities were removed from libraries B generated using array-based synthesis. To check the absence of such liabilities in the hits selected from libraries B and compare with libraries A, the percentage of unique hits without sequence liabilities in HCDR1 and HCDR2 were compared (Table 23, Figure 9). Most of the sequences originated from libraries B are free of sequence liabilities while most of sequences originated from VH3-23 and VH1-46 based-libraries A contain sequence liabilities. Interestingly, 64% of the sequence from VH1-69 based-library A are free of sequence liabilities. The percentage of liabilities-free sequences originated from libraries A were expected to depend on the VH germline knowing that they harbor different germline-specific natural diversity.
The number of unique hits and clonotypes and percentage of liabilities-free sequences generated from the different libraries were then arranged to compare HCDR3 diversification strategies function of HCDR1/2 randomization strategies (Table 21, Table 22 and Table 23, Figure 10). The VH1-69 library 3 based on natural pooled HCDR3 delivered slightly more unique hits than VH1-69 library 1 based on synthetic HCDR3 when built with trimer-based HCDR1/2 (i.e.VHl-69-A3> VH1-69-A1). The inverse was observed when the libraries were built using array-based synthesis (i.e. VHl-69-Bl> VH1-69-B3). Interestingly, libraries 2 based on VH-family specific tended to deliver less unique hits than the two other HCDR3 designs. To conclude, we have demonstrated that semi-synthetic common Light Chain libraries can deliver as many unique hits as synthetic libraries. In addition, restricting natural diversity to VH-family specific HCDR3 may not be beneficial for generating common Light Chain binders as far as number of unique hits are concerned. HCDR3 diversification strategies were also compared for delivering unique clonotypes. This analysis revealed that libraries 1 based on synthetic HCDR3 systematically deliver more clonotypes than libraries based on natural HCDR3 that suggest a superiority of libraries 1 regarding epitope coverage. Indeed, hit rates dropped by around 40% and 50% for libraries 2 and 3, respectively when clonotypes were considered compared to unique sequences. This was expected because the theoretical diversity of the synthetic repertoire is much higher than the experimental diversity that makes the generation of related variants unlikely compared to the natural repertoire. Finally, the percentage of unique hits without sequence liabilities in HCDR3 selected from the different HCDR3 diversification strategies were compared (Table 25). Here, all three HCDR3 designs are comparable with 36%, 34% and 39% of liability-free unique sequences selected from synthetic, natural VH-family specific and pooled HCDR3 libraries, respectively.
Figure imgf000052_0001
Figure imgf000053_0001
Table 21: Number of unique hits selected against CD47, CD28 and NKp46 from the 13 synthetic and semisynthetic common Light Chain libraries.
Figure imgf000053_0002
Table 22: Number of clonotypes (>80% HCDR3 sequence identity) selected against CD47, CD28 and NKp46 from the 13 synthetic and semi-synthetic common Light Chain libraries.
Figure imgf000054_0001
Table 23: Percentage of hits without full length (FL) sequence liabilities selected against CD47, CD28 and
NKp46 from the 13 synthetic and semi-synthetic common Light Chain libraries.
Figure imgf000055_0001
from the 13 synthetic and semi-synthetic common Light Chain libraries.
Figure imgf000056_0001
Table 25: Percentage of hits without HCDR3 sequence liabilities selected against CD47, CD28 and NKp46 from the 13 synthetic and semi-synthetic common Light Chain libraries. Example 4: Characterization of anti-CD47 and anti-CD28 antibodies
Materials and Methods
Fab expression cDNAs encoding the different antibody constant regions were gene synthetized by GENEART (Regensburg, Germany) or Twist Biosciences (San Francisco, USA) and modified using standard molecular biology techniques. PCR products were digested with appropriate DNA restriction enzymes, purified and ligated in modified pcDNA3.1 plasmids (Invitrogen), which carried a CMV promoter and a bovine hormone polyadenylation (poly(A)). The expression vectors also carried an oriP, which is the origin of plasmid replication of Epstein-Barr virus, and the murine IgK light chain leader peptide for secretion of the encoded polypeptide chain. For reformatting scFv library clones into human IgGl Fab fragments, each scFv clone in its phage library vector was used to amplify its individual VH cDNA by PCR. Next, the VH PCR product was cloned in the modified pcDNA 3.1 vector described above upstream of a cDNA encoding a human IgGl heavy chain CHI domain, whereas the fixed VK3-15/JK1 light chain was cloned in the modified pcDNA 3.1 vector, described above, upstream of a cDNA encoding a human kappa constant light chain domain. Fab expression was performed in 24 well plates using the Expi293™ Expression System (Thermo Fisher, catalog NO: A14527), according to the manufacturer's protocol. Typically, Expi293F cells were seeded at 2x10s viable cells in Expi293™ Expression Medium (Thermo Fisher, catalog NO: A1435101) one day prior to transfection, and incubated with orbital shaking at 37 °C, 8% CO2 and 80% humidity. ExpiFectamine™ 293 transfection reagent from the ExpiFectamine™ 293 Transfection Kit (Thermo Fisher, catalog NO: A14526) and equal quantities of light chain and heavy chain vectors (1 pg plasmid DNA/mL of transfection culture volume each) were separately diluted in OptiMEM I Reduced Serum Medium (Thermo Fisher, catalog NO: 31985062). Following a 10 min incubation at room temperature, the ExpiFectamine™ 293 in OptiMEM solution was combined with the DNA dilution, and the final solution mix was added to the cells. A 1:10 (V:V) premix of Expifectamine™ 293 Transfection Enhancer 1 and 2 were added to the cells approximately 18 h post transfection and cells were further incubated for 5 days with orbital shaking at 37 °C, 8% CO2 and 80% humidity. Purification was similarly performed in 24 well plates. Cell-free culture supernatants containing the recombinant proteins were prepared by centrifugation and used for further purification. CaptureSelect™ CH1-XL Affinity Matrix (ThermoScientific, catalog NO: 1943462005) was added to the culture supernatants and incubated for 1 h at room temperature with gentle mixing. Upon centrifugation, resin-bound Fabs were harvested and washed in PBS, and the recombinant proteins eluted with an acidic buffer (typically glycine 0.1 M pH 3.0). After neutralization with 1/10 volume of Tris-HCI pH 8.0, preparations were buffer-exchanged into PBS using Zeba™ Spin Desalting Plates (ThermoFisher, catalog NO: 89807). Absorbance at 280 nm was measured on a Synergy Neo plate reader (Synergy Neo HTS Multi Mode Reader, BioTek Instruments) and protein concentration was determined using following formula: Concentration (mg/ml) = OD280 nm / expath lengthxMW (Da).
Size-exclusion analysis
To assess the propensity of the molecules to aggregate, Fabs were analyzed by size exclusion chromatography (SEC). Briefly, SEC was performed by Ultra High-Performance Liquid Chromatography (UHPLC) separation using a Tosoh Bioscience TSKgel UP-SW3000 column (Tosoh Bioscience, catalog NO: 0023449) at room temperature with 0.1 M sodium phosphate buffer, 0.15 M sodium chloride, pH 6.8 as eluent at 0.5 ml/min flow rate, on an ACQUITY Arc UHPLC System (Waters), monitoring at 214 nm. Chromatograms were integrated using Empower Software (Waters).
Fab binding affinities for CD47
Surface plasmon resonance (SPR) was used to measure the binding affinities of the Fab fragments for mouse CD47. Measurements were performed on a Biacore 8K+ instrument (Cytiva Life Sciences) using the Biacore 8K+ Control Software at 25 °C and analyzed with the Biacore Insight Evaluation Software (v3.0). Around 60 RU of biotinylated recombinant mouse CD47 protein (in house produced protein, with protein ID, P1889) was captured on flow-path 2 of a Series S Biotin CAPture Chip (Cytiva Life Sciences, catalog NO: 28920234). Fabs were injected in single cycle kinetic analysis mode at different concentrations ranging from 6.4 to 4000 nM, in HBS-EP+ buffer (Cytiva Life Sciences, catalog NO: BR100669) at a flow rate of 30 pl/min for 3 min on flow-path 1 and 2 (flow-path 1 being used as reference). Dissociation was monitored for 10 min. After each cycle, the surface was regenerated with 60 pl of regeneration solution provided with Series S Biotin CAPture Kit (Cytiva Life Sciences, catalog NO: 28920234). Experimental data were processed using the 1:1 Langmuir kinetic fitting model. Measurements included zero-concentration samples for referencing. Chi2 and residual values were used to evaluate the quality of the fit between the experimental data and individual binding models.
Fab binding affinities for CD28
Surface plasmon resonance (SPR) was used to measure the binding affinities of the Fab fragments for human CD28. Measurements were performed on a Biacore 8K+ instrument (Cytiva Life Sciences) using the Biacore 8K+ Control Software at 25 °C and analyzed with the Biacore Insight Evaluation Software (v3.0). Around 20 RU of biotinylated recombinant human CD28 protein (Acrobiosystems, catalog NO: CD8- H82E5) was captured on flow-path 2 of a Series S Biotin CAPture Chip (Cytiva Life Sciences, catalog NO: 28920234). Fabs were injected in single cycle kinetic analysis mode at different concentrations ranging from 1.6 to 1000 nM, in HBS-EP+ buffer (Cytiva Life Sciences, catalog NO: BR100669) at a flow rate of 30 pl/min for 3 min on flow-path 1 and 2 (flow-path 1 being used as reference). Dissociation was monitored for 10 min. After each cycle, the surface was regenerated with 60 pl of regeneration solution provided with Series S Biotin CAPture Kit (Cytiva Life Sciences, catalog NO: 28920234). Experimental data were processed using the 1:1 Langmuir kinetic fitting model. Measurements included zero-concentration samples for referencing. Chi2 and residual values were used to evaluate the quality of the fit between the experimental data and individual binding models.
Differential Scanning Fluorimetry (DSF)
The thermal stability of the antibodies was investigated by Differential Scanning Fluorimetry (DSF). The analysis was performed using the Rotor Gene Q instrument (Qiagen). Briefly, 5 pg of protein were mixed with SYPRO® Orange 10X (Life technologies) in a final volume of 20 pl adjusted with H2O. The temperature was increased by 1°C per second from 25°C to 99°C. The emission and detection wavelength were set at 460 nm and 510 nm respectively, and the gain was set at 3.
Results
To assess the developability properties and the affinity of the binders selected from the different common Light Chain libraries, a panel of unique anti-CD47 and anti-CD28 hits were reformatted to Fab format. Around 10 binders from each 26 panning outputs (i.e., 13 libraries panned against the two targets) were expressed. For some panning arms, only a few hits were discovered and therefore the number of Fab that was expressed did not reach 10.
Our developability assessment considered the level of expression in mammalian cells (yield), the propensity to aggregate and polyreactivity as measured by the % of monomer and retention time by SEC, respectively, and thermostability as measured by DSF. For each library, the developability data of anti- CD47 and anti-CD28 binders were compiled and are reported in Table 26. All individual libraries delivered Fab having good transient expression yield in HEK cells with medians ranging from 31 mg/mL to 59 mg/mL which is comparable to a well-behaved comparator control, trastuzumab Fab (72 mg/mL, average of duplicate). All libraries also yielded molecules with high solubility with medians >98% monomer by SEC (excluding VH1-69_C1 data because only 2 Fab were expressed). The SEC retention times of these Fab's, with medians ranging from 12.63 min to 12.74 min (excluding VH1-69_C1 data), are comparable to the well-behaved comparator control trastuzumab Fab (12.79 min, average of duplicate) suggesting that the molecules have low propensity for polyreactivity. Finally, the thermostability of the Fab's isolated from each library is very high, with medians ranging from 82 °C to 87°C. To facilitate the comparison of the different library designs, the data were combined according to the VH germlines, or HCDR1/2 designs or HCDR3 diversification strategies. First, binders discovered from VH1-46, VH1-69 and VH3-23-based libraries show comparable yield (medians ranging from 48 mg/mL to 54 mg/mL), % of monomer (medians ranging from 99.04 % to 99.76 %), SEC retention time (medians around 12.7 min) and thermostability (medians ranging from 84°C to 86°C) confirming that all 3 VH germlines are suitable for generating VK3- 15-based cLC binders (Table 26, Figure 11-Figure 14). Interestingly, binders isolated from libraries generated using HCDR1/2 trimer oligonucleotides or HCDR1/2 array-based synthetized cassette have comparable developability properties (Table 26, Figure 11-Figure 14), although the libraries generated using array-based synthesis may have been expected to present better developability profiles because this approach introduced naturally existing sequence motif diversity, compared to the trimer oligonucleotide approach that only mimics the natural diversity at each individual residue. Similarly, the binders discovered from synthetic or semi-synthetic libraries (natural or VH-family specific HCDR3 diversity) have comparable developability characteristics. Here, semi-synthetic libraries may have been expected to be superior than synthetic libraries for the same reason as the one mentioned above regarding HCDR1 and 2 diversity. Taken together these results demonstrated that all HCDR randomization strategies described above enable the discovery of developable binders with regards to yield, solubility, polyreactivity and thermostability. It is worth mentioning that neither forced degradation studies nor stability studies were performed on the Fab and that the molecules originated from libraries generated using HCDR1/2 array-based synthesis that do not contain sequence liabilities are expected to have a better long term developability profile.
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Table 26: Biophysical parameters of anti-CD47 and anti-CD28 antibodies selected from the 13 common
Light Chain libraries reported in Table 10.
To confirm the specificity of the anti-CD47 and anti-CD28 cLC binders to their respective target, KD affinity values were measured by SPR. As reported in Table 27, diverse anti-CD47 cLC binders ranging from single digit nanomolar affinity to micromolar affinity were discovered. Comparing the median affinity of the binders selected from the individual libraries could be misleading because only a few Fab were expressed from each library. To obtain a deeper data set, and to facilitate the comparison of the different library designs, the data were combined according to the VH germlines, or HCDR1/2 designs or HCDR3 diversification strategies. As showed Figure 15, there are no statistical differences between the affinity of binders originating from the different library designs. As reported in Table 28 and Figure 16, diverse anti- CD28 cLC binders ranging from double digit nanomolar affinity to micromolar affinity were isolated and the different library designs are comparable with regards to affinity range. To conclude, all library designs described here are suitable for generating diverse and developable cLC binders if HCDR1/2 are randomized.
Figure imgf000062_0002
Figure imgf000063_0001
Table 27: KD va ues of anti-CD47 Fabs measured by SPR.
Figure imgf000064_0001
Figure imgf000065_0001
Table 28: KD values of anti-CD28 Fabs measured by SPR. Example 5: Characterization of anti-Nkp46 antibodies
Materials and Methods
Fab binding affinities for Nkp46
Surface plasmon resonance (SPR) was used to measure the binding affinities of the Fab fragments for human NKp46. Measurements were performed on a Biacore 8K+ instrument (Cytiva Life Sciences) using the Biacore 8K+ Control Software at 25 °C and analyzed with the Biacore Insight Evaluation Software (v3.0). Around 80 RU of biotinylated recombinant human NKp46 protein (Aero Biosystems, catalog NO: NC1-H82F9) was captured on flow-path 2 of a Series S Biotin CAPture Chip (Cytiva Life Sciences, catalog NO: 28920234). Fabs were injected in single cycle kinetic analysis mode at different concentrations ranging from 6.4 to 4000 nM, in HBS-EP+ buffer (Cytiva Life Sciences, catalog NO: BR100669) at a flow rate of 30 pL/min for 3 min on flow-path 1 and 2 (flow-path 1 being used as reference). Dissociation was monitored for 10 min. After each cycle, the surface was regenerated with 60 pl of regeneration solution provided with Series S Biotin CAPture Kit (Cytiva Life Sciences, catalog NO: 28920234). Experimental data were processed using the 1:1 Langmuir kinetic fitting model. Measurements included zero-concentration samples for referencing. Chi2 and residual values were used to evaluate the quality of the fit between the experimental data and individual binding models.
Results
As described in Example 4, a panel of anti-Nkp46 hits were reformatted to Fab format for developability and affinity assessment. All individual libraries delivered Fab having good transient expression yield in HEK cells with medians ranging from 25 mg/mL to 92 mg/mL (Table 29) which is comparable to a well-behaved comparator control, trastuzumab Fab (72 mg/mL, average of duplicate). Interestingly, we observed that the binders selected from libraries harboring diversity in HCDR1/2 (either Trimer-based or Array-based) have a better yield than the ones selected from germline-based libraries (Figure 17). In addition, binders selected from the semi-synthetic libraries have a better yield than the one originated from the synthetic libraries. While this trend was not observed for anti-CD47 and anti-CD28 (Example 4), we could assume that natural HCDR3 may fold and express better due to in-vivo preselection compared to synthetic HCDR3. All libraries also yielded molecules with high solubility with medians >99% monomer by SEC and the different library designs are comparable for this parameter (Figure 18). The SEC retention times of these Fab's, with medians ranging from 12.77 min to 13.29 min, are in the same range of the well-behaved comparator control trastuzumab Fab (12.79 min, average of duplicate) suggesting that the molecules have low propensity for polyreactivity (Figure 19). However, it is worth mentioning that the binders having a germline VH sequence show the longest retention time. Finally, the thermostability of the Fab's isolated from each library is very high, with medians ranging from 86 °C to 92 °C. Interestingly, VHl-69-based antibodies have a slightly higher Tm than VHl-46-based antibodies (Figure 20).
Figure imgf000067_0001
Figure imgf000068_0001
Table 29: Biophysical parameters of anti-Nkp46 antibodies selected from the 13 common Light Chain libraries reported in Table 10.
To confirm the specificity of the anti-Nkp46 cLC binders, KD affinity values were measured by SPR. As reported in Table 30, diverse anti-Nkp46 cLC binders ranging from single digit nanomolar affinity to micromolar affinity were discovered. As shown in Figure 21, there are no statistical differences between the affinity of binders originating from the different library designs.
To conclude, all library designs described in this study are suitable for generating cLC binders. While each library does not deliver the same number of binders against the different targets, we observed that the VHl-46-based and VHl-69-based libraries are the most productive if they harbor HCDR1/2 diversity.
Figure imgf000069_0001
Figure imgf000070_0001
Table 30: KD values of anti-Nkp46 Fabs measured by SPR. Example 6: Epitope binning of anti-Nkp46 antibodies
Materials and Methods
Epitope binning of anti-Nkp46 antibodies
Chip preparation - Epitope binning was performed using Carterra LSA™ instrument equipped with an HC30M chip at 25 °C using a sandwich method. Based on the affinities measured by SPR (Biacore 8K+, Example 5 Figure 21 and Example 5 Table 30), Fab with KD<200 nM, or KD<400 nM and Koff<0.02 sec 1 were immobilized). The chip was primed with 10 mM MES pH 5.5, activated for 5 min using N- hydroxysuccinimide and N-(3-Dimethylaminopropyl)-N'-ethylcarbodiimide hydrochloride (NHS-EDC) reagents from the Amine coupling kit (Cytiva BR100633) at the recommended manufacturer's concentration for 5 min, Fabs were injected with the 96 printhead at 800 nM in NaOAc 20 mM pH 4.3 for 20 min, the chip was blocked using 1 M ethanolamine pH8 for 5 min.
Binning cycles - Recombinant human NKp46 protein (Aero Biosystems, catalog NO: NC1-H82F9) was injected at 200 nM in lx HBSTE (Carterra #3728) for 10 min. Fabs with KD<1 pM (as measured by Biacore8K+, Example 5) were injected for 5 min at 800 nM in lx HBSTE as second binder to assess competition with the immobilized Fabs. Each 10 cycles, buffer was injected over NKp46. Two regeneration cycles were performed at the end of each cycle by injecting 10 mM glycine pH2.0 for 30 sec.
Data analysis - Sensorgrams were analyzed with the Carterra Epitope™ software. Data was de-spiked on a 5 RU height and width basis, referenced, Y-aligned before antigen injection and normalized at the end of the antigen injection. Thresholds to differentiate competitors from non-competitors were set at around 1.15 to 1.2 times the value obtained from the buffer cycles, as exemplified in Figure 22. As recommended by the manufacturer, asymmetries were then highlighted and reduced, when possible, by adjusting the gating. Bins were then visualized and arranged using the Fruchterman-Reingold (Fruchterman et al., 1991) algorithm for competition network generation.
Results
Epitope binning of anti-Nkp46 antibodies
The diversity of antibodies selected from the different 13 common Light Chains libraries was further assessed by epitope binning experiment. Among the three targets used for the library validation, CD47 and CD28 are composed of one single Ig-like extracellular domain and Nkp46 is composed of two Ig-like extracellular domains. To demonstrate the broad epitope coverage of the selected antibodies, the largest protein Nkp46 has been chosen for the epitope binning experiment. In addition, only the strongest affinity anti-Nkp46 Fabs were tested in this experiment to get analysable data. Consequently, only a few antibodies (3 to 11 Fabs) were tested for each individual library. Interestingly, this limited number of molecules can already be arranged in 2 to 5 bins demonstrating good epitope diversity for each individual library (Table 31). We also observed that the epitope diversity of antibodies selected from libraries using Trimer-based or Array-based HCDR1/2 diversity is comparable with 7 and 8 bins, respectively (Table 31, Figure 24). Interestingly, antibodies selected from semi-synthetic libraries target as many epitope bins as fully synthetic libraries (8 and 7 bins, respectively) (Table 31, Figure 25). Finally, this set of 13 common light chain libraries described in this study can be arranged in 10 different bins demonstrated a very broad epitope coverage knowing that only a limited number of good affinity antibodies were selected for this experiment (Figure 23 to Figure 25).
Figure imgf000072_0001
Figure imgf000073_0001
Table 31: Epitope binning of anti-Nkp46 antibodies selected from the 13 common Light Chain libraries.

Claims

Claims
1. A method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat positions 27 to 35, and wherein said human HCDR2 corresponds to the Kabat positions 50 to 58, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences of a human antibody variable heavy chain germline amino acid sequence, thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset; c. Removing from said HCDR1 and/or HCDR2 amino acid sequence dataset:
(i) sequences comprising about 60% or more amino acid substitutions per CDR compared to said human antibody variable heavy chain germline amino acid sequence (Somatic-Hyper-Mutations);
(ii) sequences comprising one or more motifs of Table 1; and
(iii) sequences encompassing immunogenic peptides.
2. The method of claim 1 wherein said sequences encompassing immunogenic peptides are detected by the Immune Epitope Database (IEDB) CD4 T cell immunogenicity prediction tool.
3. A method to generate a collection of diverse human HCDR1 and/or HCDR2, wherein said human HCDR1 corresponds to the Kabat position 25 to 36, and wherein said human HCDR2 corresponds to the Kabat position 47 to 65, comprising the steps of: a. Performing Next-Generation-Sequencing of human IgG antibody repertoire; b. Selecting unique HCDR1 and/or HCDR2 amino acid sequences for a human antibody variable heavy chain germline amino acid sequence thus obtaining a HCDR1 and/or HCDR2 amino acid sequence dataset. c. Aligning the amino acid sequences of HCDR1 and/or HCDR2 of said HCDR1 and/or HCDR2 amino acid sequence dataset. d. Calculation of the frequency of each amino acid at each position of said amino acid sequences of HCDR1 and/or HCDR2; e. Calculation of a single point mutation rate (MR) for each position of said HCDR1 and/or HCDR2, wherein MR is the frequency of a non-germline amino acid at each position of said HCDR1 and/or HCDR2; f. Calculation of a diversity score (DS) for each position of said HCDR1 and/or HCDR2, wherein DS is the MR multiplied for the minimum number of amino acids whose summed frequency is equal to the 80% of the MR. g. Obtaining said collection of diverse human HCDR1 and/or HCDR2 by providing at each position of said HCDR1 and/or HCDR2 a plurality of amino acids, characterized in that:
- for a position having said DS lower than about 70% of the DS highest value, said plurality of amino acids is made of the naturally occurring amino acid at each position, excluding amino acids with frequency less than 1%, Cys, and optionally Met;
- for a position having said DS equal to or higher than about 70% of the DS highest value, said plurality of amino acids is made of any amino acid, excluding Cys, Met, Trp, and optionally Pro. The method of claims 1 to 3, wherein said human antibody variable heavy chain germline is selected from the group comprising VH1-46 (SEQ ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3). A collection of diverse human HCDR1 and/or HCDR2 obtained by the method of claims 1 to 4. A collection of diverse human HCDR1 and/or HCDR2 of claim 5 obtained by the method of claims 3 and 4, wherein said HCDR1 is encoded as trimer oligonucleotides in VH1-69 germline according to the frequency of Table 3, or said HCDR1 is encoded as trimer oligonucleotides in VH1-46 germline according to the frequency of Table 4, said HCDR1 is encoded as trimer oligonucleotides in VH3-23 germline according to the frequency of Table 5; and said HCDR2 is encoded as trimer oligonucleotides in VH1-69 germline according to the frequency of Table 6, or said HCDR2 is encoded as trimer oligonucleotides in VH1-46 germline according to the frequency of Table 7, and said HCDR2 is encoded as trimer oligonucleotides in VH3-23 germline according to the frequency of Table 8. A library of antibody binding regions, wherein said antibody binding regions comprise a heavy chain variable domain comprising a human HCDR1 and/or a human HCDR2 obtained by the method of claims 1 to 4 and a common light chain variable domain. The library of claim 7 wherein said common light chain variable domain is a VK3-15 or a VK1-39 variable light chain. The library of claim 8 wherein said common light chain variable domain is Vi<3-15/Jkl (SEQ ID NO: 4). The library of claims 7 to 9, wherein said heavy chain variable domain further comprise a naturally occurring heavy chain framework region. The library of claims 7 to 10, wherein said naturally occurring heavy chain framework region is derived from a human antibody variable heavy chain germline selected from the group comprising VH1-46 (SEQ. ID NO: 1), VH1-69 (SEQ ID NO: 2) and VH3-23 (SEQ ID NO: 3). The library of claims 7 to 11, wherein said heavy chain variable domain further comprises a human HCDR3 being a naturally occurring HCDR3 from a human IgM repertoire. The library of claims 7 to 12, wherein said library is a phage display library. A method for identifying an antibody binding region from the library of claims 7 to 13, wherein said antibody binding region specifically binds to an antigen target of interest comprising the steps of (i) panning on said antigen and (ii) screening said antibody binding region that specifically binds to said antigen. An antibody binding region identified according to the method of any one of claim 14.
PCT/EP2023/064529 2022-06-03 2023-05-31 Common light chain antibody libraries WO2023232857A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
EP22020261 2022-06-03
EP22020261.8 2022-06-03
EP22200562 2022-10-10
EP22200562.1 2022-10-10
EP22204648 2022-10-31
EP22204648.4 2022-10-31

Publications (1)

Publication Number Publication Date
WO2023232857A1 true WO2023232857A1 (en) 2023-12-07

Family

ID=86732426

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/064529 WO2023232857A1 (en) 2022-06-03 2023-05-31 Common light chain antibody libraries

Country Status (1)

Country Link
WO (1) WO2023232857A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994013804A1 (en) 1992-12-04 1994-06-23 Medical Research Council Multivalent and multispecific binding proteins, their manufacture and use
US9209965B2 (en) 2014-01-14 2015-12-08 Microsemi Semiconductor Ulc Network interface with clock recovery module on line card
WO2020014143A1 (en) * 2018-07-08 2020-01-16 Specifica Inc. Antibody libraries with maximized antibody developability characteristics

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1994013804A1 (en) 1992-12-04 1994-06-23 Medical Research Council Multivalent and multispecific binding proteins, their manufacture and use
US9209965B2 (en) 2014-01-14 2015-12-08 Microsemi Semiconductor Ulc Network interface with clock recovery module on line card
WO2020014143A1 (en) * 2018-07-08 2020-01-16 Specifica Inc. Antibody libraries with maximized antibody developability characteristics

Non-Patent Citations (23)

* Cited by examiner, † Cited by third party
Title
BAI XUELIAN ET AL: "A Novel Synthetic Antibody Library with Complementarity-Determining Region Diversities Designed for an Improved Amplification Profile", INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, vol. 23, no. 11, 2 June 2022 (2022-06-02), pages 6255, XP093013662, DOI: 10.3390/ijms23116255 *
BIRD RE ET AL., SCIENCE, vol. 242, 1988, pages 423 - 426
CHOTHIALESK, J. MOL BIOL., vol. 196, 1987, pages 901 - 17
COLOMA MJMORRISON SL, NATURE BIOTECHNOLOGY, vol. 15, no. 2, 1997, pages 159 - 163
DHANDA: "Prediction of HLA CD4 immunogenicity in human populations", FRONTIERS IN IMMUNOLOGY, vol. 9, 2018, pages 1369
HOLLIGER P ET AL., PROC. NATL. ACAD. SCI. USA, vol. 90, 1993, pages 6444 - 48
HUSTON JS ET AL., PROC. NATL. ACAD. SCI. USA, vol. 85, 1988, pages 5879 - 83
LEFRANC, DEV. COMP. IMMUNOL., vol. 27, 2003, pages 55 - 77
M. ERASMUS ET AL., COMM. BIOLOGY, 2021
M. VAN DER LOOJ. VAN DER LAANR CORE TEAMN. LOGANC. MUIRJ. GRUBER, APPROXIMATE STRING MATCHING, FUZZY TEXT SEARCH, AND STRING DISTANCE FUNCTIONS, 2021, Retrieved from the Internet <URL:https://CRAN.R-project.org/package=stringdist>
MACCALLUM, J. MOL. BIOL., vol. 262, no. 5, 1996, pages 732 - 45
MARTIN, A.C.: "In Antibody Engineering", vol. 2, 2010, SPRINGER, article "Protein sequence and structure analysis of antibody variable domains", pages: 33 - 51
P. VALADON, MABS, 2013
PADLAN, FASEB J., vol. 9, 1995, pages 133 - 39
PAUL: "Development and validation of a broad scheme for prediction of HLA class II restricted T cell epitopes", JOURNAL OF IMMUNOLOGICAL METHODS, vol. 422, 2015, pages 28 - 34, XP029239932, DOI: 10.1016/j.jim.2015.03.022
RAMSKOLD DLUO S ET AL.: "Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells", NAT BIOTECHNOL, vol. 30, no. 8, 2012, pages 777 - 782, XP037004921, DOI: 10.1038/nbt.2282
ROUET ROMAIN ET AL: "Next-Generation Sequencing of Antibody Display Repertoires", FRONTIERS IN IMMUNOLOGY, vol. 9, 2 February 2018 (2018-02-02), XP093013512, DOI: 10.3389/fimmu.2018.00118 *
THOMPSON, NUCLEIC ACIDS RES., vol. 22, 1994, pages 4673 - 80
TOMLINSON IHOLLINGER P, METHODS ENZYMOL., vol. 326, 2000, pages 461 - 79
WARD ES ET AL., NATURE, vol. 341, 1989, pages 544 - 546
WELLENREUTHER RSCHUPP I ET AL.: "SMART amplification combined with cDNA size fractionation in order to obtain large full-length clones", BMC GENOMICS, vol. 5, no. 1, 2004, pages 140, XP021002117, DOI: 10.1186/1471-2164-5-36
XUELIAN BAI ET AL: "A Novel Human scFv Library with Non-Combinatorial Synthetic CDR Diversity", PLOS ONE, vol. 10, no. 10, 20 October 2015 (2015-10-20), pages e0141045, XP055489793, DOI: 10.1371/journal.pone.0141045 *
ZHU YYMACHLEDER EM ET AL.: "Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction", BIOTECHNIQUES, vol. 30, no. 4, 2001, pages 892 - 897, XP001121210

Similar Documents

Publication Publication Date Title
US11578428B2 (en) Humanized antibodies
RU2769133C2 (en) Immunoglobulin with tandem arrangement of fab fragments and application thereof
CN109563124A (en) The purifying of multi-specificity antibody
JP7483896B2 (en) Pro-Antibodies Reduce Off-Target Toxicity
US20210002376A1 (en) Multivalent binding molecules
JP6919100B2 (en) New multispecific binding protein
WO2023232857A1 (en) Common light chain antibody libraries
TWI796563B (en) Methods of making antibodies
US20200157190A1 (en) Monovalent and divalent binding proteins
WO2022042673A1 (en) Signal peptide for reducing end heterogeneity of heterologous polypeptide
EP4090686A2 (en) Pro-antibody that reduces off-target toxicity
JP2022550316A (en) Binding modules containing modified EHD2 domains
WO2019118318A1 (en) Recombinant antibody comprising heavy chain genetically fused to signature peptide and uses thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23729754

Country of ref document: EP

Kind code of ref document: A1