WO2019015677A1

WO2019015677A1 - Animals with genetically modified immunoglobulin heavy chain

Info

Publication number: WO2019015677A1
Application number: PCT/CN2018/096493
Authority: WO
Inventors: Yuelei SHEN; Chaoshe GUO; Rui Huang; yang BAI; Meiling Zhang; Chengzhang SHANG
Original assignee: Beijing Biocytogen Co., Ltd
Priority date: 2017-07-21
Filing date: 2018-07-20
Publication date: 2019-01-24

Abstract

Provided are genetically modified animals expressing modified immunoglobulin heavy chains, and methods of use thereof.

Description

ANIMALS WITH GENETICALLY MODIFIED IMMUNOGLOBULIN HEAVY CHAIN

CLAIM OF PRIORITY

This application claims the benefit of Chinese Patent Application App. No. 201710602851.7, filed on July 21, 2017, and Chinese Patent Application App. No. 201810796920.7, filed on July 19, 2018. The entire contents of the foregoing are incorporated herein by reference.

TECHNICAL FIELD

This disclosure relates to genetically modified animals expressing modified immunoglobulin heavy chains, and methods of use thereof.

BACKGROUND

In the development of antibody therapeutics, the current antibody screening methods generally involve either (1) phage display or (2) hybridoma technology. The phage display involves displaying fragments of the antibody on the surface of the phage to form an antibody library, and the antibody gene is screened through repeated panning steps. In contrast, the hybridoma approach generally involves immunizing mice with an antigen, and then the spleen cells of the mice are collected and are then fused with the immortalized myeloma cells to obtain hybridoma cells. The hybridoma cells are then selected using a selective medium, and then the hybridoma cells are again screened for the antibody that specifically binds to the antigen.

The ultimate goal of the antibody screening is usually to obtain antibody sequences for subsequent optimization, testing, and industrial production. The two screening methods have their own advantages and disadvantages. For example, the phage display method needs to establish a screening platform, which is complicated. The library capacity for the phage display is also relatively small, and the in vitro screening process does not undergo the somatic hypermutation process, thus the affinity needs to be further optimized. The hybridoma approach has some advantages in screening (e.g., completed VDJ rearrangement in animals, undergoing somatic mutation processes, improved screening efficiency and affinity) . But there are too many steps, especially the screening process for positive hybridomas usually requires the limiting dilution methods. Generally, the process needs to be repeated for 3～4 times to determine the cells in the wells can produce the antibody that specifically binds to the antigen of interest. Thus, the hybridoma approach is typically slow, inefficient, and time consuming.

In summary, there is an urgent need for a more convenient and efficient antibody screening method.

SUMMARY

The present disclosure provides an efficient and fast antibody screening method based on a genetically engineered animal. Immune cells that express an antibody that specifically binds to the antigen of the interest can be rapidly obtained and isolated from the genetically engineered animal. Sequence analysis can be directly performed on the immune cells, avoiding the tedious process of preparing the antibody library or the time consuming step of hybridoma screening, thereby greatly improving the speed and efficiency for antibody development. Thus, in one aspect, this disclosure is related to an animal model with genetically modified immunoglobulin heavy chains. The animal model can express genetically modified immunoglobulin heavy chains. The animals can be used for antibody screening. The methods described herein can greatly facilitate the development and design of new drugs, saving substantial time and cost.

In one aspect, the disclosure provides a genetically-modified, non-human animal whose genome comprises at least one chromosome comprising an insertion of a sequence encoding a membrane bound domain at an endogenous immunoglobulin heavy chain locus.

In some embodiments, the immunoglobulin heavy chain locus comprises a first 3’-UTR region for the secretive form of the immunoglobulin heavy chain and a second 3’-UTR region for the membrane bound form of the immunoglobulin heavy chain. In some embodiments, the sequence encoding the membrane bound domain is inserted before the first 3’-UTR region.

In some embodiments, the immunoglobulin heavy chain locus comprises a sequence encoding an immunoglobulin CH1 domain, an immunoglobulin CH2 domain, an immunoglobulin CH3 domain. In some embodiments, the membrane bound domain is linked to the immunoglobulin CH3 domain.

In some embodiments, the membrane bound domain is linked to the immunoglobulin CH3 domain by an oligopeptide sequence. In some embodiments, the oligopeptide sequence is a 2A oligopeptide sequence (e.g. F2A) . In some embodiments, the 2A oligopeptide sequence comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 2.

In some embodiments, the membrane bound domain is an immunoglobulin heavy chain membrane bound domain. In some embodiments, the membrane bound domain comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 4.

In some embodiments, the immunoglobulin heavy chain locus comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 8.

In some embodiments, the immunoglobulin heavy chain locus comprises a sequence that encodes an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 9.

In some embodiments, the immunoglobulin heavy chain locus comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 21.

In some embodiments, the immunoglobulin heavy chain locus comprises a sequence that encodes an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 22.

In some embodiments, the sequence is inserted in exon 4.

In some embodiments, the animal is heterozygous with respect to the insertion at the endogenous immunoglobulin heavy chain gene locus.

In some embodiments, the animal is homozygous s with respect to the insertion at the endogenous immunoglobulin heavy chain gene locus.

In some embodiments, the animal does not express the endogenous immunoglobulin heavy chain.

In some embodiments, the animal is a mammal, e.g., a monkey, a rodent or a mouse. In some embodiments, the animal is a mouse.

In some embodiments, the endogenous immunoglobulin heavy chain locus is immunoglobulin G heavy chain locus. In some embodiments, the endogenous immunoglobulin heavy chain locus is Ighg1 locus.

In one aspect, the disclosure also relates to a genetically-modified, non-human animal whose genome comprises at least one chromosome comprising a sequence encoding an immunoglobulin heavy chain and a membrane bound domain. In some embodiments, the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.

In some embodiments, the sequence is operably linked to an endogenous immunoglobulin heavy chain locus regulatory element of the animal. In some embodiments, the nucleotide sequence is integrated to an endogenous immunoglobulin heavy chain gene locus of the animal.

In some embodiments, the heterogeneous amino acid sequence is a 2A oligopeptide sequence. In some embodiments, the 2A oligopeptide sequence comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 2.

In some embodiments, the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.

In some embodiments, the membrane bound domain comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 4.

In some embodiments, the sequence that encodes an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 9. In some embodiments, the sequence that encodes an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 22.

In some embodiments, the animal does not express endogenous immunoglobulin heavy chain.

In some embodiments, the immunoglobulin heavy chain is an immunoglobulin G heavy chain. In some embodiments, the immunoglobulin heavy chain is an IgG1 heavy chain.

In another aspect, the disclosure provides a non-human animal comprising at least one cell comprising a nucleic acid encoding a fusion polypeptide. In some embodiments, the fusion peptide comprises an immunoglobulin heavy chain and a membrane bound domain. In some embodiments, the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.

In some embodiments, the nucleic acid is operably linked to an endogenous immunoglobulin heavy chain locus regulatory element of the animal. In some embodiments, the nucleic acid is integrated to an endogenous immunoglobulin heavy chain gene locus of the animal.

In some embodiments, the animal does not express an endogenous immunoglobulin heavy chain.

In one aspect, the disclosure also provides a genetically modified non-human animal, wherein the genome of the animal comprises from 5’to 3’at the endogenous immunoglobulin heavy chain locus, (a) a first DNA sequence; optionally (b) a second DNA sequence comprising an exogenous sequence; (c) a third DNA sequence; (d) a fourth DNA sequence. In some embodiments, the first DNA sequence, the optional second DNA sequence, the third DNA sequence, and the fourth DNA sequence are linked.

In some embodiments, the first DNA sequence comprises a sequence that is at least 80%identical to an endogenous immunoglobulin heavy chain gene sequence that is located upstream of 3’-UTR for the secretive form of the immunoglobulin heavy chain. In some embodiments, the second DNA sequence can have a length of 0 nucleotides to 300 nucleotides. In some embodiments, the third DNA sequence encodes a membrane bound domain. In some embodiments, the fourth DNA sequence comprises a sequence that is at least 80%identical to a sequence that is located downstream of 3’-UTR for the secretive form of the immunoglobulin heavy chain.

In some embodiments, the first DNA sequence comprises exon 1, exon 2, exon 3, and/or at least 10 nucleotides from exon 4 of the immunoglobulin heavy chain gene. In some embodiments, the fourth DNA sequence comprises M1 and/or M2 of the immunoglobulin heavy chain gene.

In some embodiments, the second DNA sequence encodes a 2A oligopeptide sequence. In some embodiments, the second DNA sequence comprises a sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 2.

In some embodiments, the sequence encodes an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 9. In some embodiments, the sequence encodes an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 22.

In some embodiments, the endogenous immunoglobulin heavy chain locus is an immunoglobulin G heavy chain locus. In some embodiments, the endogenous immunoglobulin heavy chain locus is an Ighg1 locus.

In another aspect, the disclosure also provides methods of isolating an immune cell that expresses an antibody that specifically binds to an antigen. The methods involve immunizing the animal described herein with the antigen (e.g., provoking the immune system with the antigen) ; collecting a plurality of cells (e.g., spleen cells, various immune cells, B cells) from the animal; isolating one or more cells that express an antibody that specifically binds to the antigen.

In some embodiments, the one or more cells that express an antibody that specifically binds to the antigen are isolated by fluorescence activated cell sorting (FACS) . In some embodiments, the one or more cells that express an antibody that specifically binds to the antigen are isolated by magnetic-activated cell sorting (MACS) .

In some embodiments, the methods further comprise sequencing a nucleic acid sequence that encodes one or more regions (e.g., complementarity-determining regions (CDRs) , VH, and/or VL) of the antibody that is expressed by the isolated cell.

In one aspect, the discourse also provides methods for making a genetically-modified, non-human animal. The methods involve inserting a sequence encoding a membrane bound domain at an endogenous immunoglobulin heavy chain locus.

In some embodiments, the locus is immunoglobulin G heavy chain locus. In some embodiments, the inserted sequence further comprises a 2A oligonucleotide sequence.

The disclosure also provides a fusion immunoglobulin heavy chain peptide comprising one or more immunoglobulin heavy chain constant domains and a membrane bound domain. In some embodiments, the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.

In some embodiments, the peptide comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 9. In some embodiments, the peptide comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, or 99%identical to SEQ ID NO: 22.

In another aspect, the disclosure provides a peptide comprising an amino acid sequence, wherein the amino acid sequence is one of the following:

(a) an amino acid sequence set forth in SEQ ID NO: 9 or 22;

(b) an amino acid sequence that is at least 90%identical to SEQ ID NO: 9 or 22;

(c) an amino acid sequence that is at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to SEQ ID NO: 9 or 22;

(d) an amino acid sequence that is different from the amino acid sequence set forth in SEQ ID NO: 9 or 22 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 amino acid; and

(e) an amino acid sequence that comprises a substitution, a deletion and /or insertion of one, two, three, four, five or more amino acids to the amino acid sequence set forth in SEQ ID NO: 9 or 22.

In one aspect, the disclosure further provides a nucleic acid comprising a nucleotide sequence, wherein the nucleotide sequence is one of the following:

(a) a sequence that encodes the peptide described herein;

(b) SEQ ID NO: 8;

(c) SEQ ID NO: 21;

(d) a sequence that is at least 90%identical to SEQ ID NO: 8 or SEQ ID NO: 21;

(e) a sequence that is at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to SEQ ID NO: 8; and

(f) a sequence that is at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to SEQ ID NO: 21.

The disclosure also relates to cells or animals that have the nucleic acids or proteins described herein.

In some embodiments, the aminol is a non-human mammal, e.g., a rodent. In some embodiments, the non-human mammal is a mouse.

The disclosure also relates to a cell (e.g., stem cell or embryonic stem cell) or cell line, or a primary cell culture thereof derived from the non-human mammal or an offspring thereof. The disclosure further relates to the tissue, organ or a culture thereof derived from the non-human mammal or an offspring thereof.

The disclosure further relates to a genomic DNA sequence of a genetically-modified mouse, a DNA sequence obtained by a reverse transcription of the mRNA obtained by transcription thereof is consistent with or complementary to the DNA sequence; a construct expressing the amino acid sequence thereof; a cell comprising the construct thereof; a tissue comprising the cell thereof.

The disclosure further relates to the use of the non-human mammal or an offspring thereof, or the tumor bearing non-human mammal, the animal model generated through the method as described herein in the development of a product related to an immunization process, the manufacture of an antibody, or the model system for a research in pharmacology, immunology, microbiology and medicine.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.

Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic diagram showing mouse Ighg1 gene and the gene locus after the modification.

FIG. 2 is a schematic diagram showing the gene modification strategy.

FIG. 3 shows the restriction enzymes digestion results of the targeting plasmid TV-4G-Ighg1 by three sets of restriction enzymes.

FIG. 4 shows the restriction enzymes digestion results of the targeting plasmid TV-4G-Ighg1-b by two sets of restriction enzymes.

FIG. 5 is a graph showing activity testing results for sgRNA1-sgRNA9 (Con is a negative control; PC is a positive control) .

FIG. 6 is a schematic diagram showing the structure of pT7-sgRNA-G2 plasmid.

FIGS. 7A-7B show PCR identification results of samples collected from tails of F0 generation mice with C57BL/6 background. WT is wildtype. Mouse #5 was positive.

FIGS. 8A-8B show PCR identification results of samples collected from tails of F0 generation mice with BALB/c background. WT is wildtype. Mouse #1, #2, and #3 were positive.

FIGS. 9A-9B show PCR identification results of samples collected from tails of F1 generation mice with C57BL/6 background. WT is wildtype. Mice labeled with F1-1, F1-2, and F1-3 were positive.

FIGS. 10A-10H are flow cytometry results of spleen cells collected from mice immunized with the extracellular region of human CD27.

FIG. 11. Magnetic-activated cell sorting (MACS) results with anti-biotin MicroBeads for spleen cells collected from mice immunized with the extracellular region of human CD27. These cells were treated with Biotinylated Human CD27 Ligand/CD70 Protein, Fc Tag before sorting.

FIG. 12 is a schematic diagram showing gene targeting strategy based on embryonic stem cells.

DETAILED DESCRIPTION

This disclosure relates to transgenic non-human animal with genetically modified immunoglobulin heavy chains, and methods of use thereof.

Immunoglobulins, also known as antibodies, are glycoprotein molecules produced by plasma cells (white blood cells) . They act as a critical part of the immune response by specifically recognizing and binding to particular antigens, such as bacteria or viruses, and aiding in their destruction. The immunoglobulin heavy chain (IgH) is the large polypeptide subunit of an antibody (immunoglobulin) . A typical antibody is composed of two immunoglobulin (Ig) heavy chains and two Ig light chains. Several different types of heavy chain exist that define the class or isotype of an antibody. These heavy chain types vary between different animals. All heavy chains contain a series of immunoglobulin domains, usually with one variable domain (VH) that is important for binding antigen and several constant domains (CH1, CH2, CH3, CH4 etc. ) .

Immunoglobulins usually occur in two main forms: soluble antibodies and membrane-bound antibodies. The membrane-bound antibodies contain a hydrophobic membrane-bound domain. The membrane-bound antibodies are typically part of B cell antigen receptor (BCR) . Antigen binding to the BCR stimulates B cells to differentiate into antibody-secreting cells. Alternative splicing in immunoglobulin heavy chain gene regulates the production of secreted antibodies or surface bound B-cell receptors in B cells.

The hybridoma technology is a widely used method for producing large numbers of identical antibodies (also called monoclonal antibodies) . This process starts by injecting a mouse (or other mammal) with an antigen that provokes an immune response. A type of white blood cell (B cell) that produces antibodies that bind to the antigen are then harvested from the mouse. These isolated B cells are in turn fused with immortal B cell cancer cells, a myeloma, to produce a hybrid cell line called a hybridoma, which has both the antibody-producing ability of the B-cell and the exaggerated longevity and reproductivity of the myeloma. The hybridomas can be grown in culture, each culture starting with one viable hybridoma cell, producing cultures each of which consists of genetically identical hybridomas, which produce one antibody per culture (monoclonal) , rather than mixtures of different antibodies (polyclonal) . The myeloma cell line that is used in this process is selected for its ability to grow in tissue culture and for an absence of antibody synthesis. The production of hybridoma and the screening of the positive clones that produce antibodies that can specifically bind to the antigen of interest is laborious and time consuming.

The present disclosure provide a transgenic non-human animal with genetically modified immunoglobulin heavy chain. Because of the addition of the membrane bound domain to the soluble form of the immunoglobulin heavy chain, the antibodies produced by the immune cells in the transgenic animal can be retained on the surface on the immune cells. After the animal is exposed to the antigen, the immune cells display these antibodies on their surface. These immune cells can be sorted and isolated based on the binding affinity of these antibodies with the antigen. The isolated cells can then be further used make hybridoma. Alternatively, the sequence of these cells can be analyzed for further modification and optimization (e.g., humanization) . The methods described herein avoid manual screening of the positive hybridoma clones, which is laborious and time-consuming, and can greatly facilitate the development of antibody therapeutics.

Unless otherwise specified, the practice of the methods described herein can take advantage of the techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA and immunology. These techniques are explained in detail in the following literature, for examples: Molecular Cloning A Laboratory Manual, 2nd Ed., ed. By Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1989) ; DNA Cloning, Volumes I and II (D.N. Glovered., 1985) ; Oligonucleotide Synthesis (M.J. Gaited., 1984) ; Mullisetal U.S. Pat. No. 4,683,195; Nucleic Acid Hybridization (B.D. Hames&S. J. Higginseds. 1984) ; Transcription And Translation (B.D. Hames&S.J. Higginseds. 1984) ; Culture Of Animal Cell (R.I. Freshney, Alan R. Liss, Inc., 1987) ; Immobilized Cells And Enzymes (IRL Press, 1986) ; B. Perbal, A Practical Guide To Molecular Cloning (1984) , the series, Methods In ENZYMOLOGY (J. Abelson and M. Simon, eds. -in-chief, Academic Press, Inc., New York) , specifically, Vols. 154 and 155 (Wuetal. eds. ) and Vol. 185, “Gene Expression Technology” (D. Goeddel, ed. ) ; Gene Transfer Vectors For Mammalian Cells (J.H. Miller and M.P. Caloseds., 1987, Cold Spring Harbor Laboratory) ; Immunochemical Methods In Cell And Molecular Biology (Mayer and Walker, eds., Academic Press, London, 1987) ; Hand book Of Experimental Immunology, Volumes V (D.M. Weir and C.C. Blackwell, eds., 1986) ; and Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986) ; each of which is incorporated herein by reference in its entirety.

Antibodies and immunoglobulin heavy chains

Antibodies (also called immunoglobulins) are made up of two classes of polypeptide chains, light chains and heavy chains. An intact antibody typically has two heavy chains and two light chains. The heavy chains, which determine the immunoglobulin class, are of five different types, denoted by the Greek letters, μ, δ, γ, εand α, corresponding to the classes, IgM, IgD, IgG, IgE, and IgA, respectively. Light chains are of two types, denoted λ and κ. Each chain is itself divided into two functional parts, the variable (V) domain and the constant (C) domain.

An antibody can comprise two identical copies of a light chain and two identical copies of a heavy chain. The heavy chains, which each contain one variable domain (or variable region, V _H) and multiple constant domains (or constant regions) , bind to one another via disulfide bonding within their constant domains to form the “stem” of the antibody. The light chains, which each contain one variable domain (or variable region, V _L) and one constant domain (or constant region) , each bind to one heavy chain via disulfide binding. The variable region of each light chain is aligned with the variable region of the heavy chain to which it is bound. The variable regions of both the light chains and heavy chains contain three hypervariable regions sandwiched between more conserved framework regions (FR) .

These hypervariable regions, known as the complementary determining regions (CDRs) , form loops that comprise the principle antigen binding surface of the antibody. The four framework regions largely adopt a beta-sheet conformation and the CDRs form loops connecting, and in some cases forming part of, the beta-sheet structure. The CDRs in each chain are held in close proximity by the framework regions and, with the CDRs from the other chain, contribute to the formation of the antigen-binding region.

Methods for identifying the CDR regions of an antibody by analyzing the amino acid sequence of the antibody are well known, and a number of definitions of the CDRs are commonly used. The Kabat definition is based on sequence variability, and the Chothia definition is based on the location of the structural loop regions. These methods and definitions are described in, e.g., Martin, "Protein sequence and structure analysis of antibody variable domains, " Antibody engineering, Springer Berlin Heidelberg, 2001. 422-439; Abhinandan, et al. "Analysis and improvements to Kabat and structurally correct numbering of antibody variable domains, " Molecular immunology 45.14 (2008) : 3832-3839; Wu, T.T. and Kabat, E.A. (1970) J. Exp. Med. 132: 211-250; Martin et al., Methods Enzymol. 203: 121-53 (1991) ; Morea et al., Biophys Chem. 68 (1-3) : 9-16 (Oct. 1997) ; Morea et al., J Mol Biol. 275 (2) : 269-94 (Jan . 1998) ; Chothia et al., Nature 342 (6252) : 877-83 (Dec. 1989) ; Ponomarenko and Bourne, BMC Structural Biology 7: 64 (2007) ; each of which is incorporated herein by reference in its entirety. Unless specifically indicated in the present disclosure, Kabat numbering is used in the present disclosure as a default.

The CDRs are important for recognizing an epitope of an antigen. As used herein, an “epitope” is the smallest portion of a target molecule capable of being specifically bound by the antigen binding domain of an antibody. The minimal size of an epitope may be about three, four, five, six, or seven amino acids, but these amino acids need not be in a consecutive linear sequence of the antigen’s primary structure, as the epitope may depend on an antigen’s three-dimensional configuration based on the antigen’s secondary and tertiary structure.

The heavy chain also has sub-isotypes. The sub-isotypes of the heavy chains also include IgG1, IgG2, IgG2a, IgG2b, IgG3, IgG4, IgE1, IgE2, etc. The IgG subclasses (IgG1, IgG2a, IgG2b, IgG3, and IgG4) are highly conserved, differ in their constant region, particularly in their hinges and upper CH2 domains. The sequences and differences of the IgG subclasses are known in the art, and are described, e.g., in Vidarsson, et al, "IgG subclasses and allotypes: from structure to effector functions. " Frontiers in immunology 5 (2014) ; Irani, et al. "Molecular properties of human IgG subclasses and their implications for designing therapeutic monoclonal antibodies against infectious diseases. " Molecular immunology 67.2 (2015) : 171-182; Shakib, Farouk, ed. The human IgG subclasses: molecular analysis of structure, function and regulation. Elsevier, 2016; each of which is incorporated herein by reference in its entirety.

Immunoglobulins usually occur in two main forms: soluble antibodies and membrane-bound antibodies. The specific form is determined by alternative splicing in immunoglobulin heavy chain gene. Each immunoglobulin heavy chain isotype (e.g., IGHM, IGHD, IGHG1, IGHG2, IGHG3, IGHG4, IGHA1, IGHA2, IGHE) is expressed as one of two isoforms: secreted or membrane bound which contain S or M exons respectively (Vollmers et al. "Novel exons and splice variants in the human antibody heavy chain identified by single cell and single molecule sequencing. " PLoS One 10.1 (2015) : e0117050) . Transcripts of all isotypes are thought to feature two exons (M1 and M2) specific to their membrane isoform, with the exception of IgA1 and IgA2 (transcribed from IGHA1 and IGHA2) which features only one membrane exon (M) .

The immunoglobulin heavy chain loci are known in the art. In these immunoglobulin heavy chain loci, two separate 3’terminal sequences for mRNA are encoded in the genome, the first 3’terminal sequence is involved in the secreted form, and the second 3’terminal sequence specifies an amino acid sequence appropriate for the membrane-bound form. The synthesis of secreted and membrane-bound immunoglobulin heavy chains is generally directed by mRNAs that differ at their 3’ends (See Alt, et al. "Synthesis of secreted and membrane-bound immunoglobulin mu heavy chains is directed by mRNAs that differ at their 3’ends. " Cell 20.2 (1980) : 293-301) .

Immunoglobulin G heavy chain (γ chain) has 4 exons (exon 1, exon 2, exon 3, and exon 4) for constant domains (CH1, CH2, and CH3) and the hinge region (H) of the antibody, and 2 exons (M1 and M2) for the membrane-bound domain (the transmembrane domain) (FIG. 1) . There are two separate 3’-UTR (untranslated region) in the genome, the first 3’-UTR is for the secreted form, and the second 3’-UTR is for the membrane-bound form. When the membrane-bound form of IGHG is expressed, the transcript will skip the first 3’-UTR and link the coding region in exon 4 with the M1 and M2. The final expression product is the membrane-bound form of the γ chain.

The mouse Ighg1 gene (Gene ID: 16017) is located in Chromosome 12 of the mouse genome, which is located from 113326544 to 113330523 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) . For the secreted from of IGHG1, exon 1 is from 113330523 to 113330233, the first intron is from 113330232 to 113329883, exon 2 is from 113329882 to 113329844, the second intron is from 113329843 to 113329741, exon 3 is from 113329740 to 113329420, the third intron is from 13329419 to113329299, exon 4 is from 113329298 to 113328874, and the 3’-UTR is from 113328975 to 113328874, based on transcript Ighg1-201 (ENSMUST00000103420.2) . For the membrane-bound form of IGHG1, exon 1 is from 113330523 to 113330233, the first intron is from 113330232 to 113329883, exon 2 is from 113329882 to 113329844, the second intron is from 113329843 to 113329741, exon 3 is from 113329740 to 113329420, the third intron is from 13329419 to113329299, exon 4 is from 113329298 to 113328984, the fourth intron is from 113328983 to 113327573, exon 5 (M1) is from 113327572 to 113327442, the fifth intron is from 113327441 to 113326625, exon 6 (M2) is from 113326624 to 113325240, and the 3’-UTR is from 113326540 to 113325240, based on transcript Ighg1-202 (ENSMUST00000194304.5) . The relevant information for mouse Ighg1 locus can be found in the NCBI website with Gene ID: 16017 and is described e.g., in Casola, et al. "Tracking germinal center B cells expressing germ-line immunoglobulin γ1 transcripts by conditional gene targeting. " Proceedings of the National Academy of Sciences 103.19 (2006) : 7396-7401, both of which are incorporated by reference herein in its entirety.

Table 1

The mouse Ighg2b gene (Gene ID: 16016) is located in Chromosome 12 of the mouse genome, which is located from 113304314 to 113307933 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) . For the secreted form, exon 1 is from 113307933 to 113307643, the first intron is from 113307642 to 113307327, exon 2 is from 113307326 to 113307261, the second intron is from 113307260 to 113307154, exon 3 is from 113307153 to 113306824, the third intron is from 113306823 to 113306712, exon 4 is from 113306711 to 113306285, and the 3’-UTR is from 113306388 to 113306285, based on transcript Ighg2b-201 (ENSMUST00000103418.2) . With respect to the membrane-bound form, exon 1 is from 113307933 to 113307643, the first intron is from 113307642 to 113307327, exon 2 is from 113307326 to 113307261, the second intron is from 113307260 to 113307154, exon 3 is from 113307153 to 113306824, the third intron is from 113306823 to 113306712, exon 4 is from 113306711 to 113306397, the fourth intron is from 113306396 to 113305040, exon 5 is from 113305039 to 113304909, the fifth intron is from 113304908 to 113304398, exon 6 is from 113304397 to 113302965, and the 3’-UTR is from 113304313 to 113302965 based on transcript Ighg2b-202 (ENSMUST00000192188.2) . The relevant information for mouse Ighg2b locus can be found in the NCBI website with Gene ID: 16016, and is described e.g., in Casola, et al. "Tracking germinal center B cells expressing germ-line immunoglobulin γ1 transcripts by conditional gene targeting. " Proceedings of the National Academy of Sciences 103.19 (2006) : 7396-7401, both of which are incorporated by reference herein in its entirety.

Table 2

The mouse Ighg3 gene (Gene ID: 380795) is located in Chromosome 12 of the mouse genome, which is located from 113357442 to 113361232 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) . For the secreted form, exon 1 is from 113361232 to 113360942, the first intron is from 113360941 to 113360577, exon 2 is from 113360576 to 113360529, the second intron is from 113360528 to 113360430, exon 3 is from 113360429 to 113360100, the third intron is from 113360099 to113359988, exon 4 is from 113359987 to 113359575, and the 3’-UTR is from 113359575 to 113359664, based on transcript Ighg3-201 (ENSMUST00000103423.2) . With respect to the membrane-bound form, exon 1 is from 113361232 to 113360942, the first intron is from 113360941 to 113360577, exon 2 is from 113360576 to 113360529, the second intron is from 113360528 to 113360430, exon 3 is from 113360429 to 113360100, the third intron is from 113360099 to113359988, exon 4 is from 113359987 to 113359673, the fourth intron is from 113359672 to 113358201, exon 5 is from 113358200 to 113358070, the fifth intron is from 113358069 to 113357523, exon 6 is from 113357522 to 113356224, and the 3’-UTR is from 113357438 to 113356224, based on transcript Ighg3-202 (ENSMUST00000223179.1) . The relevant information for mouse Ighg3 locus can be found in the NCBI website with Gene ID: 380795, and is described e.g., in Casola, et al. "Tracking germinal center B cells expressing germ-line immunoglobulin γ1 transcripts by conditional gene targeting. " Proceedings of the National Academy of Sciences 103.19 (2006) : 7396-7401, both of which are incorporated by reference herein in its entirety.

Table 3

Immunoglobulin M heavy chain (μ) has 4 exons (exon 1, exon 2, exon 3, and exon 4) for constant domains (CH1, CH2, CH3, and CH4) , and 2 exons (M1 and M2) for the membrane-bound domain (including the transmembrane region) . Similarly, there are two separate 3’-UTR (untranslated region) in the genome, the first 3’-UTR is for the canonical secreted form, and the second 3’-UTR is for the canonical membrane-bound form. When the membrane-bound form of IGHM is expressed, the transcript will skip the first 3’-UTR and link the coding region in exon 4 with the M1 and M2. The mouse IGHM gene (Gene ID: 16019) is located in in Chromosome 12 of the mouse genome, which is located from 113418826 to 113422730 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) .

Immunoglobulin A heavy chain (α) has 3 exons (exon 1, exon 2, and exon 3) for constant domains (CH1, CH2, CH3) and the hinge region, and 1 exon (M) for the membrane-bound domain. Similarly, there are two separate 3’-UTR (untranslated region) in the genome, the first 3’-UTR is for the canonical secreted form, and the second 3’-UTR is for the membrane-bound form. When the membrane-bound form of IGHA is expressed, the transcript will skip the first 3’-UTR and link the coding region in exon 3 with the M exon. The mouse Igha gene (Gene ID: 238447) is located in in Chromosome 12 of the mouse genome, which is located from 113256204 to 113260236 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) .

Similarly, the mouse Ighd gene (Gene ID: 380797) is located in in Chromosome 12 of the mouse genome, which is located from 113407535 to 113416324 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) .

The mouse Ighe gene (Gene ID: 380792) is located in in Chromosome 12 of the mouse genome, which is located from 113269263 to 113273248 of NC_000078.6 (GRCm38. p4 (GCF_000001635.24) ) .

The immunoglobulin heavy chain genes, proteins, and locus of many other species are also known in the art. For example, the gene ID for IGHG1 in Rattus norvegicus is 299354, the gene ID for IGHE in Rattus norvegicus is 299351, the gene ID for IGHD in Rattus norvegicus is 641523, the gene ID for IGHM in Danio rerio (zebrafish) is 378762, the gene ID for IGHM in Bos taurus (cattle) is 404057, the gene ID for IGHD in Pan troglodytes (chimpanzee) is 453221, the gene ID for IGHE in Bos taurus (cattle) is 505407, and the gene ID for IGHA in Bos taurus (cattle) is 281242. The relevant information for these genes (e.g., intron sequences, exon sequences, amino acid residues of these proteins) can be found, e.g., in NCBI database, which is incorporated by reference herein in its entirety.

The present disclosure provides transgenic non-human animal with genetically modified immunoglobulin heavy chain. The genome of the animal can have one or two chromosomes comprising an insertion of a sequence encoding a membrane bound domain at an endogenous immunoglobulin heavy chain locus, e.g., IGHM, IGHD, IGHG (e.g., IGHG1, IGHG2, IGHG3, IGHG4) , IGHA (e.g., IGHA1, IGHA2) , or IGHE locus.

In some embodiments, the immunoglobulin heavy chain locus comprises a first 3’-UTR region for the secretive form of the immunoglobulin heavy chain and a second 3’-UTR region for the membrane bound form of the immunoglobulin heavy chain. The sequence encoding the membrane bound domain can be inserted before the first 3’-UTR region (e.g., in the exons encoding CH1, CH2, CH3, and/or CH4) . In some embodiments, the membrane domain is linked to the last immunoglobulin constant domain in the antibody (e.g., the immunoglobulin CH3 domain in IGHG or the immunoglobulin CH4 domain in IGHM) .

As used herein, the “membrane bound domain” refers to a domain that has a hydrophobic amino acid sequence and can bind (or anchor) a peptide or a protein to the cell membrane. The membrane bound domain can have at least 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 amino acid residues. In some embodiments, the membrane bound domain has less than 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or 150 amino acid residues. In some embodiments, the membrane bound domain has a transmembrane region and/or a cytoplasmic region. In some embodiments, the membrane bound domain is an immunoglobulin heavy chain membrane bound domain (e.g., the membrane bound domain of the membrane bound form of IGHM, IGHD, IGHG, IGHA, IGHE, IGHG1, IGHG2, IGHG3, IGHG4, IGHA1, or IGHA2) . In some embodiments, the membrane bound domain comprises a sequence that is at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%identical to SEQ ID NO: 4.

In some embodiments, a linker sequence can be added between the membrane bound domain and the heavy chain constant domains. These linker sequences can be any amino acid of any length. In some embodiments, the linker sequence is an oligopeptide sequence. The linker sequence can have at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, or 50 amino acid residues. In some embodiments, the linker sequence can have fewer than 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, or 100 amino acid residues. In some embodiments, the linker sequence can comprise or consists of (GGGGS) _n (n=1-4) , (Gly) ₈, (Gly) ₆, (EAAAK) _n (n=1-3) , A (EAAAK) ₄ALEA (EAAAK) ₄A, PAPAP, multiple repeats (e.g., 5～17) of Ala-Pro, VSQTSKLTRAETVFPDV, PLGLWA, RVLAEA, EDVVCCSMSY, GGIEGRGS, TRHRQPRGWE, AGNRVRRSVG, RRRRRRRRRd, or GFLG. In some embodiments, the linker is a 2a self-cleaving peptide linker (e.g., F2A) . The commonly used 2A self-cleaving peptides include P2A, T2A, E2A, F2A, BmCPV2A, and BmIFV2A. The self-cleaving peptides can provide some additional advantages, as some of these peptides can be self-cleaved, thus a certain percentage of antibodies can be still secreted and a certain percentage of antibodies are retained on the surface of the immune cells. Thus, it is more likely that the transgenic animal can better tolerate the genetic modification and is likely to have normal physiological conditions. The amino acid sequence and nucleotide sequences for the 2A self-cleaving peptides are known in the art, and are described e.g., in Wang, et al. "2A self-cleaving peptide-based multi-gene expression system in the silkworm Bombyx mori. " Scientific reports 5 (2015) : 16273, which is incorporated herein by reference in its entirety. Thus, in some embodiments, the cleavage efficiency is greater than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%. Thus, more than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%of the expressed antibody is in soluble form. In some embodiments, the cleavage efficiency is less than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%. Thus, more than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%of the expressed antibody can be in membrane bound form.

In some embodiments, the linker sequence is F2A. In some embodiments, the linker sequence is at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%identical to SEQ ID NO: 2.

Thus, the disclosure also provides a sequence encoding an immunoglobulin heavy chain and a membrane bound domain, wherein the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence (e.g., a linker sequence) .

In some embodiments, the present disclosure also provides a modified immunoglobulin heavy chain nucleotide sequence and/or amino acid sequences, wherein in some embodiments, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%of the sequence are identical to or derived from the endogenous or wildtype immunoglobulin heavy chain nucleotide sequence and/or amino acid sequences. In some embodiments, at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%of the sequence are the inserted sequences, or are not identical to or derived from the endogenous or wildtype immunoglobulin heavy chain nucleotide sequence and/or amino acid sequences.

In some embodiments, the nucleic acids as described herein are operably linked to a promotor or regulatory element, e.g., an endogenous mouse promotor, an inducible promoter, an enhancer, and/or an endogenous (e.g., mouse) regulatory elements.

In some embodiments, the nucleic acid sequence has at least a portion (e.g., at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, or 600 nucleotides, e.g., contiguous or non-contiguous nucleotides) that are different from a portion of or the entire animal immunoglobulin heavy chain nucleotide sequence (e.g., mouse IGHG1 sequence) .

In some embodiments, the nucleic acid sequence has at least a portion (e.g., at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, or 600 nucleotides, e.g., contiguous or non-contiguous nucleotides) that is the same as a portion of or the entire animal immunoglobulin heavy chain nucleotide sequence (e.g., mouse IGHG1 sequence) .

In some embodiments, the immunoglobulin heavy chain locus has an addition of a sequence that has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 350, 400, 450, 500, 550, 600, or 650 exogenous nucleic acids.

In some embodiments, the amino acid sequence has at least a portion (e.g., at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, or 600 amino acid residues, e.g., contiguous or non-contiguous amino acid residues) that is different from a portion of or the entire animal immunoglobulin heavy chain amino acid sequence (e.g., mouse IGHG1 sequence) .

In some embodiments, the amino acid sequence has at least a portion (e.g., at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, or 600 amino acid residues, e.g., contiguous or non-contiguous amino acid residues) that is the same as a portion of or the entire animal immunoglobulin heavy chain amino acid sequence (e.g., mouse IGHG1 sequence) .

In some embodiments, the encoded immunoglobulin heavy chain peptide has an addition of a sequence that has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 exogenous amino acids.

As used herein, the term “portion” can refer to at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 250, 300, 350, 400, 500, or 600 nucleotides, or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 amino acid residues.

The present disclosure also provides a modified IGHG1 mouse amino acid sequence, wherein the amino acid sequence is selected from the group consisting of:

a) an amino acid sequence shown in SEQ ID NO: 9 or 22;

b) an amino acid sequence having a homology of at least 90%with or at least 90%identical to the amino acid sequence shown in SEQ ID NO: 9 or 22;

c) an amino acid sequence encoded by a nucleic acid sequence, wherein the nucleic acid sequence is able to hybridize to a nucleotide sequence encoding the amino acid shown in SEQ ID NO: 9 or 22 under a low stringency condition or a strict stringency condition;

d) an amino acid sequence having a homology of at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to the amino acid sequence shown in SEQ ID NO: 9 or 22;

e) an amino acid sequence that is different from the amino acid sequence shown in SEQ ID NO: 9 or 22 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or no more than 1 amino acid; or

f) an amino acid sequence that comprises a substitution, a deletion and /or insertion of one or more amino acids to the amino acid sequence shown in SEQ ID NO: 9 or 22.

The present disclosure also relates to a modified IGHG1 nucleic acid (e.g., DNA, RNA, or cDNA) sequence, wherein the nucleic acid sequence can be selected from the group consisting of:

a) a nucleic acid sequence as shown in SEQ ID NO: 8 or 21;

b) a nucleic acid sequence that is shown in SEQ ID NO: 3 or 45;

c) a nucleic acid sequence that is able to hybridize to the nucleotide sequence as shown in SEQ ID NO: 8 or 21 or SEQ ID NO: 3 or 45 under a low stringency condition or a strict stringency condition;

d) a nucleic acid sequence that has a homology of at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to the nucleotide sequence as shown in SEQ ID NO: 8 or 21 or SEQ ID NO: 3 or 45;

e) a nucleic acid sequence that encodes an amino acid sequence, wherein the amino acid sequence has a homology of at least 90%with or at least 90%identical to the amino acid sequence shown in SEQ ID NO: 9 or 22;

f) a nucleic acid sequence that encodes an amino acid sequence, wherein the amino acid sequence has a homology of at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%with, or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to the amino acid sequence shown in SEQ ID NO: 9 or 22;

g) a nucleic acid sequence that encodes an amino acid sequence, wherein the amino acid sequence is different from the amino acid sequence shown in SEQ ID NO: 9 or 22 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or no more than 1 amino acid; and/or

h) a nucleic acid sequence that encodes an amino acid sequence, wherein the amino acid sequence comprises a substitution, a deletion and /or insertion of one or more amino acids to the amino acid sequence shown in SEQ ID NO: 9 or 22.

The present disclosure further relates to a modified immunoglobulin heavy chain genomic DNA sequence of a mouse. The DNA sequence is obtained by a reverse transcription of the mRNA obtained by transcription thereof is consistent with or complementary to the DNA sequence homologous to the sequence shown in SEQ ID NO: 8 or 21.

The disclosure also provides an amino acid sequence that has a homology of at least 90%with, or at least 90%identical to the sequence shown in SEQ ID NO: 9 or 22. In some embodiments, the homology with the sequence shown in SEQ ID NO: 9 or 22 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99%. In some embodiments, the foregoing homology is at least about 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 80%, or 85%.

In some embodiments, the percentage identity with the sequence shown in SEQ ID NO: 9 or 22 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99%. In some embodiments, the foregoing percentage identity is at least about 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 80%, or 85%.

The disclosure also provides a nucleotide sequence that has a homology of at least 90%, or at least 90%identical to the sequence shown in SEQ ID NO: 8 or 21, and encodes a polypeptide that has protein activity. In some embodiments, the homology with the sequence shown in SEQ ID NO: 8 or 21 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99%. In some embodiments, the foregoing homology is at least about 50%, 55%, 60%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 80%, or 85%.

In some embodiments, the percentage identity with the sequence shown in SEQ ID NO: 8 or 21 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99%. In some embodiments, the foregoing percentage identity is at least about 50%, 55%, 60%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 80%, or 85%.

The disclosure also provides a nucleic acid sequence that is at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%identical to any nucleotide sequence as described herein, and an amino acid sequence that is at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%identical to any amino acid sequence as described herein. In some embodiments, the disclosure relates to nucleotide sequences encoding any peptides that are described herein, or any amino acid sequences that are encoded by any nucleotide sequences as described herein. In some embodiments, the nucleic acid sequence is less than 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 150, 200, 250, 300, 350, 400, 500, or 600 nucleotides. In some embodiments, the amino acid sequence is less than 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 amino acid residues.

In some embodiments, the amino acid sequence (i) comprises an amino acid sequence; or (ii) consists of an amino acid sequence, wherein the amino acid sequence is any one of the sequences as described herein.

In some embodiments, the nucleic acid sequence (i) comprises a nucleic acid sequence; or (ii) consists of a nucleic acid sequence, wherein the nucleic acid sequence is any one of the sequences as described herein.

To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes) . The length of a reference sequence aligned for comparison purposes is at least 80%of the length of the reference sequence, and in some embodiments is at least 90%, 95%, or 100%. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences. For purposes of the present disclosure, the comparison of sequences and determination of percent identity between two sequences can be accomplished using a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

The percentage of residues conserved with similar physicochemical properties (percent homology) , e.g. leucine and isoleucine, can also be used to measure sequence similarity. Families of amino acid residues having similar physicochemical properties have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine) , acidic side chains (e.g., aspartic acid, glutamic acid) , uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine) , nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan) , beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine) . The homology percentage, in many cases, is higher than the identity percentage.

Cells, tissues, and animals (e.g., mouse) are also provided that comprise the nucleotide sequences as described herein, as well as cells, tissues, and animals (e.g., mouse) that express a modified immunoglobulin heavy chain from an endogenous non-human locus.

Genetically modified animals

As used herein, the term “genetically-modified non-human animal” refers to a non-human animal having exogenous sequence in at least one chromosome of the animal’s genome. As used herein, the term “exogenous sequence” refers to a sequence that is not normally present in the wildtype animal before any genetic modification. In some embodiments, the exogenous sequence can include a sequence that is derived from the animal, but is located at a location that is different in wildtype.. In some embodiments, the exogenous sequence can be an artificial sequence or a sequence derived from a different species.

In some embodiments, at least one or more cells, e.g., at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%of cells of the genetically-modified non-human animal have the exogenous DNA in its genome. The cell having exogenous DNA can be various kinds of cells, e.g., an endogenous cell, a somatic cell, an immune cell, a T cell, a B cell, an antigen presenting cell, a macrophage, a dendritic cell, a germ cell, a blastocyst, or an endogenous tumor cell. In some embodiments, genetically-modified non-human animals are provided that comprise a modified endogenous immunoglobulin heavy chain locus (e.g. immunoglobulin μ, δ, γ, ε and α chain locus) that comprises an exogenous sequence (e.g., a linker sequence and/or a sequence encoding a membrane bound domain) . The animals are generally able to pass the modification to progeny, i.e., through germline transmission.

As used herein, the term “chimeric gene” or “chimeric nucleic acid” refers to a gene or a nucleic acid, wherein two or more portions of the gene or the nucleic acid are from different sources, or at least one of the sequences of the gene or the nucleic acid does not correspond to the wildtype nucleic acid in the animal. In some embodiments, the chimeric gene or chimeric nucleic acid has at least one portion of the sequence that is derived from two or more different sources, e.g., sequences encoding different proteins or sequences encoding the same (or homologous) protein of two or more different species.

As used herein, the term “chimeric protein” or “chimeric polypeptide” refers to a protein or a polypeptide, wherein two or more portions of the protein or the polypeptide are from different sources, or at least one of the sequences of the protein or the polypeptide does not correspond to wildtype amino acid sequence in the animal. In some embodiments, the chimeric protein or the chimeric polypeptide has at least one portion of the sequence that is derived from two or more different sources, e.g., same (or homologous) proteins of different species.

As used herein, the term “modified gene” or “modified nucleic acid” refers to a sequence has one or more modifications as compared to the wildtype gene or nucleic acid sequence.

As used herein, the term “modified peptide” or “modified protein” refers to a peptide or protein has one or more modifications as compared to the wildtype amino acid sequence.

In some embodiments, the modified gene or the modified nucleic acid is a modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) gene or a modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) nucleic acid. In some embodiments, the gene or the nucleic acid has the addition of a sequence encoding a membrane-bound domain and/or a linker sequence.

In some embodiments, the gene or the nucleic acid comprises a sequence that encodes an immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) . The encoded immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) protein is functional and can bind to each other to form a dimer, and/or bind to an immunoglobulin light chain and specifically binds to the antigen of interest.

The genetically modified non-human animal can be various animals, e.g., a mouse, rat, rabbit, pig, bovine (e.g., cow, bull, buffalo) , deer, sheep, goat, chicken, cat, dog, ferret, primate (e.g., marmoset, rhesus monkey) . For the non-human animals where suitable genetically modifiable embryonic stem (ES) cells are not readily available, other methods are employed to make a non-human animal comprising the genetic modification. Such methods include, e.g., modifying a non-ES cell genome (e.g., a fibroblast or an induced pluripotent cell) and employing nuclear transfer to transfer the modified genome to a suitable cell, e.g., an oocyte, and gestating the modified cell (e.g., the modified oocyte) in a non-human animal under suitable conditions to form an embryo. These methods are known in the art, and are described, e.g., in A. Nagy, et al., “Manipulating the Mouse Embryo: A Laboratory Manual (Third Edition) , ” Cold Spring Harbor Laboratory Press, 2003, which is incorporated by reference herein in its entirety.

In one aspect, the animal is a mammal, e.g., of the superfamily Dipodoidea or Muroidea. In some embodiments, the genetically modified animal is a rodent. The rodent can be selected from a mouse, a rat, and a hamster. In some embodiments, the genetically modified animal is from a family selected from Calomyscidae (e.g., mouse-like hamsters) , Cricetidae (e.g., hamster, New World rats and mice, voles) , Muridae (true mice and rats, gerbils, spiny mice, crested rats) , Nesomyidae (climbing mice, rock mice, with-tailed rats, Malagasy rats and mice) , Platacanthomyidae (e.g., spiny dormice) , and Spalacidae (e.g., mole rates, bamboo rats, and zokors) . In some embodiments, the genetically modified rodent is selected from a true mouse or rat (family Muridae) , a gerbil, a spiny mouse, and a crested rat. In some embodiments, the non-human animal is a mouse.

In some embodiments, the animal is a mouse of a C57BL strain, e.g., selected from C57BL/A, C57BL/An, C57BL/GrFa, C57BL/KaLwN, C57BL/6, C57BL/6J, C57BL/6ByJ, C57BL/6NJ, C57BL/10, C57BL/10ScSn, C57BL/10Cr, and C57BL/Ola. In some embodiments, the mouse is a 129 strain selected from the group consisting of a strain that is 129P1, 129P2, 129P3, 129X1, 129S1 (e.g., 129S1/SV, 129S1/SvIm) , 129S2, 129S4, 129S5, 129S9/SvEvH, 129S6 (129/SvEvTac) , 129S7, 129S8, 129T1, 129T2. These mice are described, e.g., in Festing et al., Revised nomenclature for strain 129 mice, Mammalian Genome 10: 836 (1999) ; Auerbach et al., Establishment and Chimera Analysis of 129/SvEv-and C57BL/6-Derived Mouse Embryonic Stem Cell Lines (2000) , both of which are incorporated herein by reference in the entirety. In some embodiments, the genetically modified mouse is a mix of the 129 strain and the C57BL/6 strain. In some embodiments, the mouse is a mix of the 129 strains, or a mix of the BL/6 strains. In some embodiments, the mouse is a BALB strain, e.g., BALB/c strain. In some embodiments, the mouse is a mix of a BALB strain and another strain. In some embodiments, the mouse is from a hybrid line (e.g., 50%BALB/c-50%12954/Sv; or 50%C57BL/6-50%129) . In some embodiments, the animal is a mouse of a strain selected from BALB/c, C57BL/A, C57BL/An, C57BL/GrFa, C57BL/KaLwN, C57BL/6, C57BL/6J, C57BL/6ByJ, C57BL/6NJ, C57BL/10, C57BL/10ScSn, C57BL/10Cr or C57BL/Ola.

In some embodiments, the animal is a rat. The rat can be selected from a Wistar rat, an LEA strain, a Sprague Dawley strain, a Fischer strain, F344, F6, and Dark Agouti. In some embodiments, the rat strain is a mix of two or more strains selected from the group consisting of Wistar, LEA, Sprague Dawley, Fischer, F344, F6, and Dark Agouti.

The animal can have one or more other genetic modifications, and/or other modifications, that are suitable for the particular purpose for which the modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) animal is made. For example, suitable mice for maintaining a xenograft (e.g., a human cancer or tumor) , can have one or more modifications that compromise, inactivate, or destroy the immune system of the non-human animal in whole or in part. Compromise, inactivation, or destruction of the immune system of the non-human animal can include, for example, destruction of hematopoietic cells and/or immune cells by chemical means (e.g., administering a toxin) , physical means (e.g., irradiating the animal) , and/or genetic modification (e.g., knocking out one or more genes) . Non-limiting examples of such mice include, e.g., NOD mice, SCID mice, NOD/SCID mice, IL2Rγ knockout mice, NOD/SCID/γcnull mice (Ito, M. et al., NOD/SCID/γcnull mouse: an excellent recipient mouse model for engraftment of human cells, Blood 100 (9) : 3175-3182, 2002) , nude mice, and Rag1 and/or Rag2 knockout mice. These mice can optionally be irradiated, or otherwise treated to destroy one or more immune cell type.

Genetically modified non-human animals that comprise a modification of an endogenous non-human immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) locus. In some embodiments, the modification can comprise a nucleic acid sequence encoding at least a portion of a mature immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) protein (e.g., at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, or 99%identical to the mature immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) protein sequence) . Although genetically modified cells are also provided that can comprise the modifications described herein (e.g., ES cells, somatic cells) , in many embodiments, the genetically modified non-human animals comprise the modification of the endogenous immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) locus in the germline of the animal.

Genetically modified animals can express a modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) from endogenous mouse loci, wherein the endogenous mouse immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or αchain) gene has been inserted with a transmembrane domain and/or a linker sequence. The encoded amino acid sequence can have an amino acid sequence that is at least 10%, 20%, 30%, 40%, 50%, 60%, 70&, 80%, 90%, 95%, 96%, 97%, 98%, or 99%identical to the immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) sequence.

In some embodiments, the genetically modified mice express the modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) from endogenous loci that are under control of mouse promoters and/or mouse regulatory elements. The modification at the endogenous mouse loci provide non-human animals that express modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or αchain) in appropriate cell types and in a manner that does not result in the potential pathologies observed in some other transgenic mice known in the art. The modified immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) expressed in animal can maintain one or more functions of the wildtype endogenous immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain) in the animal. As used herein, the term “endogenous immunoglobulin heavy chain” refers to immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain, or IGHG1) protein that is expressed from an endogenous immunoglobulin heavy chain nucleotide sequence of the non-human animal (e.g., mouse) before any genetic modification.

The genome of the animal can comprise a sequence encoding an amino acid sequence that is at least 70%, 75%, 80%, 85%, 90%, 95%, 99%, or 100%identical to the modified immunoglobulin heavy chain described herein. In some embodiments, the genome comprises a sequence encoding an amino acid sequence that is at least 70%, 75%, 80%, 85%, 90%, 95%, 99%, or 100%identical to SEQ ID NO: 9 or 22.

In some embodiments, the exogenous sequence is inserted within the endogenous immunoglobulin heavy chain gene locus, e.g., within, before, or after, exon 1, exon 2, exon 3, exon 4, exon 5, exon 6, M1 exon, M2 exon, the first 3’-UTR, the second 3’-UTR, the 3’-UTR for the secreted form, the 3’-UTR for the membrane-bound form, the first intron, the second intron, the third intron, the fourth intron, or the fifth intron.

Furthermore, the genetically modified animal can be heterozygous with respect to the insertion at the endogenous immunoglobulin heavy chain locus, or homozygous with respect to the insertion at the endogenous immunoglobulin heavy chain locus.

The present disclosure also provides a genetically modified animal can have one cell comprising a nucleic acid encoding a fusion polypeptide, wherein the fusion peptide comprises an immunoglobulin heavy chain and a membrane bound domain, wherein the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.

The present disclosure further relates to a non-human mammal generated through the method mentioned above.

In some embodiments, the non-human mammal is a rodent, and preferably, the non-human mammal is a mouse.

The present disclosure further relates to a cell or cell line, or a primary cell culture thereof derived from the non-human mammal or an offspring thereof. The tissue, organ or a culture thereof derived from the non-human mammal or an offspring thereof.

The present disclosure also provides non-human mammals produced by any of the methods described herein. In some embodiments, a non-human mammal is provided; and the genetically modified animal contains the DNA encoding modified immunoglobulin heavy chain in the genome of the animal.

In some embodiments, the non-human mammal comprises the genetic construct as described herein (e.g., gene construct as shown in FIG. 2) . In some embodiments, a non-human mammal expressing modified immunoglobulin heavy chain is provided.

In some embodiments, the expression of modified immunoglobulin heavy chain in a genetically modified animal is controllable, as by the addition of a specific inducer or repressor substance.

Non-human mammals can be any non-human animal known in the art and which can be used in the methods as described herein. Preferred non-human mammals are mammals, (e.g., rodents) . In some embodiments, the non-human mammal is a mouse.

Genetic, molecular and behavioral analyses for the non-human mammals described above can performed. The present disclosure also relates to the progeny produced by the non-human mammal provided by the present disclosure mated with the same or other genotypes.

The present disclosure also provides a cell line or primary cell culture derived from the non-human mammal or a progeny thereof. A model based on cell culture can be prepared, for example, by the following methods. Cell cultures can be obtained by way of isolation from a non-human mammal, alternatively cell can be obtained from the cell culture established using the same constructs and the standard cell transfection techniques. The integration of genetic constructs containing DNA sequences encoding the modified immunoglobulin heavy chain protein can be detected by a variety of methods.

There are many analytical methods that can be used to detect the genetic modification, including methods at the level of nucleic acid, including e.g., PCR, and RT-PCR or Southern blotting, and in situ hybridization, and methods at the protein level (including histochemistry, immunoblot analysis and in vitro binding studies) . In addition, the expression level of the gene of interest can be quantified by ELISA techniques well known to those skilled in the art. Many standard analysis methods can be used to complete quantitative measurements. For example, transcription levels can be measured using RT-PCR and hybridization methods including RNase protection, Southern blot analysis, RNA dot analysis (RNAdot) analysis. Immunohistochemical staining, flow cytometry, Western blot analysis can also be used to assess the presence of modified immunoglobulin heavy chain peptide.

Vectors

The present disclosure relates to a targeting vector, comprising: a) a DNA fragment homologous to the 5’end of a region to be altered (5’arm) , which is selected from the endogenous immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or αchain, or IGHG1) gene genomic DNAs in the length of 100 to 10,000 nucleotides; b) a desired/donor DNA sequence encoding a donor region; and c) a second DNA fragment homologous to the 3’end of the region to be altered (3’arm) , which is selected from the endogenous immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain, or IGHG1) gene genomic DNAs in the length of 100 to 10,000 nucleotides.

In some embodiments, a) the DNA fragment homologous to the 5’end of a conversion region to be altered (5’arm) is selected from the nucleotide sequences that have at least 90%homology to the NCBI accession number NC_000078.6; c) the DNA fragment homologous to the 3’end of the region to be altered (3’arm) is selected from the nucleotide sequences that have at least 90%homology to the NCBI accession number NC_000078.6.

In some embodiments, a) the DNA fragment homologous to the 5’end of a region to be altered (5’arm) is selected from the nucleotides from the position 113328979 to the position 113331115 of the NCBI accession number NC_000078.6; c) the DNA fragment homologous to the 3’end of the region to be altered (3’arm) is selected from the nucleotides from the position 113327084 to the position 113328975 of the NCBI accession number NC_000078.6.

In some embodiments, the length of the selected genomic nucleotide sequence in the targeting vector can be more than about 3 kb, about 3.5 kb, about 4 kb, about 4.5 kb, about 5 kb, about 5.5 kb, or about 6 kb.

In some embodiments, the region to be altered is exon 1, exon 2, exon 3, exon 4, exon 5, exon 6, CH1, CH2, CH3, CH4, M1, M2, the first 3’-UTR, the second 3’-UTR, the 3’-UTR for the soluble form, the 3’-UTR for the membrane-bound form, intron 1, intron 2, intron 3, intron 4, intron 5 of immunoglobulin heavy chain (e.g. immunoglobulin μ, δ, γ, ε or α chain, or IGHG1) gene (e.g., exon 4 of mouse IGHG1 gene) .

The targeting vector can further include a selected gene marker.

In some embodiments, the inserted sequence is at least 60%, 70%, 80%, 90%, 95%, or 99%identical to SEQ ID NO: 7 or SEQ ID NO: 20.

The disclosure also relates to a cell comprising the targeting vectors as described above.

In addition, the present disclosure further relates to a non-human mammalian cell, having any one of the foregoing targeting vectors, and one or more in vitro transcripts of the construct as described herein. In some embodiments, the cell includes Cas9 mRNA or an in vitro transcript thereof.

In some embodiments, the genes in the cell are heterozygous. In some embodiments, the genes in the cell are homozygous.

In some embodiments, the non-human mammalian cell is a mouse cell. In some embodiments, the cell is a fertilized egg cell.

Methods of making genetically modified animals

Genetically modified animals can be made by several techniques that are known in the art, including, e.g., nonhomologous end-joining (NHEJ) , homologous recombination (HR) , zinc finger nucleases (ZFNs) , transcription activator-like effector-based nucleases (TALEN) , and the clustered regularly interspaced short palindromic repeats (CRISPR) -Cas system. In some embodiments, homologous recombination is used. In some embodiments, CRISPR-Cas9 genome editing is used to generate genetically modified animals. Many of these genome editing techniques are known in the art, and is described, e.g., in Yin et al., "Delivery technologies for genome editing, " Nature Reviews Drug Discovery 16.6 (2017) : 387-399, which is incorporated by reference in its entirety. Many other methods are also provided and can be used in genome editing, e.g., micro-injecting a genetically modified nucleus into an enucleated oocyte, and fusing an enucleated oocyte with another genetically modified cell.

Thus, in some embodiments, the disclosure provides inserting in at least one cell of the animal, at an endogenous immunoglobulin heavy chain gene locus, a sequence encoding a membrane bound domain and optionally with a linker sequence. In some embodiments, the insertion occurs in a germ cell, a somatic cell, a blastocyst, or a fibroblast, etc. The nucleus of a somatic cell or the fibroblast can be inserted into an enucleated oocyte.

FIG. 2 shows a targeting strategy for a mouse immunoglobulin heavy chain locus. In FIG. 2, the targeting strategy involves a vector comprising the 5’homologous arm, the inserted gene fragment, 3’homologous arm. The process can involve replacing endogenous immunoglobulin heavy chain sequence with the sequence on the vector by homologous recombination. In some embodiments, the cleavage at the upstream and the downstream of the target site (e.g., by zinc finger nucleases, TALEN or CRISPR) can result in DNA double strands break, and the homologous recombination is used to replace endogenous immunoglobulin heavy chain sequence with the sequence in the vector.

Thus, in some embodiments, the methods for making a genetically modified cell, can include the step of replacing at an endogenous immunoglobulin heavy chain locus (or site) , a nucleic acid encoding a sequence encoding a region of endogenous immunoglobulin heavy chain with a sequence comprising the inserted sequence (e.g., exogenous sequence) .

The present disclosure further provides a method for establishing an animal model with modified immunoglobulin heavy chain gene, involving the following steps:

(a) providing the cell (e.g. a fertilized egg cell) based on the methods described herein;

(b) culturing the cell in a liquid culture medium;

(c) transplanting the cultured cell to the fallopian tube or uterus of the recipient female non-human mammal, allowing the cell to develop in the uterus of the female non-human mammal;

(d) identifying the germline transmission in the offspring genetically modified non-human mammal of the pregnant female in step (c) .

In some embodiments, the non-human mammal in the foregoing method is a mouse (e.g., a C57BL/6 mouse or a BALB/c mouse) .

In some embodiments, the non-human mammal in step (c) is a female with pseudo pregnancy (or false pregnancy) .

In some embodiments, the fertilized eggs for the methods described above are C57BL/6 fertilized eggs. Other fertilized eggs that can also be used in the methods as described herein include, but are not limited to, FVB/N fertilized eggs, BALB/c fertilized eggs, DBA/1 fertilized eggs and DBA/2 fertilized eggs.

Fertilized eggs can come from any non-human animal, e.g., any non-human animal as described herein. In some embodiments, the fertilized egg cells are derived from rodents. The genetic construct can be introduced into a fertilized egg by microinjection of DNA. For example, by way of culturing a fertilized egg after microinjection, a cultured fertilized egg can be transferred to a false pregnant non-human animal, which then gives birth of a non-human mammal, so as to generate the non-human mammal mentioned in the methods described above.

The present disclosure also provides a genetically modified non-human animal, wherein the genome of the animal comprises from 5’to 3’at the endogenous immunoglobulin heavy chain locus, (a) a first DNA sequence; optionally (b) a second DNA sequence comprising an exogenous sequence; (c) a third DNA sequence; (d) a fourth DNA sequence, wherein the first DNA sequence, the optional second DNA sequence, the third DNA sequence, and the fourth DNA sequence are linked, wherein the first DNA sequence comprises a sequence that is at least 80%identical to an endogenous immunoglobulin heavy chain gene sequence that is located upstream of 3’-UTR for the secretive form of the immunoglobulin heavy chain, the second DNA sequence can have a length of 0 nucleotides to 300 nucleotides, the third DNA sequence encodes a membrane bound domain, the fourth DNA sequence comprises a sequence that is at least 80%identical to a sequence that is located downstream of 3’-UTR for the secretive form of the immunoglobulin heavy chain.

In some embodiments, the nucleotide sequences as described herein do not overlap with each other (e.g., the first nucleotide sequence, the second nucleotide sequence, the third nucleotide sequence, and/or the fourth nucleotide sequence do not overlap) . In some embodiments, the amino acid sequences as described herein do not overlap with each other.

Because of the insertion of the membrane bound domain, the genetically-modified, non-human animal whose genome can comprise at least one chromosome comprising two or more sequences encoding a membrane bound domain at an endogenous immunoglobulin heavy chain locus. In some embodiments, one membrane bound domain is the endogenous membrane bound domain. In some embodiments, at least one membrane bound domain is exogenous membrane bound domain.

Methods of using genetically modified animals

The genetically modified animals can express the modified immunoglobulin heavy chain, e.g., in a physiologically appropriate manner, provide a variety of uses that include, but are not limited to, antibody screening.

The present widely used hybridoma technology starts by injecting a mouse (or other mammal) with an antigen that provokes an immune response. A type of white blood cell (e.g., spleen cells or B cell) that produces antibodies that bind to the antigen are then harvested from the mammal. These isolated cells are in turn fused with immortal B cell cancer cells, a myeloma, to produce a hybrid cell line called a hybridoma. The hybridomas will then be screened for antibodies that can bind to antigen of interest. This process is laborious and time consuming.

The present disclosure provides a transgenic non-human animal with genetically modified immunoglobulin heavy chain. Because of the addition of the membrane bound domain to the soluble form of the immunoglobulin heavy chain, the antibodies produced by the immune cells in the transgenic animal can be retained on the surface of the immune cells. After the animal is exposed to the antigen, the immune cells can display these antibodies on their surface. These immune cells can be sorted and isolated based on the binding affinity of these antibodies with the antigen. The isolated cells can then be used to make hybridomas. Alternatively, the sequence of these cells can be analyzed for further modification and optimization (e.g., humanization) . The methods described herein can avoid manual screening of the positive hybridoma clones, which is laborious and time-consuming, and can greatly facilitate the development of antibody therapeutics.

Thus, the disclosure provides methods of isolating an immune cell that expresses an antibody that specifically binds to an antigen. The methods involve immunizing the genetically modified animal with the antigen; collecting a plurality of immune cells from the animal; isolating one or more cells that express an antibody that specifically binds to the antigen. In some embodiments, the immune cells are cells from the spleen (e.g., B cells) .

The cells can be isolated by various sorting techniques that are known in the art, e.g., fluorescence activated cell sorting (FACS) or magnetic-activated cell sorting (MACS) . For MACS, the magnetic nanoparticles can be coated with the antigen of interest.

In some embodiments, the sequences of the antibodies are determined, e.g., by sequencing the CDRs, VH, VL region of the isolated cells. These sequences can be used for further optimization, e.g., humanization, increasing binding affinities, making scFv etc.

In some implementations, the antibody produced by the methods described herein can specifically binds to the antigen of interest with a dissociation rate (koff) of less than 0.1 s ^-1, less than 0.01 s ^-1, less than 0.001 s ^-1, less than 0.0001 s ^-1, or less than 0.00001 s ^-1. In some embodiments, the dissociation rate (koff) is greater than 0.01 s ^-1, greater than 0.001 s ^-1, greater than 0.0001 s ^-1, greater than 0.00001 s ^-1, or greater than 0.000001 s ^-1. In some embodiments, kinetic association rates (kon) is greater than 1 x 10 ²/Ms, greater than 1 x 10 ³/Ms, greater than 1 x 10 ⁴/Ms, greater than 1 x 10 ⁵/Ms, or greater than 1 x 10 ⁶/Ms. In some embodiments, kinetic association rates (kon) is less than 1 x 10 ⁵/Ms, less than 1 x 10 ⁶/Ms, or less than 1 x 10 ⁷/Ms. Affinities can be deduced from the quotient of the kinetic rate constants (KD=koff/kon) . In some embodiments, KD is less than 1 x 10 ^-6 M, less than 1 x 10 ^-7 M, less than 1 x 10 ^-8 M, less than 1 x 10 ^-9 M, or less than 1 x 10 ^-10 M. In some embodiments, the KD is less than 50nM, 30 nM, 20 nM, 15 nM, 10 nM, 9 nM, 8 nM, 7 nM, 6 nM, 5 nM, 4 nM, 3 nM, 2 nM, or 1 nM. In some embodiments, KD is greater than 1 x 10 ^-7 M, greater than 1 x 10 ^-8 M, greater than 1 x 10 ^-9 M, greater than 1 x 10 ^-10 M, greater than 1 x 10 ^-11 M, or greater than 1 x 10 ^-12 M.

In some embodiments, the methods described herein are used to develop antibodies that can specially bind to programmed cell death protein 1 (PD-1) , cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) , Lymphocyte Activating 3 (LAG-3) , B And T Lymphocyte Associated (BTLA) , Programmed Cell Death 1 Ligand 1 (PD-L1) , CD27, CD28, CD40, CD47, CD137, CD154, T-Cell Immunoreceptor With Ig And ITIM Domains (TIGIT) , T-cell Immunoglobulin and Mucin-Domain Containing-3 (TIM-3) , Glucocorticoid-Induced TNFR-Related Protein (GITR) , or TNF Receptor Superfamily Member 4 (TNFRSF4 or OX40) .

EXAMPLES

The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.

Materials and Methods

The following materials were used in the following examples.

C57BL/6 mice were purchased from the China Food and Drugs Research Institute National Rodent Experimental Animal Center.

BALB/c mice were obtained from Beijing Vital River Laboratory Animal Technology Co., Ltd.

EcoRI, BamHI, EcoRV, ScaI, SacI, NdeI, BstXI, StuI restriction enzymes were purchased from NEB (Catalog numbers: R3101M, R3136M, R0195S, R3122M, R0165S, R0111S, R0113S, R0187M) .

Ambion in vitro transcription kit was purchased from Ambion (Catalog number: AM1354) .

UCA kit was obtained from Beijing Biocytogen Co., Ltd. (Catalog number: BCG-DX-001) .

Cas9 mRNA was purchased from SIGMA (Catalog number: CAS9MRNA-1EA) .

AIO kit was obtained from Beijing Biocytogen Co., Ltd. (Catalog number: BCG-DX-004) .

The pHSG299 was purchased from Takara (Catalog number: 3299) .

Biotinylated Human CD27 Fc Tag was purchased from ACRO Biosystems (Catalog number: TN7-H82F6) .

Biotinylated Human TIGIT Fc Tag was purchased from ACRO Biosystems (Catalog number: TIT-H82F1) .

Anti-mouseCD16/32 was purchased from Biolegend (Catalog number: 101301) .

Human CD27 Fc Tag was purchased from ACRO Biosystems (Catalog number: CD7-H5254) .

Human TIGIT Fc Tag was purchased from ACRO Biosystems (Catalog number: TIT-H5254) .

Biotinylated Human CD27 Ligand/CD70 Protein, Fc Tag was purchased from ACRO Biosystems (Catalog number: TN7-H82F4) .

Anti-Biotin MicroBeads was purchased from Miltenyi Bbiotec (Catalog number: 130-090-485) .

PE Streptavidin was purchased from Biolegend (Catalog number: 405203) .

FITC anti-mouse CD19 Antibody was purchased from Biolegend (Catalog number: 101505) .

PE anti-human IgG Fc was purchased from Biolegend (Catalog number: 409303) .

Flow cytometer was purchased from BD Biosciences (model: FACS Calibur ^TM) .

EXAMPLE 1: Sequence design for modified IGHG1

A DNA fragment encoding a linker sequence F2A and a transmembrane amino acid sequence was inserted before the stop codon in the secretive form of the 3’UTR of the mouse Ighg1 gene (Gene ID: 16017) . The DNA sequence encoding F2A is SEQ ID NO: 1, and the encoded F2A amino acid sequence is set forth in SEQ ID NO: 2. The DNA sequence encoding the M1 transmembrane amino acid sequence is set forth in SEQ ID NO: 3, and the DNA sequence encoding the M2 transmembrane amino acid sequence is set forth in SEQ ID NO: 45. The amino acid sequence of the transmembrane domain is set forth in SEQ ID NO: 4.

FIG. 1 shows the schematic diagram of the engineered mouse Ighg1 gene. The targeting strategy is shown in FIG. 2. The targeting vector contains a 5’homologous arm, the inserted DNA fragment ( "A Fragment" ) , and the 3’homologous arm.

SEQ ID NO: 1

SEQ ID NO: 2

SEQ ID NO: 3

SEQ ID NO: 45

SEQ ID NO: 4:

EXAMPLE 2: Vector construction for mice with C57BL/6 background

For C57BL/6 mice, the 5’homologous arm (SEQ ID NO: 5) comprises nucleic acid 113328979-113331115 of NCBI Accession No. NC_000078.6. The 3’homologous arm comprises nucleic acid 113327084-113328975 of NCBI Accession No. NC_000078.6 with C→G mutation at position 10 (SEQ ID NO: 6) . The mutation does not change the encoded amino acid residue. The A fragment comprises F2A sequence and transmembrane sequence. The engineered mouse Ighg1 gene is shown below:

SEQ ID NO: 8 only shows the DNA sequences that are modified. The underlined sequence is the F2A sequence. The sequence with wavy line is the membrane bound domain sequence.

The amino acid sequence encoded by the modified Ighg1 gene is set forth in SEQ ID NO: 9:

In the sequence above, the underlined sequence (amino acids 325-346 of SEQ ID NO: 9) is F2A sequence. The sequence with wavy line is the membrane bound domain (including the transmembrane region) (amino acids 347-417 of SEQ ID NO: 9) .

Primers for amplifying the recombination fragments (LR, A1, A2, RR) and related sequences were designed. Among them, the LR fragment corresponds to the 5’homologous arm, the RR fragment corresponds to the 3’homologous arm, the A1 + A2 fragments correspond to F2A and the transmembrane sequence (M1 and M2) .

The primers are shown in the table below.

Table 4. Primers for recombination fragments

The recombination fragments were amplified from the genomic DNA of C57BL/6 mice. The A1 and A2 fragments were ligated by PCR (see the tables below for reaction conditions) . After sequencing, the LR, A1+A2 and RR fragments were ligated into the TV-4G plasmid in the AIO kit to obtain the TV-4G-Ighg1 vector.

Table 5. The PCR reaction system (20 μL)

Composition	Amount
10× buffer for KOD-plus DNA polymerase	2 μL
dNTP (2mM)	2 μL
MgSO ₄ (25mM)	0.8 μL
Upstream primer F (10 μM)	0.6 μL
Downstream primer R (10 μM)	0.6 μL
DNA template	200 ng
KOD-Plus DNA polymerase (1U/μL)	0.6 μL
H ₂O	Add to 20 μL

Table 6. The PCR reaction conditions

EXAMPLE 3. Verification of TV-4G-Ighg1 vectors

Four TV-4G-Ighg1 clones were randomly selected and tested by three sets of restriction enzymes. Among them, EcoRV+ScaI should generate 2709bp+5114bp fragments; SacI should generate 2878bp+4945bp fragments; NdeI should generate 982bp+6841bp fragments.

All plasmids had the expected results (FIG. 3) . The sequences of

Plasmids

1 and 4 were further confirmed by sequencing. Plasmid 4 was selected for further experiments.

EXAMPLE 4: Vector construction for mice with BALB/c background

For BALB/c mice, the 5’homologous arm comprises nucleic acid set forth in SEQ ID NO: 18. The 3’homologous arm comprises nucleic acid set forth in SEQ ID NO: 19. The A fragment comprises F2A amino acid sequence and transmembrane sequence set forth in SEQ ID NO: 20. The engineered mouse Ighg1 gene sequence is shown below:

SEQ ID NO: 21 only shows the DNA sequences that are modified. The underlined sequence is the F2A sequence. The sequence with wavy line is the membrane bound sequence.

The amino acid sequence encoded by the modified Ighg1 gene is set forth in SEQ ID NO: 22.

In the sequence above, the underlined sequence (amino acids 305-326 of SEQ ID NO: 22) is the F2A sequence. The sequence with wavy line is the membrane bound sequence (amino acids 327-397 of SEQ ID NO: 22) .

Primers for amplifying the recombination fragments (LR-b, M-b, RR-b) and related sequences were designed. Among them, the LR-b fragment corresponds to the 5’homologous arm, the RR-b fragment corresponds to the 3’homologous arm, the M-b fragments correspond to F2A and the membrane bound domain sequence (M1 and M2) .

The primers are shown in the table below.

Table 7. Primers for recombination fragments

The LR-b and RR-b recombination fragments were amplified from the genomic DNA of BALB/c mice. M-b fragment was amplified from the TV-4G-Ighg1 plasmid. The LR-b, M-b and RR-b fragments were ligated into the TV-4G plasmid in the AIO kit to obtain the TV-4G-Ighg1-b vector.

EXAMPLE 5. Verification of vectors

Six TV-4G-Ighg1-b clones were randomly selected and tested by two sets of restriction enzymes. Among them, BstXI+StuI should generate 508bp+946bp+1176bp+5203bp fragments; BamHI should generate 3595bp+4238bp fragments.

All plasmids had the expected results (FIG. 4) . The sequences of

Plasmids

1 and 6 were further confirmed by sequencing. Plasmid 1 was selected for further experiments.

EXAMPLE 6: Design of sgRNA for Ighg1 gene

Several sgRNAs were designed for Ighg1 gene. The targeting site sequences on Ighg1 gene for each sgRNA are shown below:

sgRNA-1 target sequence (SEQ ID NO: 29) : 5’-ggacactgggatcatttaccagg-3’

sgRNA-2 target sequence (SEQ ID NO: 30) : 5’-ttggagccctctggtcctacagg-3’

sgRNA-3 target sequence (SEQ ID NO: 31) : 5’-ccagtgtccttggagccctctgg-3’

sgRNA-4 target sequence (SEQ ID NO: 32) : 5’-tgtaggaccagagggctccaagg-3’

sgRNA-5 target sequence (SEQ ID NO: 33) : 5’-gggatcatttaccaggagagtgg-3’

sgRNA-6 target sequence (SEQ ID NO: 34) : 5’-ggatcatttaccaggagagtggg-3’

sgRNA-7 target sequence (SEQ ID NO: 35) : 5’-agaggctcttctcagtatggtgg-3’

sgRNA-8 target sequence (SEQ ID NO: 36) : 5’-gtaaatgatcccagtgtccttgg-3’

sgRNA-9 target sequence (SEQ ID NO: 37) : 5’-cagagggctccaaggacactggg-3’

EXAMPLE 7: Testing sgRNA activity

The UCA kit was used to detect the activities of sgRNAs (FIG. 5 and Table 8) . The results show that the guide sgRNAs had different activities. sgRNA4 were selected for further experiments.

Forward oligonucleotide sequence and reverse oligonucleotide sequence based on sgRNA4 were synthesized.

sgRNA-4 upstream: 5’-TAGGACCAGAGGGCTCCA-3’(SEQ ID NO: 46)

Forward sequence: 5’-TAGGTAGGACCAGAGGGCTCCA-3’(SEQ ID NO: 38)

sgRNA-4 downstream: 5’-TGGAGCCCTCTGGTCCTA-3’(SEQ ID NO: 47)

Reverse sequence: 5’-AAACTGGAGCCCTCTGGTCCTA-3’(SEQ ID NO: 39)

Table 8. Activities of sgRNAs

sgRNAs	Normalized Activities
Negative control (con)	1.00
Positive control (pc)	67.74
sgRNA-1	22.01
sgRNA-2	25.19
sgRNA-3	29.57
sgRNA-4	31.44
sgRNA-5	24.87
sgRNA-6	17.78
sgRNA-7	22.03
sgRNA-8	28.63
sgRNA-9	17.94

EXAMPLE 8: Constructing pT7-sgRNA G2 plasmids

A map of pT7-sgRNA G2 vector is shown in FIG. 6.

The DNA fragment containing T7 promoter and sgRNA scaffold was synthesized, and linked to the backbone vector by restriction enzyme digestion (EcoRI and BamHI) . The target plasmid was confirmed by the sequencing results.

The DNA fragment containing the T7 promoter and sgRNA scaffold (SEQ ID NO: 40) is shown below:

EXAMPLE 9: Constructing recombinant expression vectors pT7-Ighg1-4

The forward oligonucleotide sequence and the reverse oligonucleotide sequence were respectively ligated to pT7-sgRNA G2 plasmid to obtain the expression vectors pT7-Ighg1-4. The ligation reaction was set up as follows:

Table 9. The ligation reaction mix (10μL)

sgRNA after annealing	1μL (0.5μM)
pT7-sgRNA G2 vector	1μL (10 ng)
T4 DNA Ligase	1μL (5U)
10×T4 DNA Ligase buffer	1μL
50%PEG4000	1μL
H ₂O	Add to 10μL

The ligation reaction was carried out at room temperature for 10 to 30 minutes. The ligation product was then transferred to 30 μL of TOP10 competent cells. The cells were then plated on a petri dish with Kanamycin, and then cultured at 37 ℃ for at least 12 hours and then two clones were selected and added to LB medium with Kanamycin (5 ml) , and then cultured at 37 ℃ at 250 rpm for at least 12 hours.

Clones were randomly selected and sequenced to verify their sequences. The vectors with correct sequences were selected for subsequent experiments.

EXAMPLE 10: Microinjection and embryo transfer using C57BL/6 mice

The pre-mixed Cas9 mRNA, TV-4G-Ighg1 plasmid and in vitro transcription products of pT7-Ighg1-4 plasmid were injected into the cytoplasm or nucleus of mouse fertilized eggs (C57BL/6 background) with a microinjection instrument (using Ambion in vitro transcription kit to carry out the transcription according to the method provided in the product instruction) . The embryo microinjection was carried out according to the method described, e.g., in A. Nagy, et al., “Manipulating the Mouse Embryo: A Laboratory Manual (Third Edition) , ” Cold Spring Harbor Laboratory Press, 2003. The injected fertilized eggs were then transferred to a culture medium for a short time culture, and then was transplanted into the oviduct of the recipient mouse to produce the genetically modified mice (F0 generation) with C57BL/6 background. The mouse population was further expanded by cross-mating and self-mating to establish stable mouse lines.

EXAMPLE 11: Microinjection and embryo transfer using BALB/c mice

The pre-mixed Cas9 mRNA, TV-4G-Ighg1-b plasmid and in vitro transcription products of pT7-Ighg1-4 plasmid were injected into the cytoplasm or nucleus of mouse fertilized eggs (BALB/c background) with a microinjection instrument (using Ambion in vitro transcription kit to carry out the transcription according to the method provided in the product instruction) . The embryo microinjection was carried out according to the method described, e.g., in A. Nagy, et al., “Manipulating the Mouse Embryo: A Laboratory Manual (Third Edition) , ” Cold Spring Harbor Laboratory Press, 2003. The injected fertilized eggs were then transferred to a culture medium for a short time culture, and then was transplanted into the oviduct of the recipient mouse to produce the genetically modified mice (F0 generation) with BALB/c background. The mouse population was further expanded by cross-mating and self-mating to establish stable mouse lines.

EXAMPLE 12: Verification of genetic modification

1. Genotype determination for F0 generation mice

PCR analysis was performed using mouse tail genomic DNA of F0 generation mice. Primer L-GT-F is located on the left side of 5’homologous arm, Primer R-GT-R is located on the right side of 3’homologous arm, and both R-GT-F and L-GT-R are located within the A fragment.

5’end primers:

Upstream: L-GT-F (SEQ ID NO: 41)

5’-ccacatgtttaggagcctgggttgacttc-3’

Downstream: L-GT-R (SEQ ID NO: 42)

5’-gggccctgggttggactccac-3’

3’end primers:

Upstream: R-GT-F (SEQ ID NO: 43) :

5’-ccggtgaaacagactttgaattttgaccttc-3’

Downstream: R-GT-R (SEQ ID NO: 44) :

5’-gcacggaacaaggtacacctgggacagag-3’

If the desired sequence was inserted into the correct positions in the genome, the PCR experiment using L-GT-F/L-GT-R should produce a band at about 2290 bp, and the PCR experiment using R-GT-F/R-GT-R should produce a band at about 2304 bp.

The results for F0 generation mice with C57BL/6 background are shown in FIGS. 7A-7B. Mouse #5 was a positive mouse with the correct insertion.

The results for F0 generation mice with BALB/c background are shown in FIGS. 8A-8B.

Mouse #

1, 2, and 3 had the correct insertion.

The PCT conditions are shown in the table below.

Table 10. The PCR reaction (20 μL)

10× buffer	2μL
dNTP (2mM)	2μL
MgSO ₄ (25mM)	0.8μL
Upstream primer (10μM)	0.6μL
Downstream primer (10μM)	0.6μL
Mouse tail genomic DNA	200ng
KOD-Plus- (1U/μL)	0.6μL
ddH ₂O	Add to 20μL

Table 11. The PCR reaction conditions

2. Genotype determination for F1 generation mice

F0 generation mice with the correct insertion were mated with wild-type mice of the same background to obtain F1 generation mice. PCR analysis was performed on the F1 generation mouse tail genomic DNA.

The C57BL/6 F1 generation mice were obtained by mating F0 positive mice with C57BL/6 background with C57BL/6 mice. The PCR results are shown in FIGS. 9A-9B, wherein the mice labeled with F1-1, F1-2, and F1-3 were positive. The positive F1 generation mice were mated with each other obtain F2 homozygous mice.

The results show that the method described herein can be used to establish Ighg1 genetically engineered mice.

EXAMPLE 13: Antibody screening

Experiments were performed to illustrate how to use the genetically engineered mice prepared by the methods described herein for antibody screening.

One Ighg1 gene-modified homozygous mouse (C57BL/6 background) and one wild-type mouse were included for the experiments. The extracellular region of hCD27 protein was injected into the mice for subcutaneous immunization. Booster immunization was performed every 2-4 weeks for 2 -5 months. Euthanasia was then performed.

Spleen samples were collected and grinded. The samples were then passed through 70 μm cell mesh. The filtered cell suspensions were centrifuged and the supernatants were discarded. Erythrocyte lysis solution was added to the sample, which was lysed for 3-5 minutes on ice and neutralized with PBS solution. The solution was centrifuged again and the supernatants were discarded. Anti-mouse CD16/32 antibody was added. The cells were resuspended, placed on ice for 15 minutes, and then dispensed into 1.5 mL Eppendorf tubes (4 tubes in each group) .

Biotinylated Human CD27 Fc Tag (FIGS. 10A and 10E) , and human CD27 Fc Tag (FIGS. 10C and 10G) were added. Biotinylated human TIGIT Fc Tag (FIGS. 10B and 10F) and human TIGIT Fc Tag (FIGS. 10D and 10H) were added as a negative control. The cells were incubated on ice for 15 minutes and repeatedly washed by PBS. Supernatant was removed after centrifuge.

PE Streptavidin/FITC anti-mouse CD19 Antibody (FIGS. 10A, 10B, 10E, and 10F) or PE anti-human IgG Fc/FITC anti-mouse CD19 Antibody (FIGS. 10C, 10D, 10G, and 10H) were added, incubate for 15 min on ice. The cells were then washed with PBS. The supernatants were discarded after centrifuge. The cells were then resuspended in PBS and then were analyzed by FACS.

The results showed that compared with wild-type mice, the genetically modified mice showed significant positive cell population in the Q1 region after immunization with hCD27 (FIGS. 10E and 10G) . The X axis shows the signals for PE and the Y axis shows the signal for FITC.

No positive cell population appeared in the Q1 region of wild type control (FIGS. 10A, 10C) and irrelevant antibody control (FIGS. 10B, 10F, 10D, and 10H) .

The results indicate anti-hCD27 antibody can be detected on the surface of mouse spleen cells, demonstrating that the secreted form of the antibodies can be genetically engineered and retained on the surface of mouse spleen cells.

Furthermore, these spleen cells were sorted by magnetic-activated cell sorting (MACS) with anti-biotin MicroBeads. These cells were first treated with biotinylated Human CD27 Ligand/CD70 Protein, Fc Tag. MACS was then performed with anti-biotin MicroBeads. The sorting method enriched spleen cells that express CD27-specific antibodies (FIG. 11) . Thus, the results showed that the secreted form of the antibodies can be retained on the surface of mouse spleen cells, and the spleen cells that express these antibodies can be captured. These cells can be used for subsequent expression, modification and sequence modification.

EXAMPLE 14: Methods Based On Embryonic Stem Cell Technologies

The non-human mammals described herein can also be prepared through other gene editing systems and approaches, including but not limited to: gene homologous recombination techniques based on embryonic stem cells (ES) , zinc finger nuclease (ZFN) techniques, transcriptional activator-like effector factor nuclease (TALEN) technique, homing endonuclease (megakable base ribozyme) , or other techniques.

Based on the Ighg1 gene as shown in FIG. 1, a targeting strategy for generating the engineered Ighg1 mouse model with Embryonic Stem Cell Technologies is developed (FIG. 12) . Since the objective is to add transmembrane sequence into the Ighg1 gene, a recombinant vector that contains a 5’homologous arm, a 3’homologous arm, and a sequence fragment containing transmembrane sequence (SEQ ID NO: 3 and SEQ ID NO: 45) is designed. The vector can also contain a resistance gene for positive clone screening, such as neomycin phosphotransferase coding sequence Neo. On both sides of the resistance gene, two site-specific recombination systems in the same orientation, such as Frt or LoxP, can be added. Furthermore, a coding gene with a negative screening marker, such as the diphtheria toxin A subunit coding gene (DTA) , can be constructed downstream of the recombinant vector 3’homologous arm.

Vector construction can be carried out using methods known in the art, such as enzyme digestion and so on. The recombinant vector with correct sequence can be next transfected into mouse embryonic stem cells, such as C57BL/6 mouse embryonic stem cells, and then the recombinant vector can be screened by positive clone screening gene. The cells transfected with the recombinant vector are next screened by using the positive clone marker gene, and Southern Blot technique can be used for DNA recombination identification. For the selected correct positive clones, the positive clonal cells (black mice) are injected into the isolated blastocysts (white mice) by microinjection according to the method described in the book A. Nagy, et al., “Manipulating the Mouse Embryo: A Laboratory Manual (Third Edition) , ” Cold Spring Harbor Laboratory Press, 2003. The resulting chimeric blastocysts formed following the injection are transferred to the culture medium for a short time culture and then transplanted into the fallopian tubes of the recipient mice (white mice) to produce F0 generation chimeric mice (black and white) . The F0 generation chimeric mice with correct gene recombination are then selected by extracting the mouse tail genome and detecting by PCR for subsequent breeding and identification. The F1 generation mice are obtained by mating the F0 generation chimeric mice with wildtype mice. Stable gene recombination positive F1 heterozygous mice are selected by extracting rat tail genome and PCR detection. Next, the F1 heterozygous mice are mated to each other to obtain genetically recombinant positive F2 generation homozygous mice. In addition, the F1 heterozygous mice can also be mated with Flp or Cre mice to remove the positive clone screening marker gene (e.g., neo) , and then the modified IGHG1 gene homozygous mice can be obtained by mating these mice with each other. The methods of genotyping and using the F1 heterozygous mice or F2 homozygous mice are similar to the methods as described in the examples above.

OTHER EMBODIMENTS

It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

Claims

A genetically-modified, non-human animal whose genome comprises at least one chromosome comprising an insertion of a sequence encoding a membrane bound domain at an endogenous immunoglobulin heavy chain locus.
The animal of claim 1, wherein the immunoglobulin heavy chain locus comprises a first 3’-UTR region for the secretive form of the immunoglobulin heavy chain and a second 3’-UTR region for the membrane bound form of the immunoglobulin heavy chain, wherein the sequence encoding the membrane bound domain is inserted before the first 3’-UTR region.
The animal of claim 1, wherein the immunoglobulin heavy chain locus comprises a sequence encoding an immunoglobulin CH1 domain, an immunoglobulin CH2 domain, an immunoglobulin CH3 domain, wherein the membrane bound domain is linked to the immunoglobulin CH3 domain.
The animal of claim 3, wherein the membrane bound domain is linked to the immunoglobulin CH3 domain by an oligopeptide sequence.
The animal of claim 4, wherein the oligopeptide sequence is a 2A oligopeptide sequence (e.g. F2A) .
The animal of any one of claims 4-5, wherein the 2A oligopeptide sequence comprises a sequence that is at least 80%identical to SEQ ID NO: 2.
The animal of any one of claims 1-6, wherein the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.
The animal of claim 7, wherein the membrane bound domain comprises a sequence that is at least 80%identical to SEQ ID NO: 4.
The animal of any one of claims 1-8, wherein the immunoglobulin heavy chain locus comprises a sequence that is at least 80%identical to SEQ ID NO: 8.
The animal of any one of claims 1-8, wherein the immunoglobulin heavy chain locus comprises a sequence that encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 9.
The animal of any one of claims 1-8, wherein the immunoglobulin heavy chain locus comprises a sequence that is at least 80%identical to SEQ ID NO: 21.
The animal of any one of claims 1-8, wherein the immunoglobulin heavy chain locus comprises a sequence that encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 22.
The animal of any one of claims 1-12, wherein the sequence is inserted in exon 4.
The animal of any one of claims 1-13, wherein the animal is heterozygous with respect to the insertion at the endogenous immunoglobulin heavy chain gene locus.
The animal of any one of claims 1-13, wherein the animal is homozygous with respect to the insertion at the endogenous immunoglobulin heavy chain gene locus.
The animal of any one of claims 1-15, wherein the animal does not express endogenous immunoglobulin heavy chain.
The animal of any one of claims 1-16, wherein the animal is a mammal, e.g., a monkey, a rodent or a mouse.
The animal of any one of claims 1-17, wherein the animal is a mouse.
The animal of any one of claims 1-18, wherein the endogenous immunoglobulin heavy chain locus is immunoglobulin G heavy chain locus.
The animal of any one of claims 1-18, wherein the endogenous immunoglobulin heavy chain locus is Ighg1 locus.
A genetically-modified, non-human animal whose genome comprises at least one chromosome comprising a sequence encoding an immunoglobulin heavy chain and a membrane bound domain, wherein the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.
The animal of claim 21, wherein the sequence is operably linked to an endogenous immunoglobulin heavy chain locus regulatory element of the animal.
The animal of claim 21, wherein the nucleotide sequence is integrated to an endogenous immunoglobulin heavy chain gene locus of the animal.
The animal of any one of claim 21-23, wherein the heterogeneous amino acid sequence is a 2A oligopeptide sequence.
The animal of any one of claim 21-23, wherein the 2A oligopeptide sequence comprises a sequence that is at least 80%identical to SEQ ID NO: 2.
The animal of any one of claim 21-25, wherein the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.
The animal of any one of claim 21-25, wherein the membrane bound domain comprises a sequence that is at least 80%identical to SEQ ID NO: 4.
The animal of any one of claim 21-25, wherein the sequence that encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 9.
The animal of any one of claim 21-25, wherein the sequence that encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 22.
The animal of any one of claim 21-29, wherein the animal does not express endogenous immunoglobulin heavy chain.
The animal of any one of claim 21-30, wherein the animal is a mammal, e.g., a monkey, a rodent or a mouse.
The animal of any one of claim 21-30, wherein the animal is a mouse.
The animal of any one of claim 21-32, wherein the immunoglobulin heavy chain is an immunoglobulin G heavy chain.
The animal of any one of claim 21-32, wherein the immunoglobulin heavy chain is an IgG1 heavy chain.
A non-human animal comprising at least one cell comprising a nucleic acid encoding a fusion polypeptide, wherein the fusion peptide comprises an immunoglobulin heavy chain and a membrane bound domain, wherein the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.
The animal of claim 35, wherein the nucleic acid is operably linked to an endogenous immunoglobulin heavy chain locus regulatory element of the animal.
The animal of claim 35, wherein the nucleic acid is integrated to an endogenous immunoglobulin heavy chain gene locus of the animal.
The animal of any one of claims 35-37, wherein the heterogeneous amino acid sequence is a 2A oligopeptide sequence.
The animal of claim 38, wherein the 2A oligopeptide sequence comprises a sequence that is at least 80%identical to SEQ ID NO: 2.
The animal of any one of claims 35-40, wherein the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.
The animal of claim 40, wherein the membrane bound domain comprises a sequence that is at least 80%identical to SEQ ID NO: 4.
The animal of any one of claims 35-41, wherein the sequence that encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 9.
The animal of any one of claims 35-41, wherein the sequence that encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 22.
The animal of any one of claims 35-43, wherein the animal does not express an endogenous immunoglobulin heavy chain.
The animal of any one of claims 35-44, wherein the animal is a mammal, e.g., a monkey, a rodent or a mouse.
The animal of any one of claims 35-44, wherein the animal is a mouse.
The animal of any one of claims 35-46, wherein the immunoglobulin heavy chain is an immunoglobulin G heavy chain.
The animal of any one of claims 35-46, wherein the immunoglobulin heavy chain is an IgG1 heavy chain.
A genetically modified non-human animal, wherein the genome of the animal comprises from 5’ to 3’ at the endogenous immunoglobulin heavy chain locus, (a) a first DNA sequence; optionally (b) a second DNA sequence comprising an exogenous sequence; (c) a third DNA sequence; (d) a fourth DNA sequence, wherein the first DNA sequence, the optional second DNA sequence, the third DNA sequence, and the fourth DNA sequence are linked,

wherein the first DNA sequence comprises a sequence that is at least 80%identical to an endogenous immunoglobulin heavy chain gene sequence that is located upstream of 3’-UTR for the secretive form of the immunoglobulin heavy chain,

the second DNA sequence can have a length of 0 nucleotides to 300 nucleotides,

the third DNA sequence encodes a membrane bound domain,

the fourth DNA sequence comprises a sequence that is at least 80%identical to a sequence that is located downstream of 3’-UTR for the secretive form of the immunoglobulin heavy chain.
The animal of claim 49, wherein the first DNA sequence comprises exon 1, exon 2, exon 3, and/or at least 10 nucleotides from exon 4 of the immunoglobulin heavy chain gene.
The animal of claim 49 or 50, wherein the fourth DNA sequence comprises M1 and/or M2 of the immunoglobulin heavy chain gene.
The animal of any one of claims 49-51, wherein the second DNA sequence encodes a 2A oligopeptide sequence.
The animal of any one of claims 49-52, wherein the second DNA sequence comprises a sequence that is at least 80%identical to SEQ ID NO: 2.
The animal of any one of claims 49-53, wherein the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.
The animal of any one of claims 49-53, wherein the membrane bound domain comprises a sequence that is at least 80%identical to SEQ ID NO: 4.
The animal of any one of claims 49-55, wherein the sequence encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 9.
The animal of any one of claims 49-55, wherein the sequence encodes an amino acid sequence that is at least 80%identical to SEQ ID NO: 22.
The animal of any one of claims 49-57, wherein the animal is a mammal, e.g., a monkey, a rodent or a mouse.
The animal of any one of claims 49-57, wherein the animal is a mouse.
The animal of any one of claims 49-59, wherein the endogenous immunoglobulin heavy chain locus is an immunoglobulin G heavy chain locus.
The animal of any one of claims 49-59, wherein the endogenous immunoglobulin heavy chain locus is an Ighg1 locus.
A method of isolating an immune cell that expresses an antibody that specifically binds to an antigen, the method comprising:

immunizing the animal of any one of claims 1-61 with the antigen;

collecting a plurality of immune cells from the animal;

isolating one or more cells that express an antibody that specifically binds to the antigen.
The method of claim 62, wherein the one or more cells that express an antibody that specifically binds to the antigen are isolated by fluorescence activated cell sorting (FACS) .
The method of claim 62, wherein the one or more cells that express an antibody that specifically binds to the antigen are isolated by magnetic-activated cell sorting (MACS) .
The method of claim 62, wherein the method further comprises

sequencing a nucleic acid sequence that encodes one or more complementarity-determining regions (CDRs) of the antibody that is expressed by the isolated cell.
A method for making a genetically-modified, non-human animal, comprising:

inserting a sequence encoding a membrane bound domain at an endogenous immunoglobulin heavy chain locus.
The animal of claim 66, wherein the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.
The animal of claim 66, wherein the membrane bound domain comprises a sequence that is at least 80%identical to SEQ ID NO: 4.
The animal of any one of claims 66-68, wherein the locus is immunoglobulin G heavy chain locus.
The animal of any one of claims 66-69, wherein the inserted sequence further comprises a 2A oligonucleotide sequence.
A fusion immunoglobulin heavy chain peptide comprising one or more

immunoglobulin heavy chain constant domains and a membrane bound domain, wherein the membrane bound domain is linked to the immunoglobulin heavy chain by a heterogeneous amino acid sequence.
The fusion peptide of claim 71, wherein the heterogeneous amino acid sequence is a 2A oligopeptide sequence.
The fusion peptide of claim 72, wherein the 2A oligopeptide sequence comprises a sequence that is at least 80%identical to SEQ ID NO: 2.
The fusion peptide of any one of claims 71-73, wherein the membrane bound domain is an immunoglobulin heavy chain membrane bound domain.
The fusion peptide of any one of claims 71-73, wherein the membrane bound domain comprises a sequence that is at least 80%identical to SEQ ID NO: 4.
The fusion peptide of any one of claims 71-73, wherein the peptide comprises an amino acid sequence that is at least 80%identical to SEQ ID NO: 9.
The fusion peptide of any one of claims 71-73, wherein the peptide comprises an amino acid sequence that is at least 80%identical to SEQ ID NO: 22.
A peptide comprising an amino acid sequence, wherein the amino acid sequence is one of the following:

(a) an amino acid sequence set forth in SEQ ID NO: 9 or 22;

(b) an amino acid sequence that is at least 90%identical to SEQ ID NO: 9 or 22;

(c) an amino acid sequence that is at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to SEQ ID NO: 9 or 22;

(d) an amino acid sequence that is different from the amino acid sequence set forth in SEQ ID NO: 9 or 22 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 amino acid; and

(e) an amino acid sequence that comprises a substitution, a deletion and /or insertion of one, two, three, four, five or more amino acids to the amino acid sequence set forth in SEQ ID NO: 9 or 22.
A nucleic acid comprising a nucleotide sequence, wherein the nucleotide sequence is one of the following:

(a) a sequence that encodes the peptide of claim 78;

(b) SEQ ID NO: 8;

(c) SEQ ID NO: 21;

(d) a sequence that is at least 90%identical to SEQ ID NO: 8 or SEQ ID NO: 21;

(e) a sequence that is at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to SEQ ID NO: 8; and

(f) a sequence that is at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to SEQ ID NO: 21.
A cell comprising the protein of claim 78 and/or the nucleic acid of claim 79.
An animal comprising the protein of claim 78 and/or the nucleic acid of claim 80.