WO2019037099A1 - Large scale modification of eukaryotic genome - Google Patents

Large scale modification of eukaryotic genome Download PDF

Info

Publication number
WO2019037099A1
WO2019037099A1 PCT/CN2017/099097 CN2017099097W WO2019037099A1 WO 2019037099 A1 WO2019037099 A1 WO 2019037099A1 CN 2017099097 W CN2017099097 W CN 2017099097W WO 2019037099 A1 WO2019037099 A1 WO 2019037099A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna fragment
unique
goi
enzyme cleavage
unique enzyme
Prior art date
Application number
PCT/CN2017/099097
Other languages
French (fr)
Inventor
Wenning Qin
Haoyi WANG
Original Assignee
Wenning Qin
Wang Haoyi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wenning Qin, Wang Haoyi filed Critical Wenning Qin
Priority to PCT/CN2017/099097 priority Critical patent/WO2019037099A1/en
Publication of WO2019037099A1 publication Critical patent/WO2019037099A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/905Stable introduction of foreign DNA into chromosome using homologous recombination in yeast
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/10Immunoglobulins specific features characterized by their source of isolation or production
    • C07K2317/14Specific host cells or culture conditions, e.g. components, pH or temperature
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/20Immunoglobulins specific features characterized by taxonomic origin
    • C07K2317/21Immunoglobulins specific features characterized by taxonomic origin from primates, e.g. man
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/8509Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
    • C12N2015/8518Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic expressing industrially exogenous proteins, e.g. for pharmaceutical use, human insulin, blood factors, immunoglobulins, pseudoparticles
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/30Vector systems comprising sequences for excision in presence of a recombinase, e.g. loxP or FRT

Definitions

  • a shuttle vector To replace a gene or gene cluster in the recipient genome with its ortholog or syntenic region from the donor genome, a shuttle vector must first be identified or constructed carrying genomic DNA fragment of the donor origin and transferring it to the recipient genome.
  • BAC bacterial artificial chromosome
  • YAC yeast artificial chromosome
  • HAC human artificial chromosome
  • transchromosome Tomizuka et al., 1997) .
  • Cosmid is a circular plasmid, with capacity for DNA insert of 45 kb (kilo base pairs) . For smaller genes, this size may be sufficiently large to accommodate the entire gene, including the 5’ and 3’ regulatory sequences.
  • BAC vector is also a circular plasmid, with an average insert size of 150-350 kb. With its fair size and stable insert, BAC has been the popular choice providing human and murine genomic DNA fragments for a variety of uses, including whole genome sequencing and genome engineering.
  • YAC is also circular and larger and can hold inserts up to 2,000 kb, with an average insert size of 250-400 kb.
  • insert size is appealing, owing to its large size and propagation in the highly recombinant yeast host, YAC clones are plagued with issues of artifact, including intramolecular rearrangement of the insert (insertion, deletion, and inversion, etc. ) , ligation of more than one insert into the same vector, and combination of more than one YAC into one YAC clone.
  • YAC library is of limited use.
  • genomic DNA can also be transferred directly from donor to recipient cells by fusion of the two cell types, in a process called microcell-mediated chromosome transfer.
  • the sizes of these “transchromosomes” can be very large, up to the length of the entire chromosome. However, what is transferred is by chance and often fragmented and in addition, propagation to the next round of cell division is not always guaranteed.
  • RPCI-11 BAC library (Osoegawa et al., 2001) that carries the ensemble of human genomic DNA fragments in the pBACe3.6 vector.
  • blood sample was collected from a human donor, genomic DNA isolated and partially digested with EcoRI restriction enzyme. The fragmented DNA was then separated on agarose gel and size selected DNA fragments cloned into the pBACe3.6 vector between the EcoRI sites (for a subset of the clones, the same donor DNA was partially digested with MboI, size selected, and ligated into the pTARBAC1 cloning vector at the BamHI site) .
  • RPCI-11 consists of 1, 440 plates and a total of 543, 797 clones, covering the entire 3 ⁇ 10 9 base pairs of the human genome in multiple coverages.
  • EcoR1 enzyme cuts it is possible that a gene may be carried on more than one BAC clone and a gene cluster larger than the carrying capacity of pBACe3.6 (150-350 kb) most definitely will be housed in more than one BAC clone.
  • IGH immunoglobulin heavy chain
  • MHC human major histocompatibility gene cluster.
  • the MHC gene cluster is located on the short arm of chromosome 6 at band 6p21.3 and spans about 4 Mb, out ora total of 171 Mb, or 2.3%, of the human chromosome 6 content. Genes of the MHC classes 1 and 2 are located within this region.
  • mice carrying human immunoglobulin (IG) genes have been created and proven to be of tremendous pharmaceutical value (Green, 2014) .
  • the first generation mouse models were created carrying human IG transgenes on a murine knockout background (Bruggemann et al., 1989; Green et al., 1994; Lonberg et al., 1994; Taylor et al., 1992) .
  • these mice Although capable of producing antibody of a high affinity when challenged, these mice have major limitations owing to early transgenic technologies used to create these mice. They carry the BAC or YAC derived transgene and accordingly, partial representation of the IGV gene cluster.
  • these transgenes randomly integrate into the mouse genome and are out of the genomic context in which they naturally reside. As a consequence, they may not have retained authentic transcriptional regulation offered by particularly the 3’ flanking sequences which are known to be important.
  • the invention described herein provides compositions and methods to produce large DNA fragments (e.g., hundreds of Kb to million base pairs) containing a genomic region of interest (GOI) with defined start and end points.
  • large DNA fragments e.g., hundreds of Kb to million base pairs
  • GOI genomic region of interest
  • these large DNA fragments can be used to ferry genes or blocks of genes of donor origin to the recipient genome in one or multiple transfers.
  • the megacircle can be used to transfer a gene cluster from a first /donor organism (such as the human IGH gene cluster) to the cells of a second /recipient organism (such as a model organism like mouse) , and replace a target (e.g., the corresponding genomic) region in the second /recipient organism (e.g., murine Igh locus) with a (such as its syntenic) region from the first /donor organism (e.g., human) .
  • a target e.g., the corresponding genomic region in the second /recipient organism (e.g., murine Igh locus)
  • a target such as its syntenic region from the first /donor organism (e.g., human)
  • Any other gene clusters can be used in the invention, including, but not limited to, those for TCR (T cell receptor) gene cluster, MHC classes 1 and 2, and cytochrome P450 genes.
  • This method of the invention can be used to reduce large scale genome engineering to a few steps, including eliminating the need to amplifying large segments of DNA fragments in a prokaryotic (such as bacteria) or eukaryotic (such as yeast) host cell, thus improving efficiency and saving time and cost.
  • a prokaryotic such as bacteria
  • eukaryotic such as yeast
  • the methods of the invention includes the production of megacircles, which may comprise the following steps: 1. release a genomic DNA comprising the GOI from a host genome; 2. ligate the genomic DNA into megacircle with or without another DNA fragment; 3. RCA amplification of megacircle, generating double strand concatemers; 4. Digest the concatemer to make monomers with compatible sticky overhangs (megastrand) ; and 5. Self-ligating the megastrands to re-generate megacircles.
  • the megacircle or megastrand of the invention can, for example, serve as donor template of homologous recombination (HR) if it contains genomic sequence homologous to a target genomic region in a recipient or host. It could also contain binding site for endonucleases such as CRISPR, TALEN and ZFN.
  • HR homologous recombination
  • one aspect of the invention provides a DNA fragment comprising a genomic region of interest (GOI, which can be a single complete or partial gene, or a cluster of related or unrelated genes, non-encoding sequence, or any genomic region of interest) , wherein the GOI is at least about 20 kb (e.g., 50 kb, 100 kb, 150 kb, 200 kb, 300 kb, 400 kb, 500 kb, 750 kb, 1 mb, 1.25 mb, 1.5 mb, 1.75 mb, 2 mb or more) , and is flanked by a first unique enzyme cleavage site at the proximal end of the genomic DNA fragment and a second unique enzyme cleavage site at the distal end of the genomic DNA fragment, wherein the DNA fragment further comprises an exogenous sequence (such as a selection marker, a recombinase target site (RTS) , or a sequence from a non-donor source) between the
  • unique means that the enzyme cleavage site does not exist inside the GOI, so that when an enzyme specific for the unique enzyme cleavage site is used in digestion, the enzyme does not cleave within the GOI.
  • the unique enzyme cleavage site may well exist in other parts of the same genome.
  • the first and the second unique enzyme cleavage sites have the same sequence, or are recognized by one (the same) cleavage enzyme, or have sticky ends compatible for ligation.
  • the DNA fragment is a double-stranded circle (e.g., linked at the first and the second unique enzyme cleavage sites) .
  • the DNA fragment further comprises a first positive selection marker cassette located between the RTS and an end of the DNA fragment, wherein the GOI and the positive selection marker cassette flank the RTS.
  • the DNA fragment further comprises a heterotypic recombinase target site (htRTS) , wherein the RTS and the htRTS flank the GOI.
  • htRTS heterotypic recombinase target site
  • the DNA fragment is circular, with a linkage created by linking the first and the second unique enzyme cleavage sites, wherein the linkage is between the positive selection marker cassette and the htRTS (or the RTS) .
  • the DNA fragment is amplified by rolling circle amplification (RCA) .
  • RCA rolling circle amplification
  • the GOI is immunoglobulin heavy chain gene cluster, immunoglobulin kappa light chain gene cluster, immunoglobulin lambda light chain gene cluster, a TCR (T-cell receptor) gene cluster, an MHC class 1 gene, an MHC class 2 gene, or a cytochrome P450 gene, or any other genomic fragment.
  • the first and/or the second unique enzyme cleavage site (s) are cleavable by a restriction endonuclease, a homing endonuclease (such as I-Scel) , a CRISPR/Cas9 nuclease, a CRISPR/Cpf1 nuclease, a TALEN (transcription activator-like effector nuclease) , a ZFN (Zinc Finger nuclease) , or a Pyrococcus furiosus Argonaute (PfAgo) based artificial restriction enzyme (ARE) .
  • a restriction endonuclease such as I-Scel
  • CRISPR/Cas9 nuclease such as I-Scel
  • CRISPR/Cpf1 nuclease such as I-Scel
  • TALEN transcription activator-like effector nuclease
  • ZFN Zinc Finger nuclease
  • PfAgo
  • one of the first and the second unique enzyme cleavage sites is a natural /pre-existing genomic sequence.
  • the RTS or the htRTS is locus of crossover in P 1 (loxP) , flippase recognition target (FRT) , attP/attB, or mutants thereof, or the recognition site of other site-specific recombinases.
  • Another aspect of the invention provides a method of producing the DNA fragment of any one of the subject DNA fragments, the method comprising: (a) inserting, in a host genome comprising said GOI, said first unique enzyme cleavage site and said RTS proximal (or distal) to the GOI, wherein the GOI is proximal (or distal) to the second unique enzyme cleavage site, and wherein an optional first positive selection marker cassette is distal (or proximal) to said first unique enzyme cleavage site and proximal (or distal) to said RTS; (b) digesting the host genome with a first unique enzyme that cleaves said first unique enzyme cleavage site, and a second unique enzyme that cleaves said second unique enzyme cleavage site, to release the DNA fragment comprising the GOI from the host genome; (c) circularizing the released DNA fragment to form a megacircle, by ligating the ends of said released DNA fragment generated by cleavage by said first unique enzyme and said second unique
  • steps (a) and (b) can be done simultaneously, or sequentially in either order.
  • the second unique enzyme cleavage site is naturally existing sequence. Note, however, in a similar aspect of the invention, a method is contemplated in which in step (a) , two naturally existing unique enzyme cleavage sites flank the GOI.
  • step (a) further comprises inserting a 3 rd unique enzyme cleavage site between the first unique enzyme cleavage site and the RTS.
  • the second unique enzyme cleavage site together with a second positive selection marker cassette and htRTS, are inserted into said host genome distal (or proximal) to GOI, preferably the second unique enzyme cleavage site is distal (or proximal) to the htRTS and proximal (or distal) to the second positive selection marker cassette.
  • Another aspect of the invention provides a method of producing the DNA fragment of any one of the subject DNA fragments, the method comprising: (a) removing /retrieving a genomic DNA fragment comprising said GOI from a host genome, through digesting the host genome with a first enzyme specific for a first unique enzyme cleavage site and a second enzyme specific for a second unique enzyme cleavage site, wherein said first and said second unique enzyme cleavage sites are not compatible for ligation with each other; (b) ligating the retrieved genomic DNA fragment in step (a) with a linear recombinant DNA construct to produce a megacircle, wherein said linear recombinant DNA construct comprises a positive selection marker cassette flanked by a pair of heterotypic RTSs or flanked by a RTS and a 3 rd unique enzyme cleavage site, and wherein the ends of the linear recombinant DNA construct are defined by said first and said second unique enzyme cleavage sites, respectively; (c) amplifying
  • said first and said second unique enzyme cleavage sites are the same, and said first and said second unique enzymes are the same.
  • the method further comprises, before amplification by RCA, treating the ligation mixture with an exonuclease to eliminate linear DNA fragments.
  • RCA is performed using phi29 ( ⁇ 29) DNA polymerase, and one or more primers each specific for the GOI.
  • RCA comprises multiple displacement amplification (MDA) .
  • the method further comprises resolving the RCA concatemer product, preferably with said an enzyme specific for said 3 rd unique enzyme cleavage site (if present) , or with said unique first enzyme or said unique second enzyme.
  • the method further comprises self-ligating digested RCA concatemer product under conditions favoring the formation of intramolecular self-ligation (e.g., low concentration) .
  • the conditions comprise emulsifying the digested RCA concatemer product to form single oil droplets, each comprising no more than one linear DNA fragment, to promote self-ligation.
  • the method further comprises treating ligation mixture of the digested RCA product with an exonuclease to eliminate linear DNA fragments.
  • step (a) is carried out by nuclease mediated homology directed repair (HDR) , or homologous recombination (HR) , or a combination thereof.
  • HDR nuclease mediated homology directed repair
  • HR homologous recombination
  • Another aspect of the invention provides a method of replacing a host (e.g., mouse) genomic region in a host genome with a (syntenic) DNA fragment from a donor (e.g., human) genome, the method comprising: (a) replacing the host genomic region with a pair of heterotypic RTSs (e.g., via nuclease enabled HDR, or HR, or a combination thereof) , wherein said pair of heterotypic RTSs flank an optional positive/negative selection cassette (such as hygroTK selection cassette) ; (b) providing the DNA fragment of any one of claims 1-11 from the donor genome, and allowing the DNA fragment from the donor genome to integrate into the host genome through one of said pair of heterotypic RTSs in the presence of site specific recombinase specific for said one of said pair ofheterotypic RTSs; (c) optionally, allowing deletion of said positive/negative selection cassette (if present) through the other of said pair of homotypic RTSs.
  • Another aspect of the invention provides a method of replacing a host (e.g., mouse) genomic region from a host genome with a (syntenic) DNA fragment from a donor (e.g., human) genome, the method comprising: (a) inserting a single RTS, along with an optional positive/negative selection cassette (such as hygroTK selection cassette) , proximal (or distal) to the host genomic region (e.g., via nuclease enabled HDR, or HR, or a combination thereof) ; (b) providing the DNA fragment of any one of claims 1-11 from the donor genome, and allowing the DNA fragment from the donor genome to integrate into the host genome through said RTS in the presence of site specific recombinase specific for said RTS; (c) optionally, allowing deletion of said positive/negative selection cassette (if present) and the host genomic region through nuclease mediated NHEJ.
  • an optional positive/negative selection cassette such as hygroTK selection cassette
  • said host genomic region is deleted prior to step (b) .
  • the method is carried out in a zygote, an oocyte, a sperm cell (spermatogonial stem cell line) , or an ES cell of the host, preferably by microinjecting, electroporating, or transfecting exogenous components (e.g., CRISPR/Cas9 and guide RNAs targeting said first and said second unique enzyme cleavage sites; megacircles; recombinase protein or coding sequence thereof) .
  • exogenous components e.g., CRISPR/Cas9 and guide RNAs targeting said first and said second unique enzyme cleavage sites; megacircles; recombinase protein or coding sequence thereof.
  • the positive/negative selection marker cassette comprises a (neomycin/puromycin/hygromycin/blasticidin/zeocin) resistant /TK (or HPRT) gene under the expression control a eukaryotic promoter (such as the PGK promoter) and a polyA coding sequence.
  • a eukaryotic promoter such as the PGK promoter
  • Another aspect of the invention provides a mouse generated by any one of the subject methods.
  • Another aspect of the invention provides a mouse comprising in its genome an exogenous genomic DNA from a donor (e.g., a human, a mammal, or different mouse strain) , wherein the exogenous genomic DNA comprises a polymorphism within the species of the donor, and wherein the exogenous genomic DNA comprises any one of the subject DNA fragments.
  • a donor e.g., a human, a mammal, or different mouse strain
  • the exogenous genomic DNA comprises an immunoglobulin heavy chain gene cluster, immunoglobulin kappa light chain gene cluster, immunoglobulin lambda chain gene cluster, a TCR, an MHC class 1 gene, an MHC class 2 gene, or a cytochrome P450 gene.
  • FIG. 1 is a schematic drawing showing a first strategy ( “Strategy 1” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) .
  • the generated megacircle can be used in RMCE.
  • Step 1 A pair of unique enzyme cleavage sites, including restriction enzyme (RE) sites or protospacer/PAM sequences for a nuclease, such as that for Cpf1, is identified as flanking the genomic region of interest /gene of interest (collectively “GOI” ) . Sometimes two unique enzyme cleavage sites can be identified for certain GOI. Human genomic DNA is isolated and cleaved with the nuclease or restriction enzyme (RE) to release the genomic DNA fragment from its chromosomal location.
  • RE restriction enzyme
  • a recombinant DNA construct is created to carry a positive selection marker cassette flanked by a pair of heterotypic recombinase target sites (RTSs) .
  • RTSs heterotypic recombinase target sites
  • it may carry a pair ofprotospacer/PAM sequences or restriction sites matching those flanking the GOI as depicted.
  • the recombinant DNA construct When cleaved with the corresponding nuclease or RE, the recombinant DNA construct carries incompatible sticky ends in cis, but in trans, the sticky ends perfectly match those flanking the GOI.
  • Step 2 the megacircle DNA serving as a template is amplified through RCA so a large quantity of the megacircle could be generated.
  • individual megacircle units are released from the linear concatemer by digestion with an enzyme specific for a unique enzyme cleavage site, such as an I-Scel site that had been incorporated into the recombinant DNA construct earlier.
  • the resolved RCE products are then self-ligated to re-form megacircles, which can be used in RMCE.
  • FIG. 2 is a schematic drawing showing a second strategy ( “Strategy 2” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) .
  • the generated megacircle can be used in RMCE.
  • Step 1 A positive selection cassette (+a) , along with an RTS site and an I-Scel site, is inserted into the proximal end of the GOI by homologous recombination (HR) , nuclease mediated homology directed repair (HDR) , or a combination of both. It is followed by a second targeting event, inserting the second selection cassette (+b) , along with the heterotypic RTS site and a second I-Scel site, to the distal end of the GOI.
  • HR homologous recombination
  • HDR nuclease mediated homology directed repair
  • Step 2 Genomic DNA is isolated and genomic DNA fragment carrying GOI flanked by the positive selection cassette and an RTS site at the proximal end and a heterotypic RTS site at the distal end released from its chromosomal location by I-Scel digestion. This DNA fragment is self-ligated (compatible stick ends) to form the megacircle molecule.
  • Step 3 the megacircle is amplified by RCA, using a primer (or multiple primers) unique to the GOI. Upon completion of the RCA, the RCA concatemer will be resolved to release individual “megastrand” units, each can self-ligate back into a megacircle for use in RMCE.
  • FIG. 3 a schematic drawing showing the RMCE Reaction using the subject megacircles described in FIGs. 1 and 2.
  • Step 1 A pair of heterotypic RTS sites is inserted into a target chromosomal location, replacing the murine syntenic region (B) , either by HR, nuclease mediated HDR, or a combination of both. An optional positive/negative selection cassette is incorporated to allow selection and enrichment of the successful recombinant.
  • Step 2A the human GOI, carried in a megacircle, is integrated into the genomic locus, mediated by recombinase mediated integration. Integration could occur through either pair of the two pairs of homotypic sites in trans, each generating a different intermediate product.
  • Step 2B Selection cassette is excised from the locus when the other pair ofhomotypic RTS sites in cis, as compared to the pair used in Step 2A, is employed, and mutant allele with GOI replacing the selection cassette generated. Note that in step 2B, although the other homotypic pair in cis could also in theory recombine and regenerate the two substrates that started the RMCE reaction, this reaction is not favored /efficient, as the two sites are separated by the large GOI.
  • FIG. 4A shows the outcome of the murine Ighv, d, j &c genes when RMCE is employed to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
  • FIG. 4B shows the human IGHV, D &J genes included for humanization when RMCE is employed to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
  • FIG. 4C shows the overall flow scheme when RMCE is employed (with Cre/Lox as an example) to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
  • FIG. 5 is a schematic drawing showing a first strategy ( “Strategy 1” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) .
  • the generated megacircle can be used in recombinase mediated integration. This is similar to the flow scheme outlined in FIG. 1, but with only one RTS site incorporated into the megacircle.
  • FIG. 6 is a schematic drawing showing a first strategy ( “Strategy 2” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) .
  • the generated megacircle can be used in recombinase mediated integration. This is similar to flow scheme outlined in FIG. 2, but with only one RTS site incorporated into megacircle. In Step 1, only the proximal end of GOI is targeted, and an optional selection cassette, along with an RTS site and an I-Scel site, are incorporated.
  • FIG. 7 shows recombinase mediated integration.
  • Step 1 A single RTS site is inserted into the chromosomal location, along with a positive (+a) /negative (-) selection cassette, either by HR, nuclease mediated HDR, or a combination of both.
  • Step 2 The human GOI, carried in a megacircle, is integrated into the genomic locus, mediated by the corresponding site specific recombinase (SSR) .
  • the positive selection cassette (+b) is different from the selection marker gene (+a) in Step 1.
  • Step 3 The murine syntenic sequence (B) , along with the two selection cassettes, are eliminated from the targeted allele by non homologous end joining (NHEJ) .
  • the mutant allele now carries the human GOI replacing murine syntenic sequence (B) .
  • FIG. 8A shows the outcome of the murine Ighv, d, j &c genes when recombinase mediated integration is employed to generate mice carrying a complete repertoire of the human IGHV, D &J gene.
  • FIG. 8B shows the human IGHV, D &J genes included for humanization when recombinase mediated integration is employed to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
  • FIG. 8C shows the overall flow scheme when recombinase mediated integration is employed (with Cre/Lox as an example) to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
  • FIG. 9 shows overall activity for recombinase mediated integration approach.
  • the invention described herein provides a donor DNA molecule ( “linear or circular DNA fragment” also known as megastrand or megacircle, respectively) that satisfies several criteria: one, it can be large enough to accommodate the entire gene or gene cluster (in the hundreds Kb to mega base pair range) , though it can be as small as encompassing a partial gene or certain non-coding sequence; second, it has a pre-determined start and end (e.g., proximal and distal ends in relation to the centromere of the chromosomal from which the DNA fragment originates) , marking precisely the boundary of the gene or gene cluster.
  • a pre-determined start and end e.g., proximal and distal ends in relation to the centromere of the chromosomal from which the DNA fragment originates
  • such a donor DNA molecule contains additional exogenous sequence, such as a recombinase target sequence (RTS) , and preferably, the DNA fragment is devoid of a sequence sufficient for or required for self-replication in a host cell (e.g., the ORI sequence in plasmids etc. required for DNA self-replication in a host cell, pBR322, pUC, pSC101, or ARS) .
  • RTS recombinase target sequence
  • the megastrand or megacircle of the invention can be used in numerous settings, including (but not limited to) humanize large sections of a mouse genome with donor DNA from a single individual (such as a single human) , as opposed to (smaller) donor DNAs from multiple individuals.
  • the invention described herein provides a method that generates a large genomic DNA fragment called “megastrand, ” and when circularized, “megacircle, ” which typically has a size in the range of hundreds of Kb to million base pair range (FIGs. 1, 2, 5, &6) .
  • megastrand or megacircle can be generated in vitro in large quantity, free of a prokaryotic or eukaryotic vector backbone sequence, and with no need to propagate in bacterial, yeast, or other host cells.
  • genomic DNA sequence identified appropriate restriction sites flanking the fragment are identified. The two sites should be different such that they will preclude a head to tail self-ligation in a ligation reaction of the genomic fragment when released from its genomic location. Genomic DNA is then isolated and cleaved with the corresponding RE to generate the genomic fragment. Alternatively, site specific nuclease, such as Cas9 and Cas9 nickase (Jinek et al., 2012, incorporated by reference) , Cpf1 (Zetsche et al., 2015, incorporated by reference) , or PfAgo (Enghiad and Zhao, 2017, ACS Synth.
  • Cas9 and Cas9 nickase Jinek et al., 2012, incorporated by reference
  • Cpf1 Zetsche et al., 2015, incorporated by reference
  • PfAgo Endghiad and Zhao, 2017, ACS Synth.
  • the two species, the genomic DNA fragment and the recombinant DNA carrying matching ends, are then ligated to form a circle (megacircle) .
  • the ligation mixture can be treated with exonuclease to eliminate all fiagments that remain linear. While circles may be formed from all genomic fragments bearing compatible sites (in the case of using CRISPR or PfAgo, the circle should be more specific) , amplification of the target circle can be accomplished using RCA (Ali et al., 2014) . If needed, megastrand DNA can also be enriched by affinity purification or separation on pulsed field gel electrophoresis before used in a ligation reaction.
  • phi29 ( ⁇ 29) DNA polymerase (Blanco et al., 1989; Blanco and Salas, 1984) is commonly used, and one or multiple primers specific for the target genomic region is/are provided in the reaction.
  • rate of synthesis of phi29 DNA polymerase is about 50 nucleotides per second, for a 500 kb fragment, it takes about 10, 000 seconds (or 3 hours) to complete one round of synthesis for a single strand.
  • MDA multiple displacement amplification
  • about 100 or more oligonucleotides can be made to hybridize to one strand and to tile the entire length of the megacircle template.
  • oligonucleotides are provided targeting the other strand.
  • the target fragment will have been amplified 1, 000 times, as linear double-stranded concatemers.
  • the RCA product can then be digested with an enzyme, such as a restriction enzyme which cuts once per unit of the concatemer, and the individual unit ligated under low substrate concentration to form the circle.
  • an enzyme such as a restriction enzyme which cuts once per unit of the concatemer, and the individual unit ligated under low substrate concentration to form the circle.
  • error rate 1/10 6 -1/10 7 error rate 1/10 6 -1/10 7
  • this process potentially can be repeated multiple times.
  • random hexamer can also be used for additional amplification with higher efficiency.
  • exonuclease-resistant primer or random-hexamer primers with thiophosphate linkages for the two 3’ terminal nucleotides can be used in RCA reaction.
  • GOI can also be flanked with I-Scel sites (FIG. 2) or with an engineered nuclease cleavage site incorporated into one end of the GOI, matching the naturally occurring site at the other end of GOI (FIG. 6) .
  • I-Scel sites FIG. 2
  • an engineered nuclease cleavage site incorporated into one end of the GOI, matching the naturally occurring site at the other end of GOI (FIG. 6) .
  • HR nuclease mediated HDR
  • the advantage is that it allows a head to tail self-ligation of the megastrand DNA in cis to form a circle, as compared with the strategies outlined in FIGs. 1 and 5, which require ligation between two molecular species.
  • the final product is digested and self ligated to form a megacircle.
  • emulsion can be used to separate each molecule into a single oil droplet to promote intramolecular ligation efficiency. Linear fragment that remains will be eliminated by exonuclease treatment, improving the relative abundance of megacircle.
  • the megacircle preparation can be injected directly into the nucleus of mouse embryonic stem cells, as compared to delivery through electroporation, which may require a higher concentration. If mouse strains carrying the landing pad would have been created and zygotes available, megacircle could be delivered directly into the zygotes, along with the recombinase, to elicit RMCE or recombinase mediated integration.
  • FIGs. 1 and 5 are particularly appealing, as they do not require targeting of the human chromosome in human ES or iPS cells -the two human cell types are not known for high targeting efficiency and may be challenging to work with.
  • megacircle through RCA In addition to generating megacircle through RCA, it may also be assembled by Gibson assembly or synthesized de novo.
  • HR is a naturally occurring mechanism that repairs DNA damage and exchanges DNA content between sister chromatids during mitosis and meiosis. Exploiting into this mechanism, precise changes can be introduced into the genome of eukaryotic cells (Doetschman et al., 1987; Thomas and Capecchi, 1987) . Although powerful, it is a laborious and time consuming process, more so when it comes to large scale genome engineering. For example, to replace the murine Igh locus with its human syntenic region (940 kb) , using BAC targeting vectors, it took 9 rounds of manipulation in mouse ESCs and years of work to complete the project (Macdonald et al., 2014) .
  • SSR binds to a short stretch of sequence, RTS, and promotes strand exchange between the two substrates to form two strand-exchanged products (O′Gorman et al., 1991; Sauer and Henderson, 1988; Thorpe and Smith, 1998; Thyagarajan et al., 2001) .
  • the target site is 34 bps in length (LoxP) , consisting of two 13 bps of palindromic repeats separated by an 8 bps spacer sequence.
  • One recombinase molecule binds to one 13 bps repeat sequence and as such, each 13 bps repeat constitutes a recombinase binding element (RBE) .
  • the minimal target site is similar in structure to the LoxP site for Cre recombinase, although the 13 bps repeats are not strictly palindromic but differ by one base pair. Also the full FRT site has an additional 13 bps of repeat, located 5’ to and separated from the minimal sequence by one additional nucleotide, for a total of 48 bps. This segment is not needed for excision but essential for integration, including RMCE.
  • the target sites are called attP and attB. Each site consists of two 18 bps binding elements separated by two nucleotides of spacer sequence and the two 18 bps binding elements are not particularly palindromic. For the two RTS sites that are brought together to recombine, there are 4 recombinase molecules involved, each binding to one RBE, forming a synapse (Van Duyne, 2001) .
  • a transgene can be inserted into any genomic locus that has been engineered to carry the RTS sites. Compared with conventional transgenesis, insertion of a transgene into a safe harbor site is preferred, as performance of the transgene is more predictable. This is because the transgene can be inserted into a genomic locus that is known to support expression of the transgene and it can be done such that only one copy of the transgene is incorporated.
  • RMCE is a two-step process (Malchin et al., 2010; Schlake and Bode, 1994; Takata et al., 2011) ( Figure 3) .
  • the first step by nature, is recombinase mediated integration. It occurs in trans, between the pair of homotypic RTS sites. In this process, the circular donor DNA is integrated into the recipient site, creating an intermediate product. Depending on the pair of homotypic sites involved, two different intermediate products are created which will then serve as the substrates for the next step.
  • the second step in an RMCE reaction is recombinase mediated excision. It occurs in cis, between the two homotypic recognition sites now brought in cis.
  • step 1 If it is the same pair as used in step 1 (integration) , it reproduces the two substrates that started the RMCE reaction. However, if it is the other pair, two products will be produced, with one being the desired product, now with the selection marker gene replaced with the GOI. As the enzyme is the same for the two steps, RMCE should progress seamlessly from step 1 to step 2, or in the reverse direction, reaching equilibrium as defined by the reaction.
  • the donor needs to be a circular molecule so it can be integrated into the genomic locus and it has to be large enough to accommodate a gene or gene cluster.
  • a megastrand can be produced by digesting the concatemer produced from an RCA reaction with the appropriate restriction enzyme (I-Sce1 in FIGs. 1, 2, 5 & 6) . It can then be used directly as a large transgene or modified further for gene targeting in an HR reaction or nuclease mediated HDR.
  • RMCE and recombinase mediated integration can be used to replace the murine Ighv, d, and j genes with their human ortholog genes, using megacircle as the vehicle to transfer human syntenic sequence.
  • FIGs. 4A and 4B depict fate of the murine Igh and human IGH genes (Walter et al., 1990) , alleles, and technical elements incorporated in the process to support a RMCE scheme.
  • the murine Igh locus could be divided into 4 blocks.
  • Block 1 is 2.5 Mb and marked by Ighv1-86 at the very 5’end and Igv5-1 gene at the 3’end.
  • block 2 is 91 kb in size and includes Adam6a and Adam6b genes that is retained, as these two genes are important for fertility of the sperm;
  • block 3 is 54 kb in size and starts with Ighd1-1 and ends with Ighj4 genes. This region is deleted and replaced with the PGK/HygroTK type +/-cassette, bringing along the heterotypic LoxP/Lox5717 sites in preparation for RMCE; block 4 carries the Igh enhancer sequence, switch region, and constant region genes in their natural figuration and is left unperturbed.
  • a positive/negative selection cassette (Hygro/TK) , along with the LoxP/Lox5171 sites, are inserted into the murine Igh locus, replacing 54 kb of the murine Igh genes from Ighd1-1 to Ighj4. This can be accomplished by HR, nuclease mediated HDR, or a combination of both, in mouse embryonic stem cells.
  • the human IGH locus can also be divided into 3 blocks (FIG. 4B) .
  • Block 1 starts with IGHV (III) -82 gene and ends with IGHV (II) -74-1 gene;
  • block 2 is 948 kb in size and marked by IGHV3-74 at the 5’end and IGHJ6 at the 3’end.
  • Genes in this block are transferred to and replace the murine Igh locus;
  • block 3 starts with the enhancer sequence and extends to the rest of the human chromosome 14 in its natural configuration.
  • gene targeting is performed to insert a LoxP site and an I-Sce1 site between blocks 1 and 2, along with a positive selection cassette.
  • targeted clone When targeted clone (s) is identified, they can be driven to homozygocity by a “loss of heterozygocity” protocol, when cultured in a high concentration of G418. After homozygote clones are obtained, they can be modified further to carry the Lox5171 site, along with an I-Sce1 site and a positive selection cassette, between blocks 2 and 3, by HR, nuclease mediated HDR, or a combination of both. The IGH allele with both LoxP and Lox5171 sites incorporated and flanked by the I-Sce1 sites can then be digested with I-Sce1 enzyme and megastrand released. Upon circularization, the megacircle can be used as the template for megacircle amplification as outlined in FIG. 2. Alternatively, a megacircle can be prepared as outlined in FIG. 1, without gene targeting in human cells.
  • FIG. 4C illustrates how this scheme is carried out.
  • a sufficient quantity of megacircles are obtained, they can be electroporated into mouse ESCs that have been engineered to carry the LoxP/Lox5171 sites at the murine Igh locus, along with Cre recombinase, to initiate RMCE.
  • the human IGH sequence is incorporated into the murine Igh locus.
  • the 3’Lox5171 site can be flanked by PiggyBac transposon sequences and, when exposed to PiggyBac, the transposon is eliminated, along with the Lox517 site. This leaves no “footprint” of genome engineering in the region between human IGH and murine constant region genes.
  • At the 5’end of the recombined allele there is a single LoxP site remaining between the murine Adam6b and the IGHV3-74 genes.
  • FIG. 8A Megacircle can also be engineered to support recombinase mediated integration as outlined in FIG. 7.
  • FIG. 8B To use this scheme to incorporate the human IGH genes into the murine Igh locus, a detailed plan is illustrated in FIGs. 8A, 8B, and 8C. This approach may be technically less challenging, as it requires only one integration event.
  • the murine Igh locus is modified to carry the single RTS (FIG. 8A) .
  • a Cpf1 protospacer/PAM sequence is first identified marking the 3’end of IGH block 2. This same Cpf1 protospacer/PAM sequence is then incorporated into the targeting vector and inserted into the region between blocks 1 and 2, as illustrated in FIG. 8B.
  • restriction sites flanking block 2 can also be identified and used.
  • the product from recombinase mediated integration carries the two selection cassettes 3’to the IGH genes and can be deleted using nuclease mediated NHEJ mutations flanking the region.
  • megacircle prepared according to FIG. 5 can also be used for recombinase mediated integration to insert the human IGH genes into the murine locus. In this case, a megacircle is generated by ligating the human IGH genomic fragment and a recombinant DNA construct carrying compatible ends, without the need to engineer the human IGH genes in a human cell line (ESC or iPSCs) .
  • IMGT/LIGM-DB the IMGT comprehensive database of immunoglobulin and T cell receptor nucleotide sequences. Nucleic acids research 34, D781-784.
  • Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell 163, 759-771.

Abstract

Provided are methods and compositions/reagents for efficiently carrying out large scale modification of eukaryotic genome.

Description

LARGE SCALE MODIFICATION OF EUKARYOTIC GENOME BACKGROUND OF THE INVENTION
Although sharing common features, animals differ from humans in genetic composition and gene function. To improve the translational value of animal models, it is often necessary to partially replace and alter its genome and reconstitute with human specific biology and function. Although there has been significant improvement over the last 30 years, current genome engineering technologies are still cumbersome and laborious, limited by size and confines of the donor genomic DNA fragment available.
To replace a gene or gene cluster in the recipient genome with its ortholog or syntenic region from the donor genome, a shuttle vector must first be identified or constructed carrying genomic DNA fragment of the donor origin and transferring it to the recipient genome. Over the years, several systems have been invented serving this need, including plasmid, cosmid (Hohn and Murray, 1977) , BAC (bacterial artificial chromosome) (Osoegawa et al., 2001) , YAC (yeast artificial chromosome) (Burke et al., 1987) , HAC (human artificial chromosome) (Hoshiya et al., 2009) , and “transchromosome” (Tomizuka et al., 1997) .
Cosmid is a circular plasmid, with capacity for DNA insert of 45 kb (kilo base pairs) . For smaller genes, this size may be sufficiently large to accommodate the entire gene, including the 5’ and 3’ regulatory sequences.
BAC vector is also a circular plasmid, with an average insert size of 150-350 kb. With its fair size and stable insert, BAC has been the popular choice providing human and murine genomic DNA fragments for a variety of uses, including whole genome sequencing and genome engineering.
YAC is also circular and larger and can hold inserts up to 2,000 kb, with an average insert size of 250-400 kb. Although insert size is appealing, owing to its large size and propagation in the highly recombinant yeast host, YAC clones are plagued with issues of artifact, including intramolecular rearrangement of the insert (insertion, deletion, and inversion, etc. ) , ligation of more than one insert into the same vector, and combination of more than one YAC into one YAC clone. As such, YAC library is of limited use.
In addition to these preexisting libraries of genomic DNA fragments, genomic DNA can  also be transferred directly from donor to recipient cells by fusion of the two cell types, in a process called microcell-mediated chromosome transfer. The sizes of these “transchromosomes” can be very large, up to the length of the entire chromosome. However, what is transferred is by chance and often fragmented and in addition, propagation to the next round of cell division is not always guaranteed.
In addition to their sizes, another key limitation common to these vectors is that the insert carried in each vector was generated by fragmenting the linear chromosome into smaller pieces and as such, the start and end points of an insert can not be controlled. For Cosmid, BAC and YAC libraries, genomic DNA from the donor is digested with a restriction enzyme. For “transchromosome, ” although transfer of an intact chromosome is possibly, chromosome is more likely fragmented during the microcell mediated transfer process.
One example is the RPCI-11 BAC library (Osoegawa et al., 2001) that carries the ensemble of human genomic DNA fragments in the pBACe3.6 vector. To generate the library, blood sample was collected from a human donor, genomic DNA isolated and partially digested with EcoRI restriction enzyme. The fragmented DNA was then separated on agarose gel and size selected DNA fragments cloned into the pBACe3.6 vector between the EcoRI sites (for a subset of the clones, the same donor DNA was partially digested with MboI, size selected, and ligated into the pTARBAC1 cloning vector at the BamHI site) . The ligation products were subsequently transformed into DH10B cells and individual clones arrayed into 384-well microtiter plates. The entire collection of these plates are stored for archive and retrieval. For the human genomic DNA library, RPCI-11 consists of 1, 440 plates and a total of 543, 797 clones, covering the entire 3 × 109 base pairs of the human genome in multiple coverages. Depending on where EcoR1 enzyme cuts, it is possible that a gene may be carried on more than one BAC clone and a gene cluster larger than the carrying capacity of pBACe3.6 (150-350 kb) most definitely will be housed in more than one BAC clone.
Genes within the same gene family often are located close by on a chromosomal region and these gene clusters vary in size but often are large. For example, the human immunoglobulin heavy chain (IGH) locus is mapped to the most distal band of the long arm of chromosome 14 (14q32.33) and it occupies 1.3 Mb (1.3×l06 bps) , out of a total of 107× 106 (or 1.2%) , of the chromosome 14 space. The IGH gene cluster consists of 123-129 IGHV (could vary among individuals) , 27 IGHD, 9 IGHJ and 11 IGHC genes, along with other genes that are colocalized  to this region (Giudicelli et al., 2006) . Another example is the human major histocompatibility (MHC) gene cluster. The MHC gene cluster is located on the short arm of chromosome 6 at band 6p21.3 and spans about 4 Mb, out ora total of 171 Mb, or 2.3%, of the human chromosome 6 content. Genes of the  MHC classes  1 and 2 are located within this region.
Mice carrying human immunoglobulin (IG) genes have been created and proven to be of tremendous pharmaceutical value (Green, 2014) . The first generation mouse models were created carrying human IG transgenes on a murine knockout background (Bruggemann et al., 1989; Green et al., 1994; Lonberg et al., 1994; Taylor et al., 1992) . Although capable of producing antibody of a high affinity when challenged, these mice have major limitations owing to early transgenic technologies used to create these mice. They carry the BAC or YAC derived transgene and accordingly, partial representation of the IGV gene cluster. In addition, these transgenes randomly integrate into the mouse genome and are out of the genomic context in which they naturally reside. As a consequence, they may not have retained authentic transcriptional regulation offered by particularly the 3’ flanking sequences which are known to be important.
Employing gene targeting or recombinase mediated cassette exchange (RMCE) , two other IG mouse strains have also been created, which overcome the limitations found in a transgenic model. Regeneron Pharmaceutical, using conventional gene targeting with BAC based targeting vectors, created the VelocImmune mice with precise replacement of the murine Ighv, d andj genes with its human ortholog (Macdonald et al., 2014) . In another effort, the Kymab group resorted to RMCE and successfully generated a mouse model also with precise replacement of the murine Ighv, d, and j genes with their human syntenic region (Lee et al., 2014) . Both efforts involved ferrying the human IGH genes, one BAC vector at a time, into the mouse embryonic stem cells, and the entire process took multiple rounds of transfer and years of work. In addition, before used as a vector, some of the BAC clones needed to be “trimmed” to delete or add genes to them so they tile and cover the entire human IGH gene locus.
Therefore, improved techniques are needed so that genomes can be engineered with a better efficiency.
SUMMARY OF THE INVENTION
The invention described herein provides compositions and methods to produce large  DNA fragments (e.g., hundreds of Kb to million base pairs) containing a genomic region of interest (GOI) with defined start and end points. For most genome engineering needs, these large DNA fragments (termed "megacircle” herein) can be used to ferry genes or blocks of genes of donor origin to the recipient genome in one or multiple transfers. In one example, the megacircle can be used to transfer a gene cluster from a first /donor organism (such as the human IGH gene cluster) to the cells of a second /recipient organism (such as a model organism like mouse) , and replace a target (e.g., the corresponding genomic) region in the second /recipient organism (e.g., murine Igh locus) with a (such as its syntenic) region from the first /donor organism (e.g., human) . Any other gene clusters can be used in the invention, including, but not limited to, those for TCR (T cell receptor) gene cluster,  MHC classes  1 and 2, and cytochrome P450 genes. This method of the invention can be used to reduce large scale genome engineering to a few steps, including eliminating the need to amplifying large segments of DNA fragments in a prokaryotic (such as bacteria) or eukaryotic (such as yeast) host cell, thus improving efficiency and saving time and cost.
Briefly, the methods of the invention includes the production of megacircles, which may comprise the following steps: 1. release a genomic DNA comprising the GOI from a host genome; 2. ligate the genomic DNA into megacircle with or without another DNA fragment; 3. RCA amplification of megacircle, generating double strand concatemers; 4. Digest the concatemer to make monomers with compatible sticky overhangs (megastrand) ; and 5. Self-ligating the megastrands to re-generate megacircles.
The megacircle or megastrand of the invention can, for example, serve as donor template of homologous recombination (HR) if it contains genomic sequence homologous to a target genomic region in a recipient or host. It could also contain binding site for endonucleases such as CRISPR, TALEN and ZFN.
Thus one aspect of the invention provides a DNA fragment comprising a genomic region of interest (GOI, which can be a single complete or partial gene, or a cluster of related or unrelated genes, non-encoding sequence, or any genomic region of interest) , wherein the GOI is at least about 20 kb (e.g., 50 kb, 100 kb, 150 kb, 200 kb, 300 kb, 400 kb, 500 kb, 750 kb, 1 mb, 1.25 mb, 1.5 mb, 1.75 mb, 2 mb or more) , and is flanked by a first unique enzyme cleavage site at the proximal end of the genomic DNA fragment and a second unique enzyme cleavage site at the distal end of the genomic DNA fragment, wherein the DNA fragment further comprises an  exogenous sequence (such as a selection marker, a recombinase target site (RTS) , or a sequence from a non-donor source) between the proximal end of the DNA fragment and the proximal end of the GOI, or between the distal end of the DNA fragment and the distal end of the GOI; and wherein the DNA fragment is devoid of a sequence required for self-replication in a host cell (e.g., ORI, pBR322, pUC, pSC101, or ARS) .
As used herein, “unique” means that the enzyme cleavage site does not exist inside the GOI, so that when an enzyme specific for the unique enzyme cleavage site is used in digestion, the enzyme does not cleave within the GOI. However, the unique enzyme cleavage site may well exist in other parts of the same genome.
In certain embodiments, the first and the second unique enzyme cleavage sites have the same sequence, or are recognized by one (the same) cleavage enzyme, or have sticky ends compatible for ligation.
In certain embodiments, the DNA fragment is a double-stranded circle (e.g., linked at the first and the second unique enzyme cleavage sites) .
In certain embodiments, the DNA fragment further comprises a first positive selection marker cassette located between the RTS and an end of the DNA fragment, wherein the GOI and the positive selection marker cassette flank the RTS.
In certain embodiments, the DNA fragment further comprises a heterotypic recombinase target site (htRTS) , wherein the RTS and the htRTS flank the GOI.
In certain embodiments, the DNA fragment is circular, with a linkage created by linking the first and the second unique enzyme cleavage sites, wherein the linkage is between the positive selection marker cassette and the htRTS (or the RTS) .
In certain embodiments, the DNA fragment is amplified by rolling circle amplification (RCA) .
In certain embodiments, the GOI is immunoglobulin heavy chain gene cluster, immunoglobulin kappa light chain gene cluster, immunoglobulin lambda light chain gene cluster, a TCR (T-cell receptor) gene cluster, an MHC class 1 gene, an MHC class 2 gene, or a cytochrome P450 gene, or any other genomic fragment.
In certain embodiments, the first and/or the second unique enzyme cleavage site (s) are cleavable by a restriction endonuclease, a homing endonuclease (such as I-Scel) , a  CRISPR/Cas9 nuclease, a CRISPR/Cpf1 nuclease, a TALEN (transcription activator-like effector nuclease) , a ZFN (Zinc Finger nuclease) , or a Pyrococcus furiosus Argonaute (PfAgo) based artificial restriction enzyme (ARE) .
In certain embodiments, one of the first and the second unique enzyme cleavage sites is a natural /pre-existing genomic sequence.
In certain embodiments, the RTS or the htRTS is locus of crossover in P 1 (loxP) , flippase recognition target (FRT) , attP/attB, or mutants thereof, or the recognition site of other site-specific recombinases.
Another aspect of the invention provides a method of producing the DNA fragment of any one of the subject DNA fragments, the method comprising: (a) inserting, in a host genome comprising said GOI, said first unique enzyme cleavage site and said RTS proximal (or distal) to the GOI, wherein the GOI is proximal (or distal) to the second unique enzyme cleavage site, and wherein an optional first positive selection marker cassette is distal (or proximal) to said first unique enzyme cleavage site and proximal (or distal) to said RTS; (b) digesting the host genome with a first unique enzyme that cleaves said first unique enzyme cleavage site, and a second unique enzyme that cleaves said second unique enzyme cleavage site, to release the DNA fragment comprising the GOI from the host genome; (c) circularizing the released DNA fragment to form a megacircle, by ligating the ends of said released DNA fragment generated by cleavage by said first unique enzyme and said second unique enzyme, under conditions that promote intramolecular self-ligation; (d) amplifying said megacircle by rolling circle amplification (RCA) with a primer (or multiple primers) unique to said GOI.
In certain embodiments, steps (a) and (b) can be done simultaneously, or sequentially in either order.
In certain embodiments, the second unique enzyme cleavage site is naturally existing sequence. Note, however, in a similar aspect of the invention, a method is contemplated in which in step (a) , two naturally existing unique enzyme cleavage sites flank the GOI.
In certain embodiments, step (a) further comprises inserting a 3rd unique enzyme cleavage site between the first unique enzyme cleavage site and the RTS.
In certain embodiments, the second unique enzyme cleavage site, together with a second positive selection marker cassette and htRTS, are inserted into said host genome distal (or  proximal) to GOI, preferably the second unique enzyme cleavage site is distal (or proximal) to the htRTS and proximal (or distal) to the second positive selection marker cassette.
Another aspect of the invention provides a method of producing the DNA fragment of any one of the subject DNA fragments, the method comprising: (a) removing /retrieving a genomic DNA fragment comprising said GOI from a host genome, through digesting the host genome with a first enzyme specific for a first unique enzyme cleavage site and a second enzyme specific for a second unique enzyme cleavage site, wherein said first and said second unique enzyme cleavage sites are not compatible for ligation with each other; (b) ligating the retrieved genomic DNA fragment in step (a) with a linear recombinant DNA construct to produce a megacircle, wherein said linear recombinant DNA construct comprises a positive selection marker cassette flanked by a pair of heterotypic RTSs or flanked by a RTS and a 3rd unique enzyme cleavage site, and wherein the ends of the linear recombinant DNA construct are defined by said first and said second unique enzyme cleavage sites, respectively; (c) amplifying said megacircle by rolling circle amplification (RCA) with a primer unique to said GOI.
In certain embodiments, said first and said second unique enzyme cleavage sites are the same, and said first and said second unique enzymes are the same.
In certain embodiments, the method further comprises, before amplification by RCA, treating the ligation mixture with an exonuclease to eliminate linear DNA fragments.
In certain embodiments, RCA is performed using phi29 (θ29) DNA polymerase, and one or more primers each specific for the GOI.
In certain embodiments, RCA comprises multiple displacement amplification (MDA) .
In certain embodiments, the method further comprises resolving the RCA concatemer product, preferably with said an enzyme specific for said 3rd unique enzyme cleavage site (if present) , or with said unique first enzyme or said unique second enzyme.
In certain embodiments, the method further comprises self-ligating digested RCA concatemer product under conditions favoring the formation of intramolecular self-ligation (e.g., low concentration) .
In certain embodiments, the conditions comprise emulsifying the digested RCA concatemer product to form single oil droplets, each comprising no more than one linear DNA fragment, to promote self-ligation.
In certain embodiments, the method further comprises treating ligation mixture of the digested RCA product with an exonuclease to eliminate linear DNA fragments.
In certain embodiments, step (a) is carried out by nuclease mediated homology directed repair (HDR) , or homologous recombination (HR) , or a combination thereof.
Another aspect of the invention provides a method of replacing a host (e.g., mouse) genomic region in a host genome with a (syntenic) DNA fragment from a donor (e.g., human) genome, the method comprising: (a) replacing the host genomic region with a pair of heterotypic RTSs (e.g., via nuclease enabled HDR, or HR, or a combination thereof) , wherein said pair of heterotypic RTSs flank an optional positive/negative selection cassette (such as hygroTK selection cassette) ; (b) providing the DNA fragment of any one of claims 1-11 from the donor genome, and allowing the DNA fragment from the donor genome to integrate into the host genome through one of said pair of heterotypic RTSs in the presence of site specific recombinase specific for said one of said pair ofheterotypic RTSs; (c) optionally, allowing deletion of said positive/negative selection cassette (if present) through the other of said pair of homotypic RTSs.
Another aspect of the invention provides a method of replacing a host (e.g., mouse) genomic region from a host genome with a (syntenic) DNA fragment from a donor (e.g., human) genome, the method comprising: (a) inserting a single RTS, along with an optional positive/negative selection cassette (such as hygroTK selection cassette) , proximal (or distal) to the host genomic region (e.g., via nuclease enabled HDR, or HR, or a combination thereof) ; (b) providing the DNA fragment of any one of claims 1-11 from the donor genome, and allowing the DNA fragment from the donor genome to integrate into the host genome through said RTS in the presence of site specific recombinase specific for said RTS; (c) optionally, allowing deletion of said positive/negative selection cassette (if present) and the host genomic region through nuclease mediated NHEJ.
In certain embodiments, said host genomic region is deleted prior to step (b) .
In certain embodiments, the method is carried out in a zygote, an oocyte, a sperm cell (spermatogonial stem cell line) , or an ES cell of the host, preferably by microinjecting, electroporating, or transfecting exogenous components (e.g., CRISPR/Cas9 and guide RNAs targeting said first and said second unique enzyme cleavage sites; megacircles; recombinase protein or coding sequence thereof) .
In certain embodiments, the positive/negative selection marker cassette comprises a (neomycin/puromycin/hygromycin/blasticidin/zeocin) resistant /TK (or HPRT) gene under the expression control a eukaryotic promoter (such as the PGK promoter) and a polyA coding sequence.
Another aspect of the invention provides a mouse generated by any one of the subject methods.
Another aspect of the invention provides a mouse comprising in its genome an exogenous genomic DNA from a donor (e.g., a human, a mammal, or different mouse strain) , wherein the exogenous genomic DNA comprises a polymorphism within the species of the donor, and wherein the exogenous genomic DNA comprises any one of the subject DNA fragments.
In certain embodiments, the exogenous genomic DNA comprises an immunoglobulin heavy chain gene cluster, immunoglobulin kappa light chain gene cluster, immunoglobulin lambda chain gene cluster, a TCR, an MHC class 1 gene, an MHC class 2 gene, or a cytochrome P450 gene.
It should be understood that any one embodiment described herein, including those only described under one aspect of the invention, can be combined with any one or more other embodiments, unless explicitly disclaimed or improper.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic drawing showing a first strategy ( “Strategy 1” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) . The generated megacircle can be used in RMCE. In Step 1: A pair of unique enzyme cleavage sites, including restriction enzyme (RE) sites or protospacer/PAM sequences for a nuclease, such as that for Cpf1, is identified as flanking the genomic region of interest /gene of interest (collectively “GOI” ) . Sometimes two unique enzyme cleavage sites can be identified for certain GOI. Human genomic DNA is isolated and cleaved with the nuclease or restriction enzyme (RE) to release the genomic DNA fragment from its chromosomal location. The two sites are chosen such that they are incompatible with each other and as such, it precludes self ligation of the linear DNA fragment into a circle in a ligation reaction. Concurrently, a recombinant DNA construct is created to carry a positive selection marker cassette flanked by a pair of heterotypic recombinase target sites (RTSs) . In addition, it may carry a pair ofprotospacer/PAM sequences or restriction  sites matching those flanking the GOI as depicted. When cleaved with the corresponding nuclease or RE, the recombinant DNA construct carries incompatible sticky ends in cis, but in trans, the sticky ends perfectly match those flanking the GOI. These two species, the digested linear genomic DNA and the recombinant DNA construct, are brought into a ligation reaction to be ligated into a “megacircle. ” Step 2: the megacircle DNA serving as a template is amplified through RCA so a large quantity of the megacircle could be generated. Upon completion of RCA, individual megacircle units are released from the linear concatemer by digestion with an enzyme specific for a unique enzyme cleavage site, such as an I-Scel site that had been incorporated into the recombinant DNA construct earlier. The resolved RCE products are then self-ligated to re-form megacircles, which can be used in RMCE.
FIG. 2 is a schematic drawing showing a second strategy ( “Strategy 2” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) . The generated megacircle can be used in RMCE. In Step 1: A positive selection cassette (+a) , along with an RTS site and an I-Scel site, is inserted into the proximal end of the GOI by homologous recombination (HR) , nuclease mediated homology directed repair (HDR) , or a combination of both. It is followed by a second targeting event, inserting the second selection cassette (+b) , along with the heterotypic RTS site and a second I-Scel site, to the distal end of the GOI. Step 2: Genomic DNA is isolated and genomic DNA fragment carrying GOI flanked by the positive selection cassette and an RTS site at the proximal end and a heterotypic RTS site at the distal end released from its chromosomal location by I-Scel digestion. This DNA fragment is self-ligated (compatible stick ends) to form the megacircle molecule. Step 3: the megacircle is amplified by RCA, using a primer (or multiple primers) unique to the GOI. Upon completion of the RCA, the RCA concatemer will be resolved to release individual “megastrand” units, each can self-ligate back into a megacircle for use in RMCE.
FIG. 3 a schematic drawing showing the RMCE Reaction using the subject megacircles described in FIGs. 1 and 2. Step 1: A pair of heterotypic RTS sites is inserted into a target chromosomal location, replacing the murine syntenic region (B) , either by HR, nuclease mediated HDR, or a combination of both. An optional positive/negative selection cassette is incorporated to allow selection and enrichment of the successful recombinant. Step 2A: the human GOI, carried in a megacircle, is integrated into the genomic locus, mediated by recombinase mediated integration. Integration could occur through either pair of the two pairs of  homotypic sites in trans, each generating a different intermediate product. Step 2B: Selection cassette is excised from the locus when the other pair ofhomotypic RTS sites in cis, as compared to the pair used in Step 2A, is employed, and mutant allele with GOI replacing the selection cassette generated. Note that in step 2B, although the other homotypic pair in cis could also in theory recombine and regenerate the two substrates that started the RMCE reaction, this reaction is not favored /efficient, as the two sites are separated by the large GOI.
FIG. 4A shows the outcome of the murine Ighv, d, j &c genes when RMCE is employed to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
FIG. 4B shows the human IGHV, D &J genes included for humanization when RMCE is employed to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
FIG. 4C shows the overall flow scheme when RMCE is employed (with Cre/Lox as an example) to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
FIG. 5 is a schematic drawing showing a first strategy ( “Strategy 1” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) . The generated megacircle can be used in recombinase mediated integration. This is similar to the flow scheme outlined in FIG. 1, but with only one RTS site incorporated into the megacircle.
FIG. 6 is a schematic drawing showing a first strategy ( “Strategy 2” ) for generating one embodiment of the megacircle of the invention by rolling cycle amplification (RCA) . The generated megacircle can be used in recombinase mediated integration. This is similar to flow scheme outlined in FIG. 2, but with only one RTS site incorporated into megacircle. In Step 1, only the proximal end of GOI is targeted, and an optional selection cassette, along with an RTS site and an I-Scel site, are incorporated.
FIG. 7 shows recombinase mediated integration. In Step 1: A single RTS site is inserted into the chromosomal location, along with a positive (+a) /negative (-) selection cassette, either by HR, nuclease mediated HDR, or a combination of both. Step 2: The human GOI, carried in a megacircle, is integrated into the genomic locus, mediated by the corresponding site specific recombinase (SSR) . The positive selection cassette (+b) is different from the selection marker gene (+a) in Step 1. Step 3: The murine syntenic sequence (B) , along with the two selection cassettes, are eliminated from the targeted allele by non homologous end joining (NHEJ) . The mutant allele now carries the human GOI replacing murine syntenic sequence (B) .
FIG. 8A shows the outcome of the murine Ighv, d, j &c genes when recombinase mediated integration is employed to generate mice carrying a complete repertoire of the human IGHV, D &J gene.
FIG. 8B shows the human IGHV, D &J genes included for humanization when recombinase mediated integration is employed to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
FIG. 8C shows the overall flow scheme when recombinase mediated integration is employed (with Cre/Lox as an example) to generate mice carrying a complete repertoire of the human IGHV, D &J genes.
FIG. 9 shows overall activity for recombinase mediated integration approach.
DETAILED DESCRIPTION OF THE INVENTION
1. Introduction
The invention described herein provides a donor DNA molecule ( “linear or circular DNA fragment” also known as megastrand or megacircle, respectively) that satisfies several criteria: one, it can be large enough to accommodate the entire gene or gene cluster (in the hundreds Kb to mega base pair range) , though it can be as small as encompassing a partial gene or certain non-coding sequence; second, it has a pre-determined start and end (e.g., proximal and distal ends in relation to the centromere of the chromosomal from which the DNA fragment originates) , marking precisely the boundary of the gene or gene cluster. In addition, such a donor DNA molecule contains additional exogenous sequence, such as a recombinase target sequence (RTS) , and preferably, the DNA fragment is devoid of a sequence sufficient for or required for self-replication in a host cell (e.g., the ORI sequence in plasmids etc. required for DNA self-replication in a host cell, pBR322, pUC, pSC101, or ARS) .
The megastrand or megacircle of the invention can be used in numerous settings, including (but not limited to) humanize large sections of a mouse genome with donor DNA from a single individual (such as a single human) , as opposed to (smaller) donor DNAs from multiple individuals.
Aspects of the invention are described in further details below.
2. Megacircle Generation
The invention described herein provides a method that generates a large genomic DNA fragment called “megastrand, ” and when circularized, “megacircle, ” which typically has a size in the range of hundreds of Kb to million base pair range (FIGs. 1, 2, 5, &6) . According to this method, for any specific genomic DNA fragment of interest, megastrand or megacircle can be generated in vitro in large quantity, free of a prokaryotic or eukaryotic vector backbone sequence, and with no need to propagate in bacterial, yeast, or other host cells.
As outlined in FIGs. 1 and 5, for the genomic DNA sequence identified, appropriate restriction sites flanking the fragment are identified. The two sites should be different such that they will preclude a head to tail self-ligation in a ligation reaction of the genomic fragment when released from its genomic location. Genomic DNA is then isolated and cleaved with the corresponding RE to generate the genomic fragment. Alternatively, site specific nuclease, such as Cas9 and Cas9 nickase (Jinek et al., 2012, incorporated by reference) , Cpf1 (Zetsche et al., 2015, incorporated by reference) , or PfAgo (Enghiad and Zhao, 2017, ACS Synth. Biol., 2017, 6 (5) , pp 752-757, incorporated herein by reference, which describes a Pyrococcus furiosus Argonaute (PfAgo) based platform for generating artificial restriction enzymes (AREs) capable of recognizing and cleaving DNA sequences at virtually any arbitrary site and generating defined sticky ends of varying length) can be used to specifically release the target genomic DNA fragment from the chromosome. Concurrently, a recombinant DNA construct is synthesized, carrying a selection marker gene flanked by restriction sites or nuclease cleavage sites matching those of the genomic fragment as depicted. The two species, the genomic DNA fragment and the recombinant DNA carrying matching ends, are then ligated to form a circle (megacircle) . To enrich for the megacircle, the ligation mixture can be treated with exonuclease to eliminate all fiagments that remain linear. While circles may be formed from all genomic fragments bearing compatible sites (in the case of using CRISPR or PfAgo, the circle should be more specific) , amplification of the target circle can be accomplished using RCA (Ali et al., 2014) . If needed, megastrand DNA can also be enriched by affinity purification or separation on pulsed field gel electrophoresis before used in a ligation reaction.
In an RCA reaction, phi29 (θ29) DNA polymerase (Blanco et al., 1989; Blanco and Salas, 1984) is commonly used, and one or multiple primers specific for the target genomic region is/are provided in the reaction. Considering that the rate of synthesis of phi29 DNA polymerase is about 50 nucleotides per second, for a 500 kb fragment, it takes about 10, 000 seconds (or 3  hours) to complete one round of synthesis for a single strand. However, in some embodiments, using multiple displacement amplification (MDA) strategy, about 100 or more oligonucleotides can be made to hybridize to one strand and to tile the entire length of the megacircle template. In the same reaction, another 100 or more oligonucleotides are provided targeting the other strand. After 30 hours, minimally, the target fragment will have been amplified 1, 000 times, as linear double-stranded concatemers. The RCA product can then be digested with an enzyme, such as a restriction enzyme which cuts once per unit of the concatemer, and the individual unit ligated under low substrate concentration to form the circle. With the high fidelity of phi29 DNA polymerase (error rate 1/106 -1/107) , this process potentially can be repeated multiple times. When the target circle is sufficiently enriched, random hexamer can also be used for additional amplification with higher efficiency. To improve efficiency, exonuclease-resistant primer or random-hexamer primers with thiophosphate linkages for the two 3’ terminal nucleotides can be used in RCA reaction.
Alternatively, GOI can also be flanked with I-Scel sites (FIG. 2) or with an engineered nuclease cleavage site incorporated into one end of the GOI, matching the naturally occurring site at the other end of GOI (FIG. 6) . These strategies require the use of HR, nuclease mediated HDR, or a combination of both to insert the selection cassette into an endogenous locus. However, the advantage is that it allows a head to tail self-ligation of the megastrand DNA in cis to form a circle, as compared with the strategies outlined in FIGs. 1 and 5, which require ligation between two molecular species. After megacircle is obtained and multiple rounds of amplification from RCA, the final product is digested and self ligated to form a megacircle. In the case that the self-ligation efficiency is low, emulsion can be used to separate each molecule into a single oil droplet to promote intramolecular ligation efficiency. Linear fragment that remains will be eliminated by exonuclease treatment, improving the relative abundance of megacircle.
In the case that the final concentration of megacircle is low, to support recombinase mediated integration or RMCE, the megacircle preparation can be injected directly into the nucleus of mouse embryonic stem cells, as compared to delivery through electroporation, which may require a higher concentration. If mouse strains carrying the landing pad would have been created and zygotes available, megacircle could be delivered directly into the zygotes, along with the recombinase, to elicit RMCE or recombinase mediated integration.
The strategies outlined in FIGs. 1 and 5 are particularly appealing, as they do not require targeting of the human chromosome in human ES or iPS cells -the two human cell types are not known for high targeting efficiency and may be challenging to work with.
In addition to generating megacircle through RCA, it may also be assembled by Gibson assembly or synthesized de novo.
3. Employing RMCE for Large Scale Genome Engineering
HR is a naturally occurring mechanism that repairs DNA damage and exchanges DNA content between sister chromatids during mitosis and meiosis. Exploiting into this mechanism, precise changes can be introduced into the genome of eukaryotic cells (Doetschman et al., 1987; Thomas and Capecchi, 1987) . Although powerful, it is a laborious and time consuming process, more so when it comes to large scale genome engineering. For example, to replace the murine Igh locus with its human syntenic region (940 kb) , using BAC targeting vectors, it took 9 rounds of manipulation in mouse ESCs and years of work to complete the project (Macdonald et al., 2014) .
Over the years, another line of work explored into RMCE to complement HR and to facilitate large scale genome engineering (Hasegawa et al., 2011; Wallace et al., 2007) . By using HR to insert a “landing pad” into the Igh locus and employing iterative rounds of RMCE, the Kymab group was able to accomplish a humanization effort similar to the Regeneron VelocImmune mice but reducing the task of mouse ESC manipulations to 5 rounds (Lee et al., 2014) . The added advantage for RMCE based approach is the much high targeting efficiency (35%overall for RMCE versus less than 1%for HR, using BAC based targeting vector) , which should translate into less screening work and a better timeline. The limiting factor, in the case of RMCE, as described earlier, is the BAC vector employed. With a carrying capacity of 150-350 kb per targeting vector, compared with 1.0 Mb of the human IGHV, D, and J genes that need to be transferred, it necessitates multiple rounds of transfer.
SSR binds to a short stretch of sequence, RTS, and promotes strand exchange between the two substrates to form two strand-exchanged products (O′Gorman et al., 1991; Sauer and Henderson, 1988; Thorpe and Smith, 1998; Thyagarajan et al., 2001) . For Cre recombinase, the target site is 34 bps in length (LoxP) , consisting of two 13 bps of palindromic repeats separated by an 8 bps spacer sequence. One recombinase molecule binds to one 13 bps repeat sequence  and as such, each 13 bps repeat constitutes a recombinase binding element (RBE) . For Flp, the minimal target site (FRT) is similar in structure to the LoxP site for Cre recombinase, although the 13 bps repeats are not strictly palindromic but differ by one base pair. Also the full FRT site has an additional 13 bps of repeat, located 5’ to and separated from the minimal sequence by one additional nucleotide, for a total of 48 bps. This segment is not needed for excision but essential for integration, including RMCE. For θC31, the target sites are called attP and attB. Each site consists of two 18 bps binding elements separated by two nucleotides of spacer sequence and the two 18 bps binding elements are not particularly palindromic. For the two RTS sites that are brought together to recombine, there are 4 recombinase molecules involved, each binding to one RBE, forming a synapse (Van Duyne, 2001) .
RMCE was invented 35 years ago (Schlake and Bode, 1994) (Figure 3) . To start the process, two RTS sites that do not recombine between themselves (heterotypic) , often because of the mutations introduced into the spacer sequence, and made to flank a selection cassette, are integrated into the genomic locus often by gene targeting. This serves as the “landing pad” onto which other cassette can land and swap with. To swap, a donor plasmid, or megacircle in our case, carrying a matching set ofheterotypic RTS sites will then be introduced. In this scheme, although the two RTS sites in cis are heterotypic, either of the two pairs of RTS sites in trans is homotypic (the same) to each other. Upon exposure to the corresponding recombinase, the two homotypic pairs of RTS sites will synapse and strand exchange occur such that the selection cassette located in the landing pad will be replaced by the GOI from the donor plasmid. Employing RMCE, a transgene can be inserted into any genomic locus that has been engineered to carry the RTS sites. Compared with conventional transgenesis, insertion of a transgene into a safe harbor site is preferred, as performance of the transgene is more predictable. This is because the transgene can be inserted into a genomic locus that is known to support expression of the transgene and it can be done such that only one copy of the transgene is incorporated. It is particularly convenient if a strain that carries a landing pad in a safe harbor is available, and for new needs, swapping can be arranged for incorporation of the GOI. The other advantage is the higher targeting efficiency for RMCE, as compared to a direct gene targeting effort, particularly for larger GOI, as mentioned earlier.
RMCE is a two-step process (Malchin et al., 2010; Schlake and Bode, 1994; Takata et al., 2011) (Figure 3) . The first step, by nature, is recombinase mediated integration. It occurs in  trans, between the pair of homotypic RTS sites. In this process, the circular donor DNA is integrated into the recipient site, creating an intermediate product. Depending on the pair of homotypic sites involved, two different intermediate products are created which will then serve as the substrates for the next step. The second step in an RMCE reaction is recombinase mediated excision. It occurs in cis, between the two homotypic recognition sites now brought in cis. If it is the same pair as used in step 1 (integration) , it reproduces the two substrates that started the RMCE reaction. However, if it is the other pair, two products will be produced, with one being the desired product, now with the selection marker gene replaced with the GOI. As the enzyme is the same for the two steps, RMCE should progress seamlessly from step 1 to step 2, or in the reverse direction, reaching equilibrium as defined by the reaction.
Based on this understanding, in the scenario of replacing a murine genomic region with its human syntenic sequence by RMCE or recombinase mediated integration, the donor needs to be a circular molecule so it can be integrated into the genomic locus and it has to be large enough to accommodate a gene or gene cluster. These two criteria can be met with our megacircle as described above.
For other utilities, a megastrand can be produced by digesting the concatemer produced from an RCA reaction with the appropriate restriction enzyme (I-Sce1 in FIGs. 1, 2, 5 & 6) . It can then be used directly as a large transgene or modified further for gene targeting in an HR reaction or nuclease mediated HDR.
4. Humanization of the Immunoglobulin Heavy Chain Locus
Two different ways, RMCE and recombinase mediated integration, can be used to replace the murine Ighv, d, and j genes with their human ortholog genes, using megacircle as the vehicle to transfer human syntenic sequence.
FIGs. 4A and 4B depict fate of the murine Igh and human IGH genes (Walter et al., 1990) , alleles, and technical elements incorporated in the process to support a RMCE scheme. As illustrated in FIG. 4A, for convenience, the murine Igh locus could be divided into 4 blocks. Block 1 is 2.5 Mb and marked by Ighv1-86 at the very 5’end and Igv5-1 gene at the 3’end. This block is deleted employing, for example, CRISPR mediated NHEJ mutations; block 2 is 91 kb in size and includes Adam6a and Adam6b genes that is retained, as these two genes are important for fertility of the sperm; block 3 is 54 kb in size and starts with Ighd1-1 and ends with  Ighj4 genes. This region is deleted and replaced with the PGK/HygroTK type +/-cassette, bringing along the heterotypic LoxP/Lox5717 sites in preparation for RMCE; block 4 carries the Igh enhancer sequence, switch region, and constant region genes in their natural figuration and is left unperturbed. To modify the murine Igh allele in preparation for acceptance of the human IGH genes, a positive/negative selection cassette (Hygro/TK) , along with the LoxP/Lox5171 sites, are inserted into the murine Igh locus, replacing 54 kb of the murine Igh genes from Ighd1-1 to Ighj4. This can be accomplished by HR, nuclease mediated HDR, or a combination of both, in mouse embryonic stem cells.
For convenience, the human IGH locus can also be divided into 3 blocks (FIG. 4B) . Block 1 starts with IGHV (III) -82 gene and ends with IGHV (II) -74-1 gene; block 2 is 948 kb in size and marked by IGHV3-74 at the 5’end and IGHJ6 at the 3’end. Genes in this block are transferred to and replace the murine Igh locus; block 3 starts with the enhancer sequence and extends to the rest of the human chromosome 14 in its natural configuration. To modify the allele in preparation for megacircle generation, gene targeting is performed to insert a LoxP site and an I-Sce1 site between  blocks  1 and 2, along with a positive selection cassette. When targeted clone (s) is identified, they can be driven to homozygocity by a “loss of heterozygocity” protocol, when cultured in a high concentration of G418. After homozygote clones are obtained, they can be modified further to carry the Lox5171 site, along with an I-Sce1 site and a positive selection cassette, between  blocks  2 and 3, by HR, nuclease mediated HDR, or a combination of both. The IGH allele with both LoxP and Lox5171 sites incorporated and flanked by the I-Sce1 sites can then be digested with I-Sce1 enzyme and megastrand released. Upon circularization, the megacircle can be used as the template for megacircle amplification as outlined in FIG. 2. Alternatively, a megacircle can be prepared as outlined in FIG. 1, without gene targeting in human cells.
FIG. 4C illustrates how this scheme is carried out. When a sufficient quantity of megacircles are obtained, they can be electroporated into mouse ESCs that have been engineered to carry the LoxP/Lox5171 sites at the murine Igh locus, along with Cre recombinase, to initiate RMCE. By RMCE, the human IGH sequence is incorporated into the murine Igh locus. If desired, the 3’Lox5171 site can be flanked by PiggyBac transposon sequences and, when exposed to PiggyBac, the transposon is eliminated, along with the Lox517 site. This leaves no “footprint” of genome engineering in the region between human IGH and murine constant region  genes. At the 5’end of the recombined allele, there is a single LoxP site remaining between the murine Adam6b and the IGHV3-74 genes.
Megacircle can also be engineered to support recombinase mediated integration as outlined in FIG. 7. To use this scheme to incorporate the human IGH genes into the murine Igh locus, a detailed plan is illustrated in FIGs. 8A, 8B, and 8C. This approach may be technically less challenging, as it requires only one integration event. The murine Igh locus is modified to carry the single RTS (FIG. 8A) . To release the IGH genomic fragment from human chromosome 14, a Cpf1 protospacer/PAM sequence is first identified marking the 3’end of IGH block 2. This same Cpf1 protospacer/PAM sequence is then incorporated into the targeting vector and inserted into the region between  blocks  1 and 2, as illustrated in FIG. 8B. If possible, restriction sites flanking block 2 can also be identified and used. As shown in FIG. 8C, the product from recombinase mediated integration carries the two selection cassettes 3’to the IGH genes and can be deleted using nuclease mediated NHEJ mutations flanking the region. As an alternative, megacircle prepared according to FIG. 5 can also be used for recombinase mediated integration to insert the human IGH genes into the murine locus. In this case, a megacircle is generated by ligating the human IGH genomic fragment and a recombinant DNA construct carrying compatible ends, without the need to engineer the human IGH genes in a human cell line (ESC or iPSCs) .
To accomplish these tasks, multiple technical elements are employed and incorporated, including homotypic and heterotypic Lox sites, I-Sce1 site (or restriction sites for other rare cutter enzymes, such as other homing-endonucleases) , and Cpf1 protospacer/PAM sequences, based on well-known molecular biology mechanisms, including HR, RMCE, recombinase mediated integration, nuclease mediated NHEJ, nuclease mediated HDR, and rolling cycle amplification. All elements can be tested individually and in a flow scheme, such as the one illustrated in FIG. 9. Of particular interest is the minicircle, which shares all technical elements as the megacircle but does not carry the large “GOI. ” Successful recombinase mediated integration or RMCE by this minicircle demonstrates that the system performs as expected.
References
1. Ali, M.M., Li, F., Zhang, Z., Zhang, K., Kang, D.K., Ankrum, J.A., Le, X.C., Zhao, W., 2014. Rolling circle amplification: a versatile tool for chemical biology, materials science and medicine. Chemical Society reviews 43, 3324-3341.
2. Blanco, L., Bernad, A., Lazaro, J.M., Martin, G., Garmendia, C., Salas, M., 1989. Highly efficient DNA synthesis by the phage phi 29 DNA polymerase. Symmetrical mode of DNA replication. The Journal of biological chemistry 264, 8935-8940.
3. Blanco, L., Salas, M., 1984. Characterization and purification of a phage phi 29-encoded DNA polymerase required for the initiation of replication. Proceedings of the National Academy of Sciences of the United States of America 81, 5325-5329.
4. Bruggemann, M., Caskey, H.M., Teale, C., Waldmann, H., Williams, G.T., Surani, M.A., Neuberger, M.S., 1989. A repertoire of monoclonal antibodies with human heavy chains from transgenic mice. Proceedings of the National Academy of Sciences of the United States of America 86, 6709-6713.
5. Burke, D.T., Carle, G.F., Olson, M.V., 1987. Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. Science 236, 806-812.
6. Doetschman, T., Gregg, R.G., Maeda, N., Hooper, M.L., Melton, D.W., Thompson, S., Smithies, O., 1987. Targetted correction of a mutant HPRT gene in mouse embryonic stem cells. Nature 330, 576-578.
7. Enghiad, B., Zhao, H., 2017. Programmable DNA-Guided Artificial Restriction Enzymes. ACS synthetic biology 6, 752-757.
8. Giudicelli, V., Duroux, P., Ginestoux, C., Folch, G., Jabado-Michaloud, J., Chaume, D., Lefranc, M.P., 2006. IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobulin and T cell receptor nucleotide sequences. Nucleic acids research 34, D781-784.
9. Green, L.L., 2014. Transgenic mouse strains as platforms for the successful discovery and development of human therapeutic monoclonal antibodies. Current drug discovery technologies 11, 74-84.
10. Green, L.L., Hardy, M.C., Maynard-Currie, C.E., Tsuda, H., Louie, D.M., Mendez, M.J., Abderrahim, H., Noguchi, M., Smith, D.H., Zeng, Y., David, N.E., Sasai, H., Garza, D., Brenner, D.G., Hales, J.F., McGuinness, R.P., Capon, D.J., Klapholz, S., Jakobovits, A.,  1994. Antigen-specific human monoclonal antibodies from mice engineered with human Ig heavy and light chain YACs. Nature genetics 7, 13-21.
11. Hasegawa, M., Kapelyukh, Y., Tahara, H., Seibler, J., Rode, A., Krueger, S., Lee, D.N., Wolf, C.R., Scheer, N., 2011. Quantitative prediction of human pregnane X receptor and cytochrome P450 3A4 mediated drug-drug interaction in a novel multiple humanized mouse line. Molecular pharmacology 80, 518-528.
12. Hohn, B., Murray, K., 1977. Packaging recombinant DNA molecules into bacteriophage particles in vitro. Proceedings of the National Academy of Sciences of the United States of America 74, 3259-3263.
13. Hoshiya, H., Kazuki, Y., Abe, S., Takiguchi, M., Kajitani, N., Watanabe, Y., Yoshino, T., Shirayoshi, Y., Higaki, K., Messina, G., Cossu, G., Oshimura, M., 2009. A highly stable and nonintegrated human artificial chromosome (HAC) containing the 2.4 Mb entire human dystrophin gene. Molecular therapy: the journal of the American Society of Gene Therapy 17, 309-317.
14. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., Charpentier, E., 2012. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821.
15. Lee, E.C., Liang, Q., Ali, H., Bayliss, L., Beasley, A., Bloomfield-Gerdes, T., Bonoli, L., Brown, R., Campbell, J., Carpenter, A., Chalk, S., Davis, A., England, N., Fane-Dremucheva, A., Franz, B., Germaschewski, V., Holmes, H., Holmes, S., Kirby, I., Kosmac, M., Legent, A., Lui, H., Manin, A., O′Leary, S., Paterson, J., Sciarrillo, R., Speak, A., Spensberger, D., Tuffery, L., Waddell, N., Wang, W., Wells, S., Wong, V., Wood, A., Owen, M.J., Friedrich, G.A., Bradley, A., 2014. Complete humanization of the mouse immunoglobulin loci enables efficient therapeutic antibody discovery. Nat Biotechnol 32, 356-363.
16. Lonberg, N., Taylor, L.D., Harding, F.A., Trounstine, M., Higgins, K.M., Schramm, S.R., Kuo, C.C., Mashayekh, R., Wymore, K., McCabe, J.G., et al., 1994. Antigen-specific human antibodies from mice comprising four distinct genetic modifications. Nature 368, 856-859.
17. Macdonald, L.E., Karow, M., Stevens, S., Auerbach, W., Poueymirou, W.T., Yasenchak, J., Frendewey, D., Valenzuela, D.M., Giallourakis, C.C., Alt, F.W., Yancopoulos, G.D.,  Murphy, A.J., 2014. Precise and in situ genetic humanization of 6 Mb of mouse immunoglobulin genes. Proceedings of the National Academy of Sciences of the United States of America 111, 5147-5152.
18. Malchin, N., Molotsky, T., Borovok, I., Voziyanov, Y., Kotlyar, A.B., Yagil, E., Kolot, M., 2010. High efficiency of a sequential recombinase-mediated cassette exchange reaction in Escherichia coli. Journal of molecular microbiology and biotechnology 19, 117-122.
19. O′Gorman, S., Fox, D.T., Wahl, G.M., 1991. Recombinase-mediated gene activation and site-specific integration in mammalian cells. Science 251, 1351-1355.
20. Osoegawa, K., Mammoser, A.G., Wu, C., Frengen, E., Zeng, C., Catanese, J.J., de Jong, P.J., 2001. A bacterial artificial chromosome library for sequencing the complete human genome. Genome research 11, 483-496.
21. Sauer, B., Henderson, N., 1988. Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1. Proceedings of the National Academy of Sciences of the United States of America 85, 5166-5170.
22. Schlake, T., Bode, J., 1994. Use of mutated FLP recognition target (FRT) sites for the exchange of expression cassettes at defined chromosomal loci. Biochemistry 33, 12746-12751.
23. Takata, Y., Kondo, S., Goda, N., Kanegae, Y., Saito, I., 2011. Comparison of efficiency between FLPe and Cre for recombinase-mediated cassette exchange in vitro and in adenovirus vector production. Genes to cells: devoted to molecular & cellular mechanisms 16, 765-777.
24. Taylor, L.D., Carmack, C.E., Schramm, S.R., Mashayekh, R., Higgins, K.M., Kuo, C.C., Woodhouse, C., Kay, R.M., Lonberg, N., 1992. A transgenic mouse that expresses a diversity of human sequence heavy and light chain immunoglobulins. Nucleic acids research 20, 6287-6295.
25. Thomas, K.R., Capecchi, M.R., 1987. Site-directed mutagenesis by gene targeting in mouse embryo-derived stem cells. Cell 51, 503-512.
26. Thorpe, H.M., Smith, M.C., 1998. In vitro site-specific integration of bacteriophage DNA catalyzed by a recombinase of the resolvase/invertase family. Proceedings of the National Academy of Sciences of the United States of America 95, 5505-5510.
27. Thyagarajan, B., Olivares, E.C., Hollis, R.P., Ginsburg, D.S., Calos, M.P., 2001. Site-specific genomic integration in mammalian cells mediated by phage phiC31 integrase. Molecular and cellular biology 21, 3926-3934.
28. Tomizuka, K., Yoshida, H., Uejima, H., Kugoh, H., Sato, K., Ohguma, A., Hayasaka, M., Hanaoka, K., Oshimura, M., Ishida, I., 1997. Functional expression and germline transmission of a human chromosome fragment in chimaeric mice. Nature genetics 16, 133-143.
29. Van Duyne, G.D., 2001. A structural view of cre-loxp site-specific recombination. Annual review of biophysics and biomolecular structure 30, 87-104.
30. Wallace, H.A., Marques-Kranc, F., Richardson, M., Luna-Crespo, F., Sharpe, J.A., Hughes, J., Wood, W.G., Higgs, D.R., Smith, A.J., 2007. Manipulating the mouse genome to engineer precise functional syntenic replacements with human sequence. Cell 128, 197-209.
31. Walter, M.A., Surti, U., Hofker, M.H., Cox, D.W., 1990. The physical organization of the human immunoglobulin heavy chain gene complex. The EMBO journal 9, 3303-3313.
32. Zetsche, B., Gootenberg, J.S., Abudayyeh, O.O., Slaymaker, I.M., Makarova, K.S., Essletzbichler, P., Volz, S.E., Joung, J., van der Oost, J., Regev, A., Koonin, E.V., Zhang, F., 2015. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell 163, 759-771.
All references listed herein are incorporated by reference, including incorporation by reference at the instances where the specific reference is cited.
ABBREVIATIONS
GOI         genomic region of interest
HDR         homology directed repair
HR          homologous recombination
IG          immunoglobulin
Igh         murine immunoglobulin heavy chain gene
IGH         human immunoglobulin heavy chain gene
MHC         major histocompatibility complex
NHEJ        non homologous end joining
RBE         recombinase binding element
RCA         rolling circle amplification
RE          restriction enzyme
RMCE        recombinase mediated cassette exchange
RTS         recombinase target site
SSR         site specific recombinase

Claims (33)

  1. A DNA fragment comprising a genomic region of interest (GOI, which can be a single complete or partial gene, or a cluster of related or unrelated genes, non-encoding sequence, or any genomic region of interest) , wherein the GOI is at least about 20 kb (e.g., 50 kb, 100 kb, 150 kb, 200 kb, 300 kb, 400 kb, 500 kb, 750 kb, 1 mb, 1.25 mb, 1.5 mb, 1.75 mb, 2 mb or more) , and is flanked by a first unique enzyme cleavage site at the proximal end of the genomic DNA fragment and a second unique enzyme cleavage site at the distal end of the genomic DNA fragment, wherein the DNA fragment further comprises an exogenous sequence (such as a selection marker, a recombinase target site (RTS) , or a sequence from a non-donor source) between the proximal end of the DNA fragment and the proximal end of the GOI, or between the distal end of the DNA fragment and the distal end of the GOI; and wherein the DNA fragment is devoid of a sequence required for self-replication in a host cell (e.g., ORI, pBR322, pUC, pSC101, or ARS) .
  2. The DNA fragment of claim 1, wherein the first and the second unique enzyme cleavage sites have the same sequence, or are recognized by one (the same) cleavage enzyme, or have sticky ends compatible for ligation.
  3. The DNA fragment of claim 1 or 2, which is a double-stranded circle (e.g., linked at the first and the second unique enzyme cleavage sites) .
  4. The DNA fragment of any one of claims 1-3, further comprising a first positive selection marker cassette located between the RTS and an end of the DNA fragment, wherein the GOI and the positive selection marker cassette flank the RTS.
  5. The DNA fragment of claim 4, further comprising a heterotypic recombinase target site (htRTS) , wherein the RTS and the htRTS flank the GOI.
  6. The DNA fragment of claim 5, which is circular, with a linkage created by linking the first and the second unique enzyme cleavage sites, wherein the linkage is between the positive selection marker cassette and the htRTS (or the RTS) .
  7. The DNA fragment of any one of claims 2-6, wherein the DNA fragment is amplified by rolling circle amplification (RCA) .
  8. The DNA fragment of any one of claims 1-7, wherein the GOI is immunoglobulin heavy chain gene cluster, immunoglobulin kappa light chain gene cluster, immunoglobulin  lambda light chain gene cluster, a TCR (T-cell receptor) gene cluster, an MHC class 1 gene, an MHC class 2 gene, or a cytochrome P450 gene.
  9. The DNA fragment of any one of claims 1-8, wherein the first and/or the second unique enzyme cleavage site (s) are cleavable by a restriction endonuclease, a homing endonuclease (such as I-Sce1) , a CRISPR/Cas9 nuclease, a CRISPR/Cpfl nuclease, a TALEN (transcription activator-like effector nuclease) , a ZFN (Zinc Finger nuclease) , or a Pyrococcus furiosus Argonaute (PfAgo) based artificial restriction enzyme (ARE) .
  10. The DNA fragment of any one of claims 1-9, wherein one of the first and the second unique enzyme cleavage sites is a natural/pre-existing genomic sequence.
  11. The DNA fragment of any one of claims 1-10, wherein the RTS or the htRTS is locus of crossover in P1 (loxP) , flippase recognition target (FRT) , or attP/attB.
  12. A method of producing the DNA fragment of any one of claims 1-11, the method comprising:
    (a) inserting, in a host genome comprising said GOI, said first unique enzyme cleavage site and said RTS proximal (or distal) to the GOI, wherein the GOI is proximal (or distal) to the second unique enzyme cleavage site, and wherein an optional first positive selection marker cassette is distal (or proximal) to said first unique enzyme cleavage site and proximal (or distal) to said RTS;
    (b) digesting the host genome with a first unique enzyme that cleaves said first unique enzyme cleavage site, and a second unique enzyme that cleaves said second unique enzyme cleavage site, to release the DNA fragment comprising the GOI from the host genome;
    (c) circularizing the released DNA fragment to form a megacircle, by ligating the ends of said released DNA fragment generated by cleavage by said first unique enzyme and said second unique enzyme, under conditions that promote intramolecular self-ligation;
    (d) amplifying said megacircle by rolling circle amplification (RCA) with a primer (or multiple primers) unique to said GOI.
  13. The method of claim 12, wherein the second unique enzyme cleavage site is naturally existing sequence.
  14. The method of claim 12 or 13, wherein step (a) further comprises inserting a 3rd unique  enzyme cleavage site between the first unique enzyme cleavage site and the RTS.
  15. The method of claim 1 2, wherein the second unique enzyme cleavage site, together with a second positive selection marker cassette and htRTS, are inserted into said host genome distal (or proximal) to GOI, preferably the second unique enzyme cleavage site is distal (or proximal) to the htRTS and proximal (or distal) to the second positive selection marker cassette.
  16. A method of producing the DNA fragment of any one of claims 1-11, the method comprising:
    (a) removing/retrieving a genomic DNA fragment comprising said GOI from a host genome, through digesting the host genome with a first enzyme specific for a first unique enzyme cleavage site and a second enzyme specific for a second unique enzyme cleavage site, wherein said first and said second unique enzyme cleavage sites are not compatible for ligation with each other;
    (b) ligating the retrieved genomic DNA fragment in step (a) with a linear recombinant DNA construct to produce a megacircle, wherein said linear recombinant DNA construct comprises a positive selection marker cassette flanked by a pair of heterotypic RTSs or flanked by a RTS and a 3rd unique enzyme cleavage site, and wherein the ends of the linear recombinant DNA construct are defined by said first and said second unique enzyme cleavage sites, respectively;
    (c) amplifying said megacircle by rolling circle amplification (RCA) with a primer unique to said GOI.
  17. The method of any one of claims 12-16, wherein said first and said second unique enzyme cleavage sites are the same, and said first and said second unique enzymes are the same.
  18. The method of any one of claims 12-17, further comprising, before amplification by RCA, treating the ligation mixture with an exonuclease to eliminate linear DNA fragments.
  19. The method of any one of claims 12-18, wherein RCA is performed using phi29 (θ29) DNA polymerase, and one or more primers each specific for the GOI.
  20. The method of any one of claims 12-19, wherein RCA comprises multiple displacement amplification (MDA) .
  21. The method of any one of claims 12-20, further comprising resolving the RCA concatemer product, preferably with said an enzyme specific for said 3rd unique enzyme cleavage site (if present) , or with said unique first enzyme or said unique second enzyme.
  22. The method of claim 21, further comprising self-ligating digested RCA concatemer product under conditions favoring the formation of intramolecular self-ligation (e.g., low concentration) .
  23. The method of claim 22, wherein said conditions comprise emulsifying the digested RCA concatemer product to form single oil droplets, each comprising no more than one linear DNA fragment, to promote self-ligation.
  24. The method of claim 22 or 23, further comprising treating ligation mixture of the digested RCA product with an exonuclease to eliminate linear DNA fragments.
  25. The method of any one of claims 12-24, wherein step (a) is carried out by nuclease mediated homology directed repair (HDR) , or homologous recombination (HR) , or a combination thereof.
  26. A method of replacing a host (e.g., mouse) genomic region in a host genome with a (syntenic) DNA fragment from a donor (e.g., human) genome, the method comprising:
    (a) replacing the host genomic region with a pair of heterotypic RTSs (e.g., via nuclease enabled HDR, or HR, or a combination thereof) , wherein said pair of heterotypic RTSs flank an optional positive/negative selection cassette (such as hygroTK selection cassette) ;
    (b) providing the DNA fragment of any one of claims 1-11 from the donor genome, and allowing the DNA fragment from the donor genome to integrate into the host genome through one of said pair of heterotypic RTSs in the presence of site specific recombinase specific for said one of said pair of heterotypic RTSs;
    (c) optionally, allowing deletion of said positive/negative selection cassette (ifpresent) through the other of said pair of homotypic RTSs.
  27. A method of replacing a host (e.g., mouse) genomic region from a host genome with a (syntenic) DNA fragment from a donor (e.g., human) genome, the method comprising:
    (a) inserting a single RTS, along with an optional positive/negative selection cassette (such as hygroTK selection cassette) , proximal (or distal) to the host genomic region (e.g., via nuclease enabled HDR, or HR, or a combination thereof) ;
    (b) providing the DNA fragment of any one of claims 1-11 from the donor genome, and allowing the DNA fragment from the donor genome to integrate into the host genome through said RTS in the presence of site specific recombinase specific for said RTS;
    (c) optionally, allowing deletion of said positive/negative selection cassette (ifpresent) and the host genomic region through nuclease mediated NHEJ.
  28. The method of claim 27, wherein said host genomic region is deleted prior to step (b) .
  29. The method of any one of claims 26-28, which is carried out in a zygote, an oocyte, a sperm cell (spermatogonial stem cell line) , or an ES cell of the host, preferably by microinjecting, electroporating, or transfecting exogenous components (e.g., CRISPR/Cas9 and guide RNAs targeting said first and said second unique enzyme cleavage sites; megacircles; recombinase protein or coding sequence thereof) .
  30. The genomic DNA fragment of any one of claims 1-11, or the method of any one of claims 12-29, wherein the positive/negative selection marker cassette comprises a (neomycin/puromycin/hygromycin/blasticidin/zeocin) resistant/TK (or HPRT) gene under the expression control a eukaryotic promoter (such as the PGK promoter) and a polyA coding sequence.
  31. A mouse generated by any one of the method of claims 26-29.
  32. A mouse comprising in its genome an exogenous genomic DNA from a donor (e.g., a human, a mammal, or different mouse strain) , wherein the exogenous genomic DNA comprises a polymorphism within the species of the donor, and wherein the exogenous genomic DNA comprises the DNA fragment of any one of claims 1-11.
  33. The mouse of claim 32, wherein the exogenous genomic DNA comprises an immunoglobulin heavy chain gene cluster, immunoglobulin kappa light chain gene cluster, immunoglobulin lambda chain gene cluster, a TCR, an MHC class 1 gene, an MHC class 2 gene, or a cytochrome P450 gene.
PCT/CN2017/099097 2017-08-25 2017-08-25 Large scale modification of eukaryotic genome WO2019037099A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/099097 WO2019037099A1 (en) 2017-08-25 2017-08-25 Large scale modification of eukaryotic genome

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/099097 WO2019037099A1 (en) 2017-08-25 2017-08-25 Large scale modification of eukaryotic genome

Publications (1)

Publication Number Publication Date
WO2019037099A1 true WO2019037099A1 (en) 2019-02-28

Family

ID=65439900

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/099097 WO2019037099A1 (en) 2017-08-25 2017-08-25 Large scale modification of eukaryotic genome

Country Status (1)

Country Link
WO (1) WO2019037099A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110904035A (en) * 2019-12-31 2020-03-24 南昌诺汇医药科技有限公司 Culture method for promoting spermatogonial stem cell proliferation and application thereof
CN112088213A (en) * 2019-04-12 2020-12-15 株式会社湖美宝 Artificial recombinant chromosome and use thereof

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002000875A2 (en) * 2000-06-28 2002-01-03 Protemation, Inc. Compositions and methods for generating expression vectors through site-specific recombination
WO2002088353A2 (en) * 2001-04-27 2002-11-07 Association Pour Le Developpement De La Recherche En Genetique Moleculaire (Aderegem) Method for the stable inversion of dna sequence by site-specific recombination and dna vectors and transgenic cells thereof
WO2004044150A2 (en) * 2002-11-07 2004-05-27 The United States Of America, As Represented By The Secretary Of Agriculture Systems for gene targeting and producing stable genomic transgen e insertions
WO2008101216A2 (en) * 2007-02-15 2008-08-21 The Govt. Of The Usa As Represented By The Secretary Of The Dept Of Health And Human Services Gamma satellite insulator sequences and their use in preventing gene silencing
WO2013041844A2 (en) * 2011-09-19 2013-03-28 Kymab Limited Antibodies, variable domains & chains tailored for human use
CN104334732A (en) * 2012-03-28 2015-02-04 科马布有限公司 Animals expressing human lambda immunoglobulin light chain variable domain
WO2015166272A2 (en) * 2014-05-02 2015-11-05 Iontas Limited Preparation of libraries of protein variants expressed in eukaryotic cells and use for selecting binding molecules

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002000875A2 (en) * 2000-06-28 2002-01-03 Protemation, Inc. Compositions and methods for generating expression vectors through site-specific recombination
WO2002088353A2 (en) * 2001-04-27 2002-11-07 Association Pour Le Developpement De La Recherche En Genetique Moleculaire (Aderegem) Method for the stable inversion of dna sequence by site-specific recombination and dna vectors and transgenic cells thereof
WO2004044150A2 (en) * 2002-11-07 2004-05-27 The United States Of America, As Represented By The Secretary Of Agriculture Systems for gene targeting and producing stable genomic transgen e insertions
WO2008101216A2 (en) * 2007-02-15 2008-08-21 The Govt. Of The Usa As Represented By The Secretary Of The Dept Of Health And Human Services Gamma satellite insulator sequences and their use in preventing gene silencing
WO2013041844A2 (en) * 2011-09-19 2013-03-28 Kymab Limited Antibodies, variable domains & chains tailored for human use
CN104334732A (en) * 2012-03-28 2015-02-04 科马布有限公司 Animals expressing human lambda immunoglobulin light chain variable domain
WO2015166272A2 (en) * 2014-05-02 2015-11-05 Iontas Limited Preparation of libraries of protein variants expressed in eukaryotic cells and use for selecting binding molecules

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112088213A (en) * 2019-04-12 2020-12-15 株式会社湖美宝 Artificial recombinant chromosome and use thereof
CN110904035A (en) * 2019-12-31 2020-03-24 南昌诺汇医药科技有限公司 Culture method for promoting spermatogonial stem cell proliferation and application thereof
CN110904035B (en) * 2019-12-31 2020-11-13 沈阳菁华医院有限公司 Culture method for promoting spermatogonial stem cell proliferation and application thereof

Similar Documents

Publication Publication Date Title
US11268109B2 (en) CHO integration sites and uses thereof
JP6486979B2 (en) Methods for modifying eukaryotic cells
JP6402368B2 (en) Methods for modifying eukaryotic cells
US10301646B2 (en) Nuclease-mediated targeting with large targeting vectors
US11535871B2 (en) Optimized gene editing utilizing a recombinant endonuclease system
US20190338274A1 (en) Methods, Cells & Organisms
Zhang et al. Large genomic fragment deletions and insertions in mouse using CRISPR/Cas9
KR20220018613A (en) Methods for breaking immunological tolerance using multiple guide rnas
WO2016080399A1 (en) Method for knock-in of dna into target region of mammalian genome, and cell
CN111500630A (en) Targeted modification of rat genome
Clark et al. A most formidable arsenal: genetic technologies for building a better mouse
WO2019037099A1 (en) Large scale modification of eukaryotic genome
Hosur et al. Programmable RNA-guided large DNA transgenesis by CRISPR/Cas9 and site-specific integrase Bxb1
CN114727592A (en) High frequency targeted animal transgenesis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17922403

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17922403

Country of ref document: EP

Kind code of ref document: A1