WO2024075756A1 - Cell library and method for producing same - Google Patents

Cell library and method for producing same Download PDF

Info

Publication number
WO2024075756A1
WO2024075756A1 PCT/JP2023/036147 JP2023036147W WO2024075756A1 WO 2024075756 A1 WO2024075756 A1 WO 2024075756A1 JP 2023036147 W JP2023036147 W JP 2023036147W WO 2024075756 A1 WO2024075756 A1 WO 2024075756A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
allele
modified
cells
cell
Prior art date
Application number
PCT/JP2023/036147
Other languages
French (fr)
Japanese (ja)
Inventor
康則 相澤
知幸 大野
Original Assignee
株式会社Logomix
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Logomix filed Critical 株式会社Logomix
Publication of WO2024075756A1 publication Critical patent/WO2024075756A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/02Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors

Definitions

  • the present invention relates to a cell library and a method for producing the same.
  • Patent Document 2 discloses that it has been possible to efficiently modify two or more alleles simultaneously to create large-scale deletions of several hundred kb.
  • the present disclosure provides a cell library and a method for producing the same. More specifically, the present disclosure provides a cell library containing a plurality of modified cells having a rich variety of base sequences at a specific site of one allele in a cell having multiple alleles.
  • the cell library may be a combination of aqueous compositions containing one type of modified cell.
  • a library of modified cells comprising: The library comprises a combination of a plurality of aqueous compositions, Each aqueous composition comprises one type of modified cell; Each of the modified cells has a first allele and a second allele at a locus to be modified; Each of the modified cells has a cassette containing a DNA fragment that differs from each other between the aqueous compositions at the same position of the first allele.
  • a library wherein preferably said locus of the modified cell does not contain a recognition site or recombination sequence for a site-specific recombinase.
  • each of the sequences of the cassettes is composed of one or more modified portions (A) and one or more unmodified portions (B), each of the modified portions (A) has one or more modifications selected from the group consisting of sequence insertion, deletion, and substitution, the modifications of the one or more modified portions differ between the aqueous compositions in terms of the position or content of the modification, each of the one or more unmodified portions (B) is identical to the sequence of the corresponding site before modification, the unmodified portion (B1) on the centromere side of the cassette is seamlessly linked to the adjacent sequence (C1) on the centromere side of the cassette, the unmodified portion (Bt) on the telomere side of the cassette is seamlessly linked to the adjacent sequence (C2) on the telomere side of the cassette, and the region where the adjacent sequence (C1) and the unmodified portion (B1) are linked, and the region where the unmodified portion (Bt) and the adjacent sequence (C2) are linked, constitute
  • a method for producing a library of modified cells comprising: ( ⁇ ) providing a group of cells having a genome including a first allele and a second allele at a locus to be modified, the first allele and the second allele each including a cassette including a selection marker gene and a target nucleic acid sequence; wherein the selection marker gene carried by the first allele and the selection marker gene carried by the second allele are distinguishably different, the target nucleic acid sequence is a target of a genome modification system and is designed so that the first allele and the second allele can be distinguishably cleaved by the genome modification system, and each selection marker gene is a negative selection marker gene that can be used for negative selection, ( ⁇ ) introducing into the provided group of cells: (x) a sequence-specific nucleic acid cleavage molecule that targets the unique base sequence contained in the first allele, or a genome modification system comprising a polynucleotide encoding the sequence-specific nucleic acid cleavage molecule; (y)
  • the modified sequence is a coding region for a protein
  • a library of modified cells comprising a plurality of modified cells, produced by the method according to any one of (8) to (14) above.
  • FIG. 1 shows an overview of one example of a scheme for constructing a library of the present invention.
  • FIG. 2 shows an example of a scheme for preparing a library of the first modified cell of the present invention.
  • 3 is a diagram showing an example of the positional relationship between a modified portion and a non-modified portion in a modified base sequence in a modified cell library, in which the modified base sequence contains the modified portion and the non-modified portion.
  • FIG. 4 shows the characteristics (disadvantages) of the genome modification method using site-specific recombinase.
  • polynucleotide and “nucleic acid” are used interchangeably and refer to a nucleotide polymer in which nucleotides are linked by phosphodiester bonds.
  • a "polynucleotide” and a “nucleic acid” may be DNA, RNA, or a combination of DNA and RNA.
  • a "polynucleotide” and a “nucleic acid” may be a polymer of natural nucleotides, a polymer of natural nucleotides and non-natural nucleotides (such as analogs of natural nucleotides, nucleotides in which at least one of the base moiety, sugar moiety, and phosphate moiety is modified (e.g., phosphorothioate backbone), etc.), or a polymer of non-natural nucleotides.
  • the base sequence of a "polynucleotide” or “nucleic acid” is written in the generally accepted single letter code unless otherwise specified.
  • the base sequence is written from the 5' to the 3' side unless otherwise specified.
  • the nucleotide residues that make up a "polynucleotide” or “nucleic acid” may be written simply as adenine, thymine, cytosine, guanine, or uracil, etc., or by their single letter codes.
  • gene refers to a polynucleotide that contains at least one open reading frame that encodes a particular protein.
  • a gene can contain both exons and introns.
  • polypeptide refers to a polymer of amino acids linked by amide bonds.
  • a “polypeptide”, “peptide” or “protein” may be a polymer of natural amino acids, a polymer of natural and non-natural amino acids (e.g., chemical analogues or modified derivatives of natural amino acids), or a polymer of non-natural amino acids. Unless otherwise specified, amino acid sequences are written from the N-terminus to the C-terminus.
  • allele refers to a set of base sequences present at the same locus on a chromosomal genome.
  • a diploid cell has two alleles at the same locus, and a triploid cell has three alleles at the same locus.
  • additional alleles may be formed by an abnormal copy of the chromosome or an abnormal additional copy of the locus.
  • Genome modification or “genome editing” are used interchangeably and refer to the induction of a mutation at a desired position (target region) on a genome.
  • Genome modification may include the use of a sequence-specific nucleic acid cleaving molecule designed to cleave the target region DNA.
  • genome modification may include the use of a nuclease engineered to cleave the target region DNA.
  • genome modification may include the use of a nuclease engineered to cleave a target sequence having a specific base sequence in the target region (e.g., TALEN or ZFN).
  • genome modification may use a sequence-specific endonuclease such as a restriction enzyme having only one cleavage site in the genome, such as a meganuclease (e.g., a restriction enzyme with 16-base sequence specificity (theoretically present at a ratio of 1 in 4 16 bases), a restriction enzyme with 17-base sequence specificity (theoretically present at a ratio of 1 in 4 17 bases), and a restriction enzyme with 18-base sequence specificity (theoretically present at a ratio of 1 in 4 18 bases)) to cleave a target sequence having a specific base sequence in the target region.
  • a sequence-specific endonuclease such as a restriction enzyme having only one cleavage site in the genome, such as a meganuclease (e.g., a restriction enzyme with 16-base sequence specificity (theoretically present at a ratio of 1 in 4 16 bases), a restriction enzyme with 17-base sequence specificity (theoret
  • a double-stranded break is induced in the DNA of the target region by the use of a site-specific nuclease, and then the genome is repaired by endogenous processes of the cell, such as Homologous Directed Repair (HDR) and Non-Homologous End-Joining Repair (NHEJ).
  • NHEJ is a repair method that joins the ends of double-stranded breaks without using donor DNA, and insertions and/or deletions (indels) are frequently induced during repair.
  • HDR is a repair mechanism that uses donor DNA, and it is also possible to introduce desired mutations into the target region.
  • a preferred example of a genome modification technique is the CRISPR/Cas system.
  • meganucleases include I-SceI, I-SceII, I-SceIII, I-SceIV, I-SceV, I-SceVI, I-SceVII, I-CeuI, I-CeuAIIP, I-CreI, I-CrepsbIP, I-CrepsbIIP, I-CrepsbIIIIP, I-CrepsbIVP, I-TliI, I-PpoI, PI-PspI, F-SceI, F-SceII, F-SuvI, F-TevI, F-TevII, I-AmaI, I-AniI, I-ChuI, I-CmoeI, I-CpaI, I-CpaIII, I-CsmI, I-CvuI, I-CvuAIP, I-DdiI, I-DdiIII, I-DirI, I
  • target region refers to a genomic region that is subject to genome modification.
  • “Deletion” includes deletions of one or more bases and deletions of one or more genes relative to a reference genome.
  • the deletions can be deletions of 100 bp or more, deletions of 200 bp or more, deletions of 300 bp or more, deletions of 400 bp or more, deletions of 500 bp or more, deletions of 600 bp or more, deletions of 700 bp or more, deletions of 800 bp or more, deletions of 900 bp or more, deletions of 1 kbp or more, deletions of 10 kbp or more, deletions of 50 kbp or more, deletions of 100 kbp or more, deletions of 200 kbp or more, deletions of 300 kbp or more, deletions of 400 kbp or more, deletions of 500 kbp or more, or deletions of 1 Mbp or more or less.
  • the deletions can be deletions of 1 Mbp or less.
  • the deletions can be deletions of 700 kbp or less.
  • the deletion may be a deletion of 600 kbp or less.
  • the deletion may be a deletion of 500 kbp or less.
  • the deletion may be a deletion of 10 kbp to 600 kbp or less.
  • the deletion may be a deletion of 100 kbp to 600 kbp or less.
  • the deletion may be a deletion of 100 kbp to 500 kbp or less.
  • donor DNA refers to DNA used to repair double-stranded DNA breaks and capable of homologous recombination with DNA surrounding a target region.
  • the donor DNA contains a base sequence upstream and a base sequence downstream of the target region (e.g., a base sequence adjacent to the target region) as homology arms.
  • a homology arm consisting of a base sequence upstream of a target region e.g., a base sequence adjacent to the upstream side
  • upstream homology arm a homology arm consisting of a base sequence downstream of a target region
  • downstream homology arm consisting of a base sequence downstream of a target region
  • the donor DNA may contain a desired base sequence between the upstream homology arm and the downstream homology arm.
  • the length of each homology arm is preferably 300 bp or more, and is usually about 500 to 3000 bp.
  • the lengths of the upstream homology arm and the downstream homology arm may be the same or different from each other. If the target region is successfully induced to undergo homologous recombination with the donor DNA after sequence-dependent cleavage, the sequence between the upstream and downstream base sequences of the target region will be replaced with the sequence sandwiched between the upstream and downstream base sequences of the donor DNA.
  • Upstream of a target region means the DNA region located on the 5' side of a reference nucleotide strand in the double-stranded DNA of the target region.
  • Downstream of a target region means the DNA located on the 3' side of the reference nucleotide strand. It is arbitrary which strand of the double strand is used as the reference nucleotide strand. However, for convenience, when the target region contains a protein coding sequence, the reference nucleotide strand is usually the sense strand.
  • a promoter is located upstream of a protein coding sequence.
  • a terminator is located downstream of a protein coding sequence.
  • sequence-specific nucleic acid cleaving molecule refers to a molecule that can recognize a specific nucleic acid sequence and cleave a nucleic acid at said specific nucleic acid sequence.
  • a sequence-specific nucleic acid cleaving molecule is a molecule that has the activity of cleaving a nucleic acid in a sequence-specific manner (sequence-specific nucleic acid cleaving activity).
  • target sequence refers to a DNA sequence in a genome that is to be cleaved by a sequence-specific nucleic acid cleaving molecule.
  • sequence-specific nucleic acid cleaving molecule is a Cas protein
  • the target sequence refers to a DNA sequence in a genome that is to be cleaved by the Cas protein.
  • Cas9 protein is used as the Cas protein
  • the target sequence must be adjacent to the 5' side of a protospacer adjacent motif (PAM).
  • the target sequence is usually selected as a sequence of 17 to 30 bases (preferably 18 to 25 bases, more preferably 19 to 22 bases, and even more preferably 20 bases) adjacent to and immediately preceding the 5' side of the PAM.
  • a known design tool such as CRISPR DESIGN (crispr.mit.edu/) can be used to design the target sequence.
  • Cas protein refers to a CRISPR-associated protein.
  • the Cas protein forms a complex with a guide RNA and exhibits endonuclease activity or nickase activity.
  • Examples of Cas proteins include, but are not limited to, Cas9 protein, Cpf1 protein, C2c1 protein, C2c2 protein, and C2c3 protein.
  • Cas proteins include wild-type Cas proteins and their homologs (paralogs and orthologs), as well as mutants thereof, so long as they cooperate with a guide RNA to exhibit endonuclease activity or nickase activity.
  • the Cas protein is involved in the class 2 CRISPR/Cas system, more preferably in the type II CRISPR/Cas system.
  • a preferred example of the Cas protein is the Cas9 protein.
  • a preferred example of the Cas protein is the Cas3 protein.
  • Cas9 protein refers to a Cas protein involved in the type II CRISPR/Cas system.
  • the Cas9 protein forms a complex with a guide RNA and exhibits the activity of cleaving DNA in a target region in cooperation with the guide RNA.
  • the Cas9 protein includes wild-type Cas9 protein and its homologs (paralogs and orthologs), as well as mutants thereof, so long as it has the above-mentioned activity.
  • the wild-type Cas9 protein has a RuvC domain and an HNH domain as nuclease domains, but the Cas9 protein in this specification may have either the RuvC domain or the HNH domain inactivated.
  • Cas9 in which either the RuvC domain or the HNH domain is inactivated introduces a single-stranded break (nick) into double-stranded DNA. Therefore, when Cas9 in which either the RuvC domain or the HNH domain has been inactivated is used to cleave double-stranded DNA, a modified system can be constructed in which Cas9 target sequences are set for each of the sense and antisense strands, and nicks in the sense and antisense strands are generated at positions sufficiently close to each other, thereby inducing double-stranded cleavage.
  • the species of organism from which the Cas9 protein is derived is not particularly limited, but preferred examples include bacteria belonging to the genus Streptococcus, Staphylococcus, Neisseria, or Treponema. More specifically, preferred examples include Cas9 proteins derived from S. pyogenes, S. thermophilus, S. aureus, N. meningitidis, or T. denticola. In a preferred embodiment, the Cas9 protein is a Cas9 protein derived from S. pyogenes.
  • the amino acid sequences of various Cas proteins and information on their coding sequences can be obtained from various databases such as GenBank, UniProt, and Addgene.
  • the amino acid sequence of the Cas9 protein of S. pyogenes can be that registered in Addgene as plasmid number 42230.
  • An example of the amino acid sequence of the Cas9 protein of S. pyogenes is shown in SEQ ID NO:1.
  • guide RNA and "gRNA” are used interchangeably and refer to an RNA that can form a complex with Cas protein and guide Cas protein to a target region.
  • the guide RNA comprises CRISPR RNA (crRNA) and transactivating CRISPR RNA (tracrRNA).
  • the crRNA is involved in binding to a target region on the genome, and the tracrRNA is involved in binding to Cas protein.
  • the crRNA comprises a spacer sequence and a repeat sequence, and the spacer sequence binds to the complementary strand of the target sequence in the target region.
  • the tracrRNA comprises an anti-repeat sequence and a 3' tail sequence.
  • the anti-repeat sequence has a sequence complementary to the repeat sequence of the crRNA and forms a base pair with the repeat sequence, and the 3' tail sequence usually forms three stem loops.
  • the guide RNA may be a single guide RNA (sgRNA) in which the 5' end of the tracrRNA is linked to the 3' end of the crRNA, or the crRNA and the tracrRNA may be separate RNA molecules in which the repeat sequence and the anti-repeat sequence form base pairs.
  • the guide RNA is an sgRNA.
  • the crRNA repeat sequence and tracrRNA sequence can be appropriately selected depending on the type of Cas protein, and those derived from the same bacterial species as the Cas protein can be used.
  • the length of the sgRNA can be about 50 to 220 nucleotides (nt), preferably about 60 to 180 nt, more preferably about 80 to 120 nt.
  • the length of the crRNA can be about 25 to 70 bases including the spacer sequence, preferably about 25 to 50 nt.
  • the length of the tracrRNA can be about 10 to 130 nt, preferably about 30 to 80 nt.
  • the repeat sequence of the crRNA may be the same as that in the bacterial species from which the Cas protein is derived, or may be one in which a part of the 3' end has been deleted.
  • the tracrRNA may have the same sequence as the mature tracrRNA in the bacterial species from which the Cas protein is derived, or may be a truncated type in which the 5' end and/or the 3' end of the mature tracrRNA has been truncated.
  • the tracrRNA may be a truncated type in which about 1 to 40 nucleotide residues have been removed from the 3' end of the mature tracrRNA.
  • the tracrRNA may also be a truncated type in which about 1 to 80 nucleotide residues have been removed from the 5' end of the mature tracrRNA.
  • the tracrRNA may also be a truncated type in which, for example, about 1 to 20 nucleotide residues have been removed from the 5' end and about 1 to 40 nucleotide residues have been removed from the 3' end.
  • Various crRNA repeat sequences and tracrRNA sequences for sgRNA design have been proposed, and those skilled in the art can design sgRNAs based on known techniques (e.g., Jinek et al. (2012) Science, 337, 816-21; Mali et al.
  • protospacer adjacent motif and “PAM” are used interchangeably and refer to a sequence recognized by the Cas protein during DNA cleavage by the Cas protein.
  • the sequence and position of the PAM vary depending on the type of Cas protein. For example, in the case of the Cas9 protein, the PAM must be immediately adjacent to the 3' side of the target sequence.
  • the sequence of the PAM corresponding to the Cas9 protein varies depending on the bacterial species from which the Cas9 protein is derived. For example, the PAM corresponding to the Cas9 protein of S. pyogenes is "NGG", the PAM corresponding to the Cas9 protein of S. thermophilus is "NNAGAA”, the PAM corresponding to the Cas9 protein of S.
  • NGRRT or "NNGRR(N)
  • NNNNGATT the PAM corresponding to the Cas9 protein of T.
  • NAAAAC corresponds to the Cas9 protein of B. denticola (where "R” is A or G; “N” is A, T, G, or C).
  • spacer sequence and "guide sequence” are used interchangeably and refer to a sequence contained in a guide RNA that can bind to a complementary strand of a target sequence.
  • the spacer sequence is the same sequence as the target sequence (with the exception that T in the target sequence becomes U in the spacer sequence).
  • the spacer sequence may contain one or more base mismatches with the target sequence. When multiple base mismatches are contained, the mismatches may be located adjacent to each other or may be located distant from each other.
  • the spacer sequence may contain 1 to 5 base mismatches with the target sequence. In a particularly preferred embodiment, the spacer sequence may contain one base mismatch with the target sequence.
  • the spacer sequence is positioned 5' to the crRNA.
  • operably linked when used with respect to a polynucleotide means that a first base sequence is positioned sufficiently close to a second base sequence that the first base sequence can affect the second base sequence or a region under the control of the second base sequence.
  • a polynucleotide is operably linked to a promoter means that the polynucleotide is linked such that it is expressed under the control of the promoter.
  • the term “expressible state” refers to a state in which a polynucleotide can be transcribed in a cell into which it has been introduced.
  • expression vector refers to a vector containing a target polynucleotide and equipped with a system that allows the target polynucleotide to be expressed in a cell into which the vector is introduced.
  • Cas protein expression vector refers to a vector that can express Cas protein in a cell into which the vector is introduced.
  • guide RNA expression vector refers to a vector that can express guide RNA in a cell into which the vector is introduced.
  • sequence identity (or homology) between base sequences or amino acid sequences is determined by juxtaposing two base sequences or amino acid sequences with gaps at the insertion and deletion sites so that the corresponding bases or amino acids are most commonly matched, and calculating the ratio of matching bases or amino acids to the entire base sequence or entire amino acid sequence excluding gaps in the resulting alignment.
  • Sequence identity between base sequences or amino acid sequences can be determined using various homology search software known in the art.
  • the sequence identity value (identity value) of base sequences is not particularly limited, and can be obtained, for example, by a BLAT search installed in the known homology search software UCSC Genome Browser.
  • hg38 is a reference genome released by the University of California, Santa Cruz (UCSC) in December 2013.
  • the reference genome is a reference genome created by combining various genomes, and it does not mean that there is a human having this genome.
  • the decoded fragmentary sequence information is linked to construct a continuous sequence on a computer, and the sequence of the genomic DNA of the human individual can be estimated.
  • the genomic DNA of an individual such as a human individual is usually decoded by matching the sequence of the genomic DNA of the human individual to the reference genome.
  • a position or region corresponding to a specific position or specific region of the hg38 genome sequence means a position or region linked to the specific position or specific region in the genome of another individual having a different specific sequence.
  • a position or region having a sequence characteristic of the position or region based on sequence identity corresponds to a specific position or region of the hg38 genome sequence.
  • the corresponding position can be determined by aligning the partial sequences of two genomic DNAs. Even if there is a difference in the specific sequence, the correspondence between the two genomic DNAs can be determined by aligning them if they have an orthologous relationship or sequence identity.
  • the method for preparing a library may include preparing cells.
  • the prepared cells are pre-modified cells of the present disclosure, and may be referred to as "reference cells" because they can serve as a reference for comparison with modified cells.
  • the cells may preferably be cloned cells, established cells, or immortalized cells.
  • the cells may include a single type of cell.
  • a library of cells in which only a specific region of a specific locus is replaced with a target sequence is provided. From the viewpoint that the technical significance of replacing the target sequence with the target sequence can be clarified by unifying sequences other than the target sequence, it is preferable that the cells consist of a single type of cell.
  • the single type of cell is a cloned cell.
  • the method for preparing the library may include obtaining a first intermediate cell by using the following genome modification method for the cell.
  • the first intermediate cell is a cell having a genome including a first allele and a second allele at a locus to be modified, and the first allele and the second allele each have a cassette including a selection marker gene and a target nucleic acid sequence.
  • the selection marker gene of the first allele and the selection marker gene of the second allele are distinguishably different
  • the target nucleic acid sequence is a target of the genome modification system, and is designed so that the first allele and the second allele can be distinguishably cleaved by the genome modification system
  • each selection marker gene is a negative selection marker gene that can be used for negative selection.
  • the negative selection marker gene may also be used for positive selection (e.g., a visualization marker gene, etc.).
  • the method for preparing the library may include obtaining a library of modified cells (sometimes referred to as a "library of first modified cells" or simply "first library") from the first intermediate cell.
  • the modified cells contained in the first library may be referred to as "first modified cells”.
  • the first intermediate cell can be obtained by a person skilled in the art. Although not particularly limited, it can be easily prepared by using the genome modification method described below. Obtaining a modified cell from the intermediate cell can be achieved by cleaving the first cassette or its vicinity in the presence of donor DNA for introducing a modified base sequence.
  • the method for producing a library may include obtaining a first intermediate cell using the following genome modification method for a cell.
  • the method for producing a library may include obtaining a second intermediate cell from the first intermediate cell.
  • the method for producing a library may include obtaining a library from the second intermediate cell.
  • the first intermediate cell is as described above.
  • the second intermediate cell is a cell having a genome including a first allele and a second allele at a locus to be modified, the first allele having a cassette including a selection marker gene and a target nucleic acid sequence, and the second allele not including the cassette.
  • the second intermediate cell can be produced by removing the cassette from the second allele of the first intermediate cell.
  • Removal of the cassette can be appropriately performed by a person skilled in the art using a genome modification method.
  • a genome modification system capable of cleaving a target sequence contained in a second allele in the presence of a donor DNA having an upstream homology arm capable of homologous recombination with the upstream of the cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette upon cleavage can be applied to the first intermediate cell, thereby obtaining a second intermediate cell from the first intermediate cell.
  • the method for producing a library can include obtaining a library of modified cells (sometimes referred to as a "library of second modified cells" or simply a "second library”) from the second intermediate cell.
  • the modified cells contained in the second library can be sometimes referred to as "second modified cells".
  • Obtaining the second intermediate cell from the first intermediate cell can be achieved by cleaving a sequence in or near the second cassette in the presence of a donor DNA for removing the cassette.
  • Obtaining the modified cell from the second intermediate cell can be achieved by cleaving the first cassette in the presence of a donor DNA for introducing a modified base sequence.
  • the second intermediate cell was produced by removing the cassette from the second allele of the first intermediate cell.
  • the cassette in the second allele of the first intermediate cell is removed to return the second allele to the sequence before modification, and a third intermediate cell is obtained in which the first allele has a cassette containing a selection marker gene and a target nucleic acid sequence, and the second allele has the sequence before modification.
  • a library of modified cells is then obtained by applying a donor DNA for library production to the first allele of the third intermediate cell.
  • the modified cells contained in the library of modified cells contain a modified base sequence in the first allele, and the second allele has the sequence before modification.
  • the operation of returning the second allele of the first intermediate cell to the sequence before modification can be achieved by cutting within or near the second cassette in the presence of a donor DNA consisting of an upstream homology arm capable of homologous recombination with the upstream of the second cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette.
  • the present disclosure provides a first intermediate cell, a second intermediate cell, a composition comprising these intermediate cells, a first modified cell, a first library, a second modified cell, and a second library.
  • a cut is typically made in the target region, and the target region is replaced with a sequence sandwiched between the upstream and downstream homology arms in the donor DNA. If the sequence sandwiched between the upstream and downstream homology arms does not exist, the target region is deleted (or seamlessly deleted).
  • cuts may be made at multiple locations in the target region. Typically, it is beneficial to make cuts in both the target region adjacent to the upstream homology arm and the target region adjacent to the downstream homology arm.
  • the first allele and the second allele are each replaced with a cassette of the target region, and the first allele and the second allele may have a deletion of the target region.
  • the first allele and the second allele are each inserted with a cassette into the target region, and the first allele and the second allele may not have a deletion of the target region.
  • the insertion of the cassette of the target region is made into a non-functional target region, and therefore does not result in a loss of function or deficiency associated with the destruction of the target region.
  • the cells used in the genome modification method of this embodiment are not particularly limited, and may be cells having a haploid or diploid or higher chromosomal genome.
  • the cells may be diploid, triploid, or quadruploid or higher.
  • Examples of cells include, but are not limited to, eukaryotic cells.
  • the cells may be plant cells, animal cells, or fungal cells.
  • the animal cells may be, but are not limited to, cells of humans, non-human mammals (e.g., non-human primates such as monkeys, non-human mammals such as dogs, cats, cows, horses, sheep, goats, llamas, and rodents), birds, reptiles, amphibians, fish, and other vertebrates.
  • pluripotent cells e.g., pluripotent stem cells such as embryonic stem cells (ES cells) and induced pluripotent stem cells (iPS cells)
  • hematopoietic stem cells hematopoietic progenitor cells
  • bone marrow cells spleen cells
  • immune cells e.g., T cells, B cells, NK cells, NKT cells, macrophages, monocytes, neutrophils, eosinophils, basophils
  • erythrocytes megakaryocytes
  • cardiac cells cardiomyocytes, cardiac fibroblasts, pancreatic beta cells
  • corneal cells e.g., corneal epithelial cells and corneal endothelial cells
  • epidermal cells dermal cells
  • adipocytes chondrocytes, osteocytes, osteoclasts, osteoblasts
  • mesenchymal stem cells e.g., adipose-
  • the cell may be an isolated cell, a cloned cell, or a cell line.
  • the cell may be an immortalized cell.
  • the cell is a cloned cell.
  • the cell is a cell line.
  • the cell is an immortalized cell.
  • the cell is a primary somatic cell. It will be understood by those skilled in the art that the cell is appropriately selected depending on the intended use.
  • the cells can be frozen in a cell cryoprotectant.
  • the cell cryoprotectant containing the cells can be provided in a non-frozen state or preferably in a frozen state.
  • the cell cryoprotectant containing the cells in a frozen state (also called a "freeze stock") can be used as a research cell bank (RCB), master cell bank (MCB), or working cell bank (WCB).
  • RTB research cell bank
  • MB master cell bank
  • WB working cell bank
  • the present invention provides a research cell bank (RCB), master cell bank (MCB), or working cell bank (WCB) that includes the above-mentioned frozen stock.
  • a method including the following (a) and (b) can be used to prepare the first intermediate cell: (a) introducing the following (i) and (ii) into a cell containing two or more alleles to introduce a selection marker gene into each of the two or more alleles; (i) a sequence-specific nucleic acid cleaving molecule capable of targeting and cleaving a target region in two or more alleles of the chromosomal genome, or a genome modification system comprising a polynucleotide encoding the sequence-specific nucleic acid cleaving molecule; (ii) Two or more types of donor DNA for selection markers, each of which has an upstream homology arm having a base sequence capable of homologous recombination with a base sequence on the upstream side of the target region and a downstream homology arm having a base sequence capable of homologous recombination with a base sequence on the downstream side of the target region, and contains a base sequence of
  • step (a) In step (a), (i) and (ii) are introduced into a cell containing the chromosome.
  • Genome modification system refers to a molecular mechanism capable of modifying a desired target region.
  • the genome modification system includes a sequence-specific nucleic acid cleavage molecule that targets a target region of a chromosomal genome, or a polynucleotide that encodes the sequence-specific nucleic acid cleavage molecule. More specifically, the genome modification system can cleave at least one, preferably two, in or near the target region.
  • the target region to be subjected to genome modification can be any region on the genome having one or more alleles.
  • the size of the target region is not particularly limited. In the genome modification method of this embodiment, a region of a larger size than conventionally can be modified.
  • the target region may be, for example, 10 kbp or more.
  • the target region may be, for example, 100 bp or more, 200 bp or more, 400 bp or more, 800 bp or more, 1 kbp or more, 2 kbp or more, 3 kbp or more, 4 kbp or more, 5 kbp or more, 8 kbp or more, 10 kbp or more, 20 kbp or more, 40 kbp or more, 80 kbp or more, 100 kbp or more, 200 kbp or more, 300 kbp, 400 kbp or more, 500 kbp or more, 600 kbp or more, 700 kbp or more, 800 kbp or more, 900 kbp or more, or 1 Mbp or more, or any of the above values or less.
  • the target region is deleted in the modified cell.
  • sequence-specific nucleic acid cleaving molecule is not particularly limited as long as it has sequence-specific nucleic acid cleaving activity, and may be a synthetic organic compound or a biopolymer compound such as a protein.
  • An example of a protein having sequence-specific site cleavage activity is a sequence-specific endonuclease.
  • a sequence-specific endonuclease is an enzyme that can cleave nucleic acids at a specific sequence.
  • a sequence-specific endonuclease can cleave double-stranded DNA at a specific sequence.
  • Sequence-specific endonucleases are not particularly limited, but examples include zinc finger nucleases (ZFNs), TALENs (transcription activator-like effector nucleases), Cas proteins, etc., but are not limited to these.
  • ZFNs are artificial nucleases that contain a nucleic acid cleavage domain conjugated to a binding domain that contains a zinc finger array.
  • cleavage domains include the cleavage domain of the type II restriction enzyme FokI.
  • Zinc finger nucleases capable of cleaving a target sequence can be designed by known methods.
  • TALENs are artificial nucleases that contain a DNA-binding domain of a transcription activator-like (TAL) effector in addition to a DNA-cleavage domain (e.g., a FokI domain).
  • TALE constructs capable of cleaving a target sequence can be designed by known methods (e.g., Zhang, Feng et. al. (2011) Nature Biotechnology 29 (2)).
  • the genome modification system includes a CRISPR/Cas system. That is, the genome modification system preferably includes a Cas protein and a guide RNA having a base sequence homologous to a base sequence in the target region.
  • the guide RNA may include a sequence homologous to a sequence in the target region (target sequence) as a spacer sequence.
  • the guide RNA may be capable of binding to DNA in the target region, and does not need to have a sequence completely identical to the target sequence. This binding may be formed under physiological conditions in the cell nucleus.
  • the guide RNA may include, for example, 0 to 3 base mismatches with respect to the target sequence.
  • the number of mismatches is preferably 0 to 2 bases, more preferably 0 to 1, and even more preferably no mismatches.
  • the guide RNA may be designed based on a known method.
  • the genome modification system is preferably a CRISPR/Cas system, and preferably includes a Cas protein and a guide RNA.
  • the Cas protein is preferably a Cas9 protein.
  • sequence-specific endonuclease may be introduced into the cell as a protein, or may be introduced into the cell as a polynucleotide encoding the sequence-specific endonuclease.
  • the mRNA of the sequence-specific endonuclease may be introduced, or an expression vector of the sequence-specific endonuclease may be introduced.
  • the coding sequence of the sequence-specific endonuclease (sequence-specific endonuclease gene) is functionally linked to a promoter.
  • the promoter is not particularly limited, and for example, various pol II promoters can be used.
  • pol II promoters include, but are not limited to, the CMV promoter, the EF1 promoter (EF1 ⁇ promoter), the SV40 promoter, the MSCV promoter, the hTERT promoter, the ⁇ -actin promoter, the CAG promoter, and the CBh promoter.
  • the promoter may be an inducible promoter.
  • An inducible promoter is a promoter that can induce expression of a polynucleotide functionally linked to the promoter only in the presence of an inducer that drives the promoter.
  • Inducible promoters include promoters that induce gene expression by heating, such as heat shock promoters.
  • Inducible promoters also include promoters in which the inducer that drives the promoter is a drug.
  • drug-inducible promoters include, for example, cumate operator sequences, lambda operator sequences (e.g., 12 ⁇ Op), tetracycline-based inducible promoters, and the like.
  • Tetracycline-based inducible promoters include, for example, promoters that drive gene expression in the presence of tetracycline or a derivative thereof (e.g., doxycycline), or reverse tetracycline-controlled transactivator (rtTA). Tetracycline-based inducible promoters include, for example, the TRE3G promoter.
  • any known expression vector can be used without any particular restrictions.
  • expression vectors include plasmid vectors and viral vectors.
  • the expression vector may contain a guide RNA coding sequence (guide RNA gene) and in addition to the coding sequence of the Cas protein (Cas protein gene).
  • guide RNA coding sequence (guide RNA gene) is functionalized in a pol III promoter.
  • pol III promoters include mouse and human U6-snRNA promoters, human H1-RNase P RNA promoters, and human valine-tRNA promoters.
  • the donor DNA for a selection marker is a donor DNA for knocking in a selection marker into a target region.
  • the donor DNA for a selection marker contains the base sequence of one or more selection marker genes between an upstream homology arm having a base sequence homologous to a base sequence adjacent to the upstream side of the target region and a downstream homology arm having a base sequence homologous to a base sequence adjacent to the downstream side of the target region.
  • the donor DNA for the selection marker may have a length of, but is not limited to, 1 kb or more, 2 kb or more, 3 kb or more, 4 kb or more, 5 kb or more, 6 kb or more, 7 kb or more, 8 kb or more, 9 kb or more, 9.5 kb or more, or 10 kb or more.
  • the donor DNA for the selection marker may have a length of, but is not limited to, 50 kb or less, 45 kb or less, 40 kb or less, 35 kb or less, 30 kb or less, 25 kb or less, 20 kb or less, 15 kb or less, 14 kb or less, 13 kb or less, 12 kb or less, 11 kb or less, 10 kb or less, 9 kb or less, 8 kb or less, 7 kb or less, 6 kb or less, 5 kb or less, or 4 kb or less.
  • a “selection marker” refers to a protein that can select cells based on the presence or absence of its expression.
  • a selection marker gene is a gene that codes for a selection marker. When a selection marker-expressing cell is selected in a cell population in which selection marker-expressing cells and non-expressing cells are mixed, the selection marker is called a "positive selection marker” or a “selection marker for positive selection”. When a selection marker-non-expressing cell is selected in a cell population in which selection marker-expressing cells and non-expressing cells are mixed, the selection marker is called a "negative selection marker” or a “selection marker for negative selection”.
  • selection markers When selection markers are different from each other, it means that they can be distinguished from each other (e.g., they are distinguishably different), and for example, they can be distinguished from each other at least in physiological properties such as the property of drug resistance that they confer on cells into which the selection marker is introduced or in other physicochemical properties. In other words, when selection markers are different from each other, it means that different selection markers can be detected in a distinguishable manner from other selection markers, or that they can be selected for drugs in a distinguishable manner from other selection markers.
  • the selective marker gene being unique to each type of donor DNA for selective markers means that the selective marker gene possessed by one type of donor DNA for selective markers is not contained in other types of donor DNA for selective markers, or, when contained in multiple types of donor DNA, is configured so that it is not expressed from two or more types of donor DNA at the same time.
  • the two or more types of donor DNA may be identical except for the selective marker, or may differ in the sequence and/or structure other than the selective marker.
  • the positive selection marker is not particularly limited as long as it allows the selection of cells expressing it.
  • positive selection marker genes include drug resistance genes, fluorescent protein genes, luminescent enzyme genes, and chromogenic enzyme genes.
  • the negative selection marker is not particularly limited as long as it is capable of selecting cells that do not express it.
  • negative selection marker genes include suicide genes (such as thymidine kinase), fluorescent protein genes, luminescent enzyme genes, and chromogenic enzyme genes.
  • suicide genes such as thymidine kinase
  • fluorescent protein genes such as a suicide gene
  • luminescent enzyme genes such as a suicide gene
  • chromogenic enzyme genes such as thymidine kinase
  • the negative selection marker gene can be functionally linked to an inducible promoter.
  • the negative selection marker gene can be expressed only when it is desired to remove cells that have the negative selection marker gene.
  • the negative selection marker gene has little negative effect on cell survival, such as when it is an optically detectable marker gene (visible marker gene) that is fluorescent, luminescent, or chromogenic, it may be expressed constitutively.
  • Examples of drug resistance genes include, but are not limited to, a puromycin resistance gene, a blasticidin resistance gene, a geneticin resistance gene, a neomycin resistance gene, a tetracycline resistance gene, a kanamycin resistance gene, a zeocin resistance gene, a hygromycin resistance gene, and a chloramphenicol resistance gene.
  • Examples of fluorescent protein genes include, but are not limited to, green fluorescent protein (GFP) gene, yellow fluorescent protein (YFP) gene, red fluorescent protein (RFP) gene, and the like.
  • Examples of the luminescent enzyme gene include, but are not limited to, the luciferase gene.
  • chromogenic enzyme genes include, but are not limited to, ⁇ -galactosidase gene, ⁇ -glucuronidase gene, alkaline phosphatase gene, and the like.
  • suicide genes include, but are not limited to, herpes simplex virus thymidine kinase (HSV-TK), inducible caspase 9, and the like.
  • the selection marker gene contained in the selection marker donor DNA is preferably a positive selection marker gene.
  • cells expressing the selection marker can be selected as cells in which the selection marker gene has been knocked in.
  • the upstream homology arm has a base sequence capable of homologous recombination with a base sequence upstream of the target region in the genome to be modified, for example, a base sequence homologous to a base sequence adjacent to the upstream side of the target sequence.
  • the downstream homology arm has a base sequence capable of homologous recombination with a base sequence downstream of the target region in the genome to be modified, for example, a base sequence homologous to a base sequence adjacent to the downstream side of the target sequence.
  • the length and sequence of the upstream homology arm and the downstream homology arm are not particularly limited as long as they are capable of homologous recombination with the surrounding region of the target region.
  • the upstream homology arm and the downstream homology arm do not necessarily have to completely match the upstream or downstream sequence of the target region as long as they can perform homologous recombination.
  • the upstream homology arm can be a sequence having 90% or more sequence identity (homology) with the base sequence adjacent to the upstream side of the target region, and it is preferable that the sequence identity is 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, or 99% or more.
  • the downstream homology arm can be a sequence having 90% or more sequence identity (homology) with the base sequence adjacent to the downstream side of the target region, and preferably has 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, or 99% or more sequence identity.
  • the efficiency of allele modification can be further increased.
  • “close” can mean that the distance between the two sequences is 100 bp or less, 50 bp or less, 40 bp or less, 30 bp or less, 20 bp or less, or 10 bp or less.
  • the selection marker gene is located between the upstream homology arm and the downstream homology arm.
  • the selection marker gene is introduced into the target region by HDR (if a gene is destroyed by this, it is called gene knockout, and if a desired gene is introduced by this, it is called gene knockin, in which it is possible to knock out a gene while knocking in another gene).
  • the selection marker gene is preferably functionally linked to a promoter so that it is expressed under the control of an appropriate promoter.
  • the promoter can be appropriately selected depending on the type of cell into which the donor DNA is introduced. Examples of promoters include SR ⁇ promoter, SV40 early promoter, retroviral LTR, CMV (cytomegalovirus) promoter, RSV (Rous sarcoma virus) promoter, HSV-TK (herpes simplex virus thymidine kinase) promoter, EF1 ⁇ promoter, metallothionein promoter, and heat shock promoter.
  • the donor DNA for the selection marker may have any control sequence such as an enhancer, a polyA addition signal, or a terminator.
  • the donor DNA for the selection marker may have an insulator sequence.
  • An "insulator” refers to a sequence that blocks or alleviates the influence of the adjacent chromosomal environment and ensures or enhances the independence of the transcriptional regulation of the DNA sandwiched between the regions.
  • An insulator is defined by its enhancer blocking effect (the effect of blocking the effect of the enhancer on promoter activity by inserting it between an enhancer and a promoter) and its suppression effect on position effect (the effect of preventing the expression of the introduced gene from being affected by the position on the genome where it is inserted by sandwiching both sides of the introduced gene with insulators).
  • the donor DNA for the selection marker may have an insulator sequence between the upstream arm and the selection marker gene (or between the upstream arm and the promoter that controls the selection marker gene).
  • the donor DNA for the selection marker may have an insulator sequence between the downstream arm and the selection marker gene.
  • the donor DNA for the selection marker may be linear or circular, but is preferably circular.
  • the donor DNA for the selection marker is a plasmid.
  • the donor DNA for the selection marker may contain any sequence in addition to the above sequences. For example, it may contain a spacer sequence in all or part of the sequences between the upstream homology arm, the insulator, the selection marker gene, and the downstream homology arm.
  • donor DNA for selection markers is introduced into cells in a number equal to or greater than the number of alleles to be modified in the genome.
  • Different types of donor DNA for selection markers have mutually different (distinguishable) types of selection marker genes.
  • different types of donor DNA for selection markers do not have completely identical selection marker genes or sets thereof. That is, a first type of donor DNA for selection markers has a first type of selection marker gene, a second type of donor DNA for selection markers has a second type of selection marker gene, a third type of donor DNA for selection markers has a third type of selection marker gene, and so on for subsequent types of donor DNA for selection markers.
  • there are two alleles to be modified in the genome there are two or more types of donor DNA for selection markers.
  • one donor DNA for a selection marker may have two or more different (distinguishable) selection markers (even in this case, the different types of donor DNA for a selection marker must have different (distinguishable) types (e.g., unique) of selection marker genes).
  • the donor DNA for a selection marker does not have a recombination sequence of a site-specific recombinase (e.g., a loxP sequence recombined by Cre recombinase and its variants).
  • the method of the present invention does not use a site-specific recombinase and its recombination sequence (e.g., a loxP sequence recombined by Cre recombinase and its variants).
  • a site-specific recombinase e.g., a loxP sequence recombined by Cre recombinase and its variants.
  • one recombination sequence of the site-specific recombinase usually remains in the edited genome.
  • the modified genome of the cell obtained by the method of the present invention does not have a recombination sequence (which is foreign) of a site-specific recombinase.
  • the number of types of donor DNA for selection markers may be equal to or greater than the number of alleles to be targeted for genome modification, with no particular upper limit.
  • donor DNA for selection markers of a number equal to or greater than the number of alleles to be targeted for genome modification, two or more alleles can be stably modified.
  • the number of types of donor DNA for selection markers is preferably equal to the number of alleles to be targeted for genome modification or about 1 to 2 more, and more preferably equal to the number of alleles to be targeted for genome modification.
  • the method of introducing (i) and (ii) into cells is not particularly limited, and known methods can be used without particular limitation.
  • Examples of methods of introducing (i) and (ii) into cells include, but are not limited to, viral infection, lipofection, microinjection, calcium phosphate, DEAE-dextran, electroporation, and particle gun.
  • two or more types of donor DNA for selection marker can be knocked into two or more alleles of the target region randomly when the upstream homology arm and downstream homology arm of each are identical.
  • two or more types of donor DNA for selection markers can modify each of the two or more alleles as long as the donor DNA for selection markers has a base sequence of a homology arm that can undergo homologous recombination with the upstream and downstream sequences of the target region of each of the two or more alleles, and therefore does not need to have completely identical base sequences of the homology arms.
  • the donor DNA for selection markers may have base sequences of the upstream and downstream homology arms that are more identical to the upstream and downstream sequences of the target region of each allele (e.g., may be optimized in this way).
  • the donor DNA for the selection marker has an upstream homology arm and a downstream homology arm, and has a selection marker gene between the upstream homology arm and the downstream homology arm, and preferably may further have a target sequence for an endonuclease (a base sequence-specific nucleic acid cleavage molecule) such as a cleavage site for a meganuclease.
  • the selection marker includes a selection marker gene for positive selection and a marker gene for negative selection.
  • the selection marker includes a selection marker for positive selection, but may not include a negative selection marker gene separately from this.
  • the selection marker gene for positive selection can also be used for negative selection, and such a marker gene may include a visualization marker gene.
  • a set of two or more donor DNAs for selection markers is a combination of the above donor DNAs for selection markers, and each of them has a selectable marker gene for positive selection that can be distinguished from the others.
  • the above set may further have a target sequence for an endonuclease (base sequence-specific nucleic acid cutting molecule) such as a cleavage site of a meganuclease, and the target sequences may be different from each other, but are preferably the same (or can be cut by the same base sequence-specific nucleic acid cutting molecule).
  • the length of the donor DNA for selection markers is as described above, but may be, for example, 5 kbp or more, 8 kbp or more, or 10 kbp or more.
  • step (b) After the step (a), step (b) is performed.
  • step (b) cells into which two or more alleles have been introduced with selectable marker genes or a combination thereof that are distinct from each other are selected based on the expression of the distinctly different selectable marker genes. More specifically, in step (b), cells that express all of the distinctly different selectable marker genes introduced into the two or more alleles by homologous recombination of different types of selectable marker donor DNAs with respect to the two or more alleles are selected.
  • step (b) cells in which the different selectable marker donor DNAs have been introduced and each allele has been modified are selected based on the expression of all the selectable marker genes contained in the two or more selectable marker donor DNAs and integrated into the chromosomal genome. In one aspect, in step (b), cells are selected based on all the selectable marker genes contained in the two or more selectable marker donor DNAs. In one aspect, in step (b), cells in which each allele has been modified by the introduction of a distinguishable donor DNA for selection marker are selected based on the expression of all the selection marker genes (marker genes for positive selection) that are contained in the two or more types of donor DNA for selection marker and that have been integrated into the chromosomal genome.
  • the cells obtained in step (b) have different marker genes for positive selection in each allele. In one aspect, the cells obtained in step (b) have a common marker gene for positive selection in each allele.
  • single cell cloning is not performed in step (b) ⁇ however, it may or may not include single cell cloning after selecting cells in which two or more alleles have been modified in step (b) ⁇ . In one aspect, in step (b), the selection of cells is performed based on the expression of multiple distinguishable marker genes for positive selection introduced into each allele.
  • step (b) is not performed by a method of estimating the number of modified alleles based on the expression intensity of a single selection marker gene (e.g., expression intensity or fluorescence intensity of a fluorescent protein).
  • a single selection marker gene e.g., expression intensity or fluorescence intensity of a fluorescent protein.
  • step (b) cells may be selected as appropriate depending on the type of selection marker gene used in step (a). In this case, cells are selected based on the expression of all of the selection marker genes used in step (a).
  • the selection marker gene when the selection marker gene is a positive selection marker gene, cells expressing all the selection marker genes that are incorporated (or have been incorporated) into the chromosomal genome to be modified can be selected, for example, cells expressing the same number of positive selection markers as the number of alleles to be modified can be selected.
  • the positive selection marker gene is a drug resistance gene, cells expressing the positive selection marker can be selected by culturing the cells in a medium containing the drug.
  • the positive selection marker gene is a fluorescent protein gene, a luminescent enzyme gene, or a chromogenic enzyme gene
  • cells expressing the positive selection marker can be selected by selecting cells that exhibit fluorescence, luminescence, or color due to the fluorescent protein, luminescent enzyme, or chromogenic enzyme.
  • the number of alleles to be modified is n or less, and when the number of selection marker donor DNAs of types greater than or equal to n are incorporated into the genome, at least the alleles to be modified (which are two or more alleles) are modified.
  • the number of alleles to be modified is n, and the corresponding number of types of donor DNA for selection markers are incorporated into the chromosomal genome, and all alleles are modified.
  • the same number or more types of donor DNA for selection markers as the number of alleles to be modified are used, so the number of positive selection markers expressed by the cells means that the corresponding number of alleles have been reliably modified. From the viewpoint of increasing the efficiency of cell selection in step (b), it is preferable that the number of alleles to be modified is the same as the number of types of donor DNA for selection markers.
  • the genome modification method of this embodiment by inducing HDR using n types of donor DNA for selection markers to modify n alleles in an n-ploid cell, it is possible to efficiently obtain cells in which all alleles possessed by the cell have been modified. Furthermore, because it is possible to reliably obtain cells in which all alleles have been modified, it is possible to efficiently obtain cells in which the target region has been modified even if the target region is large in size (e.g., 10 kbp or more). This makes large-scale genome modification possible.
  • the donor DNA for selection marker may contain a negative selection marker gene in addition to the positive selection marker gene between the upstream homology arm and the downstream homology arm.
  • the positive selection marker gene may be a marker gene that can also be used for negative selection (a marker gene that can be used for both positive and negative selection).
  • the positive selection marker gene may be a drug resistance gene.
  • the positive selection marker gene may be a visualization marker gene that can also be used for negative selection, etc.
  • the donor DNA for the selection marker contains a negative selection marker gene in addition to a positive selection marker gene between the upstream homology arm and the downstream homology arm, and may further contain a target nucleic acid sequence.
  • the target nucleic acid sequence is a sequence that can be cleaved by the above-mentioned genome modification system.
  • the target nucleic acid sequence is preferably an allele-specific sequence, which makes it possible to cleave only the first allele or only the second allele of the cassette of the first allele (or the first cassette) and the cassette of the second allele (the second cassette). In this way, selective editing of only one allele is possible by inducing cleavage in an allele-specific manner.
  • the donor DNA for the selection marker contains one target nucleic acid sequence between the upstream homology arm and the downstream homology arm.
  • the donor DNA for the selection marker contains a first target nucleic acid sequence and a second target nucleic acid sequence between the upstream homology arm and the downstream homology arm, and contains a selection marker gene between the first target nucleic acid sequence and the second target nucleic acid sequence.
  • Another donor DNA for a selection marker contains a third target nucleic acid sequence and a fourth target nucleic acid sequence between the upstream homology arm and the downstream homology arm, and contains a selection marker gene between the third target nucleic acid sequence and the fourth target nucleic acid sequence.
  • the first target nucleic acid sequence and the second target nucleic acid sequence may be the same or different, and the third target nucleic acid sequence and the fourth target nucleic acid sequence may be the same or different.
  • the third target nucleic acid sequence and the fourth target nucleic acid sequence are designed not to be cleaved when the first target nucleic acid sequence and the second target nucleic acid sequence are cleaved, and/or the first target nucleic acid sequence and the second target nucleic acid sequence are designed not to be cleaved when the third target nucleic acid sequence and the fourth target nucleic acid sequence are cleaved, only one of the first cassette and the second cassette can be specifically cleaved, and one of the cassettes can be selectively edited. It will be apparent that when the first cassette and the second cassette are edited simultaneously, the first to fourth target nucleic acid sequences may be the same.
  • modified cells in step (b), can be selected from a pool containing cells obtained by step (a) without cloning the cells.
  • the time required for the process can be reduced.
  • the first intermediate cell can be obtained from the pre-modified cell by the above steps (a) and (b) (see step S1 in FIG. 1).
  • the donor DNA for the selection marker contains a positive selection marker gene and a negative selection marker gene, or a marker gene that can be used for both positive and negative selection, between the upstream homology arm and the downstream homology arm for the target sequence.
  • the donor DNA for the selection marker preferably contains two target nucleic acid sequences between the upstream homology arm and the downstream homology arm for the target sequence.
  • the positive selection marker gene and the negative selection marker gene, or the marker gene that can be used for both positive and negative selection are preferably present between the two target nucleic acid sequences. In this way, the first intermediate cell can be obtained by positive selection.
  • the first intermediate cell is a cell having a genome including a first allele and a second allele at a locus to be modified, and having a cassette including a selection marker gene and a target nucleic acid sequence in each of the first allele and the second allele.
  • the selection marker gene of the first allele and the selection marker gene of the second allele are distinguishably different.
  • the target nucleic acid sequence is a target of a genome modification system, and is designed so that the first allele and the second allele can be cleaved by the genome modification system in a distinguishable manner.
  • each selection marker gene is a negative selection marker gene that can be used for negative selection.
  • a positive selection marker gene is useful when obtaining the first intermediate cell, the positive selection marker gene is not necessary in the process after obtaining the first intermediate cell. Therefore, the positive selection marker gene may be removed. The removal can be performed, for example, by a genome editing technique. In this way, the first intermediate cell does not need to have a positive selection marker.
  • a second intermediate cell may be obtained from the first intermediate cell (see step S3 in FIG. 1).
  • the second intermediate cell can be prepared by removing the second cassette from the first intermediate cell.
  • the cassette can be removed by specifically cleaving the target nucleic acid sequence inside the second cassette in the presence of a donor DNA (cassette removal donor DNA) that preferably includes an upstream homology arm capable of homologous recombination with the upstream of the second cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette.
  • the second intermediate cell thus obtained has a cassette including a selection marker gene and a target nucleic acid sequence in the first allele, but does not include the second cassette.
  • the library of the present disclosure can be prepared from the first intermediate cell and the second intermediate cell (see steps S2 and S4 in FIG. 1 for reference).
  • the first intermediate cell and the second intermediate cell (collectively referred to as "intermediate cell") have a cassette containing a selection marker gene and a target nucleic acid sequence in the first allele.
  • the first cassette can be removed and a modified base sequence can be introduced instead. That is, the first cassette can be replaced with a modified base sequence. This replacement can be performed by a genome modification system.
  • a modified base sequence introduction donor DNA (or library preparation donor DNA) that includes an upstream homology arm capable of homologous recombination with the upstream of the first cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette
  • the target nucleic acid sequence inside the first cassette can be specifically cleaved.
  • the library preparation donor DNA has a modified base sequence between the upstream homology arm and the downstream homology arm.
  • the region sandwiched between the region where the upstream homology arm on the genome undergoes homologous recombination and the region where the downstream homology arm undergoes homologous recombination is replaced with the sequence sandwiched between the upstream homology arm and the downstream homology arm of the library construction donor DNA, so a library construction donor DNA having the replaced sequence between the upstream homology arm and the downstream homology arm can be preferably used. Therefore, the cassette can be replaced with a modified base sequence by the above operation.
  • the library construction donor DNA can be a DNA group containing various modified base sequences. In this way, the cassette of each intermediate cell can be replaced with various modified base sequences by the above operation.
  • the negative selection marker gene in the cassette is removed, so that the absence of expression of the negative selection marker gene can be used as an indicator to obtain a cell in which the cassette has been replaced with a modified base sequence.
  • the library construction donor DNA is linear (not circular). In this way, there is an advantage in the ease of preparation of the library construction donor DNA (see, for example, disadvantage 2 in FIG. 4).
  • the replacement of the modified base sequence of the cassette can be confirmed by a person skilled in the art using well-known conventional techniques, for example, by the presence or absence of cleavage by a restriction enzyme, the presence or absence of PCR amplification (e.g., junction PCR), or sequencing.
  • the modified base sequence may be 0 bases long, i.e., non-existent, but preferably 1 base or more long.
  • the modified base sequence is not particularly limited, but may be, for example, 10 to 1 million bases long, 10 to 500,000 bases long, 10 to 100,000 bases long, 10 to 20,000 bases long, 10 to 15,000 bases long, or 10 to 10,000 bases long.
  • the modified base sequence is not particularly limited, but may be, for example, 3 bases long or more, 10 bases long or more, 30 bases long or more, 50 bases long, or 100 bases long or more.
  • the modified base sequences of each donor DNA for library construction may be the same or different.
  • the modified base sequences of each donor DNA for library construction may independently have any of the above lengths.
  • the intermediate cell When the intermediate cell has three or more alleles in the target region, it is sufficient that it has a unique negative marker gene that allows at least the first allele to be distinguished. In this way, cassettes other than the first allele can be removed, or an operation can be performed to replace the cassette of the first allele with a modified base sequence while maintaining the cassettes other than the first allele. Therefore, the intermediate cell is a cell in which only the first allele has a unique negative marker gene that allows at least the first allele to be distinguished, and such a cell can be selected and used as the intermediate cell.
  • the method of the present disclosure does not have to solve any of the disadvantages 1 to 3 shown in FIG. 4, but preferably solves one or more of the disadvantages 1 to 3 shown in FIG. 4. Specifically, the method of the present disclosure has a higher efficiency of recombination of a foreign gene into a target genome than the GatewayTM method and the LoxP/Cre method.
  • the donor DNA for introducing a modified base sequence is linear and not circular.
  • the modified cell does not have a recognition site for a site-specific recombinase.
  • the donor DNA for introducing a modified base sequence is linear and the modified cell does not have a recognition site for a site-specific recombinase.
  • the characteristic of not having a recognition site for a site-specific recombinase is beneficial, for example, when seamlessly linking an introduction cassette and a genome.
  • the library of the present disclosure includes: A combination of a plurality of aqueous compositions, Each aqueous composition comprises one type of modified cell; Each modified cell has a first allele and a second allele at a locus to be modified (a target locus or a locus of interest); Each modified cell has a cassette containing a DNA fragment that is different between aqueous compositions at the same position of the first allele.
  • a library can be obtained as follows. For example, the target nucleic acid sequence of the first cassette of the intermediate cell is cleaved in the presence of a library-making donor DNA containing various modified base sequences. Then, the first cassette of the intermediate cell is replaced with the modified base sequence by the DNA damage repair mechanism provided in the cell.
  • the cell having the modified base sequence can be subjected to single cell cloning.
  • Single cell cloning may include forming a large number of droplets or aqueous compositions containing one cell, subjecting them to culture, and producing a cell clone derived from one cell in the droplet or aqueous composition. In this way, a plurality (or a large number) of aqueous compositions containing cell clones are obtained. A combination of such a plurality (or a large number) of aqueous compositions can be used as a library of modified cells.
  • the first allele is missing a portion or all of the target region (initially in the genome).
  • the second allele is missing a portion or all of the target region.
  • the first allele and the second allele are missing a portion or all of the target region, more preferably all of the target region.
  • the first allele has the entire target region replaced by the first cassette.
  • the second allele has the entire target region replaced by the second cassette.
  • the first allele and the second allele have the entire target region replaced by the first cassette and the second cassette, respectively.
  • the second cassette is removed.
  • the upstream and downstream of the target region of the second allele (initially on the genome) are seamlessly linked.
  • Seamless linking means that the upstream and downstream are linked without the addition of new bases.
  • the first allele comprises a modified base sequence
  • the modified base sequence has one or more mutations selected from the group consisting of addition, insertion, substitution, deletion, and deletion of bases in a target region (also called a replaced sequence) (on the original genome).
  • the modified cell can be advantageously used to compare with the original cell (reference cell or reference cell) and evaluate the effect of the mutation.
  • the library contains various cells that differ in mutations, it is advantageous to evaluate the function of each mutation site in the target region by comparing between cells.
  • the second allele comprises a cassette containing a selection marker.
  • the selection marker may comprise a positive selection marker gene.
  • the second allele is seamlessly linked upstream and downstream of the replaced sequence.
  • the replaced sequence of the first allele and the replaced sequence of the second allele have corresponding sequences. Having corresponding sequences means that the start and end points of the sequences are at the same position on the genome. The corresponding sequences typically have a high identity (e.g., 80% or more, 90% or more, or 95% or more) and are the same length.
  • each of the cassette sequences in the modified cells consists of one or more modified portions (A) and one or more unmodified portions (B) (see, for example, FIG. 3).
  • Each of the modified portions (A) has one or more modifications selected from the group consisting of sequence insertion, deletion, and substitution, and the modifications (A) of the one or more modified portions differ between each aqueous composition in terms of the position or content of the modification.
  • Each of the one or more unmodified portions (B) is identical to the sequence of the corresponding site before modification.
  • the pre-modification sequence replaced by the insertion cassette and the sequence in the insertion cassette are aligned at the same position, the two nucleic acid sequences become the sequences of the corresponding sites.
  • the unmodified portion (B1) on the centromere side of the cassette is seamlessly linked to the adjacent sequence (C1) on the centromere side of the cassette, and the unmodified portion (Bt) on the telomere side of the cassette is seamlessly linked to the adjacent sequence (C2) on the telomere side of the cassette, and the region where the adjacent sequence (C1) and the unmodified portion (B1) are linked, and the region where the unmodified portion (Bt) and the adjacent sequence (C2) are linked may constitute the same sequence as the sequence of the corresponding region before modification.
  • an intermediate cell may be modified using a donor DNA for library production having the structure of the cassette between an upstream homology arm and a downstream homology arm.
  • the total length of the modified portion (A) may be 90% or less, 80% or less, 70% or less, 60% or less, 50% or less, 40% or less, 30% or less, 20% or less, 10% or less, or 5% or less of the length of the insertion cassette.
  • a library containing modified cells of this embodiment may be a library having different mutations at different positions within the same region, and may be preferably used, for example, to investigate which mutation at which position changes cell activity, or to screen for cells having desired properties.
  • the donor DNA for introducing a modified base sequence may have the above-mentioned cassette structure between the upstream homology arm and the downstream homology arm.
  • the donor DNA for introducing a modified base sequence may be included in a library of donor DNA for introducing a modified base sequence having different cassette structures.
  • the modified base sequence in the modified cell consists entirely of mutations.
  • recombination sequence recognition site
  • a site-specific recombinase inside or outside (near) the insertion cassette.
  • no modification is made or the base sequence is the same as that of the cell before modification. In this way, modified cells that do not contain modifications other than the desired modification can be obtained, and unanticipated effects of modifications other than the desired modification can be eliminated (see, for example, drawback 3 in Figure 4).
  • the library is not particularly limited, but preferably has 4 or more types, 5 or more types, 6 or more types, 7 or more types, 8 or more types, 9 or more types, 10 or more types, 11 or more types, 12 or more types, 13 or more types, 14 or more types, 15 or more types, 16 or more types, 17 or more types, 18 or more types, 19 or more types, 20 or more types, 25 or more types, 30 or more types, 35 or more types, 40 or more types, 45 or more types, 50 or more types, 60 or more types, 70 or more types, 80 or more types, 90 or more types or more than 100, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or 10000 types of modified cells (or aqueous compositions containing modified cells).
  • a library typically includes a combination of multiple aqueous compositions. Each aqueous composition includes one type of cell. In some embodiments, a library includes separate combinations of multiple aqueous compositions. In some embodiments, a library may include a combination of multiple aqueous compositions as a mixture depending on the purpose. For example, when screening for cells with high cell proliferation or viability, a library may include a combination of multiple aqueous compositions as a mixture. Even in such a case, by analyzing the cells after culture, the cells with the highest proliferation or viability can be enriched and the cells with high proliferation or viability can be obtained.
  • the genomic sequences of each of the modified cells contained in a library are designed to be identical, except for the modified base sequence (or DNA fragment). In one embodiment, the genomic sequences of each of the modified cells contained in a library are substantially identical, except for the modified base sequence (or DNA fragment). Being substantially identical allows for the presence of differences due to mutations that may occur between cells after simply subculturing cloned cells 10 times in an environment suitable for cell culture (normal environment).
  • the elements other than the modified base sequence are the same, it is suitable for evaluating the effects of the modified base sequence, for example.
  • the possibility of an effect due to the cassette can be minimized.
  • the possibility of an effect due to the introduction of additional new bases into the second allele can be minimized.
  • the cassette (second cassette) in the second allele in the intermediate cell may also be replaced by the second modified base sequence.
  • the second modified sequence may be a sequence common to the modified cells (i.e., the same sequence across the modified cells) or may be a sequence that differs for each aqueous composition. In some cases, the second modified sequence may differ for each cell even in the same aqueous composition.
  • the second modified sequence may be the same as or different from the modified base sequence in the first allele (first modified base sequence).
  • the second modified base sequence may be designed in the same manner as the modified base sequence in the first allele, and may be introduced in the same manner as the introduction of the modified base sequence into the first allele.
  • the second cassette may be replaced by the second modified base sequence by specifically cleaving the vicinity of the second cassette (preferably the target nucleic acid sequence in the second cassette) in the presence of the second library preparation donor DNA.
  • the content of the second modified base sequence and the method of its introduction are the same as those of the first modified base sequence, so please refer to this explanation and the explanation will be omitted here.
  • Application example 1 is an application example to the analysis of a specific region of a genome. According to the present disclosure, it is possible to identify important bases in a specific region of a genome by introducing various mutations into the base sequence of the specific region of a genome and observing the gain or loss of function of the specific region due to the mutation.
  • the specific region include a region of unknown function, a promoter region, an enhancer region, a region corresponding to an intron, a region corresponding to a 5' untranslated region (UTR), a region corresponding to a 3' untranslated region (UTR), and a region encoding a non-coding RNA.
  • Application Example 2 is an application example for regulating the expression level of a protein or RNA.
  • a region involved or suspected to be involved in regulating the expression level of the protein or RNA such as a transcriptional control region of the protein or RNA (including a region suspected to be involved in transcriptional control) and a translational control region of the protein (including a region suspected to be involved in translational control) is modified to regulate the expression level of the protein.
  • a modified cell in which the expression level of the protein or RNA is regulated can be obtained.
  • the regulation can be an increase or decrease in the expression level.
  • the RNA can be mRNA, tRNA, rRNA, or other non-coding RNA (e.g., microRNA).
  • Application Example 3 is an application example to a coding region of a protein or RNA. According to the present disclosure, it is possible to identify important amino acids or important sequences in the function of the protein or RNA by introducing various mutations into the region encoding the protein or RNA and observing the functional modification of the protein or RNA due to the mutation (e.g., gain or loss of function). In addition, for example, by observing the gain or loss of function of the protein or RNA due to the mutation, it is possible to obtain a mutant protein or RNA having an improved or reduced function or a new function and a modified cell expressing the mutant protein or RNA.
  • the mutant protein or RNA having an improved or reduced function or a new function and a modified cell expressing the mutant protein or RNA.
  • RNA may be mRNA, tRNA, rRNA, or other non-coding RNA (e.g., microRNA).
  • a modified cell expressing the mutant protein or RNA, in which the expression level is regulated can also be obtained.
  • Application example 4 is an application to screening of cells with high proliferation or viability.
  • a library containing various modified cells having different mutations in genomic regions that may be involved in the proliferation or viability of cells can be obtained.
  • Such a library may contain separate aqueous compositions containing various types of cells, or may be a mixture of aqueous compositions containing various types of cells. It is preferable that the mixture contains equal amounts of each modified cell.
  • Screening can also be performed under conditions where a specific selective pressure is applied.
  • the selective pressure is not particularly limited, but examples thereof include poor nutrition, high salt concentration, low salt concentration, high temperature, low temperature, low oxygen, and the presence of drugs (e.g., physiologically active substances such as poisons and antibiotics).
  • Modification of an existing gene in the genome can be achieved by replacing the gene with a modified gene, or by simply inserting a modified nucleotide sequence into a safe harbor region (e.g., the AAVS1 locus, the ROSA26 locus, the CLBYL locus, the CXCR4 locus, and the CCR5 locus, etc.).
  • the present invention provides a cell in which two or more alleles of a chromosomal genome have been modified, with each of the two or more alleles having a mutually different (distinguishable) selectable marker gene.
  • the cell may be a cell of a unicellular organism.
  • the cell may be an isolated cell.
  • the cell may be a cell selected from the group consisting of a pluripotent cell and a pluripotent stem cell (such as an embryonic stem cell and an induced pluripotent stem cell).
  • the cell may be a tissue stem cell.
  • the cell may be a somatic cell.
  • the cell may be a germline cell (e.g., a germ cell). In one aspect, the cell may be a cell line. In one aspect, the cell may be an immortalized cell. In one aspect, the cell may be a cancer cell. In one aspect, the cell may be a non-cancerous cell. In one aspect, the cell may be a cell of a diseased patient. In one aspect, the cell may be a cell of a healthy individual.
  • germline cell e.g., a germ cell
  • the cell may be a cell line. In one aspect, the cell may be an immortalized cell. In one aspect, the cell may be a cancer cell. In one aspect, the cell may be a non-cancerous cell. In one aspect, the cell may be a cell of a diseased patient. In one aspect, the cell may be a cell of a healthy individual.
  • the cell may be an animal cell (e.g., a human cell), such as an insect cell (e.g., a silkworm cell), HEK293 cell, HEK293T cell, Expi293FTM cell, FreeStyleTM 293F cell, Chinese hamster ovary cell (CHO cell), CHO-S cell, CHO-K1 cell, and ExpiCHO cell, and cells derived from these cells.
  • an insect cell e.g., a silkworm cell
  • HEK293 cell HEK293T cell
  • Expi293FTM cell FreeStyleTM 293F cell
  • Chinese hamster ovary cell CHO cell
  • CHO-S cell CHO-K1 cell
  • ExpiCHO cell cells derived from these cells.
  • all alleles of the target region of the chromosomal genome are modified, and the modified regions each have a different (distinguishable) selection marker gene from each other.
  • a method for culturing cells in which two or more alleles of a chromosomal genome have been modified, and each of the two or more alleles has a mutually different (distinguishable) selection marker gene.
  • the selection marker genes are drug resistance marker genes
  • the cells can be cultured in the presence of a drug against each of the drug resistance marker genes.
  • the cells can be cultured under conditions suitable for the maintenance or growth of the cells.
  • the present invention provides a non-human organism having a chromosomal genome with two or more modified alleles, each of which has a selectable marker gene that is different from the other alleles.
  • the cell may be a cell of a unicellular organism.
  • the non-human organism is a yeast (e.g., a fission yeast or budding yeast, e.g., a species of the genus Saccharomyces, such as Saccharomyces cerevisiae, Saccharomyces carlsbergensis, Saccharomyces fragilis, Saccharomyces rouxii, a species of the genus Candida, such as Candida utilis, Candida tropicalis, a species of the genus Pichia, a species of the genus Kluyveromyces, a species of the genus Yarrowia ...
  • yeast e.g., a fission yeast or budding yeast, e.g., a species of the genus Saccharomyces, such as Saccharomyces cerevisiae, Saccharomyces carlsbergensis, Saccharomyces fragilis, Saccharomyces rouxii, a species of the genus Candida, such as Candida utilis, Candida tropical
  • the non-human organism may be a yeast selected from the group consisting of yeasts of the genera Arrowia, Hansenula, and Endomyces.
  • the non-human organism may be a filamentous fungus (e.g., Aspergillus, Trichoderma, Humicola, Acremonium, Fusarium, and Penicillium).
  • the non-human organism may be a multicellular organism.
  • the non-human organism may be a non-human animal.
  • the non-human organism may be a plant.
  • all alleles of the target region of the chromosomal genome of the non-human organism are modified, and the modified regions each have a selectable marker gene that is different from each other (distinguishable).
  • one or more of the desired genes may be contained or collected in another region of the chromosomal genome.
  • the other region may be, for example, a safe harbor region (e.g., the AAVS1 locus, the ROSA26 locus, the CLBYL locus, the CXCR4 locus, and the CCR5 locus).
  • the other region may be, for example, a region of (ii) above having a deletion.
  • a first intermediate cell is obtained from a pre-modified cell (referred to as a reference cell).
  • the process of obtaining the first intermediate cell from the pre-modified cell is called step S1.
  • a library of first modified cells (hereinafter simply referred to as the "first library”) can be produced from the first intermediate cell.
  • the process of obtaining the first library from the first intermediate cell is called step S2.
  • a second intermediate cell can be obtained from the first intermediate cell.
  • the process of obtaining the second intermediate cell from the first intermediate cell is called step S3.
  • a library of second modified cells hereinafter simply referred to as the "second library" can be produced from the second intermediate cell.
  • the process of obtaining the second library from the second intermediate cell is called step S4.
  • Reference cells are cells to be made into a library.
  • Reference cells are, for example, eukaryotic cells, and can be used to create the library of the present disclosure.
  • Reference cells may be natural cells, or may be modified cells.
  • Reference cells are typically diploid, but may also be triploid or higher.
  • a first intermediate cell can be produced as shown in FIG. 2.
  • the target region on the genome of the reference cell is replaced with a cassette containing a selection marker gene.
  • the donor DNA contains a distinguishably different drug selection marker gene (for positive selection) and a distinguishably different visualization marker gene (for negative selection), and each has a target sequence (gRNA1-4) by CRISPR/Cas9 at both ends.
  • drug selection is performed with two types of drugs (see step (1) in FIG.
  • Cells that survive after selection are cells (first intermediate cells) that have distinguishably different drug resistance genes in the target regions of the two alleles.
  • the first intermediate cells can be subjected to single cell cloning. It can be confirmed whether the cassette containing the selection marker gene has been inserted at the desired position.
  • the target sequences located at both ends of the cassette that replaced the paternally derived target region can be cut by the CRISPR/Cas9 system in the presence of the cassette removal donor DNA.
  • the cassette removal donor DNA consists of an upstream homology arm capable of homologous recombination with the upstream of the target region on the paternally derived allele and a downstream homology arm capable of recombination with the downstream of the target region.
  • a second intermediate cell can be obtained that has a genome in which the upstream and downstream of the target region in the paternally derived allele are seamlessly linked (i.e., the entire target region has been lost).
  • a library of second modified cells can be prepared from the second intermediate cells.
  • the target sequences gRNA3 and gRNA4 located at both ends of the maternal cassette of the second intermediate cells can be cleaved by the CRISPR/Cas9 system in the presence of donor DNA for introducing modified base sequences (donor DNA for library preparation).
  • the donor DNA for introducing modified base sequences has an upstream homology arm capable of homologous recombination with the upstream of the target region on the maternal allele and a downstream homology arm capable of recombination with the downstream of the target region, and contains a modified base sequence between the upstream homology arm and the downstream homology arm.
  • modified cells having a genome containing a modified base sequence between the upstream and downstream of the target sequence in the maternal allele can be obtained.
  • modified cells having different modified base sequences can be obtained, thereby obtaining a second library.
  • the donor DNA for introducing a modified base sequence consists of one or more modified portions (A) and one or more unmodified portions (B), and other than the modified portion (A), it can have a sequence that is the same as the sequence of the corresponding region of the genome before modification.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Mycology (AREA)
  • Cell Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Provided is a method for producing a library containing gene-modified cells. This library is useful in the functional evaluation of a specific region of a genome and the introduction, etc., of various types of mutations (such as SNPs) to a specific region of a genome.

Description

細胞ライブラリおよびその製造方法Cell library and method for producing same
 本発明は、細胞ライブラリおよびその製造方法に関する。 The present invention relates to a cell library and a method for producing the same.
 新規ゲノム編集ツールとしてCRISPR/Casシステムが報告されてから、CRISPR/Casシステムを利用した様々な研究が行われている(例えば、特許文献1)。CRISPR/Casシステムを利用したゲノム編集では、ガイドRNAにより標的化された標的領域がCas9ヌクレアーゼにより二本鎖切断される。二本鎖切断されたDNAは、相同組み換え修復(Homologous Directed Repair:HDR)又は非相同末端再結合(Non-Homologous End-Joining Repair:NHEJ)により修復されることが知られている。HDRでは、標的領域の周辺領域と相同な配列を有するドナーDNAをCRISPR/Casシステムと共に細胞に導入することにより、任意の配列を標的領域に組み込むことができる。 Since the CRISPR/Cas system was reported as a new genome editing tool, various studies using the CRISPR/Cas system have been conducted (for example, Patent Document 1). In genome editing using the CRISPR/Cas system, the target region targeted by the guide RNA is double-stranded broken by Cas9 nuclease. It is known that double-stranded DNA is repaired by Homologous Directed Repair (HDR) or Non-Homologous End-Joining Repair (NHEJ). In HDR, any sequence can be incorporated into the target region by introducing donor DNA having a sequence homologous to the surrounding region of the target region into a cell together with the CRISPR/Cas system.
 ゲノム改変技術を用いて、HDRにより2以上のアレルを同時に効率よく改変する技術が開発されている(例えば、特許文献2)。特許文献2では、数百kbの大規模欠失を2以上のアレルを同時に効率よく改変することができたことが開示されている。 A technique has been developed that uses genome modification technology to efficiently modify two or more alleles simultaneously by HDR (for example, Patent Document 2). Patent Document 2 discloses that it has been possible to efficiently modify two or more alleles simultaneously to create large-scale deletions of several hundred kb.
国際公開第2014/093661号International Publication No. 2014/093661 国際公開第2021/206054号International Publication No. 2021/206054
 本開示は、細胞ライブラリーおよびその製造方法を提供する。より具体的には、本開示は、複数のアレルを有する細胞においてその1アレルの特定部位の塩基配列に豊富な種類の多様性を有する複数の改変細胞を含む細胞ライブラリーを提供する。細胞ライブラリーは、1種類の改変細胞を含む水性組成物の組合せであり得る。 The present disclosure provides a cell library and a method for producing the same. More specifically, the present disclosure provides a cell library containing a plurality of modified cells having a rich variety of base sequences at a specific site of one allele in a cell having multiple alleles. The cell library may be a combination of aqueous compositions containing one type of modified cell.
(1)改変細胞のライブラリーであって、
 ライブラリーは、複数の水性組成物の組合せを含み、
 各水性組成物はそれぞれ1種類の改変細胞を含み、
 改変細胞はそれぞれ、改変対象である遺伝子座に第一のアレルと第二のアレルを有し、
 改変細胞はそれぞれ第一のアレルの同一位置に、水性組成物間で相互に異なるDNA断片を含むカセットを有する、
ライブラリー{ここで、好ましくは、改変細胞の前記遺伝子座は部位特異的組換え酵素の認識部位または組換え配列を含まない}。
(2)各改変細胞は、第二のアレルは、その一部または全部の破壊または欠失を有する、請求項1または2に記載のライブラリー。
(3)第二のアレルは、シームレスに前記一部または全部を欠失している、上記(2)に記載のライブラリー。
(4)改変細胞それぞれのDNA断片を含むカセット以外の配列は、改変前後で実質的に同一である、上記(1)~(3)のいずれかに記載のライブラリー。
(5)前記カセットの配列のそれぞれは、1以上の改変部分(A)と1以上の非改変部分(B)とからなり、前記改変部分(A)はそれぞれ、配列の挿入、欠失、および置換からなる群から選択される1以上の改変を有し、前記1以上の改変部分の改変は、改変の位置または内容に関して各水性組成物間で異なり、前記1以上の非改変部分(B)はそれぞれ、改変前の対応する部位の配列と同一であり、前記カセット中のセントロメア側の非改変部分(B1)は、前記カセットのセントロメア側の隣接配列(C1)とシームレスに連結しており、前記カセットのテロメア側の非改変部分(Bt)は、前記カセットのテロメア側の隣接配列(C2)とシームレスに連結しており、記隣接配列(C1)および非改変部分(B1)が連結した領域、並びに非改変部分(Bt)および上記隣接配列(C2)が連結した領域は、改変前の対応する領域の配列と同一の配列を構成している、上記(1)~(4)のいずれかに記載のライブラリー。
(6)前記改変細胞は、部位特異的組換え酵素の標的配列を含まない、上記(1)~(5)のいずれかに記載のライブラリー。
(7)ライブラリーに含まれる前記水性組成物の種類は、50種類以上である、上記(1)~(6)のいずれかに記載のライブライリー。
(8)改変細胞のライブラリーを製造する方法であって、
(α)改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、前記第一のアレルと第二のアレルそれぞれに選択マーカー遺伝子と標的核酸配列を含むカセットを有する細胞の群を提供することと、
 ここで、第一のアレルが有する選択マーカー遺伝子と第二のアレルが有する選択マーカー遺伝子とは区別可能に異なり、前記標的核酸配列は、ゲノム改変システムの標的であり、ゲノム改変システムにより第一のアレルと第二のアレルとを区別可能に切断できるように設計されており、各選択マーカー遺伝子はネガティブ選択に用いることができるネガティブ選択用マーカー遺伝子であり、
(β)提供された細胞の群に下記(x)及び(y)を導入する工程と、
(x)第一のアレルに含まれる前記固有の塩基配列を標的とする配列特異的核酸切断分子、又は前記配列特異的核酸切断分子をコードするポリヌクレオチドを含むゲノム改変システム、
(y)複数種類の第2の組換え用ドナーDNA{ここで、複数種類の第2の組換え用ドナーDNAはそれぞれ、上記(x)の前記標的部位の上流側に隣接する塩基配列と相同な塩基配列を有する上流ホモロジーアームと、前記標的領域の下流側に隣接する塩基配列と相同な塩基配列を有する下流ホモロジーアームを有し、前記上流ホモロジーアームと前記下流ホモロジーアームとの間に改変塩基配列を含み、改変塩基配列は、第2の組換え用ドナーDNA毎に異なり、第2の組換え用ドナーDNAそれぞれに固有である}、
(γ)前記工程(β)の後、第一のアレルに含まれる選択マーカー遺伝子を発現しない細胞を選択する工程と、
を含み、
 これにより、複数の細胞を含む改変細胞のライブラリー得ることができ、ここで、得られた複数の細胞において、第一のアレルは各細胞に固有の改変塩基配列を有し、第二のアレルは細胞間で共通の配列を有する、
方法。
(9)上記(8)に記載の方法であって、工程(α)の前に、
 改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、第一のアレルおよび第二のアレルに含まれる被置換配列を選択マーカー遺伝子と標的核酸配列を含むカセットにより置換し、これにより被置換配列を第一のアレルおよび第二のアレルから除去することと、
をさらに含む、方法。
(10)上記(9)に記載の方法であって、
 第一のアレルの改変塩基配列はそれぞれ、第一のアレルの被置換配列に対して、塩基の付加、挿入、置換、欠失、および削除からなる群から選択される1以上の変異を有する、方法。
(11)上記(9)または(10)に記載の方法であって、
 被改変配列は、タンパク質のコード領域であり、
 第一のアレルの改変塩基配列は、第一のアレルの被置換配列に対して、塩基の付加、挿入、置換、欠失、および削除からなる群から選択される1以上の変異を有する、方法。
(12)改変塩基配列は、被改変配列と80%以上の配列同一性を有する、上記(10)または(11)に記載の方法。
(13)上記(8)~(12)のいずれかに記載の方法であって、工程(α)と工程(β)の間に、
第二のアレルからカセットを除去すること
をさらに含む、方法。
(14)カセットがシームレスに除去される、上記(13)に記載の方法。
(15)上記(8)~(14)のいずれかに記載の方法により作製される、複数の改変細胞を含む改変細胞のライブラリー。
(1) A library of modified cells, comprising:
The library comprises a combination of a plurality of aqueous compositions,
Each aqueous composition comprises one type of modified cell;
Each of the modified cells has a first allele and a second allele at a locus to be modified;
Each of the modified cells has a cassette containing a DNA fragment that differs from each other between the aqueous compositions at the same position of the first allele.
A library, wherein preferably said locus of the modified cell does not contain a recognition site or recombination sequence for a site-specific recombinase.
(2) The library of claim 1 or 2, wherein each modified cell has a second allele that is disrupted or deleted in part or in whole.
(3) The library described in (2) above, wherein the second allele seamlessly lacks the part or all of the sequence.
(4) The library according to any one of (1) to (3) above, wherein the sequences of each modified cell other than the cassette containing the DNA fragment are substantially identical before and after modification.
(5) The library according to any of (1) to (4) above, wherein each of the sequences of the cassettes is composed of one or more modified portions (A) and one or more unmodified portions (B), each of the modified portions (A) has one or more modifications selected from the group consisting of sequence insertion, deletion, and substitution, the modifications of the one or more modified portions differ between the aqueous compositions in terms of the position or content of the modification, each of the one or more unmodified portions (B) is identical to the sequence of the corresponding site before modification, the unmodified portion (B1) on the centromere side of the cassette is seamlessly linked to the adjacent sequence (C1) on the centromere side of the cassette, the unmodified portion (Bt) on the telomere side of the cassette is seamlessly linked to the adjacent sequence (C2) on the telomere side of the cassette, and the region where the adjacent sequence (C1) and the unmodified portion (B1) are linked, and the region where the unmodified portion (Bt) and the adjacent sequence (C2) are linked, constitute a sequence identical to the sequence of the corresponding region before modification.
(6) The library according to any one of (1) to (5) above, wherein the modified cells do not contain a target sequence for a site-specific recombinase.
(7) The library according to any one of (1) to (6) above, wherein the library contains 50 or more types of the aqueous compositions.
(8) A method for producing a library of modified cells, comprising:
(α) providing a group of cells having a genome including a first allele and a second allele at a locus to be modified, the first allele and the second allele each including a cassette including a selection marker gene and a target nucleic acid sequence;
wherein the selection marker gene carried by the first allele and the selection marker gene carried by the second allele are distinguishably different, the target nucleic acid sequence is a target of a genome modification system and is designed so that the first allele and the second allele can be distinguishably cleaved by the genome modification system, and each selection marker gene is a negative selection marker gene that can be used for negative selection,
(β) introducing into the provided group of cells:
(x) a sequence-specific nucleic acid cleavage molecule that targets the unique base sequence contained in the first allele, or a genome modification system comprising a polynucleotide encoding the sequence-specific nucleic acid cleavage molecule;
(y) a plurality of types of second recombination donor DNAs {wherein each of the plurality of types of second recombination donor DNAs has an upstream homology arm having a base sequence homologous to a base sequence adjacent to the upstream side of the target site of (x) above, and a downstream homology arm having a base sequence homologous to a base sequence adjacent to the downstream side of the target region, and contains a modified base sequence between the upstream homology arm and the downstream homology arm, and the modified base sequence is different for each second recombination donor DNA and is unique to each second recombination donor DNA},
(γ) after the step (β), selecting cells that do not express the selection marker gene contained in the first allele;
Including,
This allows for the production of a library of modified cells comprising a plurality of cells, wherein in the plurality of cells obtained, a first allele has a modified base sequence unique to each cell, and a second allele has a sequence common to the cells.
Method.
(9) The method according to (8) above, further comprising the steps of:
In a cell having a genome including a first allele and a second allele at a locus to be modified, replacing the replaced sequence included in the first allele and the second allele with a cassette including a selection marker gene and a target nucleic acid sequence, thereby removing the replaced sequence from the first allele and the second allele;
The method further comprising:
(10) The method according to (9) above,
A method in which each of the modified base sequences of the first allele has one or more mutations selected from the group consisting of base addition, insertion, substitution, deletion, and deletion relative to the replaced sequence of the first allele.
(11) The method according to (9) or (10) above,
The modified sequence is a coding region for a protein,
A method in which the modified base sequence of the first allele has one or more mutations selected from the group consisting of addition, insertion, substitution, deletion, and deletion of bases relative to the replaced sequence of the first allele.
(12) The method according to (10) or (11) above, wherein the modified base sequence has a sequence identity of 80% or more with the modified sequence.
(13) The method according to any one of (8) to (12) above, further comprising the steps:
The method further comprising removing the cassette from the second allele.
(14) The method according to (13) above, wherein the cassette is seamlessly removed.
(15) A library of modified cells comprising a plurality of modified cells, produced by the method according to any one of (8) to (14) above.
図1は、本発明のライブラリー作製のスキームの一例の概要を示す。FIG. 1 shows an overview of one example of a scheme for constructing a library of the present invention. 図2は、本発明の第一の改変細胞のライブラリー作製のスキームの一例を示す。FIG. 2 shows an example of a scheme for preparing a library of the first modified cell of the present invention. 図3は、改変細胞のライブラリーにおいて、改変塩基配列が、改変部分と非改変部分とを含む場合の当該改変部分と非改変部分の位置関係の一例を示す図である。このようにすることで、改変塩基配列内の様々な箇所に様々な変異を有する多様な改変細胞を含むライブラリーを構成することができる。3 is a diagram showing an example of the positional relationship between a modified portion and a non-modified portion in a modified base sequence in a modified cell library, in which the modified base sequence contains the modified portion and the non-modified portion. In this way, a library containing a variety of modified cells having various mutations at various positions in the modified base sequence can be constructed. 図4は、部位特異的組換え酵素を用いたゲノム改変方法の特徴(欠点)を示す。FIG. 4 shows the characteristics (disadvantages) of the genome modification method using site-specific recombinase.
[定義]
 「ポリヌクレオチド」及び「核酸」という用語は、相互に互換的に使用され、ヌクレオチドがホスホジエステル結合によって結合したヌクレオチドポリマーを指す。「ポリヌクレオチド」及び「核酸」は、DNAであってもよく、RNAであってもよく、DNAとRNAとの組み合わせから構成されてもよい。また、「ポリヌクレオチド」及び「核酸」は、天然ヌクレオチドのポリマーであってもよく、天然ヌクレオチドと非天然ヌクレオチド(天然ヌクレオチドの類似体、塩基部分、糖部分及びリン酸部分のうち少なくとも一つの部分が修飾されているヌクレオチド(例えば、ホスホロチオエート骨格)等)とのポリマーであってもよく、非天然ヌクレオチドのポリマーであってもよい。
[Definition]
The terms "polynucleotide" and "nucleic acid" are used interchangeably and refer to a nucleotide polymer in which nucleotides are linked by phosphodiester bonds. A "polynucleotide" and a "nucleic acid" may be DNA, RNA, or a combination of DNA and RNA. A "polynucleotide" and a "nucleic acid" may be a polymer of natural nucleotides, a polymer of natural nucleotides and non-natural nucleotides (such as analogs of natural nucleotides, nucleotides in which at least one of the base moiety, sugar moiety, and phosphate moiety is modified (e.g., phosphorothioate backbone), etc.), or a polymer of non-natural nucleotides.
 「ポリヌクレオチド」又は「核酸」の塩基配列は、特に明示しない限り、一般的に認められている1文字コードで記載される。特に明示しない限り、塩基配列は、5’側から3’側に向かって記載する。「ポリヌクレオチド」又は「核酸」を構成するヌクレオチド残基は、単に、アデニン、チミン、シトシン、グアニン、又はウラシル等、あるいはそれらの1文字コードで記載される場合がある。 The base sequence of a "polynucleotide" or "nucleic acid" is written in the generally accepted single letter code unless otherwise specified. The base sequence is written from the 5' to the 3' side unless otherwise specified. The nucleotide residues that make up a "polynucleotide" or "nucleic acid" may be written simply as adenine, thymine, cytosine, guanine, or uracil, etc., or by their single letter codes.
 「遺伝子」という用語は、特定のタンパク質をコードする少なくとも1つのオープンリーディングフレームを含むポリヌクレオチドを指す。遺伝子は、エクソン及びイントロンの両方を含み得る。 The term "gene" refers to a polynucleotide that contains at least one open reading frame that encodes a particular protein. A gene can contain both exons and introns.
 「ポリペプチド」、「ペプチド」及び「タンパク質」という用語は、相互に互換的に使用され、アミド結合によって結合したアミノ酸のポリマーを指す。「ポリペプチド」、「ペプチド」又は「タンパク質」は、天然アミノ酸のポリマーであってもよく、天然アミノ酸と非天然アミノ酸(天然アミノ酸の化学的類似体、修飾誘導体等)とのポリマーであってもよく、非天然アミノ酸のポリマーであってもよい。特に明示しない限り、アミノ酸配列は、N末端側からC末端側に向かって記載する。 The terms "polypeptide", "peptide" and "protein" are used interchangeably and refer to a polymer of amino acids linked by amide bonds. A "polypeptide", "peptide" or "protein" may be a polymer of natural amino acids, a polymer of natural and non-natural amino acids (e.g., chemical analogues or modified derivatives of natural amino acids), or a polymer of non-natural amino acids. Unless otherwise specified, amino acid sequences are written from the N-terminus to the C-terminus.
 「アレル」という用語は、染色体ゲノム上の同一座位に存在する塩基配列のセットを指す。ある態様では、2倍体の細胞では同一座位に2アレル存在し、3倍体の細胞では同一座位に3アレル存在する。また、ある態様では、染色体の異常なコピーまたは当該座位の異常な追加のコピーによって追加のアレルが形成されている場合がある。 The term "allele" refers to a set of base sequences present at the same locus on a chromosomal genome. In some embodiments, a diploid cell has two alleles at the same locus, and a triploid cell has three alleles at the same locus. In some embodiments, additional alleles may be formed by an abnormal copy of the chromosome or an abnormal additional copy of the locus.
 「ゲノム改変」又は「ゲノム編集」という用語は、相互に互換的に用いられ、ゲノム上の所望の位置(標的領域)に変異を誘導することを指す。ゲノム改変は、標的領域DNAを切断するように設計された配列特異的核酸切断分子の使用を含み得る。好ましい実施形態において、ゲノム改変は、標的領域のDNAを切断するように操作されたヌクレアーゼの使用、を含み得る。好ましい実施形態において、ゲノム改変は、標的領域中の特定の塩基配列を有する標的配列を切断するように操作されたヌクレアーゼ(例えば、TALENやZFN)の使用を含み得る。好ましい実施形態において、ゲノム改変は、標的領域中の特定の塩基配列を有する標的配列を切断するように、メガヌクレアーゼなどのゲノムに1つしか切断部位を有しない制限酵素(例えば、16ベースの配列特異性を有する制限酵素(理論上は416塩基に1つの割合で存在する)、17ベースの配列特異性を有する制限酵素(理論上は417塩基に1つの割合で存在する)、および18ベースの配列特異性を有する制限酵素(理論上は418塩基に1つの割合で存在する))などの配列特異的エンドヌクレアーゼを用いることができる場合もある。典型的には、部位特異的ヌクレアーゼの使用により、標的領域のDNAに二本鎖切断(DSB)が誘導され、その後、相同組み換え修復(Homologous Directed Repair:HDR)及び非相同末端再結合(Non-Homologous End-Joining Repair:NHEJ)のような、細胞の内因性プロセスによってゲノムが修復される。NHEJは、ドナーDNAを用いずに二本鎖切断された末端を連結する修復方法であり、修復の際に挿入及び/又は欠失(indel)が高頻度で誘導される。HDRは、ドナーDNAを用いた修復機構であり、標的領域に所望の変異を導入することも可能である。ゲノム改変技術としては、例えば、CRISPR/Casシステムが好ましく例示される。メガヌクレアーゼとしては、例えば、I-SceI、I-SceII、I-SceIII、I-SceIV、I-SceV、I-SceVI、I-SceVII、I-CeuI、I-CeuAIIP、I-CreI、I-CrepsbIP、I-CrepsbIIP、I-CrepsbIIIP、I-CrepsbIVP、I-TliI、I-PpoI、PI-PspI、F-SceI、F-SceII、F-SuvI、F-TevI、F-TevII、I-AmaI、I-AniI、I-ChuI、I-CmoeI、I-CpaI、I-CpaII、I-CsmI、I-CvuI、I-CvuAIP、I-DdiI、I-DdiII、I-DirI、I-DmoI、I-HmuI、I-HmuII、I-HsNIP、I-LlaI、I-MsoI、I-NaaI、I-NanI、I-NclIP、I-NgrIP、I-NitI、I-NjaI、I-Nsp236IP、I-PakI、I-PboIP、I-PcuIP、I-PcuAI、I-PcuVI、I-PgrIP、I-PobIP、I-PorI、I-PorIIP、I-PbpIP、I-SpBetaIP、I-ScaI、I-SexIP、I-SneIP、I-SpomI、I-SpomCP、I-SpomIP、I-SpomIIP、I-SquIP、I-Ssp68031、I-SthPhiJP、I-SthPhiST3P、I-SthPhiSTe3bP、I-TdeIP、I-TevI、I-TevII、I-TevIII、I-UarAP、I-UarHGPAIP、I-UarHGPA13P、I-VinIP、I-ZbiIP、PI-Mtul、PI-MtuHIP PI-MtuHIIP、PI-PfuI、PI-PfuII、PI-PkoI、PI-PkoII、PI-Rma43812IP、PI-SpBetaIP、PI-SceI、PI-TfuI、PI-TfuII、PI-ThyI、PI-TliI、およびPI-TliII、並びにこれらの機能的な誘導体制限酵素からなる群から選択されるメガヌクレアーゼおよびその切断部位(または認識部位)、好ましくは、18塩基以上の配列特異性を有する制限酵素であるメガヌクレアーゼおよびその切断部位(または認識部位)、特に細胞のゲノムを1箇所または複数箇所以上切断しないメガヌクレアーゼおよびその切断部位を用いることができる。 The terms "genome modification" or "genome editing" are used interchangeably and refer to the induction of a mutation at a desired position (target region) on a genome. Genome modification may include the use of a sequence-specific nucleic acid cleaving molecule designed to cleave the target region DNA. In a preferred embodiment, genome modification may include the use of a nuclease engineered to cleave the target region DNA. In a preferred embodiment, genome modification may include the use of a nuclease engineered to cleave a target sequence having a specific base sequence in the target region (e.g., TALEN or ZFN). In a preferred embodiment, genome modification may use a sequence-specific endonuclease such as a restriction enzyme having only one cleavage site in the genome, such as a meganuclease (e.g., a restriction enzyme with 16-base sequence specificity (theoretically present at a ratio of 1 in 4 16 bases), a restriction enzyme with 17-base sequence specificity (theoretically present at a ratio of 1 in 4 17 bases), and a restriction enzyme with 18-base sequence specificity (theoretically present at a ratio of 1 in 4 18 bases)) to cleave a target sequence having a specific base sequence in the target region. Typically, a double-stranded break (DSB) is induced in the DNA of the target region by the use of a site-specific nuclease, and then the genome is repaired by endogenous processes of the cell, such as Homologous Directed Repair (HDR) and Non-Homologous End-Joining Repair (NHEJ). NHEJ is a repair method that joins the ends of double-stranded breaks without using donor DNA, and insertions and/or deletions (indels) are frequently induced during repair. HDR is a repair mechanism that uses donor DNA, and it is also possible to introduce desired mutations into the target region. A preferred example of a genome modification technique is the CRISPR/Cas system. Examples of meganucleases include I-SceI, I-SceII, I-SceIII, I-SceIV, I-SceV, I-SceVI, I-SceVII, I-CeuI, I-CeuAIIP, I-CreI, I-CrepsbIP, I-CrepsbIIP, I-CrepsbIIIIP, I-CrepsbIVP, I-TliI, I-PpoI, PI-PspI, F-SceI, F-SceII, F-SuvI, F-TevI, F-TevII, I-AmaI, I-AniI, I-ChuI, I-CmoeI, I-CpaI, I-CpaIII, I-CsmI, I-CvuI, I-CvuAIP, I-DdiI, I-DdiIII, I-DirI, I-DmoI, I-HmuI, I-HmuIII, I-HsNIP, I-LlaI, I-MsoI, I-NaaI, I-Na nI, I-NclIP, I-NgrIP, I-NitI, I-NjaI, I-Nsp236IP, I-PakI, I-PboIP, I-PcuIP, I-PcuAI, I-PcuVI, I-PgrIP, I-PobIP, I-PorI, I-PorIIIP, I-PbpIP, I-SpBetaIP, I-ScaI, I-SexIP, I-SneIP, I-SpomI, I-SpomCP, I-SpomIP, I-SpomIIP, I-SquiIP, I-Ssp68031, I-SthPhiJP, I-SthPhiST3P, I-SthPhiSTe3bP, I-TdeIP, I-TevI, I-TevII, I-TevIII, I-UarAP, I-UarHGPAIP, I-UarHGPA13P, I-VinIP, I-ZbiIP, PI-Mtul, PI-MtuHIP A meganuclease and its cleavage site (or recognition site) selected from the group consisting of restriction enzymes PI-MtuHIIP, PI-PfuI, PI-PfuII, PI-PkoI, PI-PkoII, PI-Rma43812IP, PI-SpBetaIP, PI-SceI, PI-TfuI, PI-TfuII, PI-ThyI, PI-TliI, and PI-TliII, as well as functional derivatives thereof, preferably a meganuclease that is a restriction enzyme having sequence specificity of 18 bases or more and its cleavage site (or recognition site), in particular a meganuclease that does not cleave the genome of a cell at one or more sites and its cleavage site, can be used.
 「標的領域」という用語は、ゲノム改変の対象となるゲノム領域を指す。「欠失」は、リファレンスゲノムに対する1塩基以上の欠失、および1遺伝子以上の欠失を含む。欠失は、100bp以上の欠失、200bp以上の欠失、300bp以上の欠失、400bp以上の欠失、500bp以上の欠失、600bp以上の欠失、700bp以上の欠失、800bp以上の欠失、900bp以上の欠失、1kbp以上の欠失、10kbp以上の欠失、50kbp以上の欠失、100kbp以上の欠失、200kbp以上の欠失、300kbp以上の欠失、400kbp以上の欠失、500kbp以上の欠失、または1Mbp以上の欠失またはそれ以下の欠失であり得る。欠失は、1Mbp以下の欠失であり得る。欠失は、700kbp以下の欠失であり得る。欠失は、600kbp以下の欠失であり得る。欠失は、500kbp以下の欠失であり得る。欠失は、10kbp以上600kbp以下の欠失であり得る。欠失は、100kbp以上600kbp以下の欠失であり得る。欠失は、100kbp以上500kbp以下の欠失であり得る。 The term "target region" refers to a genomic region that is subject to genome modification. "Deletion" includes deletions of one or more bases and deletions of one or more genes relative to a reference genome. The deletions can be deletions of 100 bp or more, deletions of 200 bp or more, deletions of 300 bp or more, deletions of 400 bp or more, deletions of 500 bp or more, deletions of 600 bp or more, deletions of 700 bp or more, deletions of 800 bp or more, deletions of 900 bp or more, deletions of 1 kbp or more, deletions of 10 kbp or more, deletions of 50 kbp or more, deletions of 100 kbp or more, deletions of 200 kbp or more, deletions of 300 kbp or more, deletions of 400 kbp or more, deletions of 500 kbp or more, or deletions of 1 Mbp or more or less. The deletions can be deletions of 1 Mbp or less. The deletions can be deletions of 700 kbp or less. The deletion may be a deletion of 600 kbp or less. The deletion may be a deletion of 500 kbp or less. The deletion may be a deletion of 10 kbp to 600 kbp or less. The deletion may be a deletion of 100 kbp to 600 kbp or less. The deletion may be a deletion of 100 kbp to 500 kbp or less.
 「ドナーDNA」という用語は、DNAの二本鎖切断の修復に用いられるDNAであって、標的領域周辺のDNAと相同組換え可能なDNAを指す。ドナーDNAは、ホモロジーアームとして標的領域の上流の塩基配列および下流の塩基配列(例えば、標的領域に隣接する塩基配列)を含む。本明細書においては、標的領域の上流側の塩基配列(例えば、上流側に隣接する塩基配列)からなるホモロジーアームを「上流ホモロジーアーム」、標的配列の下流側の塩基配列(例えば、下流側に隣接する塩基配列)からなるホモロジーアームを「下流ホモロジーアーム」と記載する場合がある。ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間に、所望の塩基配列を含むことができる。各ホモロジーアームの長さは、300bp以上が好ましく、通常500~3000bp程度である。上流ホモロジーアーム及び下流ホモロジーアームの長さは、互いに同じであってもよく、異なっていてもよい。標的領域は、配列依存的な切断後にドナーDNAとの間で相同組換えの誘発に成功すると、標的領域の上流の塩基配列および下流の塩基配列の間の配列が、ドナーDNAの上流の塩基配列および下流の塩基配列に挟まれている配列と置き換わることとなる。 The term "donor DNA" refers to DNA used to repair double-stranded DNA breaks and capable of homologous recombination with DNA surrounding a target region. The donor DNA contains a base sequence upstream and a base sequence downstream of the target region (e.g., a base sequence adjacent to the target region) as homology arms. In this specification, a homology arm consisting of a base sequence upstream of a target region (e.g., a base sequence adjacent to the upstream side) may be referred to as an "upstream homology arm", and a homology arm consisting of a base sequence downstream of a target region (e.g., a base sequence adjacent to the downstream side) may be referred to as a "downstream homology arm". The donor DNA may contain a desired base sequence between the upstream homology arm and the downstream homology arm. The length of each homology arm is preferably 300 bp or more, and is usually about 500 to 3000 bp. The lengths of the upstream homology arm and the downstream homology arm may be the same or different from each other. If the target region is successfully induced to undergo homologous recombination with the donor DNA after sequence-dependent cleavage, the sequence between the upstream and downstream base sequences of the target region will be replaced with the sequence sandwiched between the upstream and downstream base sequences of the donor DNA.
 標的領域の「上流」とは、標的領域の2本鎖DNAにおいて、基準となるヌクレオチド鎖の5’側に位置するDNA領域を意味する。標的領域の「下流」とは、当該基準となるヌクレオチド鎖の3’側に位置するDNAを意味する。2本鎖のいずれの鎖を基準となるヌクレオチド鎖とするかは任意である。但し、便宜的に、標的領域がタンパク質コード配列を含む場合、基準となるヌクレオチド鎖は、通常、センス鎖である。一般的に、プロモーターは、タンパク質コード配列の上流に位置する。ターミネーターは、タンパク質コード配列の下流に位置する。 "Upstream" of a target region means the DNA region located on the 5' side of a reference nucleotide strand in the double-stranded DNA of the target region. "Downstream" of a target region means the DNA located on the 3' side of the reference nucleotide strand. It is arbitrary which strand of the double strand is used as the reference nucleotide strand. However, for convenience, when the target region contains a protein coding sequence, the reference nucleotide strand is usually the sense strand. In general, a promoter is located upstream of a protein coding sequence. A terminator is located downstream of a protein coding sequence.
 「配列特異的核酸切断分子」という用語は、特定の核酸配列を認識し、前記特定の核酸配列で核酸を切断することができる分子を指す。配列特異的核酸切断分子は、配列特異的に核酸切断する活性(配列特異的核酸切断活性)を有する分子である。 The term "sequence-specific nucleic acid cleaving molecule" refers to a molecule that can recognize a specific nucleic acid sequence and cleave a nucleic acid at said specific nucleic acid sequence. A sequence-specific nucleic acid cleaving molecule is a molecule that has the activity of cleaving a nucleic acid in a sequence-specific manner (sequence-specific nucleic acid cleaving activity).
 「標的配列」という用語は、配列特異的核酸切断分子による切断の対象となるゲノム中のDNA配列を指す。配列特異的核酸切断分子がCasタンパク質である場合、標的配列は、Casタンパク質による切断の対象となるゲノム中のDNA配列を指す。Casタンパク質としてCas9タンパク質を用いる場合、標的配列は、プロトスペーサー隣接モチーフ(PAM)の5’側に隣接する配列である必要がある。標的配列は、通常、PAMの5’側直前に隣接する17~30塩基(好ましくは18~25塩基、より好ましくは19~22塩基、さらに好ましくは20塩基)の配列が選択される。標的配列の設計には、CRISPR DESIGN(crispr.mit.edu/)等の公知のデザインツールを利用することができる。 The term "target sequence" refers to a DNA sequence in a genome that is to be cleaved by a sequence-specific nucleic acid cleaving molecule. When the sequence-specific nucleic acid cleaving molecule is a Cas protein, the target sequence refers to a DNA sequence in a genome that is to be cleaved by the Cas protein. When a Cas9 protein is used as the Cas protein, the target sequence must be adjacent to the 5' side of a protospacer adjacent motif (PAM). The target sequence is usually selected as a sequence of 17 to 30 bases (preferably 18 to 25 bases, more preferably 19 to 22 bases, and even more preferably 20 bases) adjacent to and immediately preceding the 5' side of the PAM. A known design tool such as CRISPR DESIGN (crispr.mit.edu/) can be used to design the target sequence.
 「Casタンパク質」という用語は、CRISPR関連(CRISPR-associated)タンパク質を指す。好ましい態様において、Casタンパク質は、ガイドRNAと複合体を形成し、エンドヌクレアーゼ活性又はニッカーゼ活性を示す。Casタンパク質としては、特に限定されないが、例えば、Cas9タンパク質、Cpf1タンパク質、C2c1タンパク質、C2c2タンパク質、及びC2c3タンパク質等が挙げられる。Casタンパク質は、ガイドRNAと協働してエンドヌクレアーゼ活性又はニッカーゼ活性を示す限り、野生型Casタンパク質及びそのホモログ(パラログ及びオーソログ)、並びにそれらの変異体を包含する。
 好ましい態様において、Casタンパク質は、クラス2のCRISPR/Cas系に関与するものであり、より好ましくはII型のCRISPR/Cas系に関与するものである。Casタンパク質の好ましい例としては、Cas9タンパク質が例示される。Casタンパク質の好ましい例としては、Cas3タンパク質が例示される。
The term "Cas protein" refers to a CRISPR-associated protein. In a preferred embodiment, the Cas protein forms a complex with a guide RNA and exhibits endonuclease activity or nickase activity. Examples of Cas proteins include, but are not limited to, Cas9 protein, Cpf1 protein, C2c1 protein, C2c2 protein, and C2c3 protein. Cas proteins include wild-type Cas proteins and their homologs (paralogs and orthologs), as well as mutants thereof, so long as they cooperate with a guide RNA to exhibit endonuclease activity or nickase activity.
In a preferred embodiment, the Cas protein is involved in the class 2 CRISPR/Cas system, more preferably in the type II CRISPR/Cas system. A preferred example of the Cas protein is the Cas9 protein. A preferred example of the Cas protein is the Cas3 protein.
 「Cas9タンパク質」という用語は、II型のCRISPR/Cas系に関与するCasタンパク質を指す。Cas9タンパク質は、ガイドRNAと複合体を形成し、ガイドRNAと協働して標的領域のDNAを切断する活性を示す。Cas9タンパク質は、前記の活性を有する限り、野生型Cas9タンパク質及びそのホモログ(パラログ及びオーソログ)、並びにそれらの変異体を包含する。野生型Cas9タンパク質は、ヌクレアーゼドメインとしてRuvCドメイン及びHNHドメインを有するが、本明細書におけるCas9タンパク質は、RuvCドメイン及びHNHドメインのいずれか一方が不活性化されたものであってもよい。RuvCドメイン及びHNHドメインのいずれか一方が不活性化されたCas9は、二本鎖DNAに対して一本鎖切断(ニック)を導入する。そのため、RuvCドメイン及びHNHドメインのいずれか一方が不活性化されたCas9を二本鎖DNAの切断に用いる場合には、センス鎖とアンチセンス鎖それぞれに対してCas9の標的配列を設定し、センス鎖およびアンチセンス差のニックが十分に近い位置で生じ、それにより二本鎖切断が誘発されるように、改変システムを構成することができる。
 Cas9タンパク質が由来する生物種は特に限定されないが、ストレプトコッカス(Streptococcus)属、スタフィロコッカス(Staphylococcus)属、ナイセリア(Neisseria)属、又はトレポネーマ(Treponema)属に属する細菌等が好ましく例示される。より具体的には、S.pyogenes、S.thermophilus、S.aureus、N.meningitidis、又はT.denticola等に由来するCas9タンパク質が好ましく例示される。好ましい態様において、Cas9タンパク質は、S.pyogenes由来のCas9タンパク質である。
The term "Cas9 protein" refers to a Cas protein involved in the type II CRISPR/Cas system. The Cas9 protein forms a complex with a guide RNA and exhibits the activity of cleaving DNA in a target region in cooperation with the guide RNA. The Cas9 protein includes wild-type Cas9 protein and its homologs (paralogs and orthologs), as well as mutants thereof, so long as it has the above-mentioned activity. The wild-type Cas9 protein has a RuvC domain and an HNH domain as nuclease domains, but the Cas9 protein in this specification may have either the RuvC domain or the HNH domain inactivated. Cas9 in which either the RuvC domain or the HNH domain is inactivated introduces a single-stranded break (nick) into double-stranded DNA. Therefore, when Cas9 in which either the RuvC domain or the HNH domain has been inactivated is used to cleave double-stranded DNA, a modified system can be constructed in which Cas9 target sequences are set for each of the sense and antisense strands, and nicks in the sense and antisense strands are generated at positions sufficiently close to each other, thereby inducing double-stranded cleavage.
The species of organism from which the Cas9 protein is derived is not particularly limited, but preferred examples include bacteria belonging to the genus Streptococcus, Staphylococcus, Neisseria, or Treponema. More specifically, preferred examples include Cas9 proteins derived from S. pyogenes, S. thermophilus, S. aureus, N. meningitidis, or T. denticola. In a preferred embodiment, the Cas9 protein is a Cas9 protein derived from S. pyogenes.
 各種Casタンパク質のアミノ酸配列、及びそのコード配列の情報は、GenBank、UniProt、Addgene等の各種データベース上で得ることができる。例えば、S.pyogenesのCas9タンパク質のアミノ酸配列は、プラスミド番号42230としてAddgeneに登録されたもの等を用いることができる。S.pyogenesのCas9タンパク質のアミノ酸配列の一例を配列番号1に示す。 The amino acid sequences of various Cas proteins and information on their coding sequences can be obtained from various databases such as GenBank, UniProt, and Addgene. For example, the amino acid sequence of the Cas9 protein of S. pyogenes can be that registered in Addgene as plasmid number 42230. An example of the amino acid sequence of the Cas9 protein of S. pyogenes is shown in SEQ ID NO:1.
 「ガイドRNA」及び「gRNA」という用語は、相互に互換的に使用され、Casタンパク質と複合体を形成し、Casタンパク質を標的領域に誘導することができるRNAを指す。好ましい態様において、ガイドRNAは、CRISPR RNA(crRNA)及びトランス活性化型CRISPR RNA(tracrRNA)を含む。crRNAは、ゲノム上の標的領域への結合に関与し、tracrRNAは、Casタンパク質との結合に関与する。好ましい態様において、crRNAは、スペーサー配列とリピート配列とを含み、スペーサー配列が標的領域において標的配列の相補鎖と結合する。好ましい態様において、tracrRNAは、アンチリピート配列と3’テイル配列とを含む。アンチリピート配列はcrRNAのリピート配列と相補的な配列を有し、リピート配列と塩基対を形成し、3’テイル配列は通常3つのステムループを形成する。
 ガイドRNAは、crRNAの3’末端にtracrRNAの5’末端を連結した単一ガイドRNA(sgRNA)であってもよく、crRNA及びtracrRNAを別々のRNA分子とし、リピート配列及びアンチリピート配列で塩基対を形成させたものであってもよい。好ましい態様において、ガイドRNAはsgRNAである。
The terms "guide RNA" and "gRNA" are used interchangeably and refer to an RNA that can form a complex with Cas protein and guide Cas protein to a target region. In a preferred embodiment, the guide RNA comprises CRISPR RNA (crRNA) and transactivating CRISPR RNA (tracrRNA). The crRNA is involved in binding to a target region on the genome, and the tracrRNA is involved in binding to Cas protein. In a preferred embodiment, the crRNA comprises a spacer sequence and a repeat sequence, and the spacer sequence binds to the complementary strand of the target sequence in the target region. In a preferred embodiment, the tracrRNA comprises an anti-repeat sequence and a 3' tail sequence. The anti-repeat sequence has a sequence complementary to the repeat sequence of the crRNA and forms a base pair with the repeat sequence, and the 3' tail sequence usually forms three stem loops.
The guide RNA may be a single guide RNA (sgRNA) in which the 5' end of the tracrRNA is linked to the 3' end of the crRNA, or the crRNA and the tracrRNA may be separate RNA molecules in which the repeat sequence and the anti-repeat sequence form base pairs. In a preferred embodiment, the guide RNA is an sgRNA.
 crRNAのリピート配列及びtracrRNAの配列は、Casタンパク質の種類に応じて適宜選択することができ、Casタンパク質と同じ細菌種に由来するものを用いることができる。
 例えば、S.pyogenes由来のCas9タンパク質を用いる場合、sgRNAの長さは、50~220ヌクレオチド(nt)程度とすることができ、60~180nt程度が好ましく、80~120nt程度がより好ましい。crRNAの長さは、スペーサー配列を含めて約25~70塩基長とすることができ、25~50nt程度が好ましい。tracrRNAの長さは10~130nt程度とすることができ、30~80nt程度が好ましい。
 crRNAのリピート配列は、Casタンパク質が由来する細菌種におけるものと同じであってもよく、3’末端の一部を削除したものであってもよい。tracrRNAは、Casタンパク質が由来する細菌種における成熟tracrRNAと同じで配列を有していてもよく、当該成熟tracrRNAの5’末端及び/又は3’末端を切断した末端切断型であってもよい。例えば、tracrRNAは、成熟tracrRNAの3’末端から1~40個程度のヌクレオチド残基を除去した末端切断型であり得る。また、tracrRNAは、成熟tracrRNAの5’末端から1~80個程度のヌクレオチド残基を除去した末端切断型であり得る。また、tracrRNAは、例えば、5’末端から1~20程度のヌクレオチド残基を除去し、かつ3’末端から1~40個程度のヌクレオチド残基を除去した末端切断型であり得る。
 sgRNA設計のためのcrRNAリピート配列及びtracrRNAの配列は、種々提案されており、当業者は、公知技術に基づいてsgRNAを設計することができる(例えば、Jinek et al. (2012) Science, 337, 816-21; Mali et al. (2013) Science, 339: 6121, 823-6; Cong et al. (2013) Science, 339: 6121, 819-23; Hwang et al. (2013) Nat. Biotechnol. 31: 3, 227-9; Jinek et al. (2013) eLife, 2, e00471)。
The crRNA repeat sequence and tracrRNA sequence can be appropriately selected depending on the type of Cas protein, and those derived from the same bacterial species as the Cas protein can be used.
For example, when using Cas9 protein derived from S. pyogenes, the length of the sgRNA can be about 50 to 220 nucleotides (nt), preferably about 60 to 180 nt, more preferably about 80 to 120 nt. The length of the crRNA can be about 25 to 70 bases including the spacer sequence, preferably about 25 to 50 nt. The length of the tracrRNA can be about 10 to 130 nt, preferably about 30 to 80 nt.
The repeat sequence of the crRNA may be the same as that in the bacterial species from which the Cas protein is derived, or may be one in which a part of the 3' end has been deleted. The tracrRNA may have the same sequence as the mature tracrRNA in the bacterial species from which the Cas protein is derived, or may be a truncated type in which the 5' end and/or the 3' end of the mature tracrRNA has been truncated. For example, the tracrRNA may be a truncated type in which about 1 to 40 nucleotide residues have been removed from the 3' end of the mature tracrRNA. The tracrRNA may also be a truncated type in which about 1 to 80 nucleotide residues have been removed from the 5' end of the mature tracrRNA. The tracrRNA may also be a truncated type in which, for example, about 1 to 20 nucleotide residues have been removed from the 5' end and about 1 to 40 nucleotide residues have been removed from the 3' end.
Various crRNA repeat sequences and tracrRNA sequences for sgRNA design have been proposed, and those skilled in the art can design sgRNAs based on known techniques (e.g., Jinek et al. (2012) Science, 337, 816-21; Mali et al. (2013) Science, 339: 6121, 823-6; Cong et al. (2013) Science, 339: 6121, 819-23; Hwang et al. (2013) Nat. Biotechnol. 31: 3, 227-9; Jinek et al. (2013) eLife, 2, e00471).
 「プロトスペーサー隣接モチーフ」及び「PAM」という用語は、相互に互換的に使用され、Casタンパク質によるDNA切断の際に、Casタンパク質に認識される配列を指す。PAMの配列及び位置は、Casタンパク質の種類によって異なる。例えば、Cas9タンパク質の場合、PAMは標的配列の3’側直後に隣接する必要がある。Cas9タンパク質に対応するPAMの配列は、Cas9タンパク質が由来する細菌種によって異なっている。例えば、S.pyogenesのCas9タンパク質に対応するPAMは「NGG」であり、S.thermophilusのCas9タンパク質に対応するPAMは「NNAGAA」であり、S.aureusのCas9タンパク質に対応するPAMは「NNGRRT」又は「NNGRR(N)」であり、N.meningitidisのCas9タンパク質に対応するPAMは「NNNNGATT」であり、T.denticolaのCas9タンパク質に対応する「NAAAAC」である(「R」はA又はG;「N」は、A、T、G又はC)。 The terms "protospacer adjacent motif" and "PAM" are used interchangeably and refer to a sequence recognized by the Cas protein during DNA cleavage by the Cas protein. The sequence and position of the PAM vary depending on the type of Cas protein. For example, in the case of the Cas9 protein, the PAM must be immediately adjacent to the 3' side of the target sequence. The sequence of the PAM corresponding to the Cas9 protein varies depending on the bacterial species from which the Cas9 protein is derived. For example, the PAM corresponding to the Cas9 protein of S. pyogenes is "NGG", the PAM corresponding to the Cas9 protein of S. thermophilus is "NNAGAA", the PAM corresponding to the Cas9 protein of S. aureus is "NNGRRT" or "NNGRR(N)", the PAM corresponding to the Cas9 protein of N. meningitidis is "NNNNGATT", and the PAM corresponding to the Cas9 protein of T. "NAAAAC" corresponds to the Cas9 protein of B. denticola (where "R" is A or G; "N" is A, T, G, or C).
 「スペーサー配列」及び「ガイド配列」という用語は、相互に互換的に使用され、ガイドRNAに含まれる配列であって、標的配列の相補鎖と結合し得る配列を指す。通常、スペーサー配列は、標的配列と同一の配列である(但し、標的配列中のTは、スペーサー配列ではUとなる)。本発明の実施態様において、スペーサー配列は、標的配列に対して1塩基又は複数塩基のミスマッチを含むことができる。複数塩基のミスマッチを含む場合、隣接した位置にミスマッチを有していてもよく、離れた位置にミスマッチを有していてもよい。好ましい態様において、スペーサー配列は、標的配列に対して1~5塩基のミスマッチを含み得る。特に好ましい態様において、スペーサー配列は、標的配列に対して1塩基のミスマッチを含み得る。
 ガイドRNAにおいて、スペーサー配列は、crRNAの5’側に配置される。
The terms "spacer sequence" and "guide sequence" are used interchangeably and refer to a sequence contained in a guide RNA that can bind to a complementary strand of a target sequence. Usually, the spacer sequence is the same sequence as the target sequence (with the exception that T in the target sequence becomes U in the spacer sequence). In an embodiment of the present invention, the spacer sequence may contain one or more base mismatches with the target sequence. When multiple base mismatches are contained, the mismatches may be located adjacent to each other or may be located distant from each other. In a preferred embodiment, the spacer sequence may contain 1 to 5 base mismatches with the target sequence. In a particularly preferred embodiment, the spacer sequence may contain one base mismatch with the target sequence.
In the guide RNA, the spacer sequence is positioned 5' to the crRNA.
 ポリヌクレオチドに関して用いる「機能的に連結」という用語は、第一の塩基配列が第二の塩基配列に十分に近くに配置され、第一の塩基配列が第二の塩基配列又は第二の塩基配列の制御下の領域に影響を及ぼしうることを意味する。例えば、ポリヌクレオチドがプロモーターに機能的に連結するとは、当該ポリヌクレオチドが、当該プロモーターの制御下で発現するように連結されていることを意味する。 The term "operably linked" when used with respect to a polynucleotide means that a first base sequence is positioned sufficiently close to a second base sequence that the first base sequence can affect the second base sequence or a region under the control of the second base sequence. For example, a polynucleotide is operably linked to a promoter means that the polynucleotide is linked such that it is expressed under the control of the promoter.
 「発現可能な状態」という用語は、ポリヌクレオチドが導入された細胞内で、該ポリヌクレオチドが転写され得る状態にあることを指す。
 「発現ベクター」という用語は、対象ポリヌクレオチドを含むベクターであって、該ベクターを導入した細胞内で、対象ポリヌクレオチドを発現可能な状態にするシステムを備えたベクターを指す。例えば、「Casタンパク質の発現ベクター」とは、該ベクターを導入した細胞内で、Casタンパク質を発現可能なベクターを意味する。また、例えば、「ガイドRNAの発現ベクター」とは、該ベクターを導入した細胞内で、ガイドRNAを発現可能なベクターを意味する。
The term "expressible state" refers to a state in which a polynucleotide can be transcribed in a cell into which it has been introduced.
The term "expression vector" refers to a vector containing a target polynucleotide and equipped with a system that allows the target polynucleotide to be expressed in a cell into which the vector is introduced. For example, "Cas protein expression vector" refers to a vector that can express Cas protein in a cell into which the vector is introduced. Also, for example, "guide RNA expression vector" refers to a vector that can express guide RNA in a cell into which the vector is introduced.
 本明細明細書において、塩基配列どうし又はアミノ酸配列どうしの配列同一性(又は相同性)は、2つの塩基配列又はアミノ酸配列を、対応する塩基又はアミノ酸が最も多く一致するように、挿入及び欠失に当たる部分にギャップを入れながら並置し、得られたアラインメント中のギャップを除く、塩基配列全体又はアミノ酸配列全体に対する一致した塩基又はアミノ酸の割合として求められる。塩基配列又はアミノ酸配列どうしの配列同一性は、当該技術分野で公知の各種相同性検索ソフトウェアを用いて求めることができる。塩基配列の配列同一性の値(Identity値)は、特に限定されないが例えば、公知の相同性検索ソフトウェアUCSC Genome Browserに搭載されているBLAT検索により得ることができる。 In this specification, sequence identity (or homology) between base sequences or amino acid sequences is determined by juxtaposing two base sequences or amino acid sequences with gaps at the insertion and deletion sites so that the corresponding bases or amino acids are most commonly matched, and calculating the ratio of matching bases or amino acids to the entire base sequence or entire amino acid sequence excluding gaps in the resulting alignment. Sequence identity between base sequences or amino acid sequences can be determined using various homology search software known in the art. The sequence identity value (identity value) of base sequences is not particularly limited, and can be obtained, for example, by a BLAT search installed in the known homology search software UCSC Genome Browser.
 本明細書では、便宜的にヒトゲノムの位置を参照するときに、リファレンスゲノムとしてhg38ゲノム配列における位置を用いる。hg38は、カリフォルニア大学サンタクルーズ校(UCSC)により2013年12月にリリースされたリファレンスゲノムである。リファレンスゲノムは、様々なゲノムを組み合わせて作成された参照用のゲノムであり、このゲノムを有するヒトが存在するというわけではない。しかし、ヒト個体のゲノムDNAから解読された断片的な配列情報をリファレンスゲノムに照らすことによって解読された断片的な配列情報を連結してコンピュータ上で一つながりの配列を構築し、これにより当該ヒト個体のゲノムDNAの配列を推定することができる。このようにして、リファレンスゲノムにヒト個体のゲノムDNAの配列を対応付けることによって、ヒト個体等の個体のゲノムDNAを解読することが通常なされている。そして、hg38ゲノム配列の特定位置または特定領域に対応する位置または領域とは、具体的な配列の異なる別個体のゲノムにおいて、当該特定位置または特定領域に紐付けられる位置または領域を意味する。具体的には、配列の同一性に基づき、当該位置または領域に特徴的な配列を有する位置または領域が、hg38ゲノム配列の特定位置または特定領域に対応する位置または領域である。対応する位置は、2つのゲノムDNAの部分配列のアラインメントにより決定することができる。具体的な配列に相違がある場合であっても、オーソログの関係を有していたり、配列同一性を有していてアラインメントをすることによって、2つのゲノムDNAの対応関係を決定することができる。遺伝子重複により生じたパラログが豊富な領域では、単純な個別の配列に基づく配列の対応関係を決定するだけでは、2つのゲノム間の真の対応関係を決定するには十分でない場合がある。このことが類似配列が集積した領域の配列解読の難易度を増加させる。対応する配列の決定に際して、高い配列同一性を求めることで2つのゲノム間の対応関係を明らかにすることができる。また、特定領域が、複数遺伝子を含む大きな領域である場合には、シンテニーを考慮することができる。シンテニーとは、ゲノム上でのオーソログの物理的位置的関係が保存されていることをいう。個人間および生物間でシンテニーを有し得る。したがって、シンテニーを考慮して特定領域を決定することができる。 In this specification, for convenience, when referring to the position of the human genome, the position in the hg38 genome sequence is used as the reference genome. hg38 is a reference genome released by the University of California, Santa Cruz (UCSC) in December 2013. The reference genome is a reference genome created by combining various genomes, and it does not mean that there is a human having this genome. However, by comparing the fragmentary sequence information decoded from the genomic DNA of a human individual with the reference genome, the decoded fragmentary sequence information is linked to construct a continuous sequence on a computer, and the sequence of the genomic DNA of the human individual can be estimated. In this way, the genomic DNA of an individual such as a human individual is usually decoded by matching the sequence of the genomic DNA of the human individual to the reference genome. And, a position or region corresponding to a specific position or specific region of the hg38 genome sequence means a position or region linked to the specific position or specific region in the genome of another individual having a different specific sequence. Specifically, a position or region having a sequence characteristic of the position or region based on sequence identity corresponds to a specific position or region of the hg38 genome sequence. The corresponding position can be determined by aligning the partial sequences of two genomic DNAs. Even if there is a difference in the specific sequence, the correspondence between the two genomic DNAs can be determined by aligning them if they have an orthologous relationship or sequence identity. In a region rich in paralogs generated by gene duplication, simply determining the correspondence between sequences based on individual sequences may not be sufficient to determine the true correspondence between the two genomes. This increases the difficulty of sequence deciphering a region where similar sequences are accumulated. When determining the corresponding sequence, the correspondence between the two genomes can be clarified by seeking high sequence identity. In addition, when the specific region is a large region containing multiple genes, synteny can be taken into consideration. Synteny refers to the conservation of the physical positional relationship of orthologs on the genome. Synteny can exist between individuals and between organisms. Therefore, the specific region can be determined by taking synteny into consideration.
[ライブラリの作製方法]
 ライブラリの作製方法は、細胞を用意することを含み得る。用意する細胞は、本開示の改変前の細胞であり、改変後の細胞と比較する際の参照となり得ることから「リファレンス細胞」ということがある。細胞は、好ましくは、クローン化された細胞、株化された細胞、または不死化細胞であり得る。ある好ましい態様では、細胞は、単一種類を含み得る。本開示のライブラリーでは、特定遺伝子座の特定領域のみが目的配列に置き換わった細胞のライブラリーが提供される。目的配列以外の配列を統一すると目的配列に置き換わることの技術的意義を明らかにできるという観点で、細胞は単一種類の細胞からなることが好ましい。単一種類の細胞は、クローン化された細胞である。
[Library creation method]
The method for preparing a library may include preparing cells. The prepared cells are pre-modified cells of the present disclosure, and may be referred to as "reference cells" because they can serve as a reference for comparison with modified cells. The cells may preferably be cloned cells, established cells, or immortalized cells. In a preferred embodiment, the cells may include a single type of cell. In the library of the present disclosure, a library of cells in which only a specific region of a specific locus is replaced with a target sequence is provided. From the viewpoint that the technical significance of replacing the target sequence with the target sequence can be clarified by unifying sequences other than the target sequence, it is preferable that the cells consist of a single type of cell. The single type of cell is a cloned cell.
第一の作製形態の概要
 概要は、図1に示される。
 ライブラリの作製方法は、細胞対して下記ゲノム改変方法を用いて、第一の中間体細胞を得ることを含み得る。第一の中間体細胞は、改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、前記第一のアレルと第二のアレルそれぞれに選択マーカー遺伝子と標的核酸配列を含むカセットを有する細胞である。第一の中間体細胞では、第一のアレルが有する選択マーカー遺伝子と第二のアレルが有する選択マーカー遺伝子とは区別可能に異なり、前記標的核酸配列は、ゲノム改変システムの標的であり、ゲノム改変システムにより第一のアレルと第二のアレルとを区別可能に切断できるように設計されており、各選択マーカー遺伝子はネガティブ選択に用いることができるネガティブ選択用マーカー遺伝子である。ネガティブ選択用マーカー遺伝子は、ポジティブ選択用にも用いることができるもの(例えば、可視化マーカー遺伝子等)であってよい。ライブラリの作製方法は、第一の中間体細胞から、改変細胞のライブラリー(「第一の改変細胞のライブラリー」または単に「第一のライブラリー」ということがある)を得ることを含み得る。第一のライブラリーに含まれる改変細胞を「第一の改変細胞」ということがある。
 第一の中間体細胞は、当業者であれば適宜得ることができる。特に限定されないが、以下に説明するゲノム改変方法を用いることにより、簡便に作製することができる。中間体細胞からの改変細胞の取得は、改変塩基配列導入用ドナーDNAの存在下で、第一のカセット内またはその近傍を切断することにより達成され得る。
A schematic overview of the first preparation mode is shown in FIG.
The method for preparing the library may include obtaining a first intermediate cell by using the following genome modification method for the cell. The first intermediate cell is a cell having a genome including a first allele and a second allele at a locus to be modified, and the first allele and the second allele each have a cassette including a selection marker gene and a target nucleic acid sequence. In the first intermediate cell, the selection marker gene of the first allele and the selection marker gene of the second allele are distinguishably different, the target nucleic acid sequence is a target of the genome modification system, and is designed so that the first allele and the second allele can be distinguishably cleaved by the genome modification system, and each selection marker gene is a negative selection marker gene that can be used for negative selection. The negative selection marker gene may also be used for positive selection (e.g., a visualization marker gene, etc.). The method for preparing the library may include obtaining a library of modified cells (sometimes referred to as a "library of first modified cells" or simply "first library") from the first intermediate cell. The modified cells contained in the first library may be referred to as "first modified cells".
The first intermediate cell can be obtained by a person skilled in the art. Although not particularly limited, it can be easily prepared by using the genome modification method described below. Obtaining a modified cell from the intermediate cell can be achieved by cleaving the first cassette or its vicinity in the presence of donor DNA for introducing a modified base sequence.
第二の作製形態の概要
 概要は、図1に示される。
 ライブラリの作製方法は、細胞に対して下記ゲノム改変方法を用いて、第一の中間体細胞を得ることを含み得る。ライブラリの作製方法は、第一の中間体細胞から第二の中間体細胞を得ることを含みうる。ライブラリの作製方法は、第二の中間体細胞からライブラリを得ることを含みうる。ここで、第一の中間体細胞は、上述の通りである。第二の中間体細胞は、改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、前記第一のアレルに選択マーカー遺伝子と標的核酸配列を含むカセットを有し、第二のアレルには当該カセットを含まない、細胞である。第二の中間体細胞は、第一の中間体細胞の第二のアレルから前記カセットを除去することにより作製することができる。カセットの除去は、ゲノム改変方法により当業者でれあれば適宜実施することができる。具体的には、切断時にカセットの上流と相同組換え可能な上流ホモロジーアームと当該カセットの下流と相同組換え可能な下流ホモロジーアームとを有するドナーDNAの存在下で、第二のアレルに含まれる標的配列を切断することができるゲノム改変システムを第一の中間体細胞に適用することにより、第一の中間体細胞から第二の中間体細胞を得ることができる。ライブラリの作製方法は、第二の中間体細胞から、改変細胞のライブラリー(「第二の改変細胞のライブラリー」または単に「第二のライブラリー」ということがある)を得ることを含み得る。第二のライブラリーに含まれる改変細胞を「第二の改変細胞」ということがある。第一の中間体細胞からの第二の中間体細胞の取得は、カセット除去用ドナーDNAの存在下で、第二のカセット内またはその近傍の配列を切断することによって達成され得る。第二の中間体細胞からの改変細胞の取得は、改変塩基配列導入用ドナーDNAの存在下で、第一のカセットを切断することにより達成され得る。
A schematic overview of the second construction mode is shown in FIG.
The method for producing a library may include obtaining a first intermediate cell using the following genome modification method for a cell. The method for producing a library may include obtaining a second intermediate cell from the first intermediate cell. The method for producing a library may include obtaining a library from the second intermediate cell. Here, the first intermediate cell is as described above. The second intermediate cell is a cell having a genome including a first allele and a second allele at a locus to be modified, the first allele having a cassette including a selection marker gene and a target nucleic acid sequence, and the second allele not including the cassette. The second intermediate cell can be produced by removing the cassette from the second allele of the first intermediate cell. Removal of the cassette can be appropriately performed by a person skilled in the art using a genome modification method. Specifically, a genome modification system capable of cleaving a target sequence contained in a second allele in the presence of a donor DNA having an upstream homology arm capable of homologous recombination with the upstream of the cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette upon cleavage can be applied to the first intermediate cell, thereby obtaining a second intermediate cell from the first intermediate cell. The method for producing a library can include obtaining a library of modified cells (sometimes referred to as a "library of second modified cells" or simply a "second library") from the second intermediate cell. The modified cells contained in the second library can be sometimes referred to as "second modified cells". Obtaining the second intermediate cell from the first intermediate cell can be achieved by cleaving a sequence in or near the second cassette in the presence of a donor DNA for removing the cassette. Obtaining the modified cell from the second intermediate cell can be achieved by cleaving the first cassette in the presence of a donor DNA for introducing a modified base sequence.
第二の作製形態の変形例
 第二の作製形態では、第二の中間体細胞を第一の中間体細胞の第二のアレルから前記カセットを除去することにより作製した。第二の作製形態の変形例では、第一の中間体細胞の第二のアレルにおけるカセットを除去して、第二のアレルを改変前の配列に戻す操作を行い、前記第一のアレルに選択マーカー遺伝子と標的核酸配列を含むカセットを有し、第二のアレルは、改変前の配列を有する第三の中間体細胞を得る。第二の作製形態の変形例では、その後、第三の中間体細胞の第一のアレルに対してライブラリ作製用ドナーDNAを適用して改変細胞のライブラリーを得る。この態様では、改変細胞のライブラリーに含まれる改変細胞は、第一のアレルに改変塩基配列を含み、第二のアレルは改変前の配列を有する。第一の中間体細胞の第二のアレルを改変前の配列に戻す操作は、第二のカセットの上流と相同組換え可能な上流ホモロジーアームと当該カセットの下流と相同組換え可能な下流ホモロジーアームとからなるドナーDNAの存在下で、第二のカセット内またはその近傍を切断することにより達成され得る。
Variation of the second production form In the second production form, the second intermediate cell was produced by removing the cassette from the second allele of the first intermediate cell. In a variation of the second production form, the cassette in the second allele of the first intermediate cell is removed to return the second allele to the sequence before modification, and a third intermediate cell is obtained in which the first allele has a cassette containing a selection marker gene and a target nucleic acid sequence, and the second allele has the sequence before modification. In a variation of the second production form, a library of modified cells is then obtained by applying a donor DNA for library production to the first allele of the third intermediate cell. In this embodiment, the modified cells contained in the library of modified cells contain a modified base sequence in the first allele, and the second allele has the sequence before modification. The operation of returning the second allele of the first intermediate cell to the sequence before modification can be achieved by cutting within or near the second cassette in the presence of a donor DNA consisting of an upstream homology arm capable of homologous recombination with the upstream of the second cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette.
 本開示では、第一の中間体細胞、第二の中間体細胞、これらの中間体細胞を含む組成物、第一の改変細胞、第一のライブラリー、第二の改変細胞、および第二のライブラリーが提供される。 The present disclosure provides a first intermediate cell, a second intermediate cell, a composition comprising these intermediate cells, a first modified cell, a first library, a second modified cell, and a second library.
 一般的に、ゲノム編集においては、標的領域を有するゲノムに対して、標的領域の上流と相同組換え可能な上流ホモロジーアーム(例えば、相補的配列を有する)と当該標的領域の下流と相同組換え可能な下流ホモロジーアーム(例えば、相補的配列を有する)を有するドナーDNAの存在下で、典型的には当該標的領域内に切断を生じさせることにより、標的領域がドナーDNA中の上流ホモロジーアームと下流ホモロジーアームで挟まれた配列に置き換わる。上流ホモロジーアームと下流ホモロジーアームで挟まれた配列が存在しない場合には、標的領域が欠失する(またはシームレスに欠失する)。ゲノム編集においては、標的領域内に複数箇所の切断を生じさせてもよい。典型的には、上流ホモロジーアームに近接した標的領域内と下流ホモロジーアームに近接した標的領域内のそれぞれに切断を生じさせることが有益である。 In general, in genome editing, in the presence of a donor DNA having an upstream homology arm (e.g., having a complementary sequence) capable of homologous recombination with the upstream of the target region and a downstream homology arm (e.g., having a complementary sequence) capable of homologous recombination with the downstream of the target region, a cut is typically made in the target region, and the target region is replaced with a sequence sandwiched between the upstream and downstream homology arms in the donor DNA. If the sequence sandwiched between the upstream and downstream homology arms does not exist, the target region is deleted (or seamlessly deleted). In genome editing, cuts may be made at multiple locations in the target region. Typically, it is beneficial to make cuts in both the target region adjacent to the upstream homology arm and the target region adjacent to the downstream homology arm.
 ある好ましい態様では、第一の中間体細胞においては、第一のアレルおよび第二のアレルは、標的領域のカセットによる置換がそれぞれなされ、第一のアレルおよび第二のアレルは、当該標的領域の欠失を有していてもよい。また、ある好ましい態様では、第一のアレルおよび第二のアレルは、標的領域へのカセットの挿入がなされ、第一のアレルおよび第二のアレルは、当該標的領域の欠失を有しなくてもよい。ある好ましい態様では、標的領域のカセットの挿入は、機能を有しない標的領域中になされ、したがって、当該標的領域の破壊に伴う機能低下または欠損を伴わない。 In a preferred embodiment, in the first intermediate cell, the first allele and the second allele are each replaced with a cassette of the target region, and the first allele and the second allele may have a deletion of the target region. In a preferred embodiment, the first allele and the second allele are each inserted with a cassette into the target region, and the first allele and the second allele may not have a deletion of the target region. In a preferred embodiment, the insertion of the cassette of the target region is made into a non-functional target region, and therefore does not result in a loss of function or deficiency associated with the destruction of the target region.
 本実施形態のゲノム改変方法に用いる細胞は、特に限定されず、1倍体または2倍体以上の染色体ゲノムを有する細胞であればよい。細胞は、2倍体であってもよく、3倍体であってもよく、4倍体以上であってもよい。細胞としては、特に限定されないが、真核生物の細胞が挙げられる。細胞は、植物細胞であってもよく、動物細胞であってもよく、真菌細胞であってもよい。動物細胞は、特に限定されないが、ヒト、ヒト以外の哺乳動物(例えば、サルなどの非ヒト霊長類、イヌ、ネコ、ウシ、ウマ、ヒツジ、ヤギ、ラマ、齧歯類などの非ヒト哺乳類)、鳥類、爬虫類、両生類、魚類、その他の脊椎動物のいずれの細胞であってもよい。 The cells used in the genome modification method of this embodiment are not particularly limited, and may be cells having a haploid or diploid or higher chromosomal genome. The cells may be diploid, triploid, or quadruploid or higher. Examples of cells include, but are not limited to, eukaryotic cells. The cells may be plant cells, animal cells, or fungal cells. The animal cells may be, but are not limited to, cells of humans, non-human mammals (e.g., non-human primates such as monkeys, non-human mammals such as dogs, cats, cows, horses, sheep, goats, llamas, and rodents), birds, reptiles, amphibians, fish, and other vertebrates.
 前記細胞としては、例えば、多能性細胞(例えば、胚性幹細胞(ES細胞)および誘導多能性幹細胞(iPS細胞)などの多能性幹細胞)、造血幹細胞、造血前駆細胞、骨髄細胞、脾臓細胞、骨髄系共通前駆細胞、免疫細胞(例えば、T細胞、B細胞、NK細胞、NKT細胞、マクロファージ、単球、好中球、好酸球、好塩基球)、赤血球、巨核球、心臓細胞、心筋細胞、心臓線維芽細胞、膵β細胞、角膜細胞(例えば、角膜上皮細胞、および角膜内皮細胞)、表皮細胞、真皮細胞、脂肪細胞、軟骨細胞、骨細胞、破骨細胞、骨芽細胞、間葉系幹細胞(例えば、脂肪由来、骨髄由来、胎盤由来および臍帯由来)、歯髄細胞、腱細胞、靭帯細胞、神経細胞(例えば、錐体細胞、星状細胞、および顆粒細胞)、グリア細胞、プルキンエ細胞、網膜神経節細胞、網膜細胞、視神経細胞、および神経幹細胞が挙げられる。ある好ましい態様では、細胞は初代細胞であり得る。ある好ましい態様では、細胞は不死化された細胞、または細胞株であり得る。ある好ましい態様では、細胞はヒトの細胞である。 Examples of such cells include pluripotent cells (e.g., pluripotent stem cells such as embryonic stem cells (ES cells) and induced pluripotent stem cells (iPS cells)), hematopoietic stem cells, hematopoietic progenitor cells, bone marrow cells, spleen cells, common myeloid progenitor cells, immune cells (e.g., T cells, B cells, NK cells, NKT cells, macrophages, monocytes, neutrophils, eosinophils, basophils), erythrocytes, megakaryocytes, cardiac cells, cardiomyocytes, cardiac fibroblasts, pancreatic beta cells, corneal cells (e.g., corneal epithelial cells and corneal endothelial cells), epidermal cells, dermal cells, adipocytes, chondrocytes, osteocytes, osteoclasts, osteoblasts, mesenchymal stem cells (e.g., adipose-derived, bone marrow-derived, placenta-derived, and umbilical cord-derived), dental pulp cells, tendon cells, ligament cells, nerve cells (e.g., cone cells, astrocytes, and granule cells), glial cells, Purkinje cells, retinal ganglion cells, retinal cells, optic nerve cells, and neural stem cells. In some preferred embodiments, the cells can be primary cells. In some preferred embodiments, the cells can be immortalized cells or cell lines. In some preferred embodiments, the cells are human cells.
 ある態様では、細胞は、単離された細胞、クローン化された細胞、または細胞株であり得る。細胞は不死化された細胞であってもよい。ある好ましい態様では、細胞は、クローン化された細胞である。ある好ましい態様では、細胞は、細胞株である。ある好ましい態様では、細胞は不死化された細胞である。ある態様では、細胞は、初代体細胞である。細胞は、その使用目的に応じて適宜選択されることを当業者は理解する。 In some embodiments, the cell may be an isolated cell, a cloned cell, or a cell line. The cell may be an immortalized cell. In some preferred embodiments, the cell is a cloned cell. In some preferred embodiments, the cell is a cell line. In some preferred embodiments, the cell is an immortalized cell. In some embodiments, the cell is a primary somatic cell. It will be understood by those skilled in the art that the cell is appropriately selected depending on the intended use.
 ある態様では、細胞(またはライブラリー)は、細胞凍結保護液中で凍結されうる。ある態様では、前記細胞を含む細胞凍結保護液は、非凍結状態または好ましくは凍結状態で提供され得る。凍結状態の前記細胞を含む細胞凍結保護液(「フリーズストック」ともいう)は、リサーチセルバンク(RCB)、マスターセルバンク(MCB)、またはワーキングセルバンク(WCB)として用いることができる。したがって、本発明では、上記フリーズストックを含む、リサーチセルバンク(RCB)、マスターセルバンク(MCB)、またはワーキングセルバンク(WCB)が提供される。 In one embodiment, the cells (or library) can be frozen in a cell cryoprotectant. In one embodiment, the cell cryoprotectant containing the cells can be provided in a non-frozen state or preferably in a frozen state. The cell cryoprotectant containing the cells in a frozen state (also called a "freeze stock") can be used as a research cell bank (RCB), master cell bank (MCB), or working cell bank (WCB). Thus, the present invention provides a research cell bank (RCB), master cell bank (MCB), or working cell bank (WCB) that includes the above-mentioned frozen stock.
[ゲノム改変方法]
 以下に記載する方法(例えば、UKiS;国際公開第2021/206054号参照)は、上記細胞の作製に好ましく用いることができる。この方法は、特に図1に示される工程S1において、細胞から第一の中間体細胞を作製する際に好ましく用いられる。
[Genome modification method]
The method described below (see, for example, UKiS; International Publication No. 2021/206054) can be preferably used to produce the above-mentioned cells. This method is preferably used to produce a first intermediate cell from a cell, particularly in step S1 shown in FIG.
 1実施態様において、本発明では、第一の中間体細胞の作製に、以下(a)及び(b)を含む方法を用いることができる。
(a)下記(i)及び(ii)を、2つ以上のアレルを含む細胞に導入して、前記2つ以上のアレルそれぞれに選択マーカー遺伝子を導入する工程と、
(i)前記染色体ゲノムの2つ以上のアレル中の標的領域を標的とし、当該標的領域を切断することができる配列特異的核酸切断分子、又は前記配列特異的核酸切断分子をコードするポリヌクレオチドを含むゲノム改変システム、
(ii)2種以上の選択マーカー用ドナーDNAであって、それぞれが前記標的領域の上流側の塩基配列と相同組換え可能な塩基配列を有する上流ホモロジーアームと、前記標的領域の下流側の塩基配列と相同組換え可能な塩基配列を有する下流ホモロジーアームとを有し、かつ、上流ホモロジーアームと下流ホモロジーアームの間に、選択マーカー遺伝子の塩基配列を含み、前記2種以上の選択マーカー用ドナーDNAは、相互に区別可能に異なる選択マーカー遺伝子をそれぞれ有し、前記選択マーカー遺伝子は、選択マーカー用ドナーDNAの種類毎に固有であり、前記選択マーカー用ドナーDNAの種類数は、ゲノム改変の対象とする前記アレルの数と同数以上である、2種以上の選択マーカー用ドナーDNA、
(b)前記工程(a)の後、前記2以上のアレルに対して異なる種類の選択マーカー用ドナーDNAがそれぞれ相同組み換えすることによって、当該2以上のアレルにそれぞれ区別可能に異なる固有の選択マーカー遺伝子が導入され、導入された当該区別可能に異なる選択マーカー遺伝子のすべてを発現する細胞を選択する工程(ポジティブ選択のための工程)と、
を含む、方法であり得る。
In one embodiment, in the present invention, a method including the following (a) and (b) can be used to prepare the first intermediate cell:
(a) introducing the following (i) and (ii) into a cell containing two or more alleles to introduce a selection marker gene into each of the two or more alleles;
(i) a sequence-specific nucleic acid cleaving molecule capable of targeting and cleaving a target region in two or more alleles of the chromosomal genome, or a genome modification system comprising a polynucleotide encoding the sequence-specific nucleic acid cleaving molecule;
(ii) Two or more types of donor DNA for selection markers, each of which has an upstream homology arm having a base sequence capable of homologous recombination with a base sequence on the upstream side of the target region and a downstream homology arm having a base sequence capable of homologous recombination with a base sequence on the downstream side of the target region, and contains a base sequence of a selection marker gene between the upstream homology arm and the downstream homology arm, wherein the two or more donor DNAs for selection markers each have a selection marker gene that is distinguishable from each other, the selection marker gene is unique for each type of donor DNA for selection markers, and the number of types of donor DNA for selection markers is equal to or greater than the number of alleles to be subjected to genome modification;
(b) after the step (a), a step of selecting cells expressing all of the introduced selectable marker genes by homologous recombination of different types of selectable marker donor DNAs with respect to the two or more alleles, and the selectable marker genes being uniquely different from each other and being different from each other (a step for positive selection);
The method may include:
(工程(a))
 工程(a)では、前記(i)及び(ii)を、染色体を含む細胞に導入する。
(Step (a))
In step (a), (i) and (ii) are introduced into a cell containing the chromosome.
<(i)ゲノム改変システム>
 「ゲノム改変システム」とは、所望の標的領域を改変することが可能な分子機構を意味する。ゲノム改変システムは、染色体ゲノムの標的領域を標的とする配列特異的核酸切断分子、又は前記配列特異的核酸切断分子をコードするポリヌクレオチドを含む。より具体的には、ゲノム改変システムは、標的領域中あるいは近傍の少なくとも1つ、好ましくは2つを切断することができる。
(i) Genome modification system
"Genome modification system" refers to a molecular mechanism capable of modifying a desired target region. The genome modification system includes a sequence-specific nucleic acid cleavage molecule that targets a target region of a chromosomal genome, or a polynucleotide that encodes the sequence-specific nucleic acid cleavage molecule. More specifically, the genome modification system can cleave at least one, preferably two, in or near the target region.
 ゲノム改変の対象となる標的領域は、1以上のアレルを有するゲノム上の任意の領域とすることができる。標的領域のサイズは、特に限定されない。本実施形態のゲノム改変方法では、従来よりも大きなサイズの領域を改変することができる。標的領域は、例えば、10kbp以上であってもよい。標的領域は、例えば、100bp以上、200bp以上、400bp以上、800bp以上、1kbp以上、2kbp以上、3kbp以上、4kbp以上、5kbp以上、8kbp以上、10kbp以上、20kbp以上、40kbp以上、80kbp以上、100kbp以上、200kbp以上、300kbp、400kbp以上、500kbp以上、600kbp以上、700kbp以上、800kbp以上、900kbp以上、もしくは1Mbp以上、または上記のいずれかの数値以下であってもよい。ある態様では、改変された細胞において、上記標的領域が欠失している。 The target region to be subjected to genome modification can be any region on the genome having one or more alleles. The size of the target region is not particularly limited. In the genome modification method of this embodiment, a region of a larger size than conventionally can be modified. The target region may be, for example, 10 kbp or more. The target region may be, for example, 100 bp or more, 200 bp or more, 400 bp or more, 800 bp or more, 1 kbp or more, 2 kbp or more, 3 kbp or more, 4 kbp or more, 5 kbp or more, 8 kbp or more, 10 kbp or more, 20 kbp or more, 40 kbp or more, 80 kbp or more, 100 kbp or more, 200 kbp or more, 300 kbp, 400 kbp or more, 500 kbp or more, 600 kbp or more, 700 kbp or more, 800 kbp or more, 900 kbp or more, or 1 Mbp or more, or any of the above values or less. In one embodiment, the target region is deleted in the modified cell.
 配列特異的核酸切断分子は、配列特異的核酸切断活性を有する分子であれば、特に限定されず、合成有機化合物であってもよく、タンパク質等の生体高分子化合物であってもよい。配列特異的部位切断活性を有するタンパク質としては、例えば、配列特異的エンドヌクレアーゼが挙げられる。 The sequence-specific nucleic acid cleaving molecule is not particularly limited as long as it has sequence-specific nucleic acid cleaving activity, and may be a synthetic organic compound or a biopolymer compound such as a protein. An example of a protein having sequence-specific site cleavage activity is a sequence-specific endonuclease.
 配列特異的エンドヌクレアーゼは、所定の配列で核酸を切断することができる酵素である。配列特異的エンドヌクレアーゼは、所定の配列で、2本鎖DNAを切断することができる。配列特異的エンドヌクレアーゼとしては、特に限定されないが、例えば、ジンクフィンガーヌクレアーゼ(Zinc finger nuclease(ZFN))、TALEN(Transcription activator-like effector nuclease)、Casタンパク質等が挙げられるが、これらに限定されない。 A sequence-specific endonuclease is an enzyme that can cleave nucleic acids at a specific sequence. A sequence-specific endonuclease can cleave double-stranded DNA at a specific sequence. Sequence-specific endonucleases are not particularly limited, but examples include zinc finger nucleases (ZFNs), TALENs (transcription activator-like effector nucleases), Cas proteins, etc., but are not limited to these.
 ZFNは、ジンクフィンガーアレイを含む結合ドメインにコンジュゲートした核酸切断ドメインを含む人工ヌクレアーゼである。切断ドメインとしては、II型制限酵素FokIの切断ドメインが挙げられる。標的配列を切断可能なジンクフィンガーヌクレアーゼの設計は、公知の方法で行うことができる。 ZFNs are artificial nucleases that contain a nucleic acid cleavage domain conjugated to a binding domain that contains a zinc finger array. Examples of cleavage domains include the cleavage domain of the type II restriction enzyme FokI. Zinc finger nucleases capable of cleaving a target sequence can be designed by known methods.
 TALENは、DNA切断ドメイン(例えば、FokIドメイン)に加えて転写活性化因子様(TAL)エフェクターのDNA結合ドメインを含む人工ヌクレアーゼである。標的配列を切断可能なTALE構築物の設計は、公知の方法で行うことができる(例えば、Zhang, Feng et. al. (2011) Nature Biotechnology 29 (2))。 TALENs are artificial nucleases that contain a DNA-binding domain of a transcription activator-like (TAL) effector in addition to a DNA-cleavage domain (e.g., a FokI domain). TALE constructs capable of cleaving a target sequence can be designed by known methods (e.g., Zhang, Feng et. al. (2011) Nature Biotechnology 29 (2)).
 配列特異的核酸切断分子として、Casタンパク質を用いる場合、ゲノム改変システムは、CRISPR/Casシステムを含む。すなわち、ゲノム改変システムは、Casタンパク質と、標的領域内の塩基配列に相同な塩基配列を有するガイドRNAを含むことが好ましい。ガイドRNAは、スペーサー配列として、標的領域内の配列(標的配列)と相同な配列を含んでいればよい。ガイドRNAは、標的領域内のDNAに結合できるものでればよく、標的配列と完全に同一の配列を有している必要はない。この結合は、細胞核内の生理的条件下で形成されればよい。ガイドRNAは、例えば、標的配列に対して、例えば、0~3塩基のミスマッチを含むことができる。前記ミスマッチの数は、0~2塩基が好ましく、0~1がより好ましく、ミスマッチを有しないことがさらに好ましい。ガイドRNAの設計は、公知の方法に基づいて行うことができる。ゲノム改変システムは、CRISPR/Casシステムであることが好ましく、Casタンパク質とガイドRNAを含むことが好ましい。Casタンパク質は、Cas9タンパク質であることが好ましい。 When a Cas protein is used as the sequence-specific nucleic acid cleavage molecule, the genome modification system includes a CRISPR/Cas system. That is, the genome modification system preferably includes a Cas protein and a guide RNA having a base sequence homologous to a base sequence in the target region. The guide RNA may include a sequence homologous to a sequence in the target region (target sequence) as a spacer sequence. The guide RNA may be capable of binding to DNA in the target region, and does not need to have a sequence completely identical to the target sequence. This binding may be formed under physiological conditions in the cell nucleus. The guide RNA may include, for example, 0 to 3 base mismatches with respect to the target sequence. The number of mismatches is preferably 0 to 2 bases, more preferably 0 to 1, and even more preferably no mismatches. The guide RNA may be designed based on a known method. The genome modification system is preferably a CRISPR/Cas system, and preferably includes a Cas protein and a guide RNA. The Cas protein is preferably a Cas9 protein.
 配列特異的エンドヌクレアーゼは、タンパク質として細胞に導入してもよく、配列特異的エンドヌクレアーゼをコードするポリヌクレオチドとして細胞に導入してもよい。例えば、配列特異的エンドヌクレアーゼのmRNAを導入してもよく、配列特異的エンドヌクレアーゼの発現ベクターを導入してもよい。発現ベクターにおいて、配列特異的エンドヌクレアーゼのコード配列(配列特異的エンドヌクレアーゼ遺伝子)は、プロモーターに機能的に連結されている。プロモーターは、特に限定されず、例えば、pol II系プロモーターを各種使用することができる。pol II系プロモーターとしては、特に制限されないが、例えばCMVプロモーター、EF1プロモーター(EF1αプロモーター)、SV40プロモーター、MSCVプロモーター、hTERTプロモーター、βアクチンプロモーター、CAGプロモーター、CBhプロモーター等が挙げられる。 The sequence-specific endonuclease may be introduced into the cell as a protein, or may be introduced into the cell as a polynucleotide encoding the sequence-specific endonuclease. For example, the mRNA of the sequence-specific endonuclease may be introduced, or an expression vector of the sequence-specific endonuclease may be introduced. In the expression vector, the coding sequence of the sequence-specific endonuclease (sequence-specific endonuclease gene) is functionally linked to a promoter. The promoter is not particularly limited, and for example, various pol II promoters can be used. Examples of pol II promoters include, but are not limited to, the CMV promoter, the EF1 promoter (EF1α promoter), the SV40 promoter, the MSCV promoter, the hTERT promoter, the β-actin promoter, the CAG promoter, and the CBh promoter.
 プロモーターは、誘導性プロモーターであってもよい。誘導性プロモーターは、プロモーターを駆動する誘導因子の存在下でのみ、当該プロモーターに機能的に連結されたポリヌクレオチドの発現を誘導することができるプロモーターである。誘導性プロモーターとしては、ヒートショックプロモーターなどの加熱により遺伝子発現を誘導するプロモーターが挙げられる。また、誘導性プロモーターには、プロモーターを駆動する誘導因子が薬剤であるプロモーターが挙げられる。このような薬剤誘導性プロモーターとしては、例えば、Cumateオペレーター配列、λオペレーター配列(例えば、12×λOp)、テトラサイクリン系誘導性プロモーター等が挙げられる。テトラサイクリン系誘導性プロモーターとしては、例えば、テトラサイクリンもしくはその誘導体(例えば、ドキシサイクリン)、またはリバーステトラサイクリン制御性トランス活性化因子(rtTA)の存在下で遺伝子発現を駆動するプロモーターが挙げられる。テトラサイクリン系誘導性プロモーターとしては、例えば、TRE3Gプロモーターが挙げられる。 The promoter may be an inducible promoter. An inducible promoter is a promoter that can induce expression of a polynucleotide functionally linked to the promoter only in the presence of an inducer that drives the promoter. Inducible promoters include promoters that induce gene expression by heating, such as heat shock promoters. Inducible promoters also include promoters in which the inducer that drives the promoter is a drug. Such drug-inducible promoters include, for example, cumate operator sequences, lambda operator sequences (e.g., 12×λOp), tetracycline-based inducible promoters, and the like. Tetracycline-based inducible promoters include, for example, promoters that drive gene expression in the presence of tetracycline or a derivative thereof (e.g., doxycycline), or reverse tetracycline-controlled transactivator (rtTA). Tetracycline-based inducible promoters include, for example, the TRE3G promoter.
 発現ベクターは、公知のものを特に制限なく用いることができる。発現ベクターとしては、例えば、プラスミドベクター、ウイルスベクターが挙げられる。配列特異的エンドヌクレアーゼがCasタンパク質である場合、発現ベクターは、Casタンパク質のコード配列(Casタンパク質遺伝子)に加えて、ガイドRNAコード配列(ガイドRNA遺伝子)及びを含んでいてもよい。この場合、ガイドRNAコード配列(ガイドRNA遺伝子)は、pol III系プロモーターに機能的にされていることが好ましい。pol III系プロモーターとしては、例えば、マウス及びヒトのU6-snRNAプロモーター、ヒトH1-RNase P RNAプロモーター、ヒトバリン-tRNAプロモーター等が挙げられる。 Any known expression vector can be used without any particular restrictions. Examples of expression vectors include plasmid vectors and viral vectors. When the sequence-specific endonuclease is a Cas protein, the expression vector may contain a guide RNA coding sequence (guide RNA gene) and in addition to the coding sequence of the Cas protein (Cas protein gene). In this case, it is preferable that the guide RNA coding sequence (guide RNA gene) is functionalized in a pol III promoter. Examples of pol III promoters include mouse and human U6-snRNA promoters, human H1-RNase P RNA promoters, and human valine-tRNA promoters.
<(ii)選択マーカー用ドナーDNA>
 選択マーカー用ドナーDNAは、選択マーカーを標的領域にノックインするためのドナーDNAである。選択マーカー用ドナーDNAは、標的領域の上流側に隣接する塩基配列と相同な塩基配列を有する上流ホモロジーアームと、標的領域の下流側に隣接する塩基配列と相同な塩基配列を有する下流ホモロジーアームとの間に、1以上の選択マーカー遺伝子の塩基配列を含む。
<(ii) Donor DNA for selection marker>
The donor DNA for a selection marker is a donor DNA for knocking in a selection marker into a target region. The donor DNA for a selection marker contains the base sequence of one or more selection marker genes between an upstream homology arm having a base sequence homologous to a base sequence adjacent to the upstream side of the target region and a downstream homology arm having a base sequence homologous to a base sequence adjacent to the downstream side of the target region.
 選択マーカー用ドナーDNAは、特に限定されないが、例えば、1kb以上、2kb以上、3kb以上、4kb以上、5kb以上、6kb以上、7kb以上、8kb以上、9kb以上、9.5kb以上、または10kb以上の長さを有し得る。選択マーカー用ドナーDNAは、特に限定されないが、例えば、50kb以下、45kb以下、40kb以下、35kb以下、30kb以下、25kb以下、20kb以下、15kb以下、14kb以下、13kb以下、12kb以下、11kb以下、10kb以下、9kb以下、8kb以下、7kb以下、6kb以下、5kb以下、または4kb以下の長さを有し得る。 The donor DNA for the selection marker may have a length of, but is not limited to, 1 kb or more, 2 kb or more, 3 kb or more, 4 kb or more, 5 kb or more, 6 kb or more, 7 kb or more, 8 kb or more, 9 kb or more, 9.5 kb or more, or 10 kb or more. The donor DNA for the selection marker may have a length of, but is not limited to, 50 kb or less, 45 kb or less, 40 kb or less, 35 kb or less, 30 kb or less, 25 kb or less, 20 kb or less, 15 kb or less, 14 kb or less, 13 kb or less, 12 kb or less, 11 kb or less, 10 kb or less, 9 kb or less, 8 kb or less, 7 kb or less, 6 kb or less, 5 kb or less, or 4 kb or less.
 「選択マーカー」とは、その発現の有無に基づいて、細胞を選択することができるタンパク質を意味する。選択マーカー遺伝子は、選択マーカーをコードする遺伝子である。選択マーカー発現細胞及び非発現細胞が混在する細胞集団において、選択マーカー発現細胞を選択する場合、当該選択マーカーを「ポジティブ選択マーカー」または「ポジティブ選択用の選択マーカー」という。選択マーカー発現細胞及び非発現細胞が混在する細胞集団において、選択マーカー非発現細胞を選択する場合、当該選択マーカーを「ネガティブ選択マーカー」または「ネガティブ選択用の選択マーカー」という。選択マーカーが相互に異なるとは、相互に区別できること(例えば、区別可能に異なること)を意味し、例えば、選択マーカーが導入された細胞に付与する薬剤耐性の性質などの生理学的性質またはその他の物理化学的性質において少なくとも相互に区別できることを意味する。すなわち、選択マーカーが相互に異なるとは、異なる複数の選択マーカーが他の選択マーカーと区別可能に検出できること、または他の選択マーカーとは区別可能に薬剤選択できることを意味する。また、前記選択マーカー遺伝子が選択マーカー用ドナーDNAの種類毎に固有であるとは、選択マーカー用ドナーDNAの1種が有する選択マーカー遺伝子が、上記以外の種類の選択マーカー用ドナーDNAには含まれないこと、または、複数種類のドナーDNAに含まれている場合には、同時に2種類以上のドナーDNAから発現しないように構成されていることを意味する。このとき、2種類以上のドナーDNAは、選択マーカー以外は同一であってもよく、選択マーカー以外の配列および/または構成において相違があってもよい。 A "selection marker" refers to a protein that can select cells based on the presence or absence of its expression. A selection marker gene is a gene that codes for a selection marker. When a selection marker-expressing cell is selected in a cell population in which selection marker-expressing cells and non-expressing cells are mixed, the selection marker is called a "positive selection marker" or a "selection marker for positive selection". When a selection marker-non-expressing cell is selected in a cell population in which selection marker-expressing cells and non-expressing cells are mixed, the selection marker is called a "negative selection marker" or a "selection marker for negative selection". When selection markers are different from each other, it means that they can be distinguished from each other (e.g., they are distinguishably different), and for example, they can be distinguished from each other at least in physiological properties such as the property of drug resistance that they confer on cells into which the selection marker is introduced or in other physicochemical properties. In other words, when selection markers are different from each other, it means that different selection markers can be detected in a distinguishable manner from other selection markers, or that they can be selected for drugs in a distinguishable manner from other selection markers. Furthermore, the selective marker gene being unique to each type of donor DNA for selective markers means that the selective marker gene possessed by one type of donor DNA for selective markers is not contained in other types of donor DNA for selective markers, or, when contained in multiple types of donor DNA, is configured so that it is not expressed from two or more types of donor DNA at the same time. In this case, the two or more types of donor DNA may be identical except for the selective marker, or may differ in the sequence and/or structure other than the selective marker.
 ポジティブ選択マーカーは、それを発現する細胞を選択可能なものであれば、特に限定されない。ポジティブ選択マーカー遺伝子としては、例えば、薬剤耐性遺伝子、蛍光タンパク質遺伝子、発光酵素遺伝子、発色酵素遺伝子等が挙げられる。 The positive selection marker is not particularly limited as long as it allows the selection of cells expressing it. Examples of positive selection marker genes include drug resistance genes, fluorescent protein genes, luminescent enzyme genes, and chromogenic enzyme genes.
 ネガティブ選択マーカーは、それを発現しない細胞を選択可能なものであれば、特に限定されない。ネガティブ選択マーカー遺伝子としては、例えば、自殺遺伝子(チミジンカイネース等)、蛍光タンパク質遺伝子、発光酵素遺伝子、発色酵素遺伝子等が挙げられる。ネガティブ選択マーカー遺伝子が、細胞の生存に負の影響を与える遺伝子(例えば、自殺遺伝子)である場合、当該ネガティブ選択マーカー遺伝子は、誘導性プロモーターに機能的に連結され得る。誘導性プロモーターに機能的に連結することで、ネガティブ選択マーカー遺伝子を有する細胞を除去したいときにのみ、ネガティブ選択マーカー遺伝子を発現させることができる。ネガティブ選択マーカー遺伝子が、蛍光、発光、および発色等の光学的に検出可能なマーカー遺伝子(可視化マーカー遺伝子;visible marker gene)である場合など、細胞の生存に負の影響が少ない場合には、恒常的に発現させてもよい。 The negative selection marker is not particularly limited as long as it is capable of selecting cells that do not express it. Examples of negative selection marker genes include suicide genes (such as thymidine kinase), fluorescent protein genes, luminescent enzyme genes, and chromogenic enzyme genes. When the negative selection marker gene is a gene that has a negative effect on cell survival (such as a suicide gene), the negative selection marker gene can be functionally linked to an inducible promoter. By functionally linking the negative selection marker gene to an inducible promoter, the negative selection marker gene can be expressed only when it is desired to remove cells that have the negative selection marker gene. When the negative selection marker gene has little negative effect on cell survival, such as when it is an optically detectable marker gene (visible marker gene) that is fluorescent, luminescent, or chromogenic, it may be expressed constitutively.
 薬剤耐性遺伝子としては、例えば、ピューロマイシン耐性遺伝子、ブラスティサイディン耐性遺伝子、ジェネティシン耐性遺伝子、ネオマイシン耐性遺伝子、テトラサイクリン耐性遺伝子、カナマイシン耐性遺伝子、ゼオシン耐性遺伝子、ハイグロマイシン耐性遺伝子、クロラムフェニコール耐性遺伝子等が挙げられるが、これらに限定されない。
 蛍光タンパク質遺伝子としては、例えば、緑色蛍光タンパク質(GFP)遺伝子、黄色蛍光タンパク質(YFP)遺伝子、赤色蛍光タンパク質(RFP)遺伝子等が挙げられるが、これらに限定されない。
 発光酵素遺伝子としては、例えば、ルシフェラーゼ遺伝子などが挙げられるが、これらに限定されない。
 発色酵素遺伝子としては、例えば、βガラクトシターゼ遺伝子、βグルクロニダーゼ遺伝子、アルカリフォスファターゼ遺伝子等が挙げられるが、これらに限定されない。
 自殺遺伝子としては、例えば、単純ヘルペスウイルスのチミジンキナーゼ(HSV-TK)、誘導性カスパーゼ9(inducible caspase 9)等が挙げられるが、これらに限定されない。
Examples of drug resistance genes include, but are not limited to, a puromycin resistance gene, a blasticidin resistance gene, a geneticin resistance gene, a neomycin resistance gene, a tetracycline resistance gene, a kanamycin resistance gene, a zeocin resistance gene, a hygromycin resistance gene, and a chloramphenicol resistance gene.
Examples of fluorescent protein genes include, but are not limited to, green fluorescent protein (GFP) gene, yellow fluorescent protein (YFP) gene, red fluorescent protein (RFP) gene, and the like.
Examples of the luminescent enzyme gene include, but are not limited to, the luciferase gene.
Examples of chromogenic enzyme genes include, but are not limited to, β-galactosidase gene, β-glucuronidase gene, alkaline phosphatase gene, and the like.
Examples of suicide genes include, but are not limited to, herpes simplex virus thymidine kinase (HSV-TK), inducible caspase 9, and the like.
 選択マーカー用ドナーDNAが有する選択マーカー遺伝子は、ポジティブ選択マーカー遺伝子であることが好ましい。すなわち、選択マーカーを発現する細胞を、選択マーカー遺伝子がノックインされた細胞として、選択することができる。 The selection marker gene contained in the selection marker donor DNA is preferably a positive selection marker gene. In other words, cells expressing the selection marker can be selected as cells in which the selection marker gene has been knocked in.
 上流ホモロジーアームは、改変対象のゲノムにおいて、標的領域の上流側の塩基配列と相同組換え可能な塩基配列を有し、例えば、標的配列の上流側に隣接する塩基配列と相同な塩基配列を有する。下流ホモロジーアームは、改変対象のゲノムにおいて、標的領域の下流側の塩基配列と相同組換え可能な塩基配列を有し、例えば、標的配列の下流側に隣接する塩基配列と相同な塩基配列を有する。上流ホモロジーアーム及び下流ホモロジーアームは、標的領域の周辺領域と相同組換可能であれば、その長さ及び配列は特に限定されない。上流ホモロジーアーム及び下流ホモロジーアームは、相同組換え可能な限り、標的領域の上流側配列若しくは下流側配列と必ずしも完全一致する必要はない。例えば、上流ホモロジーアームは、標的領域の上流側に隣接する塩基配列と90%以上の配列同一性(相同性)を有する配列であることができ、92%以上、93%以上、94%以上、95%以上、96%以上、97%以上、98%以上、又は99%以上の配列同一性を有することが好ましい。例えば、下流ホモロジーアームは、標的領域の下流側に隣接する塩基配列と90%以上の配列同一性(相同性)を有する配列であることができ、92%以上、93%以上、94%以上、95%以上、96%以上、97%以上、98%以上、又は99%以上の配列同一性を有することが好ましい。また、上流ホモロジーアームと下流ホモロジーアームとは、少なくともそのいずれかが標的領域中の切断箇所により近いとアレルの改変効率をより高め得る。ここで、「近い」とは、2つの配列の距離が100bp以下、50bp以下、40bp以下、30bp以下、20bp以下、または10bp以下であることを意味し得る。 The upstream homology arm has a base sequence capable of homologous recombination with a base sequence upstream of the target region in the genome to be modified, for example, a base sequence homologous to a base sequence adjacent to the upstream side of the target sequence. The downstream homology arm has a base sequence capable of homologous recombination with a base sequence downstream of the target region in the genome to be modified, for example, a base sequence homologous to a base sequence adjacent to the downstream side of the target sequence. The length and sequence of the upstream homology arm and the downstream homology arm are not particularly limited as long as they are capable of homologous recombination with the surrounding region of the target region. The upstream homology arm and the downstream homology arm do not necessarily have to completely match the upstream or downstream sequence of the target region as long as they can perform homologous recombination. For example, the upstream homology arm can be a sequence having 90% or more sequence identity (homology) with the base sequence adjacent to the upstream side of the target region, and it is preferable that the sequence identity is 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, or 99% or more. For example, the downstream homology arm can be a sequence having 90% or more sequence identity (homology) with the base sequence adjacent to the downstream side of the target region, and preferably has 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, or 99% or more sequence identity. In addition, if at least one of the upstream homology arm and the downstream homology arm is closer to the cleavage site in the target region, the efficiency of allele modification can be further increased. Here, "close" can mean that the distance between the two sequences is 100 bp or less, 50 bp or less, 40 bp or less, 30 bp or less, 20 bp or less, or 10 bp or less.
 選択マーカー用ドナーDNAにおいて、選択マーカー遺伝子は、上流ホモロジーアームと下流ホモロジーアームとの間に位置する。これにより、上記(i)のゲノム改変システムと共に選択マーカー用ドナーDNAを細胞に導入した場合に、HDRにより、選択マーカー遺伝子が標的領域に導入される(これにより遺伝子が破壊される場合、遺伝子ノックアウトといい、これにより所望の遺伝子が導入される場合、遺伝子ノックインという、遺伝子をノックアウトしつつ、別の遺伝子をノックインすることもできる)。 In the donor DNA for selection markers, the selection marker gene is located between the upstream homology arm and the downstream homology arm. As a result, when the donor DNA for selection markers is introduced into a cell together with the genome modification system described above in (i), the selection marker gene is introduced into the target region by HDR (if a gene is destroyed by this, it is called gene knockout, and if a desired gene is introduced by this, it is called gene knockin, in which it is possible to knock out a gene while knocking in another gene).
 選択マーカー遺伝子は、適切なプロモーターの制御下で発現されるように、プロモーターに機能的に連結されていることが好ましい。プロモーターは、ドナーDNAを導入する細胞の種類に応じて適宜選択することができる。プロモーターとしては、例えば、SRαプロモーター、SV40初期プロモーター、レトロウイルスのLTR、CMV(サイトメガロウイルス)プロモーター、RSV(ラウス肉腫ウイルス)プロモーター、HSV-TK(単純ヘルペスウイルスチミジンキナーゼ)プロモーター、EF1αプロモーター、メタロチオネインプロモーター、ヒートショックプロモーター等が挙げられる。選択マーカー用ドナーDNAは、エンハンサー、ポリA付加シグナル、ターミネーター等の任意の制御配列等を有していてもよい。 The selection marker gene is preferably functionally linked to a promoter so that it is expressed under the control of an appropriate promoter. The promoter can be appropriately selected depending on the type of cell into which the donor DNA is introduced. Examples of promoters include SRα promoter, SV40 early promoter, retroviral LTR, CMV (cytomegalovirus) promoter, RSV (Rous sarcoma virus) promoter, HSV-TK (herpes simplex virus thymidine kinase) promoter, EF1α promoter, metallothionein promoter, and heat shock promoter. The donor DNA for the selection marker may have any control sequence such as an enhancer, a polyA addition signal, or a terminator.
 選択マーカー用ドナーDNAは、インスレーター配列を有していてもよい。「インスレーター」とは、隣接する染色体環境の影響を遮断または緩和し、その領域に挟まれたDNAの転写調節の独立性を保証するまたは高める配列をいう。インスレーターは、エンハンサー遮断効果(エンハンサーとプロモーターの間に挿入することにより、エンハンサーによるプロモーター活性への影響を遮断する効果)、及び位置効果の抑制作用(導入遺伝子の両側をインスレーターで挟むことにより、導入遺伝子の発現を挿入されたゲノム上の位置に影響されないようにする作用)により定義される。選択マーカー用ドナーDNAは、上流アームと選択マーカー遺伝子との間(又は上流アームと選択マーカー遺伝子を制御するプロモーターとの間)に、インスレーター配列を有していてもよい。選択マーカー用ドナーDNAは、下流アームと選択マーカー遺伝子との間に、インスレーター配列を有していてもよい。 The donor DNA for the selection marker may have an insulator sequence. An "insulator" refers to a sequence that blocks or alleviates the influence of the adjacent chromosomal environment and ensures or enhances the independence of the transcriptional regulation of the DNA sandwiched between the regions. An insulator is defined by its enhancer blocking effect (the effect of blocking the effect of the enhancer on promoter activity by inserting it between an enhancer and a promoter) and its suppression effect on position effect (the effect of preventing the expression of the introduced gene from being affected by the position on the genome where it is inserted by sandwiching both sides of the introduced gene with insulators). The donor DNA for the selection marker may have an insulator sequence between the upstream arm and the selection marker gene (or between the upstream arm and the promoter that controls the selection marker gene). The donor DNA for the selection marker may have an insulator sequence between the downstream arm and the selection marker gene.
 選択マーカー用ドナーDNAは、直鎖状であってもよく、環状であってもよいが、環状であることが好ましい。好ましくは、選択マーカー用ドナーDNAは、プラスミドである。選択マーカー用ドナーDNAは、上記の配列に加えて、任意の配列を含み得る。例えば、上流ホモロジーアーム、インスレーター、選択マーカー遺伝子、及び下流ホモロジーアームの各配列間の全て又は一部に、スペーサー配列を含んでいてもよい。 The donor DNA for the selection marker may be linear or circular, but is preferably circular. Preferably, the donor DNA for the selection marker is a plasmid. The donor DNA for the selection marker may contain any sequence in addition to the above sequences. For example, it may contain a spacer sequence in all or part of the sequences between the upstream homology arm, the insulator, the selection marker gene, and the downstream homology arm.
 工程(a)では、ゲノム改変の対象とするアレルの数と同数以上の種類の選択マーカー用ドナーDNAを細胞に導入する。異なる種類の選択マーカー用ドナーDNAは、相互に異なる(区別可能な)種類の選択マーカー遺伝子を有する。ある態様では、異なる種類の選択マーカー用ドナーDNAは、完全に同一の選択マーカー遺伝子またはそのセットを有しない。つまり、第1の種類の選択マーカー用ドナーDNAは、第1の種類の選択マーカー遺伝子を有し、第2の種類の選択マーカー用ドナーDNAは、第2の種類の選択マーカー遺伝子を有する。第3の種類の選択マーカー用ドナーDNAは、第3の種類の選択マーカー遺伝子を有し、それ以降の種類の選択マーカー用ドナーDNAについても同様である。ゲノム改変の対象とするアレルが2個である場合、選択マーカー用ドナーDNAの種類は2種類以上である。ゲノム改変の対象とするアレルが3個である場合、選択マーカー用ドナーDNAの種類は3種類以上である。ある態様では、1つの選択マーカー用ドナーDNAが、2種類以上の相互に異なる(区別可能な)選択マーカーを有していてもよい(この場合であっても、異なる種類の選択マーカー用ドナーDNAは、相互に異なる(区別可能な)種類(例えば、固有)の選択マーカー遺伝子を有していなければならない)。ある態様では、選択マーカー用ドナーDNAは、部位特異的組換え酵素の組換え配列(例えば、Creリコンビナーゼにより組換えられるloxP配列およびその変種)を有しない。また、ある態様では、本発明の方法は、部位特異的組換え酵素及びその組換え配列(例えば、Creリコンビナーゼにより組換えられるloxP配列およびその変種)を用いない。部位特異的組換え酵素を用いると、通常は、編集後のゲノムに部位特異的組換え酵素の組換え配列が1つ残存する。これに対して、ある態様では、本発明の方法で得られる細胞の改変ゲノムは、部位特異的組換え酵素の組換え配列(外来である)を有しない。 In step (a), donor DNA for selection markers is introduced into cells in a number equal to or greater than the number of alleles to be modified in the genome. Different types of donor DNA for selection markers have mutually different (distinguishable) types of selection marker genes. In one embodiment, different types of donor DNA for selection markers do not have completely identical selection marker genes or sets thereof. That is, a first type of donor DNA for selection markers has a first type of selection marker gene, a second type of donor DNA for selection markers has a second type of selection marker gene, a third type of donor DNA for selection markers has a third type of selection marker gene, and so on for subsequent types of donor DNA for selection markers. When there are two alleles to be modified in the genome, there are two or more types of donor DNA for selection markers. When there are three alleles to be modified in the genome, there are three or more types of donor DNA for selection markers. In one aspect, one donor DNA for a selection marker may have two or more different (distinguishable) selection markers (even in this case, the different types of donor DNA for a selection marker must have different (distinguishable) types (e.g., unique) of selection marker genes). In one aspect, the donor DNA for a selection marker does not have a recombination sequence of a site-specific recombinase (e.g., a loxP sequence recombined by Cre recombinase and its variants). In another aspect, the method of the present invention does not use a site-specific recombinase and its recombination sequence (e.g., a loxP sequence recombined by Cre recombinase and its variants). When a site-specific recombinase is used, one recombination sequence of the site-specific recombinase usually remains in the edited genome. In contrast, in one aspect, the modified genome of the cell obtained by the method of the present invention does not have a recombination sequence (which is foreign) of a site-specific recombinase.
 選択マーカー用ドナーDNAの種類数は、ゲノム改変の対象とするアレルの数と同数以上であればよく、上限は特に限定されない。ゲノム改変の対象とするアレルの数と同数以上の種類の選択マーカー用ドナーDNAを用いることで、2つ以上のアレルを安定的に改変することができる。選択マーカー用ドナーDNAの種類数は、後述の工程(b)での選択操作の観点から、ゲノム改変の対象とするアレルの数と同数又は1~2多い程度が好ましく、ゲノム改変の対象とするアレルの数と同数であることがより好ましい。 The number of types of donor DNA for selection markers may be equal to or greater than the number of alleles to be targeted for genome modification, with no particular upper limit. By using donor DNA for selection markers of a number equal to or greater than the number of alleles to be targeted for genome modification, two or more alleles can be stably modified. From the viewpoint of the selection operation in step (b) described below, the number of types of donor DNA for selection markers is preferably equal to the number of alleles to be targeted for genome modification or about 1 to 2 more, and more preferably equal to the number of alleles to be targeted for genome modification.
 上記(i)と(ii)を細胞に導入する方法は特に限定されず、公知の方法を特に制限なく用いることができる。(i)及び(ii)を細胞に導入する方法としては、例えば、ウイルス感染法、リポフェクション法、マイクロインジェクション法、カルシウムリン酸法、DEAE-デキストラン法、エレクトロポーレーション法、パーティクルガン法等が挙げられるが、これらに限定されない。上記(i)と(ii)とを細胞に導入することにより、上記(i)の配列特異的核酸切断分子により、標的領域のDNAが切断された後、HDRにより(ii)の選択マーカー用ドナーDNA中の選択マーカーが標的領域にノックインされる。この際、2種以上の選択マーカー用ドナーDNAは、それぞれの上流ホモロジーアームおよび下流ホモロジーアームが同一である場合には、ランダムに標的領域の2つ以上のアレルにノックインされ得る。但し、2種以上の選択マーカー用ドナーDNAは、2以上のアレルそれぞれの標的領域の上流配列および下流配列と相同組換え可能な塩基配列をホモロジーアームの塩基配列をそれぞれ有している限り2以上のアレルそれぞれを改変できるので、完全に同一のホモロジーアームの塩基配列を有する必要はない。ある態様では、2種以上の選択マーカー用ドナーDNAにおいては、その上流および下流ホモロジーアームの塩基配列が、それぞれのアレルの標的領域の上流配列および下流配列とより同一性の高い塩基配列を有していてもよい(例えば、そのように最適化されていてもよい)。 The method of introducing (i) and (ii) into cells is not particularly limited, and known methods can be used without particular limitation. Examples of methods of introducing (i) and (ii) into cells include, but are not limited to, viral infection, lipofection, microinjection, calcium phosphate, DEAE-dextran, electroporation, and particle gun. By introducing (i) and (ii) into cells, the DNA in the target region is cleaved by the sequence-specific nucleic acid cleavage molecule (i) above, and then the selection marker in the donor DNA for selection marker (ii) is knocked into the target region by HDR. At this time, two or more types of donor DNA for selection marker can be knocked into two or more alleles of the target region randomly when the upstream homology arm and downstream homology arm of each are identical. However, two or more types of donor DNA for selection markers can modify each of the two or more alleles as long as the donor DNA for selection markers has a base sequence of a homology arm that can undergo homologous recombination with the upstream and downstream sequences of the target region of each of the two or more alleles, and therefore does not need to have completely identical base sequences of the homology arms. In one embodiment, the donor DNA for selection markers may have base sequences of the upstream and downstream homology arms that are more identical to the upstream and downstream sequences of the target region of each allele (e.g., may be optimized in this way).
 選択マーカー用ドナーDNAは、ある態様では、上流ホモロジーアームおよび下流ホモロジーアームを有し、上流ホモロジーアームおよび下流ホモロジーアームの間に、選択マーカー遺伝子を有し、好ましくは、メガヌクレアーゼの切断部位などのエンドヌクレアーゼ(塩基配列特異的核酸切断分子)の標的配列をさらに有し得る。この態様において、ある好ましい態様では、選択マーカーは、ポジティブ選択用の選択マーカー遺伝子およびネガティブ選択用のマーカー遺伝子を含む。別の好ましい態様では、選択マーカーは、ポジティブ選択用の選択マーカーを含むが、これとは別にネガティブ選択マーカー遺伝子を含まなくてもよい。ある好ましい態様では、ポジティブ選択用の選択マーカー遺伝子は、ネガティブ選択用にも用いられ得、そのようなマーカー遺伝子としては、可視化マーカー遺伝子が挙げられる。
 2以上の選択マーカー用ドナーDNAのセットは、上記の選択マーカー用ドナーDNAの組合せであり、かつ、それぞれが互いに区別可能なポジティブ選択用の選択マーカー遺伝子を有している。上記セットにおいては、メガヌクレアーゼの切断部位などのエンドヌクレアーゼ(塩基配列特異的核酸切断分子)の標的配列をさらに有し得、当該標的配列は互いに異なっていてもよいが、同一である(または同一の塩基配列特異的核酸切断分子で切断できる)ことが好ましい。選択マーカー用ドナーDNAの長さは上記の通りであるが、例えば、5kbp以上、8kbp以上、または10kbp以上であり得る。
In one embodiment, the donor DNA for the selection marker has an upstream homology arm and a downstream homology arm, and has a selection marker gene between the upstream homology arm and the downstream homology arm, and preferably may further have a target sequence for an endonuclease (a base sequence-specific nucleic acid cleavage molecule) such as a cleavage site for a meganuclease. In this embodiment, in a preferred embodiment, the selection marker includes a selection marker gene for positive selection and a marker gene for negative selection. In another preferred embodiment, the selection marker includes a selection marker for positive selection, but may not include a negative selection marker gene separately from this. In a preferred embodiment, the selection marker gene for positive selection can also be used for negative selection, and such a marker gene may include a visualization marker gene.
A set of two or more donor DNAs for selection markers is a combination of the above donor DNAs for selection markers, and each of them has a selectable marker gene for positive selection that can be distinguished from the others. The above set may further have a target sequence for an endonuclease (base sequence-specific nucleic acid cutting molecule) such as a cleavage site of a meganuclease, and the target sequences may be different from each other, but are preferably the same (or can be cut by the same base sequence-specific nucleic acid cutting molecule). The length of the donor DNA for selection markers is as described above, but may be, for example, 5 kbp or more, 8 kbp or more, or 10 kbp or more.
(工程(b))
 前記工程(a)の後、工程(b)を行う。工程(b)では、2以上のアレルにそれぞれ区別可能に異なる選択マーカー遺伝子またはその組合せが導入された細胞を、当該区別可能に異なる選択マーカー遺伝子の発現に基づいて、選択する。より具体的には、工程(b)では、前記2以上のアレルに対して異なる種類の選択マーカー用ドナーDNAがそれぞれ相同組み換えすることによって、当該2以上のアレルにそれぞれ区別可能に異なる固有の選択マーカー遺伝子が導入され、導入された当該区別可能に異なる選択マーカー遺伝子のすべてを発現する細胞を選択する。ある態様では、工程(b)では、前記2種以上の選択マーカー用ドナーDNAが有する選択マーカー遺伝子であって染色体ゲノムに組込まれたすべての選択マーカー遺伝子の発現に基づいて、異なる選択マーカー用ドナーDNAが導入されたことによりそれぞれのアレルが改変された細胞を選択する。ある態様では、工程(b)では、前記2種以上の選択マーカー用ドナーDNAが有する全ての選択マーカー遺伝子に基づいて、細胞を選択する。ある態様では、工程(b)では、前記2種以上の選択マーカー用ドナーDNAが有する選択マーカー遺伝子であって染色体ゲノムに組込まれたすべての選択マーカー遺伝子(ポジティブ選択用のマーカー遺伝子)の発現に基づいて、区別可能な選択マーカー用ドナーDNAが導入されたことによりそれぞれのアレルが改変された細胞を選択する。ある態様では、工程(b)で得られる細胞は、各アレルが異なるポジティブ選択用のマーカー遺伝子を有する。ある態様では、工程(b)で得られる細胞は、各アレルが、共通したポジティブ選択用のマーカー遺伝子をここで、ある態様では、工程(b)では、単一細胞クローニングを行わない{但し、工程(b)で2つ以上のアレルが改変された細胞を選択した後に単一細胞クローニングを行うことを含んでいても、含まなくてもよい}。ある態様では、工程(b)では、細胞の選択は、各アレルに導入された区別可能なポジティブ選択用のマーカー遺伝子の複数の発現に基づいて行われる。ある態様では、工程(b)は、単一の選択マーカー遺伝子の発現強度(例えば、蛍光タンパク質の発現強度または蛍光強度)に基づいて改変アレル数の多さを推定する方式では行われない。単一の選択マーカー遺伝子の発現強度の強弱に基づいて改変アレル数の多さを推定する方式で細胞を選択する場合、細胞毎の遺伝子発現量に変動が生じ、2以上のアレルが改変された細胞を1つのアレルが改変された細胞から完全に分離することが困難となり、したがって、工程(b)において単一細胞クローニングが必要となるためである。
(Step (b))
After the step (a), step (b) is performed. In step (b), cells into which two or more alleles have been introduced with selectable marker genes or a combination thereof that are distinct from each other are selected based on the expression of the distinctly different selectable marker genes. More specifically, in step (b), cells that express all of the distinctly different selectable marker genes introduced into the two or more alleles by homologous recombination of different types of selectable marker donor DNAs with respect to the two or more alleles are selected. In one aspect, in step (b), cells in which the different selectable marker donor DNAs have been introduced and each allele has been modified are selected based on the expression of all the selectable marker genes contained in the two or more selectable marker donor DNAs and integrated into the chromosomal genome. In one aspect, in step (b), cells are selected based on all the selectable marker genes contained in the two or more selectable marker donor DNAs. In one aspect, in step (b), cells in which each allele has been modified by the introduction of a distinguishable donor DNA for selection marker are selected based on the expression of all the selection marker genes (marker genes for positive selection) that are contained in the two or more types of donor DNA for selection marker and that have been integrated into the chromosomal genome. In one aspect, the cells obtained in step (b) have different marker genes for positive selection in each allele. In one aspect, the cells obtained in step (b) have a common marker gene for positive selection in each allele. Here, in one aspect, single cell cloning is not performed in step (b) {however, it may or may not include single cell cloning after selecting cells in which two or more alleles have been modified in step (b)}. In one aspect, in step (b), the selection of cells is performed based on the expression of multiple distinguishable marker genes for positive selection introduced into each allele. In one aspect, step (b) is not performed by a method of estimating the number of modified alleles based on the expression intensity of a single selection marker gene (e.g., expression intensity or fluorescence intensity of a fluorescent protein). When cells are selected using a method that estimates the number of modified alleles based on the expression intensity of a single selection marker gene, the gene expression level varies from cell to cell, making it difficult to completely separate cells in which two or more alleles are modified from cells in which only one allele is modified; therefore, single-cell cloning is required in step (b).
 工程(b)は、工程(a)で用いた選択マーカー遺伝子の種類に応じて、適宜、細胞の選択を行えばよい。この際、工程(a)で用いた選択マーカー遺伝子の全ての発現に基づいて細胞を選択する。 In step (b), cells may be selected as appropriate depending on the type of selection marker gene used in step (a). In this case, cells are selected based on the expression of all of the selection marker genes used in step (a).
 例えば、選択マーカー遺伝子がポジティブ選択マーカー遺伝子である場合、改変対象の染色体ゲノムに組込まれる(または組込まれた)すべての選択マーカー遺伝子を発現する細胞を選択することができ、例えば、改変対象のアレルの数と同数のポジティブ選択マーカーを発現する細胞を選択することができる。ポジティブ選択マーカー遺伝子が薬剤耐性遺伝子である場合、当該薬剤を含む培地で細胞を培養することにより、前記ポジティブ選択マーカーを発現する細胞を選択することができる。ポジティブ選択マーカー遺伝子が蛍光タンパク質遺伝子、発光酵素遺伝子、又は発色酵素遺伝子である場合、蛍光タンパク質、発光酵素、又は発色酵素による蛍光、発光、又は発色を呈する細胞を選択することにより、前記ポジティブ選択マーカーを発現する細胞を選択することができる。本工程では、改変されるべきアレルの数と同数の選択マーカー用ドナーDNAがゲノムに取り込まれた場合、当該数のアレルが改変されている。n倍体の細胞においては、改変されるべきアレルの数はnまたはそれ以下であり、当該数以上n以下の種類の選択マーカー用ドナーDNAがゲノムに取り込まれた場合には、少なくとも改変されるべきアレル(2以上のアレルである)が改変されている。ある態様では、改変されるべきアレルの数がnであり、当該数の種類の選択マーカー用ドナーDNAが染色体ゲノムに取り込まれ、すべてのアレルが改変されている。ある態様では、本工程では、改変対象とするアレルの数と同数以上の種類の選択マーカー用ドナーDNAを用いているため、細胞が発現するポジティブ選択マーカーの数は、当該数のアレルが確実に改変されていることを意味する。工程(b)における細胞の選択効率を高める観点では、改変対象とするアレルの数は、選択マーカー用ドナーDNAの種類数と同一であることが好ましい。 For example, when the selection marker gene is a positive selection marker gene, cells expressing all the selection marker genes that are incorporated (or have been incorporated) into the chromosomal genome to be modified can be selected, for example, cells expressing the same number of positive selection markers as the number of alleles to be modified can be selected. When the positive selection marker gene is a drug resistance gene, cells expressing the positive selection marker can be selected by culturing the cells in a medium containing the drug. When the positive selection marker gene is a fluorescent protein gene, a luminescent enzyme gene, or a chromogenic enzyme gene, cells expressing the positive selection marker can be selected by selecting cells that exhibit fluorescence, luminescence, or color due to the fluorescent protein, luminescent enzyme, or chromogenic enzyme. In this process, when the same number of selection marker donor DNAs as the number of alleles to be modified are incorporated into the genome, the number of alleles is modified. In an n-ploid cell, the number of alleles to be modified is n or less, and when the number of selection marker donor DNAs of types greater than or equal to n are incorporated into the genome, at least the alleles to be modified (which are two or more alleles) are modified. In one embodiment, the number of alleles to be modified is n, and the corresponding number of types of donor DNA for selection markers are incorporated into the chromosomal genome, and all alleles are modified. In one embodiment, in this step, the same number or more types of donor DNA for selection markers as the number of alleles to be modified are used, so the number of positive selection markers expressed by the cells means that the corresponding number of alleles have been reliably modified. From the viewpoint of increasing the efficiency of cell selection in step (b), it is preferable that the number of alleles to be modified is the same as the number of types of donor DNA for selection markers.
 上記の通り、本実施形態のゲノム改変方法では、n倍体の細胞においてn個のアレルを改変するためにn種類の選択マーカー用ドナーDNAを用いてHDRを誘発させることによって、細胞が有する全てのアレルが改変された細胞を効率よく取得することができる。また、全てのアレルが改変された細胞を確実に取得することができるため、標的領域が大きいサイズ(例えば、10kbp以上)であっても、標的領域が改変された細胞を効率よく取得することができる。そのため、大規模なゲノム改変も可能となる。 As described above, in the genome modification method of this embodiment, by inducing HDR using n types of donor DNA for selection markers to modify n alleles in an n-ploid cell, it is possible to efficiently obtain cells in which all alleles possessed by the cell have been modified. Furthermore, because it is possible to reliably obtain cells in which all alleles have been modified, it is possible to efficiently obtain cells in which the target region has been modified even if the target region is large in size (e.g., 10 kbp or more). This makes large-scale genome modification possible.
 ある好ましい態様では、選択マーカー用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間にポジティブ選択用マーカー遺伝子の他にネガティブ選択用マーカー遺伝子を含んでいてもよい。ある好ましい態様では、選択マーカー用ドナーDNAにおいては、前記ポジティブ選択用マーカー遺伝子は、ネガティブ選択用にも兼用できるマーカー遺伝子(ポジティブ選択およびネガティブ選択に兼用可能なマーカー遺伝子)であってもよい。ある好ましい態様では、ポジティブ選択用マーカー遺伝子は、薬剤耐性遺伝子であり得る。ある好ましい態様では、ポジティブ選択用マーカー遺伝子は、ネガティブ選択用にも兼用できる可視化マーカー遺伝子等であり得る。 In a preferred embodiment, the donor DNA for selection marker may contain a negative selection marker gene in addition to the positive selection marker gene between the upstream homology arm and the downstream homology arm. In a preferred embodiment, in the donor DNA for selection marker, the positive selection marker gene may be a marker gene that can also be used for negative selection (a marker gene that can be used for both positive and negative selection). In a preferred embodiment, the positive selection marker gene may be a drug resistance gene. In a preferred embodiment, the positive selection marker gene may be a visualization marker gene that can also be used for negative selection, etc.
 選択マーカー用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間にポジティブ選択用マーカー遺伝子の他にネガティブ選択用マーカー遺伝子を含むが、標的核酸配列をさらに含んでいてもよい。標的核酸配列は、上記ゲノム改変システムにより切断可能な配列である。標的核酸配列は、アレル特異的な配列であることが好ましく、これにより、第一のアレルのカセット(または第一のカセット)および第二のアレルのカセット(第二のカセット)のうち、第一のアレルのみを切断すること、または第二のアレルのみを切断することができる。このように、切断をアレル特異的に誘発させることにより、片アレルのみの選択的な編集が可能である。ある好ましい態様では、選択マーカー用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間に1つの標的核酸配列を含む。ある好ましい態様では、選択マーカー用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間に第一の標的核酸配列と第二の標的核酸配列を含み、第一の標的核酸配列と第二の標的核酸配列の間に選択マーカー遺伝子を含む。別の選択マーカー用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間に第三の標的核酸配列と第四の標的核酸配列を含み、第三の標的核酸配列と第四の標的核酸配列の間に選択マーカー遺伝子を含む。第一の標的核酸配列と第二の標的核酸配列は、同一配列でも異なっていてもよく、第三の標的核酸配列と第四の標的核酸配列は、同一配列でも異なっていてもよい。例えば、第一の標的核酸配列と第二の標的核酸配列を切断するときに、第三の標的核酸配列と第四の標的核酸配列が切断されないように設計され、および/または、第三の標的核酸配列と第四の標的核酸配列を切断するときに、第一の標的核酸配列と第二の標的核酸配列が切断されないように設計されていれば、第一のカセットおよび第二のカセットのいずれかのみを特異的に切断し、当該カセットのうちの一方の編集を選択的に行うことができる。第一のカセットと第二のカセットを同時に編集する場合には、第一~第四の標的核酸配列は同一でもよいことは明らかであろう。 The donor DNA for the selection marker contains a negative selection marker gene in addition to a positive selection marker gene between the upstream homology arm and the downstream homology arm, and may further contain a target nucleic acid sequence. The target nucleic acid sequence is a sequence that can be cleaved by the above-mentioned genome modification system. The target nucleic acid sequence is preferably an allele-specific sequence, which makes it possible to cleave only the first allele or only the second allele of the cassette of the first allele (or the first cassette) and the cassette of the second allele (the second cassette). In this way, selective editing of only one allele is possible by inducing cleavage in an allele-specific manner. In a preferred embodiment, the donor DNA for the selection marker contains one target nucleic acid sequence between the upstream homology arm and the downstream homology arm. In a preferred embodiment, the donor DNA for the selection marker contains a first target nucleic acid sequence and a second target nucleic acid sequence between the upstream homology arm and the downstream homology arm, and contains a selection marker gene between the first target nucleic acid sequence and the second target nucleic acid sequence. Another donor DNA for a selection marker contains a third target nucleic acid sequence and a fourth target nucleic acid sequence between the upstream homology arm and the downstream homology arm, and contains a selection marker gene between the third target nucleic acid sequence and the fourth target nucleic acid sequence. The first target nucleic acid sequence and the second target nucleic acid sequence may be the same or different, and the third target nucleic acid sequence and the fourth target nucleic acid sequence may be the same or different. For example, if the third target nucleic acid sequence and the fourth target nucleic acid sequence are designed not to be cleaved when the first target nucleic acid sequence and the second target nucleic acid sequence are cleaved, and/or the first target nucleic acid sequence and the second target nucleic acid sequence are designed not to be cleaved when the third target nucleic acid sequence and the fourth target nucleic acid sequence are cleaved, only one of the first cassette and the second cassette can be specifically cleaved, and one of the cassettes can be selectively edited. It will be apparent that when the first cassette and the second cassette are edited simultaneously, the first to fourth target nucleic acid sequences may be the same.
 ある態様では、工程(b)では、工程(a)によって得られた細胞を含むプールから、細胞をクローニングすることなく、改変された細胞を選択することができる。クローニングの工程を省くことによって工程に必要な時間が短縮され得る。 In one embodiment, in step (b), modified cells can be selected from a pool containing cells obtained by step (a) without cloning the cells. By eliminating the cloning step, the time required for the process can be reduced.
 第一の中間体細胞は、改変前細胞から上記工程(a)および(b)によって得ることができる(図1の工程S1参照)。工程(a)においては、選択マーカー用ドナーDNAとして、標的配列に対する上流ホモロジーアームと下流ホモロジーアームとの間にポジティブ選択用マーカー遺伝子とネガティブ選択用マーカー遺伝子、またはポジティブ選択用およびネガティブ選択用の両方に兼用可能なマーカー遺伝子を含む。選択マーカー用ドナーDNAは好ましくは、標的配列に対する上流ホモロジーアームと下流ホモロジーアームとの間に2つの標的核酸配列を含む。そして、ポジティブ選択用マーカー遺伝子とネガティブ選択用マーカー遺伝子、またはポジティブ選択用およびネガティブ選択用の両方に兼用可能なマーカー遺伝子は、好ましくは、当該2つの標的核酸配列の間に存在する。このようにすることで、ポジティブ選択により第一の中間体細胞を得ることができる。 The first intermediate cell can be obtained from the pre-modified cell by the above steps (a) and (b) (see step S1 in FIG. 1). In step (a), the donor DNA for the selection marker contains a positive selection marker gene and a negative selection marker gene, or a marker gene that can be used for both positive and negative selection, between the upstream homology arm and the downstream homology arm for the target sequence. The donor DNA for the selection marker preferably contains two target nucleic acid sequences between the upstream homology arm and the downstream homology arm for the target sequence. The positive selection marker gene and the negative selection marker gene, or the marker gene that can be used for both positive and negative selection, are preferably present between the two target nucleic acid sequences. In this way, the first intermediate cell can be obtained by positive selection.
 第一の中間体細胞は、改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、前記第一のアレルと第二のアレルそれぞれに選択マーカー遺伝子と標的核酸配列を含むカセットを有する細胞である。ある好ましい態様では、第一の中間体細胞では、第一のアレルが有する選択マーカー遺伝子と第二のアレルが有する選択マーカー遺伝子とは区別可能に異なる。ある好ましい態様では、第一の中間体細胞では、前記標的核酸配列は、ゲノム改変システムの標的であり、ゲノム改変システムにより第一のアレルと第二のアレルとを区別可能に切断できるように設計されている。ある好ましい態様では、第一の中間体細胞では、各選択マーカー遺伝子はネガティブ選択に用いることができるネガティブ選択用マーカー遺伝子である。第一の中間体細胞を取得する際にはポジティブ選択用マーカー遺伝子が有用であるが、第一の中間体細胞を得た後の工程においては、ポジティブ選択用マーカー遺伝子は必要ではない。したがって、ポジティブ選択用マーカー遺伝子は、除去されてもよい。除去は、例えば、ゲノム編集技術というにより実施することができる。このように、第一の中間体細胞は、ポジティブ選択用を有していなくてもよい。 The first intermediate cell is a cell having a genome including a first allele and a second allele at a locus to be modified, and having a cassette including a selection marker gene and a target nucleic acid sequence in each of the first allele and the second allele. In a preferred embodiment, in the first intermediate cell, the selection marker gene of the first allele and the selection marker gene of the second allele are distinguishably different. In a preferred embodiment, in the first intermediate cell, the target nucleic acid sequence is a target of a genome modification system, and is designed so that the first allele and the second allele can be cleaved by the genome modification system in a distinguishable manner. In a preferred embodiment, in the first intermediate cell, each selection marker gene is a negative selection marker gene that can be used for negative selection. Although a positive selection marker gene is useful when obtaining the first intermediate cell, the positive selection marker gene is not necessary in the process after obtaining the first intermediate cell. Therefore, the positive selection marker gene may be removed. The removal can be performed, for example, by a genome editing technique. In this way, the first intermediate cell does not need to have a positive selection marker.
 第一の中間体細胞からは、第二の中間体細胞を得てもよい(参考:図1の工程S3)。第二の中間体細胞は、第一の中間体細胞の第二のカセットを除去して作製することができる。カセットの除去は、好ましくは前記第二のカセットの上流と相同組換え可能な上流ホモロジーアームと前記カセットの下流と相同組換え可能な下流ホモロジーアームを含むドナーDNA(カセット除去用ドナーDNA)存在下で、第二のカセット内部の標的核酸配列を特異的に切断することにより行うことができる。前記カセットの上流と相同組換え可能な上流ホモロジーアームと前記カセットの下流と相同組換え可能な下流ホモロジーアームからなるドナーDNA存在下で標的核酸配列を切断することでカセットの除去の際にカセットの上流と下流をシームレスに連結することもできる。このようにして得られる第二の中間体細胞は、第一のアレルには、選択マーカー遺伝子と標的核酸配列を含むカセットを有するが、第二のカセットを含まない。 A second intermediate cell may be obtained from the first intermediate cell (see step S3 in FIG. 1). The second intermediate cell can be prepared by removing the second cassette from the first intermediate cell. The cassette can be removed by specifically cleaving the target nucleic acid sequence inside the second cassette in the presence of a donor DNA (cassette removal donor DNA) that preferably includes an upstream homology arm capable of homologous recombination with the upstream of the second cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette. By cleaving the target nucleic acid sequence in the presence of a donor DNA that includes an upstream homology arm capable of homologous recombination with the upstream of the cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette, the upstream and downstream of the cassette can be seamlessly linked when removing the cassette. The second intermediate cell thus obtained has a cassette including a selection marker gene and a target nucleic acid sequence in the first allele, but does not include the second cassette.
 第一の中間体細胞および第二の中間体細胞から本開示のライブラリーを作製することができる(参考:図1の工程S2およびS4)。第一の中間体細胞および第二の中間体細胞(合わせて「中間体細胞」という)は、第一のアレルに、選択マーカー遺伝子と標的核酸配列を含むカセットを有する。ライブラリ作製工程においては、第一のカセットを除去し、代わりに改変塩基配列を導入することができる。すなわち、第一のカセットを改変塩基配列で置き換えることができる。この置き換えは、ゲノム改変システムにより実施することができる。第一のカセットの上流と相同組換え可能な上流ホモロジーアームと前記カセットの下流と相同組換え可能な下流ホモロジーアームを含む改変塩基配列導入用ドナーDNA(またはライブラリ作製用ドナーDNA)存在下で、第一のカセット内部の標的核酸配列を特異的に切断することにより行うことができる。ライブラリ作製用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームとの間に改変塩基配列を有する。基本的には、ライブラリ作製用ドナーDNAの上流ホモロジーアームと下流ホモロジーアームの間に挟まれた配列へと、ゲノム上の上流ホモロジーアームが相同組換えする領域と下流ホモロジーアームが相同組換えする領域に挟まれた領域が置き換わるので、置き換え後の配列を上流ホモロジーアームと下流ホモロジーアームの間に有するライブラリ作製用ドナーDNAが好ましく用いられ得る。したがって、上記操作により、カセットを改変塩基配列に置き換えることができる。ライブラリ作製用ドナーDNAは、種々の改変塩基配列を含むDNA群であり得る。このようにすることで、上記操作により、各中間体細胞のカセットそれぞれを様々な改変塩基配列に置き換えることができる。カセットが置き換えられると、カセット内のネガティブ選択用マーカー遺伝子が除去されることとなるので、ネガティブ選択用マーカー遺伝子が発現しないことを指標としてカセットが改変塩基配列に置き換わった細胞を取得することができる。ある態様では、ライブラリ作製用ドナーDNAは、線状である(環状ではない)。このようにすることで、ライブラリ作製用ドナーDNAの調整の簡便性において利点を有し得る(例えば、図4の欠点2参照)。カセットの改変塩基配列の置き換えは、当業者であれは周知慣用技術により確認でき、例えば、制限酵素による切断の有無、PCRの増幅の有無(例えば、ジャンクションPCR)やシークエンシングにより確認することができる。改変塩基配列は、0塩基長、すなわち存在しないでもよいが、好ましくは、1塩基長以上であり得る。改変塩基配列は、特に限定されないが例えば、1~100万塩基長、1~50万塩基長、1~10万塩基長、1~2万塩基長、1~1.5万塩基長、または1~1万塩基長であり得る。改変塩基配列は、特に限定されないが例えば、3塩基長以上、10塩基長以上、30塩基長以上、50塩基長以上、100塩基長以上の長さを有し得る。各ライブラリ作製用ドナーDNAの改変塩基配列は、同じでも異なっていてもよい。各ライブラリ作製用ドナーDNAの改変塩基配列は、それぞれ独立して上記の長さのいずれかを有し得る。 The library of the present disclosure can be prepared from the first intermediate cell and the second intermediate cell (see steps S2 and S4 in FIG. 1 for reference). The first intermediate cell and the second intermediate cell (collectively referred to as "intermediate cell") have a cassette containing a selection marker gene and a target nucleic acid sequence in the first allele. In the library preparation process, the first cassette can be removed and a modified base sequence can be introduced instead. That is, the first cassette can be replaced with a modified base sequence. This replacement can be performed by a genome modification system. In the presence of a modified base sequence introduction donor DNA (or library preparation donor DNA) that includes an upstream homology arm capable of homologous recombination with the upstream of the first cassette and a downstream homology arm capable of homologous recombination with the downstream of the cassette, the target nucleic acid sequence inside the first cassette can be specifically cleaved. The library preparation donor DNA has a modified base sequence between the upstream homology arm and the downstream homology arm. Basically, the region sandwiched between the region where the upstream homology arm on the genome undergoes homologous recombination and the region where the downstream homology arm undergoes homologous recombination is replaced with the sequence sandwiched between the upstream homology arm and the downstream homology arm of the library construction donor DNA, so a library construction donor DNA having the replaced sequence between the upstream homology arm and the downstream homology arm can be preferably used. Therefore, the cassette can be replaced with a modified base sequence by the above operation. The library construction donor DNA can be a DNA group containing various modified base sequences. In this way, the cassette of each intermediate cell can be replaced with various modified base sequences by the above operation. When the cassette is replaced, the negative selection marker gene in the cassette is removed, so that the absence of expression of the negative selection marker gene can be used as an indicator to obtain a cell in which the cassette has been replaced with a modified base sequence. In one embodiment, the library construction donor DNA is linear (not circular). In this way, there is an advantage in the ease of preparation of the library construction donor DNA (see, for example, disadvantage 2 in FIG. 4). The replacement of the modified base sequence of the cassette can be confirmed by a person skilled in the art using well-known conventional techniques, for example, by the presence or absence of cleavage by a restriction enzyme, the presence or absence of PCR amplification (e.g., junction PCR), or sequencing. The modified base sequence may be 0 bases long, i.e., non-existent, but preferably 1 base or more long. The modified base sequence is not particularly limited, but may be, for example, 10 to 1 million bases long, 10 to 500,000 bases long, 10 to 100,000 bases long, 10 to 20,000 bases long, 10 to 15,000 bases long, or 10 to 10,000 bases long. The modified base sequence is not particularly limited, but may be, for example, 3 bases long or more, 10 bases long or more, 30 bases long or more, 50 bases long, or 100 bases long or more. The modified base sequences of each donor DNA for library construction may be the same or different. The modified base sequences of each donor DNA for library construction may independently have any of the above lengths.
 中間体細胞が、標的領域において3以上のアレルを有する場合には、少なくとも第一のアレルが区別可能な固有のネガティブマーカー遺伝子をあればよい。そのようにすることで、第一のアレル以外のカセットを除去することができる、または、第一のアレル以外のカセットを維持したまま、第一のアレルのカセットを改変塩基配列に置き換える操作を行うことができる。したがって、中間体細胞として、第一のアレルのみが少なくとも第一のアレルが区別可能な固有のネガティブマーカー遺伝子を有する細胞であり、中間体細胞としてはそのような細胞を選択して用いることができる。 When the intermediate cell has three or more alleles in the target region, it is sufficient that it has a unique negative marker gene that allows at least the first allele to be distinguished. In this way, cassettes other than the first allele can be removed, or an operation can be performed to replace the cassette of the first allele with a modified base sequence while maintaining the cassettes other than the first allele. Therefore, the intermediate cell is a cell in which only the first allele has a unique negative marker gene that allows at least the first allele to be distinguished, and such a cell can be selected and used as the intermediate cell.
 本開示の方法は、図4に示される欠点1~3のいずれも解決しなくてもよいが、好ましくは、図4に示される欠点1~3のいずれか1以上を解決し得る。具体的には、本開示の方法では、GateweyTM法、およびLoxP/Cre法と比較して、標的ゲノムへの外来遺伝子の組換え効率が高い。本開示のある好ましい態様では、改変塩基配列導入用ドナーDNAは、線状であり、環状ではない。本開示の好ましい態様では、改変細胞は、部位特異的組換え酵素の認識部位を有しない。本開示の方法のある好ましい態様では、GateweyTM法、およびLoxP/Cre法と比較して、標的ゲノムへの外来遺伝子の組換え効率が高く、改変塩基配列導入用ドナーDNAは、線状であり、改変細胞は、部位特異的組換え酵素の認識部位を有しない。部位特異的組換え酵素の認識部位を有しないという特徴は、例えば、導入カセットとゲノムとをシームレスに連結する際に有益である。 The method of the present disclosure does not have to solve any of the disadvantages 1 to 3 shown in FIG. 4, but preferably solves one or more of the disadvantages 1 to 3 shown in FIG. 4. Specifically, the method of the present disclosure has a higher efficiency of recombination of a foreign gene into a target genome than the GatewayTM method and the LoxP/Cre method. In a preferred embodiment of the present disclosure, the donor DNA for introducing a modified base sequence is linear and not circular. In a preferred embodiment of the present disclosure, the modified cell does not have a recognition site for a site-specific recombinase. In a preferred embodiment of the method of the present disclosure, the donor DNA for introducing a modified base sequence is linear and the modified cell does not have a recognition site for a site-specific recombinase. The characteristic of not having a recognition site for a site-specific recombinase is beneficial, for example, when seamlessly linking an introduction cassette and a genome.
 本開示のライブラリーは、
 複数の水性組成物の組合せを含み、
 各水性組成物はそれぞれ1種類の改変細胞を含み、
 改変細胞はそれぞれ、改変対象である遺伝子座(標的遺伝子座または目的遺伝子座)に第一のアレルと第二のアレルを有し、
 改変細胞はそれぞれ第一のアレルの同一位置に、水性組成物間で相互に異なるDNA断片を含むカセットを有する。このようなライブラリーは、以下のようにして得ることができる。例えば、様々な改変塩基配列を含むライブラリー作製用ドナーDNAの存在下で前記中間体細胞の第一のカセットの標的核酸配列を切断する。そうすると細胞に備わったDNA損傷の修復機構により、中間体細胞の第一のカセットが改変塩基配列に置き換わる。改変塩基配列を有する細胞をシングルセルクローニングに供することができる。シングルセルクローニングは、細胞1つを含む液滴または水性組成物を多数形成し、培養に供して、液滴または水性組成物中で、1つの細胞に由来する細胞クローンを作製することを含み得る。このようにすると細胞クローンを含む水性組成物が複数(または多数)得られることとなる。このような複数(または多数)の水性組成物の組合せを改変細胞のライブラリーとして用いることができる。
The library of the present disclosure includes:
A combination of a plurality of aqueous compositions,
Each aqueous composition comprises one type of modified cell;
Each modified cell has a first allele and a second allele at a locus to be modified (a target locus or a locus of interest);
Each modified cell has a cassette containing a DNA fragment that is different between aqueous compositions at the same position of the first allele. Such a library can be obtained as follows. For example, the target nucleic acid sequence of the first cassette of the intermediate cell is cleaved in the presence of a library-making donor DNA containing various modified base sequences. Then, the first cassette of the intermediate cell is replaced with the modified base sequence by the DNA damage repair mechanism provided in the cell. The cell having the modified base sequence can be subjected to single cell cloning. Single cell cloning may include forming a large number of droplets or aqueous compositions containing one cell, subjecting them to culture, and producing a cell clone derived from one cell in the droplet or aqueous composition. In this way, a plurality (or a large number) of aqueous compositions containing cell clones are obtained. A combination of such a plurality (or a large number) of aqueous compositions can be used as a library of modified cells.
 中間体細胞または改変細胞のある態様では、第一のアレルは、(当初ゲノム上の)標的領域の一部又は全体を欠失している。中間体細胞または改変細胞のある態様では、第二のアレルは、標的領域の一部又は全部を欠失している。中間体細胞または改変細胞のある好ましい態様では、第一のアレルおよび第二のアレルは、標的領域の一部又は全体、より好ましくは全部を欠失している。 In some embodiments of the intermediate or modified cells, the first allele is missing a portion or all of the target region (initially in the genome). In some embodiments of the intermediate or modified cells, the second allele is missing a portion or all of the target region. In some preferred embodiments of the intermediate or modified cells, the first allele and the second allele are missing a portion or all of the target region, more preferably all of the target region.
 第一の中間体細胞のある態様では、第一のアレルでは、標的領域の全体が第一のカセットにより置換されている。第一の中間体細胞のある態様では、第二のアレルでは、標的領域の全体が第二のカセットにより置換されている。第一の中間体細胞のある好ましい態様では、第一のアレルおよび第二のアレルは、標的領域の全体がそれぞれ第一のカセットおよび第二のカセットにより置換されている。 In one embodiment of the first intermediate cell, the first allele has the entire target region replaced by the first cassette. In one embodiment of the first intermediate cell, the second allele has the entire target region replaced by the second cassette. In one preferred embodiment of the first intermediate cell, the first allele and the second allele have the entire target region replaced by the first cassette and the second cassette, respectively.
 第二の中間体細胞では、第二のカセットは除去されている。ある好ましい態様では、第二の中間体細胞では、(当初ゲノム上の)第二のアレルの標的領域の上流と下流とがシームレスに連結している。シームレスに連結しているとは、新しい塩基の追加なく、上流と下流とが連結していることを意味する。上流と下流のシームレスな連結においては、塩基の追加の脱落なく、上流と下流とが連結していることが好ましい。 In the second intermediate cell, the second cassette is removed. In a preferred embodiment, in the second intermediate cell, the upstream and downstream of the target region of the second allele (initially on the genome) are seamlessly linked. Seamless linking means that the upstream and downstream are linked without the addition of new bases. In the seamless linking of the upstream and downstream, it is preferred that the upstream and downstream are linked without the addition or loss of bases.
 改変細胞のある態様では、第一のアレルは、改変塩基配列を含み、改変塩基配列は、(当初ゲノム上の)標的領域(被置換配列ともいう)に対して塩基の付加、挿入、置換、欠失、および削除からなる群から選択される1以上の変異を有する。改変細胞は、当初の細胞(リファレンス細胞または参照細胞)と比較をし、変異の影響を評価することに有利に用いることができる。また、ライブラリーが、変異において異なる様々な細胞を含むことにより、細胞間の比較により、標的領域のそれぞれの変異部位の機能を評価することに有利である。 In one embodiment of the modified cell, the first allele comprises a modified base sequence, and the modified base sequence has one or more mutations selected from the group consisting of addition, insertion, substitution, deletion, and deletion of bases in a target region (also called a replaced sequence) (on the original genome). The modified cell can be advantageously used to compare with the original cell (reference cell or reference cell) and evaluate the effect of the mutation. In addition, since the library contains various cells that differ in mutations, it is advantageous to evaluate the function of each mutation site in the target region by comparing between cells.
 第一の改変細胞のある態様では、第二のアレルは、選択マーカーを含むカセットを含む。選択マーカーは、ポジティブ選択用マーカー遺伝子を含み得る。第二の改変細胞のある態様では、第二のアレルは、被置換配列の上流と下流とがシームレスに連結している。改変細胞の好ましいある態様では、第一のアレルの被置換配列と第二のアレルの被置換配列とは、対応する配列を有する。対応する配列を有するとは、配列の起点および終点がゲノム上の同一位置に存在することを意味する。対応する配列は、典型的には、高い同一性(例えば、80%以上、90%以上、または95%以上)を有し、同一の長さを有する。 In one embodiment of the first modified cell, the second allele comprises a cassette containing a selection marker. The selection marker may comprise a positive selection marker gene. In one embodiment of the second modified cell, the second allele is seamlessly linked upstream and downstream of the replaced sequence. In a preferred embodiment of the modified cell, the replaced sequence of the first allele and the replaced sequence of the second allele have corresponding sequences. Having corresponding sequences means that the start and end points of the sequences are at the same position on the genome. The corresponding sequences typically have a high identity (e.g., 80% or more, 90% or more, or 95% or more) and are the same length.
 ある態様において、改変細胞におけるカセットの配列のそれぞれは、1以上の改変部分(A)と1以上の非改変部分(B)とからなっている(例えば、図3参照)。前記改変部分(A)はそれぞれ、配列の挿入、欠失、および置換からなる群から選択される1以上の改変を有し、前記1以上の改変部分の改変(A)は、改変の位置または内容に関して各水性組成物間で異なる。前記1以上の非改変部分(B)はそれぞれ、改変前の対応する部位の配列と同一である。ここで、挿入カセットで置換される改変前の配列と、挿入カセット内の配列とをアラインメントしたときに、同じ位置に整列される場合に、2つの核酸配列は対応する部位の配列となる。前記カセット中のセントロメア側の非改変部分(B1)は、前記カセットのセントロメア側の隣接配列(C1)とシームレスに連結しており、前記カセットのテロメア側の非改変部分(Bt)は、前記カセットのテロメア側の隣接配列(C2)とシームレスに連結しており、記隣接配列(C1)および非改変部分(B1)が連結した領域、並びに非改変部分(Bt)および上記隣接配列(C2)が連結した領域は、改変前の対応する領域の配列と同一の配列を構成していてもよい。このようなカセットをDNAに組込むには、上記カセットの構造を上流ホモロジーアームと下流ホモロジーアームの間に有するライブラリ作製用ドナーDNAを用いて中間体細胞を改変すればよい。ある好ましい態様では、改変部分(A)の長さの合計は、挿入カセットの長さの90%以下、80%以下、70%以下、60%以下、50%以下、40%以下、30%以下、20%以下、10%以下、または5%以下であり得る。この態様の改変細胞を含むライブラリーは、同一領域内の異なる位置に異なる変異を有するライブラリーであり得、例えば、どの位置のどの変異が細胞活性を変化させるかを調べることや、所望の特性を有する細胞のスクリーニングなどに好ましく用いられ得る。改変塩基配列導入用ドナーDNAは、上流ホモロジーアームと下流ホモロジーアームの間に上記カセットの構造を有し得る。改変塩基配列導入用ドナーDNAは、異なるカセットの構造を有する改変塩基配列導入用ドナーDNAのライブラリーに含まれ得る。 In one embodiment, each of the cassette sequences in the modified cells consists of one or more modified portions (A) and one or more unmodified portions (B) (see, for example, FIG. 3). Each of the modified portions (A) has one or more modifications selected from the group consisting of sequence insertion, deletion, and substitution, and the modifications (A) of the one or more modified portions differ between each aqueous composition in terms of the position or content of the modification. Each of the one or more unmodified portions (B) is identical to the sequence of the corresponding site before modification. Here, when the pre-modification sequence replaced by the insertion cassette and the sequence in the insertion cassette are aligned at the same position, the two nucleic acid sequences become the sequences of the corresponding sites. The unmodified portion (B1) on the centromere side of the cassette is seamlessly linked to the adjacent sequence (C1) on the centromere side of the cassette, and the unmodified portion (Bt) on the telomere side of the cassette is seamlessly linked to the adjacent sequence (C2) on the telomere side of the cassette, and the region where the adjacent sequence (C1) and the unmodified portion (B1) are linked, and the region where the unmodified portion (Bt) and the adjacent sequence (C2) are linked may constitute the same sequence as the sequence of the corresponding region before modification. To incorporate such a cassette into DNA, an intermediate cell may be modified using a donor DNA for library production having the structure of the cassette between an upstream homology arm and a downstream homology arm. In a preferred embodiment, the total length of the modified portion (A) may be 90% or less, 80% or less, 70% or less, 60% or less, 50% or less, 40% or less, 30% or less, 20% or less, 10% or less, or 5% or less of the length of the insertion cassette. A library containing modified cells of this embodiment may be a library having different mutations at different positions within the same region, and may be preferably used, for example, to investigate which mutation at which position changes cell activity, or to screen for cells having desired properties. The donor DNA for introducing a modified base sequence may have the above-mentioned cassette structure between the upstream homology arm and the downstream homology arm. The donor DNA for introducing a modified base sequence may be included in a library of donor DNA for introducing a modified base sequence having different cassette structures.
 ある態様において、改変細胞における改変塩基配列は、その全体が変異からなる。 In some embodiments, the modified base sequence in the modified cell consists entirely of mutations.
 ある態様において、挿入カセットの内部または外部(近辺)に部位特異的組換え酵素の組換え配列(認識部位)は存在しない。ある態様において、挿入カセット以外は、改変がなされていない、または、改変前の細胞と同一の塩基配列を有する。このようにすることで、目的の改変以外の改変を含まない改変細胞を得ることができ、目的の改変以外の改変の予期せぬ影響を除去することができる(例えば、図4の欠点3参照)。 In one embodiment, there is no recombination sequence (recognition site) for a site-specific recombinase inside or outside (near) the insertion cassette. In one embodiment, other than the insertion cassette, no modification is made or the base sequence is the same as that of the cell before modification. In this way, modified cells that do not contain modifications other than the desired modification can be obtained, and unanticipated effects of modifications other than the desired modification can be eliminated (see, for example, drawback 3 in Figure 4).
 ライブラリーは、特に限定されないが好ましくは、4種類以上、5種類以上、6種類以上、7種類以上、8種類以上、9種類以上、10種類以上、11種類以上、12種類以上、13種類以上、14種類以上、15種類以上、16種類以上、17種類以上、18種類以上、19種類以上、20種類以上、25種類以上、30種類以上、35種類以上、40種類以上、45種類以上、50種類以上、60種類以上、70種類以上、80種類以上、90種類以上、100種類以上、150種類以上、200種類以上、300種類以上、400種類以上、500種類以上、600種類以上、700種類以上、800種類以上、900種類以上、1000種類以上、2000種類以上、3000種類以上、4000種類以上、5000種類以上、6000種類以上、7000種類以上、8000種類以上、9000種類以上、または10000種類以上の改変細胞(又は改変細胞を含む水性組成物)を含んでいてもよい。 The library is not particularly limited, but preferably has 4 or more types, 5 or more types, 6 or more types, 7 or more types, 8 or more types, 9 or more types, 10 or more types, 11 or more types, 12 or more types, 13 or more types, 14 or more types, 15 or more types, 16 or more types, 17 or more types, 18 or more types, 19 or more types, 20 or more types, 25 or more types, 30 or more types, 35 or more types, 40 or more types, 45 or more types, 50 or more types, 60 or more types, 70 or more types, 80 or more types, 90 or more types or more than 100, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or 10000 types of modified cells (or aqueous compositions containing modified cells).
 ライブラリーは、典型的には、複数の水性組成物の組合せを含む。各水性組成物は、1種類の細胞を含む。ある態様では、ライブラリーは、複数の水性組成物の組合せを別々に含む。ある態様では、ライブラリーは、目的によっては、複数の水性組成物の組合せを混合物として含んでいてもよい。例えば、細胞の増殖力や生存力の高い細胞をスクリーニングする場合には、ライブラリーは、複数の水性組成物の組合せを混合物として含んでいてもよく、このような場合であっても、培養後の細胞を分析することにより、もっとも増殖力や生存力の高い細胞が富化され、増殖力や生存力の高い細胞を取得することができる。 A library typically includes a combination of multiple aqueous compositions. Each aqueous composition includes one type of cell. In some embodiments, a library includes separate combinations of multiple aqueous compositions. In some embodiments, a library may include a combination of multiple aqueous compositions as a mixture depending on the purpose. For example, when screening for cells with high cell proliferation or viability, a library may include a combination of multiple aqueous compositions as a mixture. Even in such a case, by analyzing the cells after culture, the cells with the highest proliferation or viability can be enriched and the cells with high proliferation or viability can be obtained.
 ある態様では、1つのライブラリーにおいて、当該ライブラリーに含まれる改変細胞それぞれの改変塩基配列(またはDNA断片)以外のゲノム配列は、同一となるように設計されている。ある態様では、1つのライブラリーにおいて、当該ライブラリーに含まれる改変細胞それぞれの改変塩基配列(またはDNA断片)以外のゲノム配列は、実質的に同一である。実質的に同一であるとは、細胞培養に適した環境下(通常の環境下)で、クローニングされた細胞を単に10回継代した後に細胞間に生じ得る変異に基づく相違の存在を許容する。 In one embodiment, the genomic sequences of each of the modified cells contained in a library are designed to be identical, except for the modified base sequence (or DNA fragment). In one embodiment, the genomic sequences of each of the modified cells contained in a library are substantially identical, except for the modified base sequence (or DNA fragment). Being substantially identical allows for the presence of differences due to mutations that may occur between cells after simply subculturing cloned cells 10 times in an environment suitable for cell culture (normal environment).
 改変塩基配列以外が共通していることにより、例えば、改変塩基配列の影響を評価することに適する。第二のアレルにおけるカセットが除去されることで、カセットによる影響の可能性を最小に抑えることができる。第二のアレルにおけるカセットがシームレスに除去されることで、追加の新しい塩基が第二のアレルに導入されることによる影響の可能性を最小に抑えることができる。 Because the elements other than the modified base sequence are the same, it is suitable for evaluating the effects of the modified base sequence, for example. By removing the cassette in the second allele, the possibility of an effect due to the cassette can be minimized. By seamlessly removing the cassette in the second allele, the possibility of an effect due to the introduction of additional new bases into the second allele can be minimized.
 別のある態様では、中間体細胞における第二のアレルにおけるカセット(第二のカセット)も、第二の改変塩基配列により置き換えられていてもよい。第二の改変配列は、改変細胞に共通した配列(つまり、改変細胞にわたって同一配列)であってもよいし、水性組成物毎に異なる配列であってもよい。場合によっては、第二の改変配列は、同一水性組成物中においてさえ細胞毎に異なっていてもよい。また、第二の改変配列は、第一のアレルにおける改変塩基配列(第一の改変塩基配列)と同一であってもよいし、異なっていてもよい。第二の改変塩基配列については、第一のアレルにおける改変塩基配列と同様に設計し、第一のアレルへの改変塩基配列の導入と同様に導入することができる。第二のカセットの第二の改変塩基配列への置換は、第二のライブラリ作製用ドナーDNAの存在下で、第二のカセット付近(好ましくは、第二のカセット内の標的核酸配列)を特異的に切断すればよい。本明細書では、第二の改変塩基配列の内容やその導入方法については、第一の改変塩基配列の説明と同様であるので、この説明を参照し、ここでは説明を省略する。 In another embodiment, the cassette (second cassette) in the second allele in the intermediate cell may also be replaced by the second modified base sequence. The second modified sequence may be a sequence common to the modified cells (i.e., the same sequence across the modified cells) or may be a sequence that differs for each aqueous composition. In some cases, the second modified sequence may differ for each cell even in the same aqueous composition. The second modified sequence may be the same as or different from the modified base sequence in the first allele (first modified base sequence). The second modified base sequence may be designed in the same manner as the modified base sequence in the first allele, and may be introduced in the same manner as the introduction of the modified base sequence into the first allele. The second cassette may be replaced by the second modified base sequence by specifically cleaving the vicinity of the second cassette (preferably the target nucleic acid sequence in the second cassette) in the presence of the second library preparation donor DNA. In this specification, the content of the second modified base sequence and the method of its introduction are the same as those of the first modified base sequence, so please refer to this explanation and the explanation will be omitted here.
応用例1
 応用例1は、ゲノムの特定領域の解析への応用例である。本開示によれば、ゲノムの特定領域の塩基配列に様々な変異をそれぞれ導入して、変異による当該特定領域の機能の獲得または喪失を観察することによって、当該ゲノムの特定領域における重要塩基を特定することが可能である。特定領域の例としては、機能が不明な領域、プロモーター領域、エンハンサー領域、イントロンに該当する領域、5’非翻訳領域(UTR)に該当する領域、3’非翻訳領域(UTR)に該当する領域、ノンコーディングRNAをコードする領域などが挙げられる。
Application example 1
Application example 1 is an application example to the analysis of a specific region of a genome. According to the present disclosure, it is possible to identify important bases in a specific region of a genome by introducing various mutations into the base sequence of the specific region of a genome and observing the gain or loss of function of the specific region due to the mutation. Examples of the specific region include a region of unknown function, a promoter region, an enhancer region, a region corresponding to an intron, a region corresponding to a 5' untranslated region (UTR), a region corresponding to a 3' untranslated region (UTR), and a region encoding a non-coding RNA.
応用例2
 応用例2は、タンパク質またはRNAの発現レベルの調節への応用例である。この応用例では、タンパク質またはRNAの転写制御領域(転写制御に関与していると疑われる領域を含む)、および前記タンパク質の翻訳制御領域(翻訳制御に関与していると疑われる領域を含む)等の前記タンパク質またはRNAの発現レベルの制御に関与する領域または関与すると疑われる領域を改変し、当該タンパク質の発現レベルを調節する。これにより、応用例2では、タンパク質またはRNAの発現量が調節された改変細胞を得ることができる。調節は、発現量の増加または減少であり得る。RNAは、mRNA、tRNA、rRNA、その他のノンコーディングRNA(例えば、マイクロRNA)であり得る。
Application example 2
Application Example 2 is an application example for regulating the expression level of a protein or RNA. In this application example, a region involved or suspected to be involved in regulating the expression level of the protein or RNA, such as a transcriptional control region of the protein or RNA (including a region suspected to be involved in transcriptional control) and a translational control region of the protein (including a region suspected to be involved in translational control), is modified to regulate the expression level of the protein. Thereby, in Application Example 2, a modified cell in which the expression level of the protein or RNA is regulated can be obtained. The regulation can be an increase or decrease in the expression level. The RNA can be mRNA, tRNA, rRNA, or other non-coding RNA (e.g., microRNA).
応用例3
 応用例3は、タンパク質またはRNAのコード領域への応用例である。本開示によれば、タンパク質またはRNAをコードする領域に様々な変異をそれぞれ導入して、変異によるタンパク質またはRNAの機能改変(例えば、機能の獲得または喪失)を観察することによって、当該タンパク質またはRNAの機能における重要アミノ酸または重要配列を特定することが可能である。また、例えば、変異によるタンパク質またはRNAの機能の獲得または喪失を観察することにより、向上もしくは低減した機能または新しい機能を有する変異タンパク質またはRNAおよび当該変異タンパク質またはRNAを発現する改変細胞を取得することができる。タンパク質またはRNAの一部または全部に対して多様な改変をする場合には、改変された部位を含む一部または全部をタンパク質またはRNAをコードする領域にインフレームでシームレスに連結することが望ましい。RNAは、mRNA、tRNA、rRNA、その他のノンコーディングRNA(例えば、マイクロRNA)であり得る。応用例3によれば、上記応用例2と組み合わせて、当該変異タンパク質またはRNAを発現する改変細胞であって、その発現レベルが調節された改変細胞を得ることもできる。
Application example 3
Application Example 3 is an application example to a coding region of a protein or RNA. According to the present disclosure, it is possible to identify important amino acids or important sequences in the function of the protein or RNA by introducing various mutations into the region encoding the protein or RNA and observing the functional modification of the protein or RNA due to the mutation (e.g., gain or loss of function). In addition, for example, by observing the gain or loss of function of the protein or RNA due to the mutation, it is possible to obtain a mutant protein or RNA having an improved or reduced function or a new function and a modified cell expressing the mutant protein or RNA. When various modifications are made to a part or all of a protein or RNA, it is desirable to seamlessly link a part or all of the protein or RNA including the modified site in frame to the region encoding the protein or RNA. The RNA may be mRNA, tRNA, rRNA, or other non-coding RNA (e.g., microRNA). According to Application Example 3, in combination with Application Example 2 above, a modified cell expressing the mutant protein or RNA, in which the expression level is regulated, can also be obtained.
応用例4
 応用例5は、増殖力または生存力の高い細胞のスクリーニングへの応用である。本開示によれば、細胞の増殖力または生存力に関与する可能性のあるゲノム領域について、異なる変異を有する多様な改変細胞を含むライブラリーを得ることができる。そのようなライブラリーは、各種細胞を含む水性組成物を別々に含むものであってもよいが、各種細胞を含む水性組成物の混合物であってもよい。混合物中では、それぞれの改変細胞が同等量含まれていることが好ましい。混合物を、細胞の培養に適した環境下または細胞への淘汰圧存在下で培養することによって、増殖力または生存力の高い細胞は相対濃度を高め、増殖力または生存力の低い細胞は相対濃度を低める。したがって、培養後、増殖力または生存力の高い細胞が濃縮され、これらの細胞を取得できることに有利である。スクリーニングは、特定の選択圧を負荷した条件下で行うこともできる。このようにすることで、当該特定の選択圧に対して増殖力または生存力の高い細胞をスクリーニングすることができる。選択圧としては、特に限定されないが例えば、貧栄養、高塩濃度、低塩濃度、高温、低温、低酸素、薬物(例えば、毒物、抗生物質等の生理活性物質)の存在などが挙げられる。ゲノム上の既存の遺伝子の改変は、当該遺伝子の改変遺伝子への置き換えにより行うことができる。あるいは、セーフハーバー領域(例えば、AAVS1遺伝子座、ROSA26遺伝子座、CLBYL遺伝子座、CXCR4遺伝子座、およびCCR5遺伝子座など)などに、単純に改変塩基配列を挿入することもできる。
Application example 4
Application example 5 is an application to screening of cells with high proliferation or viability. According to the present disclosure, a library containing various modified cells having different mutations in genomic regions that may be involved in the proliferation or viability of cells can be obtained. Such a library may contain separate aqueous compositions containing various types of cells, or may be a mixture of aqueous compositions containing various types of cells. It is preferable that the mixture contains equal amounts of each modified cell. By culturing the mixture in an environment suitable for cell culture or in the presence of selective pressure on the cells, the relative concentration of cells with high proliferation or viability is increased, and the relative concentration of cells with low proliferation or viability is decreased. Therefore, after culture, cells with high proliferation or viability are concentrated, which is advantageous in that these cells can be obtained. Screening can also be performed under conditions where a specific selective pressure is applied. In this way, cells with high proliferation or viability against the specific selective pressure can be screened. The selective pressure is not particularly limited, but examples thereof include poor nutrition, high salt concentration, low salt concentration, high temperature, low temperature, low oxygen, and the presence of drugs (e.g., physiologically active substances such as poisons and antibiotics). Modification of an existing gene in the genome can be achieved by replacing the gene with a modified gene, or by simply inserting a modified nucleotide sequence into a safe harbor region (e.g., the AAVS1 locus, the ROSA26 locus, the CLBYL locus, the CXCR4 locus, and the CCR5 locus, etc.).
 1実施形態において、本発明は、染色体ゲノムの2つ以上のアレルが改変された細胞であって、当該2つ以上のアレルそれぞれにおいて相互に異なる(区別可能な)選択マーカー遺伝子を有する、細胞を提供する。ある態様では、細胞は、単細胞生物の細胞であり得る。ある態様では、細胞は、単離された細胞であり得る。ある態様では、細胞は、多能性細胞、および多能性幹細胞(胚性幹細胞および誘導多能性幹細胞など)からなる群から選択される細胞であり得る。ある態様では、細胞は、組織幹細胞であり得る。ある態様では、細胞は、体細胞であり得る。ある態様では、細胞は、生殖系列細胞(例えば、生殖細胞)であり得る。ある態様では、細胞は、細胞株であり得る。ある態様では、細胞は不死化細胞であり得る。ある態様では、細胞はがん細胞であり得る。ある態様では、細胞は、非がん細胞であり得る。ある態様では、細胞は、疾患患者の細胞であり得る。ある態様では、細胞は健常者の細胞であり得る。ある態様では、細胞は、動物細胞(例えば、ヒト細胞)、例えば、昆虫細胞(例えば、カイコ細胞)、HEK293細胞、HEK293T細胞、Expi293F(商標)細胞、FreeStyle(商標)293F細胞、チャイニーズハムスター卵巣細胞(CHO細胞)、CHO-S細胞、CHO-K1細胞、およびExpiCHO細胞、ならびにこれらの細胞からの派生細胞からなる群から選択される細胞であり得る。ある好ましい態様では、上記細胞においては、染色体ゲノムの標的領域のすべてのアレルが改変され、改変後の領域は、それぞれ相互に異なる(区別可能な)選択マーカー遺伝子を有する。 In one embodiment, the present invention provides a cell in which two or more alleles of a chromosomal genome have been modified, with each of the two or more alleles having a mutually different (distinguishable) selectable marker gene. In one aspect, the cell may be a cell of a unicellular organism. In one aspect, the cell may be an isolated cell. In one aspect, the cell may be a cell selected from the group consisting of a pluripotent cell and a pluripotent stem cell (such as an embryonic stem cell and an induced pluripotent stem cell). In one aspect, the cell may be a tissue stem cell. In one aspect, the cell may be a somatic cell. In one aspect, the cell may be a germline cell (e.g., a germ cell). In one aspect, the cell may be a cell line. In one aspect, the cell may be an immortalized cell. In one aspect, the cell may be a cancer cell. In one aspect, the cell may be a non-cancerous cell. In one aspect, the cell may be a cell of a diseased patient. In one aspect, the cell may be a cell of a healthy individual. In one embodiment, the cell may be an animal cell (e.g., a human cell), such as an insect cell (e.g., a silkworm cell), HEK293 cell, HEK293T cell, Expi293F™ cell, FreeStyle™ 293F cell, Chinese hamster ovary cell (CHO cell), CHO-S cell, CHO-K1 cell, and ExpiCHO cell, and cells derived from these cells. In one preferred embodiment, in the above cell, all alleles of the target region of the chromosomal genome are modified, and the modified regions each have a different (distinguishable) selection marker gene from each other.
 1実施形態において、染色体ゲノムの2つ以上のアレルが改変された細胞であって、当該2つ以上のアレルそれぞれにおいて相互に異なる(区別可能な)選択マーカー遺伝子を有する、細胞の培養方法が提供される。選択マーカー遺伝子が、薬剤耐性マーカー遺伝子である場合には、培養はそれぞれの薬剤耐性マーカー遺伝子に対する薬剤の存在下で培養され得る。培養は、細胞の維持または増殖に適した条件下で行われ得る。 In one embodiment, a method for culturing cells is provided in which two or more alleles of a chromosomal genome have been modified, and each of the two or more alleles has a mutually different (distinguishable) selection marker gene. When the selection marker genes are drug resistance marker genes, the cells can be cultured in the presence of a drug against each of the drug resistance marker genes. The cells can be cultured under conditions suitable for the maintenance or growth of the cells.
 1実施形態において、本発明は、2つ以上のアレルが改変された染色体ゲノムを有する非ヒト生物であって、当該2つ以上のアレルそれぞれにおいて相互に異なる(区別可能な)選択マーカー遺伝子を有する、非ヒト生物が提供される。ある態様では、細胞は、単細胞生物の細胞であり得る。ある態様では、非ヒト生物は、酵母(例えば、分裂酵母または出芽酵母、例えば、サッカロミセス・セレビシエ(Saccharomyces cerevisiae)、サッカロミセス・カールスベルゲンシス(Saccharomyces carlsbergensis)、サッカロミセス・フラギリス(Saccharomyces fragilis)、サッカロミセス・ルーキシー(Saccharomyces rouxii)などのサッカロミセス属、キャンディダ・ユーティリス(Candida utilis)、キャンディダ・トロピカリス(Candida tropicalis)などのキャンディダ属、ピキア(Pichia)属、クリベロマイセス(Kluyveromyces)属、ヤロイワ(Yarrowia)属、ハンゼニュラ(Hansenula)属、エンドマイセス(Endomyces)属などの酵母からなる群から選択される生物であり得る。ある態様では、非ヒト生物は、糸状菌(例えば、Aspergillus,Trichoderma,Humicola,Acremonium,Fusarium,及びPenicillium種)であり得る。ある態様では、非ヒト生物は、は、多細胞生物であり得る。ある態様では、非ヒト生物は、非ヒト動物であり得る。ある態様では、非ヒト生物は、植物であり得る。ある好ましい態様では、上記非ヒト生物では、染色体ゲノムの標的領域のすべてのアレルが改変され、改変後の領域は、それぞれ相互に異なる(区別可能な)選択マーカー遺伝子を有する。 In one embodiment, the present invention provides a non-human organism having a chromosomal genome with two or more modified alleles, each of which has a selectable marker gene that is different from the other alleles. In one aspect, the cell may be a cell of a unicellular organism. In some aspects, the non-human organism is a yeast (e.g., a fission yeast or budding yeast, e.g., a species of the genus Saccharomyces, such as Saccharomyces cerevisiae, Saccharomyces carlsbergensis, Saccharomyces fragilis, Saccharomyces rouxii, a species of the genus Candida, such as Candida utilis, Candida tropicalis, a species of the genus Pichia, a species of the genus Kluyveromyces, a species of the genus Yarrowia ... In one embodiment, the non-human organism may be a yeast selected from the group consisting of yeasts of the genera Arrowia, Hansenula, and Endomyces. In one embodiment, the non-human organism may be a filamentous fungus (e.g., Aspergillus, Trichoderma, Humicola, Acremonium, Fusarium, and Penicillium). In one embodiment, the non-human organism may be a multicellular organism. In one embodiment, the non-human organism may be a non-human animal. In one embodiment, the non-human organism may be a plant. In one preferred embodiment, all alleles of the target region of the chromosomal genome of the non-human organism are modified, and the modified regions each have a selectable marker gene that is different from each other (distinguishable).
 上記細胞では、細胞の生存や増殖に必要な遺伝子などの所望の遺伝子の1以上が、染色体ゲノムの他の領域に含まれ、または集められていてもよい。他の領域は、例えば、セーフハーバー領域(例えば、AAVS1遺伝子座、ROSA26遺伝子座、CLBYL遺伝子座、CXCR4遺伝子座、およびCCR5遺伝子座など)であり得る。他の領域は、例えば、欠失を有する上記(ii)の領域であり得る。 In the cell, one or more of the desired genes, such as genes necessary for cell survival or proliferation, may be contained or collected in another region of the chromosomal genome. The other region may be, for example, a safe harbor region (e.g., the AAVS1 locus, the ROSA26 locus, the CLBYL locus, the CXCR4 locus, and the CCR5 locus). The other region may be, for example, a region of (ii) above having a deletion.
 図1に示されるように、改変前細胞(リファレンス細胞という)から第一の中間体細胞を得る。改変前細胞から第一の中間体細胞を得る工程を工程S1と呼ぶ。第一の中間体細胞から第一の改変細胞のライブラリー(以下単に「第一のライブラリー」という)を作製することができる。第一の中間体細胞から第一のライブラリーを得る工程を工程S2と呼ぶ。第一の中間体細胞から第二の中間体細胞を得ることができる。第一の中間体細胞から第二の中間体細胞を得る工程を工程S3と呼ぶ。第二の中間体細胞から第二の改変細胞のライブラリー(以下単に「第二のライブラリー」という)を作製することができる。第二の中間体細胞から第二のライブラリーを得る工程を工程S4と呼ぶ。 As shown in FIG. 1, a first intermediate cell is obtained from a pre-modified cell (referred to as a reference cell). The process of obtaining the first intermediate cell from the pre-modified cell is called step S1. A library of first modified cells (hereinafter simply referred to as the "first library") can be produced from the first intermediate cell. The process of obtaining the first library from the first intermediate cell is called step S2. A second intermediate cell can be obtained from the first intermediate cell. The process of obtaining the second intermediate cell from the first intermediate cell is called step S3. A library of second modified cells (hereinafter simply referred to as the "second library") can be produced from the second intermediate cell. The process of obtaining the second library from the second intermediate cell is called step S4.
 リファレンス細胞を用意する。リファレンス細胞は、ライブラリー化したい細胞である。リファレンス細胞は、例えば、真核細胞であり、本開示のライブラリーの作製に用いることができる。リファレンス細胞は、天然の細胞であってもよいが、改変を有する細胞であってもよい。リファレンス細胞は、典型的には2倍体であるが、3倍体以上であってもよい。 Prepare reference cells. Reference cells are cells to be made into a library. Reference cells are, for example, eukaryotic cells, and can be used to create the library of the present disclosure. Reference cells may be natural cells, or may be modified cells. Reference cells are typically diploid, but may also be triploid or higher.
 例えば、図2に示されるように第一の中間体細胞を作製することができる。具体的には、リファレンス細胞のゲノム上の標的領域を、選択マーカー遺伝子を含むカセットで置換する。図2中では、2種類のドナーDNAと標的領域とを相同組換えに供する。ドナーDNAは、それぞれ区別可能に異なる薬剤選択マーカー遺伝子(ポジティブ選択用)と区別可能に異なる可視化マーカー遺伝子(ネガティブ選択用)を含み、それぞれ両端にCRISPR/Cas9による標的配列(gRNA1~4)を有する。相同組換え後、2種類のドナーDNAに由来する断片がそれぞれ父由来および母由来の染色体上の標的配列を置き換わった細胞を選択するために、2種類の薬剤により薬剤選択を行う(図2の工程(1)参照)。選択後生存する細胞は、2アレルの標的領域にそれぞれ区別可能に異なる薬剤耐性遺伝子を有する細胞(第一の中間体細胞)である。第一の中間体細胞は、シングルセルクローニングに供することができる。目的の位置に選択マーカー遺伝子を含むカセットが挿入されているかを確認しておくことができる。 For example, a first intermediate cell can be produced as shown in FIG. 2. Specifically, the target region on the genome of the reference cell is replaced with a cassette containing a selection marker gene. In FIG. 2, two types of donor DNA and the target region are subjected to homologous recombination. The donor DNA contains a distinguishably different drug selection marker gene (for positive selection) and a distinguishably different visualization marker gene (for negative selection), and each has a target sequence (gRNA1-4) by CRISPR/Cas9 at both ends. After homologous recombination, in order to select cells in which fragments derived from two types of donor DNA have replaced the target sequences on the paternal and maternal chromosomes, respectively, drug selection is performed with two types of drugs (see step (1) in FIG. 2). Cells that survive after selection are cells (first intermediate cells) that have distinguishably different drug resistance genes in the target regions of the two alleles. The first intermediate cells can be subjected to single cell cloning. It can be confirmed whether the cassette containing the selection marker gene has been inserted at the desired position.
 次に、父由来または母由来のカセットいずれか一方を除去する。例えば、図2に示されるように、父由来の標的領域を置き換えたカセットの両端に位置する標的配列(gRNA1およびgRNA2)を、カセット除去用ドナーDNAの存在下で、CRISPR/Cas9システムにより切断することができる。カセット除去用ドナーDNAは、父由来アレル上の標的領域の上流と相同組換え可能な上流ホモロジーアームと当該標的領域の下流と組換え可能な下流ホモロジーアームからなる。その後、GFPシグナルが消失した細胞をセルソーターで選別することで、父由来アレルにおいて標的領域の上流と下流とがシームレスに連結した(すなわち、標的領域の全体が脱落した)ゲノムを有する第二の中間体細胞を得ることができる。 Next, either the paternally or maternal cassette is removed. For example, as shown in FIG. 2, the target sequences (gRNA1 and gRNA2) located at both ends of the cassette that replaced the paternally derived target region can be cut by the CRISPR/Cas9 system in the presence of the cassette removal donor DNA. The cassette removal donor DNA consists of an upstream homology arm capable of homologous recombination with the upstream of the target region on the paternally derived allele and a downstream homology arm capable of recombination with the downstream of the target region. Then, by selecting cells in which the GFP signal has disappeared using a cell sorter, a second intermediate cell can be obtained that has a genome in which the upstream and downstream of the target region in the paternally derived allele are seamlessly linked (i.e., the entire target region has been lost).
 第二の中間体細胞から第二の改変細胞のライブラリー(第二のライブラリー)を作製することができる。第二の中間体細胞の母由来カセットの両端に位置する標的配列(gRNA3およびgRNA4)を、改変塩基配列導入用ドナーDNA(ライブラリー作製用ドナーDNA)の存在下で、CRISPR/Cas9システムにより切断することができる。改変塩基配列導入用ドナーDNAは、母由来アレル上の標的領域の上流と相同組換え可能な上流ホモロジーアームと当該標的領域の下流と組換え可能な下流ホモロジーアームを有し、上流ホモロジーアームと下流ホモロジーアームの間に、改変塩基配列を含む。このようにすることで、母由来アレルにおいて、標的配列の上流と下流との間に改変塩基配列を含むゲノムを有する改変細胞が得られる。異なる改変塩基配列を有する改変塩基配列導入用ドナーDNAを用意しておくことで、異なる改変塩基配列を有する改変細胞を得ることができ、これにより第二のライブラリーを得ることができる。 A library of second modified cells (second library) can be prepared from the second intermediate cells. The target sequences (gRNA3 and gRNA4) located at both ends of the maternal cassette of the second intermediate cells can be cleaved by the CRISPR/Cas9 system in the presence of donor DNA for introducing modified base sequences (donor DNA for library preparation). The donor DNA for introducing modified base sequences has an upstream homology arm capable of homologous recombination with the upstream of the target region on the maternal allele and a downstream homology arm capable of recombination with the downstream of the target region, and contains a modified base sequence between the upstream homology arm and the downstream homology arm. In this way, modified cells having a genome containing a modified base sequence between the upstream and downstream of the target sequence in the maternal allele can be obtained. By preparing donor DNA for introducing modified base sequences having different modified base sequences, modified cells having different modified base sequences can be obtained, thereby obtaining a second library.
 改変塩基配列導入用ドナーDNAは、上述のように1以上の改変部分(A)と1以上の非改変部分(B)とからなっており、改変部分(A)以外は、改変前のゲノムの当該領域の配列と同じ配列を有していることができる。


 
As described above, the donor DNA for introducing a modified base sequence consists of one or more modified portions (A) and one or more unmodified portions (B), and other than the modified portion (A), it can have a sequence that is the same as the sequence of the corresponding region of the genome before modification.


Claims (15)

  1.  改変細胞のライブラリーであって、
     ライブラリーは、複数の水性組成物の組合せを含み、
     各水性組成物はそれぞれ1種類の改変細胞を含み、
     改変細胞はそれぞれ、改変対象である遺伝子座に第一のアレルと第二のアレルを有し、
     改変細胞はそれぞれ第一のアレルの同一位置に、水性組成物間で相互に異なるDNA断片を含むカセットを有する、
    ライブラリー。
    1. A library of modified cells, comprising:
    The library comprises a combination of a plurality of aqueous compositions,
    Each aqueous composition comprises one type of modified cell;
    Each of the modified cells has a first allele and a second allele at a locus to be modified;
    Each of the modified cells has a cassette containing a DNA fragment that differs from each other between the aqueous compositions at the same position of the first allele.
    Library.
  2.  各改変細胞は、第二のアレルは、その一部または全部の破壊または欠失を有する、請求項1に記載のライブラリー。 The library of claim 1, wherein each modified cell has a second allele that is partially or completely disrupted or deleted.
  3.  第二のアレルは、シームレスに前記一部または全部を欠失している、請求項1または2に記載のライブラリー。 The library of claim 1 or 2, wherein the second allele seamlessly lacks the part or all of the sequence.
  4.  改変細胞それぞれのDNA断片を含むカセット以外の配列は、改変前後で実質的に同一である、請求項1~3のいずれか一項に記載のライブラリー。 The library according to any one of claims 1 to 3, wherein the sequences of each modified cell other than the cassette containing the DNA fragment are substantially identical before and after modification.
  5.  前記カセットの配列のそれぞれは、1以上の改変部分(A)と1以上の非改変部分(B)とからなり、前記改変部分(A)はそれぞれ、配列の挿入、欠失、および置換からなる群から選択される1以上の改変を有し、前記1以上の改変部分の改変は、改変の位置または内容に関して各水性組成物間で異なり、前記1以上の非改変部分(B)はそれぞれ、改変前の対応する部位の配列と同一であり、前記カセット中のセントロメア側の非改変部分(B1)は、前記カセットのセントロメア側の隣接配列(C1)とシームレスに連結しており、前記カセットのテロメア側の非改変部分(Bt)は、前記カセットのテロメア側の隣接配列(C2)とシームレスに連結しており、記隣接配列(C1)および非改変部分(B1)が連結した領域、並びに非改変部分(Bt)および上記隣接配列(C2)が連結した領域は、改変前の対応する領域の配列と同一の配列を構成している、上記請求項1~4のいずれか一項に記載のライブラリー。  The library according to any one of claims 1 to 4, wherein each of the sequences of the cassettes is composed of one or more modified portions (A) and one or more unmodified portions (B), each of the modified portions (A) has one or more modifications selected from the group consisting of sequence insertion, deletion, and substitution, the modifications of the one or more modified portions differ between the aqueous compositions in terms of the position or content of the modification, each of the one or more unmodified portions (B) is identical to the sequence of the corresponding site before modification, the unmodified portion (B1) on the centromere side of the cassette is seamlessly linked to the adjacent sequence (C1) on the centromere side of the cassette, the unmodified portion (Bt) on the telomere side of the cassette is seamlessly linked to the adjacent sequence (C2) on the telomere side of the cassette, and the region where the adjacent sequence (C1) and the unmodified portion (B1) are linked, and the region where the unmodified portion (Bt) and the adjacent sequence (C2) are linked constitute the same sequence as the sequence of the corresponding region before modification.
  6.  前記改変細胞は、部位特異的組換え酵素の標的配列を含まない、請求項1~5のいずれか一項に記載のライブライリー。 The library according to any one of claims 1 to 5, wherein the modified cells do not contain a target sequence for a site-specific recombinase.
  7.  ライブラリーに含まれる前記水性組成物の種類は、50種類以上である、請求項1~6のいずれか一項に記載のライブライリー。 The library according to any one of claims 1 to 6, wherein the library contains 50 or more types of aqueous compositions.
  8.  改変細胞のライブラリーを製造する方法であって、
    (α)改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、前記第一のアレルと第二のアレルそれぞれに選択マーカー遺伝子と標的核酸配列を含むカセットを有する細胞の群を提供することと、
     ここで、第一のアレルが有する選択マーカー遺伝子と第二のアレルが有する選択マーカー遺伝子とは区別可能に異なり、前記標的核酸配列は、ゲノム改変システムの標的であり、ゲノム改変システムにより第一のアレルと第二のアレルとを区別可能に切断できるように設計されており、各選択マーカー遺伝子はネガティブ選択に用いることができるネガティブ選択用マーカー遺伝子であり、
    (β)提供された細胞の群に下記(x)及び(y)を導入する工程と、
    (x)第一のアレルに含まれる前記固有の塩基配列を標的とする配列特異的核酸切断分子、又は前記配列特異的核酸切断分子をコードするポリヌクレオチドを含むゲノム改変システム、
    (y)複数種類の第2の組換え用ドナーDNA{ここで、複数種類の第2の組換え用ドナーDNAはそれぞれ、上記(x)の前記標的部位の上流側に隣接する塩基配列と相同な塩基配列を有する上流ホモロジーアームと、前記標的領域の下流側に隣接する塩基配列と相同な塩基配列を有する下流ホモロジーアームを有し、前記上流ホモロジーアームと前記下流ホモロジーアームとの間に改変塩基配列を含み、改変塩基配列は、第2の組換え用ドナーDNA毎に異なり、第2の組換え用ドナーDNAそれぞれに固有である}、
    (γ)前記工程(β)の後、第一のアレルに含まれる選択マーカー遺伝子を発現しない細胞を選択する工程と、
    を含み、
     これにより、複数の細胞を含む改変細胞のライブラリー得ることができ、ここで、得られた複数の細胞において、第一のアレルは各細胞に固有の改変塩基配列を有し、第二のアレルは細胞間で共通の配列を有する、
    方法。
    1. A method for producing a library of modified cells, comprising:
    (α) providing a group of cells having a genome including a first allele and a second allele at a locus to be modified, the first allele and the second allele each including a cassette including a selection marker gene and a target nucleic acid sequence;
    wherein the selection marker gene carried by the first allele and the selection marker gene carried by the second allele are distinguishably different, the target nucleic acid sequence is a target of a genome modification system and is designed so that the first allele and the second allele can be distinguishably cleaved by the genome modification system, and each selection marker gene is a negative selection marker gene that can be used for negative selection,
    (β) introducing into the provided group of cells:
    (x) a sequence-specific nucleic acid cleavage molecule that targets the unique base sequence contained in the first allele, or a genome modification system comprising a polynucleotide encoding the sequence-specific nucleic acid cleavage molecule;
    (y) a plurality of types of second recombination donor DNAs {wherein each of the plurality of types of second recombination donor DNAs has an upstream homology arm having a base sequence homologous to a base sequence adjacent to the upstream side of the target site of (x) above, and a downstream homology arm having a base sequence homologous to a base sequence adjacent to the downstream side of the target region, and contains a modified base sequence between the upstream homology arm and the downstream homology arm, and the modified base sequence is different for each second recombination donor DNA and is unique to each second recombination donor DNA},
    (γ) after the step (β), selecting cells that do not express the selection marker gene contained in the first allele;
    Including,
    This allows for the production of a library of modified cells comprising a plurality of cells, wherein in the plurality of cells obtained, a first allele has a modified base sequence unique to each cell, and a second allele has a sequence common to the cells.
    Method.
  9.  請求項8に記載の方法であって、工程(α)の前に、
     改変対象である遺伝子座に第一のアレルと第二のアレルを含むゲノムを有する細胞において、第一のアレルおよび第二のアレルに含まれる被置換配列を選択マーカー遺伝子と標的核酸配列を含むカセットにより置換し、これにより被置換配列を第一のアレルおよび第二のアレルから除去することと、
    をさらに含む、方法。
    The method according to claim 8, further comprising the steps of:
    In a cell having a genome including a first allele and a second allele at a locus to be modified, replacing the replaced sequence included in the first allele and the second allele with a cassette including a selection marker gene and a target nucleic acid sequence, thereby removing the replaced sequence from the first allele and the second allele;
    The method further comprising:
  10.  請求項9に記載の方法であって、
     第一のアレルの改変塩基配列はそれぞれ、第一のアレルの被置換配列に対して、塩基の付加、挿入、置換、欠失、および削除からなる群から選択される1以上の変異を有する、方法。
    10. The method of claim 9,
    A method in which each of the modified base sequences of the first allele has one or more mutations selected from the group consisting of base addition, insertion, substitution, deletion, and deletion relative to the replaced sequence of the first allele.
  11.  請求項8~10のいずれか一項に記載の方法であって、
     被改変配列は、タンパク質のコード領域であり、
     第一のアレルの改変塩基配列は、第一のアレルの被置換配列に対して、塩基の付加、挿入、置換、欠失、および削除からなる群から選択される1以上の変異を有する、方法。
    The method according to any one of claims 8 to 10,
    The modified sequence is a coding region for a protein,
    A method in which the modified base sequence of the first allele has one or more mutations selected from the group consisting of addition, insertion, substitution, deletion, and deletion of bases relative to the replaced sequence of the first allele.
  12.  改変塩基配列は、被改変配列と80%以上の配列同一性を有する、請求項10または11に記載の方法。 The method according to claim 10 or 11, wherein the modified base sequence has a sequence identity of 80% or more with the modified sequence.
  13.  請求項8~12のいずれか一項に記載の方法であって、工程(α)と工程(β)の間に、
    第二のアレルからカセットを除去すること
    をさらに含む、方法。
    The method according to any one of claims 8 to 12, further comprising, between step (α) and step (β),
    The method further comprising removing the cassette from the second allele.
  14.  カセットがシームレスに除去される、請求項13に記載の方法。 The method of claim 13, wherein the cassette is seamlessly removed.
  15.  請求項8~14のいずれか一項に記載の方法により作製される、複数の改変細胞を含む改変細胞のライブラリー。

     
    A library of modified cells comprising a plurality of modified cells, produced by the method of any one of claims 8 to 14.

PCT/JP2023/036147 2022-10-05 2023-10-04 Cell library and method for producing same WO2024075756A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022160946 2022-10-05
JP2022-160946 2022-10-05

Publications (1)

Publication Number Publication Date
WO2024075756A1 true WO2024075756A1 (en) 2024-04-11

Family

ID=90608255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/036147 WO2024075756A1 (en) 2022-10-05 2023-10-04 Cell library and method for producing same

Country Status (1)

Country Link
WO (1) WO2024075756A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004525624A (en) * 2001-01-25 2004-08-26 エボルバ バイオテック アクティーゼルスカブ Cell collection library
JP2005503778A (en) * 2001-05-30 2005-02-10 クロモス・モレキユラー・システムズ・インコーポレーテツド Chromosome-based platform
JP2005503826A (en) * 2001-10-01 2005-02-10 ドイチェス クレブスフォルシュンクスツェントルム スチフトゥング デス エッフェントリヒェン レヒツ Method for producing protein library and method for selecting protein therefrom
JP2014529998A (en) * 2011-09-19 2014-11-17 カイマブ・リミテッド Antibodies, variable domains and chains made for human use

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004525624A (en) * 2001-01-25 2004-08-26 エボルバ バイオテック アクティーゼルスカブ Cell collection library
JP2005503778A (en) * 2001-05-30 2005-02-10 クロモス・モレキユラー・システムズ・インコーポレーテツド Chromosome-based platform
JP2005503826A (en) * 2001-10-01 2005-02-10 ドイチェス クレブスフォルシュンクスツェントルム スチフトゥング デス エッフェントリヒェン レヒツ Method for producing protein library and method for selecting protein therefrom
JP2014529998A (en) * 2011-09-19 2014-11-17 カイマブ・リミテッド Antibodies, variable domains and chains made for human use

Similar Documents

Publication Publication Date Title
US20240352489A1 (en) Methods and compositions for modifying a targeted locus
KR102417127B1 (en) Large gene excision and insertion
EP3536796A1 (en) Gene knockout method
CN102858966A (en) Improved meganuclease recombination system
US11730150B2 (en) Fibrillin-1 mutations for modeling neonatal progeroid syndrome with congenital lipodystrophy
EP4134431A1 (en) Genome alteration method and genome alteration kit
WO2024075756A1 (en) Cell library and method for producing same
JP6867635B1 (en) Genome modification method and genome modification kit
JP7212982B1 (en) Cell library and its production method
EP3775201B1 (en) Methods for scarless introduction of targeted modifications into targeting vectors
WO2023191063A1 (en) Cell suitable for gene engineering, cell engineering and cellular medicine, and method for producing same
US20240368633A1 (en) Method for improving genome editing
WO2022137760A1 (en) Method for causing large-scale deletions in genomic dna and method for analyzing genomic dna
US20230407278A1 (en) Compositions and methods for cas9 molecules with improved gene editing properties
JP2022079062A (en) Method for inserting exogenous gene onto chromosome of animal cell, animal cell, kit for inserting exogenous gene, vector, guide rna, and guide rna expression vector
Lopes Easy embryonicStem cell: a new tool to manipulate gene expression in embryoid bodies
da Fonseca Lopes Easy Embryonicstem Cell: A new Tool to Manipulate Gene Expression in Embryoid Bodies

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23874882

Country of ref document: EP

Kind code of ref document: A1