WO2022169235A1 - Composition pour l'édition primaire ayant une efficacité d'édition améliorée - Google Patents

Composition pour l'édition primaire ayant une efficacité d'édition améliorée Download PDF

Info

Publication number
WO2022169235A1
WO2022169235A1 PCT/KR2022/001611 KR2022001611W WO2022169235A1 WO 2022169235 A1 WO2022169235 A1 WO 2022169235A1 KR 2022001611 W KR2022001611 W KR 2022001611W WO 2022169235 A1 WO2022169235 A1 WO 2022169235A1
Authority
WO
WIPO (PCT)
Prior art keywords
editing
prime
sequence
hype2
gene
Prior art date
Application number
PCT/KR2022/001611
Other languages
English (en)
Korean (ko)
Inventor
김형범
송명재
임정민
Original Assignee
연세대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 연세대학교 산학협력단 filed Critical 연세대학교 산학협력단
Publication of WO2022169235A1 publication Critical patent/WO2022169235A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • C12Y207/07049RNA-directed DNA polymerase (2.7.7.49), i.e. telomerase or reverse-transcriptase

Definitions

  • the present invention relates to gene editing complexes and uses thereof for prime editing.
  • Prime Editing is an innovative novel genome editing method capable of introducing genetic changes of virtually any size without the need for donor DNA or double-strand breaks (DSBs) (Anzalone, AV et al. Search-and -replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019)). These changes include insertions, deletions, and all possible 12 point mutations, as well as combinations of these changes.
  • DLBs double-strand breaks
  • Prime editor basically consists of Cas9 nickase-reverse transcriptase (RT) fusion protein and prime editing guide RNA (pegRNA);
  • the pegRNA contains a guide sequence recognizing a target sequence, a tracrRNA scaffold sequence, a primer binding site (PBS) required for initiation of reverse transcription, and a desired genetic change.
  • An RT template homologous to the target sequence. include Four types of Prime Editor have been developed: PE1, PE2, PE3, and PE3b.
  • PE2 is known to be capable of inducing prime editing in cell types of various species including liver cells.
  • PE2 is sometimes not efficient enough for gene editing. Therefore, to further improve prime editing efficiency, PE3 and PE3b can be generated by adding a single guide RNA (sgRNA) to PE2.
  • sgRNA single guide RNA
  • the present inventors have improved the existing PE2 as a result of efforts to improve the prime editing efficiency, and confirmed that the gene editing efficiency of the improved PE2 mutant is significantly improved, thereby completing the present invention.
  • One aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a single-stranded DNA-binding domain (single-stranded DNA-binding domain, ssDBD) comprising a fusion protein; And 2) to provide a gene editing complex comprising a prime editing guide RNA (pegRNA).
  • nickase nickase
  • reverse transcriptase reverse transcriptase
  • ssDBD single-stranded DNA-binding domain
  • pegRNA prime editing guide RNA
  • Another aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a poly encoding the fusion protein a nucleotide or a recombinant vector comprising the polynucleotide; And 2) to provide a composition for gene editing, including a recombinant vector comprising a prime editing guide RNA (pegRNA) or a polynucleotide encoding the same.
  • pegRNA prime editing guide RNA
  • Another aspect is to provide a method for editing a gene of a subject, comprising introducing the composition for gene editing into isolated eukaryotic cells or eukaryotic organisms other than humans.
  • One aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a single-stranded DNA-binding domain (single-stranded DNA-binding domain, ssDBD) comprising a fusion protein; and 2) a prime editing guide RNA (pegRNA).
  • nickase nickase
  • reverse transcriptase reverse transcriptase
  • ssDBD single-stranded DNA-binding domain
  • pegRNA prime editing guide RNA
  • the term "gene editing” refers to a genetic engineering technique for manipulating genes or DNA using artificially engineered nucleases or gene scissors, and one or more nucleic acid molecules (eg, 1 - 100,000 bp, 1 - 10,000 bp, 1 - 1,000 bp, 1 - 100 bp, 1 - 70 bp, 1 - 50 bp, 1 - 30 bp, 1 - 20 bp, or 1 - 10 bp) deletion, insertion, substitution, etc. It means to lose, alter, and/or restore (modify) gene function. Specifically, it may include DNA or gene knockout, deletion such as knockdown, gene insertion (knock-in), gene correction, gene expression regulation, or chromosome rearrangement.
  • nucleic acid molecules eg, 1 - 100,000 bp, 1 - 10,000 bp, 1 - 1,000 bp, 1 - 100 bp, 1 - 70 bp, 1 - 50 bp, 1 - 30
  • the gene knockout may refer to the regulation of gene activity by deletion, substitution, and/or insertion of one or more nucleotides, eg, inactivation, of all or part of a gene (eg, one or more nucleotides).
  • the gene inactivation refers to a modification to encode a protein that has lost its original function or suppressed or downregulated the expression of a gene.
  • gene regulation involves structural modification of proteins obtained by deletion of exon sites due to simultaneous targeting of both intron sites surrounding one or more exons of the target gene, expression of dominant negative form of protein, expression of competitive inhibitor secreted in soluble form, etc. It may mean a change in the function of a gene as a result.
  • the gene insertion refers to inserting an exogenous base sequence that did not exist in another species or originally in the organism into the genome of the organism or a DNA sequence derived from the organism by using genetic recombination technology.
  • the gene editing complex is a complex for prime editing, and may specifically edit a gene through prime editing.
  • primary editing refers to a genome editing method capable of introducing a genetic change by cutting only one strand of DNA without DNA double-strand cutting by fourth-generation gene scissors.
  • the prime editing may be performed by "Prime editor (PE)” or a variant thereof.
  • the prime editor may include a Cas nickase-reverse transcriptase (RT) fusion protein and a prime editing guide RNA (pegRNA), and additional domains or proteins are added to improve the efficiency of the prime editing. It may be further included in the fusion protein.
  • the type of the prime editor includes, but is not limited to, PE1, PE2, PE3, PE3b, and the like or variants thereof.
  • the prime editor may be one in which a single-stranded DNA-binding domain (ssDBD) is additionally included in Prime editor 2 (PE2), the following Examples and Experiments In the example, it was named hyPE2 as a variant of PE2.
  • ssDBD single-stranded DNA-binding domain
  • the prime editor includes a prime editor protein (fusion protein) in which a Cas nickcase, a reverse transcriptase, and a single-stranded DNA binding domain are fused, and a prime editing guide RNA (pegRNA).
  • Prime editor as used herein narrowly refers to a prime editor protein (fusion protein) in which Cas nickcase protein, single-stranded DNA binding domain and reverse transcriptase are fused, broadly, the prime editor protein and prime editing guide RNA form a complex It may mean a type of prime editor complex.
  • the prime editor may mean including only the Cas nickcase-ssDBD-RT fusion protein, or may mean including the Cas nickcase-ssDBD-RT fusion protein and pegRNA together.
  • introduction of the prime editor here may mean introducing only the Cas nickcase-ssDBD-RT fusion protein. That is, when the pegRNA has already been introduced, the introduction of the prime editor may mean introducing only the Cas nickcase-ssDBD-RT fusion protein.
  • the prime editor may refer to a Cas nickcase-ssDBD-RT fusion protein.
  • the "Cas nickase" used in the prime editor has a nickcase activity to cut (nick) one-stranded DNA, and the Cas protein may be modified to have a nickcase activity, and a target together with pegRNA. Any Cas protein that has been delivered to a site and modified so that only a single strand can be specifically cleaved can be used without limitation.
  • the Cas nickcase includes Cas9 nickcase, SaCas9 nickcase, SpCas9 nickcase, Cpf1 nickcase, Cas3 nickcase, Cas8a-c nickcase, Cas10 nickcase, Cse1 nickcase, Csy1 nickcase, Csn2 nickcase, Cas4 nickcase , Csm2 nickcase, Cm5 nickcase, Csf1 nickcase, C2C2 nickcase, NgAgo nickcase, Cas12e nickcase, Cas12d nickcase, Cas12a nickcase, Cas12b1 nickcase, Cas13a nickcase, Cas12c nickcase and variants thereof It may be one or more selected from the group, and may specifically be a Cas9 nickcase, and more specifically may be a Cas9 H850A nickcase.
  • RT reverse transcriptase
  • single-stranded DNA-binding domain refers to an independently folded protein domain comprising one or more structural motifs that recognize single-stranded DNA.
  • the DNA-binding domain may recognize a specific DNA sequence or have a general affinity for DNA, and some DNA-binding domains may include a nucleic acid in a folded structure.
  • the single-stranded DNA binding domain may be Rad51 DBD or RPA70 DBD, specifically Rad51 DBD.
  • fusion protein refers to a protein artificially synthesized such that a Cas nickase, a reverse transcriptase, and a single-stranded DNA binding domain are bound.
  • the single-stranded DNA binding domain may be connected to the C-terminus of the Cas nickcase and the N-terminus of the reverse transcriptase. Accordingly, the fusion protein may include a recombinant protein linked from the N-terminus to a Cas nickcase, a single-stranded DNA binding domain, and a reverse transcriptase in order.
  • the Cas nick case may include the amino acid sequence of SEQ ID NO: 1, and specifically, may include the amino acid sequence of SEQ ID NO: 1.
  • the single-stranded DNA binding domain may include the amino acid sequence of SEQ ID NO: 2, and specifically, may include the amino acid sequence of SEQ ID NO: 2.
  • the reverse transcriptase may include the amino acid sequence of SEQ ID NO: 3, specifically, may consist of the amino acid sequence of SEQ ID NO: 3.
  • the Cas nickase, the single-stranded DNA binding domain, and the reverse transcriptase may be directly linked to each other or linked through a linker, but may be specifically linked through a linker.
  • the linker is not particularly limited as long as it exhibits the activity of the fusion protein, and for example, glycine, alanine, leucine, isoleucine, proline, serine, threonine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, lysine, It can be linked using amino acids, such as arginic acid, and can be used by linking 1 to 40 amino acids.
  • a linker connecting the Cas nickcase and the single-stranded DNA-binding domain may be referred to as a first linker
  • a linker connecting the single-stranded DNA-binding domain and the reverse transcriptase may be referred to as a second linker.
  • the first linker may include the amino acid sequence of SEQ ID NO: 4, specifically, may consist of the amino acid sequence of SEQ ID NO: 4.
  • the second linker may include the amino acid sequence of SEQ ID NO: 5, specifically, may consist of the amino acid sequence of SEQ ID NO: 5.
  • the fusion protein may include a recombinant protein linked in order from the N-terminus to the Cas nickcase, the first linker, the single-stranded DNA binding domain, the second linker, and the reverse transcriptase.
  • the fusion protein may include the amino acid sequence of SEQ ID NO: 6, specifically, the amino acid sequence of SEQ ID NO: 6.
  • the fusion protein may further include a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • nuclear localization sequence or signal refers to an amino acid sequence that serves to transport a specific substance (eg, protein) into a cell nucleus, and is generally a nuclear pore (Nuclear Pore). ) through the cell nucleus (Kalderon D, et al., Cell 39:499509 (1984); Dingwall C, et al., J Cell Biol. 107(3):8419 (1988)).
  • nuclear localization sequences are not required for gene editing activity in eukaryotes, it is believed that the inclusion of such sequences enhances the activity of the system, particularly targeting nucleic acid molecules in the nucleus.
  • the nuclear localization sequence may include the amino acid sequence of SEQ ID NO: 7, specifically, may consist of the amino acid sequence of SEQ ID NO: 7.
  • the nuclear localization sequence may be added to the N-terminus and/or C-terminus of the fusion protein, and specifically may be added to the N-terminus and C-terminus.
  • the fusion protein is to include a recombinant protein linked in the order of the nuclear localization sequence, the Cas nickase, the first linker, the single-stranded DNA binding domain, the second linker, the reverse transcriptase and the nuclear localization sequence from the N-terminus.
  • the fusion protein comprising the nuclear localization sequence, Cas nickase, first linker, single-stranded DNA binding domain, second linker, reverse transcriptase, and some or all of the above amino acid sequences described in SEQ ID NOs.
  • an amino acid sequence exhibiting 80% or more, specifically 90% or more, more specifically 95% or more, more specifically 98% or more, and most specifically 99% or more homology with the sequence, substantially If the amino acid sequence expressing a protein exhibiting the same or corresponding biological activity as the respective protein, the case where some sequences have an amino acid sequence deleted, modified, substituted or added is included without limitation.
  • homology refers to the degree of similarity between the amino acid sequence constituting the protein or the polynucleotide sequence encoding the same.
  • the homology is sufficiently high, the expression product of the polynucleotide (gene) and the protein are identical or have a similar activity.
  • homology can be expressed as a percentage according to the degree of correspondence with a given amino acid sequence or polynucleotide sequence.
  • a homologous sequence having the same or similar activity to a given amino acid sequence or nucleotide sequence is expressed as "% homology".
  • Standard software that calculates parameters such as score, identity, and similarity, specifically BLAST 2.0, or hybridization written under defined stringent conditions
  • Appropriate hybridization conditions that can be confirmed by comparing the sequences by experimentation are within the technical scope and are well known to those skilled in the art (eg, J. Sambrook et al., Molecular Cloning, A Laboratory Manual, 2nd Edition, Cold Spring). Harbor Laboratory press, Cold Spring Harbor, New York, 1989; F.M. Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York).
  • pegRNA primary editing guide RNA
  • PBS reverse transcription
  • the guide sequence refers to a sequence in the guide RNA that specifies a target site, and includes a sequence that is fully or partially complementary to the target sequence.
  • the guide sequence is any polynucleotide sequence having sufficient complementarity with the target polynucleotide sequence to hybridize with the target DNA sequence and induce sequence-specific binding of the gene editing coalesce to the target DNA sequence.
  • the degree of complementarity between a guide sequence and its corresponding target sequence is about 50%, 60%, 75%, 80%, 85%, 90%, 95 when optimally aligned using an appropriate alignment algorithm. %, 97.5%, 99% or more.
  • Optimal alignment can be determined using any algorithm suitable for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, the Burroughs- Algorithms based on the Burrows-Wheeler Transform (eg Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies) ), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn) and Maq (available at maq.sourceforge.net).
  • any algorithm suitable for aligning sequences include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, the Burroughs- Algorithms based on the Burrows-Wheeler Transform (eg Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies) ), ELAND (Illumina, San Diego, CA), SOAP (available at
  • a guide The sequence is about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40 , 45, 50, 75 or more nucleotides in length. In some embodiments, the guide sequence is about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12 or less nucleotides in length.
  • the ability of a guide sequence to induce sequence-specific binding of a gene editing complex of a can be assessed by any suitable assay.
  • target sequence refers to a target nucleotide sequence for which pegRNA is desired.
  • the target sequence may be a sequence expected to be targeted by pegRNA.
  • the target sequence may be a partial sequence among known genomic sequences, or a sequence to be edited by a person skilled in the art using the system of the present invention.
  • the complex is for editing a target DNA or a target gene, and specifically, the complex of the present invention can edit a gene more effectively than a previously known prime editor, so that the gene editing efficiency or prime editing efficiency is improved/improved. .
  • the prime editing complex comprising a fusion protein comprising Cas9 H840A, Rad51 DBD and RT of the present invention has a lower or similar frequency of unintended editing than the existing PE2 excluding Rad51 DBD.
  • the prime editing editing efficiency was significantly improved while showing , it can be seen that it has superior efficacy than various previously known prime editors including PE2.
  • the "prime editing efficiency" means gene editing efficiency by the prime editor. Prime editing efficiency can be calculated as a rate at which editing induced by the prime editor and pegRNA occurs without unintentional mutation in the target sequence when prime editing is performed. The prime editing efficiency may be expressed as a percentage.
  • the data on the efficiency of the prime editing comprises the steps of introducing the genome editing composition into a cell or tissue; performing deep sequencing using the DNA obtained from the cell or tissue introduced with the composition; And it may be obtained by performing a method comprising the step of analyzing the prime editing efficiency from the data obtained by the deep sequencing.
  • the method of obtaining DNA from cells or tissues into which the prime editor is introduced may be performed using various DNA isolation methods known in the art. Since each cell or tissue is expected to have gene editing in the introduced target sequence, the gene editing efficiency can be detected by sequencing the target sequence.
  • the sequencing method is not limited to a specific method as long as prime editing efficiency data can be obtained, but, for example, deep sequencing may be used.
  • the step of analyzing the prime editing efficiency from the data obtained by the deep sequencing may include calculating the prime editing efficiency.
  • Prime editing efficiency may vary depending on the type and/or length of the pegRNA sequence and the target sequence.
  • the data on the prime editing efficiency may be provided as a data set.
  • the editing efficiency of the gene editing complex may be affected by the melting temperature of PBS, and specifically, the gene editing efficiency may be improved when the melting temperature of PBS is low.
  • Another aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a poly encoding the fusion protein a nucleotide or a recombinant vector comprising the polynucleotide; And 2) it provides a composition for gene editing, comprising a recombinant vector comprising a prime editing guide RNA (pegRNA) or a polynucleotide encoding the same.
  • pegRNA prime editing guide RNA
  • the composition is for editing a target DNA or a target gene, and may be for editing a gene through prime editing.
  • the polynucleotide encoding the Cas nickcase may include the polynucleotide sequence of SEQ ID NO: 8, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 8.
  • the polynucleotide encoding the single-stranded DNA binding domain may include the polynucleotide sequence of SEQ ID NO: 9, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 9.
  • the polynucleotide encoding the reverse transcriptase may include the polynucleotide sequence of SEQ ID NO: 10, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 10.
  • the polynucleotide encoding the first linker may include the polynucleotide sequence of SEQ ID NO: 11, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 11.
  • the polynucleotide encoding the second linker may include the polynucleotide sequence of SEQ ID NO: 12, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 12.
  • the polynucleotide encoding the fusion protein may include the polynucleotide sequence of SEQ ID NO: 13, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 13.
  • polynucleotide refers to deoxyribonucleotides or polymers of ribonucleotides that exist in single-stranded or double-stranded form. It encompasses RNA genomic sequences, DNA (gDNA and cDNA) and RNA sequences transcribed therefrom, and includes analogs of natural polynucleotides, unless otherwise specified.
  • nucleotide sequence encoding the protein/domain, all or part of the fusion protein including the nucleotide sequence, as well as the nucleotide sequence encoding the amino acid described in each SEQ ID NO: 80% or more, specifically 90% of the sequence A base sequence encoding a protein substantially the same as or corresponding to each protein as a nucleotide sequence showing homology of at least 95%, more specifically at least 98%, and most specifically at least 99%. Sequences are included without limitation.
  • the fusion protein is 80% or more, 85% or more, 90% or more, 91% or more, 92% or more, 93% of the amino acid sequence of SEQ ID NO: It may include a polynucleotide encoding a protein exhibiting at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% homology.
  • polynucleotide encoding the fusion protein is a range that does not change the amino acid sequence of the protein expressed from the coding region, considering codons preferred in the organism in which the protein is to be expressed due to codon degeneracy.
  • the polynucleotide may be included without limitation as long as it is a polynucleotide sequence encoding each protein.
  • the polynucleotide includes not only a nucleotide sequence encoding the amino acid sequence of the fusion protein, but also a sequence complementary to the sequence.
  • the complementary sequence includes not only perfectly complementary sequences, but also substantially complementary sequences, which under stringent conditions known in the art, for example, nucleotides encoding the amino acid sequence of the fusion protein. It refers to a sequence capable of hybridizing with the nucleotide sequence of the sequence.
  • stringent conditions means conditions that allow specific hybridization between polynucleotides. These conditions are specifically described in the literature (eg, J. Sambrook et al., supra). For example, genes having high homology between genes having homology of 40% or more, specifically 90% or more, more specifically 95% or more, still more specifically 97% or more, and particularly specifically 99% or more homology. Conditions that hybridize with each other and do not hybridize with genes with lower homology, or wash conditions of normal Southern hybridization at 60° C. 1XSSC, 0.1% SDS, specifically 60° C. 0.1XSSC, 0.1% SDS, more specifically As examples, the conditions of washing once, specifically 2 to 3 times, at a salt concentration and temperature equivalent to 68° C. 0.1XSSC, 0.1% SDS can be exemplified.
  • Hybridization requires that two polynucleotides have complementary sequences, although mismatch between bases is possible depending on the stringency of hybridization.
  • complementary is used to describe the relationship between nucleotide bases capable of hybridizing to each other. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the present application may also include substantially similar polynucleotide sequences as well as isolated polynucleotide fragments complementary to the overall sequence.
  • polynucleotides having homology can be detected using hybridization conditions including a hybridization step at a Tm value of 55° C. and using the conditions described above.
  • the Tm value may be 60 °C, 63 °C, or 65 °C, but is not limited thereto and may be appropriately adjusted by those skilled in the art according to the purpose.
  • the appropriate stringency for hybridizing polynucleotides depends on the length of the polynucleotides and the degree of complementarity, and the parameters are well known in the art (see Sambrook et al., supra, 9.50-9.51, 11.7-11.8).
  • the term "recombinant vector” may refer to a medium capable of delivering the polynucleotide into a cell and/or expressing a target protein encoding it.
  • the recombinant vector may include a polynucleotide encoding the fusion protein or a polynucleotide comprising a pegRNA encoding sequence.
  • the vector may contain the necessary regulatory elements operably linked to the insert, ie, the polynucleotide, to allow expression of the insert when present in a cell of an individual.
  • operably linked means that a nucleic acid expression control sequence and a nucleic acid sequence encoding a protein of interest are functionally linked to perform a general function.
  • the operative linkage with the recombinant vector can be prepared and purified using genetic recombination techniques well known in the art, and site-specific DNA cleavage and ligation can be easily performed using enzymes generally known in the art. can do.
  • the vector may include a promoter, a start codon, and a stop codon terminator.
  • DNA encoding the signal peptide, and/or enhancer sequence, and/or the untranslated region on the 5' side and the 3' side of the desired gene, and/or a selectable marker region, and/or a replicable unit, etc. are appropriately added may include
  • the promoter may be constitutive or inducible as a general promoter, lac, tac, T3 and T7 promoters for prokaryotic cells, monkey virus 40 (SV40) for eukaryotic cells, mouse mammary tumor virus (MMTV) promoter, human Immunodeficiency virus (HIV), such as the long terminal repeat (LTR) promoter of HIV, Moloney virus, cytomegalovirus (CMV), Epstein Barr virus (EBV), Loose Sacoma virus (RSV) promoter, as well as , ⁇ -actin promoter, human heroglobin, human muscle creatine, human metallothionein-derived promoter, and the like.
  • SV40 monkey virus 40
  • MMTV mouse mammary tumor virus
  • HV Human Immunodeficiency virus
  • LTR long terminal repeat
  • CMV cytomegalovirus
  • EBV Epstein Barr virus
  • RSV Loose Sacoma virus
  • the selectable marker is for selecting cells transformed by introducing a vector, and markers conferring a selectable phenotype such as drug resistance, auxotrophy, resistance to cytotoxic agents, or surface protein expression may be used. In an environment treated with a selective agent, only the cells expressing the selection marker survive, so that the transformed cells can be selected.
  • the vector may be a viral, cosmid or plasmid vector, but is not limited thereto.
  • the type of the vector is not particularly limited as long as it allows the expression of the desired gene and the function of producing the desired protein in various desired cells such as prokaryotic and eukaryotic cells, but specifically a promoter showing strong activity and strong expression
  • a vector capable of producing a large amount of a foreign protein in a form similar to that of a natural state while retaining its strength can be used.
  • Expression vectors suitable for eukaryotic cells or eukaryotic organisms include, but are not limited to, vectors derived from SV40, bovine papillomavirus, adenovirus, adeno-associated virus, cytomegalovirus, lentivirus and retrovirus, etc. may be used. may, but is not limited thereto.
  • Expression vectors that can be used in bacterial hosts include, but are not limited to, pET21a, pET, pRSET, pBluescript, pGEX2T, pUC vectors, col E1, pCR1, pBR322, pMB9 or derivatives thereof.
  • Bacterial plasmids obtained Bacterial plasmids obtained, plasmids having a wider host range such as RP4, phage DNA exemplified by phage lambda derivatives such as ⁇ gt10, ⁇ gt11 or NM989, and others such as M13 and filamentous single-stranded DNA phage Other DNA phages and the like may be included.
  • pVL941 or the like can be used for insect cells.
  • the composition may be for editing target DNA or genes ex vivo or in vivo, and may be used for gene editing in eukaryotic cells, prokaryotic cells, eukaryotic organisms or prokaryotic organisms, specifically It can be used for genome editing of eukaryotic cells or eukaryotic organisms.
  • the eukaryotic cells may be cells of yeast, mold, protozoa, plants, higher plants and insects, amphibians or birds, or mammalian cells such as CHO, HeLa, HEK293, and COS-1, for example , embryonic cells, stem cells, somatic cells, germ cells, cultured cells (in vitro), transplanted cells (graft cells) and primary cultured cells (in vitro and ex vivo), and in vivo, commonly used in the art It may be an (in vivo) cell, a cell of an organism, or a cell of a mammal including a human (mammalian cell).
  • mammalian cells such as CHO, HeLa, HEK293, and COS-1, for example , embryonic cells, stem cells, somatic cells, germ cells, cultured cells (in vitro), transplanted cells (graft cells) and primary cultured cells (in vitro and ex vivo), and in vivo, commonly used in the art It may be an (in vivo
  • the eukaryotic organism may be a yeast, a fungus, a protozoan, a plant, a higher plant and insect, an amphibian, a bird or a mammal (human, primate such as monkey, dog, pig, cow, sheep, goat, mouse, rat, etc.).
  • the composition can be used for priming of eukaryotic cells or eukaryotic organisms.
  • a method of delivering the polynucleotide or recombinant vector to a cell can be accomplished using various methods known in the art. For example, calcium phosphate-DNA co-precipitation method, DEAE-dextran-mediated transfection method, polybrene-mediated transfection method, electroshock method, microinjection method, liposome fusion method, lipofectamine and protoplast fusion method, etc. It can be carried out by several methods known in.
  • a target object that is, the vector can be delivered into a cell using viral particles by means of infection.
  • the vector can be introduced into the cell by gene bambadment or the like. The introduced vector may exist as a vector itself in a cell or may be integrated into a chromosome, but is not limited thereto.
  • Another aspect provides a method of editing a gene of a subject, comprising introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism.
  • introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism comprising introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism.
  • the same parts as those described above are equally applied to the above method.
  • the eukaryotic cell may be an individual or a cell isolated from a eukaryotic organism. Also, the eukaryotic organism may be non-human.
  • Cas nickase (nickase), a reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a polynucleotide encoding the fusion protein or a recombinant vector comprising the polynucleotide;
  • the prime editing guide RNA pegRNA
  • a recombinant vector comprising a polynucleotide encoding the same can be introduced simultaneously or sequentially, as long as it does not affect the gene editing efficiency of the composition, it is not limited thereto .
  • All steps performed in the genome editing method may be performed intracellularly or extracellularly, or in vivo or ex vivo.
  • the method of editing the genome may be performed by prime editing.
  • the gene editing complex for prime editing of the present invention and the composition for gene editing comprising the same show significantly superior gene editing efficiency than conventionally known prime editors such as PE2, using the gene editing complex as an active ingredient Gene editing-based therapeutics with excellent effects can be developed.
  • 1 is a diagram showing the structure of the PE2 variant used in the present invention.
  • FIG. 2 is a diagram illustrating the correlation between prime editing efficiency of biological replicates for high-throughput evaluation of prime editing activity.
  • the number of target sequences is 83.
  • FIG. 3 is a diagram comparing the prime editing efficiency of hyPE2 and PE2-mid_RPA70, which are PE2 variants, normalized to the efficiency of PE2.
  • Target sequences with less than 1% PE2-induced prime editing efficiency are indicated by white dots.
  • the number of target sequences is 64.
  • FIG. 4 is a diagram comparing the prime editing efficiency of PE2 variants, hyPE2 and PE2-mid_RPA70, normalized to that of PE2, and the results for a target sequence having a PE2-induced prime editing efficiency of more than 1%.
  • the number of target sequences is 30.
  • FIG. 5 is a diagram comparing the prime editing efficiencies of PE2 variants hyPE2, PE2-N_Rad51 and PE2-C_Rad51 normalized to that of PE2, and results for a target sequence having a PE2-induced prime editing efficiency of more than 1%.
  • the number of target sequences is 32.
  • FIG. 6 is a view showing a comparison result of the prime editing efficiency of PE2 and hyPE2 in HEK293T cells. Data points represent the average prime editing efficiency of three biological replicates in each target sequence, with 0.1% added to all efficiency values to allow logarithmic scales for both x- and y-axes. The number of target sequences is 88.
  • FIG. 7 is a diagram comparing the prime editing efficiency of hyPE2 and PE2-mid_RPA70, which are PE2 variants, normalized to that of PE2 in HCT116 cells in HCT116 cells. The results are for a target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of target sequences is 43.
  • FIG. 8 is a diagram showing the structure of the hyPE2 linker variant and sequence information of the linker.
  • FIG. 9 is a diagram comparing the prime editing efficiency of hyPE2 and hyPE2 linker variants normalized to the efficiency of PE2.
  • the fold increase was expressed as an adjusted fold increase, and the number of pegRNAs was 82.
  • FIG. 10 is a diagram comparing the prime editing efficiency of hyPE2 against the same endogenous target of HEK293T and HCT116 cells normalized to that of PE2. The results for the target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of pegRNAs is 31 for HEK293T and 11 for HCT116.
  • FIG. 11 is a diagram showing the results of comparing the prime editing efficiency of PE2 and hyPE2 in the endogenous region of HEK293T cells.
  • the number of pegRNAs is 63.
  • FIG. 12 is a view showing a comparison result of the prime editing efficiency of PE2 and hyPE2 in the endogenous region of HCT116 cells.
  • the number of pegRNAs is 51.
  • FIG. 13 is a diagram showing the results of comparing the prime editing efficiency of PE2 and hyPE2 in six endogenous regions of primary human dermal fibroblasts.
  • the number of pegRNAs is 51.
  • FIG. 14 is a diagram comparing the prime editing efficiency of hyPE2 in the same target sequence of primary human dermal fibroblasts normalized to that of PE2. Adjusted fold increments are plotted on the y-axis and are plotted as mean ⁇ standard deviation for three independent biological replicates.
  • FIG. 15 is a diagram comparing the prime editing efficiency and unintended editing frequency of hyPE2 in the same endogenous region of HEK293T cells normalized to the result of PE2.
  • the adjusted P-value was calculated by one-way ANOVA followed by a post-hoc analysis by Tukey's multiple comparisons, and the P-value was not recorded when the difference between the two groups was not statistically significant.
  • the number of pegRNAs is 25.
  • FIG. 16 shows off-target effects of hyPE2 and PE2 at potential off-target sites of pegRNAs 2, 3, 5 and 4 other pegRNAs targeting HEK4 in HEK293T cells. Intended edit positions are highlighted in yellow, and mismatched and overhanging nucleotides are indicated in red and blue lowercase fonts, respectively. Data are expressed as mean ⁇ standard deviation for three independent biological replicates.
  • FIG. 17 is a diagram comparing the on-target and off-target editing frequencies of hyPE2 normalized to PE2 in the same endogenous region of HEK293T cells. P-values were calculated with a two-tailed, unpaired Student's t-test, where the number of on-target pegRNAs was 7 and the number of off-target pegRNAs was 22.
  • FIG. 18 is a diagram comparing the prime editing efficiency of hyPE2 normalized to that of PE2 in the same target sequence integrated as a lentivirus into HEK293T cells. Results for target sequences with PE2-induced prime editing efficiency greater than 1%, the number of target sequences being 423.
  • 19 is a diagram illustrating a result of comparing Spearman correlation coefficients between prediction models.
  • FIG. 20 is a diagram showing the top 10 functions related to hyPE2 activity compared to PE2 activity determined by Tree SHAP (XGBoost classifier). Dot colors indicate high (red) or low (blue) values of the relevant function for each pegRNA.
  • FIG. 21 is a diagram showing the fold increase dependence of hyPE2 prime editing efficiency compared to PE2 on the primer binding site (PBS) melting temperature. Editing efficiencies for hyPE2 and PE2 were determined from the same target sequence (Library B) incorporated lentivirally into HEK293T cells. The number of pegRNAs is 4 ( ⁇ 20 °C), 32 (20-30 °C), 236 (30-40 °C), 348 (40-50 °C) and 11 ( ⁇ 50 °C), respectively.
  • 22 is a diagram schematically illustrating the structure of hyPE2.
  • FIG. 23 is a diagram showing the prediction results of the three-dimensional structure of hyPE2 (left) and PE2 (right) before and after the reverse transcription domain binds to the target ssDNA/pegRNA hybrid with a nick.
  • the hypothetical interaction modeling between Rad51, the target ssDNA with nick and the pegRNA primer binding site is shown in hyPE2.
  • Sequences encoding human RPA70-C and Rad51DBD were synthesized by request of GeneScript.
  • To construct a plasmid encoding ssDBD-PE2- the sequences were amplified by PCR and cloned into pCMV-PE2 (Addgene, no. 132775) plasmid.
  • the plasmids were named PE2-mid_RPA70, hyPE2, PE2-N_Rad51 and PE2-C_Rad51 (FIG. 1).
  • Linker variants were derived from the hyPE2 plasmid and cloned using Gibson assembly.
  • 158 of the 507 pegRNAs can be modified to induce silent mutations in the NGG PAM sequence, and 158 modified pegRNAs that can induce silent mutations in the PAM sequence in addition to the initially designed editing in library B were added. did.
  • HEK293T cells were seeded into 100 mm dishes containing DMEM. After 15 hours, the culture medium was replaced with DMEM containing 25 ⁇ M chloroquine diphosphate, and the cells were further cultured for 5 hours.
  • the plasmid library was mixed with psPAX2 (Addgene no. 12260) and pMD2.G (Addgene no. 12259) in a molar ratio of 1.3:0.72:1.64, and then the plasmid was mixed with HEK293T using polyethyleneimine (PEI MAX, Polysciences). The cells were cotransfected. The next day, the culture medium was replaced with fresh medium.
  • the medium containing the lentivirus was collected and filtered using a Millex-HV 0.45- ⁇ m low protein-binding membrane (Millipore), which was aliquoted and stored at -80°C.
  • Millex-HV 0.45- ⁇ m low protein-binding membrane Millex-HV 0.45- ⁇ m low protein-binding membrane (Millipore), which was aliquoted and stored at -80°C.
  • serial dilutions of virus aliquots in the presence of 8 ⁇ g/ml polybrene (Sigma) were transduced into HEK293T cells cultured in DMEM supplemented with 10% FBS. Both non-transduced and transduced cells were cultured in DMEM supplemented with 10% FBS and 2 ⁇ g/ml of puromycin. After all non-transduced cells died, the number of viable cells in the transduced population was counted to estimate virus titer.
  • lentiviral transduction 1.0 ⁇ 10 6 HEK293T or HCT116 cells were inoculated into 100 mm dishes and cultured overnight.
  • lentiviral libraries were transduced at an MOI of 0.3, and the next day, the culture medium was replaced with DMEM supplemented with 10% FBS and 2 ⁇ g/ml puromycin. replaced. In order to remove the non-transduced cells, the culture was maintained under the above conditions for 5 days.
  • the PE2 variant-encoding plasmid, pcDNA BSD-encoding plasmid, and puro-eGFP-encoding plasmid were mixed in a weight ratio of 10:1:1 for a total A plasmid mixture of 12 ⁇ g (for library A) or 24 ⁇ g (for library B) was made, and then Lipofectamine 2000 (Invitrogen) was used for a total of 1 ⁇ 10 6 cells from cell library A or a total of 6 ⁇ 10 6 cells from cell library B. Cells were transfected. After overnight incubation, the culture medium was replaced with DMEM containing 10% FBS and 40 ⁇ g/ml blasticidin S. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
  • HEK293T or HCT116 cells were seeded in 24-well plates and transfected at 70-80% cell density. Specifically, 750 ng of PE2-, 250 ng of pegRNA- and 100 ng of eGFP-Puro- (Addgene no. 45561) encoding plasmids were mixed and cells were co-transfected using Lipofectamine 2000 according to the manufacturer's protocol. The next day, the culture medium was replaced with DMEM supplemented with 10% FBS and 2 ⁇ g/ml puromycin. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
  • a skin punch biopsy was performed from the participants by a dermatologist after obtaining written consent from the study participants, who were healthy people.
  • the consent procedure and research were approved by the institutional review committee of Yonsei University Health and Medical Center Severance Hospital (No. 4-2012-0028).
  • Fibroblasts obtained from skin biopsies were cultured in DMEM containing 10% FBS and penicillin/streptomycin.
  • a total of 1 ⁇ 10 6 human dermal fibroblasts were mixed with 3 ⁇ g of PE2-, 1 ⁇ g of pegRNA-, and 1 ⁇ g of eGFP-Puro encoding plasmid and electroporated using the Neon electroporation kit according to the manufacturer's protocol.
  • transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
  • genomic DNA was extracted from the pelleted cells using the Wizard Genomic DNA Purification Kit (Promega) according to the manufacturer's protocol.
  • Wizard Genomic DNA Purification Kit Promega
  • a total of 16 ⁇ g (over 16,000x coverage) of genomic DNA was PCR-amplified using 2x pfu PCR Smart mix (Solgent).
  • the resulting PCR product was combined and purified with a MEGAquick-spin total fragment DNA purification kit (iNtRON Biotechnology).
  • 20 ng of the purified product was PCR amplified using the Illumina adapter and primers containing the barcode sequence.
  • the prime-editing efficiency (i.e., intended editing frequency) of library experiments was calculated as follows using the Python script disclosed in the previously published literature 'Nature Biotechnology volume 39, pages198-206 (2021)':
  • pegRNA and target-sequence pairs To identify individual pegRNA and target-sequence pairs, a 22-nt sequence consisting of an 18-nt barcode and a 4-nt sequence upstream of the barcode was used. To improve the accuracy of the analysis, pegRNA and target sequence pairs with fewer than 100 deep sequencing reads were excluded. Reads containing the desired edit but no unintended mutations in the broad target sequence including the PAM were classified as PE2-induced mutations.
  • Cas-analyzer was used to evaluate the intended editing frequency, unintended editing frequency, and indel frequency in the endogenous region. Calculated like this:
  • nts nucleotides
  • PE2 off-target sites with up to 2 nucleotide mismatches or 1 nucleotide RNA or DNA overhangs were identified by Cas-OFFinder.
  • genomic DNA samples used to measure prime editing activity at endogenous sites described above were used as templates for PCR amplification.
  • the resulting product was purified and sequenced by MiSeq.
  • hyPE2- and PE-induced prime editing efficiencies obtained using library B were partitioned into training and test datasets through random sampling, ensuring that pegRNA and target sequences were not shared between the two datasets.
  • Seven existing machine learning algorithms XGBoost (extremegradient boosting), gradient-boosted regression tree (Boosted RT), random forest, L1-regularized linear regression, Lasso), L2-regularized linear regression (Ridge), L1L2-regularized linear regression (L1L2-regularized linear regression, ElasticNet), and SVM (support vector machine) were trained respectively.
  • the above models were implemented using the XGBoost Python package (version 1.3.3) and scikit-learn (version 0.23.2).
  • XGBoost and gradient-boosted regression trees we searched more than 16 models selected from the following hyperparameter constructs: number of base estimators (selected from [50, 100]), maximum depth of individual regression estimators ([5, 10] ]), the minimum number of samples in a leaf node (choose from [1, 2]]), and the learning rate (choose from [0.1, 0.2] ).
  • RNAfold WebServer was used to predict the secondary structure of this region and adopted a hairpin structure for the residue.
  • the pegRNA RT-template region hybridized with the 16-nt DNA primer region was modeled manually based on the structure of the XMRV RT complexed with an RNA:DNA hybrid (PDB code: 4HKQ).
  • a three-dimensional model of Rad51 ssDBD (residues 16-85) was obtained from the N-terminal domain structure of Rad51 (PDB code: 1B22). Secondary structures were predicted using RaptorX to find putative ⁇ -helices in the flexible N- and C-terminal regions of Rad51 ssDBD, Linker A and Linker B. 23 showing the three-dimensional structure was generated using the UCSF Chimera program, and two linkers were displayed on the three-dimensional structure in consideration of the length and secondary structure.
  • Example 11 Data and Code Availability
  • Deep sequencing data of the present invention is in the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra/) accession no. Submitted as SRP307854.
  • Protein structure data for predicting the structures of PE2 and hyPE2 were obtained from Protein Data Bank (https://www.rcsb.org). Information about the PDB code is as follows:
  • ssDNA single-stranded DNA
  • ssDBD ssDNA binding protein domain
  • the prime editing efficiency was determined using deep sequencing. First, it was confirmed that a high correlation between biological replicates was shown ( FIG. 2 ), and the average of the prime editing efficiencies in three biological replicates was obtained and used for analysis. In addition, from the first 107 pairs, 24 pairs were excluded due to insufficient deep sequencing reads ( ⁇ 100 reads), and 19 pairs showed an editing efficiency of 0% for PE2, which was useful for normalizing hyPE2 efficiency to PE2. It was excluded because it may indicate an error.
  • the median increase was 2.4 times (range, 0-360 times) for hyPE2 and 1.5 times (range, 0-232 times) for PE2-mid_RPA70. (Fig. 3). Additionally, a higher median increase was seen for pairs exhibiting less than 1% editing efficiency in PE2 compared to pairs exhibiting greater than 1% efficiencies in PE2.
  • mutants in which Rad51 DBD was inserted into the N or C-terminal region of PE2 were prepared, and these were named PE2-N_Rad51 and PE2-C_Rad51, respectively (FIG. 1).
  • hyPE2 showed the highest overall activity (median value, 1.8 times higher than PE2 activity)
  • PE2-N_Rad51 showed lower activity than PE2 (median, 0.85 fold)
  • PE2 C_Rad51 showed slightly higher overall activity than PE2 (median, 1.1 fold) ( FIG. 5 ).
  • hyPE2 efficiency was higher compared to PE2 in all 33 target sequences showing more than 1% PE2 efficiency.
  • Twenty of the remaining 55 target sequences (20/55 36%) with a prime editing efficiency of less than 1% by PE2 showed a prime editing efficiency of greater than 1% by hyPE2, with an average efficiency of 9.1% (median 5.7) %, range 1.1-29%) (Fig. 6).
  • HCT116 cell library containing 107 pairs of pegRNA encoding sequences and target sequences. Similar to the above, 26 pairs with insufficient read count ( ⁇ 100 read count) and 38 pairs with PE2 efficiency ⁇ 1% were excluded. As a result of evaluating the prime editing efficiency for the remaining 43 pairs, it was confirmed that hyPE2 exhibited a median value of 1.4 times (range, 0.82-2.9 times) than PE2, indicating higher efficiency than PE2 (FIG. 7).
  • PE2 efficiency was higher than 1% in 31 and 11 targets for HEK293T and HCT116 cells, respectively, and the efficiency of hyPE2 at the target site was 1.4 times higher than that of PE2 in HEK293T and HCT116 cells, respectively (median, range). 0.89 to 2.2 times) and 1.5 times (median, range 1.0 to 2.6 times) higher ( FIGS. 10 to 12 ).
  • the prime editing efficiency by hyPE2 increased from 5.8% to 13% in HEK293T cells and from 1.1% to 2.8% in HCT116 cells.
  • hyPE2 showed higher prime editing efficiency in 5 of 6 targets, and it was confirmed that the mean and median increases in 6 targets were 2.1-fold and 2.0-fold (adjusted fold), respectively ( 13 and 14).
  • hyPE2 does not specifically induce unintended editing and off-target effects compared to PE2.
  • the data in library B was randomly partitioned into a training data set and a test data set, ensuring that neither the pegRNA nor the target sequence was shared between the two data sets.
  • the model name was named PEselector, and the model is provided at http://deepcrispr.info/PEselector .

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne un complexe d'édition génique pour l'édition primaire et son utilisation. Selon un aspect, un complexe d'édition génique pour l'édition primaire et une composition d'édition primaire le comprenant de la présente invention présentent une efficacité d'édition génique remarquablement plus élevée que les éditeurs primaires conventionnellement connus, tels que PE2, etc. Ainsi, un agent thérapeutique hautement efficace, basé sur l'édition génique, peut être développé avec le complexe d'édition génique employé en tant que principe actif.
PCT/KR2022/001611 2021-02-04 2022-01-28 Composition pour l'édition primaire ayant une efficacité d'édition améliorée WO2022169235A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2021-0016283 2021-02-04
KR20210016283 2021-02-04

Publications (1)

Publication Number Publication Date
WO2022169235A1 true WO2022169235A1 (fr) 2022-08-11

Family

ID=82742329

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/001611 WO2022169235A1 (fr) 2021-02-04 2022-01-28 Composition pour l'édition primaire ayant une efficacité d'édition améliorée

Country Status (2)

Country Link
KR (1) KR20220112698A (fr)
WO (1) WO2022169235A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160034901A (ko) * 2013-06-17 2016-03-30 더 브로드 인스티튜트, 인코퍼레이티드 서열 조작에 최적화된 crispr-cas 이중 닉카아제 시스템, 방법 및 조성물
KR20180012834A (ko) * 2014-11-19 2018-02-06 기초과학연구원 두 개의 벡터로부터 발현된 Cas9 단백질을 이용한 유전자 발현 조절 방법

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160034901A (ko) * 2013-06-17 2016-03-30 더 브로드 인스티튜트, 인코퍼레이티드 서열 조작에 최적화된 crispr-cas 이중 닉카아제 시스템, 방법 및 조성물
KR20180012834A (ko) * 2014-11-19 2018-02-06 기초과학연구원 두 개의 벡터로부터 발현된 Cas9 단백질을 이용한 유전자 발현 조절 방법

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANZALONE ANDREW V. ET AL: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 576, no. 7785, 21 October 2019 (2019-10-21), London, pages 149 - 157, XP036953141, ISSN: 0028-0836, DOI: 10.1038/s41586-019-1711-4 *
MATSOUKAS IANIS G.: "Prime Editing: Genome Editing for Rare Genetic Diseases Without Double-Strand Breaks or Donor DNA", FRONTIERS IN GENETICS, vol. 11, XP055829020, DOI: 10.3389/fgene.2020.00528 *
ZHANG XIAOHUI ET AL: "Increasing the efficiency and targeting range of cytidine base editors through fusion of a single-stranded DNA-binding protein domain", NATURE CELL BIOLOGY, NATURE PUBLISHING GROUP UK, LONDON, vol. 22, no. 6, 11 May 2020 (2020-05-11), London, pages 740 - 750, XP037159237, ISSN: 1465-7392, DOI: 10.1038/s41556-020-0518-8 *

Also Published As

Publication number Publication date
KR20220112698A (ko) 2022-08-11

Similar Documents

Publication Publication Date Title
US20200354729A1 (en) Fusion proteins for improved precision in base editing
WO2010143917A2 (fr) Réagencements génomiques ciblés faisant intervenir des nucléases spécifiques de site
KR102254602B1 (ko) Cas9 변이체 및 그의 용도
KR20200121782A (ko) 아데노신 염기 편집제의 용도
WO2019103442A2 (fr) Composition d'édition génomique utilisant un système crispr/cpf1 et son utilisation
KR20180069898A (ko) 핵염기 편집제 및 그의 용도
KR20190005801A (ko) 표적 특이적 crispr 변이체
JPWO2020168132A5 (fr)
CN110551761B (zh) CRISPR/Sa-SepCas9基因编辑系统及其应用
Huang et al. A HECT domain ubiquitin ligase closely related to the mammalian protein WWP1 is essential for Caenorhabditis elegans embryogenesis
WO2019239361A1 (fr) Procédé d'insertion de séquence à l'aide de crispr
CN110577971B (zh) CRISPR/Sa-SauriCas9基因编辑系统及其应用
Zhang et al. Two c-myc genes from a tetraploid fish, the common carp (Cyprinus carpio)
WO2022169235A1 (fr) Composition pour l'édition primaire ayant une efficacité d'édition améliorée
JP2022512868A (ja) C2c1ヌクレアーゼに基づくゲノム編集のためのシステムおよび方法
US20220162648A1 (en) Compositions and methods for improved gene editing
Marracci et al. Gypsy/Ty3-like elements in the genome of the terrestrial salamander Hydromantes (Amphibia, Urodela)
Flachmann et al. Replacement of a conserved arginine in the assembly domain of ribulose-1, 5-bisphosphate carboxylase/oxygenase small subunit interferes with holoenzyme formation.
WO2021125840A1 (fr) Composition pour l'édition de gène ou l'inhibition de son expression, comprenant cpf1 et un guide d'arn-adn chimérique
CN110551760B (zh) CRISPR/Sa-SeqCas9基因编辑系统及其应用
CN110577972B (zh) CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN110577970B (zh) CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN112852778A (zh) 一种参与调控端粒长度的PTEN亚型蛋白质PTENγ及其应用
WO2024063538A1 (fr) Édition de bases de l'adn d'organites de cellules végétales
Kahiapo Atf5 Links Olfactory Receptor fInduced Stress Response to Proper Neuronal Function

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22749984

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22749984

Country of ref document: EP

Kind code of ref document: A1