WO2022169235A1 - Prime editing composition with improved editing efficiency - Google Patents

Prime editing composition with improved editing efficiency Download PDF

Info

Publication number
WO2022169235A1
WO2022169235A1 PCT/KR2022/001611 KR2022001611W WO2022169235A1 WO 2022169235 A1 WO2022169235 A1 WO 2022169235A1 KR 2022001611 W KR2022001611 W KR 2022001611W WO 2022169235 A1 WO2022169235 A1 WO 2022169235A1
Authority
WO
WIPO (PCT)
Prior art keywords
editing
prime
sequence
hype2
gene
Prior art date
Application number
PCT/KR2022/001611
Other languages
French (fr)
Korean (ko)
Inventor
김형범
송명재
임정민
Original Assignee
연세대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 연세대학교 산학협력단 filed Critical 연세대학교 산학협력단
Publication of WO2022169235A1 publication Critical patent/WO2022169235A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • C12Y207/07049RNA-directed DNA polymerase (2.7.7.49), i.e. telomerase or reverse-transcriptase

Definitions

  • the present invention relates to gene editing complexes and uses thereof for prime editing.
  • Prime Editing is an innovative novel genome editing method capable of introducing genetic changes of virtually any size without the need for donor DNA or double-strand breaks (DSBs) (Anzalone, AV et al. Search-and -replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019)). These changes include insertions, deletions, and all possible 12 point mutations, as well as combinations of these changes.
  • DLBs double-strand breaks
  • Prime editor basically consists of Cas9 nickase-reverse transcriptase (RT) fusion protein and prime editing guide RNA (pegRNA);
  • the pegRNA contains a guide sequence recognizing a target sequence, a tracrRNA scaffold sequence, a primer binding site (PBS) required for initiation of reverse transcription, and a desired genetic change.
  • An RT template homologous to the target sequence. include Four types of Prime Editor have been developed: PE1, PE2, PE3, and PE3b.
  • PE2 is known to be capable of inducing prime editing in cell types of various species including liver cells.
  • PE2 is sometimes not efficient enough for gene editing. Therefore, to further improve prime editing efficiency, PE3 and PE3b can be generated by adding a single guide RNA (sgRNA) to PE2.
  • sgRNA single guide RNA
  • the present inventors have improved the existing PE2 as a result of efforts to improve the prime editing efficiency, and confirmed that the gene editing efficiency of the improved PE2 mutant is significantly improved, thereby completing the present invention.
  • One aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a single-stranded DNA-binding domain (single-stranded DNA-binding domain, ssDBD) comprising a fusion protein; And 2) to provide a gene editing complex comprising a prime editing guide RNA (pegRNA).
  • nickase nickase
  • reverse transcriptase reverse transcriptase
  • ssDBD single-stranded DNA-binding domain
  • pegRNA prime editing guide RNA
  • Another aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a poly encoding the fusion protein a nucleotide or a recombinant vector comprising the polynucleotide; And 2) to provide a composition for gene editing, including a recombinant vector comprising a prime editing guide RNA (pegRNA) or a polynucleotide encoding the same.
  • pegRNA prime editing guide RNA
  • Another aspect is to provide a method for editing a gene of a subject, comprising introducing the composition for gene editing into isolated eukaryotic cells or eukaryotic organisms other than humans.
  • One aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a single-stranded DNA-binding domain (single-stranded DNA-binding domain, ssDBD) comprising a fusion protein; and 2) a prime editing guide RNA (pegRNA).
  • nickase nickase
  • reverse transcriptase reverse transcriptase
  • ssDBD single-stranded DNA-binding domain
  • pegRNA prime editing guide RNA
  • the term "gene editing” refers to a genetic engineering technique for manipulating genes or DNA using artificially engineered nucleases or gene scissors, and one or more nucleic acid molecules (eg, 1 - 100,000 bp, 1 - 10,000 bp, 1 - 1,000 bp, 1 - 100 bp, 1 - 70 bp, 1 - 50 bp, 1 - 30 bp, 1 - 20 bp, or 1 - 10 bp) deletion, insertion, substitution, etc. It means to lose, alter, and/or restore (modify) gene function. Specifically, it may include DNA or gene knockout, deletion such as knockdown, gene insertion (knock-in), gene correction, gene expression regulation, or chromosome rearrangement.
  • nucleic acid molecules eg, 1 - 100,000 bp, 1 - 10,000 bp, 1 - 1,000 bp, 1 - 100 bp, 1 - 70 bp, 1 - 50 bp, 1 - 30
  • the gene knockout may refer to the regulation of gene activity by deletion, substitution, and/or insertion of one or more nucleotides, eg, inactivation, of all or part of a gene (eg, one or more nucleotides).
  • the gene inactivation refers to a modification to encode a protein that has lost its original function or suppressed or downregulated the expression of a gene.
  • gene regulation involves structural modification of proteins obtained by deletion of exon sites due to simultaneous targeting of both intron sites surrounding one or more exons of the target gene, expression of dominant negative form of protein, expression of competitive inhibitor secreted in soluble form, etc. It may mean a change in the function of a gene as a result.
  • the gene insertion refers to inserting an exogenous base sequence that did not exist in another species or originally in the organism into the genome of the organism or a DNA sequence derived from the organism by using genetic recombination technology.
  • the gene editing complex is a complex for prime editing, and may specifically edit a gene through prime editing.
  • primary editing refers to a genome editing method capable of introducing a genetic change by cutting only one strand of DNA without DNA double-strand cutting by fourth-generation gene scissors.
  • the prime editing may be performed by "Prime editor (PE)” or a variant thereof.
  • the prime editor may include a Cas nickase-reverse transcriptase (RT) fusion protein and a prime editing guide RNA (pegRNA), and additional domains or proteins are added to improve the efficiency of the prime editing. It may be further included in the fusion protein.
  • the type of the prime editor includes, but is not limited to, PE1, PE2, PE3, PE3b, and the like or variants thereof.
  • the prime editor may be one in which a single-stranded DNA-binding domain (ssDBD) is additionally included in Prime editor 2 (PE2), the following Examples and Experiments In the example, it was named hyPE2 as a variant of PE2.
  • ssDBD single-stranded DNA-binding domain
  • the prime editor includes a prime editor protein (fusion protein) in which a Cas nickcase, a reverse transcriptase, and a single-stranded DNA binding domain are fused, and a prime editing guide RNA (pegRNA).
  • Prime editor as used herein narrowly refers to a prime editor protein (fusion protein) in which Cas nickcase protein, single-stranded DNA binding domain and reverse transcriptase are fused, broadly, the prime editor protein and prime editing guide RNA form a complex It may mean a type of prime editor complex.
  • the prime editor may mean including only the Cas nickcase-ssDBD-RT fusion protein, or may mean including the Cas nickcase-ssDBD-RT fusion protein and pegRNA together.
  • introduction of the prime editor here may mean introducing only the Cas nickcase-ssDBD-RT fusion protein. That is, when the pegRNA has already been introduced, the introduction of the prime editor may mean introducing only the Cas nickcase-ssDBD-RT fusion protein.
  • the prime editor may refer to a Cas nickcase-ssDBD-RT fusion protein.
  • the "Cas nickase" used in the prime editor has a nickcase activity to cut (nick) one-stranded DNA, and the Cas protein may be modified to have a nickcase activity, and a target together with pegRNA. Any Cas protein that has been delivered to a site and modified so that only a single strand can be specifically cleaved can be used without limitation.
  • the Cas nickcase includes Cas9 nickcase, SaCas9 nickcase, SpCas9 nickcase, Cpf1 nickcase, Cas3 nickcase, Cas8a-c nickcase, Cas10 nickcase, Cse1 nickcase, Csy1 nickcase, Csn2 nickcase, Cas4 nickcase , Csm2 nickcase, Cm5 nickcase, Csf1 nickcase, C2C2 nickcase, NgAgo nickcase, Cas12e nickcase, Cas12d nickcase, Cas12a nickcase, Cas12b1 nickcase, Cas13a nickcase, Cas12c nickcase and variants thereof It may be one or more selected from the group, and may specifically be a Cas9 nickcase, and more specifically may be a Cas9 H850A nickcase.
  • RT reverse transcriptase
  • single-stranded DNA-binding domain refers to an independently folded protein domain comprising one or more structural motifs that recognize single-stranded DNA.
  • the DNA-binding domain may recognize a specific DNA sequence or have a general affinity for DNA, and some DNA-binding domains may include a nucleic acid in a folded structure.
  • the single-stranded DNA binding domain may be Rad51 DBD or RPA70 DBD, specifically Rad51 DBD.
  • fusion protein refers to a protein artificially synthesized such that a Cas nickase, a reverse transcriptase, and a single-stranded DNA binding domain are bound.
  • the single-stranded DNA binding domain may be connected to the C-terminus of the Cas nickcase and the N-terminus of the reverse transcriptase. Accordingly, the fusion protein may include a recombinant protein linked from the N-terminus to a Cas nickcase, a single-stranded DNA binding domain, and a reverse transcriptase in order.
  • the Cas nick case may include the amino acid sequence of SEQ ID NO: 1, and specifically, may include the amino acid sequence of SEQ ID NO: 1.
  • the single-stranded DNA binding domain may include the amino acid sequence of SEQ ID NO: 2, and specifically, may include the amino acid sequence of SEQ ID NO: 2.
  • the reverse transcriptase may include the amino acid sequence of SEQ ID NO: 3, specifically, may consist of the amino acid sequence of SEQ ID NO: 3.
  • the Cas nickase, the single-stranded DNA binding domain, and the reverse transcriptase may be directly linked to each other or linked through a linker, but may be specifically linked through a linker.
  • the linker is not particularly limited as long as it exhibits the activity of the fusion protein, and for example, glycine, alanine, leucine, isoleucine, proline, serine, threonine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, lysine, It can be linked using amino acids, such as arginic acid, and can be used by linking 1 to 40 amino acids.
  • a linker connecting the Cas nickcase and the single-stranded DNA-binding domain may be referred to as a first linker
  • a linker connecting the single-stranded DNA-binding domain and the reverse transcriptase may be referred to as a second linker.
  • the first linker may include the amino acid sequence of SEQ ID NO: 4, specifically, may consist of the amino acid sequence of SEQ ID NO: 4.
  • the second linker may include the amino acid sequence of SEQ ID NO: 5, specifically, may consist of the amino acid sequence of SEQ ID NO: 5.
  • the fusion protein may include a recombinant protein linked in order from the N-terminus to the Cas nickcase, the first linker, the single-stranded DNA binding domain, the second linker, and the reverse transcriptase.
  • the fusion protein may include the amino acid sequence of SEQ ID NO: 6, specifically, the amino acid sequence of SEQ ID NO: 6.
  • the fusion protein may further include a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • nuclear localization sequence or signal refers to an amino acid sequence that serves to transport a specific substance (eg, protein) into a cell nucleus, and is generally a nuclear pore (Nuclear Pore). ) through the cell nucleus (Kalderon D, et al., Cell 39:499509 (1984); Dingwall C, et al., J Cell Biol. 107(3):8419 (1988)).
  • nuclear localization sequences are not required for gene editing activity in eukaryotes, it is believed that the inclusion of such sequences enhances the activity of the system, particularly targeting nucleic acid molecules in the nucleus.
  • the nuclear localization sequence may include the amino acid sequence of SEQ ID NO: 7, specifically, may consist of the amino acid sequence of SEQ ID NO: 7.
  • the nuclear localization sequence may be added to the N-terminus and/or C-terminus of the fusion protein, and specifically may be added to the N-terminus and C-terminus.
  • the fusion protein is to include a recombinant protein linked in the order of the nuclear localization sequence, the Cas nickase, the first linker, the single-stranded DNA binding domain, the second linker, the reverse transcriptase and the nuclear localization sequence from the N-terminus.
  • the fusion protein comprising the nuclear localization sequence, Cas nickase, first linker, single-stranded DNA binding domain, second linker, reverse transcriptase, and some or all of the above amino acid sequences described in SEQ ID NOs.
  • an amino acid sequence exhibiting 80% or more, specifically 90% or more, more specifically 95% or more, more specifically 98% or more, and most specifically 99% or more homology with the sequence, substantially If the amino acid sequence expressing a protein exhibiting the same or corresponding biological activity as the respective protein, the case where some sequences have an amino acid sequence deleted, modified, substituted or added is included without limitation.
  • homology refers to the degree of similarity between the amino acid sequence constituting the protein or the polynucleotide sequence encoding the same.
  • the homology is sufficiently high, the expression product of the polynucleotide (gene) and the protein are identical or have a similar activity.
  • homology can be expressed as a percentage according to the degree of correspondence with a given amino acid sequence or polynucleotide sequence.
  • a homologous sequence having the same or similar activity to a given amino acid sequence or nucleotide sequence is expressed as "% homology".
  • Standard software that calculates parameters such as score, identity, and similarity, specifically BLAST 2.0, or hybridization written under defined stringent conditions
  • Appropriate hybridization conditions that can be confirmed by comparing the sequences by experimentation are within the technical scope and are well known to those skilled in the art (eg, J. Sambrook et al., Molecular Cloning, A Laboratory Manual, 2nd Edition, Cold Spring). Harbor Laboratory press, Cold Spring Harbor, New York, 1989; F.M. Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York).
  • pegRNA primary editing guide RNA
  • PBS reverse transcription
  • the guide sequence refers to a sequence in the guide RNA that specifies a target site, and includes a sequence that is fully or partially complementary to the target sequence.
  • the guide sequence is any polynucleotide sequence having sufficient complementarity with the target polynucleotide sequence to hybridize with the target DNA sequence and induce sequence-specific binding of the gene editing coalesce to the target DNA sequence.
  • the degree of complementarity between a guide sequence and its corresponding target sequence is about 50%, 60%, 75%, 80%, 85%, 90%, 95 when optimally aligned using an appropriate alignment algorithm. %, 97.5%, 99% or more.
  • Optimal alignment can be determined using any algorithm suitable for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, the Burroughs- Algorithms based on the Burrows-Wheeler Transform (eg Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies) ), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn) and Maq (available at maq.sourceforge.net).
  • any algorithm suitable for aligning sequences include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, the Burroughs- Algorithms based on the Burrows-Wheeler Transform (eg Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies) ), ELAND (Illumina, San Diego, CA), SOAP (available at
  • a guide The sequence is about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40 , 45, 50, 75 or more nucleotides in length. In some embodiments, the guide sequence is about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12 or less nucleotides in length.
  • the ability of a guide sequence to induce sequence-specific binding of a gene editing complex of a can be assessed by any suitable assay.
  • target sequence refers to a target nucleotide sequence for which pegRNA is desired.
  • the target sequence may be a sequence expected to be targeted by pegRNA.
  • the target sequence may be a partial sequence among known genomic sequences, or a sequence to be edited by a person skilled in the art using the system of the present invention.
  • the complex is for editing a target DNA or a target gene, and specifically, the complex of the present invention can edit a gene more effectively than a previously known prime editor, so that the gene editing efficiency or prime editing efficiency is improved/improved. .
  • the prime editing complex comprising a fusion protein comprising Cas9 H840A, Rad51 DBD and RT of the present invention has a lower or similar frequency of unintended editing than the existing PE2 excluding Rad51 DBD.
  • the prime editing editing efficiency was significantly improved while showing , it can be seen that it has superior efficacy than various previously known prime editors including PE2.
  • the "prime editing efficiency" means gene editing efficiency by the prime editor. Prime editing efficiency can be calculated as a rate at which editing induced by the prime editor and pegRNA occurs without unintentional mutation in the target sequence when prime editing is performed. The prime editing efficiency may be expressed as a percentage.
  • the data on the efficiency of the prime editing comprises the steps of introducing the genome editing composition into a cell or tissue; performing deep sequencing using the DNA obtained from the cell or tissue introduced with the composition; And it may be obtained by performing a method comprising the step of analyzing the prime editing efficiency from the data obtained by the deep sequencing.
  • the method of obtaining DNA from cells or tissues into which the prime editor is introduced may be performed using various DNA isolation methods known in the art. Since each cell or tissue is expected to have gene editing in the introduced target sequence, the gene editing efficiency can be detected by sequencing the target sequence.
  • the sequencing method is not limited to a specific method as long as prime editing efficiency data can be obtained, but, for example, deep sequencing may be used.
  • the step of analyzing the prime editing efficiency from the data obtained by the deep sequencing may include calculating the prime editing efficiency.
  • Prime editing efficiency may vary depending on the type and/or length of the pegRNA sequence and the target sequence.
  • the data on the prime editing efficiency may be provided as a data set.
  • the editing efficiency of the gene editing complex may be affected by the melting temperature of PBS, and specifically, the gene editing efficiency may be improved when the melting temperature of PBS is low.
  • Another aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a poly encoding the fusion protein a nucleotide or a recombinant vector comprising the polynucleotide; And 2) it provides a composition for gene editing, comprising a recombinant vector comprising a prime editing guide RNA (pegRNA) or a polynucleotide encoding the same.
  • pegRNA prime editing guide RNA
  • the composition is for editing a target DNA or a target gene, and may be for editing a gene through prime editing.
  • the polynucleotide encoding the Cas nickcase may include the polynucleotide sequence of SEQ ID NO: 8, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 8.
  • the polynucleotide encoding the single-stranded DNA binding domain may include the polynucleotide sequence of SEQ ID NO: 9, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 9.
  • the polynucleotide encoding the reverse transcriptase may include the polynucleotide sequence of SEQ ID NO: 10, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 10.
  • the polynucleotide encoding the first linker may include the polynucleotide sequence of SEQ ID NO: 11, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 11.
  • the polynucleotide encoding the second linker may include the polynucleotide sequence of SEQ ID NO: 12, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 12.
  • the polynucleotide encoding the fusion protein may include the polynucleotide sequence of SEQ ID NO: 13, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 13.
  • polynucleotide refers to deoxyribonucleotides or polymers of ribonucleotides that exist in single-stranded or double-stranded form. It encompasses RNA genomic sequences, DNA (gDNA and cDNA) and RNA sequences transcribed therefrom, and includes analogs of natural polynucleotides, unless otherwise specified.
  • nucleotide sequence encoding the protein/domain, all or part of the fusion protein including the nucleotide sequence, as well as the nucleotide sequence encoding the amino acid described in each SEQ ID NO: 80% or more, specifically 90% of the sequence A base sequence encoding a protein substantially the same as or corresponding to each protein as a nucleotide sequence showing homology of at least 95%, more specifically at least 98%, and most specifically at least 99%. Sequences are included without limitation.
  • the fusion protein is 80% or more, 85% or more, 90% or more, 91% or more, 92% or more, 93% of the amino acid sequence of SEQ ID NO: It may include a polynucleotide encoding a protein exhibiting at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% homology.
  • polynucleotide encoding the fusion protein is a range that does not change the amino acid sequence of the protein expressed from the coding region, considering codons preferred in the organism in which the protein is to be expressed due to codon degeneracy.
  • the polynucleotide may be included without limitation as long as it is a polynucleotide sequence encoding each protein.
  • the polynucleotide includes not only a nucleotide sequence encoding the amino acid sequence of the fusion protein, but also a sequence complementary to the sequence.
  • the complementary sequence includes not only perfectly complementary sequences, but also substantially complementary sequences, which under stringent conditions known in the art, for example, nucleotides encoding the amino acid sequence of the fusion protein. It refers to a sequence capable of hybridizing with the nucleotide sequence of the sequence.
  • stringent conditions means conditions that allow specific hybridization between polynucleotides. These conditions are specifically described in the literature (eg, J. Sambrook et al., supra). For example, genes having high homology between genes having homology of 40% or more, specifically 90% or more, more specifically 95% or more, still more specifically 97% or more, and particularly specifically 99% or more homology. Conditions that hybridize with each other and do not hybridize with genes with lower homology, or wash conditions of normal Southern hybridization at 60° C. 1XSSC, 0.1% SDS, specifically 60° C. 0.1XSSC, 0.1% SDS, more specifically As examples, the conditions of washing once, specifically 2 to 3 times, at a salt concentration and temperature equivalent to 68° C. 0.1XSSC, 0.1% SDS can be exemplified.
  • Hybridization requires that two polynucleotides have complementary sequences, although mismatch between bases is possible depending on the stringency of hybridization.
  • complementary is used to describe the relationship between nucleotide bases capable of hybridizing to each other. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the present application may also include substantially similar polynucleotide sequences as well as isolated polynucleotide fragments complementary to the overall sequence.
  • polynucleotides having homology can be detected using hybridization conditions including a hybridization step at a Tm value of 55° C. and using the conditions described above.
  • the Tm value may be 60 °C, 63 °C, or 65 °C, but is not limited thereto and may be appropriately adjusted by those skilled in the art according to the purpose.
  • the appropriate stringency for hybridizing polynucleotides depends on the length of the polynucleotides and the degree of complementarity, and the parameters are well known in the art (see Sambrook et al., supra, 9.50-9.51, 11.7-11.8).
  • the term "recombinant vector” may refer to a medium capable of delivering the polynucleotide into a cell and/or expressing a target protein encoding it.
  • the recombinant vector may include a polynucleotide encoding the fusion protein or a polynucleotide comprising a pegRNA encoding sequence.
  • the vector may contain the necessary regulatory elements operably linked to the insert, ie, the polynucleotide, to allow expression of the insert when present in a cell of an individual.
  • operably linked means that a nucleic acid expression control sequence and a nucleic acid sequence encoding a protein of interest are functionally linked to perform a general function.
  • the operative linkage with the recombinant vector can be prepared and purified using genetic recombination techniques well known in the art, and site-specific DNA cleavage and ligation can be easily performed using enzymes generally known in the art. can do.
  • the vector may include a promoter, a start codon, and a stop codon terminator.
  • DNA encoding the signal peptide, and/or enhancer sequence, and/or the untranslated region on the 5' side and the 3' side of the desired gene, and/or a selectable marker region, and/or a replicable unit, etc. are appropriately added may include
  • the promoter may be constitutive or inducible as a general promoter, lac, tac, T3 and T7 promoters for prokaryotic cells, monkey virus 40 (SV40) for eukaryotic cells, mouse mammary tumor virus (MMTV) promoter, human Immunodeficiency virus (HIV), such as the long terminal repeat (LTR) promoter of HIV, Moloney virus, cytomegalovirus (CMV), Epstein Barr virus (EBV), Loose Sacoma virus (RSV) promoter, as well as , ⁇ -actin promoter, human heroglobin, human muscle creatine, human metallothionein-derived promoter, and the like.
  • SV40 monkey virus 40
  • MMTV mouse mammary tumor virus
  • HV Human Immunodeficiency virus
  • LTR long terminal repeat
  • CMV cytomegalovirus
  • EBV Epstein Barr virus
  • RSV Loose Sacoma virus
  • the selectable marker is for selecting cells transformed by introducing a vector, and markers conferring a selectable phenotype such as drug resistance, auxotrophy, resistance to cytotoxic agents, or surface protein expression may be used. In an environment treated with a selective agent, only the cells expressing the selection marker survive, so that the transformed cells can be selected.
  • the vector may be a viral, cosmid or plasmid vector, but is not limited thereto.
  • the type of the vector is not particularly limited as long as it allows the expression of the desired gene and the function of producing the desired protein in various desired cells such as prokaryotic and eukaryotic cells, but specifically a promoter showing strong activity and strong expression
  • a vector capable of producing a large amount of a foreign protein in a form similar to that of a natural state while retaining its strength can be used.
  • Expression vectors suitable for eukaryotic cells or eukaryotic organisms include, but are not limited to, vectors derived from SV40, bovine papillomavirus, adenovirus, adeno-associated virus, cytomegalovirus, lentivirus and retrovirus, etc. may be used. may, but is not limited thereto.
  • Expression vectors that can be used in bacterial hosts include, but are not limited to, pET21a, pET, pRSET, pBluescript, pGEX2T, pUC vectors, col E1, pCR1, pBR322, pMB9 or derivatives thereof.
  • Bacterial plasmids obtained Bacterial plasmids obtained, plasmids having a wider host range such as RP4, phage DNA exemplified by phage lambda derivatives such as ⁇ gt10, ⁇ gt11 or NM989, and others such as M13 and filamentous single-stranded DNA phage Other DNA phages and the like may be included.
  • pVL941 or the like can be used for insect cells.
  • the composition may be for editing target DNA or genes ex vivo or in vivo, and may be used for gene editing in eukaryotic cells, prokaryotic cells, eukaryotic organisms or prokaryotic organisms, specifically It can be used for genome editing of eukaryotic cells or eukaryotic organisms.
  • the eukaryotic cells may be cells of yeast, mold, protozoa, plants, higher plants and insects, amphibians or birds, or mammalian cells such as CHO, HeLa, HEK293, and COS-1, for example , embryonic cells, stem cells, somatic cells, germ cells, cultured cells (in vitro), transplanted cells (graft cells) and primary cultured cells (in vitro and ex vivo), and in vivo, commonly used in the art It may be an (in vivo) cell, a cell of an organism, or a cell of a mammal including a human (mammalian cell).
  • mammalian cells such as CHO, HeLa, HEK293, and COS-1, for example , embryonic cells, stem cells, somatic cells, germ cells, cultured cells (in vitro), transplanted cells (graft cells) and primary cultured cells (in vitro and ex vivo), and in vivo, commonly used in the art It may be an (in vivo
  • the eukaryotic organism may be a yeast, a fungus, a protozoan, a plant, a higher plant and insect, an amphibian, a bird or a mammal (human, primate such as monkey, dog, pig, cow, sheep, goat, mouse, rat, etc.).
  • the composition can be used for priming of eukaryotic cells or eukaryotic organisms.
  • a method of delivering the polynucleotide or recombinant vector to a cell can be accomplished using various methods known in the art. For example, calcium phosphate-DNA co-precipitation method, DEAE-dextran-mediated transfection method, polybrene-mediated transfection method, electroshock method, microinjection method, liposome fusion method, lipofectamine and protoplast fusion method, etc. It can be carried out by several methods known in.
  • a target object that is, the vector can be delivered into a cell using viral particles by means of infection.
  • the vector can be introduced into the cell by gene bambadment or the like. The introduced vector may exist as a vector itself in a cell or may be integrated into a chromosome, but is not limited thereto.
  • Another aspect provides a method of editing a gene of a subject, comprising introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism.
  • introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism comprising introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism.
  • the same parts as those described above are equally applied to the above method.
  • the eukaryotic cell may be an individual or a cell isolated from a eukaryotic organism. Also, the eukaryotic organism may be non-human.
  • Cas nickase (nickase), a reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a polynucleotide encoding the fusion protein or a recombinant vector comprising the polynucleotide;
  • the prime editing guide RNA pegRNA
  • a recombinant vector comprising a polynucleotide encoding the same can be introduced simultaneously or sequentially, as long as it does not affect the gene editing efficiency of the composition, it is not limited thereto .
  • All steps performed in the genome editing method may be performed intracellularly or extracellularly, or in vivo or ex vivo.
  • the method of editing the genome may be performed by prime editing.
  • the gene editing complex for prime editing of the present invention and the composition for gene editing comprising the same show significantly superior gene editing efficiency than conventionally known prime editors such as PE2, using the gene editing complex as an active ingredient Gene editing-based therapeutics with excellent effects can be developed.
  • 1 is a diagram showing the structure of the PE2 variant used in the present invention.
  • FIG. 2 is a diagram illustrating the correlation between prime editing efficiency of biological replicates for high-throughput evaluation of prime editing activity.
  • the number of target sequences is 83.
  • FIG. 3 is a diagram comparing the prime editing efficiency of hyPE2 and PE2-mid_RPA70, which are PE2 variants, normalized to the efficiency of PE2.
  • Target sequences with less than 1% PE2-induced prime editing efficiency are indicated by white dots.
  • the number of target sequences is 64.
  • FIG. 4 is a diagram comparing the prime editing efficiency of PE2 variants, hyPE2 and PE2-mid_RPA70, normalized to that of PE2, and the results for a target sequence having a PE2-induced prime editing efficiency of more than 1%.
  • the number of target sequences is 30.
  • FIG. 5 is a diagram comparing the prime editing efficiencies of PE2 variants hyPE2, PE2-N_Rad51 and PE2-C_Rad51 normalized to that of PE2, and results for a target sequence having a PE2-induced prime editing efficiency of more than 1%.
  • the number of target sequences is 32.
  • FIG. 6 is a view showing a comparison result of the prime editing efficiency of PE2 and hyPE2 in HEK293T cells. Data points represent the average prime editing efficiency of three biological replicates in each target sequence, with 0.1% added to all efficiency values to allow logarithmic scales for both x- and y-axes. The number of target sequences is 88.
  • FIG. 7 is a diagram comparing the prime editing efficiency of hyPE2 and PE2-mid_RPA70, which are PE2 variants, normalized to that of PE2 in HCT116 cells in HCT116 cells. The results are for a target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of target sequences is 43.
  • FIG. 8 is a diagram showing the structure of the hyPE2 linker variant and sequence information of the linker.
  • FIG. 9 is a diagram comparing the prime editing efficiency of hyPE2 and hyPE2 linker variants normalized to the efficiency of PE2.
  • the fold increase was expressed as an adjusted fold increase, and the number of pegRNAs was 82.
  • FIG. 10 is a diagram comparing the prime editing efficiency of hyPE2 against the same endogenous target of HEK293T and HCT116 cells normalized to that of PE2. The results for the target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of pegRNAs is 31 for HEK293T and 11 for HCT116.
  • FIG. 11 is a diagram showing the results of comparing the prime editing efficiency of PE2 and hyPE2 in the endogenous region of HEK293T cells.
  • the number of pegRNAs is 63.
  • FIG. 12 is a view showing a comparison result of the prime editing efficiency of PE2 and hyPE2 in the endogenous region of HCT116 cells.
  • the number of pegRNAs is 51.
  • FIG. 13 is a diagram showing the results of comparing the prime editing efficiency of PE2 and hyPE2 in six endogenous regions of primary human dermal fibroblasts.
  • the number of pegRNAs is 51.
  • FIG. 14 is a diagram comparing the prime editing efficiency of hyPE2 in the same target sequence of primary human dermal fibroblasts normalized to that of PE2. Adjusted fold increments are plotted on the y-axis and are plotted as mean ⁇ standard deviation for three independent biological replicates.
  • FIG. 15 is a diagram comparing the prime editing efficiency and unintended editing frequency of hyPE2 in the same endogenous region of HEK293T cells normalized to the result of PE2.
  • the adjusted P-value was calculated by one-way ANOVA followed by a post-hoc analysis by Tukey's multiple comparisons, and the P-value was not recorded when the difference between the two groups was not statistically significant.
  • the number of pegRNAs is 25.
  • FIG. 16 shows off-target effects of hyPE2 and PE2 at potential off-target sites of pegRNAs 2, 3, 5 and 4 other pegRNAs targeting HEK4 in HEK293T cells. Intended edit positions are highlighted in yellow, and mismatched and overhanging nucleotides are indicated in red and blue lowercase fonts, respectively. Data are expressed as mean ⁇ standard deviation for three independent biological replicates.
  • FIG. 17 is a diagram comparing the on-target and off-target editing frequencies of hyPE2 normalized to PE2 in the same endogenous region of HEK293T cells. P-values were calculated with a two-tailed, unpaired Student's t-test, where the number of on-target pegRNAs was 7 and the number of off-target pegRNAs was 22.
  • FIG. 18 is a diagram comparing the prime editing efficiency of hyPE2 normalized to that of PE2 in the same target sequence integrated as a lentivirus into HEK293T cells. Results for target sequences with PE2-induced prime editing efficiency greater than 1%, the number of target sequences being 423.
  • 19 is a diagram illustrating a result of comparing Spearman correlation coefficients between prediction models.
  • FIG. 20 is a diagram showing the top 10 functions related to hyPE2 activity compared to PE2 activity determined by Tree SHAP (XGBoost classifier). Dot colors indicate high (red) or low (blue) values of the relevant function for each pegRNA.
  • FIG. 21 is a diagram showing the fold increase dependence of hyPE2 prime editing efficiency compared to PE2 on the primer binding site (PBS) melting temperature. Editing efficiencies for hyPE2 and PE2 were determined from the same target sequence (Library B) incorporated lentivirally into HEK293T cells. The number of pegRNAs is 4 ( ⁇ 20 °C), 32 (20-30 °C), 236 (30-40 °C), 348 (40-50 °C) and 11 ( ⁇ 50 °C), respectively.
  • 22 is a diagram schematically illustrating the structure of hyPE2.
  • FIG. 23 is a diagram showing the prediction results of the three-dimensional structure of hyPE2 (left) and PE2 (right) before and after the reverse transcription domain binds to the target ssDNA/pegRNA hybrid with a nick.
  • the hypothetical interaction modeling between Rad51, the target ssDNA with nick and the pegRNA primer binding site is shown in hyPE2.
  • Sequences encoding human RPA70-C and Rad51DBD were synthesized by request of GeneScript.
  • To construct a plasmid encoding ssDBD-PE2- the sequences were amplified by PCR and cloned into pCMV-PE2 (Addgene, no. 132775) plasmid.
  • the plasmids were named PE2-mid_RPA70, hyPE2, PE2-N_Rad51 and PE2-C_Rad51 (FIG. 1).
  • Linker variants were derived from the hyPE2 plasmid and cloned using Gibson assembly.
  • 158 of the 507 pegRNAs can be modified to induce silent mutations in the NGG PAM sequence, and 158 modified pegRNAs that can induce silent mutations in the PAM sequence in addition to the initially designed editing in library B were added. did.
  • HEK293T cells were seeded into 100 mm dishes containing DMEM. After 15 hours, the culture medium was replaced with DMEM containing 25 ⁇ M chloroquine diphosphate, and the cells were further cultured for 5 hours.
  • the plasmid library was mixed with psPAX2 (Addgene no. 12260) and pMD2.G (Addgene no. 12259) in a molar ratio of 1.3:0.72:1.64, and then the plasmid was mixed with HEK293T using polyethyleneimine (PEI MAX, Polysciences). The cells were cotransfected. The next day, the culture medium was replaced with fresh medium.
  • the medium containing the lentivirus was collected and filtered using a Millex-HV 0.45- ⁇ m low protein-binding membrane (Millipore), which was aliquoted and stored at -80°C.
  • Millex-HV 0.45- ⁇ m low protein-binding membrane Millex-HV 0.45- ⁇ m low protein-binding membrane (Millipore), which was aliquoted and stored at -80°C.
  • serial dilutions of virus aliquots in the presence of 8 ⁇ g/ml polybrene (Sigma) were transduced into HEK293T cells cultured in DMEM supplemented with 10% FBS. Both non-transduced and transduced cells were cultured in DMEM supplemented with 10% FBS and 2 ⁇ g/ml of puromycin. After all non-transduced cells died, the number of viable cells in the transduced population was counted to estimate virus titer.
  • lentiviral transduction 1.0 ⁇ 10 6 HEK293T or HCT116 cells were inoculated into 100 mm dishes and cultured overnight.
  • lentiviral libraries were transduced at an MOI of 0.3, and the next day, the culture medium was replaced with DMEM supplemented with 10% FBS and 2 ⁇ g/ml puromycin. replaced. In order to remove the non-transduced cells, the culture was maintained under the above conditions for 5 days.
  • the PE2 variant-encoding plasmid, pcDNA BSD-encoding plasmid, and puro-eGFP-encoding plasmid were mixed in a weight ratio of 10:1:1 for a total A plasmid mixture of 12 ⁇ g (for library A) or 24 ⁇ g (for library B) was made, and then Lipofectamine 2000 (Invitrogen) was used for a total of 1 ⁇ 10 6 cells from cell library A or a total of 6 ⁇ 10 6 cells from cell library B. Cells were transfected. After overnight incubation, the culture medium was replaced with DMEM containing 10% FBS and 40 ⁇ g/ml blasticidin S. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
  • HEK293T or HCT116 cells were seeded in 24-well plates and transfected at 70-80% cell density. Specifically, 750 ng of PE2-, 250 ng of pegRNA- and 100 ng of eGFP-Puro- (Addgene no. 45561) encoding plasmids were mixed and cells were co-transfected using Lipofectamine 2000 according to the manufacturer's protocol. The next day, the culture medium was replaced with DMEM supplemented with 10% FBS and 2 ⁇ g/ml puromycin. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
  • a skin punch biopsy was performed from the participants by a dermatologist after obtaining written consent from the study participants, who were healthy people.
  • the consent procedure and research were approved by the institutional review committee of Yonsei University Health and Medical Center Severance Hospital (No. 4-2012-0028).
  • Fibroblasts obtained from skin biopsies were cultured in DMEM containing 10% FBS and penicillin/streptomycin.
  • a total of 1 ⁇ 10 6 human dermal fibroblasts were mixed with 3 ⁇ g of PE2-, 1 ⁇ g of pegRNA-, and 1 ⁇ g of eGFP-Puro encoding plasmid and electroporated using the Neon electroporation kit according to the manufacturer's protocol.
  • transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
  • genomic DNA was extracted from the pelleted cells using the Wizard Genomic DNA Purification Kit (Promega) according to the manufacturer's protocol.
  • Wizard Genomic DNA Purification Kit Promega
  • a total of 16 ⁇ g (over 16,000x coverage) of genomic DNA was PCR-amplified using 2x pfu PCR Smart mix (Solgent).
  • the resulting PCR product was combined and purified with a MEGAquick-spin total fragment DNA purification kit (iNtRON Biotechnology).
  • 20 ng of the purified product was PCR amplified using the Illumina adapter and primers containing the barcode sequence.
  • the prime-editing efficiency (i.e., intended editing frequency) of library experiments was calculated as follows using the Python script disclosed in the previously published literature 'Nature Biotechnology volume 39, pages198-206 (2021)':
  • pegRNA and target-sequence pairs To identify individual pegRNA and target-sequence pairs, a 22-nt sequence consisting of an 18-nt barcode and a 4-nt sequence upstream of the barcode was used. To improve the accuracy of the analysis, pegRNA and target sequence pairs with fewer than 100 deep sequencing reads were excluded. Reads containing the desired edit but no unintended mutations in the broad target sequence including the PAM were classified as PE2-induced mutations.
  • Cas-analyzer was used to evaluate the intended editing frequency, unintended editing frequency, and indel frequency in the endogenous region. Calculated like this:
  • nts nucleotides
  • PE2 off-target sites with up to 2 nucleotide mismatches or 1 nucleotide RNA or DNA overhangs were identified by Cas-OFFinder.
  • genomic DNA samples used to measure prime editing activity at endogenous sites described above were used as templates for PCR amplification.
  • the resulting product was purified and sequenced by MiSeq.
  • hyPE2- and PE-induced prime editing efficiencies obtained using library B were partitioned into training and test datasets through random sampling, ensuring that pegRNA and target sequences were not shared between the two datasets.
  • Seven existing machine learning algorithms XGBoost (extremegradient boosting), gradient-boosted regression tree (Boosted RT), random forest, L1-regularized linear regression, Lasso), L2-regularized linear regression (Ridge), L1L2-regularized linear regression (L1L2-regularized linear regression, ElasticNet), and SVM (support vector machine) were trained respectively.
  • the above models were implemented using the XGBoost Python package (version 1.3.3) and scikit-learn (version 0.23.2).
  • XGBoost and gradient-boosted regression trees we searched more than 16 models selected from the following hyperparameter constructs: number of base estimators (selected from [50, 100]), maximum depth of individual regression estimators ([5, 10] ]), the minimum number of samples in a leaf node (choose from [1, 2]]), and the learning rate (choose from [0.1, 0.2] ).
  • RNAfold WebServer was used to predict the secondary structure of this region and adopted a hairpin structure for the residue.
  • the pegRNA RT-template region hybridized with the 16-nt DNA primer region was modeled manually based on the structure of the XMRV RT complexed with an RNA:DNA hybrid (PDB code: 4HKQ).
  • a three-dimensional model of Rad51 ssDBD (residues 16-85) was obtained from the N-terminal domain structure of Rad51 (PDB code: 1B22). Secondary structures were predicted using RaptorX to find putative ⁇ -helices in the flexible N- and C-terminal regions of Rad51 ssDBD, Linker A and Linker B. 23 showing the three-dimensional structure was generated using the UCSF Chimera program, and two linkers were displayed on the three-dimensional structure in consideration of the length and secondary structure.
  • Example 11 Data and Code Availability
  • Deep sequencing data of the present invention is in the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra/) accession no. Submitted as SRP307854.
  • Protein structure data for predicting the structures of PE2 and hyPE2 were obtained from Protein Data Bank (https://www.rcsb.org). Information about the PDB code is as follows:
  • ssDNA single-stranded DNA
  • ssDBD ssDNA binding protein domain
  • the prime editing efficiency was determined using deep sequencing. First, it was confirmed that a high correlation between biological replicates was shown ( FIG. 2 ), and the average of the prime editing efficiencies in three biological replicates was obtained and used for analysis. In addition, from the first 107 pairs, 24 pairs were excluded due to insufficient deep sequencing reads ( ⁇ 100 reads), and 19 pairs showed an editing efficiency of 0% for PE2, which was useful for normalizing hyPE2 efficiency to PE2. It was excluded because it may indicate an error.
  • the median increase was 2.4 times (range, 0-360 times) for hyPE2 and 1.5 times (range, 0-232 times) for PE2-mid_RPA70. (Fig. 3). Additionally, a higher median increase was seen for pairs exhibiting less than 1% editing efficiency in PE2 compared to pairs exhibiting greater than 1% efficiencies in PE2.
  • mutants in which Rad51 DBD was inserted into the N or C-terminal region of PE2 were prepared, and these were named PE2-N_Rad51 and PE2-C_Rad51, respectively (FIG. 1).
  • hyPE2 showed the highest overall activity (median value, 1.8 times higher than PE2 activity)
  • PE2-N_Rad51 showed lower activity than PE2 (median, 0.85 fold)
  • PE2 C_Rad51 showed slightly higher overall activity than PE2 (median, 1.1 fold) ( FIG. 5 ).
  • hyPE2 efficiency was higher compared to PE2 in all 33 target sequences showing more than 1% PE2 efficiency.
  • Twenty of the remaining 55 target sequences (20/55 36%) with a prime editing efficiency of less than 1% by PE2 showed a prime editing efficiency of greater than 1% by hyPE2, with an average efficiency of 9.1% (median 5.7) %, range 1.1-29%) (Fig. 6).
  • HCT116 cell library containing 107 pairs of pegRNA encoding sequences and target sequences. Similar to the above, 26 pairs with insufficient read count ( ⁇ 100 read count) and 38 pairs with PE2 efficiency ⁇ 1% were excluded. As a result of evaluating the prime editing efficiency for the remaining 43 pairs, it was confirmed that hyPE2 exhibited a median value of 1.4 times (range, 0.82-2.9 times) than PE2, indicating higher efficiency than PE2 (FIG. 7).
  • PE2 efficiency was higher than 1% in 31 and 11 targets for HEK293T and HCT116 cells, respectively, and the efficiency of hyPE2 at the target site was 1.4 times higher than that of PE2 in HEK293T and HCT116 cells, respectively (median, range). 0.89 to 2.2 times) and 1.5 times (median, range 1.0 to 2.6 times) higher ( FIGS. 10 to 12 ).
  • the prime editing efficiency by hyPE2 increased from 5.8% to 13% in HEK293T cells and from 1.1% to 2.8% in HCT116 cells.
  • hyPE2 showed higher prime editing efficiency in 5 of 6 targets, and it was confirmed that the mean and median increases in 6 targets were 2.1-fold and 2.0-fold (adjusted fold), respectively ( 13 and 14).
  • hyPE2 does not specifically induce unintended editing and off-target effects compared to PE2.
  • the data in library B was randomly partitioned into a training data set and a test data set, ensuring that neither the pegRNA nor the target sequence was shared between the two data sets.
  • the model name was named PEselector, and the model is provided at http://deepcrispr.info/PEselector .

Abstract

The present invention relates to a gene editing complex for prime editing and a use thereof. According to an aspect, a gene editing complex for prime editing and a prime editing composition comprising same of the present invention exhibit remarkably higher gene editing efficiency than conventionally known prime editors, such as PE2, etc. Thus, a highly effective, gene edition-based therapeutic agent can be developed with the gene editing complex used as an active ingredient.

Description

교정 효율이 개선된 프라임에디팅용 조성물Prime editing composition with improved correction efficiency
본 발명은 프라임 에디팅을 위한 유전자 편집 복합체 및 이의 용도에 관한 것이다.The present invention relates to gene editing complexes and uses thereof for prime editing.
프라임에디팅(Prime Editing)은 donor DNA 또는 이중가닥 나누기(double-strand breaks, DSBs) 없이, 거의 모든 크기의 유전자 변화를 도입할 수 있는 혁신적인 신규 게놈 편집 방법이다(Anzalone, A.V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019)). 이러한 변화에는 삽입, 결실, 및 모든 가능한 12가지 점 돌연변이뿐만 아니라 이러한 변화들의 조합을 포함한다.Prime Editing is an innovative novel genome editing method capable of introducing genetic changes of virtually any size without the need for donor DNA or double-strand breaks (DSBs) (Anzalone, AV et al. Search-and -replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019)). These changes include insertions, deletions, and all possible 12 point mutations, as well as combinations of these changes.
프라임에디터(Prime editor, PE)는 기본적으로 Cas9 nickase-reverse transcriptase (RT) 융합 단백질 및 프라임에디팅 가이드 RNA(prime editing guide RNA, pegRNA)로 구성되며; pegRNA는 표적 서열을 인식하는 가이드 서열, tracrRNA 스캐폴드 서열, 역전사 개시에 필요한 프라이머 결합 부위(primer binding site, PBS), 및 원하는 유전적 변화를 포함하며 표적 서열에 상동성인 RT 주형(RT template)을 포함한다. 4가지 유형의 프라임에디터가 개발되었다: PE1, PE2, PE3, 및 PE3b.Prime editor (PE) basically consists of Cas9 nickase-reverse transcriptase (RT) fusion protein and prime editing guide RNA (pegRNA); The pegRNA contains a guide sequence recognizing a target sequence, a tracrRNA scaffold sequence, a primer binding site (PBS) required for initiation of reverse transcription, and a desired genetic change. An RT template homologous to the target sequence. include Four types of Prime Editor have been developed: PE1, PE2, PE3, and PE3b.
이 중 PE2는 이간 세포를 포함하는 다양한 종의 세포 유형에서 프라임 편집을 유도할 수 있는 것으로 알려져 있다. 그러나, PE2는 어떤 경우에는 유전자 편집하기에는 충분하지 않은 효율성을 나타내는 경우도 있다. 따라서, 프라임 편집 효율성을 더욱 향상시키기 위해 PE2에 단일 가이드 RNA(sgRNA)를 추가하여 PE3 및 PE3b를 생성할 수 있다. 그러나, PE3 및 PE3b 종종 의도하지 않은 인델(indel)을 유도하는 바 예상하지 못한 돌연변이가 발생할 위험성이 있다.Among them, PE2 is known to be capable of inducing prime editing in cell types of various species including liver cells. However, PE2 is sometimes not efficient enough for gene editing. Therefore, to further improve prime editing efficiency, PE3 and PE3b can be generated by adding a single guide RNA (sgRNA) to PE2. However, there is a risk of unexpected mutations as PE3 and PE3b often lead to unintended indels.
이러한 배경 하에, 본 발명자들은 프라임 에디팅 효율을 향상시키기 위해 노력한 결과, 기존의 PE2를 개량하였으며, 상기 개량된 PE2 변이체의 유전자 편집 효율이 현저히 향상됨을 확인하여 본 발명을 완성하였다.Under this background, the present inventors have improved the existing PE2 as a result of efforts to improve the prime editing efficiency, and confirmed that the gene editing efficiency of the improved PE2 mutant is significantly improved, thereby completing the present invention.
일 양상은 1) Cas 닉케이스(nickase), 역전사 효소(reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질; 및 2) 프라임 에디팅 가이드 RNA (pegRNA)를 포함하는, 유전자 편집 복합체를 제공하는 것이다.One aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a single-stranded DNA-binding domain (single-stranded DNA-binding domain, ssDBD) comprising a fusion protein; And 2) to provide a gene editing complex comprising a prime editing guide RNA (pegRNA).
다른 양상은 1) Cas 닉케이스(nickase), 역전사 효소(reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질, 상기 융합 단백질을 암호화하는 폴리뉴클레오티드 또는 상기 폴리뉴클레오티드를 포함하는 재조합 백터; 및 2) 프라임 에디팅 가이드 RNA (pegRNA) 또는 이를 암호화하는 폴리뉴클레오티드를 포함하는 재조합 벡터를 포함하는, 유전자 편집용 조성물을 제공하는 것이다.Another aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a poly encoding the fusion protein a nucleotide or a recombinant vector comprising the polynucleotide; And 2) to provide a composition for gene editing, including a recombinant vector comprising a prime editing guide RNA (pegRNA) or a polynucleotide encoding the same.
또 다른 양상은 상기 유전자 편집용 조성물을 분리된 진핵 세포 또는 인간을 제외한 진핵 유기체에 도입하는 단계를 포함하는, 대상 개체의 유전자를 편집하는 방법을 제공하는 것인다.Another aspect is to provide a method for editing a gene of a subject, comprising introducing the composition for gene editing into isolated eukaryotic cells or eukaryotic organisms other than humans.
일 양상은 1) Cas 닉케이스(nickase), 역전사 효소(reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질; 및 2) 프라임 에디팅 가이드 RNA (pegRNA)를 포함하는, 유전자 편집 복합체를 제공한다.One aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a single-stranded DNA-binding domain (single-stranded DNA-binding domain, ssDBD) comprising a fusion protein; and 2) a prime editing guide RNA (pegRNA).
본 명세서에서의 용어 "유전자 편집(genome editing)"은 인공적으로 조작된 핵산분해효소 혹은 유전자 가위를 이용해 유전자 또는 DNA를 조작하는 유전 공학적 기술을 의미하는 것으로서, 하나 이상의 핵산 분자(예컨대, 1 - 100,000 bp, 1 - 10,000 bp, 1 - 1,000 bp, 1 - 100 bp, 1 - 70 bp, 1 - 50 bp, 1 - 30 bp, 1 - 20 bp, 또는 1 - 10 bp)의 결실, 삽입, 치환 등에 의하여 유전자 기능을 상실, 변경, 및/또는 회복(수정)시키는 것을 의미한다. 구체적으로 DNA 또는 유전자 넉아웃, 넉다운 등의 결실, 유전자 삽입(넉인), 유전자 교정, 유전자 발현 조절 또는 염색체 재배열 등을 포함하는 것일 수 있다.As used herein, the term "gene editing" refers to a genetic engineering technique for manipulating genes or DNA using artificially engineered nucleases or gene scissors, and one or more nucleic acid molecules (eg, 1 - 100,000 bp, 1 - 10,000 bp, 1 - 1,000 bp, 1 - 100 bp, 1 - 70 bp, 1 - 50 bp, 1 - 30 bp, 1 - 20 bp, or 1 - 10 bp) deletion, insertion, substitution, etc. It means to lose, alter, and/or restore (modify) gene function. Specifically, it may include DNA or gene knockout, deletion such as knockdown, gene insertion (knock-in), gene correction, gene expression regulation, or chromosome rearrangement.
상기 유전자 넉아웃은 유전자의 전부 또는 일부 (예컨대, 하나 이상의 뉴클레오티드)의 결실, 치환, 및/또는 하나 이상의 뉴클레오티드의 삽입에 의한 유전자의 활성 조절, 예컨대, 불활성화를 의미하는 것일 수 있다. 상기 유전자 불활성화는 유전자의 발현 억제 또는 발현 감소 (downregulation) 또는 본래의 기능을 상실한 단백질을 코딩하도록 변형된 것을 의미한다. 또한 유전자 조절은 타겟 유전자의 하나 이상의 Exon을 둘러싸고 있는 양쪽 intron 부위를 동시에 targeting 함으로 인한 Exon 부위의 결실로 인해 얻어지는 단백질의 구조 변형, Dominant negative 형태의 단백질 발현, soluble 형태로 분비되는 경쟁적 저해제 발현 등의 결과에 의한 유전자의 기능 변화를 의미하는 것일 수 있다. 상기 유전자 삽입(넉인)은 다른 종 또는 원래부터 해당 생명체에 존재하지 않았던 외래의 염기서열을 해당 생명체의 게놈 또는 해당 생명체로부터 유래한 DNA 염기서열에 유전자 재조합 기술을 이용하여 삽입하는 것을 의미한다.The gene knockout may refer to the regulation of gene activity by deletion, substitution, and/or insertion of one or more nucleotides, eg, inactivation, of all or part of a gene (eg, one or more nucleotides). The gene inactivation refers to a modification to encode a protein that has lost its original function or suppressed or downregulated the expression of a gene. In addition, gene regulation involves structural modification of proteins obtained by deletion of exon sites due to simultaneous targeting of both intron sites surrounding one or more exons of the target gene, expression of dominant negative form of protein, expression of competitive inhibitor secreted in soluble form, etc. It may mean a change in the function of a gene as a result. The gene insertion (knock-in) refers to inserting an exogenous base sequence that did not exist in another species or originally in the organism into the genome of the organism or a DNA sequence derived from the organism by using genetic recombination technology.
상기 유전자 편집 복합체는 프라임 에디팅(prime editing)을 위한 복합체로서, 구체적으로 프라임 에디팅을 통해 유전자를 편집하는 것일 수 있다.The gene editing complex is a complex for prime editing, and may specifically edit a gene through prime editing.
본 명세서에서의 용어 "프라임 에디팅(Prime editing)"은 4세대 유전자 가위에 의한, DNA 이중가닥 절단 없이 한 가닥의 DNA만 절단하여 유전자 변화를 도입할 수 있는 게놈 편집 방법이다.As used herein, the term “prime editing” refers to a genome editing method capable of introducing a genetic change by cutting only one strand of DNA without DNA double-strand cutting by fourth-generation gene scissors.
상기 프라임 에디팅은 "프라임 에디터(Prime editor, PE)" 또는 이의 변이체에 의해 수행될 수 있다. 상기 프라임 에디터는 Cas 닉케이스(nickase)-역전사 효소(reverse transcriptase, RT) 융합 단백질 및 프라임에디팅 가이드 RNA (pegRNA)를 포함하는 것일 수 있으며, 상기 프라임 에디팅의 효율을 향상시키기 위해 추가적인 도메인 또는 단백질이 상기 융합 단백질에 추가로 포함될 수 있다. 상기 프라임 에디터의 종류로는 PE1, PE2, PE3, 및 PE3b 등 또는 이들의 변이체 등이 있으나, 이에 제한되지 않는다. 일 구체예에서, 상기 프라임 에디터는 프라임 에디터2(Prime editor 2, PE2)에 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)이 추가로 포함된 것일 수 있으며, 하기 실시예 및 실험예에서는 PE2의 변이체로서 hyPE2로 명명하였다.The prime editing may be performed by "Prime editor (PE)" or a variant thereof. The prime editor may include a Cas nickase-reverse transcriptase (RT) fusion protein and a prime editing guide RNA (pegRNA), and additional domains or proteins are added to improve the efficiency of the prime editing. It may be further included in the fusion protein. The type of the prime editor includes, but is not limited to, PE1, PE2, PE3, PE3b, and the like or variants thereof. In one embodiment, the prime editor may be one in which a single-stranded DNA-binding domain (ssDBD) is additionally included in Prime editor 2 (PE2), the following Examples and Experiments In the example, it was named hyPE2 as a variant of PE2.
상기 프라임 에디터는 Cas 닉케이스, 역전사 효소 및 단일 가닥 DNA 결합 도메인이 융합된 프라임에디터 단백질(융합 단백질) 및 프라임에디팅 가이드 RNA(prime editing guide RNA: pegRNA)를 포함한다. 본 명세서에서 프라임에디터는 좁게는 Cas 닉케이스 단백질, 단일 가닥 DNA 결합 도메인 및 역전사 효소가 융합된 프라임에디터 단백질(융합 단백질)을 의미하고, 넓게는 상기 프라임에디터 단백질과 프라임에디팅 가이드 RNA가 복합체를 형성한 형태의 프라임에디터 복합체(prime editor complex)를 의미하는 것일 수 있다.The prime editor includes a prime editor protein (fusion protein) in which a Cas nickcase, a reverse transcriptase, and a single-stranded DNA binding domain are fused, and a prime editing guide RNA (pegRNA). Prime editor as used herein narrowly refers to a prime editor protein (fusion protein) in which Cas nickcase protein, single-stranded DNA binding domain and reverse transcriptase are fused, broadly, the prime editor protein and prime editing guide RNA form a complex It may mean a type of prime editor complex.
본 명세서에서, 프라임에디터는 Cas 닉케이스-ssDBD-RT 융합 단백질만을 포함하는 것을 의미할 수도 있고, Cas 닉케이스-ssDBD-RT 융합 단백질과 pegRNA를 함께 포함하는 것을 의미할 수도 있다. 예를 들어, 세포 내에 pegRNA를 별도로 도입한 경우, 여기에 프라임에디터를 도입하였다는 것은 Cas 닉케이스-ssDBD-RT 융합 단백질만을 도입한 것을 의미할 수 있다. 즉, pegRNA가 이미 도입되어 있는 경우 프라임에디터의 도입은 Cas 닉케이스-ssDBD-RT 융합 단백질만을 도입한 것을 의미할 수 있다. 일 구체예에서, 프라임에디터는 Cas 닉케이스-ssDBD-RT 융합단백질을 의미할 수 있다. In the present specification, the prime editor may mean including only the Cas nickcase-ssDBD-RT fusion protein, or may mean including the Cas nickcase-ssDBD-RT fusion protein and pegRNA together. For example, when pegRNA is separately introduced into the cell, introduction of the prime editor here may mean introducing only the Cas nickcase-ssDBD-RT fusion protein. That is, when the pegRNA has already been introduced, the introduction of the prime editor may mean introducing only the Cas nickcase-ssDBD-RT fusion protein. In one embodiment, the prime editor may refer to a Cas nickcase-ssDBD-RT fusion protein.
상기 프라임에디터에 사용되는 "Cas 닉케이스(nickase)"는 한 가닥의 DNA를 절단(nick)하는 닉케이스 활성을 갖는 것으로서, Cas 단백질이 닉케이스 활성을 갖도록 변형된 것일 수 있으며, pegRNA와 함께 표적 부위로 전달되어 표적 특이적으로 단일 가닥만 절단할 수 있도록 변형된 Cas 단백질이라면 제한 없이 사용할 수 있다. 상기 Cas 닉케이스는 Cas9 닉케이스, SaCas9 닉케이스, SpCas9 닉케이스, Cpf1 닉케이스, Cas3 닉케이스, Cas8a-c 닉케이스, Cas10 닉케이스, Cse1 닉케이스, Csy1 닉케이스, Csn2 닉케이스, Cas4 닉케이스, Csm2 닉케이스, Cm5 닉케이스, Csf1 닉케이스, C2C2 닉케이스, NgAgo 닉케이스, Cas12e 닉케이스, Cas12d 닉케이스, Cas12a 닉케이스, Cas12b1 닉케이스, Cas13a 닉케이스, Cas12c 닉케이스 및 이들의 변이체로 이루어진 군에서 선택된 하나 이상일 수 있으며, 구체적으로 Cas9 닉케이스일 수 있으며, 보다 구체적으로 Cas9 H850A 닉케이스일 수 있다.The "Cas nickase" used in the prime editor has a nickcase activity to cut (nick) one-stranded DNA, and the Cas protein may be modified to have a nickcase activity, and a target together with pegRNA. Any Cas protein that has been delivered to a site and modified so that only a single strand can be specifically cleaved can be used without limitation. The Cas nickcase includes Cas9 nickcase, SaCas9 nickcase, SpCas9 nickcase, Cpf1 nickcase, Cas3 nickcase, Cas8a-c nickcase, Cas10 nickcase, Cse1 nickcase, Csy1 nickcase, Csn2 nickcase, Cas4 nickcase , Csm2 nickcase, Cm5 nickcase, Csf1 nickcase, C2C2 nickcase, NgAgo nickcase, Cas12e nickcase, Cas12d nickcase, Cas12a nickcase, Cas12b1 nickcase, Cas13a nickcase, Cas12c nickcase and variants thereof It may be one or more selected from the group, and may specifically be a Cas9 nickcase, and more specifically may be a Cas9 H850A nickcase.
본 명세서에서의 용어 “역전사 효소(reverse transcriptase, RT)"는 RNA를 주형으로 하고, 이에 상보적인 새로운 DNA를 합성하는 효소이다.As used herein, the term “reverse transcriptase (RT)” is an enzyme that uses RNA as a template and synthesizes new DNA complementary thereto.
본 명세서에서의 용어 "단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)"은 단일 가닥 DNA를 인식하는 하나 이상의 구조적 모티프를 포함하는 독립적으로 접힌 단백질 도메인을 의미한다. 상기 DNA 결합 도메인은 특정 DNA 서열을 인식하거나 DNA에 대한 일반적인 친화성을 가질 수 있으며, 일부 DNA 결합 도메인은 접힌 구조에 핵산을 포함할 수도 있다.As used herein, the term “single-stranded DNA-binding domain (ssDBD)” refers to an independently folded protein domain comprising one or more structural motifs that recognize single-stranded DNA. The DNA-binding domain may recognize a specific DNA sequence or have a general affinity for DNA, and some DNA-binding domains may include a nucleic acid in a folded structure.
상기 단일 가닥 DNA 결합 도메인은 Rad51 DBD 또는 RPA70 DBD일 수 있으며, 구체적으로 Rad51 DBD일 수 있다.The single-stranded DNA binding domain may be Rad51 DBD or RPA70 DBD, specifically Rad51 DBD.
본 명세서에서 "융합 단백질"은 Cas 닉케이스, 역전사 효소 및 단일 가닥 DNA 결합 도메인이 결합되도록 인위적으로 합성된 단백질을 의미한다.As used herein, "fusion protein" refers to a protein artificially synthesized such that a Cas nickase, a reverse transcriptase, and a single-stranded DNA binding domain are bound.
상기 융합 단백질에서 상기 단일 가닥 DNA 결합 도메인은 Cas 닉케이스의 C-말단 및 역전사 효소의 N-말단에 연결되는 것일 수 있다. 따라서, 상기 융합 단백질은 N-말단부터 Cas 닉케이스, 단일 가닥 DNA 결합 도메인 및 역전사 효소의 순서로 연결된 재조합 단백질을 포함하는 것일 수 있다.In the fusion protein, the single-stranded DNA binding domain may be connected to the C-terminus of the Cas nickcase and the N-terminus of the reverse transcriptase. Accordingly, the fusion protein may include a recombinant protein linked from the N-terminus to a Cas nickcase, a single-stranded DNA binding domain, and a reverse transcriptase in order.
상기 Cas 닉케이스는 서열번호 1의 아미노산 서열을 포함하는 것일 수 있으며, 구체적으로 서열번호 1의 아미노산 서열로 이루어진 것일 수 있다.The Cas nick case may include the amino acid sequence of SEQ ID NO: 1, and specifically, may include the amino acid sequence of SEQ ID NO: 1.
상기 단일 가닥 DNA 결합 도메인은 서열번호 2의 아미노산 서열을 포함하는 것일 수 있으며, 구체적으로 서열번호 2의 아미노산 서열로 이루어진 것일 수 있다.The single-stranded DNA binding domain may include the amino acid sequence of SEQ ID NO: 2, and specifically, may include the amino acid sequence of SEQ ID NO: 2.
상기 역전사 효소는 서열번호 3의 아미노산 서열을 포함하는 것일 수 있으며, 구체적으로 서열번호 3의 아미노산 서열로 이루어진 것일 수 있다.The reverse transcriptase may include the amino acid sequence of SEQ ID NO: 3, specifically, may consist of the amino acid sequence of SEQ ID NO: 3.
상기 융합 단백질에서 Cas 닉케이스, 단일 가닥 DNA 결합 도메인 및 역전사 효소는 각각에 직접적으로 연결될 수도 있고, 링커를 통해 연결될 수도 있으나, 구체적으로 링커를 통해 연결되는 것일 수 있다.In the fusion protein, the Cas nickase, the single-stranded DNA binding domain, and the reverse transcriptase may be directly linked to each other or linked through a linker, but may be specifically linked through a linker.
상기 링커는 융합단백질의 활성을 나타내게 하는 한 특별히 이에 제한되지 않으며, 예를 들어 글라이신, 알라닌, 루이신, 이소루이신, 프롤린, 세린, 트레오닌, 아스파라긴, 아스파르트산, 시스테인, 글루타민, 글루탐산, 리신, 아르기닌산 등의 아미노산을 사용하여 연결시킬 수 있고, 1개 내지 40개의 아미노산을 연결하여 사용할 수 있다.The linker is not particularly limited as long as it exhibits the activity of the fusion protein, and for example, glycine, alanine, leucine, isoleucine, proline, serine, threonine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, lysine, It can be linked using amino acids, such as arginic acid, and can be used by linking 1 to 40 amino acids.
상기 융합 단백질에서 Cas 닉케이스 및 단일 가닥 DNA 결합 도메인을 연결하는 링커를 제1 링커로 명명할 수 있고, 단일 가닥 DNA 결합 도메인 및 역전사 효소를 연결하는 링커를 제2 링커로 명명할 수 있다. 상기 제1 링커는 서열번호 4의 아미노산 서열을 포함하는 것일 수 있고, 구체적으로 서열번호 4의 아미노산 서열로 이루어진 것일 수 있다. 상기 제2 링커는 서열번호 5의 아미노산 서열을 포함하는 것일 수 있고, 구체적으로 서열번호 5의 아미노산 서열로 이루어진 것일 수 있다.In the fusion protein, a linker connecting the Cas nickcase and the single-stranded DNA-binding domain may be referred to as a first linker, and a linker connecting the single-stranded DNA-binding domain and the reverse transcriptase may be referred to as a second linker. The first linker may include the amino acid sequence of SEQ ID NO: 4, specifically, may consist of the amino acid sequence of SEQ ID NO: 4. The second linker may include the amino acid sequence of SEQ ID NO: 5, specifically, may consist of the amino acid sequence of SEQ ID NO: 5.
따라서, 상기 융합 단백질은 N-말단부터 Cas 닉케이스, 제1 링커, 단일 가닥 DNA 결합 도메인, 제2 링커 및 역전사 효소의 순서로 연결된 재조합 단백질을 포함하는 것일 수 있다.Accordingly, the fusion protein may include a recombinant protein linked in order from the N-terminus to the Cas nickcase, the first linker, the single-stranded DNA binding domain, the second linker, and the reverse transcriptase.
상기 융합 단백질은 서열번호 6의 아미노산 서열을 포함하는 것일 수 있고, 구체적으로 서열번호 6의 아미노산 서열로 이루어진 것일 수 있다.The fusion protein may include the amino acid sequence of SEQ ID NO: 6, specifically, the amino acid sequence of SEQ ID NO: 6.
상기 융합 단백질은 핵 위치화 서열 (Nuclear localization sequence or signal, NLS)을 추가로 포함하는 것일 수 있다.The fusion protein may further include a nuclear localization sequence (NLS).
본 명세서에서 용어 "핵 위치화 서열 또는 신호(Nuclear localization sequence or signal, NLS)"은 특정물질(예컨대, 단백질)을 세포 핵 내로 운반하는 역할을 하는 아미노산 서열을 의미하며, 대체적으로 핵공(Nuclear Pore)을 통하여 세포 핵 내로 운반하는 작용을 한다(Kalderon D, et al., Cell 39:499509(1984); Dingwall C, et al., J CellBiol. 107(3):8419(1988)). 상기 핵 위치화 서열은 진핵생물에서 유전자 편집 활성에 필요하지 않지만, 이러한 서열을 포함하여, 시스템의 활성을 증진시켜, 특히 핵 내의 핵산 분자를 표적화하는 것으로 여겨진다. 상기 핵 위치화 서열은 서열번호 7의 아미노산 서열을 포함하는 것일 수 있고, 구체적으로 서열번호 7의 아미노산 서열로 이루어진 것일 수 있다.As used herein, the term "nuclear localization sequence or signal (NLS)" refers to an amino acid sequence that serves to transport a specific substance (eg, protein) into a cell nucleus, and is generally a nuclear pore (Nuclear Pore). ) through the cell nucleus (Kalderon D, et al., Cell 39:499509 (1984); Dingwall C, et al., J Cell Biol. 107(3):8419 (1988)). Although such nuclear localization sequences are not required for gene editing activity in eukaryotes, it is believed that the inclusion of such sequences enhances the activity of the system, particularly targeting nucleic acid molecules in the nucleus. The nuclear localization sequence may include the amino acid sequence of SEQ ID NO: 7, specifically, may consist of the amino acid sequence of SEQ ID NO: 7.
상기 핵 위치화 서열은 융합 단백질의 N-말단 및/또는 C-말단에 추가될 수 있으며, 구체적으로 N-말단 및 C-말단에 추가되는 것일 수 있다. 따라서, 상기 융합 단백질은 N-말단부터 핵 위치화 서열, Cas 닉케이스, 제1 링커, 단일 가닥 DNA 결합 도메인, 제2 링커, 역전사 효소 및 핵 위치화 서열의 순서로 연결된 재조합 단백질을 포함하는 것일 수 있다.The nuclear localization sequence may be added to the N-terminus and/or C-terminus of the fusion protein, and specifically may be added to the N-terminus and C-terminus. Accordingly, the fusion protein is to include a recombinant protein linked in the order of the nuclear localization sequence, the Cas nickase, the first linker, the single-stranded DNA binding domain, the second linker, the reverse transcriptase and the nuclear localization sequence from the N-terminus. can
본 명세서에서 상기 핵 위치화 서열, Cas 닉케이스, 제1 링커, 단일 가닥 DNA 결합 도메인, 제2 링커, 역전사 효소 및 이들 일부 또는 전부를 포함하는 융합 단백질은 상기에서 각 서열번호로 기재한 아미노산 서열뿐만 아니라, 상기 서열과 80% 이상, 구체적으로는 90% 이상, 보다 구체적으로는 95% 이상, 더욱 구체적으로는 98% 이상, 가장 구체적으로는 99% 이상의 상동성을 나타내는 아미노산 서열로서, 실질적으로 상기 각 단백질과 동일하거나 상응하는 생물학적 활성을 나타내는 단백질을 표현하는 아미노산 서열이라면 일부 서열이 결실, 변형, 치환 또는 부가된 아미노산 서열을 가지는 경우도 제한없이 포함한다.In the present specification, the fusion protein comprising the nuclear localization sequence, Cas nickase, first linker, single-stranded DNA binding domain, second linker, reverse transcriptase, and some or all of the above amino acid sequences described in SEQ ID NOs. In addition, as an amino acid sequence exhibiting 80% or more, specifically 90% or more, more specifically 95% or more, more specifically 98% or more, and most specifically 99% or more homology with the sequence, substantially If the amino acid sequence expressing a protein exhibiting the same or corresponding biological activity as the respective protein, the case where some sequences have an amino acid sequence deleted, modified, substituted or added is included without limitation.
본 명세서에서의 용어 "상동성" 이란, 단백질을 구성하는 아미노산 서열 또는 이를 암호화하는 폴리뉴클레오티드 서열의 유사한 정도를 의미하는데, 상동성이 충분히 높은 경우 해당 폴리뉴클레오티드(유전자)의 발현 산물 및 단백질은 동일하거나 유사한 활성을 가질 수 있다. 또한, 상동성은 주어진 아미노산 서열 또는 폴리뉴클레오티드 서열과 일치하는 정도에 따라 백분율로 표시될 수 있다. 본 명세서에서, 주어진 아미노산 서열 또는 뉴클레오티드 서열과 동일하거나 유사한 활성을 가지는 그의 상동성 서열이 "% 상동성"으로 표시된다. 예를 들면, 점수(score), 동일성(identity) 및 유사도(similarity) 등의 매개 변수(parameter)들을 계산하는 표준 소프트웨어, 구체적으로 BLAST 2.0을 이용하거나, 정의된 엄격한 조건(stringent condition)하에서 썼던 혼성화 실험에 의해 서열을 비교함으로써 확인할 수 있으며, 정의되는 적절한 혼성화 조건은 해당 기술 범위 내이고, 당업자에게 잘 알려진 방법(예컨대, J. Sambrook et al., Molecular Cloning, A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press, Cold Spring Harbor,New York, 1989; F.M. Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York)으로 결정될 수 있다. As used herein, the term "homology" refers to the degree of similarity between the amino acid sequence constituting the protein or the polynucleotide sequence encoding the same. When the homology is sufficiently high, the expression product of the polynucleotide (gene) and the protein are identical or have a similar activity. In addition, homology can be expressed as a percentage according to the degree of correspondence with a given amino acid sequence or polynucleotide sequence. In the present specification, a homologous sequence having the same or similar activity to a given amino acid sequence or nucleotide sequence is expressed as "% homology". For example, standard software that calculates parameters such as score, identity, and similarity, specifically BLAST 2.0, or hybridization written under defined stringent conditions Appropriate hybridization conditions that can be confirmed by comparing the sequences by experimentation are within the technical scope and are well known to those skilled in the art (eg, J. Sambrook et al., Molecular Cloning, A Laboratory Manual, 2nd Edition, Cold Spring). Harbor Laboratory press, Cold Spring Harbor, New York, 1989; F.M. Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York).
본 명세서에서의 용어 "pegRNA(prime editing guide RNA)"는 표적 서열을 인식하는 가이드 서열(guide sequence), tracrRNA 스캐폴드 서열, 역전사 개시에 필요한 프라이머 결합 부위(primer binding site, PBS), 및 원하는 유전적 변화를 포함하는 RT 주형(RT template)을 포함한다.As used herein, the term "pegRNA (prime editing guide RNA)" refers to a guide sequence recognizing a target sequence, a tracrRNA scaffold sequence, a primer binding site required for initiation of reverse transcription (PBS), and a desired gene. Includes an RT template that contains the enemy change.
상기 pegRNA에서 가이드 서열은 표적 부위를 지정하는 가이드 RNA 내의 서열을 지칭하며, 표적 서열과 전부 또는 일부 상보적인 서열을 포함한다. 상기 가이드 서열은 표적 DNA 서열과 혼성화하고, 표적 DNA 서열로의 유전자 편집 합체의 서열-특이적 결합을 유도하기에 충분한, 표적 폴리뉴클레오티드 서열과의 상보성을 갖는 임의의 폴리뉴클레오티드 서열이다. 일부 구현예에서, 가이드 서열과 그의 상응하는 표적 서열 간의 상보성의 정도는 적절한 정렬 알고리즘을 사용하여 최적으로 정렬되는 경우, 약 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99% 이상이다. 최적의 정렬은 서열을 정렬하기에 적절한 임의의 알고리즘의 사용으로 결정될 수 있으며, 그의 비제한적인 예는 스미스-워터만(Smith-Waterman) 알고리즘, 니들만-분쉬(Needleman-Wunsch) 알고리즘, 버로우즈-휠러 트랜스폼(Burrows-Wheeler Transform)에 기초한 알고리즘(예를 들어, 버로우즈 휠러 얼라이너(Burrows Wheeler Aligner)), ClustalW, Clustal X, BLAT, 노보얼라인(Novoalign)(노보크라 프트 테크놀로지즈(Novocraft Technologies), ELAND(일루미나(Illumina), 미국 캘리포니아주 샌디에고), SOAP(soap.genomics.org.cn에서 이용가능) 및 Maq(maq.sourceforge.net에서 이용가능)를 포함한다. 일부 구현예에서, 가이드 서열은 약 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75개 이상의 뉴클레오티드 길이이다. 일부 구현예에서, 가이드 서열은 약 75, 50, 45, 40, 35, 30, 25, 20, 15, 12개 이하의 뉴클레오티드 길이이다. 표적 서열로의 유전자 편집 복합체의 서열-특이적 결합을 유도하는 가이드 서열의 능력은 임의의 적절한 검정에 의해 평가될 수 있다.In the pegRNA, the guide sequence refers to a sequence in the guide RNA that specifies a target site, and includes a sequence that is fully or partially complementary to the target sequence. The guide sequence is any polynucleotide sequence having sufficient complementarity with the target polynucleotide sequence to hybridize with the target DNA sequence and induce sequence-specific binding of the gene editing coalesce to the target DNA sequence. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is about 50%, 60%, 75%, 80%, 85%, 90%, 95 when optimally aligned using an appropriate alignment algorithm. %, 97.5%, 99% or more. Optimal alignment can be determined using any algorithm suitable for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, the Burroughs- Algorithms based on the Burrows-Wheeler Transform (eg Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies) ), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn) and Maq (available at maq.sourceforge.net). In some embodiments, a guide The sequence is about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40 , 45, 50, 75 or more nucleotides in length.In some embodiments, the guide sequence is about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12 or less nucleotides in length. The ability of a guide sequence to induce sequence-specific binding of a gene editing complex of a can be assessed by any suitable assay.
본 명세서에서의 용어 "표적 서열(target sequence)"은 pegRNA가 목적하는 표적 뉴클레오티드 서열을 의미한다. 상기 표적 서열은 pegRNA가 표적으로 할 것으로 예상되는 서열일 수 있다. 상기 표적 서열은 공지된 게놈 서열 중 일부 서열일 수 있고, 본 발명의 시스템을 이용하는 당업자가 편집하고자 하는 서열일 수도 있다.As used herein, the term “target sequence” refers to a target nucleotide sequence for which pegRNA is desired. The target sequence may be a sequence expected to be targeted by pegRNA. The target sequence may be a partial sequence among known genomic sequences, or a sequence to be edited by a person skilled in the art using the system of the present invention.
상기 복합체는 표적 DNA 또는 표적 유전자를 편집하기 위한 것으로서, 구체적으로 본 발명의 복합체는 기존에 알려진 프라임에디터보다 효과적으로 유전자를 편집할 수 있으므로 유전자 편집 효율 또는 프라임에디팅 효율이 기존보다 개선/향상된 것일 수 있다.The complex is for editing a target DNA or a target gene, and specifically, the complex of the present invention can edit a gene more effectively than a previously known prime editor, so that the gene editing efficiency or prime editing efficiency is improved/improved. .
일 실시예에 따르면, 본 발명의 Cas9 H840A, Rad51 DBD 및 RT를 포함하는 융합 단백질을 포함하는 프라임 에디팅 복합체(hyPE2)는 Rad51 DBD가 제외된 기존의 PE2보다 의도되지 않은 편집 빈도는 낮거나 유사한 수준을 나타내면서 프라임 에디팅 편집 효율은 현저히 향상된 것을 확인하였는 바, PE2를 포함하는 기존에 알려진 다양한 프라임 에디터보다 우수한 효능을 갖는 것을 알 수 있다.According to one embodiment, the prime editing complex (hyPE2) comprising a fusion protein comprising Cas9 H840A, Rad51 DBD and RT of the present invention has a lower or similar frequency of unintended editing than the existing PE2 excluding Rad51 DBD. As it was confirmed that the prime editing editing efficiency was significantly improved while showing , it can be seen that it has superior efficacy than various previously known prime editors including PE2.
상기 "프라임에디팅 효율"은 프라임에디터에 의한 유전자 편집 효율을 의미한다. 프라임에디팅 효율은 프라임에디팅을 수행하였을 때, 표적 서열 내에서 의도하지 않은 돌연변이 없이 프라임에디터 및 pegRNA에 의해 유도된 편집이 발생하는 비율로 계산될 수 있다. 상기 프라임에디팅 효율은 백분율로 표시될 수 있다.The "prime editing efficiency" means gene editing efficiency by the prime editor. Prime editing efficiency can be calculated as a rate at which editing induced by the prime editor and pegRNA occurs without unintentional mutation in the target sequence when prime editing is performed. The prime editing efficiency may be expressed as a percentage.
일 구현예에 따르면, 상기 프라임에디팅 효율에 대한 데이터는 상기 유전체 편집용 조성물을 세포 또는 조직에 도입하는 단계; 상기 조성물이 도입된 세포 또는 조직으로부터 수득한 DNA를 이용하여 딥 시퀀싱(Deep sequencing)을 수행하는 단계; 및 상기 딥 시퀀싱으로 수득한 데이터로부터 프라임에디팅 효율을 분석하는 단계를 포함하는 방법을 수행하여 수득된 것일 수 있다.According to one embodiment, the data on the efficiency of the prime editing comprises the steps of introducing the genome editing composition into a cell or tissue; performing deep sequencing using the DNA obtained from the cell or tissue introduced with the composition; And it may be obtained by performing a method comprising the step of analyzing the prime editing efficiency from the data obtained by the deep sequencing.
상기 프라임에디터가 도입된 세포 또는 조직으로부터 DNA를 수득하는 방법은 당업계에 공지된 다양한 DNA 분리 방법을 이용하여 수행될 수 있다. 각각의 세포 또는 조직은 도입된 표적 서열에서 유전자 편집이 발생한 것으로 예상되므로, 표적 서열을 서열 분석하여 유전자 편집 효율을 검출할 수 있다. 상기 서열 분석 방법은 프라임에디팅 효율 데이터를 얻을 수 있다면 특정 방법에 제한되는 것은 아니나, 예를 들어 딥 시퀀싱을 이용할 수 있다.The method of obtaining DNA from cells or tissues into which the prime editor is introduced may be performed using various DNA isolation methods known in the art. Since each cell or tissue is expected to have gene editing in the introduced target sequence, the gene editing efficiency can be detected by sequencing the target sequence. The sequencing method is not limited to a specific method as long as prime editing efficiency data can be obtained, but, for example, deep sequencing may be used.
상기 딥시퀀싱으로 수득한 데이터로부터 프라임에디팅 효율을 분석하는 단계는 프라임에디팅 효율을 계산하는 단계를 포함할 수 있다. 프라임에디팅 효율은 pegRNA 서열 및 표적 서열의 종류 및/또는 길이에 따라 다르게 나타날 수 있다. 상기 프라임에디팅 효율에 대한 데이터는 데이터 세트로 제공될 수 있다.The step of analyzing the prime editing efficiency from the data obtained by the deep sequencing may include calculating the prime editing efficiency. Prime editing efficiency may vary depending on the type and/or length of the pegRNA sequence and the target sequence. The data on the prime editing efficiency may be provided as a data set.
일 실시예에 따르면, 상기 유전자 편집 복합체의 편집 효율은 PBS의 융해 온도에 영향을 받는 것일 수 있으며, 구체적으로 PBS 융해 온도가 낮은 경우 유전자 편집 효율이 향상되는 것일 수 있다.According to an embodiment, the editing efficiency of the gene editing complex may be affected by the melting temperature of PBS, and specifically, the gene editing efficiency may be improved when the melting temperature of PBS is low.
다른 양상은 1) Cas 닉케이스(nickase), 역전사 효소 (reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인 (single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질, 상기 융합 단백질을 암호화하는 폴리뉴클레오티드 또는 상기 폴리뉴클레오티드를 포함하는 재조합 백터; 및 2) 프라임 에디팅 가이드 RNA (pegRNA) 또는 이를 암호화하는 폴리뉴클레오티드를 포함하는 재조합 벡터를 포함하는, 유전자 편집용 조성물을 제공한다. 상기에서 설명한 내용과 동일한 부분은 상기 조성물에도 공히 적용된다.Another aspect is 1) Cas nickase (nickase), reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a poly encoding the fusion protein a nucleotide or a recombinant vector comprising the polynucleotide; And 2) it provides a composition for gene editing, comprising a recombinant vector comprising a prime editing guide RNA (pegRNA) or a polynucleotide encoding the same. The same parts as described above also apply to the composition.
상기 조성물은 표적 DNA 또는 표적 유전자를 편집하기 위한 것으로서, 프라임 에디팅을 통해 유전자를 편집하는 것일 수 있다.The composition is for editing a target DNA or a target gene, and may be for editing a gene through prime editing.
상기 Cas 닉케이스를 암호화하는 폴리뉴클레오티드는 서열번호 8의 폴리뉴클레오티드 서열을 포함하는 것일 수 있으며, 구체적으로 상기 서열번호 8의 폴리뉴클레오티드 서열로 이루어진 것일 수 있다.The polynucleotide encoding the Cas nickcase may include the polynucleotide sequence of SEQ ID NO: 8, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 8.
상기 단일 가닥 DNA 결합 도메인을 암호화하는 폴리뉴클레오티드는 서열번호 9의 폴리뉴클레오티드 서열을 포함하는 것일 수 있으며, 구체적으로 상기 서열번호 9의 폴리뉴클레오티드 서열로 이루어진 것일 수 있다.The polynucleotide encoding the single-stranded DNA binding domain may include the polynucleotide sequence of SEQ ID NO: 9, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 9.
상기 역전사 효소를 암호화하는 폴리뉴클레오티드는 서열번호 10의 폴리뉴클레오티드 서열을 포함하는 것일 수 있으며, 구체적으로 상기 서열번호 10의 폴리뉴클레오티드 서열로 이루어진 것일 수 있다.The polynucleotide encoding the reverse transcriptase may include the polynucleotide sequence of SEQ ID NO: 10, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 10.
상기 제1 링커를 암호화하는 폴리뉴클레오티드는 서열번호 11의 폴리뉴클레오티드 서열을 포함하는 것일 수 있으며, 구체적으로 상기 서열번호 11의 폴리뉴클레오티드 서열로 이루어진 것일 수 있다.The polynucleotide encoding the first linker may include the polynucleotide sequence of SEQ ID NO: 11, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 11.
상기 제2 링커를 암호화하는 폴리뉴클레오티드는 서열번호 12의 폴리뉴클레오티드 서열을 포함하는 것일 수 있으며, 구체적으로 상기 서열번호 12의 폴리뉴클레오티드 서열로 이루어진 것일 수 있다.The polynucleotide encoding the second linker may include the polynucleotide sequence of SEQ ID NO: 12, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 12.
상기 융합 단백질을 암호화하는 폴리뉴클레오티드는 서열번호 13의 폴리뉴클레오티드 서열을 포함하는 것일 수 있으며, 구체적으로 상기 서열번호 13의 폴리뉴클레오티드 서열로 이루어진 것일 수 있다.The polynucleotide encoding the fusion protein may include the polynucleotide sequence of SEQ ID NO: 13, and specifically, may consist of the polynucleotide sequence of SEQ ID NO: 13.
본 명세서에서의 용어 "폴리뉴클레오티드"란, 단일가닥 또는 이중가닥 형태로 존재하는 디옥시리보뉴클레오티드 또는 리보뉴클레오티드의 중합체이다. RNA 게놈 서열, DNA(gDNA 및 cDNA) 및 이로부터 전사되는 RNA 서열을 포괄하며, 특별하게 다른 언급이 없는 한 천연의 폴리뉴클레오티드의 유사체를 포함한다.As used herein, the term “polynucleotide” refers to deoxyribonucleotides or polymers of ribonucleotides that exist in single-stranded or double-stranded form. It encompasses RNA genomic sequences, DNA (gDNA and cDNA) and RNA sequences transcribed therefrom, and includes analogs of natural polynucleotides, unless otherwise specified.
본 명세서에서 상기 단백질/도메인들, 이를 전부 또는 일부 포함하는 융합 단백질을 코딩하는 염기 서열은 각 서열번호로 기재한 아미노산을 코딩하는 염기 서열뿐만 아니라, 상기 서열과 80% 이상, 구체적으로는 90% 이상, 보다 구체적으로는 95% 이상, 더욱 구체적으로는 98% 이상, 가장 구체적으로는 99% 이상의 상동성을 나타내는 염기 서열로서 실질적으로 상기 각 단백질과 동일하거나 상응하는 효능을 나타내는 단백질을 코딩하는 염기 서열이라면 제한 없이 포함한다.In the present specification, the nucleotide sequence encoding the protein/domain, all or part of the fusion protein including the nucleotide sequence, as well as the nucleotide sequence encoding the amino acid described in each SEQ ID NO: 80% or more, specifically 90% of the sequence A base sequence encoding a protein substantially the same as or corresponding to each protein as a nucleotide sequence showing homology of at least 95%, more specifically at least 98%, and most specifically at least 99%. Sequences are included without limitation.
상기 융합 단백질은 각각의 단백질과 동일하거나 상응하는 생물학적 활성을 가지는 한, 기재된 서열번호의 아미노산 서열 또는 상기 서열과 80% 이상, 85% 이상, 90% 이상, 91% 이상, 92% 이상, 93% 이상, 94% 이상, 95% 이상, 96% 이상, 97% 이상, 98% 이상, 99% 이상의 상동성을 나타내는 단백질을 코딩하는 폴리뉴클레오티드를 포함할 수 있다.The fusion protein is 80% or more, 85% or more, 90% or more, 91% or more, 92% or more, 93% of the amino acid sequence of SEQ ID NO: It may include a polynucleotide encoding a protein exhibiting at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% homology.
또한, 상기 융합 단백질을 코딩하는 폴리뉴클레오티드는 코돈의 축퇴성(degeneracy)으로 인하여 상기 단백질을 발현시키고자 하는 생물에서 선호되는 코돈을 고려하여, 코딩영역으로부터 발현되는 단백질의 아미노산 서열을 변화시키지 않는 범위 내에서 코딩영역에 다양한 변형이 이루어질 수 있다. 따라서, 상기 폴리뉴클레오티드는 각 단백질들을 코딩하는 폴리뉴클레오티드 서열이면 제한 없이 포함될 수 있다.In addition, the polynucleotide encoding the fusion protein is a range that does not change the amino acid sequence of the protein expressed from the coding region, considering codons preferred in the organism in which the protein is to be expressed due to codon degeneracy. Various modifications may be made to the coding region within. Accordingly, the polynucleotide may be included without limitation as long as it is a polynucleotide sequence encoding each protein.
또한, 상기 폴리뉴클레오티드는 상기 융합 단백질의 아미노산 서열을 코딩하는 뉴클레오티드 서열뿐만 아니라, 그 서열에 상보적인(complementary) 서열도 포함한다. 상기 상보적인 서열은 완벽하게 상보적인 서열뿐만 아니라, 실질적으로 상보적인 서열도 포함하며, 이는 당업계에 공지된 엄격 조건(stringent conditions) 하에서, 예를 들어, 상기 융합 단백질의 아미노산 서열을 코딩하는 뉴클레오티드 서열의 뉴클레오티드 서열과 혼성화될 수 있는 서열을 의미한다.In addition, the polynucleotide includes not only a nucleotide sequence encoding the amino acid sequence of the fusion protein, but also a sequence complementary to the sequence. The complementary sequence includes not only perfectly complementary sequences, but also substantially complementary sequences, which under stringent conditions known in the art, for example, nucleotides encoding the amino acid sequence of the fusion protein. It refers to a sequence capable of hybridizing with the nucleotide sequence of the sequence.
상기 "엄격한 조건"이란 폴리뉴클레오티드 간의 특이적 혼성화를 가능하게 하는 조건을 의미한다. 이러한 조건은 문헌 (예컨대, J. Sambrook et al., 상동)에 구체적으로 기재되어 있다. 예를 들어, 상동성이 높은 유전자끼리, 40% 이상, 구체적으로는 90% 이상, 보다 구체적으로는 95% 이상, 더욱 구체적으로는 97% 이상, 특히 구체적으로는 99% 이상의 상동성을 갖는 유전자끼리 하이브리드화하고, 그보다 상동성이 낮은 유전자끼리 하이브리드화하지 않는 조건, 또는 통상의 써던 하이브리드화의 세척 조건인 60℃ 1XSSC, 0.1% SDS, 구체적으로는 60℃ 0.1XSSC, 0.1% SDS, 보다 구체적으로는 68℃ 0.1XSSC, 0.1% SDS에 상당하는 염 농도 및 온도에서, 1회, 구체적으로는 2회 내지 3회 세정하는 조건을 열거할 수 있다.The "stringent conditions" means conditions that allow specific hybridization between polynucleotides. These conditions are specifically described in the literature (eg, J. Sambrook et al., supra). For example, genes having high homology between genes having homology of 40% or more, specifically 90% or more, more specifically 95% or more, still more specifically 97% or more, and particularly specifically 99% or more homology. Conditions that hybridize with each other and do not hybridize with genes with lower homology, or wash conditions of normal Southern hybridization at 60° C. 1XSSC, 0.1% SDS, specifically 60° C. 0.1XSSC, 0.1% SDS, more specifically As examples, the conditions of washing once, specifically 2 to 3 times, at a salt concentration and temperature equivalent to 68° C. 0.1XSSC, 0.1% SDS can be exemplified.
혼성화는 비록 혼성화의 엄격도에 따라 염기 간의 미스매치 (mismatch)가 가능할지라도, 두 개의 폴리뉴클레오티드가 상보적 서열을 가질 것을 요구한다. 용어, "상보적"은 서로 혼성화가 가능한 뉴클레오티드 염기 간의 관계를 기술하는데 사용된다. 예를 들면, DNA에 관하여, 아데노신은 티민에 상보적이며 시토신은 구아닌에 상보적이다. 따라서, 본 출원은 또한 실질적으로 유사한 폴리뉴클레오티드 서열뿐만 아니라 전체 서열에 상보적인 단리된 폴리뉴클레오티드 단편을 포함할 수 있다.Hybridization requires that two polynucleotides have complementary sequences, although mismatch between bases is possible depending on the stringency of hybridization. The term "complementary" is used to describe the relationship between nucleotide bases capable of hybridizing to each other. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the present application may also include substantially similar polynucleotide sequences as well as isolated polynucleotide fragments complementary to the overall sequence.
구체적으로, 상동성을 가지는 폴리뉴클레오티드는 55 ℃의 Tm 값에서 혼성화 단계를 포함하는 혼성화 조건을 사용하고 상술한 조건을 사용하여 탐지할 수 있다. 또한, 상기 Tm 값은 60 ℃, 63 ℃ 또는 65 ℃일 수 있으나, 이에 제한되는 것은 아니고 그 목적에 따라 당업자에 의해 적절히 조절될 수 있다. 폴리뉴클레오티드를 혼성화하는 적절한 엄격도는 폴리뉴클레오티드의 길이 및 상보성 정도에 의존하고 변수는 해당기술분야에 잘 알려져 있다(Sambrook et al., supra, 9.50-9.51, 11.7-11.8 참조).Specifically, polynucleotides having homology can be detected using hybridization conditions including a hybridization step at a Tm value of 55° C. and using the conditions described above. In addition, the Tm value may be 60 °C, 63 °C, or 65 °C, but is not limited thereto and may be appropriately adjusted by those skilled in the art according to the purpose. The appropriate stringency for hybridizing polynucleotides depends on the length of the polynucleotides and the degree of complementarity, and the parameters are well known in the art (see Sambrook et al., supra, 9.50-9.51, 11.7-11.8).
본 명세서에서의 용어 "재조합 벡터"는 상기 폴리뉴클레오티드를 세포 내로 전달하거나 및/또는 암호화하고 있는 목적 단백질을 발현할 수 있는 매개체를 의미할 수 있다. 본 명세서에서 상기 재조합 벡터는 상기 융합 단백질을 암호화하는 폴리뉴클레오티드 또는 pegRNA 암호화 서열을 포함하는 폴리뉴클레오티드를 포함할 수 있다. 상기 벡터는 개체의 세포 내에 존재하는 경우 삽입물, 즉 폴리뉴클레오티드가 발현될 수 있도록 삽입물에 작동가능하게 연결된 필수적인 조절 요소를 포함할 수 있다. 상기 용어 "작동가능하게 연결된(operably linked)"이란, 일반적 기능을 수행하도록 핵산 발현 조절 서열과 목적하는 단백질을 코딩하는 핵산 서열이 기능적으로 연결되어 있는 것을 의미한다. 재조합 벡터와의 작동적 연결은 당해 기술분야에서 잘 알려진 유전자 재조합 기술을 이용하여 제조 및 정제할 수 있으며, 부위-특이적 DNA 절단 및 연결은 당해 기술 분야에서 일반적으로 알려진 효소 등을 사용하여 용이하게 할 수 있다. 상기 벡터는 프로모터, 개시코돈, 및 종결코돈 터미네이터를 포함할 수 있다. 그 외에 시그널 펩타이드를 코드하는 DNA, 및/또는 인핸서 서열, 및/또는 원하는 유전자의 5'측 및 3'측의 비번역 영역, 및/또는 선택마커 영역, 및/또는 복제가능단위 등을 적절하게 포함할 수도 있다.As used herein, the term "recombinant vector" may refer to a medium capable of delivering the polynucleotide into a cell and/or expressing a target protein encoding it. In the present specification, the recombinant vector may include a polynucleotide encoding the fusion protein or a polynucleotide comprising a pegRNA encoding sequence. The vector may contain the necessary regulatory elements operably linked to the insert, ie, the polynucleotide, to allow expression of the insert when present in a cell of an individual. The term “operably linked” means that a nucleic acid expression control sequence and a nucleic acid sequence encoding a protein of interest are functionally linked to perform a general function. The operative linkage with the recombinant vector can be prepared and purified using genetic recombination techniques well known in the art, and site-specific DNA cleavage and ligation can be easily performed using enzymes generally known in the art. can do. The vector may include a promoter, a start codon, and a stop codon terminator. In addition, DNA encoding the signal peptide, and/or enhancer sequence, and/or the untranslated region on the 5' side and the 3' side of the desired gene, and/or a selectable marker region, and/or a replicable unit, etc. are appropriately added may include
상기 프로모터는 일반 프로모터로서 구성적 또는 유도성일 수 있고, 원핵 세포의 경우에는 lac, tac, T3 및 T7 프로모터, 진핵세포의 경우에는 원숭이 바이러스 40(SV40), 마우스 유방 종양 바이러스(MMTV) 프로모터, 사람 면역 결핍 바이러스(HIV), 예를 들어 HIV의 긴 말단 반복부(LTR) 프로모터, 몰로니 바이러스, 시토메갈로바이러스(CMV), 엡스타인 바 바이러스(EBV), 로우스 사코마 바이러스(RSV) 프로모터뿐만 아니라, β-액틴 프로모터, 사람 헤로글로빈, 사람 근육 크레아틴, 사람 메탈로티오네인 유래의 프로모터 등이 있지만, 이에 제한되지 않는다.The promoter may be constitutive or inducible as a general promoter, lac, tac, T3 and T7 promoters for prokaryotic cells, monkey virus 40 (SV40) for eukaryotic cells, mouse mammary tumor virus (MMTV) promoter, human Immunodeficiency virus (HIV), such as the long terminal repeat (LTR) promoter of HIV, Moloney virus, cytomegalovirus (CMV), Epstein Barr virus (EBV), Loose Sacoma virus (RSV) promoter, as well as , β-actin promoter, human heroglobin, human muscle creatine, human metallothionein-derived promoter, and the like.
상기 선택마커는 벡터가 도입되어 형질전환된 세포를 선별하기 위한 것으로서, 약물 내성, 영양 요구성, 세포 독성제에 대한 내성 또는 표면 단백질의 발현과 같은 선택가능 표현형을 부여하는 마커들이 사용될 수 있다. 선택제(selective agent)가 처리된 환경에서는 선별 마커를 발현하는 세포만 생존하므로 형질전환된 세포가 선별 가능하다.The selectable marker is for selecting cells transformed by introducing a vector, and markers conferring a selectable phenotype such as drug resistance, auxotrophy, resistance to cytotoxic agents, or surface protein expression may be used. In an environment treated with a selective agent, only the cells expressing the selection marker survive, so that the transformed cells can be selected.
상기 벡터는 바이러스, 코즈미드 또는 플라스미드 벡터일 수 있으나, 이에 제한되지 않는다. 상기 벡터의 종류는 원핵세포 및 진핵세포 등의 각종 목적하는 세포에서 원하는 유전자를 발현하고 원하는 단백질을 생산하는 기능 등이 작용할 수 있도록 하는 한 특별히 제한되지 않지만, 구체적으로 강력한 활성을 나타내는 프로모터와 강한 발현력을 보유하면서 자연 상태와 유사한 형태의 외래 단백질을 대량으로 생산할 수 있는 벡터가 이용될 수 있다.The vector may be a viral, cosmid or plasmid vector, but is not limited thereto. The type of the vector is not particularly limited as long as it allows the expression of the desired gene and the function of producing the desired protein in various desired cells such as prokaryotic and eukaryotic cells, but specifically a promoter showing strong activity and strong expression A vector capable of producing a large amount of a foreign protein in a form similar to that of a natural state while retaining its strength can be used.
상기 벡터는 다양한 목적 세포와 벡터의 조합이 이용될 수 있다. 진핵 세포 또는 진핵 유기체에 적합한 발현 벡터로는 이에 제한되지 않지만, SV40, 소 유두종바이러스, 아데노바이러스, 아데노-연관 바이러스(adenoassociated virus), 시토메갈로바이러스, 렌티바이러스 및 레트로바이러스로부터 유래된 벡터 등이 사용될 수 있으나, 이에 제한되지 않는다. 세균 숙주에 사용할 수 있는 발현 벡터로는 이에 제한되지 않지만, pET21a, pET, pRSET, pBluescript, pGEX2T, pUC 벡터, col E1, pCR1, pBR322, pMB9 또는 이들의 유도체 등을 포함하는 대장균(Escherichia coli)에서 얻어지는 세균성 플라스미드, RP4와 같이 보다 넓은 숙주 범위를 갖는 플라스미드, λgt10, λgt11 또는 NM989 등의 파지 람다(phage lambda) 유도체로 예시될 수 있는 파지 DNA, 및 M13과 필라멘트성 단일가닥의 DNA 파지와 같은 기타 다른 DNA 파지 등이 포함될 수 있다. 곤충 세포에는 pVL941 등이 사용될 수 있다.As the vector, a combination of various target cells and vectors may be used. Expression vectors suitable for eukaryotic cells or eukaryotic organisms include, but are not limited to, vectors derived from SV40, bovine papillomavirus, adenovirus, adeno-associated virus, cytomegalovirus, lentivirus and retrovirus, etc. may be used. may, but is not limited thereto. Expression vectors that can be used in bacterial hosts include, but are not limited to, pET21a, pET, pRSET, pBluescript, pGEX2T, pUC vectors, col E1, pCR1, pBR322, pMB9 or derivatives thereof. Bacterial plasmids obtained, plasmids having a wider host range such as RP4, phage DNA exemplified by phage lambda derivatives such as λgt10, λgt11 or NM989, and others such as M13 and filamentous single-stranded DNA phage Other DNA phages and the like may be included. For insect cells, pVL941 or the like can be used.
상기 조성물은 엑스 비보 (ex vivo) 또는 인 비보(in vivo)에서 표적 DNA 또는 유전자를 편집하기 위한 것일 수 있으며, 진핵 세포, 원핵 세포, 진핵 유기체 또는 원핵 유기체의 유전자 편집에 사용할 수 있으며, 구체적으로 진핵 세포 또는 진핵 유기체의 유전체 편집에 사용할 수 있다. 상기 진핵 세포는 효모, 곰팡이, 원생동물 (protozoa), 식물, 고등 식물 및 곤충, 양서류 또는 조류의 세포, 또는 CHO, HeLa, HEK293, 및 COS-1과 같은 포유류의 세포일 수 있고, 예를 들어, 당업계에서 일반적으로 사용되는, 배아세포, 줄기세포, 체세포, 생식세포, 배양된 세포 (in vitro), 이식된 세포 (graft cell) 및 일차 배양 세포 (in vitro 및 ex vivo), 및 인 비보 (in vivo) 세포, 유기체의 세포 또는 인간을 포함하는 포유류의 세포 (mammalian cell) 일 수 있다. 상기 진핵 유기체는 효모, 곰팡이, 원생동물, 식물, 고등 식물 및 곤충, 양서류, 조류 또는 포유류(인간, 원숭이 등의 영장류, 개, 돼지, 소, 양, 염소, 마우스, 래트 등)일 수 있다. 바람직하게는, 상기 조성물은 진핵 세포 또는 진핵 유기체의 프라임에디팅 용도로 사용할 수 있다.The composition may be for editing target DNA or genes ex vivo or in vivo, and may be used for gene editing in eukaryotic cells, prokaryotic cells, eukaryotic organisms or prokaryotic organisms, specifically It can be used for genome editing of eukaryotic cells or eukaryotic organisms. The eukaryotic cells may be cells of yeast, mold, protozoa, plants, higher plants and insects, amphibians or birds, or mammalian cells such as CHO, HeLa, HEK293, and COS-1, for example , embryonic cells, stem cells, somatic cells, germ cells, cultured cells (in vitro), transplanted cells (graft cells) and primary cultured cells (in vitro and ex vivo), and in vivo, commonly used in the art It may be an (in vivo) cell, a cell of an organism, or a cell of a mammal including a human (mammalian cell). The eukaryotic organism may be a yeast, a fungus, a protozoan, a plant, a higher plant and insect, an amphibian, a bird or a mammal (human, primate such as monkey, dog, pig, cow, sheep, goat, mouse, rat, etc.). Preferably, the composition can be used for priming of eukaryotic cells or eukaryotic organisms.
상기 폴리뉴클레오티드 또는 재조합 벡터를 세포에 전달하는 방법은 당업계에 공지된 다양한 방법을 이용하여 달성될 수 있다. 예컨대, 칼슘 포스페이트-DNA 공침전법, DEAE-덱스트란-매개 트랜스펙션법, 폴리브렌-매개 형질감염법, 전기충격법, 미세주사법, 리포좀 융합법, 리포펙타민 및 원형질체 융합법 등의 당 분야에 공지된 여러 방법에 의해 수행될 수 있다. 또한, 바이러스 벡터를 이용하는 경우, 감염(infection)을 수단으로 하여 바이러스 입자를 사용하여 목적물, 즉 벡터를 세포 내로 전달시킬 수 있다. 아울러, 유전자 밤바드먼트 등에 의해 벡터를 세포 내로 도입할 수 있다. 상기 도입된 벡터는 세포 내에서 벡터 자체로 존재하거나, 염색체 내에 통합될 수 있으나, 이에 제한되는 것은 아니다.A method of delivering the polynucleotide or recombinant vector to a cell can be accomplished using various methods known in the art. For example, calcium phosphate-DNA co-precipitation method, DEAE-dextran-mediated transfection method, polybrene-mediated transfection method, electroshock method, microinjection method, liposome fusion method, lipofectamine and protoplast fusion method, etc. It can be carried out by several methods known in In addition, in the case of using a viral vector, a target object, that is, the vector can be delivered into a cell using viral particles by means of infection. In addition, the vector can be introduced into the cell by gene bambadment or the like. The introduced vector may exist as a vector itself in a cell or may be integrated into a chromosome, but is not limited thereto.
또 다른 양상은 본 발명의 유전자 편집용 조성물을 진핵 세포 또는 진핵 유기체에 도입하는 단계를 포함하는, 대상 개체의 유전자를 편집하는 방법을 제공한다. 상기에서 설명한 내용과 동일한 부분은 상기 방법에도 공히 적용된다.Another aspect provides a method of editing a gene of a subject, comprising introducing the composition for gene editing of the present invention into a eukaryotic cell or eukaryotic organism. The same parts as those described above are equally applied to the above method.
상기 진핵 세포는 개체 또는 진핵 유기체로부터 분리된 세포일 수 있다. 또한, 상기 진핵 유기체는 인간을 제외한 것일 수 있다.The eukaryotic cell may be an individual or a cell isolated from a eukaryotic organism. Also, the eukaryotic organism may be non-human.
상기 1) Cas 닉케이스(nickase), 역전사 효소(reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질, 상기 융합 단백질을 암호화하는 폴리뉴클레오티드 또는 상기 폴리뉴클레오티드를 포함하는 재조합 백터; 및 2) 프라임 에디팅 가이드 RNA (pegRNA) 또는 이를 암호화하는 폴리뉴클레오티드를 포함하는 재조합 벡터는 동시에 도입하거나, 순차적으로 도입될 수 있으며, 상기 조성물의 유전자 편집 효율에 영향을 끼치지 않는 한 이에 제한되지 않는다.1) Cas nickase (nickase), a reverse transcriptase (reverse transcriptase, RT) and a fusion protein comprising a single-stranded DNA-binding domain (ssDBD), a polynucleotide encoding the fusion protein or a recombinant vector comprising the polynucleotide; And 2) the prime editing guide RNA (pegRNA) or a recombinant vector comprising a polynucleotide encoding the same can be introduced simultaneously or sequentially, as long as it does not affect the gene editing efficiency of the composition, it is not limited thereto .
상기 유전체를 편집하는 방법에서 수행되는 모든 단계는 세포 내 또는 세포 외, 또는 생체 내 또는 생체 외에서 수행되는 것일 수 있다.All steps performed in the genome editing method may be performed intracellularly or extracellularly, or in vivo or ex vivo.
상기 유전체를 편집하는 방법은 프라임에디팅에 의해 수행되는 것일 수 있다.The method of editing the genome may be performed by prime editing.
일 양상에 따르면, 본 발명의 프라임 에디팅을 위한 유전자 편집 복합체 및 이를 포함하는 유전자 편집용 조성물은 PE2 등 기존에 알려진 프라임 에디터보다 현저히 우수한 유전자 편집 효율을 나타내는 바, 상기 유전자 편집 복합체를 유효성분으로 하여 효과가 우수한 유전자 편집 기반의 치료제를 개발할 수 있다.According to one aspect, the gene editing complex for prime editing of the present invention and the composition for gene editing comprising the same show significantly superior gene editing efficiency than conventionally known prime editors such as PE2, using the gene editing complex as an active ingredient Gene editing-based therapeutics with excellent effects can be developed.
도 1은 본 발명에서 사용된 PE2 변이체의 구조를 나타낸 도면이다.1 is a diagram showing the structure of the PE2 variant used in the present invention.
도 2는 프라임 에디팅 활성의 고처리량 평가를 위한 생물학적 복제물의 프라임 에디팅 효율성 간의 상관관계를 나타낸 도면이다. 표적 서열의 수는 83개이다.2 is a diagram illustrating the correlation between prime editing efficiency of biological replicates for high-throughput evaluation of prime editing activity. The number of target sequences is 83.
도 3은 PE2 변이체인 hyPE2 및 PE2-mid_RPA70의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면이다. PE2-유도 프라임 에디팅 효율이 1% 미만인 표적 서열은 흰색 점으로 표시하였다. 표적 서열의 수는 64개이다.3 is a diagram comparing the prime editing efficiency of hyPE2 and PE2-mid_RPA70, which are PE2 variants, normalized to the efficiency of PE2. Target sequences with less than 1% PE2-induced prime editing efficiency are indicated by white dots. The number of target sequences is 64.
도 4는 PE2 변이체인 hyPE2 및 PE2-mid_RPA70의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면으로서, PE2-유도 프라임 에디팅 효율이 1% 초과인 표적 서열에 대한 결과이다. 표적 서열의 수는 30개이다.4 is a diagram comparing the prime editing efficiency of PE2 variants, hyPE2 and PE2-mid_RPA70, normalized to that of PE2, and the results for a target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of target sequences is 30.
도 5는 PE2 변이체인 hyPE2, PE2-N_Rad51 및 PE2-C_Rad51의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면으로서, PE2-유도 프라임 에디팅 효율이 1% 초과인 표적 서열에 대한 결과이다. 표적 서열의 수는 32개이다.5 is a diagram comparing the prime editing efficiencies of PE2 variants hyPE2, PE2-N_Rad51 and PE2-C_Rad51 normalized to that of PE2, and results for a target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of target sequences is 32.
도 6은 HEK293T 세포에서 PE2와 hyPE2의 프라임 에디팅 효율성을 비교한 결과를 나타낸 도면이다. 데이터 포인트는 각 표적 서열에서 3개의 생물학적 복제물의 평균 프라임 에디팅 효율성을 나타내며, x축과 y축 모두에 로그 스케일을 사용할 수 있도록 모든 효율 값에 0.1%를 추가했습다. 표적 서열의 수는 88개이다.6 is a view showing a comparison result of the prime editing efficiency of PE2 and hyPE2 in HEK293T cells. Data points represent the average prime editing efficiency of three biological replicates in each target sequence, with 0.1% added to all efficiency values to allow logarithmic scales for both x- and y-axes. The number of target sequences is 88.
도 7은 HCT116 세포에서 PE2 변이체인 hyPE2 및 PE2-mid_RPA70의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면으로서, PE2-유도 프라임 에디팅 효율이 1% 초과인 표적 서열에 대한 결과이다. 표적 서열의 수는 43개이다.7 is a diagram comparing the prime editing efficiency of hyPE2 and PE2-mid_RPA70, which are PE2 variants, normalized to that of PE2 in HCT116 cells in HCT116 cells. The results are for a target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of target sequences is 43.
도 8은 hyPE2 링커 변이체의 구조 및 링커의 서열 정보를 나타낸 도면이다.8 is a diagram showing the structure of the hyPE2 linker variant and sequence information of the linker.
도 9는 hyPE2 및 hyPE2 링커 변이체의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면이다. 상기 배수 증가는 조절된 배수 증가(Adjusted fold increase)로 표현하였으며, pegRNA의 수는 82개이다.9 is a diagram comparing the prime editing efficiency of hyPE2 and hyPE2 linker variants normalized to the efficiency of PE2. The fold increase was expressed as an adjusted fold increase, and the number of pegRNAs was 82.
도 10은 HEK293T 및 HCT116 세포의 동일한 내인성 표적에 대한 hyPE2의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면으로서, PE2-유도 프라임 에디팅 효율이 1% 초과인 표적 서열에 대한 결과이다. pegRNA의 개수는 HEK293T의 경우 31개, HCT116의 경우 11개이다.FIG. 10 is a diagram comparing the prime editing efficiency of hyPE2 against the same endogenous target of HEK293T and HCT116 cells normalized to that of PE2. The results for the target sequence having a PE2-induced prime editing efficiency of more than 1%. The number of pegRNAs is 31 for HEK293T and 11 for HCT116.
도 11은 HEK293T 세포의 내인성 부위에서 PE2와 hyPE2의 프라임 에디팅 효율성을 비교한 결과를 나타낸 도면이다. pegRNA의 수는 63개이다.11 is a diagram showing the results of comparing the prime editing efficiency of PE2 and hyPE2 in the endogenous region of HEK293T cells. The number of pegRNAs is 63.
도 12는 HCT116 세포의 내인성 부위에서 PE2와 hyPE2의 프라임 에디팅 효율성을 비교한 결과를 나타낸 도면이다. pegRNA의 수는 51개이다.12 is a view showing a comparison result of the prime editing efficiency of PE2 and hyPE2 in the endogenous region of HCT116 cells. The number of pegRNAs is 51.
도 13은 1차 인간 피부 섬유아세포의 6개의 내인성 부위에서 PE2와 hyPE2의 프라임 에디팅 효율성을 비교한 결과를 나타낸 도면이다. pegRNA의 수는 51개이다.13 is a diagram showing the results of comparing the prime editing efficiency of PE2 and hyPE2 in six endogenous regions of primary human dermal fibroblasts. The number of pegRNAs is 51.
도 14는 1차 인간 피부 섬유아세포의 동일 표적 서열에서 hyPE2의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면이다. 조절된 배수 증가는 y축에 표시하였으며, 3개의 독립적인 생물학적 복제물에 대한 평균±표준편차로 표시하였다.14 is a diagram comparing the prime editing efficiency of hyPE2 in the same target sequence of primary human dermal fibroblasts normalized to that of PE2. Adjusted fold increments are plotted on the y-axis and are plotted as mean±standard deviation for three independent biological replicates.
도 15는 HEK293T 세포의 동일한 내인성 부위에서 hyPE2의 프라임 에디팅 효율성 및 의도되지 않은 편집 빈도를 PE2의 결과로 정규화하여 비교한 도면이다. 조절된 P-값은 일원분산분석 후 Tukey의 다중 비교에 의한 사후 분석에 의해 계산되었으며, 두 그룹간의 차이가 통계적으로 유의하지 않은 경우 P-값을 기재하지 않았다. pegRNA의 수는 25개이다.15 is a diagram comparing the prime editing efficiency and unintended editing frequency of hyPE2 in the same endogenous region of HEK293T cells normalized to the result of PE2. The adjusted P-value was calculated by one-way ANOVA followed by a post-hoc analysis by Tukey's multiple comparisons, and the P-value was not recorded when the difference between the two groups was not statistically significant. The number of pegRNAs is 25.
도 16은 HEK293T 세포에서 HEK4를 표적으로 하는 pegRNA 2, 3, 5 및 4개의 다른 pegRNA의 잠재적인 표적-외 부위에서의 hyPE2 및 PE2의 표적-외 효과를 나타낸 도면이다. 의도한 편집 위치는 노란색으로 강조 표시하였고, 불일치 및 돌출된 뉴클레오티드는 각각 빨간색 및 파란색 소문자 글꼴로 표시하였다. 데이터는 3개의 독립적인 생물학적 복제물에 대한 평균±표준편차로 표시하였다.FIG. 16 shows off-target effects of hyPE2 and PE2 at potential off-target sites of pegRNAs 2, 3, 5 and 4 other pegRNAs targeting HEK4 in HEK293T cells. Intended edit positions are highlighted in yellow, and mismatched and overhanging nucleotides are indicated in red and blue lowercase fonts, respectively. Data are expressed as mean ± standard deviation for three independent biological replicates.
도 17은 HEK293T 세포의 동일한 내인성 부위에서 hyPE2의 표적-상 및 표적-외 에디팅 빈도를 PE2로 정규화하여 비교한 도면이다. P-값은 two-tailed, unpaired Student's t-검정으로 계산되었으며, 표적-상 pegRNA의 수는 7개이고, 표적-외 pegRNA의 수는 22개이다.FIG. 17 is a diagram comparing the on-target and off-target editing frequencies of hyPE2 normalized to PE2 in the same endogenous region of HEK293T cells. P-values were calculated with a two-tailed, unpaired Student's t-test, where the number of on-target pegRNAs was 7 and the number of off-target pegRNAs was 22.
도 18은 HEK293T 세포에 렌티바이러스로 통합된 동일한 표적 서열에서 hyPE2의 프라임 에디팅 효율성을 PE2의 효율성으로 정규화하여 비교한 도면이다. PE2-유도 프라임 에디팅 효율이 1% 초과인 표적 서열에 대한 결과이며, 표적 서열의 수는 423개이다.FIG. 18 is a diagram comparing the prime editing efficiency of hyPE2 normalized to that of PE2 in the same target sequence integrated as a lentivirus into HEK293T cells. Results for target sequences with PE2-induced prime editing efficiency greater than 1%, the number of target sequences being 423.
도 19는 예측 모델 간의 Spearman 상관 계수를 비교한 결과를 나타낸 도면이다. 19 is a diagram illustrating a result of comparing Spearman correlation coefficients between prediction models.
도 20은 Tree SHAP(XGBoost 분류기)에 의해 결정된 PE2 활성과 비교하여 hyPE2 활성과 관련된 상위 10개의 기능을 나타낸 도면이다. 점 색상은 각 pegRNA에 대한 관련 기능의 높은(빨간색) 또는 낮은(파란색) 값을 나타낸다.20 is a diagram showing the top 10 functions related to hyPE2 activity compared to PE2 activity determined by Tree SHAP (XGBoost classifier). Dot colors indicate high (red) or low (blue) values of the relevant function for each pegRNA.
도 21은 프라이머 결합 부위(PBS) 융해 온도에 대한 PE2 대비 hyPE2 프라임 에디팅 효율의 배수 증가 의존성을 나타낸 도면이다. hyPE2 및 PE2에 대한 편집 효율은 HEK293T 세포에 렌티바이러스로 통합된 동일한 표적 서열(라이브러리 B)에서 결정되었다. pegRNA의 수는 각각 4개 (<20 ℃), 32개 (20-30 ℃), 236개 (30-40 ℃), 348개 (40-50 ℃) 및 11개(≥50 ℃)이다.21 is a diagram showing the fold increase dependence of hyPE2 prime editing efficiency compared to PE2 on the primer binding site (PBS) melting temperature. Editing efficiencies for hyPE2 and PE2 were determined from the same target sequence (Library B) incorporated lentivirally into HEK293T cells. The number of pegRNAs is 4 (<20 °C), 32 (20-30 °C), 236 (30-40 °C), 348 (40-50 °C) and 11 (≥50 °C), respectively.
도 22는 hyPE2의 구조를 도시적으로 나타낸 도면이다.22 is a diagram schematically illustrating the structure of hyPE2.
도 23은 역전사 도메인이 흠(nick)이 있는 표적 ssDNA/pegRNA 하이브리드에 결합하기 전과 후의 hyPE2 (왼쪽) 및 PE2 (오른쪽)의 3차원 구조를 예측한 결과를 나타낸 도면이다. Rad51, nick이 있는 표적 ssDNA 및 pegRNA 프라이머 결합 부위 사이의 가상 상호작용 모델링은 hyPE2에 표시된다. 23 is a diagram showing the prediction results of the three-dimensional structure of hyPE2 (left) and PE2 (right) before and after the reverse transcription domain binds to the target ssDNA/pegRNA hybrid with a nick. The hypothetical interaction modeling between Rad51, the target ssDNA with nick and the pegRNA primer binding site is shown in hyPE2.
이하 실시예 및 실험예를 통하여 보다 상세하게 설명한다. 그러나, 이들 실시예 및 실험예는 예시적으로 설명하기 위한 것으로 본 발명의 범위가 이들 실시예 및 실험예에 한정되는 것은 아니다.Hereinafter, it will be described in more detail through Examples and Experimental Examples. However, these Examples and Experimental Examples are for illustrative purposes, and the scope of the present invention is not limited to these Examples and Experimental Examples.
실시예 1: 플라스미드 벡터 제작Example 1: Plasmid vector construction
인간 RPA70-C 및 Rad51DBD를 인코딩하는 서열은 GeneScript에 의뢰하여 합성하였다. ssDBD-PE2-인코딩하는 플라스미드를 제작하기 위해, 상기 서열들은 PCR로 증폭한 후, pCMV-PE2 (Addgene, no. 132775) 플라스미드에 클로닝하였다. 상기 플라스미드들은 PE2-mid_RPA70, hyPE2, PE2-N_Rad51 및 PE2-C_Rad51로 명명하였다 (도 1). 링커 변이체는 hyPE2 플라스미드에서 유래되었으며, Gibson assembly를 사용하여 클로닝하였다. Sequences encoding human RPA70-C and Rad51DBD were synthesized by request of GeneScript. To construct a plasmid encoding ssDBD-PE2-, the sequences were amplified by PCR and cloned into pCMV-PE2 (Addgene, no. 132775) plasmid. The plasmids were named PE2-mid_RPA70, hyPE2, PE2-N_Rad51 and PE2-C_Rad51 (FIG. 1). Linker variants were derived from the hyPE2 plasmid and cloned using Gibson assembly.
실시예 2: 플라스미드 라이브러리 제작 및 세포 라이브러리 생성Example 2: Plasmid library construction and cell library construction
이전 연구에서 54,836쌍의 pegRNA 인코딩 서열 및 표적 서열의 플라스미드 라이브러리를 생성한 바 있다 (Nature Biotechnology volume 39, pages 198-206 (2021)). 본 발명에서는 콜로니를 선택하여 상기 라이브러리에서 107개의 플라스미드를 무작위로 선택하였고, 선택한 플라스미드를 등몰 비율로 혼합했다(라이브러리 A).In a previous study, a plasmid library of 54,836 pairs of pegRNA encoding sequences and target sequences was generated (Nature Biotechnology volume 39, pages 198-206 (2021)). In the present invention, colonies were selected and 107 plasmids were randomly selected from the library, and the selected plasmids were mixed in an equimolar ratio (Library A).
다음으로, 더 많은 수의 표적 서열에서 다양한 유형의 편집에 대한 보다 광범위한 평가를 위해 665쌍의 pegRNA 및 표적 서열을 포함하는 또 다른 라이브러리를 추가로 설계했으며, 이를 '라이브러리 B'로 명명하였다. 상기 라이브러리 B를 설계하기 위해, 이전 연구에서 공개된 상기 54,836쌍의 pegRNA 인코딩 서열 및 표적 서열의 플라스미드 라이브러리에서 100개의 삭제 유도 pegRNA, 100개의 삽입 유도 pegRNA 및 200개의 치환 유도 pegRNA를 선택했다. 상기 선택을 위해, 이전 연구 (Nature Biotechnology volume 39, pages 198-206 (2021))에서 공개된 편집 효율에 따라 8개의 계층으로 나누고 (<1%, 1-3%, 3-6%, 6-10%, 10-20%, 20-30%, 30-40%, 및 >40%), 각 계층에서 유사한 수의 pegRNA를 무작위로 선택하여 모든 수준의 효율성과 관련된 pegRNA가 포함되도록 했다. 상기 과정을 통해 선택된 400개의 pegRNA와 상기 라이브러리 A에서 선택된 107개의 pegRNA를 추가하여 총 507개의 pegRNA를 도출하였다. 또한, 상기 507개의 pegRNA 중 158개는 NGG PAM 서열에서의 침묵 돌연변이를 유도하기 위해 변형될 수 있으며, 라이브러리 B에 초기 설계된 편집 외에도 PAM 서열에서 침묵 돌연변이를 유도할 수 있는 158개의 변형된 pegRNA를 추가했다. 따라서 라이브러리 B의 총 pegRNA 수는 507 + 158 = 665였다. 각 pegRNA는 3개의 바코드와 관련되었다. 따라서 라이브러리 B를 생성하는 데 사용된 올리고뉴클레오티드의 수는 665 Х 3 = 1995였다.Next, another library containing 665 pairs of pegRNAs and target sequences was further designed for a more extensive evaluation of various types of editing in a larger number of target sequences, which was designated as 'Library B'. To design the library B, 100 deletion-inducing pegRNAs, 100 insertion-inducing pegRNAs and 200 substitution-inducing pegRNAs were selected from the plasmid library of the 54,836 pairs of pegRNA encoding sequences and target sequences published in the previous study. For the selection, divided into 8 layers according to the editing efficiency published in a previous study (Nature Biotechnology volume 39, pages 198-206 (2021)) (<1%, 1-3%, 3-6%, 6- 10%, 10-20%, 20-30%, 30-40%, and >40%), a similar number of pegRNAs from each stratum were randomly selected to ensure that pegRNAs associated with all levels of efficiency were included. A total of 507 pegRNAs were derived by adding 400 pegRNAs selected through the above process and 107 pegRNAs selected from library A. In addition, 158 of the 507 pegRNAs can be modified to induce silent mutations in the NGG PAM sequence, and 158 modified pegRNAs that can induce silent mutations in the PAM sequence in addition to the initially designed editing in library B were added. did. Thus, the total number of pegRNAs in library B was 507 + 158 = 665. Each pegRNA was associated with three barcodes. Therefore, the number of oligonucleotides used to generate library B was 665 Х 3 = 1995.
다음으로, 107개 플라스미드 라이브러리로부터 렌티바이러스 생성을 준비하기 위해, 4.0 Х 106개의 HEK293T 세포를 DMEM이 포함된 100mm 접시에 시딩했다. 15시간 후, 상기 배양 배지를 25 μM 클로로퀸 이인산염 (chloroquine diphosphate)을 포함하는 DMEM으로 교체하였고, 세포를 5시간 추가 배양하였다. 상기 플라스미드 라이브러리를 psPAX2 (Addgene no. 12260) 및 pMD2.G (Addgene no. 12259)와 1.3:0.72:1.64의 몰 비율로 혼합하였으며, 이어서, 플라스미드를 폴리에틸렌이민(PEI MAX, Polysciences)을 사용하여 HEK293T 세포 내로 공동형질감염시켰다. 다음날 배양 배지를 새로운 배지로 교체하였다. 형질감염 48시간 후, Millex-HV 0.45-μm low protein-binding membrane(Millipore)을 이용하여 렌티바이러스가 포함된 배지를 채취하여 여과하였으며, 이를 분취하여 -80℃에 보관하였다. 렌티바이러스의 적정을 위해, 8㎍/ml 폴리브렌(Sigma)의 존재 하에 바이러스 분취액의 연속 희석액을 10% FBS가 첨가된 DMEM에서 배양된 HEK293T 세포로 형질도입했다. 형질도입되지 않은 세포와 형질도입된 세포는 둘 다 10% FBS 및 2㎍/ml의 퓨로마이신(puromycin)이 보충된 DMEM에서 배양되었다. 형질도입되지 않은 세포가 모두 죽은 후, 바이러스 역가를 추정하기 위해 형질도입된 집단에서 살아있는 세포의 수를 카운팅하였다. Next, to prepare lentivirus generation from the 107 plasmid library, 4.0 Х 10 6 HEK293T cells were seeded into 100 mm dishes containing DMEM. After 15 hours, the culture medium was replaced with DMEM containing 25 μM chloroquine diphosphate, and the cells were further cultured for 5 hours. The plasmid library was mixed with psPAX2 (Addgene no. 12260) and pMD2.G (Addgene no. 12259) in a molar ratio of 1.3:0.72:1.64, and then the plasmid was mixed with HEK293T using polyethyleneimine (PEI MAX, Polysciences). The cells were cotransfected. The next day, the culture medium was replaced with fresh medium. 48 hours after transfection, the medium containing the lentivirus was collected and filtered using a Millex-HV 0.45-μm low protein-binding membrane (Millipore), which was aliquoted and stored at -80°C. For titration of lentivirus, serial dilutions of virus aliquots in the presence of 8 μg/ml polybrene (Sigma) were transduced into HEK293T cells cultured in DMEM supplemented with 10% FBS. Both non-transduced and transduced cells were cultured in DMEM supplemented with 10% FBS and 2 μg/ml of puromycin. After all non-transduced cells died, the number of viable cells in the transduced population was counted to estimate virus titer.
다음으로, 렌티바이러스 형질도입을 위해, 1.0 Х 106개의 HEK293T 또는 HCT116 세포를 100mm 접시에 접종하고 밤새 배양했다. 선택된 pegRNA-인코딩 플라스미드의 수에 비해 3000배 이상의 커버리지를 달성하기 위해, 렌티바이러스 라이브러리는 0.3의 MOI에서 형질도입되었으며, 다음날, 배양 배지를 10% FBS 및 2㎍/ml 퓨로마이신이 보충된 DMEM으로 교체하였다. 형질도입되지 않은 세포를 제거하기 위해, 5일 동안 상기 조건으로 배양을 유지하였다.Next, for lentiviral transduction, 1.0 Х 10 6 HEK293T or HCT116 cells were inoculated into 100 mm dishes and cultured overnight. To achieve over 3000-fold coverage compared to the number of selected pegRNA-encoding plasmids, lentiviral libraries were transduced at an MOI of 0.3, and the next day, the culture medium was replaced with DMEM supplemented with 10% FBS and 2 μg/ml puromycin. replaced. In order to remove the non-transduced cells, the culture was maintained under the above conditions for 5 days.
실시예 3: 세포 라이브러리로 PE2 또는 PE2 변이체의 전달Example 3: Delivery of PE2 or PE2 variants into a cell library
상기 실시예 1에서 제작한 PE2 변이체를 세포 라이브러리 A 또는 B에 전달하기 위해, PE2 변이체-인코딩 플라스미드, pcDNA BSD-인코딩 플라스미드 및 puro-eGFP-인코딩 플라스미드를 10:1:1의 중량비로 혼합하여 총 12μg(라이브러리 A의 경우) 또는 24㎍(라이브러리 B의 경우)의 플라스미드 혼합물을 만들었고, 그런 다음 Lipofectamine 2000(Invitrogen)을 사용하여 세포 라이브러리 A의 총 1Х106개 세포 또는 세포 라이브러리 B의 총 6Х106개 세포를 형질감염시켰다. 밤새 배양한 후, 배양 배지를 10% FBS 및 40㎍/ml 블라스티시딘 S(blasticidin S)를 함유하는 DMEM으로 교체하였다. 배양 5일 후, 형질감염된 세포를 게놈 DNA 추출 및 심층 시퀀싱을 위해 0.25% 트립신을 이용하여 수득하였다.In order to transfer the PE2 variant prepared in Example 1 to the cell library A or B, the PE2 variant-encoding plasmid, pcDNA BSD-encoding plasmid, and puro-eGFP-encoding plasmid were mixed in a weight ratio of 10:1:1 for a total A plasmid mixture of 12 μg (for library A) or 24 μg (for library B) was made, and then Lipofectamine 2000 (Invitrogen) was used for a total of 1Х10 6 cells from cell library A or a total of 6Х10 6 cells from cell library B. Cells were transfected. After overnight incubation, the culture medium was replaced with DMEM containing 10% FBS and 40 μg/ml blasticidin S. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
실시예 4: 내인성 부위에서 프라임-에디팅 활성의 측정Example 4: Determination of Prime-Editing Activity at Endogenous Sites
내인성 부위에서 hyPE2 및 PE2 활성을 평가하기 위해, HEK293T 또는 HCT116 세포를 24웰 플레이트에 시딩하고 70-80% 세포밀도에서 형질감염시켰다. 구체적으로, 750ng의 PE2-, 250ng의 pegRNA- 및 100ng의 eGFP-Puro-(Addgene no. 45561) 인코딩 플라스미드를 혼합하고 제조업체의 프로토콜에 따라 Lipofectamine 2000을 사용하여 세포에 공동 형질감염시켰다. 다음날, 배양 배지를 10% FBS 및 2㎍/ml 퓨로마이신이 보충된 DMEM으로 교체하였다. 배양 5일 후, 형질감염된 세포를 게놈 DNA 추출 및 심층 시퀀싱을 위해 0.25% 트립신을 이용하여 수득하였다.To evaluate hyPE2 and PE2 activity at endogenous sites, HEK293T or HCT116 cells were seeded in 24-well plates and transfected at 70-80% cell density. Specifically, 750 ng of PE2-, 250 ng of pegRNA- and 100 ng of eGFP-Puro- (Addgene no. 45561) encoding plasmids were mixed and cells were co-transfected using Lipofectamine 2000 according to the manufacturer's protocol. The next day, the culture medium was replaced with DMEM supplemented with 10% FBS and 2 μg/ml puromycin. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
다음으로, 건강한 사람인 연구 참가자로부터 서면 동의를 얻은 후 피부과 전문의가 참가자로부터 피부 펀치 생검을 실시했다. 연세대학교 보건의료원 세브란스병원 기관심사위원회에서 동의 절차 및 연구를 승인하였다(제4-2012-0028호). 피부 생검에서 수득한 섬유아세포를 10% FBS 및 페니실린/스트렙토마이신을 함유하는 DMEM에서 배양하였다. 총 1Х106개의 인간 피부 섬유아세포를 3μg의 PE2-, 1μg의 pegRNA- 및 1μg의 eGFP-Puro 인코딩 플라스미드와 혼합하고, Neon 전기천공 키트를 사용하여 제조업체의 프로토콜에 따라 전기천공했다. 배양 5일 후, 형질감염된 세포를 게놈 DNA 추출 및 심층 시퀀싱을 위해 0.25% 트립신을 이용하여 수득하였다.Next, a skin punch biopsy was performed from the participants by a dermatologist after obtaining written consent from the study participants, who were healthy people. The consent procedure and research were approved by the institutional review committee of Yonsei University Health and Medical Center Severance Hospital (No. 4-2012-0028). Fibroblasts obtained from skin biopsies were cultured in DMEM containing 10% FBS and penicillin/streptomycin. A total of 1Х10 6 human dermal fibroblasts were mixed with 3 μg of PE2-, 1 μg of pegRNA-, and 1 μg of eGFP-Puro encoding plasmid and electroporated using the Neon electroporation kit according to the manufacturer's protocol. After 5 days of culture, transfected cells were harvested using 0.25% trypsin for genomic DNA extraction and deep sequencing.
실시예 5: 딥 시퀀싱 (Deep sequencing)Example 5: Deep sequencing
딥 시퀀싱 과정은 이전에 공개된 문헌에서 사용한 방법을 이용하였다 (Nature Biotechnology volume 39, pages 198-206 (2021)). 구체적으로, 제조사의 프로토콜에 따라 Wizard Genomic DNA 정제 키트(Promega)를 사용하여 펠릿 세포에서 게놈 DNA를 추출했다. 라이브러리 실험을 위한 프라임 에디팅 효율성을 측정하기 위해, 2x pfu PCR Smart mix(Solgent)를 사용하여 총 16μg(16,000x 커버리지 이상)의 게놈 DNA를 PCR 증폭했다. 생성된 PCR 산물을 MEGAquick-spin total fragment DNA 정제 키트(iNtRON Biotechnology)로 결합하여 정제했다. 다음으로, Illumina 어댑터 및 바코드 서열을 포함하는 프라이머를 사용하여 20ng의 정제된 생성물을 PCR 증폭하였다. 내인성 부위에서 프라임 에디팅 효율성을 결정하기 위해, ~200ng의 개별 게놈 DNA 샘플을 20μl 반응 부피에서 PCR 증폭했다. 생성된 PCR 산물을 결합하여 정제했다. 다음으로, 100ng의 정제된 생성물을 Illumina 어댑터 서열을 함유하는 프라이머를 사용하여 20㎕ 반응 부피에서 PCR 증폭하였다. 생성된 생성물을 정제하고 MiniSeq(Illumina)로 시퀀싱했다. The deep sequencing process used the method used in the previously published literature (Nature Biotechnology volume 39, pages 198-206 (2021)). Specifically, genomic DNA was extracted from the pelleted cells using the Wizard Genomic DNA Purification Kit (Promega) according to the manufacturer's protocol. To measure the prime editing efficiency for library experiments, a total of 16 μg (over 16,000x coverage) of genomic DNA was PCR-amplified using 2x pfu PCR Smart mix (Solgent). The resulting PCR product was combined and purified with a MEGAquick-spin total fragment DNA purification kit (iNtRON Biotechnology). Next, 20 ng of the purified product was PCR amplified using the Illumina adapter and primers containing the barcode sequence. To determine prime editing efficiency at endogenous sites, ~200 ng of individual genomic DNA samples were PCR amplified in a 20 μl reaction volume. The resulting PCR products were combined and purified. Next, 100 ng of the purified product was PCR amplified in a 20 μl reaction volume using primers containing the Illumina adapter sequence. The resulting product was purified and sequenced by MiniSeq (Illumina).
실시예 6: 프라임 에디팅 활성의 분석Example 6: Analysis of Prime Editing Activity
라이브러리 실험의 프라임-에디팅 효율성(즉, 의도한 편집 빈도)은 이전에 공개된 문헌 'Nature Biotechnology volume 39, pages198-206 (2021)'에 개시된 Python 스크립트를 사용하여 다음과 같이 계산되었다:The prime-editing efficiency (i.e., intended editing frequency) of library experiments was calculated as follows using the Python script disclosed in the previously published literature 'Nature Biotechnology volume 39, pages198-206 (2021)':
[수학식 1][Equation 1]
Figure PCTKR2022001611-appb-img-000001
Figure PCTKR2022001611-appb-img-000001
개별 pegRNA와 표적-서열 쌍을 식별하기 위해, 18-nt 바코드와 바코드 업스트림 4-nt 서열로 구성된 22-nt 서열을 사용하였다. 분석의 정확도를 향상시키기 위해, 100개 미만의 딥 시퀀싱 리드 수를 갖는 pegRNA 및 표적 서열 쌍은 제외시켰다. 원하는 편집을 포함하지만 PAM을 포함하는 넓은 표적 서열에 의도하지 않은 돌연변이가 없는 리드를 PE2-유도 돌연변이로 분류하였다. To identify individual pegRNA and target-sequence pairs, a 22-nt sequence consisting of an 18-nt barcode and a 4-nt sequence upstream of the barcode was used. To improve the accuracy of the analysis, pegRNA and target sequence pairs with fewer than 100 deep sequencing reads were excluded. Reads containing the desired edit but no unintended mutations in the broad target sequence including the PAM were classified as PE2-induced mutations.
의도한 편집 빈도(Intended editing frequency), 의도하지 않은 편집 빈도 (Unintended editing frequency) 및 내인성 부위의 삽입결실의 빈도 (Indel frequency)를 평가하기 위해, Cas-analyzer를 사용하였으며, 각각의 빈도값을 다음과 같이 계산했다:Cas-analyzer was used to evaluate the intended editing frequency, unintended editing frequency, and indel frequency in the endogenous region. Calculated like this:
[수학식 2][Equation 2]
Figure PCTKR2022001611-appb-img-000002
Figure PCTKR2022001611-appb-img-000002
[수학식 3][Equation 3]
Figure PCTKR2022001611-appb-img-000003
Figure PCTKR2022001611-appb-img-000003
[수학식 4][Equation 4]
Figure PCTKR2022001611-appb-img-000004
Figure PCTKR2022001611-appb-img-000004
표적 위치 근처의 의도하지 않은 치환의 분석을 위해, nick 부위로부터 -10 뉴클레오티드(nts) 내지 +25 nts에 이르는 40 nt 영역이 치환에 대해 평가되었으며, 평균값은 후속 계산을 위한 리드 수로 고려되었다.For analysis of unintended substitutions near the target site, a 40 nt region ranging from -10 nucleotides (nts) to +25 nts from the nick site was evaluated for substitutions, and the average value was considered the number of reads for subsequent calculations.
어떤 경우에는, PE2 효율이 0%일 때 발생할 수 있는 수학적 오류를 방지하고 중요하지 않은 배수 증가를 감쇠시키기 위해, hyPE2 및 PE2 효율성 모두에 +0.1%가 추가된 조정된 배수 증가를 이용하였으며, 하기와 같이 계산하였다:In some cases, to avoid mathematical errors that may occur when PE2 efficiency is 0% and to attenuate insignificant fold increases, an adjusted fold increase with +0.1% added to both hyPE2 and PE2 efficiencies was used, Calculated as:
[수학식 5][Equation 5]
Figure PCTKR2022001611-appb-img-000005
Figure PCTKR2022001611-appb-img-000005
예를 들어, 0.015%에서 0.15%로 조정된 배수 증가는 10-배 대신, (0.15% + 0.1%)/(0.015% + 0.1%) = 2.2-배로 계산할 수 있다. 그러나, 1.5%에서 15%로의 증가는 (15% + 0.1%)/(1.5% + 0.1%) = 9.4배로 계산할 수 있으며 이는 10배에 가깝다. 폴드 증가 대신 조정된 폴드 증가를 사용할 경우에는 관련 수치에 대한 범례에서 이 점을 언급하였다.For example, a fold increase adjusted from 0.015% to 0.15% can be calculated as (0.15% + 0.1%)/(0.015% + 0.1%) = 2.2-fold instead of 10-fold. However, an increase from 1.5% to 15% can be calculated as (15% + 0.1%)/(1.5% + 0.1%) = 9.4 times, which is close to 10 times. When using adjusted fold increase instead of fold increase, this is noted in the legend for the relevant figure.
실시예 7: 잠재적인 PE2 표적-외(off-target) 부위에서의 프라임 에디팅 활성의 측정Example 7: Determination of prime editing activity at potential PE2 off-target sites
최대 2개의 뉴클레오티드 불일치 또는 1개의 뉴클레오티드 RNA 또는 DNA 돌출이 있는 잠재적인 PE2 표적-외(off-target) 부위는 Cas-OFFinder에 의해 확인되었다.Potential PE2 off-target sites with up to 2 nucleotide mismatches or 1 nucleotide RNA or DNA overhangs were identified by Cas-OFFinder.
잠재적인 표적-외 부위에서의 프라임 에디팅 효율을 평가하기 위해, 상기에서 기술한 내인성 부위에서의 프라임 에디팅 활성 측정에 사용된 게놈 DNA 샘플을 PCR 증폭을 위한 주형으로 사용하였다. 생성된 생성물을 정제하고 MiSeq로 시퀀싱했다.To evaluate the efficiency of prime editing at potential off-target sites, the genomic DNA samples used to measure prime editing activity at endogenous sites described above were used as templates for PCR amplification. The resulting product was purified and sequenced by MiSeq.
실시예 8: 종래의 기계 학습-기반 모델 훈련Example 8: Conventional Machine Learning-Based Model Training
라이브러리 B를 사용하여 얻은 hyPE2- 및 PE- 유도된 프라임 에디팅 효율성의 데이터를 무작위 샘플링을 통해 훈련 및 테스트 데이터 세트로 분할하였으며, pegRNA와 표적 서열이 두 데이터 세트 간에 공유되지 않도록 하였다. 7가지 기존의 기계 학습 알고리즘인, XGBoost (extremegradient boosting), 그래디언트 부스티드 회귀 트리 (gradient-boosted regression tree, Boosted RT), 랜덤 포레스트 (random forest), L1-정규화 선형 회귀 (L1-regularized linear regression, Lasso), L2-정규화 선형 회귀 (L2-regularized linear regression, Ridge), L1L2-정규화 선형 회귀 (L1L2-regularized linear regression, ElasticNet) 및 SVM (support vector machine)을 기반으로 각각 학습시켰다. 상기 모델들은 XGBoost Python 패키지(버전 1.3.3)와 scikit-learn(버전 0.23.2)을 사용하여 구현하였다. 위치-독립적 및 위치-의존적 뉴클레오타이드 및 디뉴클레오타이드, 융해 온도, GC 수, 최소 자가폴딩 자유 에너지(minimum self-folding free energy), DeepSpCas9 점수 (Kim, H.K. et al. SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance. Sci Adv 5, eaax9249 (2019))를 포함한 1820개의 특징(feature) 세트가 넓은 표적 서열과 PBS 및 RT 템플릿 서열에서 추출되었다. MeltingTemp 모듈 (https://biopython.org/docs/1.74/api/Bio.SeqUtils.MeltingTemp.html)은 기본 설정을 사용하여 융해 온도를 계산하는 데 이용되었다. 각 알고리즘의 정규화 파라미터 및 하이퍼 파라미터 구성에서 모델을 선택하기 위해 5배 교차 검증이 수행되었다. 각 기계 학습 알고리즘에 대한 세부 정보는 다음과 같다.Data of hyPE2- and PE-induced prime editing efficiencies obtained using library B were partitioned into training and test datasets through random sampling, ensuring that pegRNA and target sequences were not shared between the two datasets. Seven existing machine learning algorithms, XGBoost (extremegradient boosting), gradient-boosted regression tree (Boosted RT), random forest, L1-regularized linear regression, Lasso), L2-regularized linear regression (Ridge), L1L2-regularized linear regression (L1L2-regularized linear regression, ElasticNet), and SVM (support vector machine) were trained respectively. The above models were implemented using the XGBoost Python package (version 1.3.3) and scikit-learn (version 0.23.2). Site-independent and site-dependent nucleotides and dinucleotides, melting temperature, number of GCs, minimum self-folding free energy, DeepSpCas9 score (Kim, HK et al. SpCas9 activity prediction by DeepSpCas9, a deep learning A set of 1820 features including -based model with high generalization performance. Sci Adv 5 , eaax9249 (2019)) were extracted from a wide target sequence and PBS and RT template sequences. The MeltingTemp module (https://biopython.org/docs/1.74/api/Bio.SeqUtils.MeltingTemp.html) was used to calculate the melting temperature using default settings. Five-fold cross-validation was performed to select models from the normalization parameters and hyperparameter configurations of each algorithm. Here are the details for each machine learning algorithm.
XGBoost 및 그래디언트 부스티드 회귀 트리의 경우, 하기의 하이퍼 파라미터 구성에서 선택한 16개 이상의 모델을 검색했다: 베이스 추정기의 수 ([50, 100]에서 선택), 개별 회귀 추정량의 최대 깊이 ([5, 10] 에서 선택), 리프(leaf) 노드에 있는 최소 샘플 수 ([1, 2]에서 선택]) 및 학습률 ([0.1, 0.2] 에서 선택).For XGBoost and gradient-boosted regression trees, we searched more than 16 models selected from the following hyperparameter constructs: number of base estimators (selected from [50, 100]), maximum depth of individual regression estimators ([5, 10] ]), the minimum number of samples in a leaf node (choose from [1, 2]]), and the learning rate (choose from [0.1, 0.2] ).
랜덤 포레스트의 경우, 학습률을 제외하고 상기 XGBoost에 사용된 동일한 하이퍼 파라미터 구성에서 선택된 16개 이상의 모델을 검색했으며, 최상의 분리(split)를 찾을 때 고려해야 할 최대 특징 수를 검색했다 ([모든 특징, 모든 특징의 제곱근, 모든 특징의 이진 로그 (binary logarithm)]에서 선택).For the random forest, we searched more than 16 models selected from the same hyperparameter configuration used in XGBoost except for the learning rate, and searched for the maximum number of features to consider when finding the best split ([all features, all square root of features, binary logarithm of all features]).
L1-, L2 및 L1L2-정규화 선형 회귀의 경우, 정규화 파라미터를 최적화하기 위해, 로그 공간에서 10-6과 106 사이에 균등한 간격으로 16개 이상의 점을 검색했다.For L1-, L2 and L1L2-normalized linear regression, we searched at least 16 points equally spaced between 10 -6 and 10 6 in log space to optimize the regularization parameters.
SVM의 경우, 다음의 하이퍼 파라미터에서 16개 이상의 모델을 검색했다: 패널티 파라미터 C 및 커널 파라미터 γ, 10-3과 103 사이에 균등한 간격으로 배치된 4개의 점.For SVM, more than 16 models were searched in the following hyperparameters: penalty parameter C and kernel parameter γ, 4 points equally spaced between 10 -3 and 10 3 .
실시예 9: 3차원 구조적 모델링Example 9: Three-dimensional structural modeling
도 23에 표시된 hyPE2의 구조적 모델은 Coot 프로그램(버전 WinCoot 0.9.6.1)으로 구축하였다. SpCas9 DNA 아데닌-기반(adeninebase) 에디터(PDB 코드: 6VPC)의 구조로부터 가이드 RNA 및 표적 DNA 단편과 복합체를 이루는 Cas9의 3차원 모델을 얻었다. 121-nt pegRNA(잔기 83-121)의 3' 연장을 수동으로 모델링하기 위해 RNAfold WebServer을 사용하여 이 영역의 2차 구조를 예측하고 잔기를 위한 헤어핀 구조를 채택했다. 16-nt DNA 프라이머 영역과 혼성화된 pegRNA RT-주형 영역은 RNA:DNA 혼성체(PDB 코드: 4HKQ)와 복합된 XMRV RT의 구조를 기반으로 수동으로 모델링되었다. Rad51 ssDBD(잔기 16-85)의 3차원 모델은 Rad51(PDB 코드: 1B22)의 N-말단 도메인 구조에서 얻었다. Rad51 ssDBD, Linker A 및 Linker B의 유연한 N 및 C 말단 영역에서 추정되는 α-나선을 찾기 위해 RaptorX를 사용하여 2차 구조를 예측했다. 3차원 구조를 보여주는 도 23은 UCSF Chimera 프로그램을 사용하여 생성되었으며, 길이와 2차 구조를 고려하여 3차원 구조 상에 두 개의 링커를 표시했다. 보충 그림 5a에 표시된 hyPE2의 도식적인 구조적 모델의 경우 Cas9(PDB 4OO8), RT(PDB 5DMQ) 및 Rad51(PDB 1B22)의 좌표를 사용했다. 구조적 이미지는 CueMol(버전 2.2.3.443; http://www.cuemol.org) 프로그램을 사용하여 준비했다.The structural model of hyPE2 shown in Fig. 23 was built with the Coot program (version WinCoot 0.9.6.1). From the structure of the SpCas9 DNA adeninebase editor (PDB code: 6VPC), a three-dimensional model of Cas9 complexed with guide RNA and target DNA fragments was obtained. To manually model the 3' extension of the 121-nt pegRNA (residues 83-121), RNAfold WebServer was used to predict the secondary structure of this region and adopted a hairpin structure for the residue. The pegRNA RT-template region hybridized with the 16-nt DNA primer region was modeled manually based on the structure of the XMRV RT complexed with an RNA:DNA hybrid (PDB code: 4HKQ). A three-dimensional model of Rad51 ssDBD (residues 16-85) was obtained from the N-terminal domain structure of Rad51 (PDB code: 1B22). Secondary structures were predicted using RaptorX to find putative α-helices in the flexible N- and C-terminal regions of Rad51 ssDBD, Linker A and Linker B. 23 showing the three-dimensional structure was generated using the UCSF Chimera program, and two linkers were displayed on the three-dimensional structure in consideration of the length and secondary structure. For the schematic structural model of hyPE2 shown in Supplementary Fig. 5a, we used the coordinates of Cas9 (PDB 4OO8), RT (PDB 5DMQ) and Rad51 (PDB 1B22). Structural images were prepared using the CueMol (version 2.2.3.443; http://www.cuemol.org) program.
실시예 10: 통계 및 재현성 분석Example 10: Statistical and Reproducibility Analysis
데이터는 독립적인 실험으로부터 평균 ± 표준편차로 표시하였다. P-값은 독립 변수의 수에 따라 two-tailed paired t-테스트, unpaired Student's t-테스트 또는 일원분산분석 (one-way ANOVA) 후 Tukey의 다중 비교에 의한 사후 분석에 의해 계산되었다. 고처리량 실험의 경우 라이브러리 A에 대해 3반복, 라이브러리 B 및 링커 변이체에 대해 2반복으로 독립적으로 수행하였다. 모든 반복 실험은 유사한 결과를 보였다. HEK293T 세포, HCT116 세포 및 인간 섬유아세포에 대한 개별 평가 실험을 독립적으로 3반복으로 수행하였으며, 유사한 결과를 얻었다.Data are expressed as mean ± standard deviation from independent experiments. P-values were calculated by post-hoc analysis with Tukey's multiple comparisons after two-tailed paired t-test, unpaired Student's t-test, or one-way ANOVA according to the number of independent variables. For high-throughput experiments, triplicates for library A and duplicates for library B and linker variants were independently performed. All replicates showed similar results. Individual evaluation experiments for HEK293T cells, HCT116 cells and human fibroblasts were independently performed in triplicate, and similar results were obtained.
실시예 11: 데이터 및 코드 가용성Example 11: Data and Code Availability
본 발명의 딥시퀀싱 데이터는 NCBI Sequence Read Archive(SRA; https://www.ncbi.nlm.nih.gov/sra/)에 accession no. SRP307854로 제출되었다. PE2 및 hyPE2의 구조를 예측하기 위한 단백질 구조 데이터는 Protein Data Bank(https://www.rcsb.org)에서 가져왔다. PDB 코드에 대한 정보는 다음과 같다:Deep sequencing data of the present invention is in the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra/) accession no. Submitted as SRP307854. Protein structure data for predicting the structures of PE2 and hyPE2 were obtained from Protein Data Bank (https://www.rcsb.org). Information about the PDB code is as follows:
6VPC - SpCas9 DNA 아데닌 기반 에디터 (https://doi.org/10.2210/pdb6VPC/pdb),6VPC - SpCas9 DNA adenine-based editor ( https://doi.org/10.2210/pdb6VPC/pdb ),
4HKQ - RNA:DNA 하이브리드와 복합된 XMRV RT (https://doi.org/10.2210/pdb4HKQ/pdb)4HKQ - XMRV RT complexed with RNA:DNA hybrid ( https://doi.org/10.2210/pdb4HKQ/pdb )
1B22 - Rad51의 N-말단 도메인 (https://doi.org/10.2210/pdb1B22/pdb)1B22 - N-terminal domain of Rad51 ( https://doi.org/10.2210/pdb1B22/pdb )
4OO8 - Cas9 (https://doi.org/10.2210/pdb4OO8/pdb)4OO8 - Cas9 ( https://doi.org/10.2210/pdb4OO8/pdb )
5DMQ - 역전사효소 (reverse transcriptase) (https://doi.org/10.2210/pdb5DMQ/pdb)5DMQ - reverse transcriptase (https://doi.org/10.2210/pdb5DMQ/pdb)
프라임 에디팅 효율성 계산을 위해 이전 문헌(Nature Biotechnology volume 39, pages 198-206 (2021))에 게시된 Python 스크립트를 사용하였으며, 이는 https://github.com/hkimlab-PE/PE_SupplementaryCode에서 이용할 수 있다.For the prime editing efficiency calculation, a Python script published in the previous literature (Nature Biotechnology volume 39, pages 198-206 (2021)) was used, which is available at https://github.com/hkimlab-PE/PE_SupplementaryCode .
실험예 1: hyPE2의 개발Experimental Example 1: Development of hyPE2
프라임 에디팅 과정에서 Cas9-니케이즈 도메인과 pegRNA가 표적 서열에 결합하면 단일 가닥 DNA(single-stranded DNA, ssDNA)가 생성되어 pegRNA RT-주형 영역의 역전사를 위한 프라이머로 사용된다. 따라서, ssDBD (ssDNA binding protein domain)의 추가하는 경우 ssDNA의 안정화를 유도함으로서 프라임 에디팅의 효율성을 향상시킬 수 있다고 가정했다.When the Cas9-nikase domain and pegRNA bind to the target sequence during the prime editing process, single-stranded DNA (ssDNA) is generated and used as a primer for reverse transcription of the pegRNA RT-template region. Therefore, it was hypothesized that the addition of ssDBD (ssDNA binding protein domain) could improve the efficiency of prime editing by inducing stabilization of ssDNA.
따라서, PE2 보다 프라임 에디팅 효율이 우수한 hyPE2를 개발하기 위해, 하기와 같은 실험들을 수행하였다.Therefore, in order to develop hyPE2 with better prime editing efficiency than PE2, the following experiments were performed.
1.1: DBD 추가에 따른 프라임 에디팅 효율성 평가1.1: Evaluation of Prime Editing Efficiency with DBD Addition
기존에 개발된 PE2에 DBD(DNA-binding domains)를 추가함에 따라 프라임 에디팅 효율성이 향상되는지 여부를 평가하기 위해, ssDBD인 Rad51 DBD 또는 RPA70-C를 PE2의 Cas9 니케이즈(nickase)와 RT 도메인 사이에 추가하였으며, 상기 PE2 변이체를 각각 PE2 mid_Rad51 (hyperPE2 또는 hyPE2) 및 PE2-mid_RPA70로 명명하였다 (도 1).In order to evaluate whether the prime editing efficiency is improved by adding DBD (DNA-binding domains) to the previously developed PE2, Rad51 DBD or RPA70-C, an ssDBD, was used between the Cas9 nickelase of PE2 and the RT domain. In addition, the PE2 variants were named PE2 mid_Rad51 (hyperPE2 or hyPE2) and PE2-mid_RPA70, respectively (FIG. 1).
한편, 하나 또는 두 개의 표적 서열에서 상기의 변이체의 활성을 평가하는 경우, 표적 서열에 따라 프라임 에디팅 효율이 크게 달라질 수 있기 때문에, 이러한 변이체의 활성에 대해 일반화된 결론을 내리기가 어렵다. 이와 관련하여, 이전 연구에서 pegRNA-인코딩 서열 및 표적 서열의 렌티바이러스 라이브러리를 사용하여 총 54,836 쌍의 활성을 결정한 바 있다(Nature Biotechnology volume 39, pages 198-206 (2021)). 따라서, 이전 연구에서 사용된 48,000쌍의 플라스미드 라이브러리 1에서 107개의 플라스미드를 무작위로 선택했다. 상기 107쌍의 플라스미드 라이브러리로부터 HEK293T 세포로 형질도입된 렌티바이러스 라이브러리를 생성하였으며, 라이브러리 A라고 명명된 세포 라이브러리를 만들었다. 라이브러리 A에서 표적 서열과 상응하는 pegRNA 인코딩 서열은 렌티바이러스로 게놈에 통합시켰다.On the other hand, when evaluating the activity of the variant in one or two target sequences, it is difficult to draw a generalized conclusion about the activity of the variant because the prime editing efficiency may vary greatly depending on the target sequence. In this regard, in a previous study, a total of 54,836 pairs of activities were determined using a lentiviral library of pegRNA-encoding sequences and target sequences (Nature Biotechnology volume 39, pages 198-206 (2021)). Therefore, 107 plasmids were randomly selected from library 1 of the 48,000 pairs of plasmids used in the previous study. A lentiviral library transduced into HEK293T cells was generated from the 107 pairs of plasmid libraries, and a cell library named library A was created. The pegRNA encoding sequence corresponding to the target sequence in library A was integrated into the genome with a lentivirus.
상기에서 제작한 hyPE2, PE2-mid_RPA70 또는 PE2를 인코딩하는 플라스미드를 세포 라이브러리에 전달한 후, 딥 시퀀싱을 사용하여 프라임 에디팅 효율성을 결정했다. 먼저, 생물학적 복제물 사이의 높은 상관 관계를 나타내는 것을 확인하였으며 (도 2), 3개의 생물학적 복제물에서 프라임 에디팅 효율의 평균을 얻어 분석에 사용했다. 또한 최초 107개 쌍에서 24개 쌍은 딥 시퀀싱 리드 수가 충분하지 않아(< 100 리드) 제외하였으며, 19개 쌍은 PE2의 편집 효율성이 0%로 나타났으며 이는 hyPE2 효율성을 PE2로 정규화하는 데에 오류를 나타낼 수 있으므로 제외하였다.After the plasmid encoding hyPE2, PE2-mid_RPA70 or PE2 constructed above was transferred to the cell library, the prime editing efficiency was determined using deep sequencing. First, it was confirmed that a high correlation between biological replicates was shown ( FIG. 2 ), and the average of the prime editing efficiencies in three biological replicates was obtained and used for analysis. In addition, from the first 107 pairs, 24 pairs were excluded due to insufficient deep sequencing reads (< 100 reads), and 19 pairs showed an editing efficiency of 0% for PE2, which was useful for normalizing hyPE2 efficiency to PE2. It was excluded because it may indicate an error.
나머지 64쌍에 대해 PE로 정규화된 프라임 에디팅 효율성을 확인한 결과, 중앙값 증가는 hyPE2의 경우 2.4배(범위, 0-360배), PE2-mid_RPA70의 경우 1.5배(범위, 0-232배)를 나타냈다 (도 3). 또한, PE2에서 1% 초과의 효율을 나타내는 쌍과 비교하여, PE2에서 1% 미만의 편집 효율을 나타내는 쌍의 경우 중앙값 증가가 더 높은 것으로 나타났다.As a result of checking the PE-normalized prime editing efficiency for the remaining 64 pairs, the median increase was 2.4 times (range, 0-360 times) for hyPE2 and 1.5 times (range, 0-232 times) for PE2-mid_RPA70. (Fig. 3). Additionally, a higher median increase was seen for pairs exhibiting less than 1% editing efficiency in PE2 compared to pairs exhibiting greater than 1% efficiencies in PE2.
표적 서열에서의 PE2 효율이 너무 낮은 경우에는, 이를 기반으로 계산되는 변이체의 배수 증가 결과에 대한 오류가 발생하기 쉽다는 점을 감안하여, 배수 증가를 확인하기 위한 계산에서 1% 미만의 PE2 효율을 나타내는 추가적인 34개의 표적 서열을 제거했다. 그 결과 hyPE2와 PE2-mid_RPA70이 PE2와 비교하여 각각 1.6배(범위, 1.1~11배)의 중앙값 증가와 1.2배(범위, 0.57~6.6배)의 중앙값 증가를 나타내는 것을 확인하였다 (도 4). 또한, PE2 효율이 0%에서 1% 사이인 34쌍에 대하여, 각각 15쌍 및 13 쌍은 hyPE2(평균 효율 11%, 중앙값 9.1%, 1.1-29% 범위) 및 PE2-mid_RPA70(평균 효율 7.4%, 중앙값 5.3%, 1.1-24% 범위)에서 1%보다 높은 프라임 에디팅 효율을 나타냈다. 0% PE2 효율과 관련된 19쌍 중, 각각 3쌍 및 2 쌍은 hyPE2 (1.8, 2.3 및 4.0%) 및 PE2-mid_RPA70 (1.2 및 1.5%)에서 각각 1% 이상의 효율을 보였다.If the PE2 efficiency in the target sequence is too low, taking into account that the fold increase result of the variant calculated based on it is prone to error, a PE2 efficiency of less than 1% in the calculation to confirm the fold increase An additional 34 target sequences indicated were removed. As a result, it was confirmed that hyPE2 and PE2-mid_RPA70 showed a median increase of 1.6 times (range, 1.1 to 11 times) and 1.2 times (range, 0.57 to 6.6 times), respectively, compared to PE2 (FIG. 4). Also, for 34 pairs with PE2 efficiencies between 0% and 1%, 15 and 13 pairs, respectively, had hyPE2 (mean efficiency 11%, median 9.1%, range 1.1-29%) and PE2-mid_RPA70 (mean efficiency 7.4%) , with a median value of 5.3%, in the range of 1.1-24%), which showed a prime editing efficiency higher than 1%. Of the 19 pairs associated with 0% PE2 efficiency, 3 and 2 pairs, respectively, showed efficiencies greater than 1% in hyPE2 (1.8, 2.3 and 4.0%) and PE2-mid_RPA70 (1.2 and 1.5%), respectively.
상기 결과를 토대로, PE2에 Rad51 DBD를 추가하는 경우 프라임 에디팅 효율이 향상되는 것을 알 수 있으며, 이하의 실험에서는 Rad51 DBD를 추가하는 것을 기반으로 실험을 수행하였다.Based on the above results, it can be seen that prime editing efficiency is improved when Rad51 DBD is added to PE2. In the following experiment, an experiment was performed based on adding Rad51 DBD.
1.2: Rad51 DBD의 삽입 위치에 따른 프라임 에디팅 효율성 평가1.2: Evaluation of Prime Editing Efficiency by Insertion Position of Rad51 DBD
Rad51 DBD이 삽입되는 위치에 따른 프라임 에디팅 효율성을 평가하기 위해, 하기와 같은 실험을 수행하였다.In order to evaluate the prime editing efficiency according to the position where the Rad51 DBD is inserted, the following experiment was performed.
구체적으로, Rad51 DBD를 PE2의 N 또는 C 말단 영역에 삽입한 변이체를 제작하였으며, 이를 각각 PE2-N_Rad51 및 PE2-C_Rad51로 명명하였다 (도 1). 1%보다 높은 PE2 효율을 보인 표적 서열에서, PE2와 비교하여 hyPE2와 상기 2 종의 변이체의 프라임 에디팅 효율을 확인한 결과, hyPE2는 가장 높은 오버롤 활성을 나타내었고 (중앙값, PE2 활성보다 1.8배 높음), 반면 PE2-N_Rad51은 PE2보다 낮은 활성을 보였고(중앙값, 0.85배), PE2 C_Rad51은 PE2보다 약간 높은 오버롤 활성(중앙값, 1.1배)을 나타냈다(도 5). 또한, 1% 초과의 PE2 효율을 나타내는 33개의 표적 서열 모두에서 hyPE2 효율성은 PE2와 비교하여 더 높았음을 확인하였다. PE2에 의한 프라임 에디팅 효율이 1% 미만인 나머지 55개 표적 서열 중 20개 표적 서열(20/55 = 36%)은 hyPE2에 의한 1% 초과의 프라임 에디팅 효율을 보였고, 평균 효율은 9.1%(중앙값 5.7%, 범위 1.1-29%)를 나타냈다(도 6).Specifically, mutants in which Rad51 DBD was inserted into the N or C-terminal region of PE2 were prepared, and these were named PE2-N_Rad51 and PE2-C_Rad51, respectively (FIG. 1). In the target sequence showing PE2 efficiency higher than 1%, as a result of confirming the prime editing efficiency of hyPE2 and the two variants compared to PE2, hyPE2 showed the highest overall activity (median value, 1.8 times higher than PE2 activity) , whereas PE2-N_Rad51 showed lower activity than PE2 (median, 0.85 fold), and PE2 C_Rad51 showed slightly higher overall activity than PE2 (median, 1.1 fold) ( FIG. 5 ). In addition, it was confirmed that the hyPE2 efficiency was higher compared to PE2 in all 33 target sequences showing more than 1% PE2 efficiency. Twenty of the remaining 55 target sequences (20/55 = 36%) with a prime editing efficiency of less than 1% by PE2 showed a prime editing efficiency of greater than 1% by hyPE2, with an average efficiency of 9.1% (median 5.7) %, range 1.1-29%) (Fig. 6).
상기 결과를 토대로, PE2에 Rad51 DBD을 추가하는 경우, Cas9 니케이즈와 RT 사이에 삽입한 변이체의 프라임 에디팅 효율이 가장 우수한 것을 알 수 있으며, 이하의 실험에서는 hyPE2를 기반으로 실험을 수행하였다.Based on the above results, when Rad51 DBD is added to PE2, it can be seen that the prime editing efficiency of the mutant inserted between Cas9 nickase and RT is the best. In the following experiment, an experiment was performed based on hyPE2.
1.3: 다른 세포주에서의 프라임 에디팅 효율성 평가1.3: Evaluation of Prime Editing Efficiency in Different Cell Lines
상기 hyPE2의 우수한 프라임 에디팅 효율이 다른 세포주에서도 나타나는 지 확인하기 위해, 하기와 같은 실험을 수행하였다.In order to confirm whether the excellent prime editing efficiency of hyPE2 is also shown in other cell lines, the following experiment was performed.
구체적으로, 다른 세포주에서 hyPE2의 활성을 평가하기 위해, 107쌍의 pegRNA 인코딩 서열 및 표적 서열을 포함하는 HCT116 세포 라이브러리를 생성했다. 상기에서와 유사하게, 리드 수가 불충분한 26쌍(<100 리드 수)과 PE2 효율이 <1% 인 38쌍은 제외시켰다. 나머지 43개 쌍에 대한 프라임 에디팅 효율성을 평가한 결과, hyPE2가 PE2보다 1.4배 중앙값(범위, 0.82-2.9배)을 나타냈음을 확인하여 PE2보다 높은 효율성을 나타냄을 확인하였다 (도 7).Specifically, to evaluate the activity of hyPE2 in different cell lines, we generated an HCT116 cell library containing 107 pairs of pegRNA encoding sequences and target sequences. Similar to the above, 26 pairs with insufficient read count (<100 read count) and 38 pairs with PE2 efficiency <1% were excluded. As a result of evaluating the prime editing efficiency for the remaining 43 pairs, it was confirmed that hyPE2 exhibited a median value of 1.4 times (range, 0.82-2.9 times) than PE2, indicating higher efficiency than PE2 (FIG. 7).
1.4: 링커에 따른 프라임 에디팅 효율성 평가1.4: Evaluating Prime Editing Efficiency According to the Linker
Rad51 DBD의 삽입을 위해 사용되는 링커 종류에 따른 프라임 에디팅 효율성을 평가하기 위해, 하기와 같은 실험을 수행하였다.In order to evaluate the prime editing efficiency according to the type of linker used for Rad51 DBD insertion, the following experiment was performed.
구체적으로, hyPE2의 활성에서 링커의 영향을 평가하기 위해, hyPE2-AA, -AB, -BB, -AY, -AX 및 -XA로 명명된 6개의 링커 변이체를 제작하였다(도 8). 상기에서 기술한 방법을 토대로 세포 라이브러리 A를 사용하여 hyPE2의 활성과 비교한 결과, 상기 변이체는 hyPE2보다 낮은 프라임 에디팅 효율성을 보여주었다(도 9). Specifically, to evaluate the effect of the linker on the activity of hyPE2, six linker variants named hyPE2-AA, -AB, -BB, -AY, -AX and -XA were constructed (FIG. 8). As a result of comparing the activity of hyPE2 using cell library A based on the method described above, the mutant showed lower prime editing efficiency than hyPE2 (FIG. 9).
상기 결과를 토대로, PE2에 Rad51 DBD에 삽입하기 위해 링커 B 및 링커 A를 사용하는 것이 가장 우수한 프라임 에디팅 효율을 나타낼 수 있음을 알 수 있으며, 이하의 실험에서는 hyPE2를 기반으로 실험을 수행하였다. 또한, 상기 hyPE2에 포함되는 단백질/도메인 및 링커의 아미노산 서열 정보는 하기 표 1 및 표 2에 기재하였다.Based on the above results, it can be seen that using Linker B and Linker A for insertion into Rad51 DBD in PE2 can exhibit the best prime editing efficiency, and experiments were performed based on hyPE2 in the following experiments. In addition, amino acid sequence information of the protein/domain and linker included in the hyPE2 is described in Tables 1 and 2 below.
단백질/
도메인
protein/
domain
아미노산 서열amino acid sequence 서열
번호
order
number
Cas9 H840ACas9 H840A DKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHMKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDY
FKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD
DKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHMKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDY
FKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD
1One
단백질/
도메인
protein/
domain
아미노산 서열amino acid sequence 서열
번호
order
number
Rad51 ssDBD Rad51 ssDBD AMQMQLEANADTSVEEESFGPQPISRLEQCGINANDVKKLEEAGFHTVEAVAYAPKKELINIKGISEAKADKILAEAAKLVPM GFTTATEFHQRRSEIIQITTGSKELDKLLQAMQMQLEANADTSVEEESFGPQPISRLEQCGINANDVKKLEEAGFHTVEAVAYAPKKELINIKGISEAKADKILAEAAKLVPMGFTTATEFHQRRSEIIQITTGSKELDKLLQ 22
Reverse transcriptaseReverse transcriptase TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPSGGSKRTADGSEFEPKKKRKV*TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVTTETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRRRGWLTSEGKEIKNKDEILALLKALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPSGGSKRTADGSEFEPKKKRKV* 33
Linker B Linker B SGGSPKKKRKVGSSGSSGGSPKKKRKVGSSGS 44
Linker A Linker A SGG SSGGSSGSETPGTSESATPESSGGSSGGSSSGGSSGGSSGSETPGTSESATPESSGGSSGGSS 55
실험예 2: 내인성 표적에서 hyPE2의 활성 평가Experimental Example 2: Evaluation of the activity of hyPE2 in endogenous targets
내인성 표적에서 hyPE2의 프라임 에디팅 효율성을 평가하기 위해, 하기와 같은 실험을 수행하였다.To evaluate the prime editing efficiency of hyPE2 in endogenous targets, the following experiments were performed.
구체적으로, HEK293T 및 HCT116 세포에서 각각 63개 및 51개의 내인성 표적 부위에서의 hyPE2 및 PE2의 효율성을 평가하였다. 그 결과, PE2 효율은 HEK293T 및 HCT116 세포의 경우 각각 31개 및 11개 표적에서 1%보다 높았으며, 상기 표적 부위에서 hyPE2의 효율은 HEK293T 및 HCT116 세포에서 각각 PE2의 효율보다 1.4배(중앙값, 범위 0.89~2.2배) 및 1.5배(중앙값, 범위 1.0~2.6배) 높은 것을 확인하였다 (도 10 내지 12). 구체적으로, hyPE2에 의한 프라임 에디팅 효율은 HEK293T 세포에서 5.8%에서 13%로, HCT116 세포에서 1.1%에서 2.8%로 증가했다.Specifically, the efficiency of hyPE2 and PE2 at 63 and 51 endogenous target sites in HEK293T and HCT116 cells, respectively, was evaluated. As a result, PE2 efficiency was higher than 1% in 31 and 11 targets for HEK293T and HCT116 cells, respectively, and the efficiency of hyPE2 at the target site was 1.4 times higher than that of PE2 in HEK293T and HCT116 cells, respectively (median, range). 0.89 to 2.2 times) and 1.5 times (median, range 1.0 to 2.6 times) higher ( FIGS. 10 to 12 ). Specifically, the prime editing efficiency by hyPE2 increased from 5.8% to 13% in HEK293T cells and from 1.1% to 2.8% in HCT116 cells.
또한, PE2 효율이 1% 미만인 HEK293T 및 HCT116 세포에서의 나머지 32개 및 40개의 표적 부위에서, HEK293T 세포의 11개의 표적 서열(34% = 11/32)은 1.6%(1.4%의 중앙값, 범위 1.1-2.9%의 중앙값)의 평균 효율로 1%보다 높은 hyPE2-유도 프라임 에디팅 효율을 보인 반면, HCT116 세포에서의 hyPE2 효율성은 모두 1% 미만을 나타냈다. 상기 결과들을 토대로, hyPE2가 PE2보다 전반적으로 우수한 활성을 갖는다는 것을 알 수 있다.In addition, for the remaining 32 and 40 target sites in HEK293T and HCT116 cells with PE2 efficiencies less than 1%, 11 target sequences (34% = 11/32) of HEK293T cells were 1.6% (median of 1.4%, range 1.1) The hyPE2-induced prime editing efficiency was higher than 1% with an average efficiency of -2.9%), whereas the hyPE2 efficiencies in HCT116 cells were all less than 1%. Based on the above results, it can be seen that hyPE2 has better overall activity than PE2.
다음으로, 치료적으로 관련된 세포 유형인 1차 인간 피부 섬유아세포의 6개의 표적 서열에서 hyPE2의 효율성과 PE2의 효율성을 비교했다. 그 결과, PE2와 비교하여 hyPE2는 6개 표적 중 5개 표적에서 더 높은 프라임 에디팅 효율을 보였고 6개 표적에서 평균 및 중앙값 증가는 각각 2.1배 및 2.0배(조정된 배수)를 나타냄을 확인하였다 (도 13 및 14).Next, we compared the efficiency of hyPE2 with that of PE2 in six target sequences of a therapeutically relevant cell type, primary human dermal fibroblasts. As a result, compared to PE2, hyPE2 showed higher prime editing efficiency in 5 of 6 targets, and it was confirmed that the mean and median increases in 6 targets were 2.1-fold and 2.0-fold (adjusted fold), respectively ( 13 and 14).
실험예 3: hyPE2에 의한 의도되지 않은 편집 및 표적-외 효과 수준 평가Experimental Example 3: Unintentional editing and off-target effect level evaluation by hyPE2
hyPE2가 내인성 표적 서열에서 PE2보다 더 높은 수준으로 의도하지 않은 편집(unintended edits)을 유도하는지 여부를 확인하기 위해 하기와 같은 실험을 수행하였으며, 의도하지 않은 편집의 빈도를 비교하기 위해 조정된 배수 증가를 채택했다. 그 결과, PE2 대신 hyPE2를 사용한 경우 인델(indel) 등의 의도하지 않은 치환 및 편집 수준이 약간 증가했으나, 상기의 의도하지 않은 편집의 배수 증가 수준은 의도된 편집의 배수 증가 수준보다 낮거나 유사한 정도의 수준을 나타냈다 (도 15).The following experiment was performed to determine whether hyPE2 induces unintended edits at a higher level than PE2 in the endogenous target sequence, and adjusted fold increase to compare the frequency of unintended edits. has been adopted As a result, when hyPE2 was used instead of PE2, the level of unintentional substitution and editing such as indels slightly increased, but the fold increase level of the above unintentional editing was lower than or similar to the fold increase level of the intended editing. of (Fig. 15).
다음으로, HEK293T 세포에서 상대적으로 높은 표적-상(on-target) 프라임 에디팅 효율을 보인 pegRNA의 잠재적인 표적-외(off-target) 부위를 분석하여 표적-외 편집 효과를 평가했다. 구체적으로, 먼저 pegRNA 당 최대 2개의 뉴클레오티드 불일치 또는 1개의 뉴클레오티드 RNA 또는 DNA 돌출이 있는 잠재적인 표적-외 부위를 식별하고, 3개의 pegRNA에 대해 총 6개의 잠재적인 표적-외 부위를 발견했다. 또한, 프라임 에디팅의 초기 연구에서 총 16쌍의 pegRNA 및 표적-외 부위와 관련된 4개의 HEK4-표적화 pegRNA를 사용하여 프라임 에디팅 효율을 결정했다 (도 16). HEK293T 세포에서 이러한 22개(= 6 + 16) 쌍의 pegRNA 및 잠재적인 표적-외 부위에 대한 평가 결과, hyPE2의 표적-외 효과는 PE2의 표적-외 효과와 유사한 수준을 나타내는 것을 확인하였다 (도 16 및 17). Next, the effect of off-target editing was evaluated by analyzing potential off-target sites of pegRNAs that showed relatively high on-target prime editing efficiency in HEK293T cells. Specifically, we first identified potential off-target sites with up to two nucleotide mismatches or one nucleotide RNA or DNA overhang per pegRNA, and found a total of six potential off-target sites for three pegRNAs. In addition, in an initial study of prime editing, a total of 16 pairs of pegRNAs and 4 HEK4-targeting pegRNAs associated with off-target sites were used to determine prime editing efficiency ( FIG. 16 ). As a result of evaluation of these 22 (= 6 + 16) pairs of pegRNAs and potential off-target sites in HEK293T cells, it was confirmed that the off-target effect of hyPE2 was similar to that of PE2 (Fig. 16 and 17).
상기 결과를 토대로, hyPE2가 PE2와 비교하여 의도되지 않은 편집 및 표적-외 효과를 특별히 유도하지 않음을 알 수 있다.Based on the above results, it can be seen that hyPE2 does not specifically induce unintended editing and off-target effects compared to PE2.
실험예 4: PEselector의 개발Experimental Example 4: Development of PEselector
상기 실험예들의 결과를 통해 hyPE2에 의해 유도되는 프라임 에디팅 효율 증가는 일부 pegRNA에서 더 높다는 것을 감안하여, hyPE2에 의한 프라임 에디팅 효율 배수 증가를 예측하기 위해 하기와 같은 실험을 수행하였다.Considering that the increase in prime editing efficiency induced by hyPE2 is higher in some pegRNAs through the results of the above experimental examples, the following experiment was performed to predict the fold increase in prime editing efficiency by hyPE2.
구체적으로, 665쌍의 pegRNA와 통합 표적 서열(라이브러리 B)을 사용하여 상기에서 설명한 고처리량 방법을 이용하여 hyPE2- 및 PE2- 유도된 프라임 에디팅 효율성을 비교했다. 먼저, 라이브러리 A 및 B로부터 도출된 배수 증가 데이터를 결합한 경우, 평균-배수 및 중앙값-배수 증가는 각각 1.3배 및 1.2배를 나타냈다 (도 18).Specifically, we compared the hyPE2- and PE2-induced prime editing efficiencies using the high-throughput method described above using 665 pairs of pegRNAs and an integrated target sequence (Library B). First, when the fold increase data derived from libraries A and B were combined, the mean-fold and median-fold increase were 1.3-fold and 1.2-fold, respectively ( FIG. 18 ).
다음으로, 라이브러리 B의 데이터는 무작위로 훈련 데이터 세트 및 테스트 데이터 세트로 분할되었으며, 상기 두 데이터 세트 간에는 pegRNA 및 표적 서열 모두 공유되지 않도록 하였다. 568개의 pegRNA로 결정된 hyPE2- 및 PE2- 유도 프라임 에디팅 효율성의 훈련 데이터 세트를 사용하여, PE2와 비교하여 hyPE2-유도 프라임 에디팅 효율성의 배수 증가를 예측하는 7개의 컴퓨터 모델을 생성했으며, 상기 7개의 모델 중 SVM-기반의 모델이 가장 높은 정확도를 나타내는 것을 확인하였다 (도 19). 상기 모델이름은 PEselector로 명명하였으며, 상기 모델은 http://deepcrispr.info/PEselector에서 제공한다.Next, the data in library B was randomly partitioned into a training data set and a test data set, ensuring that neither the pegRNA nor the target sequence was shared between the two data sets. Using the training data set of hyPE2- and PE2-induced prime editing efficiencies determined with 568 pegRNAs, we generated 7 computer models predicting a fold increase in hyPE2-induced prime editing efficiencies compared to PE2, wherein the 7 models It was confirmed that the SVM-based model showed the highest accuracy ( FIG. 19 ). The model name was named PEselector, and the model is provided at http://deepcrispr.info/PEselector .
실험예 5: hyPE2 효율성과 관련된 인자 확인Experimental Example 5: Confirmation of factors related to hyPE2 efficiency
hyPE2-유도 프라임 에디팅 효율의 배수 증가와 관련된 요인(factor)을 확인하기 위해, Tree SHAP method (XGBoost 알고리즘으로 통합된 SHapley Additive explanations)를 사용하였으며(Lundberg, S.M. et al. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence 2, 56-67 (2020)), 1820개의 특징 중 가장 중요한 요인은 프라이머-결합 부위 (primer-binding site, PBS)의 융해 온도임을 확인하였다 (도 20). 또한, hyPE2 효율의 배수 증가가 PBS의 융해 온도에 어떻게 의존하는지 확인했을 때, PBS 융해 온도가 증가함에 따라 배수 증가가 감소하는 경향이 있음을 확인하였는 바(도 21), PBS 융해 온도가 낮은 경우에 hyPE2가 특히 유용할 것임을 알 수 있다.To check the factors related to the multiple increase of the hyPE2-induced prime editing efficiency, the Tree SHAP method (SHapley Additive explanations integrated with the XGBoost algorithm) was used (Lundberg, SM et al. From local explanations to global understanding with Explainable AI for trees. Nature Machine Intelligence 2 , 56-67 (2020)), it was confirmed that the most important factor among the 1820 characteristics was the melting temperature of the primer-binding site (PBS) ( FIG. 20 ). In addition, when it was confirmed how the fold increase in hyPE2 efficiency depends on the melting temperature of PBS, it was confirmed that the fold increase tends to decrease as the PBS melting temperature increases ( FIG. 21 ), when the PBS melting temperature is low It can be seen that hyPE2 will be particularly useful for
다음으로, hyPE2의 더 높은 프라임 에디팅 효율성에 대한 잠재적인 메커니즘을 확인하기 위해, PE2와 hyPE2의 3차원 구조를 예측했다 (도 22 및 23). 그 결과, Rad51 DBD가 ssDNA와 RNA 모두에 결합하여 DNA/RNA 혼성체의 형성을 촉진할 수 있다는 점을 감안할 때, Rad51은 pegRNA가 닉킹(nickding)이 있는 표적 ssDNA에 결합하는 것을 촉진하여 역전사를 향상시킬 수 있음을 알 수 있다. 상기 잠재적인 메커니즘은 PBS 융해 온도가 낮을 때 프라임 에디팅 효율의 향상이 특히 향상된다는 상기 결과와 일치하며, 이는 PE2를 이용할 때, nick 부위가 있는 표적 ssDNA에 대한 pegRNA PBS 도메인의 불량한 결합과 연결될 수 있다.Next, to identify a potential mechanism for the higher prime editing efficiency of hyPE2, we predicted the three-dimensional structures of PE2 and hyPE2 (Figs. 22 and 23). As a result, given that Rad51 DBD can bind to both ssDNA and RNA and promote the formation of a DNA/RNA hybrid, Rad51 promotes the binding of pegRNA to the target ssDNA with nicking and reverse transcription. It can be seen that it can be improved. This potential mechanism is consistent with the above results that the improvement of prime editing efficiency is particularly enhanced when the PBS melting temperature is low, which may be linked to the poor binding of the pegRNA PBS domain to the target ssDNA with the nick site when using PE2. .
전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.The above description of the present invention is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive.

Claims (13)

1) Cas 닉케이스(nickase), 역전사 효소(reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질; 및1) a fusion protein comprising a Cas nickase, a reverse transcriptase (RT) and a single-stranded DNA-binding domain (ssDBD); and
2) 프라임 에디팅 가이드 RNA (pegRNA)2) Prime editing guide RNA (pegRNA)
를 포함하는, 유전자 편집 복합체.comprising, a gene editing complex.
청구항 1에 있어서, 상기 복합체는 프라임 에디팅 (prime editing)을 위한 것인, 복합체.The complex according to claim 1, wherein the complex is for prime editing.
청구항 1에 있어서, 상기 단일 가닥 DNA 결합 도메인은 Rad51 DBD인 것인, 복합체.The complex of claim 1, wherein the single-stranded DNA binding domain is Rad51 DBD.
청구항 1에 있어서, 상기 단일 가닥 DNA 결합 도메인은 Cas 닉케이스의 C-말단 및 역전사 효소의 N-말단에 연결되는 것인, 복합체.The complex according to claim 1, wherein the single-stranded DNA binding domain is linked to the C-terminus of the Cas nickase and the N-terminus of the reverse transcriptase.
청구항 1에 있어서, 상기 Cas 닉케이스는 서열번호 1의 아미노산 서열을 포함하는 것인, 복합체.The complex according to claim 1, wherein the Cas nickcase comprises the amino acid sequence of SEQ ID NO: 1.
청구항 1에 있어서, 상기 역전사 효소는 서열번호 3의 아미노산 서열을 포함하는 것인, 복합체.The complex of claim 1, wherein the reverse transcriptase comprises the amino acid sequence of SEQ ID NO: 3.
청구항 1에 있어서, 상기 단일 가닥 DNA 결합 도메인은 서열번호 2의 아미노산 서열을 포함하는 것인, 복합체.The complex of claim 1, wherein the single-stranded DNA binding domain comprises the amino acid sequence of SEQ ID NO:2.
청구항 1에 있어서, 상기 단일 가닥 DNA 결합 도메인은 서열번호 4의 아미노산 서열을 포함하는 제 1링커를 통해 Cas 닉케이스의 C-말단에 연결되는 것인, 복합체.The complex according to claim 1, wherein the single-stranded DNA binding domain is linked to the C-terminus of the Cas nickase via a first linker comprising the amino acid sequence of SEQ ID NO: 4.
청구항 1에 있어서, 상기 단일 가닥 DNA 결합 도메인은 서열번호 5의 아미노산 서열을 포함하는 제 2링커를 통해 역전사 효소의 N-말단에 연결되는 것인, 복합체.The complex according to claim 1, wherein the single-stranded DNA binding domain is linked to the N-terminus of the reverse transcriptase through a second linker comprising the amino acid sequence of SEQ ID NO: 5.
청구항 1에 있어서, 상기 융합 단백질은 핵 위치화 서열 (Nuclear localization sequence or signal, NLS)을 추가로 포함하는 것인, 복합체.The complex of claim 1, wherein the fusion protein further comprises a nuclear localization sequence or signal (NLS).
1) Cas 닉케이스(nickase), 역전사 효소(reverse transcriptase, RT) 및 단일 가닥 DNA 결합 도메인(single-stranded DNA-binding domain, ssDBD)을 포함하는 융합 단백질, 상기 융합 단백질을 암호화하는 폴리뉴클레오티드 또는 상기 폴리뉴클레오티드를 포함하는 재조합 백터; 및1) a fusion protein comprising a Cas nickase, a reverse transcriptase (RT) and a single-stranded DNA-binding domain (ssDBD), a polynucleotide encoding the fusion protein, or the a recombinant vector comprising a polynucleotide; and
2) 프라임 에디팅 가이드 RNA (pegRNA) 또는 이를 암호화하는 폴리뉴클레오티드를 포함하는 재조합 벡터2) Recombinant vector containing prime editing guide RNA (pegRNA) or polynucleotide encoding the same
를 포함하는, 유전자 편집용 조성물.A composition for gene editing comprising a.
청구항 11에 있어서, 상기 조성물은 엑스 비보 (ex vivo) 또는 인 비보(in vivo)에서 표적 DNA 또는 유전자를 편집하기 위한 것인, 조성물.The composition of claim 11 , wherein the composition is for editing a target DNA or gene in ex vivo or in vivo.
청구항 11의 유전자 편집용 조성물을 분리된 진핵 세포 또는 인간을 제외한 진핵 유기체에 도입하는 단계를 포함하는, 대상 개체의 유전자를 편집하는 방법.A method of editing a gene of a subject, comprising introducing the composition for gene editing of claim 11 into isolated eukaryotic cells or eukaryotic organisms other than humans.
PCT/KR2022/001611 2021-02-04 2022-01-28 Prime editing composition with improved editing efficiency WO2022169235A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2021-0016283 2021-02-04
KR20210016283 2021-02-04

Publications (1)

Publication Number Publication Date
WO2022169235A1 true WO2022169235A1 (en) 2022-08-11

Family

ID=82742329

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/001611 WO2022169235A1 (en) 2021-02-04 2022-01-28 Prime editing composition with improved editing efficiency

Country Status (2)

Country Link
KR (1) KR20220112698A (en)
WO (1) WO2022169235A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160034901A (en) * 2013-06-17 2016-03-30 더 브로드 인스티튜트, 인코퍼레이티드 Optimized crispr-cas double nickase systems, methods and compositions for sequence manipulation
KR20180012834A (en) * 2014-11-19 2018-02-06 기초과학연구원 A method for regulation of gene expression by expressing Cas9 protein from the two independent vector

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160034901A (en) * 2013-06-17 2016-03-30 더 브로드 인스티튜트, 인코퍼레이티드 Optimized crispr-cas double nickase systems, methods and compositions for sequence manipulation
KR20180012834A (en) * 2014-11-19 2018-02-06 기초과학연구원 A method for regulation of gene expression by expressing Cas9 protein from the two independent vector

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANZALONE ANDREW V. ET AL: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 576, no. 7785, 21 October 2019 (2019-10-21), London, pages 149 - 157, XP036953141, ISSN: 0028-0836, DOI: 10.1038/s41586-019-1711-4 *
MATSOUKAS IANIS G.: "Prime Editing: Genome Editing for Rare Genetic Diseases Without Double-Strand Breaks or Donor DNA", FRONTIERS IN GENETICS, vol. 11, XP055829020, DOI: 10.3389/fgene.2020.00528 *
ZHANG XIAOHUI ET AL: "Increasing the efficiency and targeting range of cytidine base editors through fusion of a single-stranded DNA-binding protein domain", NATURE CELL BIOLOGY, NATURE PUBLISHING GROUP UK, LONDON, vol. 22, no. 6, 11 May 2020 (2020-05-11), London, pages 740 - 750, XP037159237, ISSN: 1465-7392, DOI: 10.1038/s41556-020-0518-8 *

Also Published As

Publication number Publication date
KR20220112698A (en) 2022-08-11

Similar Documents

Publication Publication Date Title
US20200354729A1 (en) Fusion proteins for improved precision in base editing
WO2010143917A2 (en) Targeted genomic rearrangements using site-specific nucleases
KR102254602B1 (en) Cas9 variants and uses thereof
KR20200121782A (en) Uses of adenosine base editor
WO2019103442A2 (en) Genome editing composition using crispr/cpf1 system and use thereof
KR20180069898A (en) Nucleobase editing agents and uses thereof
KR20190005801A (en) Target Specific CRISPR variants
JPWO2020168132A5 (en)
CN110551761B (en) CRISPR/Sa-SepCas9 gene editing system and application thereof
Huang et al. A HECT domain ubiquitin ligase closely related to the mammalian protein WWP1 is essential for Caenorhabditis elegans embryogenesis
WO2019239361A1 (en) Method for sequence insertion using crispr
CN110577971B (en) CRISPR/Sa-SauriCas9 gene editing system and application thereof
Zhang et al. Two c-myc genes from a tetraploid fish, the common carp (Cyprinus carpio)
WO2022169235A1 (en) Prime editing composition with improved editing efficiency
JP2022512868A (en) Systems and methods for genome editing based on C2c1 nuclease
US20220162648A1 (en) Compositions and methods for improved gene editing
Marracci et al. Gypsy/Ty3-like elements in the genome of the terrestrial salamander Hydromantes (Amphibia, Urodela)
CN110577969B (en) CRISPR/Sa-SlugCas9 gene editing system and application thereof
Flachmann et al. Replacement of a conserved arginine in the assembly domain of ribulose-1, 5-bisphosphate carboxylase/oxygenase small subunit interferes with holoenzyme formation.
CN110551760B (en) CRISPR/Sa-SeqCas9 gene editing system and application thereof
CN110577972B (en) CRISPR/Sa-ShaCas9 gene editing system and application thereof
CN110577970B (en) CRISPR/Sa-SlutCas9 gene editing system and application thereof
WO2024063538A1 (en) Base editing of plant cell organelle dna
Grummt Mammalian ribosomal gene transcription
WO2021125840A1 (en) Composition for editing gene or inhibiting expression thereof, comprising cpf1 and chimeric dna-rna guide

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22749984

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22749984

Country of ref document: EP

Kind code of ref document: A1