WO2023075154A1

WO2023075154A1 - Method for inactivating gene of porcine endogenous retrovirus and composition thereof

Info

Publication number: WO2023075154A1
Application number: PCT/KR2022/014077
Authority: WO
Inventors: 구옥재; 염수영
Original assignee: 주식회사 툴젠
Priority date: 2021-10-25
Filing date: 2022-09-21
Publication date: 2023-05-04
Also published as: KR20230065230A

Abstract

The present application relates to a composition capable of artificially manipulating a PERV genome existing in multiple copies in the genome of a porcine cell, and a method for inactivating PERV using same.

Description

Method for inactivating the gene of porcine endogenous retrovirus and composition thereof

This application relates to the artificial manipulation or modification of target genes present in multiple copies. More specifically, it relates to a composition capable of artificially manipulating the PERV genome existing in multiple copies in the genome of a pig cell and a method of inactivating PERV using the same.

The CRISPR-Cas system consists of a guide RNA (gRNA) having a sequence complementary to a target gene or nucleic acid and a CRISPR enzyme, a nuclease capable of cleaving the target gene or nucleic acid. The gRNA and the CRISPR enzyme form a CRISPR complex. and the target gene or nucleic acid is cut or modified by the formed CRISPR complex.

In general, the gene edited by the CRISPR-Cas system is usually 1 copy or 2 copies, and even if genome editing occurs at the same time, no big problem arises. However, when the number of copies of the target gene is multiple, that is, when multiple copies of the gene are targeted, gene editing using the CRISPR-Cas system involves rearrangement of the genome through simultaneous nucleic acid cleavage and repair processes, Serious problems such as loss of some nucleic acid fragments may occur, which may result in cytotoxicity and cell death.

This problem is also recognized in the field of organs for transplantation. The shortage of organs for transplantation is a major barrier to the treatment of organ failure. Although porcine organs are considered promising as a solution to the shortage of organs for transplantation, the use of porcine organs remains unresolved for transmission or transmission of porcine endogenous retrovirus (PERV) to humans. PERV is harmless to pigs, but when transferred or spread to humans, there is potential for various diseases caused by retroviruses. In order to solve these problems, it is necessary to produce pigs in which PERV is inactivated.

According to a recent study, it was confirmed that PERV exists in 62 copies in a porcine kidney epithelial cell line (PK15), and multiple editing was attempted with the CRISPR/Cas system. However, most of the cells died and only some confirmed that PERV was inactivated. This has been a limited success, but additional problems such as rearrangement of the genome due to concurrent genome editing and low efficiency still exist.

Therefore, there is a need for an efficient technique for simultaneously editing multiple copies of genes to secure potential sources of organs for transplantation.

One object of the present application is to provide a composition for artificially manipulating the gene of porcine endogenous retrovirus (PERV).

Another object of the present application is to provide a method for artificially inactivating a porcine endogenous retrovirus gene.

Another object of the present application is to provide a pig cell or / and pig or / and pig organ using the above method.

In order to solve the above problems of the present application, a method for inactivating porcine endogenous retrovirus (PERV), which is present in multiple copies in the pig genome, is provided.

Incorporating a composition for genetic manipulation into pig cells;

At this time, the composition for genetic manipulation,

(a) a guide RNA capable of targeting a target sequence of a target gene of PERV or a nucleic acid sequence encoding the same;

(b) a Cas protein or a nucleic acid sequence encoding the same; and

(c) cytidine deaminase or a nucleic acid sequence encoding the same;

including,

At this time, the target gene of the PERV is at least one selected from the gag gene, the pol gene, and the env gene,

The target sequence is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CTA-3', 5'-CCA-3' and 5'-TCA-3 At least one nucleotide sequence of the nucleotide sequence of ', wherein the target sequence is located in a continuous 10 bp to 30 bp sequence region of the PERV gene,

The Cas protein is a Cas9 protein derived from Streptococcus pyogenes, a Cas9 protein derived from Campylobacter jejuni, a Cas9 protein derived from Streptococcus thermophiles, and a Staphylococcus au It is selected from the group consisting of Cas9 protein derived from Staphylococcus aureus, Staphylococcus auricularis and Cpf1 protein,

The inactivation reduces or suppresses the expression of the target gene of PERV by generating a stop codon in the target sequence,

A method characterized by inactivating PERV present in the multiple copies may be provided.

At this time, a method characterized in that the stop codon is any one or more nucleotide sequences selected from 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3' may be provided.

In order to solve the other object of the present application described above, a composition for genetic manipulation is provided for inactivating porcine endogenous retrovirus (PERV).

The composition,

(a) a guide RNA comprising one or more guide domain sequences selected from SEQ ID NOs: 66 to 130, or a nucleic acid encoding the same;

(b) a Cas protein or a nucleic acid sequence encoding the same; and

(c) cytidine deaminase or a nucleic acid sequence encoding the same;

including,

The guide domain may form a complementary bond with a guide nucleic acid binding target sequence among target sequences of the PERV gene,

The guide RNA is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CUA-3', 5'-CCA-3' and 5' in the guide domain sequence. -Contains one or more of the nucleotide sequences of UCA-3',

The Cas protein is a Cas9 protein derived from Streptococcus pyogenes, a Cas9 protein derived from Campylobacter jejuni, a Cas9 protein derived from Streptococcus thermophiles, and a Staphylococcus au A composition characterized by being selected from the group consisting of a Cas9 protein derived from Staphylococcus aureus, a Staphylococcus auricularis, and a Cpf1 protein may be provided.

In addition, in order to solve another object of the present application described above, pig cells are provided.

The pig cells,

A porcine endogenous retrovirus (PERV) in which at least one gene selected from the gag gene, pol gene, and env gene is inactivated is included in the genome,

A pig cell containing at least two or more stop codons in each of the inactivated genes,

The stop codon is any one or more nucleotide sequences selected from 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3',

The inactivated PERV exists in multiple copies in the genome and

Porcine cells characterized in that the expression level of PERV is reduced by 50% or more compared to before inactivation can be provided. That is, a pig cell characterized in that the inactivated PERV gene is present at 50% or more compared to before inactivation can be provided.

According to this application, the following effects occur.

First, according to the present application, it is possible to provide a composition for artificially manipulating the gene of a porcine endogenous retrovirus.

Second, by using the composition provided by the present application, it is possible to provide a method for artificially inactivating a gene of a pig endogenous retrovirus.

Thirdly, by using the method provided by the present application, it is possible to provide pig cells or/and pigs or/and pig organs artificially engineered with pig endogenous retrovirus genes.

1 is a diagram showing a schematic diagram of the technology of the present application. PAM refers to a protospacer adjacent motif, and UGI refers to a uracil DNA glycosylase inhibitor.

2 is a diagram showing vectors used in this application. Arrows indicate the positions of primers used to check whether the transgene used in this application is integrated.

3 is a view confirming whether the transgene of the present application is integrated into a cell. Colonies #1 and #3 refer to colonies derived from single cells obtained after Neomycin selection, respectively.

4 is a diagram showing the results of confirming that a stop codon was generated in the gag gene of PERV of the present application.

5 is a diagram showing the results of confirming that a stop codon was generated in the pol gene of PERV of the present application.

The content of the application will be described in more detail with reference to the accompanying drawings and the following description. The accompanying drawings include embodiments of the application, but do not include all embodiments. In addition, the subject matter of the present application may be embodied in various forms and is not limited to the specific implementations described below. That is, the present application can make various changes and have various embodiments. Although the configuration and characteristics of the present application have been described based on the embodiments according to the present application, the present application is not limited thereto, and it is possible to make various changes or modifications within the spirit and scope of the present application in the technical field to which this application belongs. It will be apparent to those skilled in the art, and thus such changes or modifications are intended to fall within the scope of the appended claims.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials similar or equivalent to those described herein may be used in the practice or testing of the present invention. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety. Additionally, the materials, methods, and examples are illustrative only and not intended to be limiting.

The term "target sequence" is a nucleotide sequence present in a target gene or nucleic acid, specifically, a partial nucleotide sequence of a target region in a target gene or nucleic acid, wherein the "target region" is a guide nucleic acid-editor protein in a target gene or nucleic acid. It is a site that can be modified by In this case, the target sequence may be a guide nucleic acid binding sequence or a guide nucleic acid non-binding sequence. In addition, the target region is a region including nucleotides in which base substitution is induced by a base modification enzyme. The term target region may be used interchangeably with target site.

The "guide nucleic acid-binding sequence" is a nucleotide sequence having partial or complete complementarity with the guide sequence included in the guide domain of the guide nucleic acid, and is complementary to the guide sequence included in the guide domain of the guide nucleic acid. can combine The term guide nucleic acid binding sequence may be used interchangeably with guide nucleic acid binding target sequence.

The "guide nucleic acid-non-binding sequence" is a nucleotide sequence having partial or complete homology with the guide sequence included in the guide domain of the guide nucleic acid, and the guide included in the guide domain of the guide nucleic acid unable to form complementary sequences. The term guide nucleic acid non-binding sequence may be used interchangeably with guide nucleic acid non-binding target sequence.

Hereinafter, the target sequence may be used as a term meaning both types of nucleotide sequence information. For example, in the case of a target gene, the target sequence may mean sequence information of a transcribed strand of the target gene DNA, or nucleotide sequence information of a non-transcribed strand.

The term “targeting” means complementary binding with a guide nucleic acid binding sequence among target sequences present in a target gene or nucleic acid. At this time, the complementary bond may be 100% complete complementary bond, or 70% or more and less than 100% incomplete complementary bond. Accordingly, "targeting gRNA" refers to a gRNA that complementarily binds to a guide nucleic acid binding sequence among target sequences present in a target gene or nucleic acid.

The term “artificially modified or engineered or artificially engineered” means a state artificially modified, not a state as it exists in nature. Hereinafter, the non-natural, artificially engineered or modified porcine endogenous retrovirus (PERV) gene may be used interchangeably with the term artificial PERV gene. The non-natural, artificially engineered or modified PERV gene is an artificially engineered or modified gag gene, an artificially engineered or modified pol gene, and an artificially engineered or modified env gene. It can be used in an inclusive sense.

The term "PERV gene" is meant to include both the RNA genome sequence of PERV and its DNA counterpart sequence, for example, a cDNA sequence transcribed by the RNA genome sequence of PERV, a double-stranded DNA generated by the cDNA sequence It includes both sequences and DNA sequences inserted into the genome of a host cell. Also, the PERV gene may be used interchangeably with the genome of PERV.

The term “inactivation or deactivation” refers to a state in which a target nucleic acid or gene is artificially manipulated. In particular, i) a state in which a target site in a target nucleic acid or gene is cleaved and the function is reduced and/or lost; and ii) a state in which transcription and/or translation of the target nucleic acid or gene is not and/or suppressed and/or reduced.

The inactivation may suppress or reduce the occurrence of a disease by inactivating a disease-causing gene or a gene having an abnormal function, thereby regulating the expression of a target gene or nucleic acid or protein. For example, when the PERV gene is inactivated, i) the target site in the PERV gene is cleaved and the function is reduced or lost; and ii) the transcription and / or translation of the target gene among the PERV genes is inhibited or It should be interpreted as including all; a state in which expression of a gene or protein is reduced and/or suppressed. In particular, in the present application, inactivation of PERV should be interpreted as a state in which expression is reduced and/or suppressed by artificially generating a stop codon to inhibit transcription and/or translation of a PERV target gene.

I. 본 출원의 기술적 특징 요약I. SUMMARY OF TECHNICAL FEATURES OF THIS APPLICATION

The present application relates to a composition and method capable of artificially manipulating a porcine endogenous retrovirus (PERV) present in multiple copies in the pig genome, and in particular, by forming a stop codon in a target sequence, PERV It is characterized by inactivating.

Specifically, in the target sequence in the target region of PERV in the porcine genome targeted in this application, a sequence capable of forming a stop codon with a single base substitution is found, and a base modification capable of inducing or generating a single base substitution use enzymes.

In order to manipulate the sequence in the target region of PERV with a stop codon in this way, the CRISPR/Cas system and a base modifying enzyme that induces or generates a single base substitution, such as deaminase, are used together. According to the method of the present application, the gag gene, pol gene, and env gene of PERV present in the genome of a pig are used as specific target regions, and a stop codon is formed specifically in the target region, thereby inactivating PERV. occurs

Therefore, this application has the following technical features.

i) By using the CRISPR/Cas system, a specific gene of PERV can be targeted and specifically targeted.

ii) A stop codon can be artificially generated at a desired position by using a base modifying enzyme that induces or generates a single base substitution while site-specifically targeting a specific gene of PERV with the CRISPR/Cas system. The generated stop codon can act in the process of transcription or translation of a specific gene of PERV, thereby suppressing the expression of a specific gene of PERV.

iii) By targeting a specific gene of PERV, it has high efficiency and accuracy specific to the target sequence, and at the same time generates a stop codon to suppress PERV expression, so even if PERV that exists in multiple copies is manipulated, cells are not killed, so a safe effect there is

In particular, it was intended to overcome the problem of the prior art, such as the low efficiency of artificially manipulating cells, such as easy death of cells themselves when inactivating by simultaneously knocking out multiple copies of PERV.

A more detailed description of this is as follows.

II. 본 출원의 기술적(작동) 원리II. Technical (operational) principle of the present application

Although organ xenotransplantation is urgently required worldwide, immunological factors, which are a major barrier, are still problematic. These problems were also found in germ-free pigs, which are in the limelight for xenotransplantation. As it has been reported that PERV present in the genome of germ-free pigs has the potential to infect human cells in vitro, many efforts are being made to solve this problem. However, PERV exists in the form of a provirus or viral particle in multiple copies on all types of pig genomes, and is difficult to control because it spreads through the germ line. Therefore, an efficient method is needed to prevent PERV from infecting human cells in vitro.

In order to solve the above problem, the present application uses a CRISPR/Cas system and a base modifying enzyme together to inactivate PERV.

In particular, the present application is characterized by the use of a CRISPR/Cas system (eg, gRNA and Cas nickase) and a base modifying enzyme (eg, deaminase) for cleaving DNA.

For the principle of inactivating PERV of the present application, reference may be made to the schematic diagram of FIG. 1. In FIG. 1, nCas9 and cytidine deaminase of Cas9 are described as an example. 1 illustrates the technical principle of the present application by way of example, and is not limited thereto.

Referring to Figure 1, the principle of inactivating PERV present in the pig genome is as follows.

For PERV inactivation, a CRISPR/Cas system containing gRNA and Cas9 nickase to target the target gene or nucleic acid of PERV is required. In addition, cytidine deaminse and uracil DNA glycosylase inhibitor are required to generate a stop codon by replacing cytosine (C) with thymine (T) .

In FIG. 1, the gRNA targets a target sequence including the 5'-CAG-3' sequence and the PAM sequence, interacts with Cas9 nickase, and Cas9 nickase cuts a single strand in the target site.

In this application, in order to generate a stop codon by substituting a specific base of a target sequence of interest, while maintaining the characteristics of Cas9 that recognizes the target sequence, the strand opposite to the strand where the stop codon is generated Nickase Cas9 (nCas9) Use

Based on this, when cytidine deaminase is used together with Cas9 nickase, cytidine deaminase and UGI act on the opposite strand of the DNA strand cleaved by Cas9 nickase, resulting in the 5'-CAG-3' sequence in the target sequence. Among them, cytidine is substituted with thymine to change the sequence to 5'-TAG-3'. As a result, the target nucleic acid or gene is inactivated by having the 5'-TAG-3' sequence, which is a stop codon sequence, in the target sequence.

This is described in detail as follows.

The gRNA of the CRISPR/Cas system binds to a target nucleic acid or gene sequence (target in FIG. 1) and interacts with Cas9 nickase, whereby Cas9 nickase cuts a single strand in the target site (scissors in FIG. 1), and PERV Inactivated. That is, by using the CRIPSR/Cas system in the present application, the cleavage is not only specific to the target gene in the porcine genome, but also has a characteristic that occurs specifically at a specific position of the target gene.

As such, since the gRNA of the CRISPR/Cas system can recognize the target gene, the gag gene, pol gene, and env gene, which play the most important role in replication among the PERV genes present in the pig genome, can be targeted. can In addition, Cas9 nickase cuts the target site of the target gene, so that the CRISPR / Cas system for PERV inactivation of the present application can specifically target the desired gene.

At this time, since Cas9 nickase recognizes a PAM (Protospacer adjacent motif) sequence, the target sequence must include the PAM sequence.

A gRNA can be designed in consideration of the PAM sequence. Briefly, the design method can be designed by finding a region where a PAM sequence is present in a sequence in a target nucleic acid or gene, and selecting a sequence of ~20 bp from the corresponding PAM sequence.

Meanwhile, the present application uses cytidine deaminase together with the CRISPR/Cas system for PERV inactivation. In particular, when the CRISPR/Cas system is used in combination with base modifying enzymes (e.g., cytidine deaminase and UGI), cytosine is replaced by thymine, resulting in the target gene or target within the nucleic acid targeted by the CRISPR/Cas system. This is to achieve the purpose of making a specific sequence of sequences become a stop codon.

Cytidine deaminase is an enzyme having an activity of substituting thymine for cytosine, and in one embodiment, cytosine located on a strand where a PAM sequence is present in a target sequence is converted to thymine.

At this time, in order to increase the efficiency of base substitution, UGI, a protein that inhibits uracil (U) to cytosine conversion enzyme (Uracil DNA glycosylase; UDG), is additionally used along with cytidine deaminase. In this case, UGI acts to inhibit the restoration of uracil, an intermediate product of deamination, to cytosine by UDG due to cytidine deaminase. Then, in the process of DNA replication or mismatch-repair using the strand containing uracil as a template, guanine (G) on the opposite strand is substituted with adenine (A), resulting in a single base pair substitution from C-G pair to T-A pair. In one embodiment, by the action of the CRISPR/Cas system, cytidine deaminase and UGI, cytosine located on the strand where the PAM sequence is present in the target sequence is substituted with thymine.

Since cytidine deaminase is an enzyme involved in changing cytosine to thymine, considering this, a sequence that can be changed to a stop codon sequence in the target sequence, for example, 5'-CAG-3', 5'-CAA-3 ', 5'-CGA-3', 5'-CTA-3', 5'-CCA-3' and 5'-TCA-3', a target sequence comprising any one or more nucleotide sequences selected from the present application can be used in

As such, when the CRISPR/Cas system and deaminase work together, the CRIPSR/Cas system targets any one or more genes of PERV's gag gene, pol gene, and env gene, cleave the target site, and A stop codon is formed within the PERV gene to inhibit transcription or/or translation of one or more of the PERV genes, thereby reducing or/or suppressing expression of the PERV gene, thereby further enhancing the PERV inactivation effect.

Hereinafter, a composition for manipulation for artificially inactivating PERV having the above technical features of the present application and a method for inactivating PERV using the same will be described in more detail.

III. PERV 불활성화를 위한, 유전자 조작용 조성물III. Composition for genetic engineering for PERV inactivation

One aspect disclosed by the present application relates to a composition for genetic engineering for artificially inactivating PERV described above. The composition for genetic engineering may be used for the production of artificially engineered PERV genes. In addition, the PERV gene artificially manipulated by the composition for genetic engineering may control the transfer of the PERV gene and the expression of the protein encoded by the PERV gene.

1) 표적 유전자1) target gene

The target gene targeted by the composition for genetic manipulation of the present application is PERV.

PERV exists in the form of a provirus or viral particle in multiple copies on all types of pig genomes, and is not easy to control because it spreads through the germ line. PERV refers to endogenous retroviruses that are inserted into the pig's genome and are passed on to the next generation according to Mendel's genetic laws, and are also transmitted by infection.

PERV is classified into Retroviridae and Gammaretrovirus , and belongs to gamma-retroviruses, such as murine leukemia virus and feline leukemia virus, which cause leukemia or immunodeficiency in terms of nucleotide sequence, amino acid sequence and morphology. . The virus undergoes germline transmission, does not show any symptoms in pigs, and most exist in a form in which actual viruses cannot be made due to gene deletion or mutation.

Replicable PERV subtypes include PERV-A, PERV-B and PERV-C.

PERV-A and PERV-B can infect human cells, pig cells, and cells of other species, while PERV-C has been reported to only partially infect pig cells and human cells called HT1080 (Takeuchi et al., 1998). In some studies, it has been confirmed that a virus recombined with PERV-C as a template and only the receptor binding site with PERV-A exhibits 500-fold infectivity against human cells (Birke et al., 2004; Harrison et al., 2004; Scobie et al., 2004). In addition, it has been reported that humanized PERV, once infected with human cells, has increased infectivity to human cells compared to PERV produced in pig cells.

For example, the composition for genetic manipulation of the present application may simultaneously target any two or more subtypes selected from among PERV-A, PERV-B and PERV-C.

Target genes disclosed by this application may be some genes included in the PERV gene.

The target gene disclosed by this application may be a gag gene among PERV genes.

The target gene disclosed by this application may be a pol gene among PERV genes.

The target gene disclosed by the present application may be an env gene among PERV genes.

PERV genes include the gag gene, pol gene and env gene. These are the genes that play the most important role in PERV replication. Specifically, gag gene, pol gene, and env gene are known. Specifically, the gag gene encodes the core protein, the pol gene encodes the reverse transcriptase, and the env gene encodes the envelope protein.

Among these, the gag gene and pol gene, which form viral enzymes and structural proteins, show a similarity of more than 90% in PERV-A, PERV-B, and PERV-C, whereas the similarity of the env gene related to host cell infection is makes a big difference

The target gene disclosed by the present application may be two or more genes selected from among the gag gene, the pol gene, and the env gene among the PERV genes.

For example, when the pol gene and the gag gene of PERV are targeted and inactivated as target genes, i) the pol gene region involved in reverse transcription and replication is inactivated, thereby inhibiting reverse transcription and replication, and ii) core essential for growth at the same time By inactivating the gag gene, which encodes the protein, the production of viral particles can be blocked.

The target gene disclosed by this application may be a gag gene, a pol gene, and an env gene among PERV genes.

The target gene disclosed by the present application may be two or more genes selected from the gag gene, the pol gene, and the env gene among the genes of the PERV subtype.

The PERV gene may exist as a whole genome sequence or a partial nucleic acid sequence of PERV in the genome of a pig cell.

In this case, the partial nucleic acid sequence may be the entire nucleic acid sequence of one or more of the gag gene, pol gene, and env gene.

In this case, the partial nucleic acid sequence may be a partial nucleic acid sequence of one or more of the gag gene, pol gene, and env gene.

The PERV gene can be present in 1 to 100 copies within the genome of a pig cell.

In this case, the PERV gene may be a full-length genome sequence or a partial nucleic acid sequence of PERV.

At this time, 1 to 100 copies of the PERV gene in the genome of the pig cell may be contiguous and/or non-contiguous.

For example, if 60 copies of the PERV gene are present in the genome of a pig cell, 10 copies may exist continuously and the remaining 50 copies may each exist non-contiguously. Alternatively, if 20 copies of the PERV gene exist in the genome of a pig cell, 10 copies exist continuously and the remaining 10 copies also exist continuously, but two groups of 10 consecutive copies may exist non-contiguously. there is.

At this time, 1 to 100 copies in the genome of the pig cell may exist in a mixture of the full-length genome sequence and some nucleic acid sequences of PERV.

For example, if 5 copies of the PERV gene exist in the genome of a pig cell, 2 copies may exist as the full-length genome sequence of PERV, and the remaining 3 copies may exist as partial nucleic acid sequences. Alternatively, when the PERV gene has 13 copies in the genome of a pig cell, 5 copies are present as the full-length genome sequence of PERV, 4 copies are present as the entire nucleic acid sequence of the gag gene among the PERV genes, and the remaining 4 A copy of the dog may exist as a partial nucleic acid sequence of the pol gene in the PERV gene.

As described above, in order for the CRISPR/Cas system included in the genetic manipulation composition of the present application to operate, the target gene must contain a PAM (Protospacer adjacent motif) sequence in the target sequence.

Since the Cas protein recognizes the PAM sequence in the CRISPR/Cas system, the PAM sequence must be included in the target sequence for its operation. In particular, there is a guide nucleic acid non-binding sequence in the region adjacent to the PAM sequence, and a guide nucleic acid binding sequence is present on the opposite strand.

So, in order for the CRISPR/Cas system to work, the target sequence in the target gene can be determined based on PAM. Because of this, since the target gene of the present application is designed to enable the CRISPR/Cas system to operate, the location of the target gene has a PAM-specific feature.

In an embodiment,

The target sequence disclosed by the present application, that is, the guide nucleic acid non-binding sequence may be a sequence of 10 to 35 consecutive nucleotides adjacent to the 5' end and/or 3' end of the PAM sequence in the nucleic acid sequence of the target gene.

In this case, the PAM sequence may be, for example, one or more of the following sequences (written in the 5' to 3' direction).

NGG (N is A, T, C or G);

NNNNRYAC (N is each independently A, T, C or G, R is A or G, and Y is C or T);

NNGG (N is each independently A, T, C or G);

NNAGAAW (N is each independently A, T, C or G, and W is A or T);

NNNNGATT (N is each independently A, T, C or G);

NNGRR(T) (N is each independently A, T, C or G, and R is A or G); and TTN (N is A, T, C or G).

For example, the target sequence (guide nucleic acid non-binding sequence) may be a sequence of 10 to 35 consecutive nucleotides adjacent to the 5' end and/or the 3' end of the PAM sequence in the nucleic acid sequence of the PERV gene.

As another example, the target sequence may be a sequence of 10 to 35 consecutive nucleotides adjacent to the 5' end and/or 3' end of the PAM sequence in the nucleic acid sequence of the gag gene.

As another example, the target sequence may be a sequence of 10 to 35 consecutive nucleotides adjacent to the 5' end and/or 3' end of the PAM sequence in the nucleic acid sequence of the pol gene.

As another example, the target sequence may be a sequence of 10 to 35 consecutive nucleotides adjacent to the 5' end and/or the 3' end of the PAM sequence in the nucleic acid sequence of the env gene.

In addition, in the present application, the target sequence of the target gene must include a sequence capable of base substitution with a stop codon in the sequence. The base substitution may be effected by a base modifying enzyme that induces or generates a single base substitution.

In this case, the base modifying enzyme may be deaminase.

For example, deaminase may include cytidine deaminase that changes (substitutes) cytosine into thymine, adenosine deaminase that substitutes adenine into guanine, and the like.

For convenience, cytidine deaminase, which changes cytosine to thymine, will be mainly described below.

For example, when the cytidine deaminase is operated on a target sequence, cytosine in a sequence capable of base substitution with a stop codon in the target sequence is substituted with thymine, thereby generating a stop codon.

In one embodiment, the PERV sequence targeted by the present application is a stop codonation target sequence, 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CTA-3 ', 5'-CCA-3' and 5'-TCA-3' and includes any one or more nucleotide sequences selected from. Regarding this, it is described in [Table 1] and [Table 2] below.

The stop codon target sequence is included in the guide nucleic acid non-binding sequence of the PERV target gene in the 5' to 3' direction (Table 1); Or it may not be included in the guide nucleic acid binding sequence on its opposite strand. For explanation, the strand containing the guide nucleic acid non-binding sequence is referred to as the first strand, and the opposite strand (including the guide nucleic acid binding sequence) is referred to as the second strand.

i) 제1 스트랜드 상에서 종결코돈을 생성하기 위한, 제1 스트랜드의 표적서열에 포함된 종결코돈화 대상 서열:i) a stop codon target sequence included in the target sequence of the first strand for generating a stop codon on the first strand:

Selection of 5'-CAG-3', 5'-CAA-3' and 5'-CGA-3' to the 5' to 3' strand (i.e., guide nucleic acid non-binding sequence) of PERV targeted in this application It can be considered the case of including any one or more nucleotide sequences. In this case, when cytidine deaminase, which substitutes cytosine for thymine, works, the stop codons 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3' are changed (Table 1 ).

제1 스트랜드의 표적서열에 포함된 종결코돈화 대상 서열Stop codonation target sequence included in the target sequence of the first strand		시티딘 디아미네이즈 처리 후After cytidine deaminase treatment 변경 서열change sequence
5'-CAG-3'5'-CAG-3'	→→	5'-TAG-3'5'-TAG-3'
5'-CAA-3'5'-CAA-3'	→→	5'-TAA-3'5'-TAA-3'
5'-CGA-3'5'-CGA-3'	→→	5'-TAA-3'5'-TAA-3'

ii) 제2 스트랜드 상에서 종결코돈을 생성하기 위한, 제1 스트랜드의 표적서열에 포함된 종결코돈화 대상 서열:ii) a stop codon target sequence included in the target sequence of the first strand for generating a stop codon on the second strand:

From the same point of view, the case of generating a stop codon on the opposite strand (second strand) of the first strand in [Table 1] can be considered. In other words, as a target sequence for stop codonation in the first strand of [Table 1], any one or more nucleotide sequences selected from 5'-CTA-3', 5'-CCA-3' and 5'-TCA-3' may contain At this time, when cytidine deaminase works, cytosine (C) is substituted with thymine (T), and the above sequences are changed to 5'-TTA-3'. When DNA replication or repair occurs using the first strand containing the altered sequence (5'-TTA-3') as a template, the second strand has a complementary sequence (3'-GAT- 5', 3'-GGT-5', 3'-AGT-5') is changed to 3'-AAT-5', which is a sequence complementary to the altered sequence (5'-TTA-3'). That is, the stop codon 5'-TAA-3' is generated on the second strand (Table 2).

제1스트랜드의 표적서열에 포함된 종결코돈화 대상 서열Stop codonation target sequence included in the target sequence of the first strand		시티딘 디아미네이즈 처리 후After cytidine deaminase treatment 제1스트랜드에 포함된included in the first strand 변경된 서열altered sequence	변경된 서열에 상보적인 제2 스트랜드에 포함된 종결코돈 서열Stop codon sequence included in the second strand complementary to the altered sequence
5'-CTA-3'5'-CTA-3'	→→	5'-TTA-3'5'-TTA-3'	3'-AAT-5'3'-AAT-5'
5'-CCA-3'5'-CCA-3'	→→	5'-TTA-3'5'-TTA-3'	3'-AAT-5'3'-AAT-5'
5'-TCA-3'5'-TCA-3'	→→	5'-TTA-3'5'-TTA-3'	3'-AAT-5'3'-AAT-5'

The target sequence targeted in the present application is a sequence that considers both a PAM sequence for the CRISPR/Cas system to operate and a sequence capable of generating a stop codon by operating a base modifying enzyme (hereinafter, a target sequence for stop codonation).

Therefore, through [Table 1] and [Table 2], the target sequence of the present application is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3' , 5'-CTA-3', 5'-CCA-3' and 5'-TCA-3'.

In one embodiment,

The target sequence disclosed by this application may be a sequence of 10 to 35 consecutive nucleotides including 5'-CAG-3' in the nucleic acid sequence of the target gene.

The target sequence disclosed by this application may be a sequence of 10 to 35 consecutive nucleotides including 5'-CAA-3' in the nucleic acid sequence of the target gene.

The target sequence disclosed by this application may be a sequence of 10 to 35 consecutive nucleotides including 5'-CGA-3' in the nucleic acid sequence of the target gene.

The target sequence disclosed by this application may be a sequence of 10 to 35 consecutive nucleotides including 5'-CTA-3' in the nucleic acid sequence of the target gene.

The target sequence disclosed by this application may be a sequence of 10 to 35 consecutive nucleotides including 5'-CCA-3' in the nucleic acid sequence of the target gene.

The target sequence disclosed by this application may be a sequence of 10 to 35 consecutive nucleotides including 5'-TCA-3' in the nucleic acid sequence of the target gene.

Hereinafter, examples of target sequences that can be used in one embodiment of the present invention are summarized in [Table 3], and the target sequences described in [Table 3] are guide nucleic acid non-binding sequences, complementary sequences through the described sequences. , that is, the guide nucleic acid binding sequence can be predicted.

NameName	Target geneTarget gene	DirectionDirection	Target (5' to 3', w/o PAM)Target (5' to 3', w/o PAM)	SEQ ID NOSEQ ID No.
PERV-gag-1PERV-gag-1	gaggag	++	ttcaggttaagaagggacctttcaggttaagaagggacct	SEQ ID NO: 1SEQ ID NO: 1
PERV-gag-2PERV-gag-2	gaggag	++	gcagacactcttcacagccggcagacactcttcacagccg	SEQ ID NO: 2SEQ ID NO: 2
PERV-gag-3PERV-gag-3	gaggag	++	ccagaaagcctcagtggcccccagaaagcctcagtggccc	SEQ ID NO: 3SEQ ID NO: 3
PERV-gag-4PERV-gag-4	gaggag	++	tcagagactggaagggttactcagagactggaagggtac	SEQ ID NO: 4SEQ ID NO: 4
PERV-gag-6PERV-gag-6	gaggag	--	agccatggtttaacccatggagccatggtttaacccatgg	SEQ ID NO: 5SEQ ID NO: 5
PERV-gag-7PERV-gag-7	gaggag	--	tcccaaccgaggcgagtcaatcccaaccgaggcgagtcaa	SEQ ID NO: 6SEQ ID NO: 6
PERV-gag-8PERV-gag-8	gaggag	--	gtcccaaccgaggcgagtcagtcccaaccgaggcgagtca	SEQ ID NO: 7SEQ ID NO: 7
PERV-gag-9PERV-gag-9	gaggag	++	tccccgaatcctggctcttgtccccgaatcctggctcttg	SEQ ID NO: 8SEQ ID NO: 8
PERV-gag-10PERV-gag-10	gaggag	++	gcgagagagaattctgttaggcgagagagaattctgttag	SEQ ID NO: 9SEQ ID NO: 9
PERV-gag-11PERV-gag-11	gaggag	++	tccccaacgcctcacggggttccccaacgcctcacggggt	SEQ ID NO: 10SEQ ID NO: 10
PERV-gag-12PERV-gag-12	gaggag	++	ccaacgcctcacggggttggccaacgcctcacggggttgg	SEQ ID NO: 11SEQ ID NO: 11
PERV-gag-13PERV-gag-13	gaggag	++	gttgcaaaatgagattgacagttgcaaaatgagattgaca	SEQ ID NO: 12SEQ ID NO: 12
PERV-gag-14PERV-gag-14	gaggag	++	ttgcaaaatgagattgacatttgcaaaatgagattgacat	SEQ ID NO: 13SEQ ID NO: 13
PERV-pol-1PERV-pol-1	polpol	++	ggagcagtttccccaagcctggagcagtttccccaagcct	SEQ ID NO: 14SEQ ID NO: 14
PERV-pol-2PERV-pol-2	polpol	++	cccacaggttattcaactgacccacaggttatcaactga	SEQ ID NO: 15SEQ ID NO: 15
PERV-pol-3PERV-pol-3	polpol	++	acagtaccccttgagtagagacagtaccccttgagtagag	SEQ ID NO: 16SEQ ID NO: 16
PERV-pol-4PERV-pol-4	polpol	++	ggtgcaggacatacacccaaggtgcaggacatacacccaa	SEQ ID NO: 17SEQ ID NO: 17
PERV-pol-5PERV-pol-5	polpol	++	ggcccagatttgcaggagagggcccagatttgcaggagag	SEQ ID NO: 18SEQ ID NO: 18
PERV-pol-6PERV-pol-6	polpol	++	cgggcagcgatggctgacggcgggcagcgatggctgacgg	SEQ ID NO: 19SEQ ID NO: 19
PERV-pol-7PERV-pol-7	polpol	++	gacagtacaccctagaagacgacagtacaccctagaagac	SEQ ID NO: 20SEQ ID NO: 20
PERV-pol-8PERV-pol-8	polpol	++	agaccagttctctgagactcagaccagttctctgagactc	SEQ ID NO: 21SEQ ID NO: 21
PERV-pol-9PERV-pol-9	polpol	++	ccagttctctgagactccggccagttctctgagactccgg	SEQ ID NO: 22SEQ ID NO: 22
PERV-pol-10PERV-pol-10	polpol	++	cagttctctgagactccggacagttctctgagactccgga	SEQ ID NO: 23SEQ ID NO: 23
PERV-pol-11PERV-pol-11	polpol	--	ccagttccgttcaggcgggaccagttccgttcaggcggga	SEQ ID NO: 24SEQ ID NO: 24
PERV-pol-12PERV-pol-12	polpol	--	accagttccgttcaggcgggaccagttccgttcaggcggg	SEQ ID NO: 25SEQ ID NO: 25
PERV-pol-13PERV-pol-13	polpol	--	tgtaccagttccgttcaggctgtaccagttccgttcaggc	SEQ ID NO: 26SEQ ID NO: 26
PERV-pol-14PERV-pol-14	polpol	--	ctccattcgaaggcaaaaagctccattcgaaggcaaaaag	SEQ ID NO: 27SEQ ID NO: 27
PERV-pol-15PERV-pol-15	polpol	--	cctccatggtcctagggtttcctccatggtcctagggttt	SEQ ID NO: 28SEQ ID NO: 28
PERV-pol-16PERV-pol-16	polpol	--	tcctccatggtcctagggtttcctccatggtcctagggtt	SEQ ID NO: 29SEQ ID NO: 29
PERV-pol-17PERV-pol-17	polpol	--	cttcccagtgagcgcctgggcttcccagtgagcgcctggg	SEQ ID NO: 30SEQ ID NO: 30
PERV-pol-18PERV-pol-18	polpol	--	ccagtattttttagccaccaccagtattttttagccacca	SEQ ID NO: 31SEQ ID NO: 31
PERV-pol-19PERV-pol-19	polpol	--	cttcagttgaataacctgtgcttcagttgaataacctgtg	SEQ ID NO: 32SEQ ID NO: 32
PERV-pol-20PERV-pol-20	polpol	--	ttctaagcagtcctgtttggttctaagcagtcctgtttgg	SEQ ID NO: 33SEQ ID NO: 33
PERV-pol-21PERV-pol-21	polpol	--	tcctagggtgtactgtcgtctcctagggtgtactgtcgtc	SEQ ID NO: 34SEQ ID NO: 34
PERV-pol-22PERV-pol-22	polpol	++	agcgatggctgacggaggcaagcgatggctgacggaggca	SEQ ID NO: 35SEQ ID NO: 35
PERV-pol-23PERV-pol-23	polpol	++	gctcaaatttcttttgaacagctcaaatttcttttgaaca	SEQ ID NO: 36SEQ ID NO: 36
PERV-pol-24PERV-pol-24	polpol	++	tcaagatatacagtcctggttcaagatatacagtcctggt	SEQ ID NO: 37SEQ ID NO: 37
PERV-pol-25PERV-pol-25	polpol	++	caagcctgggcagaaaccgccaagcctggggcagaaaccgc	SEQ ID NO: 38SEQ ID NO: 38
PERV-pol-26PERV-pol-26	polpol	++	tgttcaaagattaatccaactgttcaaagattaatccaac	SEQ ID NO: 39SEQ ID NO: 39
PERV-pol-27PERV-pol-27	polpol	++	gttcaaagattaatccaacagttcaaagattaatccaaca	SEQ ID NO: 40SEQ ID NO: 40
PERV-pol-28PERV-pol-28	polpol	++	aaacaagtgagagagtttttaaacaagtgagagagttttt	SEQ ID NO: 41SEQ ID NO: 41
PERV-pol-29PERV-pol-29	polpol	++	aacaagtgagagagtttttgaacaagtgagagagtttttg	SEQ ID NO: 42SEQ ID NO: 42
PERV-pol-30PERV-pol-30	polpol	++	cccaaaccctaggaccatggcccaaaccctaggaccatgg	SEQ ID NO: 43SEQ ID NO: 43
PERV-pol-31PERV-pol-31	polpol	++	caactattgattgaggagaccaactattgattgaggagac	SEQ ID NO: 44SEQ ID NO: 44
PERV-pol-32PERV-pol-32	polpol	++	agcgcaaaaggctgagctcaagcgcaaaaggctgagctca	SEQ ID NO: 45SEQ ID NO: 45
PERV-pol-33PERV-pol-33	polpol	++	caagctttgcggctggccgacaagctttgcggctggccga	SEQ ID NO: 46SEQ ID NO: 46
PERV-pol-34PERV-pol-34	polpol	++	tccgagatttggaatacctatccgagatttggaataccta	SEQ ID NO: 47SEQ ID NO: 47
PERV-env-1PERV-env-1	envenv	++	aacaggaaaatattcaaaagaacaggaaaatattcaaaag	SEQ ID NO: 48SEQ ID NO: 48
PERV-env-2PERV-env-2	envenv	++	tgaacaggggcccccggccctgaacaggggccccccggccc	SEQ ID NO: 49SEQ ID NO: 49
PERV-env-3PERV-env-3	envenv	++	cagcagctagagaaaggactcagcagctagagaaaggact	SEQ ID NO: 50SEQ ID NO: 50
PERV-env-4PERV-env-4	envenv	++	accaggggtggtttgaaggaaccaggggtggtttgaagga	SEQ ID NO: 51SEQ ID NO: 51
PERV-env-5PERV-env-5	envenv	--	ccagcttaacgtgggatgcaccagcttaacgtgggatgca	SEQ ID NO: 52SEQ ID NO: 52
PERV-env-6PERV-env-6	envenv	--	tttccagtctccatcgttggtttccagtctccatcgttgg	SEQ ID NO: 53SEQ ID NO: 53
PERV-env-7PERV-env-7	envenv	--	ccatttccagtctccatcgtccatttccagtctccatcgt	SEQ ID NO: 54SEQ ID NO: 54
PERV-env-8PERV-env-8	envenv	--	gcccaccacctgttataaccgcccaccacctgttataacc	SEQ ID NO: 55SEQ ID NO: 55
PERV-env-9PERV-env-9	envenv	--	accatccttcaaaccaccccaccatccttcaaaccacccc	SEQ ID NO: 56SEQ ID NO: 56
PERV-env-10PERV-env-10	envenv	++	actcgaggtgttgctcctagactcgaggtgttgctcctag	SEQ ID NO: 57SEQ ID NO: 57
PERV-env-11PERV-env-11	envenv	++	ccgagtgtactaccatcctgccgagtgtactaccatcctg	SEQ ID NO: 58SEQ ID NO: 58
PERV-env-12PERV-env-12	envenv	++	taccaaggccttctgagccataccaaggccttctgagcca	SEQ ID NO: 59SEQ ID NO: 59
PERV-env-13PERV-env-13	envenv	--	gtctataaggcgtttactacgtctataaggcgtttaactac	SEQ ID NO: 60SEQ ID NO: 60
PERV-env-14PERV-env-14	envenv	--	gaccatgacacagaaatcttgaccatgacacagaaatctt	SEQ ID NO: 61SEQ ID NO: 61
PERV-env-15PERV-env-15	envenv	--	accatccttcaaaccaccccaccatccttcaaaccacccc	SEQ ID NO: 62SEQ ID NO: 62
PERV-env-16PERV-env-16	envenv	--	acccactcgttctctaacaaacccactcgttctctaacaa	SEQ ID NO: 63SEQ ID NO: 63
PERV-env-17PERV-env-17	envenv	--	cgtcagagcagaaagcagggcgtcagagcagaaagcaggg	SEQ ID NO: 64SEQ ID NO: 64
PERV-env-18PERV-env-18	envenv	--	tcctatgcatgtccccttcctcctatgcatgtccccttcc	SEQ ID NO: 65SEQ ID NO: 65

2) 에디터단백질2) Editor protein

The composition for genetic manipulation disclosed by the present application may include an editor protein.

For a detailed description of the guide nucleic acid, editor protein, and guide nucleic acid-editor protein complex, which will be described below, reference may be made to Korean Patent Publication No. 2018-0054427 and Korean Patent Publication No. 2018-0136914.

In addition, a detailed description of the nucleotide modification enzyme may refer to Korean Patent Publication No. 2020-0135225.

The "editor protein" refers to a peptide, polypeptide or protein capable of artificially editing a genome or nucleic acid in a target cell.

The editor protein may be a cutting enzyme or a fusion protein containing the same.

The fusion protein may consist of a cleavage enzyme and one or more enzymes that function differently from the cleavage enzyme.

One or more enzymes that have a different function from the cleavage enzyme may be one or more enzymes selected from base modification enzymes, methylation regulatory enzymes, phosphorylation regulatory enzymes, and acetylation regulatory enzymes.

In this case, the cleavage enzyme may be a polypeptide or protein that functions to cleave nucleic acids, genes, or chromosomes.

In this case, the base modifying enzyme may be a polypeptide or protein that functions to transform a base of a nucleotide constituting a nucleic acid into another base.

In this case, the methylation-regulating enzyme may be a polypeptide or protein that functions to methylate and/or demethylate nucleotide sequences of nucleic acids, genes, or chromosomes.

In this case, the phosphorylation-regulating enzyme may be a polypeptide or protein that functions to phosphorylate and/or dephosphorylate a nucleotide sequence of a nucleic acid, gene, or chromosome.

In this case, the acetylation regulatory enzyme may be a polypeptide or protein that functions to acetylate and/or deacetylate a nucleotide sequence of a nucleic acid, gene, or chromosome.

In addition, the editor protein may further include additional domains, peptides, polypeptides or proteins.

The additional domain, peptide, polypeptide or protein may be a functional domain, peptide, polypeptide or protein having the same or different function as the functional domain, peptide, polypeptide or protein included in the editor protein.

In one embodiment, the editor protein includes a CRISPR enzyme.

In the present application, the CRISPR enzyme is Streptococcus pyogenes, Streptococcus thermophilus, Streptococcus sp., Staphylococcus aureus or Campylobacter jejuni (Campylobacter jejuni, Cj).

The CRISPR enzyme may be Cas9, Cpf1 or nickase, but is not limited thereto.

Structural characteristics of the Cas9 are described by Hiroshi Nishimasu et al. (2014) Cell 156:935-949.

The structural characteristics of Cpf1 are described by Takashi Yamano et al. (2016) Cell 165:949-962.

The nickase refers to a CRISPR enzyme engineered or modified to cleave only one strand of the double strand of a target gene or nucleic acid, and the nickase is a single strand, for example, non-complementary with the gRNA of the target gene or nucleic acid It has nuclease activity that cleave the strand or the complementary strand. Thus, the nuclease activity of two nickases is required to cleave the double strand.

The CRISPR enzyme recognizes or interacts with a specific nucleotide sequence in a target gene or nucleic acid, that is, a protospacer adjacent motif (PAM).

For example, the Cas9 protein recognizes a proto-spaceradjacent motif (PAM) sequence on a subject's genomic DNA sequence and interacts with a guide nucleic acid to cut the DNA sequence at a position adjacent to the PAM sequence. Origin of the CRISPR enzyme It is generally understood that PAM is determined according to, but as research on mutants of enzymes derived from the enzyme proceeds, PAM may change.

For example, if the CRISPR enzyme is SpCas9, the PAM may be 5'-NGG-3', and if the CRISPR enzyme is SpCas9, the PAM may be 5'-NNAGAAW-3' (W = A or T) In the case of Neisseria meningiditis Cas9 (NmCas9), the PAM may be 5'-NNNNGATT-3', and in the case of Campylobacter jejuni Cas9 (CjCas9), the PAM is 5'-NNNNRYAC-3' (V = G or C or A, R = A or G, Y = C or T), and in the case of Staphylococcus aureus Cas9 (SaCas9), the PAM may be 5'-NNGRR (T) -3' , In this case, N is each independently A, T, C or G, and R may be A or G. In addition, in the case of Staphylococcus auricularis, SauriCas9, PAM may be 5'-NNGG-3', wherein N is A, T, G or C; or A, U, G or C. For another example, when the CRISPR enzyme is Cpf1, the PAM sequence may be 5'-TTN-3' (N is A, T, C, or G).

The editor protein of the present application further includes a base modification enzyme.

The base modifying enzyme may be a polypeptide or protein that functions to transform a base of a nucleotide constituting a nucleic acid into another base.

The base modifying enzyme may be deaminase.

The deaminase may be wild-type or mutant. The deaminase variant may be an enzyme having increased deaminase activity compared to wild-type deaminase.

For example, a deaminase variant may be an enzyme in which one or more amino acid sequences in deaminase are modified. At this time, the modification may be any one selected from substitution, deletion and insertion of amino acids.

The deaminase may modify (substitute) one or more bases selected from adenine (A), guanine (G), cytosine (C), thymine (T), and uracil (U) with another base.

For example, when the selected one or more bases are adenine (A), the selected bases can be modified to guanine (G), cytosine (C), thymine (T) or uracil (U). Alternatively, when the selected one or more bases are guanine (G), the selected bases may be modified to adenine (A), cytosine (C), thymine (T) or uracil (U). Alternatively, when the selected one or more bases are cytosine (C), the selected bases may be modified to adenine (A), guanine (G), thymine (T) or uracil (U). Alternatively, when the selected one or more bases are thymine (T) or uracil (U), the selected bases may be modified to adenine (A), guanine (G) or cytosine (C).

The deaminase may be adenosine deaminase, cytidine deaminase, cytosine deaminase or guanine deaminase.

For example, the deaminase is AID, PmCDA1, AICDA, ARP2, CDA2, HIGM2, APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3E, APOBEC3F, APOBEC3G, APOBEC3H, AICDA, CDA, DCTD, AMPD1, ADAT , ADAT2, ADAR, ADAR2, ADA, TadA or GDA.

In one embodiment, the cytidine deaminase may substitute thymine for cytosine.

In another embodiment, the adenosine deaminase may replace adenine with guanine.

For example, the deaminase of the present application may be cytidine deaminase.

The cytidine deaminase may be AID, PmCDA1, AICDA, ARP2, CDA2, HIGM2, APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3E, APOBEC3F, APOBEC3G, APOBEC3H, CDA or DCTD.

The deaminase can generate a specific codon sequence. In this case, the specific codon sequence may be any one or more stop codon sequences selected from among 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3'.

In one embodiment, in the case of cytidine deaminase, the target sequence in the target gene or nucleic acid is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CTA- 3', 5'-CCA-3' and 5'-TCA-3', if it contains any one or more nucleotide sequences selected from among, cytosine is substituted with thymine, and the stop codon sequence in the target gene or nucleic acid can create

For example, one or more bases may be substituted with the base modifying enzyme to generate a stop codon.

For example, one or more bases may be substituted with the base modifying enzyme to induce stop codon generation.

The stop codon may be 5'-TAG-3', 5'-TAA-3' and/or 5'-TGA-3'.

The editor protein of the present application may further include a DNA glycosylase inhibitor.

The DNA glycosylase inhibitor can further increase the efficiency of base modification.

For example, the DNA glycosylase inhibitor may be a uracil glycosylase inhibitor (UGI) that inhibits an enzyme that converts uracil into cytosine.

In base substitution, the deaminase and DNA glycosylase inhibitors may be involved sequentially or simultaneously.

For example, the cytidine deaminase deaminates cytosine, providing the intermediate product uracil. The mechanism by which the provided uracil is restored to cytosine by UGI is inhibited, and as a result, cytosine can be replaced by thymine by DNA-replication or mismatch-repair.

Meanwhile, for example, the editor protein of the present application may include a CRISPR enzyme and a base modification enzyme independently or may be fused to each other and included as a fusion protein. At this time, it can be expressed as a separate vector or included in one vector.

In one embodiment, the editor protein may be composed of wild-type Cas9 and cytidine deaminase.

In another embodiment, the editor protein may be composed of Cas9 nickase, cytidine deaminase and UGI.

For another example, the editor protein of the present application may be a fusion protein in which a CRISPR enzyme and a nucleotide modification enzyme are fused to each other. At this time, it may optionally further include a linker.

In one embodiment, the editor protein may be a fusion protein in which wild-type Cas9 and deaminase are fused.

In another embodiment, the editor protein may be composed of a Cas9 nickase and a fusion protein in which cytidine deaminase and UGI are fused.

In another embodiment, the editor protein may be a fusion protein composed of Cas9 nickase, cytidine deaminase and UGI.

3) 가이드핵산3) guide nucleic acid

The composition for genetic manipulation disclosed by the present application further includes a guide nucleic acid.

In one embodiment disclosed by the present application, the composition for genetic manipulation is a guide nucleic acid targeting a target sequence within the PERV gene.

The "guide nucleic acid" refers to a nucleotide sequence capable of recognizing a target nucleic acid, gene or chromosome and interacting with an editor protein. In this case, the guide nucleic acid may complementarily bind to a target sequence, a gene, or a part of a nucleotide sequence in a chromosome. In addition, some nucleotide sequences of the guide nucleic acid may interact with some amino acids of the editor protein to form a guide nucleic acid-editor protein complex (ribonucleoprotein, RNP).

The guide nucleic acid may perform a function of inducing a guide nucleic acid-editor protein complex to be located in a target region of a target nucleic acid, gene, or chromosome.

The guide nucleic acid may be in the form of DNA, RNA or a DNA/RNA mixture.

The guide nucleic acid may include one or more domains.

The domain may be a functional domain such as a guide domain, a first complementary domain, a linking domain, a second complementary domain, a proximal domain, and a tail domain, but is not limited thereto.

“Guide RNA” RNA (gRNA) refers to an RNA capable of specifically targeting the gRNA-CRISPR enzyme complex, ie, the CRISPR complex, to a target gene or nucleic acid. In addition, the gRNA refers to a target gene or nucleic acid-specific RNA, and can bind to a CRISPR enzyme to direct the CRISPR enzyme to a target gene or nucleic acid.

The guide RNA may include crRNA (CRISPR RNA) or/and tracrRNA (trans-activating crRNA). A form in which the crRNA and a specific site of the tracrRNA are fused may be referred to as sgRNA (sinlge-chain guide RNA).

The crRNA is a site that binds to a target sequence, and the tracrRNA is involved in forming the following editor protein complex.

For example, the crRNA may include a guide domain and a first complementary domain, and the tracrRNA may include a second complementary domain, a proximal domain and optionally a tail domain.

The guide domain is a domain containing a guide sequence capable of complementary binding with a target sequence (guide nucleic acid binding sequence) on a target gene or nucleic acid, and can play a role for specific interaction with a target gene or nucleic acid. .

The first complementary domain is a nucleic acid sequence including a nucleic acid sequence complementary to the second complementary domain, and may have complementarity to the extent of forming a double strand with the second complementary domain.

The "linking domain" is a nucleic acid sequence linking two or more domains, and the linking domain may link two or more identical or different domains. The linking domain may form a covalent or non-covalent bond with two or more domains, or may link two or more domains covalently or non-covalently.

The "second complementary domain" is a nucleic acid sequence including a nucleic acid sequence complementary to the first complementary domain, and may have complementarity to the extent of forming a double strand with the first complementary domain.

The "proximal domain" may be a nucleic acid sequence located proximal to the second complementary domain.

The "tail domain" may be a nucleic acid sequence located at one or more ends of both ends of the guide nucleic acid.

For example, the composition of the present application may be a gRNA capable of targeting a target gene in the PERV gene.

As another example, the composition of the present application may be a gRNA capable of simultaneously targeting target genes each present in two or more subtype gene sequences selected from among PERV-A, PERV-B and PERV-C.

The gRNA may include a guide domain capable of partially or completely complementary binding with a guide nucleic acid binding sequence among target sequences of the PERV gene.

The gRNA may include a guide domain capable of partially or completely complementary binding to a guide nucleic acid binding sequence among target sequences of the gag gene, pol gene, and/or env gene.

The guide domain may be at least 70%, 75%, 80%, 85%, 90% or 95% complementary or completely complementary to the guide nucleic acid binding sequence.

The guide RNA may include a guide domain that complementarily binds to a guide nucleic acid binding target sequence and includes the same sequence as a guide nucleic acid non-binding target sequence in which T is replaced with U.

Therefore, the guide RNA is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CUA-3', 5'-CCA-3' and 5'-UCA -3' includes one or more of the nucleotide sequences.

In the present application, the part of the guide RNA sequence complementary to the guide nucleic acid binding target sequence is the guide domain region.

The guide domain may include a nucleotide sequence complementary to a guide nucleic acid binding sequence among target sequences of the PERV gene. In this case, the complementary nucleotide sequence may include 0 to 5, 0 to 4, 0 to 3, 0 to 2, and 0 to 1 mismatches.

The guide domain may include a nucleotide sequence complementary to a guide nucleic acid binding sequence among target sequences of the gag gene. In this case, the complementary nucleotide sequence may include 0 to 5, 0 to 4, 0 to 3, 0 to 2, and 0 to 1 mismatches.

The guide domain may include a nucleotide sequence complementary to a guide nucleic acid binding sequence among target sequences of the pol gene. In this case, the complementary nucleotide sequence may include 0 to 5, 0 to 4, 0 to 3, 0 to 2, and 0 to 1 mismatches.

The guide domain may include a nucleotide sequence complementary to a guide nucleic acid binding sequence among target sequences of the env gene. In this case, the complementary nucleotide sequence may include 0 to 5, 0 to 4, 0 to 3, 0 to 2, and 0 to 1 mismatches.

The gRNA may include one or more domains selected from the group consisting of a first complementary domain, a linking domain, a second complementary domain, a proximal domain, and a tail domain.

Among the guide RNA sequences, domain regions other than the guide domain region contribute to forming a complex with the Cas protein.

In the present application, the guide nucleic acid non-binding target sequence in the gag gene, pol gene and/or env gene includes a PAM and a stop codon target sequence. In addition, when the base modification enzyme operates on the guide nucleic acid non-binding target sequence, some bases of the stop codon target sequence are substituted (changed). At this time, a stop codon sequence in the target sequence may be generated by the substitution.

The PAM sequence can provide a selection criterion for a target sequence when designing a gRNA. A target position is found based on the PAM sequence, and the target sequence is designed to include a portion containing the stop codon target sequence. At this time, in the guide nucleic acid non-binding target sequence in which T is substituted with U, the same RNA sequence as the sequence in which T is substituted with U is determined as the guide domain sequence.

Hereinafter, guide domains of gRNAs that can be used in one embodiment of the present application are summarized in [Table 4].

NameName	Target geneTarget gene	DirectionDirection	gRNA (5' to 3', w/o PAM)gRNA (5' to 3', w/o PAM)	SEQ ID NOSEQ ID No.
PERV-gag-1_GPERV-gag-1_G	gaggag	++	uucagguuaagaagggaccuuucagguuaagaagggaccu	SEQ ID NO: 66SEQ ID NO: 66
PERV-gag-2_GPERV-gag-2_G	gaggag	++	gcagacacucuucacagccggcagacacucuucacagccg	SEQ ID NO: 67SEQ ID NO: 67
PERV-gag-3_GPERV-gag-3_G	gaggag	++	ccagaaagccucaguggcccccagaaagccucaguggccc	SEQ ID NO: 68SEQ ID NO: 68
PERV-gag-4_GPERV-gag-4_G	gaggag	++	ucagagacuggaaggguuacucagagacuggaaggguuac	SEQ ID NO: 69SEQ ID NO: 69
PERV-gag-6_GPERV-gag-6_G	gaggag	--	agccaugguuuaacccauggagccaugguuuaacccaugg	SEQ ID NO: 70SEQ ID NO: 70
PERV-gag-7_GPERV-gag-7_G	gaggag	--	ucccaaccgaggcgagucaaucccaaccgaggcgagucaa	SEQ ID NO: 71SEQ ID NO: 71
PERV-gag-8_GPERV-gag-8_G	gaggag	--	gucccaaccgaggcgagucagucccaaccgaggcgaguca	SEQ ID NO: 72SEQ ID NO: 72
PERV-gag-9_GPERV-gag-9_G	gaggag	++	uccccgaauccuggcucuuguccccgaauccuggcucuug	SEQ ID NO: 73SEQ ID NO: 73
PERV-gag-10_GPERV-gag-10_G	gaggag	++	gcgagagagaauucuguuaggcgagagagaauucuguuag	SEQ ID NO: 74SEQ ID NO: 74
PERV-gag-11_GPERV-gag-11_G	gaggag	++	uccccaacgccucacgggguuccccaacgccucacggggu	SEQ ID NO: 75SEQ ID NO: 75
PERV-gag-12_GPERV-gag-12_G	gaggag	++	ccaacgccucacgggguuggccaacgccucacgggguugg	SEQ ID NO: 76SEQ ID NO: 76
PERV-gag-13_GPERV-gag-13_G	gaggag	++	guugcaaaaugagauugacaguugcaaaaugagauugaca	SEQ ID NO: 77SEQ ID NO: 77
PERV-gag-14_GPERV-gag-14_G	gaggag	++	uugcaaaaugagauugacauuugcaaaaugagauugacau	SEQ ID NO: 78SEQ ID NO: 78
PERV-pol-1_GPERV-pol-1_G	polpol	++	ggagcaguuuccccaagccuggagcaguuuccccaagccu	SEQ ID NO: 79SEQ ID NO: 79
PERV-pol-2_GPERV-pol-2_G	polpol	++	cccacagguuauucaacugacccacagguuauucaacuga	SEQ ID NO: 80SEQ ID NO: 80
PERV-pol-3_GPERV-pol-3_G	polpol	++	acaguaccccuugaguagagacaguaccccuugaguagag	SEQ ID NO: 81SEQ ID NO: 81
PERV-pol-4_GPERV-pol-4_G	polpol	++	ggugcaggacauacacccaaggugcaggacauacacccaa	SEQ ID NO: 82SEQ ID NO: 82
PERV-pol-5_GPERV-pol-5_G	polpol	++	ggcccagauuugcaggagagggcccagauuugcaggagag	SEQ ID NO: 83SEQ ID NO: 83
PERV-pol-6_GPERV-pol-6_G	polpol	++	cgggcagcgauggcugacggcgggcagcgauggcugacgg	SEQ ID NO: 84SEQ ID NO: 84
PERV-pol-7_GPERV-pol-7_G	polpol	++	gacaguacacccuagaagacgacaguacacccuagaagac	SEQ ID NO: 85SEQ ID NO: 85
PERV-pol-8_GPERV-pol-8_G	polpol	++	agaccaguucucugagacucagaccaguucucugagacuc	SEQ ID NO: 86SEQ ID NO: 86
PERV-pol-9_GPERV-pol-9_G	polpol	++	ccaguucucugagacuccggccaguucucugagacuccgg	SEQ ID NO: 87SEQ ID NO: 87
PERV-pol-10_GPERV-pol-10_G	polpol	++	caguucucugagacuccggacaguucucugagacuccgga	SEQ ID NO: 88SEQ ID NO: 88
PERV-pol-11_GPERV-pol-11_G	polpol	--	ccaguuccguucaggcgggaccaguuccguucaggcggga	SEQ ID NO: 89SEQ ID NO: 89
PERV-pol-12_GPERV-pol-12_G	polpol	--	accaguuccguucaggcgggaccaguuccguucaggcggg	SEQ ID NO: 90SEQ ID NO: 90
PERV-pol-13_GPERV-pol-13_G	polpol	--	uguaccaguuccguucaggcuguaccaguuccguucaggc	SEQ ID NO: 91SEQ ID NO: 91
PERV-pol-14_GPERV-pol-14_G	polpol	--	cuccauucgaaggcaaaaagcuccauucgaaggcaaaaag	SEQ ID NO: 92SEQ ID NO: 92
PERV-pol-15_GPERV-pol-15_G	polpol	--	ccuccaugguccuaggguuuccuccaugguccuaggguuu	SEQ ID NO: 93SEQ ID NO: 93
PERV-pol-16_GPERV-pol-16_G	polpol	--	uccuccaugguccuaggguuuccuccaugguccuaggguu	SEQ ID NO: 94SEQ ID NO: 94
PERV-pol-17_GPERV-pol-17_G	polpol	--	cuucccagugagcgccugggcuucccaugagcgccuggg	SEQ ID NO: 95SEQ ID NO: 95
PERV-pol-18_GPERV-pol-18_G	polpol	--	ccaguauuuuuuagccaccaccaguauuuuuuagccacca	SEQ ID NO: 96SEQ ID NO: 96
PERV-pol-19_GPERV-pol-19_G	polpol	--	cuucaguugaauaaccugugcuucaguugaauaaccugg	SEQ ID NO: 97SEQ ID NO: 97
PERV-pol-20_GPERV-pol-20_G	polpol	--	uucuaagcaguccuguuugguucuaagcaguccuguuugg	SEQ ID NO: 98SEQ ID NO: 98
PERV-pol-21_GPERV-pol-21_G	polpol	--	uccuaggguguacugucgucuccuaggguguacugucguc	SEQ ID NO: 99SEQ ID NO: 99
PERV-pol-22_GPERV-pol-22_G	polpol	++	agcgauggcugacggaggcaagcgauggcugacggaggca	SEQ ID NO: 100SEQ ID NO: 100
PERV-pol-23_GPERV-pol-23_G	polpol	++	gcucaaauuucuuuugaacagcucaaauuucuuuugaaca	SEQ ID NO: 101SEQ ID NO: 101
PERV-pol-24_GPERV-pol-24_G	polpol	++	ucaagauauacaguccugguucaagauauacaguccuggu	SEQ ID NO: 102SEQ ID NO: 102
PERV-pol-25_GPERV-pol-25_G	polpol	++	caagccugggcagaaaccgccaagccugggcagaaaccgc	SEQ ID NO: 103SEQ ID NO: 103
PERV-pol-26_GPERV-pol-26_G	polpol	++	uguucaaagauuaauccaacuguucaaagauuaauccaac	SEQ ID NO: 104SEQ ID NO: 104
PERV-pol-27_GPERV-pol-27_G	polpol	++	guucaaagauuaauccaacaguucaaagauuaauccaaca	SEQ ID NO: 105SEQ ID NO: 105
PERV-pol-28_GPERV-pol-28_G	polpol	++	aaacaagugagagaguuuuuaaacaagugagagaguuuuu	SEQ ID NO: 106SEQ ID NO: 106
PERV-pol-29_GPERV-pol-29_G	polpol	++	aacaagugagagaguuuuugaacaagugagagaguuuuug	SEQ ID NO: 107SEQ ID NO: 107
PERV-pol-30_GPERV-pol-30_G	polpol	++	cccaaacccuaggaccauggcccaaacccuaggaccaugg	SEQ ID NO: 108SEQ ID NO: 108
PERV-pol-31_GPERV-pol-31_G	polpol	++	caacuauugauugaggagaccaacuauugauugaggagac	SEQ ID NO: 109SEQ ID NO: 109
PERV-pol-32_GPERV-pol-32_G	polpol	++	agcgcaaaaggcugagcucaagcgcaaaaggcugagcuca	SEQ ID NO: 110SEQ ID NO: 110
PERV-pol-33_GPERV-pol-33_G	polpol	++	caagcuuugcggcuggccgacaagcuuugcggcuggccga	SEQ ID NO: 111SEQ ID NO: 111
PERV-pol-34_GPERV-pol-34_G	polpol	++	uccgagauuuggaauaccuauccgagauuuggaauccua	SEQ ID NO: 112SEQ ID NO: 112
PERV-env-1_GPERV-env-1_G	envenv	++	aacaggaaaauauucaaaagaacaggaaaauauucaaaag	SEQ ID NO: 113SEQ ID NO: 113
PERV-env-2_GPERV-env-2_G	envenv	++	ugaacaggggcccccggcccugaacaggggccccccggccc	SEQ ID NO: 114SEQ ID NO: 114
PERV-env-3_GPERV-env-3_G	envenv	++	cagcagcuagagaaaggacucagcagcuagagaaaggacu	SEQ ID NO: 115SEQ ID NO: 115
PERV-env-4_GPERV-env-4_G	envenv	++	accaggggugguuugaaggaaccaggggugguuugaagga	SEQ ID NO: 116SEQ ID NO: 116
PERV-env-5_GPERV-env-5_G	envenv	--	ccagcuuaacgugggaugcaccagcuuaacgugggaugca	SEQ ID NO: 117SEQ ID NO: 117
PERV-env-6_GPERV-env-6_G	envenv	--	uuuccagucuccaucguugguuuccagucuccaucguugg	SEQ ID NO: 118SEQ ID NO: 118
PERV-env-7_GPERV-env-7_G	envenv	--	ccauuuccagucuccaucguccauuuccagucuccaucgu	SEQ ID NO: 119SEQ ID NO: 119
PERV-env-8_GPERV-env-8_G	envenv	--	gcccaccaccuguuauaaccgcccaccaccuguuauaacc	SEQ ID NO: 120SEQ ID NO: 120
PERV-env-9_GPERV-env-9_G	envenv	--	accauccuucaaaccaccccaccauccuucaaaccacccc	SEQ ID NO: 121SEQ ID NO: 121
PERV-env-10_GPERV-env-10_G	envenv	++	acucgagguguugcuccuagacucgagguguugcuccuag	SEQ ID NO: 122SEQ ID NO: 122
PERV-env-11_GPERV-env-11_G	envenv	++	ccgaguguacuaccauccugccgaguguacuaccauccug	SEQ ID NO: 123SEQ ID NO: 123
PERV-env-12_GPERV-env-12_G	envenv	++	uaccaaggccuucugagccauaccaaggccuucugagcca	SEQ ID NO: 124SEQ ID NO: 124
PERV-env-13_GPERV-env-13_G	envenv	--	gucuauaaggcguuuacuacgucuauaaggcguuuacuac	SEQ ID NO: 125SEQ ID NO: 125
PERV-env-14_GPERV-env-14_G	envenv	--	gaccaugacacagaaaucuugaccaugacacagaaaucuu	SEQ ID NO: 126SEQ ID NO: 126
PERV-env-15_GPERV-env-15_G	envenv	--	accauccuucaaaccaccccaccauccuucaaaccacccc	SEQ ID NO: 127SEQ ID NO: 127
PERV-env-16_GPERV-env-16_G	envenv	--	acccacucguucucuaacaaaccacucguucucuaacaa	SEQ ID NO: 128SEQ ID NO: 128
PERV-env-17_GPERV-env-17_G	envenv	--	cgucagagcagaaagcagggcgucagagcagaaagcaggg	SEQ ID NO: 129SEQ ID NO: 129
PERV-env-18_GPERV-env-18_G	envenv	--	uccuaugcauguccccuuccuccuaugcaugucccccuucc	SEQ ID NO: 130SEQ ID NO: 130

4) 유전자 조작용 조성물 형태4) Form of composition for genetic manipulation

The composition for genetic manipulation of the present application may include a guide nucleic acid and an editor protein.

Composition for genetic manipulation,

(a) a guide nucleic acid capable of targeting a target sequence within a target nucleic acid sequence or a nucleic acid sequence encoding the same; and

(b) one or more editor proteins or nucleic acid sequences encoding them;

can include

The target nucleic acid may be the full length or partial nucleic acid sequence of the PERV gene.

In this case, the PERV gene may include a gag gene, a pol gene, and/or an env gene. The description of the PERV gene is as described above.

The target nucleic acid may be a gag gene.

The target nucleic acid may be a pol gene.

The target nucleic acid may be an env gene.

The target nucleic acid may be two genes selected from a gag gene, a pol gene, and an env gene.

The target nucleic acid may be a gag gene, a pol gene, and an env gene.

The description of the target sequence is the same as described above.

The target sequence may include a specific sequence.

The specific sequence may be a nucleotide sequence of 5'-CAG-3'.

The specific sequence may be a nucleotide sequence of 5'-CAA-3'.

The specific sequence may be a nucleotide sequence of 5'-CGA-3'.

The specific sequence may be a nucleotide sequence of 5'-CTA-3'.

The specific sequence may be a nucleotide sequence of 5'-CCA-3'.

The specific sequence may be a nucleotide sequence of 5'-TCA-3'.

As one specific example disclosed by the present application, the composition for genetic manipulation may include a gRNA, a CRISPR enzyme, and a base modification enzyme.

Compositions for genetic manipulation

(a) a gRNA capable of targeting a target sequence within a target nucleic acid sequence or a nucleic acid sequence encoding the same;

(b) one or more CRISPR enzymes or nucleic acid sequences encoding them; and

(C) one or more base modifying enzymes or nucleic acid sequences encoding the same;

can include

The target nucleic acid may be a gag gene.

The target nucleic acid may be a pol gene.

The target nucleic acid may be an env gene.

The target nucleic acid may be a gag gene, a pol gene, and an env gene.

The description of the target sequence is the same as described above.

The target sequence may include a specific sequence.

The specific sequence may be a nucleotide sequence of 5'-CAG-3'.

The specific sequence may be a nucleotide sequence of 5'-CAA-3'.

The specific sequence may be a nucleotide sequence of 5'-CGA-3'.

The specific sequence may be a nucleotide sequence of 5'-CTA-3'.

The specific sequence may be a nucleotide sequence of 5'-CCA-3'.

The specific sequence may be a nucleotide sequence of 5'-TCA-3'.

Description of the CRISPR enzyme is as described above.

The CRISPR enzyme may be Cas9 nickase.

Description of the nucleotide modification enzyme is as described above.

The base modifying enzyme may be a deaminase.

The deaminase may be cytidine deaminase.

The CRISPR enzyme and the base modification enzyme may be fusion proteins fused to each other.

The composition for genetic manipulation may further include a DNA glycosylase inhibitor.

Description of the DNA glycosylase inhibitor is the same as described above.

The DNA glycosylase inhibitor may be a uracil glycosylase inhibitor.

The CRISPR enzyme, base modifying enzyme, and DNA glycosylase inhibitor may be included in the composition for genetic manipulation in the form of a fusion protein in which two or more of them are fused to each other.

Descriptions related to the guide nucleic acid and the editor protein are the same as described above.

The guide nucleic acid, editor protein, or guide nucleic acid-editor protein complex included in the composition for genetic manipulation disclosed in the present application may be delivered or introduced into cells in various forms.

The guide nucleic acid and editor protein may be delivered or introduced into a subject in the form of a nucleic acid-protein mixture.

The guide nucleic acid and editor protein may be delivered or introduced into a subject in the form of a guide nucleic acid-editor protein complex.

For example, the guide nucleic acid may be DNA, RNA or a mixture thereof. The editor protein may be in the form of a peptide, polypeptide or protein.

For example, the guide nucleic acid and the editor protein may be delivered or introduced into the target in the form of a guide nucleic acid-editor protein complex, that is, a ribonucleoprotein (RNP), in which the guide nucleic acid in the form of RNA and the editor protein in the form of protein are formed.

The composition for genetic manipulation may be in the form of a vector or a non-vector.

In this case, a vector containing the nucleic acid sequence encoding the guide nucleic acid or/and the CRISPR enzyme or/and the base modification enzyme simultaneously or separately may be constructed.

The vector may be a viral vector or a recombinant viral vector.

The virus may be a DNA virus or an RNA virus.

In this case, the DNA virus may be a double-stranded DNA (dsDNA) virus or a single-stranded DNA (ssDNA) virus.

At this time, the RNA virus may be a single-stranded RNA (ssRNA) virus.

The virus may be retrovirus, lentivirus, adenovirus, adeno-associated virus (AAV), vaccinia virus, poxvirus or herpes simplex virus, but is not limited thereto.

The vector may contain one or more regulation/control elements.

At this time, the regulatory / control elements are promoters, enhancers, introns, polyadenylation signals, Kozak consensus sequences, internal ribosome entry sites (IRES), splice acceptors and / or 2A sequence may be included.

As the promoter, a suitable promoter may be used according to the control region (ie, a nucleic acid sequence encoding a guide nucleic acid or an editor protein).

For example, useful promoters for guide nucleic acids can be H1, EF-1a, tRNA or U6 promoters. For example, useful promoters for editor proteins may be the CMV, EF-1a, EFS, MSCV, PGK or CAG promoters.

The vector may contain one or more additional domains for separation and purification.

In this case, the one or more additional domains may be tags or reporter genes for separation and purification of proteins (including peptides).

The tag includes a histidine (His) tag, a V5 tag, a FLAG tag, an influenza hemagglutinin (HA) tag, a Myc tag, a VSV-G tag and a thioredoxin (Trx) tag, and the reporter gene is glutathione -S-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta-galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP) and blue fluorescent protein (BFP).

The non-vector may be naked DNA, DNA complex, mRNA, or a mixture thereof.

The non-vector can be prepared by electroporation, gene gun, sonoporation, magnetofection, transient cell compression or squeezing (eg, Lee, et al, (2012) Nano Lett., 12, 6322- 6327), lipid-mediated transfection, dendrimers, nanoparticles, calcium phosphate, silica, silicate (ormosil), or combinations thereof.

For example, delivery through electroporation may be performed by mixing cells and nucleic acid sequences encoding guide nucleic acids and/or editor proteins in a cartridge, chamber, or cuvette, and applying electrical stimulation of a predetermined duration and amplitude. there is.

As another example, a non-vector may be delivered using nanoparticles. The nanoparticles may be inorganic nanoparticles (eg, magnetic nanoparticles, silica, etc.) or organic nanoparticles (eg, polyethylene glycol (PEG)-coated lipids, etc.). The outer surface of the nanoparticles can be conjugated with a positively charged polymer (eg, polyethyleneimine, polylysine, polyserine, etc.) to enable adhesion.

In certain embodiments, delivery may be performed using a lipid envelope.

In certain embodiments, delivery may be performed using exosomes. Exosomes are endogenous nano-vesicles that transport proteins and RNA, capable of delivering RNA to the brain and other target organs.

In certain embodiments, delivery may be performed using liposomes. Liposomes are spherical vesicular structures composed of a single or multilamellar lipid bilayer surrounding an inner aqueous compartment and a relatively impermeable outer lipophilic phospholipid bilayer. Liposomes can be made from several different types of lipids; Phospholipids are most commonly used to create liposomes as drug carriers.

In addition, the composition for delivery of the non-vector may include some other additives.

4-1) Components for knocking in pig genome

The composition for genetic manipulation of the present application may be knocked into the pig genome in order to continuously inactivate PERV in the pig genome. When the composition for genetic engineering is knocked into the genome, the possibility of permanent inactivation of PERV in multicopy state rather than temporary inactivation can be increased. Moreover, since the composition for genetic manipulation of the present application uses the CRISPR/Cas system, it not only suppresses the transcription or/and translation of the target gene of PERV, but also creates a stop codon sequence in the target sequence of the target gene, thereby providing a target-specific and , high efficiency of PERV inactivation can be achieved.

To this end, the vector may include one or more transfer factors.

The transfer factor or a vector including the transfer factor may be located within a safe harbor.

For example, the safe harbor may be AAVS1, ROSA26, and the like.

The safe harbor enables the expression of the foreign gene or the vector containing the same to be continued without causing side effects even when the foreign gene or the vector containing the same is inserted or knocked into a specific part of the pig genome.

The transfer factor may be a piggyBac transfer factor or a Sleeping Beauty transfer factor.

.

The piggyBac transfer factor may include an inverted terminal repeat sequence (ITR) at both ends.

The enzyme that mediates translocation of the piggyBac transfer factor may include piggyBac transposase.

The piggyBac transfer factor can be inserted into the genome by recognizing a specific nucleotide sequence.

In this case, the specific nucleotide sequence may be a target insertion site, and the target insertion site may be TTAA.

In this case, the genome may be a genome of an animal cell including a pig cell, a human cell, a mouse cell, and a rat cell.

The piggyBac transfer factor may include an external gene or nucleic acid sequence.

The piggyBac transfer factor may include an external gene or nucleic acid sequence between ITRs at both ends.

The foreign gene or nucleic acid sequence may be a gene or nucleic acid sequence to be translocated.

The external gene or nucleic acid sequence may be a guide nucleic acid and/or an editor protein.

At this time, the foreign gene may be inserted into the genome by a piggyBac transfer factor.

For example, when the transfer factor is a piggyBac transfer factor, the vector system may be configured as follows. The piggyBac transfer factor may include a nucleic acid sequence encoding a guide nucleic acid and an editor protein between ITRs at both ends. Alternatively, the piggyBac transfer factor may include a nucleic acid sequence encoding a guide nucleic acid or an editor protein between ITRs at both ends.

The Sleeping Beauty transfer factor may include direct repeat (DR) at both ends.

The Sleeping Beauty transfer factor may include ITR at both ends.

In this case, the ITR may include a DR.

The Sleeping Beauty transfer factor may include Sleeping Beauty transposase as an enzyme that mediates translocation.

The Sleeping Beauty transfer factor can be inserted into the genome by recognizing a specific nucleotide sequence.

In this case, the specific nucleotide sequence may be a target insertion site, and the target insertion site may be a TA.

In this case, the genome may be a genome of an animal cell including pig cells, human cells, mouse cells, and rat cells.

The Sleeping Beauty transfer factor may include an external gene or nucleic acid sequence.

The Sleeping Beauty transfer factor may include an external gene or nucleic acid sequence between ITRs at both ends.

At this time, the foreign gene may be inserted into the genome by a Sleeping Beauty transfer factor.

For example, when the transfer factor is a Sleeping Beauty transfer factor, the vector system may be configured as follows. The Sleeping Beauty transfer factor may include a nucleic acid sequence encoding a guide nucleic acid and an editor protein between ITRs at both ends. Alternatively, the Sleeping Beauty transfer factor may include a nucleic acid sequence encoding a guide nucleic acid or an editor protein between ITRs at both ends.

Even in the examples of the present application, it can be confirmed through the results of FIG. 3 that the composition for genetic manipulation was knocked into the pig genome. Furthermore, through the results of FIGS. 4 to 5, it can be confirmed that the composition for genetic manipulation generated a stop codon in the target sequence of the PERV gene.

IV. PERV 불활성화 방법IV. PERV inactivation method

Another aspect disclosed by the present application relates to a method for inactivating PERV using the composition for genetic engineering. Description of the composition for genetic manipulation is the same as described above.

For example, the method,

Incorporating a composition for genetic manipulation into pig cells;

At this time, the composition for genetic manipulation,

(b) Cas9 protein derived from Streptococcus pyogenes, Cas9 protein derived from Campylobacter jejuni, Cas9 protein derived from Streptococcus thermophiles, Staphylococcus aure At least one editor protein selected from the group consisting of a Cas9 protein derived from Staphylococcus aureus, a Cas9 protein derived from Neisseria meningitidis, a Staphylococcus auricularis (SauriCas9), and a Cpf1 protein or a nucleic acid sequence encoding the same;

can include

In another example, the method

Incorporating a composition for genetic manipulation into pig cells;

At this time, the composition for genetic manipulation,

(a) a guide nucleic acid capable of targeting a target sequence within a target nucleic acid sequence or a nucleic acid sequence encoding the same;

(b) Cas9 protein derived from Streptococcus pyogenes, Cas9 protein derived from Campylobacter jejuni, Cas9 protein derived from Streptococcus thermophiles, Staphylococcus aure At least one editor protein selected from the group consisting of a Cas9 protein derived from Staphylococcus aureus, a Cas9 protein derived from Neisseria meningitidis, a Staphylococcus auricularis (SauriCas9), and a Cpf1 protein or a nucleic acid sequence encoding the same; and

(c) one or more deaminase or a nucleic acid sequence encoding the same;

can include

The introduction may be performed by at least one method selected from electroporation, liposome, plasmid, viral vector, nanoparticles, and PTD (Protein translocation domain) fusion protein method.

The plasmid may be a vector containing a transfer factor for genome knock-in of pigs.

The vector-related description including the transfer factor is the same as described above.

The viral vector may be at least one selected from the group consisting of retrovirus, lentivirus, adenovirus, adeno-associated virus (AAV), vaccinia virus, poxvirus, and herpes simplex virus.

In another embodiment, the composition for manipulation may be delivered or introduced into a subject as a non-vector.

The description of the non-vector is as described above.

The subject may be a pig or a pig cell.

In this case, the porcine cells may include primary cultured porcine cells or commercial porcine cell lines.

In one embodiment, the method may be a pig cell production method comprising the step of introducing a composition for genetic manipulation capable of artificially manipulating the gene of PERV into the pig cell.

The porcine cell may be a porcine cell containing the PERV gene.

The porcine cell may be a porcine cell containing the full length or partial nucleic acid sequence of the PERV gene.

In this case, the full length or partial nucleic acid sequence of the PERV gene may be present in the genome of the pig cell.

In this case, the full length or partial nucleic acid sequence of the PERV gene may be present in the pig cell.

The target nucleic acid may be a gag gene.

The target nucleic acid may be a pol gene.

The target nucleic acid may be an env gene.

The target nucleic acid may be a gag gene, a pol gene, and an env gene.

In some embodiments, the method comprises extracting a cell or cell population from a pig, and modifying the cell or cell population. Culturing of the cells can occur in vitro at any stage. The engineered cell or cell population can be reintroduced into a pig or human, and furthermore, the nucleus or genome extracted from the engineered cell or cell population can be transplanted into a pig cell.

조성물이 대상에 도입되었는지 여부를 확인하는 방법How to Determine Whether a Composition Has Been Introduced to a Subject

The method,

Optionally, it may further include; confirming whether the introduced composition for genetic manipulation is introduced.

The step of confirming whether or not the introduction; may be performed using a known technique.

For example, the known technology may include a method for confirming transformation or a method for selecting transduction.

The selection method may be a method using a selectable marker.

The selectable marker may include an antibiotic gene, a fluorescent protein, and the like.

For example, the antibiotic gene may be ampicillin, neomycin, kanamycin, hygromycin, gentamicin, puromycin, and the like.

For example, the fluorescent protein may be green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), and the like.

When using the selectable marker, the selectable marker may be included in the composition.

A selectable marker can be used as an indicator of whether the composition has been introduced into a subject. When an antibiotic resistance gene is used as a selection marker, and the composition is inserted together and a colony is formed by culturing cells in a medium containing an antibiotic, it can be determined that the composition has been introduced.

For example, when the subject to which the composition is introduced is a cell and the selectable marker is an antibiotic gene, the selection can be made as follows. When the cells are cultured in a medium containing antibiotics and colonies are formed, it can be determined that the composition has been introduced into the cells.

Using the method described above, porcine cells in which PERV is inactivated on the porcine genome can be selected and obtained.

V. PERV가 불활성화된 동물세포 & 동물 & 동물장기V. PERV-inactivated animal cell & animal & animal organ

Another aspect disclosed by the present application relates to a porcine cell, pig or porcine organ artificially engineered to inactivate PERV using the above method.

1) 인위적으로 조작된 돼지 세포1) Artificially engineered pig cells

As an example, it relates to artificially engineered pig cells.

“Manipulated porcine cell or engineered porcine cell” means a pig cell that has been artificially manipulated rather than in a natural state.

An engineered porcine cell may be a porcine cell produced by genetic engineering.

The engineered porcine cell may be a porcine cell comprising an artificially engineered porcine endogenous retrovirus (PERV) gene.

The engineered porcine cell may be a porcine cell in which PERV gene transfer is inhibited.

The PERV gene transfer may be a position shift transfer on the genome in an engineered porcine cell.

The PERV gene transfer may be transfer into the genome of an engineered porcine cell and an adjacent cell, i.e., an engineered or non-engineered neighbor cell.

In this case, the adjacent cells may be animal cells including pig cells, human cells, mouse cells, and rat cells.

The engineered porcine cell may be a porcine cell in which production of a protein encoded by the PERV gene is inhibited or inhibited.

The engineered porcine cell may be a porcine cell in which production of a protein encoded by the gag gene is inhibited or inhibited.

An engineered porcine cell may be a porcine cell in which production of a protein encoded by a pol gene is inhibited or inhibited.

The engineered porcine cell may be a porcine cell in which the production of a protein encoded by the env gene is inhibited or inhibited.

The engineered porcine cell may be a porcine cell in which production of two or more of the proteins encoded by the gag gene, the pol gene and the env gene is inhibited or inhibited.

As an embodiment disclosed by this specification, the engineered porcine cell may be a porcine cell comprising one or more of an artificially engineered gag gene, an artificially engineered pol gene, and an artificially engineered env gene.

The artificially engineered gag gene, artificially engineered pol gene, and artificially engineered env gene may be artificially engineered to include a specific sequence.

In this case, the specific sequence may be a stop codon.

In this case, the specific sequence may be 5'-TAG-3', 5'-TAA-3' and/or 5'-TGA-3'.

The artificially engineered gag gene may be a gene containing two or more stop codons.

The artificially engineered pol gene may be a gene containing two or more stop codons.

The artificially engineered env gene may be a gene containing two or more stop codons.

The two or more stop codons may include a first stop codon and one or more additional stop codons.

In this case, the “first stop codon” refers to a stop codon present at the 3' end of the gag gene, pol gene, and/or env gene in their natural state.

The first stop codon may be one of 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3'.

At this time, the "additional stop codon" means a stop codon excluding the first stop codon in the nucleic acid sequence of the artificially engineered gag gene, pol gene and / or env gene, which is a natural gag gene, pol gene and / or it may be an artificially generated stop codon that does not exist in the nucleic acid sequence of the env gene.

The addition stop codon may have the same or different nucleotide sequence as the first stop codon.

The additional termination codon may be 5'-TAG-3', 5'-TAA-3' and/or 5'-TGA-3'.

For example, the addition termination codon may be located at or adjacent to the 5' end of the nucleic acid sequence of the gag gene, pol gene and/or env gene.

In another example, the additional termination codon may be located within the nucleic acid sequence of the gag gene, pol gene and/or env gene.

In another example, the additional termination codon may be located adjacent to the 3' end of the gag gene, pol gene, and/or env gene.

The one or more additional termination codons are contiguous nucleotides of 1 bp to 100 bp adjacent to the 5' end and/or 3' end of the proto-spacer-adjacent motif (PAM) sequence in the nucleic acid sequence of the gag gene, pol gene and/or env gene. can be located within a sequence.

In other words, the engineered porcine cell contains an artificially generated stop codon, not just a stop codon originally present in the genome, that is, naturally present. More specifically, the porcine cell further includes a stop codon naturally present in each of the gag gene, pol gene and env gene of PERV and an artificially generated stop codon in each of the gag gene, pol gene and env gene, thereby producing PERV is inactivated.

As one embodiment,

The artificially engineered gag gene may include one or more additional stop codons in the nucleic acid sequence of the gag gene, in addition to the first stop codon present at the 3' end of the nucleic acid sequence of the wild-type gag gene.

The artificially engineered pol gene may include one or more additional stop codons in the nucleic acid sequence of the pol gene, in addition to the first stop codon present at the 3' end of the nucleic acid sequence of the wild-type pol gene.

The artificially engineered env gene may include one or more additional stop codons in the nucleic acid sequence of the env gene, in addition to the first stop codon present at the 3' end of the nucleic acid sequence of the wild-type env gene.

In this application, more specific examples of porcine cells containing one or more artificially engineered genes are listed:

For example, the engineered pig cell is a pig cell comprising one artificially engineered gene. One artificially engineered gene is one of an artificially engineered gag gene, an artificially engineered pol gene, and an artificially engineered env gene. At this time, the one artificially engineered gene includes a first stop codon and one or more additional stop codons, and the additional stop codon is 5'-NGG-3' of the nucleic acid sequence of one or more artificially engineered genes ( N is A, T, C or G) may be located within a contiguous nucleotide sequence of 1 bp to 100 bp located adjacent to the 5' end and/or 3' end of the PAM sequence.

In another example, the engineered porcine cell is a porcine cell comprising two artificially engineered genes. The two artificially engineered genes are two of an artificially engineered gag gene, an artificially engineered pol gene, and an artificially engineered env gene. At this time, each of the two artificially engineered genes includes a first stop codon and one or more additional stop codons, and the additional stop codon is 5'-NGG-3' of the nucleic acid sequence of the one or more artificially engineered genes. (N is A, T, C or G) may be located within a contiguous nucleotide sequence of 1 bp to 100 bp located adjacent to the 5' end and/or 3' end of the PAM sequence.

In another example, the engineered porcine cell is a porcine cell comprising three artificially engineered genes. The three artificially engineered genes are an artificially engineered gag gene, an artificially engineered pol gene, and an artificially engineered env gene. At this time, each of the three artificially engineered genes includes a first stop codon and one or more additional stop codons, and the additional stop codon is 5'-NGG-3' of the nucleic acid sequence of the one or more artificially engineered genes. (N is A, T, C or G) may be located within a contiguous nucleotide sequence of 1 bp to 100 bp located adjacent to the 5' end and/or 3' end of the PAM sequence.

In certain embodiments, the porcine cell comprises:

The inactivated PERV exists in multiple copies in the genome,

It may be that the expression level of PERV is reduced by 50% or more compared to before inactivation.

2) 인위적으로 조작된 형질전환 돼지2) Artificially engineered transgenic pigs

One aspect disclosed by this application relates to transgenic pigs.

The transgenic pig may be produced using engineered pig cells.

Description of the engineered porcine cells is as described above.

As an example, it relates to a method for producing a transgenic pig by transplanting a nucleus extracted from an engineered pig cell into an enucleated egg to produce offspring, and a transgenic pig produced thereby.

The transgenic pig may be a pig having a full-length or partial nucleic acid sequence of the artificially engineered PERV gene in its genome.

Description of the artificially engineered PERV gene is as described above.

In one embodiment, somatic cell nuclear transfer (SCNT) may be used using engineered porcine cells to produce transgenic pigs.

Transgenic pig production method,

(a) culturing somatic cells or stem cells isolated from pig tissue and introducing a composition for genetic manipulation to prepare engineered pig cells, that is, nuclear donor cells;

(b) preparing enucleated oocytes by removing nuclei from porcine eggs;

(c) microinjecting and fusing the nucleus of the engineered porcine cell of step (a), that is, the nuclear donor cell, into the enucleated oocyte of step (b);

(d) activating the oocyte fused in step (c); and

(e) transplanting the activated egg into the oviduct of the surrogate mother

can include

Conventional descriptions of each step will be understood by referring to methods for producing cloned animals using conventional somatic cell nuclear transfer technology known in the art.

3) 형질전환 돼지 장기3) Transgenic pig organs

In addition, one aspect disclosed by the present specification relates to a transgenic pig organ.

The transgenic pig organ may be an organ obtained from a transgenic pig.

Description of Sanji transgenic pigs is as described above.

The transgenic pig organ may be any organ that can be transplanted, such as heart, liver, kidney, and small intestine.

In this case, the transplant may be a homograft or a heterograft.

Artificial manipulation or modification of the PERV gene using the genetic manipulation composition disclosed in this application solves the problems of cytotoxicity and cell death due to genomic rearrangement, etc. that occur when knocking out a multi-copy gene using the existing CRISPR-Cas system. Multiple copies of the target gene can be effectively and safely inactivated. In addition, transgenic pigs generated using the engineered pig cells disclosed herein may be usefully used for xenotransplantation therapy.

Hereinafter, the present invention will be described in more detail through examples.

These examples are only for explaining the present invention in more detail, and it will be apparent to those skilled in the art that the scope of the present invention is not limited by these examples.

실험 방법 및 결과Experimental methods and results

1. gRNA 설계1. gRNA design

gRNA was designed as follows using http://www.rgenome.net/be-designer/.

Specifically, 1) among the gRNA sequences that may exist in the gag, pol, and env genes of PERV, only gRNAs capable of inducing a stop codon by C to T base editing are selected, and 2) each gRNA sequence is PERV-A, PERV -B and PERV-C were analyzed for simultaneous targeting, and only gRNAs capable of simultaneously targeting PERV-A, PERV-B, and PERV-C were selected with one gRNA. In this example, among the selected gRNAs, a gRNA at a position close to the Start Codon was finally selected and used (Table 5).

In this example, guide RNAs targeting SEQ ID NO: 132 and SEQ ID NO: 139, respectively, were designed.

2. 벡터 제작(All-in-one PiggyBac vector cloning)2. Vector construction (All-in-one PiggyBac vector cloning)

Piggybac vector (#20960) and Target-AID vector (#79620; pcDNA3.1_pCMV-nCas-PmCDA1-ugi pH1-gRNA (HPRT)) were purchased from Addgene. PERV-gag and PERV-pol targeting sgRNAs were synthesized with the U6 promoter sequence (U6-sgGag and U6-sgPol). An All-in-one PiggyBac vector was constructed using the above vectors (FIG. 2). Primer sequences used in vector construction are as follows (Table 6):

The vector was designed so that all of the sequences encoding the guide RNAs including the guide domain sequences of SEQ ID NO: 66 and SEQ ID NO: 81 were included in the vector.

GeneGene	RangeRange	DirectionDirection	Primer Sequence Primer Sequence (5' to 3')(5' to 3')	Length, Temp.Length, Temp.	SEQ. ID No.SEQ. ID No.
gaggag	1-5291-529	1^stForward1 ^st Forward	ctggtggtctcctactgtcgctggtggtctcctactgtcg	20bp, 57-58℃20bp, 57-58℃	SEQ ID NO: 157SEQ ID NO: 157
	1-5291-529	1^st Reverse1 ^st Reverse	ctccaagagccaggattcggctccaagagccaggattcgg	20bp, 59℃20bp, 59℃	SEQ ID NO: 158SEQ ID NO: 158
	2-2852-285	2^nd Forward ^2nd Forward	gtcttgtgcgtccttgtctagtcttgtgcgtccttgtcta	20bp, 54℃20bp, 54℃	SEQ ID NO: 159SEQ ID NO: 159
	2-2852-285	2^nd Reverse ^2nd Reverse	cgtaaggatatagggctcctcgtaaggatatagggctcct	20bp, 54-55℃20bp, 54-55℃	SEQ ID NO: 160SEQ ID NO: 160
polpol	1-4951-495	1^stForward1 ^st Forward	ccatcactgtgttgaccctcccatcactgtgttgaccctc	20bp, 56-57℃20bp, 56-57℃	SEQ ID NO: 161SEQ ID NO: 161
	1-4951-495	1^st Reverse1 ^st Reverse	ggtgtaatctcaggcagaagggtgtaatctcaggcagaag	20bp, 54-55℃20bp, 54-55℃	SEQ ID NO: 162SEQ ID NO: 162
	2-2952-295	2^nd Forward ^2nd Forward	tatacagtcctggttggagctatacagtcctggttggagc	20bp, 54-56℃20bp, 54-56℃	SEQ ID NO: 163SEQ ID NO: 163
	2-2952-295	2^nd Reverse ^2nd Reverse	attgacctctctcaagtcctattgacctctctcaagtcct	20bp, 54-55℃20bp, 54-55℃	SEQ ID NO: 164SEQ ID NO: 164

^1st In-fusion cloning: PB-MluI-NotI cloning

^2nd Restriction enzyme cloning: PB-MluI-NotI & MluI-CMV-nCas9-PmCDA1-UGI-NeoR/KanR-pA-NotI

( vector size: 4352bp, insert size: 7199bp)

^3rd In-fusion cloning: PB-CMV-nCas9-PmCDA1-UGI-NeoR/KanR-pA NotI + U6-sgGag + U6-sgPol

At this time, the abbreviations described in 3 ^rd In-fusion cloning refer to:

nCas9 (nickase Cas9), UGI (uracil-DNA glycosylase inhibitor), PmCDA1 (cytidine deaminase 1)

NeoR (neomycin resistance), KanR (kanamycin resistance), pA (poly A).

3. 세포 트랜스펙션(Cell transfection)3. Cell transfection

The PWG #8 cell line (Passage: 3) to be used for transfection was cultured in the following medium composition:

Dulbecco's modified Angel medium (DMEM) (Gibco, Carlsbad, California, USA)

1% P/S, 10% Fetal bovine serum (FBS) (Gibco)

100 mM β-Mercaptoethanol (β-ME) (Sigma)

1% Non-essential amino acids (NEAA) (Gibco)

After washing the cultured porcine fibroblast cell line (PWG cells) with PBS, the cells were prepared by treating them with trypsin. The prepared 3Ⅹ10 ⁵ PWG cells were transfected with 500 ng All-in-one PiggyBac vector prepared in Experimental Method 2 using a Neon electroporator (ThermoFisher).

After transfection, the transduced PWG cells were cultured in a clean medium, and 4 hours later, the medium was replaced with a clean medium. Two days later, Neomycin (2μg/mL) was treated for 10 days. Thereafter, colonies derived from single cells were selected and experiments were conducted. Since the All-in-one PiggyBac vector contains the antibiotic resistance gene NeoR as a selection marker, cells that survive when neomycin is treated are cells transduced with the transgene.

4. 트랜스진의 도입 여부 확인(Identification of transgene integration)4. Identification of transgene integration

The following experiment was performed to confirm whether the corresponding transgene was introduced (integrated) into the genome in cells surviving neomycin treatment.

Cellular genomic DNA was extracted using the DNeasy Blood & Tissue Kit (Qiagen, Cat No./ID: 69506), PCR was performed using the primers indicated by arrows in FIG. 2, and bands were confirmed by loading on an agarose gel. (FIG. 3). The primers used at this time are as follows (Table 7):

NameName	Primer Sequence (5' to 3')Primer Sequence (5' to 3')	SEQ. ID No.SEQ. ID No.
integration_Forwardintegration_Forward	CCTCGTGCTTTACGGTATCGCCTCGTGCTTTACGGTATCG	SEQ ID NO: 165SEQ ID NO: 165
integration_Reverseintegration_Reverse	ATGCTCAAGGGGCTTCATGAATGCTCAAGGGGCTTCATGA	SEQ ID NO: 166SEQ ID NO: 166

In FIG. 3, colonies #1 and #3 refer to single cell-derived colonies obtained after Neomycin selection. Referring to FIG. 3, bands corresponding to transgenes are well observed in both colonies #1 and #3.

That is, through the results of FIG. 3 , it was confirmed that the transgene introduced into the cell through transfection was integrated into the genome of the cell.

5. 표적 서열의 변형 확인5. Identification of variants of the target sequence

After deep sequencing (using Illumina miSeq) the cells in which the transgene was integrated into the genome of the cells were used, BE-Analyzer (rgenome.net) was used to confirm whether or not modification occurred in the target sequence.

The results of confirming that stop codons were generated in each of the target sequences present in the gag gene and the pol gene are shown in FIGS. 4 and 5 .

Referring to Figure 4, it can be seen that 5'-CAG-3' in the target sequence of the gag gene is modified to 5'-TAG-3'.

5, it can be seen that 5'-CAG-3' in the target sequence of the pol gene is modified to 5'-TAG-3'.

The efficiency of stop codon generation in each colony is shown in [Table 8].

Target GeneTarget Gene		C-to-T Substitution (%)C-to-T Substitution (%)
PERV-PERV- *gaggag*	non-treatednon-treated	0.220.22
	Colony #1 Colony #1	63.1563.15
	Colony #3 Colony #3	52.1252.12
PERV-PERV- *polpol*	non-treatednon-treated	0.110.11
	Colony #1 Colony #1	54.6054.60
	Colony #3 Colony #3	47.8347.83

Through the above examples of the present application, each sgRNA targeting the fusion editor protein (Cas9, cytidine deaminase, UGI) and the gag and pol genes included in the transgene integrated into the genome of the transfected cell is transfected. It was confirmed that the expression in the transfected cells could generate stop codons in the target sequences of the gag and pol genes in the genome of the cells. Therefore, the transfected cells using the composition of the present application can express PERV is believed to be significantly suppressed. In addition, since the fusion protein and sgRNA targeting the above-described PERV gene are expected to be continuously expressed, it is expected that PERV can be continuously inactivated regardless of the copy number of the PERV gene in the future.

The present application relates to a composition capable of artificially manipulating the PERV genome existing in multiple copies in the genome of a pig cell and a method of inactivating PERV using the same.

It relates to a target sequence of the PERV gene, a gRNA sequence for targeting the PERV gene, a primer sequence, and the like.

Claims

A method for inactivating a porcine endogenous retrovirus (PERV) present in multiple copies in a pig genome, the method comprising:

Incorporating a composition for genetic manipulation into pig cells;

At this time, the composition for genetic manipulation,

(a) a guide RNA capable of targeting a target sequence of a target gene of PERV or a nucleic acid sequence encoding the same;

(b) a Cas protein or a nucleic acid sequence encoding the same; and

(c) cytidine deaminase or a nucleic acid sequence encoding the same;

including,

At this time, the target gene of the PERV is at least one selected from the gag gene, the pol gene, and the env gene,

The target sequence is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CTA-3', 5'-CCA-3' and 5'-TCA-3 At least one nucleotide sequence of the nucleotide sequence of ', wherein the target sequence is located in a continuous 10 bp to 30 bp sequence region of the PERV gene,

The Cas protein is a Cas9 protein derived from Streptococcus pyogenes, a Cas9 protein derived from Campylobacter jejuni, a Cas9 protein derived from Streptococcus thermophiles, and a Staphylococcus au It is selected from the group consisting of Cas9 protein derived from Staphylococcus aureus, Staphylococcus auricularis and Cpf1 protein,

The inactivation reduces or suppresses the expression of the target gene of PERV by generating a stop codon in the target sequence,

A method characterized by inactivating PERV present in the multiple copies.
According to claim 1,

Characterized in that the introduction is performed by one or more methods selected from electroporation, liposomes, plasmids, viral vectors, nanoparticles and PTD (Protein translocation domain) fusion protein methods.
According to claim 1,

Wherein the stop codon is any one or more nucleotide sequences selected from 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3'.
According to claim 1,

Wherein the cytidine deaminase is selected from AID, PmCDA1, AICDA, ARP2, CDA2, HIGM2, APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3E, APOBEC3F, APOBEC3G, APOBEC3H, CDA and DCTD .
According to claim 1,

The composition for genetic manipulation is characterized in that it further comprises a DNA glycosylase inhibitor (DNA glycosylase inhibitor).
According to claim 5,

Wherein the DNA glycosylase inhibitor is a uracil glycosylase inhibitor (UGI).
According to claim 1,

The Cas9 protein is characterized in that the Cas9 nickase (Cas9 nickase, nCas9).
According to claim 1,

The method, characterized in that the target sequence is any one or more selected from SEQ ID NOs: 1 to 65.
According to claim 1,

The target sequence is positioned adjacent to the 5' end and / or 3' end of the PAM (proto-spacer-adjacent motif) sequence present in the nucleic acid sequence of the PERV gene and located within 10 bp to 25 bp method.
According to claim 9,

The method characterized in that the PAM sequence is one or more of the following sequences: (written in the 5' to 3' direction)

NGG (N is A, T, C or G);

NNNNRYAC (N is each independently A, T, C or G, R is A or G, and Y is C or T);

NNGG (N is A, T, C or G);

NNAGAAW (N is each independently A, T, C or G, and W is A or T);

NNNNGATT (N is each independently A, T, C or G);

NNGRR(T), where each N is independently A, T, C, or G, and R is A or G; and

TTN (where N is A, T, C, or G).
According to claim 1,

Wherein the composition for genetic manipulation is integrated into the genome of the pig cell.
According to claim 1,

The method of claim 1 , wherein the guide RNA can simultaneously target target genes present in two or more subtype gene sequences selected from among PERV-A, PERV-B and PERV-C.
A composition for genetic manipulation for inactivating porcine endogenous retrovirus (PERV),

The composition,

(a) a guide RNA comprising one or more guide domain sequences selected from SEQ ID NOs: 66 to 130, or a nucleic acid encoding the same;

(b) a Cas protein or a nucleic acid sequence encoding the same; and

(c) cytidine deaminase or a nucleic acid sequence encoding the same;

including,

At this time, the target gene of the PERV is at least one selected from the gag gene, the pol gene, and the env gene,

The guide domain may form a complementary bond with a guide nucleic acid binding target sequence among target sequences of the PERV gene,

The guide RNA is 5'-CAG-3', 5'-CAA-3', 5'-CGA-3', 5'-CUA-3', 5'-CCA-3' and 5' in the guide domain sequence. -Contains one or more of the nucleotide sequences of UCA-3',

The Cas protein is a Cas9 protein derived from Streptococcus pyogenes, a Cas9 protein derived from Campylobacter jejuni, a Cas9 protein derived from Streptococcus thermophiles, and a Staphylococcus au A composition characterized in that it is selected from the group consisting of a Cas9 protein derived from Staphylococcus aureus, a Staphylococcus auricularis and a Cpf1 protein.
According to claim 13,

The composition, characterized in that the cytidine deaminase is selected from AID, PmCDA1, AICDA, ARP2, CDA2, HIGM2, APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3E, APOBEC3F, APOBEC3G, APOBEC3H, CDA and DCTD .
According to claim 13,

The composition for genetic manipulation is characterized in that it further comprises a DNA glycosylase inhibitor (DNA glycosylase inhibitor).
According to claim 15,

The composition, characterized in that the DNA glycosylase inhibitor is a uracil glycosylase inhibitor (UGI).
According to claim 13,

The Cas protein is a composition characterized in that the Cas9 nickase (Cas9 nickase, nCas9).
According to claim 13,

The target sequence is positioned adjacent to the 5' end and / or 3' end of the PAM (proto-spacer-adjacent motif) sequence present in the nucleic acid sequence of the PERV gene and located within 10 bp to 25 bp composition.
According to claim 18,

The composition, characterized in that the PAM sequence is one or more of the following sequences: (written in the 5 'to 3' direction)

NGG (N is A, T, C or G);

NNNNRYAC (N is each independently A, T, C or G, R is A or G, and Y is C or T);

NNGG (N is A, T, C or G);

NNAGAAW (N is each independently A, T, C or G, and W is A or T);

NNNNGATT (N is each independently A, T, C or G);

NNGRR(T), where each N is independently A, T, C, or G, and R is A or G; and

TTN (where N is A, T, C, or G).
According to claim 13,

The composition is characterized in that the guide RNA, Cas protein and cytidine deaminase are all included in one vector or each nucleic acid sequence is included in an individual vector.
21. The method of claim 20,

Wherein the vector is a viral vector or a plasmid.
According to claim 21,

The composition, characterized in that the virus is at least one selected from retrovirus, lentivirus, adenovirus, adeno-associated virus (AAV), vaccinia virus, poxvirus or herpes simplex virus.
According to claim 21,

The vector composition characterized in that it may further comprise a gene encoding a transfer factor or / and transposase (transposase).
24. The method of claim 23,

The transfer factor is a composition, characterized in that the piggyBac transfer factor or Sleeping Beauty transfer factor.
A porcine endogenous retrovirus (PERV) in which at least one gene selected from the gag gene, pol gene, and env gene is inactivated is included in the genome,

A pig cell containing at least two or more stop codons in each of the inactivated genes,

The stop codon is any one or more nucleotide sequences selected from 5'-TAG-3', 5'-TAA-3' and 5'-TGA-3',

The inactivated PERV exists in multiple copies in the genome and

Porcine cells characterized in that the expression level of PERV is reduced by 50% or more compared to before inactivation.
26. The method of claim 25,

A pig produced using the pig cells.