CN109385425A

CN109385425A - A kind of high specific ABE base editing system and its application in β hemoglobinopathy

Info

Publication number: CN109385425A
Application number: CN201811345987.5A
Authority: CN
Inventors: 松阳洲; 梁普平; 黄军就
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2019-02-26

Abstract

The present invention provides a kind of ABE base editing systems of efficiently targeting HBG1 and HBG2 promoter region, it can be used for treating β hemoglobinopathy, the system comprises the high specific gRNA of a variety of special target HBG gene promoter regions, it will be in the body cell of the system introducing to people, it is catalyzed adenine (Adenine at target site, A) to guanine (Guanine, G displacement), the expression for improving HBG1 and HBG2, to treat β hemoglobinopathy (such as β-thalassemia, sickle cell anemia disease).The technology is with a wide range of applications in β hemoglobinopathy field of gene.

Description

A kind of high specific ABE base editing system and its application in β hemoglobinopathy

Technical field

The present invention relates to gene modification field, more particularly, to a kind of high specific ABE base editing system and its Gene editing, gene therapy, the application especially in the treatment of β hemoglobinopathy.

Background technique

β hemoglobinopathy is the genetic disease as caused by red blood cell hyperglobulinemia resulting anomaly, mainly includes that the Mediterranean β is poor Blood and sickle cell anemia.The hyperglobulinemia of adult includes Hb A2 (2.5%), Hb F (0.5%) and Hb A1 (97%) three types.Wherein the maximum HbA1 albumen of accounting is made of 2 alpha globulin subunits and two beta Globulin subunits. HbF albumen is made of 2 alpha globulin subunits and two gamma Globulin subunits.It is by hyperglobulinemia β by wherein β hemoglobinopathy A kind of common monogenic inheritance disease caused by Globulin is mutated, encoding gene are HBB (human β globin).γ Globulin is by two gene codings of HBG1 and HBG2.Clinical research discovery, HbF high expression can alleviate beta Thalassemia With the symptom of sickle cell anemia.So the expression for improving HbF becomes one of the treatment method for the treatment of β hemoglobinopathy. Forefathers' the study found that will be located at the GTGTGGGG in HBG1 and HBG2 promoter regionAAA in GGGGCCCCCAAG sequence (underscore mark) is mutated into G, and expression (Wienert, B.et the al.KLF1drives the of gamma Globulin can be improved expression of fetal hemoglobin in British HPFH.Blood 130,803-807(2017))。

ABE base editing system is grouped as by TadA:TadA*:Cas9 fusion protein and gRNA two parts group.? Under the guidance of gRNA,

TadA:TadA*:Cas9 fusion protein can be in conjunction with the target site on DNA, wherein the DNA chain complementary with gRNA It can be cut off by Cas9 nuclease, rather than 4-9 A bases then can be by adenine deaminase on complementary strand --- TadA albumen --- It is catalyzed deamination and forms I base.With the duplication of DNA, I (hypoxanthine, Inosine) base can by G (guanine, Guanine) base substitutes, to realize the base replacement of A to G.

2017, the David Liu group of Harvard University was found, was using ABE base editing system combination boot sequence The gRNA (being named as HBG-GX19gRNA, SEQ ID NO.6) of GUGGGGAAGGGGCCCCCAAG can be incited somebody to action GTGTGGGGAAA (underscore mark) in GGGGCCCCCAAG sequence is mutated into G.Extract the candidate stem cell of patient itself (Hematopoietic stem cell, HSC) or bone marrow cell utilize adenine base editing system editor HBG1 and HBG2 Promoter region, then can accurately change the promoter sequence of HBG1 and HBG2 on protogene seat, improve the table of HBG1 and HBG2 It reaches, this method has the characteristics that high-efficient, highly-safe.The HSC of mutation repair is fed back to patient, then it is blood red to treat β Albumen patient.

But ours the study found that David Liu group provide HBG-GX19gRNA (SEQ ID NO.6) and ABE alkali Base editing system has undershooting-effect.Therefore, it is necessary to provide a kind of gRNA and ABE base editing system with more high specific The promoter region of HBG1 and HBG2 are edited, to raise the expression of gamma Globulin, improves the level of HbF, so that it is blood red to treat β Albumen disease.

Summary of the invention

The purpose of the present invention is have to miss the target for the ABE base editing system of above-mentioned targeting HBG1 and HBG2 promoter region Effect, and provide a kind of gRNA with more high specific and the ABE base editing system of higher editorial efficiency, the present invention mention The ABE base editing system of confession is improved by the promoter region of editor HBG1 and HBG2 with raising the expression of gamma Globulin The level of HbF, to treat β hemoglobinopathy.In a specific embodiment of the invention, the present invention passes through the length for changing gRNA Degree develops HBG-GGX20 (SEQ ID NO.1), HBG-GX20 (SEQ ID NO.2), HBG- with more high specific GX17 (SEQ ID NO.3) and HBG-GX16 (SEQ ID NO.4), and by itself and TadA:TadA*:Cas9 fusion protein ABE alkali Base editing system combination, specifically by the GTGTGGGG in HBG1 and HBG2 promoter regionAAA in GGGGCCCCCAAG sequence (underscore mark) is mutated into G, to treat β hemoglobinopathy.

Above-mentioned purpose of the present invention is achieved through the following technical solutions:

In a first aspect, the present invention provides a kind of gRNA for targeting HBG1 and HBG2 promoter region, the gRNA sequence Nucleotide sequence include at least one of SEQ ID NO.1, SEQ ID NO.2, SEQ ID NO.3, SEQ ID NO.4.

In one specific embodiment of first aspect present invention, the gRNA sequence further includes frame sequence, wherein gRNA Frame sequence are as follows: GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAA AAGUGGCAC CGAGUCGGUGCUUUUUU(SEQ ID NO.7)。

It is understood that SEQ ID NO.1 of the present invention, SEQ ID NO.2, SEQ ID NO.3, SEQ ID NO.4 institute Show that gRNA sequence is gRNA boot sequence, SEQ ID NO.7 of the present invention is a specific example, does not limit present invention protection model It encloses, those skilled in the art can should be included in the scope of the present invention as needed using other substitution frame sequences.

Second aspect, the present invention provides a kind of ABE base editor systems of targeting editor HBG1 and HBG2 promoter region System, the ABE base editing system includes TadA:TadA*:Cas9 fusion protein, further includes targeting HBG1 and HBG2 promoter The gRNA in region, the nucleotide sequence of the gRNA sequence include SEQ ID NO.1, SEQ ID NO.2, SEQ ID NO.3, At least one of SEQ ID NO.4.

In one specific embodiment of second aspect of the present invention, the TadA:TadA*:Cas9 fusion protein includes CRISPR/ Effect protein structural domain, the adenosine deaminase structural domain of Cas system.

In one specific embodiment of second aspect of the present invention, the TadA:TadA*:Cas9 fusion protein includes CRISPR/ Effect protein structural domain, connecting peptides, the adenosine deaminase structural domain of Cas system.

It will be appreciated by persons skilled in the art that TadA:TadA*:Cas9 fusion protein of the present invention is by Cas9 Effect protein is merged with adenosine deaminase (abbreviation TadA albumen), and those skilled in the art can according to need, and utilize one One Cas9 effect protein structural domain and one or more TadA albumen are attached, are melted by item or a plurality of connecting peptides Hop protein, in a specific embodiment of the invention, the TadA albumen is repeated once.It is understood that the Cas9 effect The order of connection of the N-terminal and C-terminal of answering albumen and TadA albumen is this field routine techniques, and connecting peptides include but is not limited to ability The connecting peptides segment of domain routine, typically, such as GS linker.

It will be appreciated by persons skilled in the art that TadA is adenosine deaminase in TadA:TadA*:Cas9 fusion protein Abbreviation, TadA* be TadA mutant abbreviation, Cas9 be CRISPR/Cas system Cas9 effect protein.

Further, in the effect protein structural domain of the CRISPR/Cas system, the Cas9 effect protein include but It is not limited to no cleavage activity or only with the Cas Protein S treptococcus pyogenes Cas9 of single-stranded cleavage activity (SpCas9),Staphylococcus aureus Cas9(SaCas9),Lachnospiraceae Cpf1(LbCpf1), Acidaminococcus Cpf1(AsCpf1),Streptococcus thermophilus Cas9(StCas9),and Neisseria meningitidis Cas9 (NmCas9) and Francisella Cpf1 (FnCpf1) etc..

In one specific embodiment of second aspect of the present invention, the amino acid sequence of the TadA:TadA*:Cas9 fusion protein Be classified as amino acid at least 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98%, 99% shown in SEQ ID NO.5 or 99.5% consistent sequence.

In one specific embodiment of second aspect of the present invention, the TadA:TadA*:Cas9 fusion protein can also be merged Other protein, protein domain or protein fragments, the protein, protein domain or protein fragments include But be not limited to following 1) -5) one of or it is a variety of:

1) maltose-binding protein (MBP), S-tag, Lex A DNA binding structural domain (DBD) fusions, GAL4DNA knot Close structural domain fusions and herpes simplex virus (HSV) BP16 protein fusions；

2) molecular labeling can be used to identify the position of target sequence using the TadA:TadA*:Cas9 fusion protein of label；

3) epitope tag, the non-limiting example of the epitope tag include histidine (His) label, V5 label, FLAG Label, influenza virus hemagglutinin (HA) label, Myc label, VSV-G label and thioredoxin (Trx) label；

4) reporter gene, the example of the reporter gene include, but are not limited to glutathione-S-transferase (GST), peppery Root peroxidase (HRP), chloramphenicol acetyltransferase (CAT), beta galactosidase, β-glucuronidase, luciferin Enzyme, green fluorescent protein (GFP), HcRed, DsRed, green fluorescin (CFP), yellow fluorescence protein (YFP), to include blue The autofluorescence albumen of fluorescin (BFP)；

5) and with one or more following active protein domain: methyl enzymatic activity, hepatic Microsomal Aniline Hydroxylase, Transcriptional activation activity, transcription repression activity, transcription releasing factor activity, histone modification activity, RNA cleavage activity and nucleic acid knot Close activity.

It is understood that term " polypeptide ", " peptide " and " protein " or " protein fragments " interchangeably make herein With referring to the polymer of the amino acid with any length.The polymer can be linear chain or branched chain, it can wrap Amino acid containing modification, and it can be interrupted by non-amino acid.These terms also cover the amino acid polymerization being modified Object；These modification for example disulfide bond formation, glycosylation, esterification (lipidation), acetylation, phosphorylation or any other repair Decorations, such as and the combination of detection molecules labeling component.

In one specific embodiment of second aspect of the present invention, the gRNA sequence further includes frame sequence, wherein gRNA Frame sequence are as follows: GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAA AAGUGGCAC CGAGUCGGUGCUUUUUU(SEQ ID NO.7)。

In one specific embodiment of second aspect of the present invention, the gRNA can be expressed by carrier, be transcribed in vitro, be changed Be synthetically prepared, and the gRNA may also pass through chemical modification, to improve its editorial efficiency, specificity, safety etc..

The third aspect, the present invention provides a kind of non-naturally occurring or engineering composition, the composition is such as Lower 1) -6) one of or a variety of:

1) include one or more carriers, which includes component I and component II:

The component I includes the first regulating element, and the coding being operably connected with first regulating element is such as The coded sequence of TadA:TadA*:Cas9 fusion protein described in second aspect；The component II includes the second regulating element, with And the coded sequence of gRNA of the coding being operably connected with second regulating element as described in second aspect；Wherein, group I and II is divided to be located on identical or different carrier；

2) mRNA and carrier comprising the TadA:TadA*:Cas9 fusion protein as described in second aspect, the carrier Including the second regulating element, and the gRNA encoded as described in second aspect being operably connected with second regulating element Coded sequence；

3) albumen and carrier comprising the TadA:TadA*:Cas9 fusion protein as described in second aspect, the carrier Including the second regulating element, and the gRNA encoded as described in second aspect being operably connected with second regulating element Coded sequence；

4) expression vector and such as second aspect comprising the TadA:TadA*:Cas9 fusion protein as described in second aspect The gRNA；

5) mRNA comprising TadA:TadA*:Cas9 fusion protein as described in second aspect and as described in second aspect GRNA；

6) albumen comprising TadA:TadA*:Cas9 fusion protein as described in second aspect and as described in second aspect GRNA.

It will be appreciated by persons skilled in the art that in composition described in the third aspect, TadA:TadA*:Cas9 fusion Albumen can be DNA, the form of RNA or albumen；GRNA can be the form of DNA or RNA.Third aspect present invention is suitable Under the conditions of TadA:TadA*:Cas9 fusion protein or directly offer TadA:TadA*:Cas9 fusion protein and suitable are provided Under the conditions of transcribe gRNA or directly provide gRNA.The suitable condition can be intracellular or in-vitro transcription/translation system.

In third aspect present invention embodiment, term " carrier " refers to a kind of nucleic acid molecules, it can be transported and it Another nucleic acid molecules of connection.Carrier includes but is not limited to single-stranded, double-strand or partially double stranded nucleic acid molecules；Including one Or multiple free ends, the nucleic acid molecules without free end (such as cricoid)；Nucleic acid molecules including DNA, RNA, or both；And Other diversified polynucleotides known in the art.Optionally, a type of carrier is " plasmid ", refers to and wherein may be used To be for example inserted into the circular double stranded DNA ring of other DNA fragmentation by standard molecule clone technology.Optionally, another seed type Carrier be viral vectors, wherein DNA or RNA sequence derived from virus are present in for packaging virus (for example, reverse transcription disease Poison, replication defect type retrovirus, adenovirus, replication-defective adenoviral and adeno-associated virus) carrier in.Virus Carrier also includes the polynucleotides carried by the virus for being transfected into a kind of host cell.Certain carriers are (for example, have thin The bacteria carrier and episomal mammalian vectors of bacterium replication orgin) it can be independently multiple in their imported host cells System.Other carriers (for example, non-add type mammalian vector) are integrated into the gene of the host cell after introducing host cell In group, and thus replicated together with the host genome.Moreover, the gene that certain carriers can instruct them to be operatively connected Expression.Such carrier is referred to herein as " expression vector ".Carrying agent, which is commonly expressed, used in the recombinant DNA technology is usually Plasmid form.

In general, " being operably connected " is intended to indicate that nucleotide sequence allows the nucleotide sequence with a kind of in carrier The mode of expression be connected to one or more regulating elements (optionally, carrier be in a kind of in-vitro transcription/translation system The nucleotide sequence can be expressed；Optionally, the nucleotides sequence can be expressed when the carrier is introduced in host cell Column).

In third aspect present invention embodiment, term " expression ", which refers to from DNA profiling, is transcribed into polynucleotides (as turned Record into mRNA or other RNA transcripts) process and/or the mRNA of transcription then translate into the process of peptide, polypeptide or protein. The polypeptide of transcript and coding can collectively referred to as " gene product ".If polynucleotides derive from genomic DNA, expression be can wrap Include the montage of mRNA in eukaryocyte.

Terms used herein " non-naturally occurring " or " engineering " are interchangeably used, when referring to nucleic acid molecules or more When peptide, indicate that the nucleic acid molecules or polypeptide are found in tying in nature with it at least substantially from them in nature or such as At least another component separate out closed.

In third aspect present invention preferred embodiment, first regulating element is (in an embodiment of the present invention, " regulating element " of the present invention can be understood as " expression casette ") include one or more pol III promoter (such as 2,3,4,5, or more 1, pol III promoter), one or more pol II promoter (such as 1,2,3,4,5, or more A pol II promoter), one or more pol I promoter (such as 1,2,3,4,5, or more pol I promoter) or its Combination.In addition, the first regulating element can also be T7 promoter, Sp6 promoter etc..The example of pol III promoter include but It is not limited to U6 and H1 promoter.The example of pol II promoter includes but is not limited to that reverse transcription Rous sarcoma virus (RSV) LTR is opened Mover (optionally with RSV enhancer), cytomegalovirus (CMV) promoter (optionally having cmv enhancer) are [referring to example Such as, wave Saudi Arabia (Boshart) et al., " cell " (Cell) 41:521-530 (1985)], SV40 promoter, dihydrofolate reduction Enzyme promoters, beta-actin promoter, phosphoglycerokinase (PGK) promoter and EF1 α promoter.

In some embodiments of third aspect present invention, albumen coded sequence is through codon optimization, so as to specific It is expressed in cell such as eukaryocyte.These eukaryocytes can be those of particular organisms or from particular organisms, such as lactation Animal, including but not limited to people, mouse, rat, rabbit, dog or non-human primate.In general, codon optimization refers to The codon more frequently or most frequently used in gene by being used in host cell replaces at least the one of native sequences A codon (for example, about or more than about 1,2,3,4,5,10,15,20,25,50, or more codon maintain the day simultaneously Right amino acid sequence and modify a nucleic acid sequence to enhance the method for the expression in host cell interested.Different objects Kind shows specific preference for certain codons with specific amino acids.Codon preference is (close between biology The difference that numeral uses) it is often related to the translation efficiency of mRNA (mRNA), and the translation efficiency is then considered depending on (except other things) availability of the property for the codon being translated and specific transfer RNA (tRNA) molecule.It is intracellular selected The advantage of tRNA generally reflect the codon for being used most frequently for peptide synthesis.Therefore, gene can be customized to be based on password Best gene expression of the son optimization in given biology.Codon usage table can be readily available, such as be made in codon With in database (" Codon Usage Database "), and these tables can be adjusted by different modes it is applicable.Referring to, Middle village Y. (Nakamura Y.) et al., " codon tabulated from international DNA sequence data library uses: state in 2000 " (Codon usage tabulated from the international DNA sequence databases:status For the year 2000) " nucleic acids research " (Nucl.Acids Res.) 28:292 (2000).For codon optimization spy Fixed ordered series of numbers so as to the computerized algorithm expressed in specific host cell be also it is available, as gene manufacture (Gene Forge) (Aptagen company；Chris Jacobs (Jacobus), PA) and it is available.In some embodiments, it is encoding In the sequence of CRISPR enzyme one or more codons (such as 1,2,3,4,5,10,15,20,25,50, or more or All codons) correspond to the codon most frequently used for specific amino acids.

In a preferred embodiment of this invention, second regulating element includes but is not limited to one or more pol II promoter (such as 1,2,3,4,5, or more pol II promoter), one or more pol I promoter (such as 1,2,3, 5, or more 4, pol I promoter), or combinations thereof, pol III promoter (including but not limited to U6, H1 promoter), T7 Promoter, Sp6 promoter etc..

In a preferred embodiment of this invention, described first or two regulating element further include enhancer, internal ribosome Entry site (IRES) and other expression control elements (such as transcription stop signals, such as polyadenylation signal and poly U Sequence).Such adjusting sequence is for example described in Gaston Godel (Goeddel), " gene expression technique: Enzymology method " (GENE EXPRESSION TECHNOLOGY:METHODS IN ENZYMOLOGY) 185, academic press (Academic Press) is holy Ground Asia brother (San Diego), in California (1990).Regulating element includes that one nucleotides sequence of guidance is listed in many classes Those of constitutive expression in the host cell of type sequence and instruct the nucleotide sequence only in certain host cells table Up to those of sequence (for example, organizing specific type adjust sequence).Tissue-specific promoter can be instructed mainly in the interested phase Hope the expression in tissue, the tissue such as muscle, neuron, bone, skin, blood, specific organ (such as liver, pancreas Gland) or special cell type (such as lymphocyte).Regulating element can be in a manner of Temporal dependency (such as with the cell cycle Dependence or stage of development dependence mode) guidance expression, which may or may not be that tissue or cell type are special Anisotropic.

In a preferred embodiment of this invention, second regulating element is T7 promoter.

Fourth aspect present invention provides a kind of for starting in body cell, human body internal specific human editor HBG1 and HBG2 The method of subregion, the method includes groups described in ABE base editing system described in delivering second aspect or the third aspect Object is closed, keeps the gene order of the ABE base editing system or composition and people's HBG1 and HBG2 promoter region close, obtains Obtain people's HBG1 and HBG2 promoter gene by editor.

In embodiments of the present invention, term " makes the ABE base editing system (base editing system) or composition It is close with the gene order of people's HBG1 and HBG2 promoter region " refer to component is delivered in vitro (in vitro) or in vivo In (in vivo) environment, ex vivo environment such as those skilled in the art reaction solution that demand configures according to specific experiments, internal ring Border is such as intracellular；The term " close " refers in vitro (in vitro) or in vivo (in vivo) environment, each component It can be contacted with the gene order of people's HBG1 and HBG2 promoter region, and those skilled in the art occur under certain conditions The reaction being anticipated that.

In an embodiment of the present invention, the present invention provides following methods, including into cell, human body deliver one or Multiple polynucleotides, one or more carriers, one or more transcripts, and/or the albumen of one or more transcriptions.Some Aspect, invention further provides the cells generated by such method.

In an embodiment of the present invention, the ABE base editing system compound compound with gRNA is delivered to cell, people In vivo.Can be used the gene transfer method of conventional virus and non-viral base by base editing system described in second aspect or Composition described in the third aspect introduces the cell of people, in human body.

Cell of such method into cell culture or individual can be used and give composition described in the third aspect. Non-virus carrier delivery system include DNA plasmid, RNA (such as transcript of carrier described herein), albumen, naked nucleic acid and The compound nucleic acid with Delivery excipient (such as liposome).Viral vector delivery system includes DNA and RNA virus, is being delivered to They have sequestered or integrated genome after cell.About the summary of genes delivery system, referring to peace moral (Anderson), " science " (Science) 256:808-813 (1992)；Receive Bell (Nabel) Fil Ge Na (Felgner), TIBTECH 11: 211-217(1993)；Three paddy (Mitani) Ka Siji (Caskey), TIBTECH11:162-166 (1993)；Di Long (Dillon), (1993) TIBTECH 11:167-175；Miller (Miller), " nature " (Nature) 357:455-460 (1992)；Fan Bulangte (Van Brunt), " biotechnology " (Biotechnology) 6 (10): 1149-1154 (1988)； Wei Nie (Vigne), " restoring neurology and Neuscience " (Restorative Neurology and Neurosciece) 8: 35-36(1995)；Silent (Kremer) the & Perry Coudé of Cray is special (Perricaudet), " British Medical bulletin " (British Medical Bulletin)51(1):31-44(1995)；It breathes out clatter clatter (Haddada) et al., " microbiology and immunology are worked as Preceding theme " Dorr husband strangles (Doerfler) and wins in (Current Topics in Microbiologyand Immunology) Nurse (editor) (1995)；And remaining (Yu) et al., " gene therapy " (Gene Therapy) 1:13-26 (1994).

The Nonviral delivery methods of nucleic acid include fat transfection, nuclear transfection, microinjection, electricity turn, particle gun, virion, Liposome, immunoliposome, polycation or lipid: the reagent enhancing of nucleic acid conjugate, naked DNA, artificial virions and DNA Property intake.Fat transfection is described in such as U.S. Patent number 5,049,386,4,946,787 and 4,897,355 and fat transfection is tried Agent is commercially available (for example, TransfectamTM and LipofectinTM).Effective Receptor recognition rouge suitable for polynucleotides turns The cation and neutral lipid of dye include Felgner (Fil Ge Na), WO 91/17424；Those of WO 91/16024.Delivering It can be for cell (such as external or give in vitro) or target tissue (such as giving in vivo).

Lipid: the preparation of nucleic acid complexes (liposome including targeting, such as immunolipid complexes) is the skill of this field (see, for example, Krystal (Crystal), " science " (Science) 270:404-410 (1995) known to art personnel；Cloth Lai Ze (Blaese) et al., " cancer gene therapy " (Cancer Gene Ther.) 2:291-297 (1995)；Bell (Behr) Et al., " bioconjugate chemistry " (Bioconjugate Chem.) 5:382-389 (1994)；Thunder rice (Remy) et al., " biology is altogether Yoke chemistry " 5:647-654 (1994)；High (Gao) et al., " gene therapy " (Gene Therapy) 2:710-722 (1995)；Chinese mugwort Ha Maide (Ahmad) et al., " cancer research " (Cancer Res.) 52:4817-4820 (1992)；U.S. Patent number 4,186, 183,4,217,344,4,235,871,4,261,975,4,485,054,4,501,728,4,774,085,4,837,028 and 4,946,787)。

In a preferred embodiment of the invention, TadA:TadA*:Cas9 fusion protein described in second aspect is first prepared Carrier for expression of eukaryon, prokaryotic expression carrier or be transcribed in vitro carrier；Then carrier for expression of eukaryon is expanded, and prepared without in Prokaryotic expression carrier (or is transferred to inducing expression base editor in Escherichia coli by the carrier for expression of eukaryon of the transfection grade of toxin Enzyme), prepare TadA:TadA*:Cas9 fusion protein.The coding DNA of TadA:TadA*:Cas9 fusion protein can also be cloned Into the in-vitro transcription carrier comprising T7 Sp6 promoter, by being transcribed in vitro, preparation TadA:TadA*:Cas9 merges egg White mRNA.

In a preferred embodiment of the invention, gRNA described in second aspect is prepared, gRNA can be directly synthesized, or The coded sequence of gRNA is cloned into the transcription vector (such as pDR274 comes from Addgene) comprising T7 or Sp6 promoter, or logical The modes such as PCR, annealing, synthesis are crossed directly in the front end of the corresponding coding DNA of gRNA plus T7 or Sp6 promoter, then pass through body The method of outer transcription prepares gRNA；Or the coded sequence of the gRNA is cloned into carrier for expression of eukaryon pGEM-T-U6- In gRNA carrier, the pGEM-T-U6-gRNA carrier is to be expanded from pX330 carrier (Addgene) by way of PCR PCR product is connected into pGEM-T carrier (Promega) in such a way that TA is cloned by the frame sequence of U6 promoter and gRNA In it is obtained.

In an of the invention preferred embodiment, by the DNA of base editing enzymes described in second aspect (transcript mRNA or The albumen of translation) with the expression vector of gRNA (gRNA of the gRNA or synthesis that perhaps transcribe) by including but is not limited to rouge Plasmids, virus infection, electricity turn and the modes such as microinjection are imported into the cell of people.

Further, the viral vectors that the virus infection uses includes but is not limited to retroviral vector, slow disease Poisonous carrier, adenovirus vector and adenovirus first close viral vectors etc..

In a preferred embodiment of the invention, people's cell includes but is not limited to bone marrow cell, candidate stem cell (HSC), embryonic stem cell (ESC), inducing pluripotent stem cells (iPSC) and hematopoietic progenitor cells.

It is understood that the gene order of the ABE base editing system and people's HBG1 and HBG2 promoter region Close, TadA:TadA*:Cas9 fusion protein forms compound in conjunction with gRNA, the compound under the guidance of gRNA, TadA:TadA*:Cas9 fusion protein can be in conjunction with the target site on DNA, wherein the DNA chain complementary with gRNA can be by Cas9 Nuclease cutting, rather than the A base in certain sites then can be by adenine deaminase on complementary strand --- TadA albumen --- catalysis Deamination forms I base.With the duplication of DNA, I (hypoxanthine, Inosine) base can be by G (guanine, Guanine) alkali Base substitution, thus specifically by the GTGTGGGG in HBG1 and HBG2 promoter regionAA(lower stroke of A in GGGGCCCCCAAG sequence Line mark) it is mutated into G, to treat β hemoglobinopathy.

Fifth aspect present invention provides a kind of eukaryotic host cell, includes targeting human editor HBG1 described in second aspect Non-naturally occurring or engineering combination described in ABE base editing system or the third aspect with HBG2 promoter region Object.

In an embodiment of the present invention, the eukaryotic host cell includes but is not limited to bone marrow cell, and Hematopoietic Stem is thin Born of the same parents (HSC), embryonic stem cell (ESC), inducing pluripotent stem cells (iPSC) and hematopoietic progenitor cells.

Sixth aspect present invention provides a kind of kit, and the gRNA sequence or second aspect provided comprising first aspect mentions The ABE base editing system of confession, non-naturally occurring or engineering the composition of third aspect offer, the 5th aspect provide One of eukaryotic host cell is a variety of.

In an embodiment of the present invention, the kit further includes conventional matched reaction reagent and/or consersion unit.Example Such as, kit can provide one or more reactions or storage buffer.It by the available form in specific measurement or can press The form (such as by concentration or lyophilized form) for needing to add one or more other components before the use provides reagent.Buffering Liquid can be any buffer, including but not limited to sodium carbonate buffer, sodium bicarbonate buffer liquid, borate buffer solution, Tris Buffer, MOPS buffer, HEPES buffer solution and combinations thereof.In some embodiments, which is alkaline.Some In embodiment, which has the pH from about 7 to about 10.In some embodiments, which includes one or more few Nucleotide, one or more nucleic acid include at least one gRNA sequence as described in relation to the first aspect.

Each component in kit of the present invention can provide either individually or in combination, and can be provided in any In suitable container, such as bottle, bottle, pipe.

Seventh aspect present invention provides a kind of targeting volume as described in the gRNA sequence of first aspect offer, second aspect Collect the ABE base editing system of people's HBG1 and HBG2 promoter region, the non-naturally occurring or engineering that the third aspect provides A kind of kit for providing of composition, one of the eukaryotic host cell that provides of the 5th aspect or a variety of, the 6th aspect exist Prevention improves, and/or treats the application in β hemoglobinopathy.

In an embodiment of the present invention, the β hemoglobinopathy includes β-thalassemia, sickle cell anemia.

Eighth aspect present invention provides a kind of targeting human editor HBG1 and HBG2 promoter region as described in second aspect The ABE base editing system in domain, non-naturally occurring or engineering the composition of third aspect offer, the 5th aspect provide Eukaryotic host cell is applied by following one or more methods:

1) targeting human editor HBG1 and HBG2 promoter region described in the gRNA sequence that first aspect is provided, second aspect The ABE base editing system in domain, non-naturally occurring or engineering the composition of third aspect offer, the 5th aspect provide Eukaryotic host cell is individually applied；

2) targeting human editor HBG1 and HBG2 promoter region described in the gRNA sequence that first aspect is provided, second aspect The ABE base editing system in domain, non-naturally occurring or engineering the composition of third aspect offer, the 5th aspect provide Eukaryotic host cell and one or more of operation, biological therapy, immunization therapy use in conjunction；

3) targeting described in the gRNA sequence that is provided the first aspect by the way of delivering in vivo, second aspect is compiled Collect the ABE base editing system of people's HBG1 and HBG2 promoter region, the non-naturally occurring or engineering that the third aspect provides Composition, the eukaryotic host cell that provides of the 5th aspect is directly delivered to patient's body and treated；

4) targeting human editor described in the gRNA sequence, second aspect that in-vitro transfection technology provides first aspect is first passed through Non-naturally occurring or engineering the group that the ABE base editing system of HBG1 and HBG2 promoter region, the third aspect provide It closes object to mix with host cell, then by targeting human editor described in the gRNA sequence provided containing first aspect, second aspect Non-naturally occurring or engineering the group that the ABE base editing system of HBG1 and HBG2 promoter region, the third aspect provide The defeated time patient's body of host cell for closing object implements treatment；

5) the eukaryotic host cell input patient's body implementation that the 5th aspect provides is controlled.

Ninth aspect present invention provides a kind for the treatment of method of β hemoglobinopathy, comprising:

1) targeting human editor HBG1 and HBG2 promoter region described in the gRNA sequence that first aspect is provided, second aspect The ABE base editing system in domain, non-naturally occurring or engineering the composition of third aspect offer, the 5th aspect provide Eukaryotic host cell individually treats patient；

2) targeting human editor HBG1 and HBG2 promoter region described in the gRNA sequence that first aspect is provided, second aspect The ABE base editing system in domain, non-naturally occurring or engineering the composition of third aspect offer, the 5th aspect provide Eukaryotic host cell is combined with one or more of operation, biological therapy, immunization therapy treats patient；

5) the eukaryotic host cell input patient's body that the 5th aspect provides is implemented into treatment.

In a specific embodiment of the invention, HBG-GGX20 (SEQ ID NO.1) will be expressed, HBG-GX20 (SEQ ID ), NO.2 the segment of HBG-GX17 (SEQ ID NO.3) and HBG-GX16 (SEQ ID NO.4) sequence is cloned into gRNA expression and carries In body, and by it individually with pcDNA3.1-ABE7.10 carrier (SEQ ID NO.9) cotransfection into 293T cell, generate GRNA and TadA:TadA*:Cas9 fusion protein.Under the guidance of gRNA, TadA:TadA*:Cas9 fusion protein can be with DNA On target site combine, wherein the DNA chain complementary with gRNA can be cut off by Cas9 nuclease, rather than 4-9 A alkali on complementary strand Base then can be by adenine deaminase --- TadA albumen ---, and catalysis deamination forms I base.With the duplication of DNA, I (secondary Huang Purine, Inosine) base can be substituted by G (guanine, Guanine) base, to realize the base replacement of A to G, specifically will GTGTGGGG in HBG1 and HBG2 promoter regionAAA (underscore mark) in GGGGCCCCCAAG sequence is mutated into G.Through Deep sequencing comparison, is more specifically targeted to HBG1 and HBG2 for ABE base editing system than HBG-GX19 (SEQ ID NO.6) Promoter region, the A at its target spot is accurately transformed into G, reduces the undershooting-effect of ABE base editing system.

ABE base editing system the invention has the following advantages:

The present invention provides technical solutions can be more specifically compared to the HBG-GX19 (SEQ ID NO.6) of existing report ABE base editing system is targeted to the promoter region of HBG1 and HBG2, the A at its target spot is accurately transformed into G, reduces ABE The undershooting-effect of base editing system more efficiently raises the expression of gamma Globulin, the level of HbF is improved, thus more efficiently Treat β hemoglobinopathy.High specific system provided by the invention has very in the field of gene of β hemoglobinopathy Broad application prospect.

Detailed description of the invention

Fig. 1 is the site of missing the target that PCR product deep sequencing verifies HBG-GX19gRNA, has 6 positions in 18 sites of missing the target Point can be verified, and be marked with No. *.6 sites include: HBG OT1, HBG OT3, HBG OT4, HBG OT5, HBG OT6, HBG OT9.Abscissa indicates A base editorial efficiency.

Fig. 2 is that PCR product deep sequencing compares HBG-GGX20, HBG-GX20, HBG-GX19, HBG-GX17, HBG-GX16 Miss the target the editorial efficiency of site and target site at 10, and abscissa indicates A base editorial efficiency.

Fig. 3 is to compare HBG-GGX20, HBG-GX20, HBG-GX19, and HBG-GX17, HBG-GX16 are in 10 sites of missing the target And the relative activity of target site.Wherein, the relative activity is by removing each gRNA in the editorial efficiency of different loci With HBG-GX19gRNA ratio obtained by editor's activity of target site¹, ratio shown by way of thermal map again.In figure It has been shown that, HBG-GGX20, HBG-GX17 and HBG-GX16gRNA have higher specificity compared to HBG-GX19gRNA.

Fig. 4 is HBG-GGX20, HBG-GX20, HBG-GX19, the coding nucleotide sequence of HBG-GX17, HBG-GX1 and HBG promoter target site nucleotide sequence.

Specific embodiment

The present invention is further illustrated below in conjunction with Figure of description and specific embodiment, but embodiment is not to the present invention It limits in any form.

Unless stated otherwise, the present invention uses reagent, method and apparatus for the art conventional reagent, method and are set It is standby.Unless stated otherwise, following embodiment agents useful for same and material are commercially available.Test method without specific conditions is led to Often according to normal conditions or condition proposed by manufacturer implement.

In the embodiment of the invention, the present invention provides a kind of more high specifics for editing HBG1 and HBG2 Promoter region gRNA, ABE base editing system, method, kit and its application.To raise the expression of gamma Globulin, The level for improving HbF, to treat β hemoglobinopathy.

Method provided by the present invention for editing the promoter region of HBG1 and HBG2, uses use provided by the invention In editor HBG1 and HBG2 promoter region ABE base editing system or kit, the method includes but be not limited to as The one or more steps of lower step:

Comparative example

Article (Gaudelli, the N.M.et that the step of comparative example of the present invention delivers with reference to David Liu seminar al.Programmable base editing of A*T to G*C in genomic DNA without DNA Cleavage.Nature 551,464-471 (2017)), it is invented in conjunction with this seminar fast based on genome sequencing detection gland Method (the EndoV-seq technology, with specific reference to the patent application CN of the applicant of purine single base editing system undershooting-effect 2018111602309 embodiment partial tables 2 and Fig. 6 and its related experiment step that undershooting-effect is detected and is analyzed), lead to Cross the inspection of the technologies such as gene chemical synthesis, molecular cloning, cell experiment, nucleic acid purification, PCR product deep sequencing, bioinformatic analysis The undershooting-effect for measuring HBG-GX19gRNA (SEQ ID NO.6) and ABE base editing system, specifically includes:

HBG-GX19gRNA (SEQ ID NO.6) and the combination effect sequencing of ABE base editing system

Expression HBG-GX19gRNA (nucleotides sequence of gRNA boot sequence is classified as SEQ ID NO.6) is cloned into gRNA table Up to (pUC19-SpCas9-gRNA, nucleotides sequence are classified as SEQ ID NO.8, the building of this laboratory) and pcDNA3.1- in carrier ABE7.10 carrier (Ai Ji Biotechnology Co., Ltd synthesizes by Guangzhou, and nucleotides sequence is classified as SEQ ID NO.9) expression vector is total It is transfected into 293T cell, cell is collected after 48h.Genomic DNA (DNeasy is extracted using genome DNA extracting reagent kit Blood&Tissue Kit, Qiagen), operating method is carried out fully according to specification.Then passed through using the primer in table 1 PCR amplification target site and site of missing the target, and these PCR products are used for deep sequencing.

As a result such as such as Fig. 1, pass through deep sequencing, it has been found that for HBG, there are 6 sites in 18 sites of missing the target It is missed the target.These results suggest that the ABE base editing system of HBG-GX19gRNA guidance has undershooting-effect.

Effect example

GRNA and ABE base editing system combination effect sequencing provided by the invention and effect compare

The HBG-GX19gRNA of comparative example is replaced are as follows: (nucleotides sequence of gRNA boot sequence is classified as HBG-GGX20 SEQ ID NO.1), HBG-GX20 (nucleotides sequence of gRNA boot sequence is classified as SEQ ID NO.2), HBG-GX17 (nucleotide Sequence is SEQ ID NO.3) and the frame sequence of HBG-GX16 (nucleotides sequence is classified as SEQ ID NO.4), gRNA be SEQ ID NO.7:GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACU UGAAAAAGUGGCACCGAGUC GGUGCUUUUUU¹(Gaudelli,N.M.et al.Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage.Nature551,464-471(2017))

Separately verify its base edit effect:

HBG-GGX20 (nucleotides sequence of gRNA boot sequence is classified as SEQ ID NO.1) will be encoded, HBG-GX20 (gRNA The nucleotides sequence of boot sequence is classified as SEQ ID NO.2), (nucleotides sequence of gRNA boot sequence is classified as SEQ ID to HBG-GX17 NO.3) and HBG-GX16 (nucleotides sequence of gRNA boot sequence is classified as SEQ ID NO.4) gene order is cloned into gRNA expression In carrier (pUC19-SpCas9-gRNA, nucleotides sequence are classified as SEQ ID NO.8, the building of this laboratory), and individually by it (by Guangzhou, Ai Ji Biotechnology Co., Ltd is synthesized, and nucleotides sequence is classified as SEQ ID with pcDNA3.1-ABE7.10 carrier NO.9) cotransfection collects cell into 293T cell after 48h.Genomic DNA is extracted using genome DNA extracting reagent kit (DNeasy Blood&Tissue Kit, Qiagen), operating method are carried out fully according to specification.Then drawing in table 1 is utilized Object passes through totally 10 sites of missing the target PCR amplification target site and OT1-OT10, and these PCR products are used for deep sequencing.

As a result such as such as Fig. 3, pass through deep sequencing, it has been found that HBG-GGX20 (SEQ ID NO.1), HBG-GX17 (SEQ ID NO.3) and HBG-GX16 (SEQ ID NO.4) have compared to HBG-GX19gRNA (SEQ ID NO.6) it is higher special Property.And editorial efficiency that HBG-GX20 (SEQ ID NO.2) is then provided simultaneously at higher target site and efficiency of missing the target.

It should be pointed out that for those skilled in the art, not departing from principle of the embodiment of the present invention Under the premise of, several improvements and modifications can also be made, these improvements and modifications are also considered as the protection scope of the embodiment of the present invention.It will HBG-GGX20 (SEQ ID NO.1), HBG-GX20 (SEQ ID NO.2), HBG-GX17 (SEQ ID NO.3) and HBG-GX16 (SEQ ID NO.4) gRNA and ABE base editing system is imported into candidate stem cell, has knowledge according to this field, can be with , it is envisioned that the promoter of the HBG1 and HBG2 gene in candidate stem cell can be edited, the expression of gamma Globulin is raised.

Potential site statistics and its PCR product deep sequencing primer of missing the target of table 1HBG-GX19gRNA

The base sequence difference of SEQ ID NO.8 and SEQ ID NO.9 of the present invention is as follows: (the SEQ ID NO.8 Base sequence with SEQ ID NO.9 is the sequence of the building of this laboratory and business plasmid vector, therefore is not written into subsequent sequence table In):

SEQ ID NO.8:

TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTC TGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAAC TATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAA ATACCGCATCAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTAT TACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGT TGTAAAACGACGGCCAGTGAATTCGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGT TAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAAT TTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTC GATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGGGTCTTCGAGAAGACCTGTTTTAGAGCTAGAAAT AGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTCTAGAGTCGACC TGCAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACA CAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGC GCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGC GGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGT ATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAG GCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCAT CACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAG CTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGG CGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAA CCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATC GCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGT GGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGA GTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCG CAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTT AAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCA ATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTG TCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCC CCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGG GCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAG TAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTA TGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGC TCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAA TTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGT GTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTG CTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACC CACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAA ATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGC ATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCG CACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTA TCACGAGGCCCTTTCGTC

SEQ ID NO.9:

AGCTTAAGTTTAAACCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCC TCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCA TTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATA GCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCTAGGGGGTATCCC CACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGC CCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATC GGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCA CGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTT GTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCT ATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGTG GAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAG TCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAAC TCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATG CAGAGGCCGAGGCCGCCTCTGCCTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGC AAAAAGCTCCCGGGAGCTTGTATATCCATTTTCGGATCTGATCAAGAGACAGGATGAGGATCGTTTCGCATGATTGA ACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAA TCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCC GGTGCCCTGAATGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGT GCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTC ACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGC CCATTCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGA TCTGGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGG ATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGAC TGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGG CGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTC TTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGCGACGCCCAACCTGCCATCACGAGATT TCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAG CGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAA TAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTAT CTTATCATGTCTGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAAT TGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAG CTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAA TCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCG GTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGC AGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATA GGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGA TACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGC CTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCT CCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCC AACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGG TGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGA AGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTTTTTTTGTT TGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCA GTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATT AAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAG GCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACG GGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAA TAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGT TGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGT GTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGT TGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATG GTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAAC CAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCAC ATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTG AGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTG AGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCC TTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAAT AAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCTCCCGATCCC CTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGA GGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTGC TTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTAT TAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGG CCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAG GGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATG CCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGA CTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATG GGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCAC CAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTG GGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTATCGAAATTAATACGACTC ACTATAGGGAGACCCAAGCTGGCTAGCGTTTAAACGGGCCCTCTAGACTCGAGCGGCCGCCATGTCCGAAGTCGAGT TTTCCCATGAGTACTGGATGAGACACGCATTGACTCTCGCAAAGAGGGCTTGGGATGAACGCGAGGTGCCCGTGGGG GCAGTACTCGTGCATAACAATCGCGTAATCGGCGAAGGTTGGAATAGGCCGATCGGACGCCACGACCCCACTGCACA TGCGGAAATCATGGCCCTTCGACAGGGAGGGCTTGTGATGCAGAATTATCGACTTATCGATGCGACGCTGTACGTCA CGCTTGAACCTTGCGTAATGTGCGCGGGAGCTATGATTCACTCCCGCATTGGACGAGTTGTATTCGGTGCCCGCGAC GCCAAGACGGGTGCCGCAGGTTCACTGATGGACGTGCTGCATCACCCAGGCATGAACCACCGGGTAGAAATCACAGA AGGCATATTGGCGGACGAATGTGCGGCGCTGTTGTCCGACTTTTTTCGCATGCGGAGGCAGGAGATCAAGGCCCAGA AAAAAGCACAATCCTCTACTGACTCTGGTGGTTCTTCTGGTGGTTCTAGCGGCAGCGAGACTCCCGGGACCTCAGAG TCCGCCACACCCGAAAGTTCTGGTGGTTCTTCTGGTGGTTCTTCCGAGGTCGAATTTTCACATGAGTATTGGATGCG ACACGCCTTGACGCTCGCCAAAAGGGCGAGGGACGAACGGGAAGTTCCCGTAGGCGCCGTCCTTGTACTGAATAATC GAGTTATTGGCGAAGGTTGGAACAGGGCCATAGGACTGCATGATCCAACAGCCCATGCAGAAATCATGGCGCTCCGG CAGGGTGGCCTTGTCATGCAAAATTATAGGCTGATCGACGCGACGTTGTACGTCACCTTCGAACCTTGCGTTATGTG TGCAGGCGCTATGATACATTCAAGAATTGGGCGAGTCGTGTTTGGGGTCAGGAACGCAAAGACTGGTGCAGCCGGTT CCCTTATGGATGTGCTCCACTACCCAGGAATGAATCATCGGGTCGAGATTACAGAGGGGATACTGGCTGACGAATGC GCCGCCCTCCTGTGCTACTTCTTTCGGATGCCCAGGCAGGTGTTTAACGCACAGAAGAAAGCTCAAAGCAGTACCGA CTCTGGGGGCTCTAGTGGAGGCTCCAGCGGTTCTGAGACCCCCGGCACTAGTGAATCTGCCACTCCCGAATCATCCG GGGGATCTTCAGGGGGATCTGATAAAAAGTATTCTATTGGTTTAGCCATCGGCACTAATTCCGTTGGATGGGCTGTC ATAACCGATGAATACAAAGTACCTTCAAAGAAATTTAAGGTGTTGGGGAACACAGACCGTCATTCGATTAAAAAGAA TCTTATCGGTGCCCTCCTATTCGATAGTGGCGAAACGGCAGAGGCGACTCGCCTGAAACGAACCGCTCGGAGAAGGT ATACACGTCGCAAGAACCGAATATGTTACTTACAAGAAATTTTTAGCAATGAGATGGCCAAAGTTGACGATTCTTTC TTTCACCGTTTGGAAGAGTCCTTCCTTGTCGAAGAGGACAAGAAACATGAACGGCACCCCATCTTTGGAAACATAGT AGATGAGGTGGCATATCATGAAAAGTACCCAACGATTTATCACCTCAGAAAAAAGCTAGTTGACTCAACTGATAAAG CGGACCTGAGGTTAATCTACTTGGCTCTTGCCCATATGATAAAGTTCCGTGGGCACTTTCTCATTGAGGGTGATCTA AATCCGGACAACTCGGATGTCGACAAACTGTTCATCCAGTTAGTACAAACCTATAATCAGTTGTTTGAAGAGAACCC TATAAATGCAAGTGGCGTGGATGCGAAGGCTATTCTTAGCGCCCGCCTCTCTAAATCCCGACGGCTAGAAAACCTGA TCGCACAATTACCCGGAGAGAAGAAAAATGGGTTGTTCGGTAACCTTATAGCGCTCTCACTAGGCCTGACACCAAAT TTTAAGTCGAACTTCGACTTAGCTGAAGATGCCAAATTGCAGCTTAGTAAGGACACGTACGATGACGATCTCGACAA TCTACTGGCACAAATTGGAGATCAGTATGCGGACTTATTTTTGGCTGCCAAAAACCTTAGCGATGCAATCCTCCTAT CTGACATACTGAGAGTTAATACTGAGATTACCAAGGCGCCGTTATCCGCTTCAATGATCAAAAGGTACGATGAACAT CACCAAGACTTGACACTTCTCAAGGCCCTAGTCCGTCAGCAACTGCCTGAGAAATATAAGGAAATATTCTTTGATCA GTCGAAAAACGGGTACGCAGGTTATATTGACGGCGGAGCGAGTCAAGAGGAATTCTACAAGTTTATCAAACCCATAT TAGAGAAGATGGATGGGACGGAAGAGTTGCTTGTAAAACTCAATCGCGAAGATCTACTGCGAAAGCAGCGGACTTTC GACAACGGTAGCATTCCACATCAAATCCACTTAGGCGAATTGCATGCTATACTTAGAAGGCAGGAGGATTTTTATCC GTTCCTCAAAGACAATCGTGAAAAGATTGAGAAAATCCTAACCTTTCGCATACCTTACTATGTGGGACCCCTGGCCC GAGGGAACTCTCGGTTCGCATGGATGACAAGAAAGTCCGAAGAAACGATTACTCCCTGGAATTTTGAGGAAGTTGTC GATAAAGGTGCGTCAGCTCAATCGTTCATCGAGAGGATGACCAACTTTGACAAGAATTTACCGAACGAAAAAGTATT GCCTAAGCACAGTTTACTTTACGAGTATTTCACAGTGTACAATGAACTCACGAAAGTTAAGTATGTCACTGAGGGCA TGCGTAAACCCGCCTTTCTAAGCGGAGAACAGAAGAAAGCAATAGTAGATCTGTTATTCAAGACCAACCGCAAAGTG ACAGTTAAGCAATTGAAAGAGGACTACTTTAAGAAAATTGAATGCTTCGATTCTGTCGAGATCTCCGGGGTAGAAGA TCGATTTAATGCGTCACTTGGTACGTATCATGACCTCCTAAAGATAATTAAAGATAAGGACTTCCTGGATAACGAAG AGAATGAAGATATCTTAGAAGATATAGTGTTGACTCTTACCCTCTTTGAAGATCGGGAAATGATTGAGGAAAGACTA AAAACATACGCTCACCTGTTCGACGATAAGGTTATGAAACAGTTAAAGAGGCGTCGCTATACGGGCTGGGGACGCTT GTCGCGGAAACTTATCAACGGGATAAGAGACAAGCAAAGTGGTAAAACTATTCTCGATTTTCTAAAGAGCGACGGCT TCGCCAATAGGAACTTTATGCAGCTGATCCATGATGACTCTTTAACCTTCAAAGAGGATATACAAAAGGCACAGGTT TCCGGACAAGGGGACTCATTGCACGAACATATTGCGAATCTTGCTGGTTCGCCAGCCATCAAAAAGGGCATACTCCA GACAGTCAAAGTAGTGGATGAGCTAGTTAAGGTCATGGGACGTCACAAACCGGAAAACATTGTAATCGAGATGGCAC GCGAAAATCAAACGACTCAGAAGGGGCAAAAAAACAGTCGAGAGCGGATGAAGAGAATAGAAGAGGGTATTAAAGAA CTGGGCAGCCAGATCTTAAAGGAGCATCCTGTGGAAAATACCCAATTGCAGAACGAGAAACTTTACCTCTATTACCT ACAAAATGGAAGGGACATGTATGTTGATCAGGAACTGGACATAAACCGTTTATCTGATTACGACGTCGATCACATTG TACCCCAATCCTTTTTGAAGGACGATTCAATCGACAATAAAGTGCTTACACGCTCGGATAAGAACCGAGGGAAAAGT GACAATGTTCCAAGCGAGGAAGTCGTAAAGAAAATGAAGAACTATTGGCGGCAGCTCCTAAATGCGAAACTGATAAC GCAAAGAAAGTTCGATAACTTAACTAAAGCTGAGAGGGGTGGCTTGTCTGAACTTGACAAGGCCGGATTTATTAAAC GTCAGCTCGTGGAAACCCGCCAGATCACAAAGCATGTTGCCCAGATACTAGATTCCCGAATGAATACGAAATACGAC GAGAACGATAAGCTGATTCGGGAAGTCAAAGTAATCACTTTAAAGTCAAAATTGGTGTCGGACTTCAGAAAGGATTT TCAATTCTATAAAGTTAGGGAGATAAATAACTACCACCATGCGCACGACGCTTATCTTAATGCCGTCGTAGGGACCG CACTCATTAAGAAATACCCGAAGCTAGAAAGTGAGTTTGTGTATGGTGATTACAAAGTTTATGACGTCCGTAAGATG ATCGCGAAAAGCGAACAGGAGATAGGCAAGGCTACAGCCAAATACTTCTTTTATTCTAACATTATGAATTTCTTTAA GACGGAAATCACTCTGGCAAACGGAGAGATACGCAAACGACCTTTAATTGAAACCAATGGGGAGACAGGTGAAATCG TATGGGATAAGGGCCGGGACTTCGCGACGGTGAGAAAAGTTTTGTCCATGCCCCAAGTCAACATAGTAAAGAAAACT GAGGTGCAGACCGGAGGGTTTTCAAAGGAATCGATTCTTCCAAAAAGGAATAGTGATAAGCTCATCGCTCGTAAAAA GGACTGGGACCCGAAAAAGTACGGTGGCTTCGATAGCCCTACAGTTGCCTATTCTGTCCTAGTAGTGGCAAAAGTTG AGAAGGGAAAATCCAAGAAACTGAAGTCAGTCAAAGAATTATTGGGGATAACGATTATGGAGCGCTCGTCTTTTGAA AAGAACCCCATCGACTTCCTTGAGGCGAAAGGTTACAAGGAAGTAAAAAAGGATCTCATAATTAAACTACCAAAGTA TAGTCTGTTTGAGTTAGAAAATGGCCGAAAACGGATGTTGGCTAGCGCCGGAGAGCTTCAAAAGGGGAACGAACTCG CACTACCGTCTAAATACGTGAATTTCCTGTATTTAGCGTCCCATTACGAGAAGTTGAAAGGTTCACCTGAAGATAAC GAACAGAAGCAACTTTTTGTTGAGCAGCACAAACATTATCTCGACGAAATCATAGAGCAAATTTCGGAATTCAGTAA GAGAGTCATCCTAGCTGATGCCAATCTGGACAAAGTATTAAGCGCATACAACAAGCACAGGGATAAACCCATACGTG AGCAGGCGGAAAATATTATCCATTTGTTTACTCTTACCAACCTCGGCGCTCCAGCCGCATTCAAGTATTTTGACACA ACGATAGATCGCAAACGATACACTTCTACCAAGGAGGTGCTAGACGCGACACTGATTCACCAATCCATCACGGGATT ATATGAAACTCGGATAGATTTGTCACAGCTTGGGGGTGACTCTGGTGGTTCTCCCAAGAAGAAGAGGAAAGTCTAAA

It should be pointed out that for those skilled in the art, not departing from principle of the embodiment of the present invention Under the premise of, several improvements and modifications can also be made, these improvements and modifications are also considered as the protection scope of the embodiment of the present invention.

Sequence table

<110>Zhongshan University

<120>a kind of high specific ABE base editing system and its application in β hemoglobinopathy

<130> 2018

<160> 7

<170> SIPOSequenceListing 1.0

<210> 1

<211> 22

<212> RNA

<213>artificial sequence (Artificial Sequence)

<400> 1

ggguggggaa ggggccccca ag 22

<210> 2

<211> 21

<212> RNA

<213>artificial sequence (Artificial Sequence)

<400> 2

gguggggaag gggcccccaa g 21

<210> 3

<211> 18

<212> RNA

<213>artificial sequence (Artificial Sequence)

<400> 3

ggggaagggg cccccaag 18

<210> 4

<211> 17

<212> RNA

<213>artificial sequence (Artificial Sequence)

<400> 4

gggaaggggc ccccaag 17

<210> 5

<211> 1763

<212> PRT

<213>artificial sequence (Artificial Sequence)

<400> 5

Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu Thr

1 5 10 15

Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala Val

20 25 30

Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro Ile

35 40 45

Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg Gln

50 55 60

Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu Tyr

65 70 75 80

Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His Ser

85 90 95

Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly Ala

100 105 110

Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His Arg

115 120 125

Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu Leu

130 135 140

Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys Lys

145 150 155 160

Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly

165 170 175

Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly

180 185 190

Gly Ser Ser Gly Gly Ser Ser Glu Val Glu Phe Ser His Glu Tyr Trp

195 200 205

Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu

210 215 220

Val Pro Val Gly Ala Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu

225 230 235 240

Gly Trp Asn Arg Ala Ile Gly Leu His Asp Pro Thr Ala His Ala Glu

245 250 255

Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu

260 265 270

Ile Asp Ala Thr Leu Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala

275 280 285

Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg

290 295 300

Asn Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp Val Leu His Tyr

305 310 315 320

Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp

325 330 335

Glu Cys Ala Ala Leu Leu Cys Tyr Phe Phe Arg Met Pro Arg Gln Val

340 345 350

Phe Asn Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser

355 360 365

Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala

370 375 380

Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Asp Lys Lys Tyr

385 390 395 400

Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile

405 410 415

Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn

420 425 430

Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe

435 440 445

Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg

450 455 460

Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile

465 470 475 480

Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu

485 490 495

Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro

500 505 510

Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro

515 520 525

Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala

530 535 540

Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg

545 550 555 560

Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val

565 570 575

Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu

580 585 590

Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser

595 600 605

Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu

610 615 620

Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser

625 630 635 640

Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp

645 650 655

Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn

660 665 670

Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala

675 680 685

Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn

690 695 700

Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr

705 710 715 720

Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln

725 730 735

Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn

740 745 750

Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr

755 760 765

Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu

770 775 780

Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe

785 790 795 800

Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala

805 810 815

Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg

820 825 830

Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly

835 840 845

Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser

850 855 860

Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly

865 870 875 880

Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn

885 890 895

Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr

900 905 910

Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly

915 920 925

Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val

930 935 940

Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys

945 950 955 960

Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser

965 970 975

Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu

980 985 990

Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu

995 1000 1005

Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg

1010 1015 1020

Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp

1025 1030 1035 1040

Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg

1045 1050 1055

Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys

1060 1065 1070

Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe

1075 1080 1085

Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln

1090 1095 1100

Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala

1105 1110 1115 1120

Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val

1125 1130 1135

Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu

1140 1145 1150

Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly

1155 1160 1165

Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys

1170 1175 1180

Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln

1185 1190 1195 1200

Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp

1205 1210 1215

Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp

1220 1225 1230

Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp

1235 1240 1245

Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn

1250 1255 1260

Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln

1265 1270 1275 1280

Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr

1285 1290 1295

Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile

1300 1305 1310

Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln

1315 1320 1325

Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu

1330 1335 1340

Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp

1345 1350 1355 1360

Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr

1365 1370 1375

His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu

1380 1385 1390

Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr

1395 1400 1405

Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile

1410 1415 1420

Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe

1425 1430 1435 1440

Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro

1445 1450 1455

Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly

1460 1465 1470

Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn

1475 1480 1485

Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser

1490 1495 1500

Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp

1505 1510 1515 1520

Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr

1525 1530 1535

Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu

1540 1545 1550

Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser

1555 1560 1565

Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu

1570 1575 1580

Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu

1585 1590 1595 1600

Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln

1605 1610 1615

Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr

1620 1625 1630

Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu

1635 1640 1645

Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile

1650 1655 1660

Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala

1665 1670 1675 1680

Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro

1685 1690 1695

Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn

1700 1705 1710

Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg

1715 1720 1725

Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His

1730 1735 1740

Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu

1745 1750 1755 1760

Gly Gly Asp

<210> 6

<211> 20

<212> RNA

<213>artificial sequence (Artificial Sequence)

<400> 6

guggggaagg ggcccccaag 20

<210> 7

<211> 82

<212> RNA

<213>artificial sequence (Artificial Sequence)

<400> 7

guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc cguuaucaac uugaaaaagu 60

ggcaccgagu cggugcuuuu uu 82

Claims

1. a kind of gRNA for targeting HBG1 and HBG2 promoter region, which is characterized in that include gRNA sequence, the gRNA sequence Nucleotide sequence include at least one of SEQ ID NO.1, SEQ ID NO.2, SEQ ID NO.3, SEQ ID NO.4.

2. a kind of ABE base editing system of targeting editor HBG1 and HBG2 promoter region, which is characterized in that the ABE alkali Base editing system includes TadA:TadA*:Cas9 fusion protein, further includes the gRNA for targeting HBG1 and HBG2 promoter region, institute State gRNA sequence nucleotide sequence include SEQ ID NO.1, SEQ ID NO.2, SEQ ID NO.3, in SEQ ID NO.4 It is at least one.

3. ABE base editing system according to claim 2, which is characterized in that the TadA:TadA*:Cas9 merges egg White amino acid sequence be amino acid at least 80% shown in SEQ ID NO.5,85%, 90%, 92%, 95%, 96%, 97%, 98%, 99% or 99.5% consistent sequence.

4. a kind of non-naturally occurring or engineering composition, which is characterized in that the composition is 1) -6) one of or It is a variety of:

1) include one or more carriers, which includes component I and component II:

The component I includes the first regulating element, and the coding such as right being operably connected with first regulating element It is required that the coded sequence of TadA:TadA*:Cas9 fusion protein described in 2；The component II includes the second regulating element, and The coding being operably connected with second regulating element gRNA sequence as described in claim 1；Wherein, component I and II On identical or different carrier；

2) comprising the mRNA and carrier of TadA:TadA*:Cas9 fusion protein as claimed in claim 2, the carrier includes Second regulating element, and the coding gRNA sequence as described in claim 1 being operably connected with second regulating element Column；

3) comprising the albumen and carrier of TadA:TadA*:Cas9 fusion protein as claimed in claim 2, the carrier includes Second regulating element, and the coding gRNA sequence as described in claim 1 being operably connected with second regulating element Column；

4) comprising TadA:TadA*:Cas9 fusion protein as claimed in claim 2 expression vector and such as claim 1 institute The gRNA stated；

5) mRNA comprising TadA:TadA*:Cas9 fusion protein as claimed in claim 2 and as described in claim 1 gRNA；

6) albumen comprising TadA:TadA*:Cas9 fusion protein as claimed in claim 2 and as described in claim 1 gRNA。

5. a kind of method in body cell or a internal targeting editor HBG1 and HBG2 promoter region, which is characterized in that Include: to deliver ABE base editing system as claimed in claim 2 or composition as claimed in claim 4, makes described The sequence of ABE base editing system or composition and people HBG1 and HBG2 is close.

6. a kind of eukaryotic host cell, which is characterized in that comprising claim 1 provide gRNA sequence, as claimed in claim 2 ABE base editing system or composition as claimed in claim 5.

7. a kind of kit for the expression in body cell or human body internal specific up-regulation HBG1 and HBG2, which is characterized in that GRNA sequence, the ABE base editing system of claim 2 offer, the claim 4 provided including claim 1 provides non- One of eukaryotic host cell that naturally occurring or engineering composition, claim 6 provide is a variety of.

8. the ABE base editing system that a kind of gRNA sequence as claimed in claim 1, claim 2 provide, claim 5 provide It is non-naturally occurring or engineering composition, claim 6 provide eukaryotic host cell, claim 7 provide examination Agent box prevents, improves, and/or treats the medicine of the β hemoglobinopathy caused by HBB gene mutation preparing in body cell or human body Application in object.

9. application as claimed in claim 8, which is characterized in that the β hemoglobinopathy includes β-thalassemia, sickle Cell anaemia.

10. the ABE base editing system that a kind of gRNA sequence provided such as claim 1, claim 2 provide, claim 5 The eukaryotic host cell of non-naturally occurring or engineering the composition, claim 6 offer that provide is by following a kind of or more Kind method is applied:

1) ABE base editing system that gRNA sequence that such as claim 1 provides, claim 2 provide, claim 5 are mentioned The eukaryotic host cell of non-naturally occurring or engineering the composition, claim 6 offer that supply individually is applied；

2) ABE base editing system that gRNA sequence that such as claim 1 provides, claim 2 provide, claim 5 are mentioned Supply it is non-naturally occurring or engineering composition, claim 6 provide eukaryotic host cell and operation, biological therapy, One or more of immunization therapy use in conjunction；

3) ABE that gRNA sequence, the claim 2 provided such as such as claim 1 is provided by the way of delivering in vivo The eukaryon of non-naturally occurring or engineering composition, claim 6 offer that base editing system, claim 5 provide Host cell is directly delivered to patient's body and is treated；

4) the ABE base that in-vitro transfection technology provides gRNA sequence that such as claim 1 provides, claim 2 is first passed through to compile Non-naturally occurring or engineering the composition that the system of collecting, claim 5 provide is mixed with host cell, then will be contained It states as the gRNA sequence, the ABE base editing system of claim 2 offer, the claim 5 that provide such as claim 1 provide Defeated time patient's body of host cell of non-naturally occurring or engineering composition implements treatment；

5) the eukaryotic host cell input patient's body provided claim 6 implements treatment.