CN112143753A - 一套腺嘌呤碱基编辑器及其相关生物材料与应用 - Google Patents

一套腺嘌呤碱基编辑器及其相关生物材料与应用 Download PDF

Info

Publication number
CN112143753A
CN112143753A CN202010980266.2A CN202010980266A CN112143753A CN 112143753 A CN112143753 A CN 112143753A CN 202010980266 A CN202010980266 A CN 202010980266A CN 112143753 A CN112143753 A CN 112143753A
Authority
CN
China
Prior art keywords
leu
lys
glu
ile
asp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010980266.2A
Other languages
English (en)
Inventor
周焕斌
任斌
严大琦
柳浪
严芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Plant Protection of Chinese Academy of Agricultural Sciences
Original Assignee
Institute of Plant Protection of Chinese Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Plant Protection of Chinese Academy of Agricultural Sciences filed Critical Institute of Plant Protection of Chinese Academy of Agricultural Sciences
Priority to CN202010980266.2A priority Critical patent/CN112143753A/zh
Publication of CN112143753A publication Critical patent/CN112143753A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04002Adenine deaminase (3.5.4.2)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

本发明公开了一套腺嘌呤碱基编辑器及其相关生物材料与应用。本发明提供了融合蛋白在植物单碱基编辑中的应用,所述融合蛋白的名称为TadA‑R‑cas,含有Cas蛋白和腺嘌呤脱氨酶,所述腺嘌呤脱氨酶是氨基酸序列是SEQ ID No.2的第1‑167位的蛋白质。本发明不仅适用于含SpCas9的腺嘌呤碱基编辑器,同时还适用于含ScCas9、SpCas9‑NG和SpRY的腺嘌呤碱基编辑器,扩宽了植物基因组定点编辑的使用范围,为植物研究领域科研人员提供一个重要的基因功能研究工具。本发明可以提高编辑的效率且能够精确的介导靶位点的突变,并且能够在水稻细胞中广泛适用。

Description

一套腺嘌呤碱基编辑器及其相关生物材料与应用
技术领域
本发明涉及基因编辑技术领域中的一套腺嘌呤碱基编辑器及其相关生物材料与应用。
背景技术
CRISPR/Cas9系统是一项新的人工核酸酶技术,是由sgRNA(single guide RNA)和Cas9蛋白组成的复合物,其介导的基因组编辑技术已经成为分子生物学中最强大的工具之一,是一种新兴的比较精确的能对生物体基因组特定目标基因进行修饰的一种基因工程技术。CRISPR/Cas9是在sgRNA引导下,通过自身的核酸内切酶活性引起基因组中靶位点DNA序列特异性双链断裂(double-strand breaks,DSBs),然后诱导生物体通过非同源末端连接(non-homologous end joining,NHEJ)或同源重组介导的修复(homology-directedrepair,HDR)两种方式。NHEJ途径诱导产生的突变大部分为核苷酸的插入或缺失,造成移码突变,而HDR则由同源供体DNA介导片段插入或核苷酸修正来修复DSB,修复的过程就将导致基因突变。
碱基编辑技术则是在CRISPR/Cas9系统基础上发展而来一种精准高效的基因组编辑技术,它能够而将基因组特定位点的某个目标碱基不可逆地替换为另一个碱基。在作物育种与基因功能研究中中,很多重要的农艺性状如抗病基因、抗除草剂基因等,多是由于碱基点突变引起的,所以CRISPR/Cas9系统介导的基因敲除技术的应用特别局限,但是单碱基编辑系统的出现却克服了这个技术难题,为作物缺陷型基因校正和精准分子育种提供了有力的技术支撑。
腺嘌呤碱基编辑技术(Adenine base editor,ABE)作为植物碱基编辑技术中的一种,其能够实现腺嘌呤A向鸟嘌呤G的定向替换,其原理中主要为由切口酶Cas9(D10A)(又称为Cas9n)结合腺嘌呤脱氨酶(大肠杆菌tRNA腺嘌呤脱氨酶TadA的突变体TadA7.10等)组成融合蛋白,在sgRNA的引导下,结合靶位点并将位于碱基编辑活性窗口内的靶碱基A脱氨形成雌黄嘌呤I,再经DNA修复和复制后逐渐被替换为G,最终形成A向G的定向替换(A>G)(YanFang,Kuang Yongjie,Ren Bin,et al.High-efficient A·T to G·C base editing byCas9n-guided tRNA adenosine deaminase in rice.Molecular plant,2018,11:631-634.)。目前,植物腺嘌呤碱基编辑技术仍处于第一代阶段,即腺嘌呤脱氨酶仍使用的建立初期所用的TadA7.10,其对植物中靶碱基腺嘌呤的碱基替换效率较低,同时在实际应用过程还存在大量满足碱基编辑条件(即靶碱基位于碱基编辑活性窗口和具有合适的PAM序列)的靶位点仍然无法完成碱基编辑的事件。因此开发高效的植物腺嘌呤碱基编辑技术将基因功能研究和作物缺陷型基因矫正具有重要意义。
发明内容
本发明所要解决的技术问题是如何提高植物腺嘌呤碱基编辑的效率,解决和实现目前技术无法完成预期腺嘌呤碱基编辑的靶点的碱基编辑。
为了解决以上技术问题,本发明提供了融合蛋白在植物单碱基编辑中的应用。
本发明所提供的融合蛋白在植物单碱基编辑中的应用中,所述融合蛋白的名称为TadA-R-cas,含有Cas蛋白和腺嘌呤脱氨酶(adenosine deaminase),所述腺嘌呤脱氨酶是氨基酸序列是SEQ ID No.2的第1-167位的蛋白质,其名称为TadA-R。
上述应用中,所述Cas蛋白可为ScCas9(D10A)、SpRY(D10A)、SpCas9(D10A)或SpCas9-NG(D10A)。
上述应用中,所述SpCas9(D10A)是氨基酸序列是SEQ ID No.2的第200-1567位的蛋白质,SpCas9-NG(D10A)是氨基酸序列是SEQ ID No.4的第200-1567位的蛋白质,ScCas9(D10A)是氨基酸序列是SEQ ID No.6的第200-1574位的蛋白质,SpRY(D10A)氨基酸序列是SEQ ID No.8的第200-1567位的蛋白质。
上述应用中,所述融合蛋白可为由所述腺嘌呤脱氨酶、所述Cas蛋白和核定位信号(nuclear localization signal,NLS)连接而成的蛋白质。
上述应用中,所述融合蛋白具体可为TadA-R-ScCas9(D10A)、TadA-R-SpRY(D10A)、TadA-R-SpCas9(D10A)或TadA-R-SpCas9-NG(D10A),所述TadA-R-SpCas9(D10A)为氨基酸序列是SEQ ID No.2的蛋白质,所述TadA-R-SpCas9-NG(D10A)为氨基酸序列是SEQ ID No.4的蛋白质,所述TadA-R-ScCas9(D10A)为氨基酸序列是SEQ ID No.6的蛋白质,所述TadA-R-SpRY(D10A)为氨基酸序列是SEQ ID No.8的蛋白质。
与所述融合蛋白TadA-R-cas相关的生物材料在植物单碱基编辑(植物基因组单碱基编辑)中的应用也属于本发明的保护范围。所述生物材料可为下述任一种:
C1)编码所述融合蛋白TadA-R-cas的DNA分子;
C2)含有C1)所述DNA分子的表达盒;
C3)含有C1)所述DNA分子的重组载体;
C4)含有C1)所述DNA分子的重组微生物;
C5)含有C2)所述表达盒的重组载体;
C6)含有C2)所述表达盒的重组微生物;
C7)含有C3)所述重组载体的重组微生物。
上述应用中,C1)所述DNA分子含有腺嘌呤脱氨酶的编码基因,所述腺嘌呤脱氨酶的编码基因的核苷酸序列是SEQ ID No.1的第7-507位核苷酸。
上述应用中,C1)所述DNA分子可为TadA-R-ScCas9(D10A)的编码基因、TadA-R-SpRY(D10A)的编码基因、TadA-R-SpCas9(D10A)的编码基因或TadA-R-SpCas9-NG(D10A)的编码基因,所述TadA-R-SpCas9(D10A)的编码基因的编码链的编码序列(CDS)是SEQ IDNo.1,所述TadA-R-SpCas9-NG(D10A)的编码基因的编码链的编码序列(CDS)是SEQ IDNo.3,所述TadA-R-ScCas9(D10A)的编码基因的编码链的编码序列(CDS)是SEQ ID No.5,所述TadA-R-SpRY(D10A)的编码基因的编码链的编码序列(CDS)是SEQ ID No.7。
上述应用中,所述表达盒,是指能够在宿主细胞(如植物细胞)中表达所述融合蛋白的DNA,该DNA不但可包括启动所述融合蛋白基因转录的启动子,还可包括终止所述融合蛋白基因转录的终止子。进一步,所述表达盒还可包括增强子序列。可用于本发明的启动子包括但不限于:组成型启动子,组织、器官和发育特异的启动子,和诱导型启动子。启动子的例子包括但不限于:玉米的Ubiquitin启动子、花椰菜花叶病毒的组成型启动子35S;来自西红柿的创伤诱导型启动子,亮氨酸氨基肽酶("LAP",Chao等人(1999)PlantPhysiology120:979-992);来自烟草的化学诱导型启动子,病程相关蛋白1(PR1)(由水杨酸和BTH(苯并噻二唑-7-硫代羟酸S-甲酯)诱导);西红柿蛋白酶抑制剂II启动子(PIN2)或LAP启动子(均可用茉莉酮酸甲酯诱导);热休克启动子(美国专利5,187,267);四环素诱导型启动子(美国专利5,057,422);种子特异性启动子,如谷子种子特异性启动子pF128(CN101063139B(中国专利2007 1 0099169.7)),种子贮存蛋白质特异的启动子(例如,菜豆球蛋白、napin,oleosin和大豆beta conglycin的启动子(Beachy等人(1985)EMBO J.4:3047-3053))。它们可单独使用或与其它的植物启动子结合使用。此处引用的所有参考文献均全文引用。合适的转录终止子包括但不限于:农杆菌胭脂碱合成酶终止子(NOS终止子)、花椰菜花叶病毒CaMV 35S终止子、tml终止子、豌豆rbcS E9终止子和胭脂氨酸和章鱼氨酸合酶终止子(参见,例如:Odell等人(I985)Nature 313:810;Rosenberg等人(1987)Gene,56:125;Guerineau等人(1991)Mol.Gen.Genet,262:141;Proudfoot(1991)Cell,64:671;Sanfacon等人Genes Dev.,5:141;Mogen等人(1990)Plant Cell,2:1261;Munroe等人(1990)Gene,91:151;Ballad等人(1989)Nucleic Acids Res.17:7891;Joshi等人(1987)Nucleic Acid Res.,15:9627)。
在本发明的一个实施例中,所述表达盒由Ubip启动子(核苷酸序列是SEQ IDNo.9),所述融合蛋白TadA-R-cas的编码基因(编码链的CDS是SEQ ID No.1的第7-4737位所示的rBE46b基因、编码链的CDS是SEQ ID No.3的第7-4737位所示的rBE50基因、编码链的CDS是SEQ ID No.5的第7-4758位所示的rBE54基因、编码链的CDS是SEQ ID No.7的第7-4737位所示的rBE62基因)和NOS终止子(核苷酸序列是SEQ ID No.10)连接而成。
SEQ ID No.1中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4707位为SpCas9(D10A)的CDS,第4708-4734位为NLS的CDS,第4735-4737位为终止密码子TGA,第4738-4743位为BcuI识别位点。SEQ ID No.3中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4707位为SpCas9-NG(D10A)的CDS,第4708-4734位为NLS的CDS,第4735-4737位为终止密码子TGA,第4738-4743位为BcuI识别位点。SEQ ID No.5中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4728位为ScCas9(D10A)的CDS,第4729-4755位为NLS的CDS,第4756-4758位为终止密码子TGA,第4759-4764位为BcuI识别位点。SEQID No.7中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4707位为SpRY(D10A)的CDS,第4708-4734位为NLS的CDS,第4735-4737位为终止密码子TGA,第4738-4743位为BcuI识别位点。
上述应用中,所述重组微生物具体可为细菌,酵母,藻和真菌。
为了解决以上技术问题,本发明提供了一种将植物基因组上的A定点突变为G的方法。
本发明所提供的将植物基因组上的A定点突变为G的方法,其包括如下步骤:将表达所述融合蛋白和sgRNA的DNA分子导入受体植物中,得到含有A定点突变为G的目的植物;所述sgRNA的靶标序列是5′-N19-20PAM-3′,所述N19-20为19-20个N,所述PAM(protospaceradjacent motif)为3个N;所述N为A、G、C或T。
在将表达所述融合蛋白和sgRNA的DNA分子导入受体植物时,可以采用PEG介导转化的方法,也可以采用基因枪法或农杆菌侵染法中的一种将所述基因编辑工具盒导入到水稻原生质体或愈伤组织中,这是本领域技术人员容易理解的。本领域的技术人员公知,水稻基因组DNA由两条链组成,因此,所述靶核苷酸序列可以在其中互补的任意一条链上。例如,当所述靶核苷酸序列位于某一功能基因的正义链中时,如果该功能基因的特定位点上的A被定点突变为G后,并且如果其中的一种突变能够获得预期的其对应的功能蛋白中的氨基酸,也可以采用此系统来实现,即可以通过直接进行正义链上的碱基替换来实现三联体密码子中的A替换为G,从而得到水稻基因功能“矫正”突变体;或当所述靶核苷酸序列位于某一功能基因的反义链中时,如果该功能基因的特定位点上的T被定点突变为C后,并且如果其中的一种突变能够获得预期的其对应的功能蛋白中的氨基酸,则可以采用此系统来实现,即可以通过将该反义链中的A被定点突变为G,进而使正义链中的相应互补的T替换为C来改变正义链中的所述三联体密码子编码氨基酸,得到水稻基因功能“矫正”突变体。
所述腺嘌呤脱氨酶或编码所述腺嘌呤脱氨酶的核酸分子在植物单碱基编辑中的应用也属于本发明的保护范围。
上述融合蛋白或上述生物材料也属于本发明的保护范围。
上文中,所述植物可为双子叶植物或单子叶植物。所述单子叶植物可为水稻。所述单碱基编辑可为将腺嘌呤A替换为鸟嘌呤G。
本发明提供了4种水稻腺嘌呤基编辑器:1)名称为rBE46b的融合蛋白质(又称TadA-R-SpCas9(D10A),由名称为TadA-R的腺嘌呤脱氨酶、名称为SpCas9(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。2)名称为rBE50的融合蛋白质(又称TadA-R-SpCas9-NG(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为SpCas9-NG(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。3)名称为rBE54的融合蛋白质(又称TadA-R-ScCas9(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为ScCas9(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。4)名称为rBE62的融合蛋白质(又称TadA-R-SpRY(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为SpRY(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE46b、rBE50、rBE54和rBE62这4个腺嘌呤基编辑器的区别仅在于Cas蛋白不同。本发明的这4种水稻腺嘌呤基编辑器中的腺嘌呤脱氨酶是SEQ ID No.2的第1-167位的蛋白质,其名称为TadA-R,与未简化(二聚体化)的腺嘌呤碱基编辑器(腺嘌呤脱氨酶是由名称为wtTadA的野生型腺嘌呤脱氨酶和名称为TadA7.10的突变型腺嘌呤脱氨酶组成的二聚体),相比其编辑效率有显著的提升,且不需要额外提供ecTadA分子。基于这一发现,未来对腺嘌呤碱基编辑器的改造和优化将会更加方便。
实验证明,作为对照的碱基编辑器rBE14(wtTadA-TadA7.10-SpCas9(D10A)-NLS)对OsMPK6的NGG PAM靶点的靶碱基编辑效率为17.65%,碱基编辑器rBE46b(TadA-R-SpCas9(D10A))对OsMPK6的NGG PAM靶点的靶碱基编辑效率为60.42%;碱基编辑器rBE14对OsTms9的NGG PAM靶点的靶碱基编辑效率为0%,碱基编辑器rBE46b对OsTms9的NGG PAM靶点的靶碱基编辑效率为64.58%。作为对照的碱基编辑器rBE23(wtTadA-TadA7.10-SpCas9-NG(D10A)-NLS)对OsSERK2的NGA PAM靶点的靶碱基编辑效率为44.19%,碱基编辑器rBE50(TadA-R-SpCas9-NG(D10A))对OsSERK2的NGA PAM靶点的靶碱基编辑效率为100%;碱基编辑器rBE23对OsDEP2的NGA PAM靶点的靶碱基编辑效率为0%,碱基编辑器rBE50对OsDEP2的NGA PAM靶点的靶碱基编辑效率为27.08%;碱基编辑器rBE23对OsWRKY45的NGT PAM靶点的靶碱基编辑效率为0%,碱基编辑器rBE50对OsWRKY45的NGA PAM靶点的靶碱基编辑效率为89.36%。作为对照的碱基编辑器rBE26(wtTadA-TadA7.10-ScCas9(D10A)-NLS)对OsGS1的NAG PAM靶点(靶标序列1:5′-GCAAGAGTACACCCTCCTCCAG-3′)的靶碱基编辑效率为0%,碱基编辑器rBE54(又称TadA-R-ScCas9(D10A))对OsGS1的NAG PAM靶点(靶标序列1:5′-GCAAGAGTACACCCTCCTCCAG-3′)的靶碱基编辑效率为25.00%;碱基编辑器rBE26对OsGS1的NAG PAM靶点(靶标序列2:5′-GCTCACACCAACTACAGGTGAG-3′)的靶碱基编辑效率为47.50%,碱基编辑器rBE54对OsGS1的NAG PAM靶点(靶标序列2:5′-GCTCACACCAACTACAGGTGAG-3′)的靶碱基编辑效率为97.92%。碱基编辑器rBE62(TadA-R-SpRY(D10A))对OsMPK13的NAA PAM靶点的靶碱基编辑效率为29.17%;碱基编辑器rBE62对OsGS1的NAT PAM靶点的靶碱基编辑效率为93.75%。说明与原来的腺嘌呤脱氨酶TadA7.10介导的腺嘌呤碱基编辑载体rBE14、rBE23和rBE26相比,本申请中基于腺嘌呤脱氨酶TadA-R建立的腺嘌呤碱基编辑载体rBE46b、rBE50、rBE54对各靶位点的靶碱基编辑效率显著提高(见表2);之前很多无法编辑的靶位点,在TadA-R介导的腺嘌呤碱基编辑载体在作用下,均得到了很好的预期碱基编辑效果,这些数据表明TadA-R介导的腺嘌呤碱基编辑技术的编辑效率远远高于TadA7.10介导的腺嘌呤碱基编辑技术。
本发明不仅适用于含SpCas9(D10A)的腺嘌呤碱基编辑器,同时还适用于含ScCas9(D10A)、SpCas9-NG(D10A)和SpRY(D10A)的腺嘌呤碱基编辑器,提高了植物腺嘌呤碱基编辑的效率,尤其是解决了TadA7.10介导的碱基编辑载体无法编辑的靶点的腺嘌呤碱基编辑难题,扩宽了植物基因组定点编辑的使用范围,为植物研究和作物遗传改良领域科研人员提供一套重要的基因功能研究和矫正工具。本发明可以提高腺嘌呤碱基编辑的效率且能够精确地介导靶位点的碱基突变,并且能够在水稻甚至植物细胞中广泛适用。
附图说明
图1为pUbi-rBE、pENTR4-sgRNA和pUbi-rBE-sgRNA的载体图。
图2为rBE14和rBE46b介导的水稻内源基因OsMPK6和OsTms9的腺嘌呤碱基编辑突变效果图。图中,ref为水稻参考基因组的相应序列,WT为未进行基因编辑的水稻粳稻品种Kitaake的相应序列,其余序列为突变株的相应序列。
图3为rBE23和rBE50介导的水稻内源基因OsSERK2、OsDEP2和OsWRKY45的腺嘌呤碱基编辑突变效果图。图中,ref为水稻参考基因组的相应序列,WT为未进行基因编辑的水稻粳稻品种Kitaake的相应序列,其余序列为突变株的相应序列。
图4为rBE26和rBE54介导的水稻内源基因OsGS1的腺嘌呤碱基编辑突变效果图。图中,ref为水稻参考基因组的相应序列,WT为未进行基因编辑的水稻粳稻品种Kitaake的相应序列,其余序列为突变株的相应序列。
图5为rBE62介导的水稻内源基因OsGS1和OsMPK13的腺嘌呤碱基编辑突变效果图。图中,ref为水稻参考基因组的相应序列,WT为未进行基因编辑的水稻粳稻品种Kitaake的相应序列,其余序列为突变株的相应序列。
具体实施方式
下面结合具体实施方式对本发明进行进一步的详细描述,给出的实施例仅为了阐明本发明,而不是为了限制本发明的范围。以下提供的实施例可作为本技术领域普通技术人员进行进一步改进的指南,并不以任何方式构成对本发明的限制。
下述实施例中的实验方法,如无特殊说明,均为常规方法,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。下述实施例中所用的材料、试剂等,如无特殊说明,均可从商业途径得到。
下述实施例中的pUbi-Cas9由发明人所在实验室保存并提供(H.Zhou,B.Liu,D.P.Weeks,M.H.Spalding&B.Yang.Large chromosomal deletions and heritable smallgenetic changes induced by CRISPR/Cas9 in rice.Nucleic Acids Res.2014,42(17):10903-10914)。公众可从发明人所在实验室获得该生物材料,该生物材料只为重复本发明的相关实验所用,不可作为其它用途使用。
实施例1、将水稻基因组中的A定点突变为G
一、水稻腺嘌呤碱基编辑器表达载体的构建
本实施例提供了4种本发明的水稻腺嘌呤碱基编辑器表达载体pUbi-rBE(图1),名称分别为pUbi-rBE46b、pUbi-rBE50、pUbi-rBE54和pUbi-rBE62。pUbi-rBE46b表达的腺嘌呤基编辑器是名称为rBE46b的融合蛋白质(又称TadA-R-SpCas9(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为SpCas9(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE46b的氨基酸序列是序列表中的SEQ ID No.2。SEQ ID No.2中,第1-167位是TadA-R的氨基酸序列,第168-199位是连接肽的氨基酸序列,第200-1567位是SpCas9(D10A)的氨基酸序列,第1568-1576位是NLS的氨基酸序列。根据水稻密码子使用的偏好性对嵌合基因rBE46b基因的核苷酸序列进行密码子优化,委托生工生物工程(上海)股份有限公司完成4743bp的rBE46b基因人工合成工作。将pUbi-Cas9的BamHI和BcuI识别位点间的小片段(Cas9)替换为SEQ ID No.1的第7-4737位所示的rBE46b基因,保持pUbi-Cas9的其它核苷酸不变,得到rBE46b基因表达载体pUbi-rBE46b。SEQ ID No.1中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4707位为SpCas9(D10A)的CDS,第4708-4734位为NLS的CDS,第4735-4737位为终止密码子TGA,第4738-4743位为BcuI识别位点。pUbi-rBE46b中含有用于LR反应的元件attR1-ccdB-attR2。
pUbi-rBE50表达的腺嘌呤基编辑器是名称为rBE50的融合蛋白质(又称TadA-R-SpCas9-NG(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为SpCas9-NG的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE50的氨基酸序列是序列表中的SEQ ID No.4。SEQ ID No.4中,第1-167位是TadA-R的氨基酸序列,第168-199位是连接肽的氨基酸序列,第200-1567位是SpCas9-NG(D10A)的氨基酸序列,第1568-1576位是NLS的氨基酸序列。根据水稻密码子使用的偏好性对嵌合基因rBE50基因的核苷酸序列进行密码子优化,委托生工生物工程(上海)股份有限公司完成4743bp的rBE50基因人工合成工作。将pUbi-Cas9的BamHI和BcuI识别位点间的小片段(Cas9)替换为SEQ ID No.3的第7-4737位所示的rBE50基因,保持pUbi-Cas9的其它核苷酸不变,得到rBE50基因表达载体pUbi-rBE50。SEQ ID No.3中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4707位为SpCas9-NG(D10A)的CDS,第4708-4734位为NLS的CDS,第4735-4737位为终止密码子TGA,第4738-4743位为BcuI识别位点。pUbi-rBE50中含有用于LR反应的元件attR1-ccdB-attR2。
pUbi-rBE54表达的腺嘌呤基编辑器是名称为rBE54的融合蛋白质(又称TadA-R-ScCas9(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为ScCas9(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE54的氨基酸序列是序列表中的SEQ ID No.6。SEQ ID No.6中,第1-167位是TadA-R的氨基酸序列,第168-199位是连接肽的氨基酸序列,第200-1574位是ScCas9(D10A)的氨基酸序列,第1575-1583位是NLS的氨基酸序列。根据水稻密码子使用的偏好性对嵌合基因rBE54基因的核苷酸序列进行密码子优化,委托生工生物工程(上海)股份有限公司完成4764bp的rBE54基因人工合成工作。将pUbi-Cas9的BamHI和BcuI识别位点间的小片段(Cas9)替换为SEQ ID No.5的第7-4758位所示的rBE54基因,保持pUbi-Cas9的其它核苷酸不变,得到rBE54基因表达载体rBE54。SEQ ID No.5中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4728位为ScCas9的CDS,第4729-4755位为NLS的CDS,第4756-4758位为终止密码子TGA,第4759-4764位为BcuI识别位点。pUbi-rBE54中含有用于LR反应的元件attR1-ccdB-attR2。
pUbi-rBE62表达的腺嘌呤基编辑器是名称为rBE62的融合蛋白质(又称TadA-R-SpRY(D10A)),由名称为TadA-R的腺嘌呤脱氨酶、名称为SpRY(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE62的氨基酸序列是序列表中的SEQ ID No.8。SEQ IDNo.8中,第1-167位是TadA-R的氨基酸序列,第168-199位是连接肽的氨基酸序列,第200-1567位是SpRY(D10A)的氨基酸序列,第1568-1576位是NLS的氨基酸序列。根据水稻密码子使用的偏好性对嵌合基因rBE62基因的核苷酸序列进行密码子优化,委托生工生物工程(上海)股份有限公司完成4743bp的rBE62基因人工合成工作。将pUbi-Cas9的BamHI和BcuI识别位点间的小片段(Cas9)替换为SEQ ID No.7的第7-4737位所示的rBE62基因,保持pUbi-Cas9的其它核苷酸不变,得到rBE62基因表达载体pUbi-rBE62。SEQ ID No.7中,第1-6位为BamHI识别位点,第7-507位为TadA-R的CDS,第508-603位为连接肽的CDS,第604-4707位为SPRY的CDS,第4708-4734位为NLS的CDS,第4735-4737位为终止密码子TGA,第4738-4743位为BcuI识别位点。pUbi-rBE62中含有用于LR反应的元件attR1-ccdB-attR2。
pUbi-rBE46b、pUbi-rBE50、pUbi-rBE54和pUbi-rBE62的区别仅在于腺嘌呤基编辑器的编码基因不同。rBE46b、rBE50、rBE54和rBE62这4个腺嘌呤基编辑器的区别仅在于Cas蛋白不同。
载体pUbi-rBE46b、pUbi-rBE50、pUbi-rBE54和pUbi-rBE62的主要组成元件如下:RB T-DNA repeat序列(核苷酸序列为genbank登陆号为LC506530.1的第13973至第13997位,2020年3月20日)、attR1(核苷酸序列为genbank登陆号为KR233518.1的第2055至第2174位,2015年8月8日),ccdB表达盒(核苷酸序列为genbank登陆号为KR233518.1的第3289至第3594位,2015年8月8日),attR2(核苷酸序列为genbank登陆号为KR233518.1的第3635至第3759位,2015年8月8日),Ubip启动子(核苷酸序列是SEQ ID No.9),水稻腺嘌呤碱基编辑器基因(rBE46b基因(核苷酸序列为SEQ ID No.1的第7-4737位)、rBE50基因(核苷酸序列为SEQ ID No.3的第7-4737位)、rBE54基因(核苷酸序列为SEQ ID No.5的第7-4758位)或rBE62基因(核苷酸序列为SEQ ID No.7的第7-4737位)),NOS终止子(核苷酸序列是SEQ IDNo.10),CaMV35S启动子(核苷酸序列为genbank登陆号为FJ362600.1的第10382至第11162位,2008年11月26日),潮霉素基因(核苷酸序列为genbank登陆号为KY420085.1,2017年7月11日),CaMV poly(A)终止子(核苷酸序列为genbank登陆号为MK896900.1的第8618至第8792位,2019年9月4日),LB T-DNA repeat(核苷酸序列为genbank登陆号为LC506530.1,第3569至第3593位,2020年3月20日)。
本实施例还提供了3种作为对照的水稻腺嘌呤基编辑器表达载体,作为本发明pUbi-rBE46b的对照载体名称为pUbi-rBE14,作为本发明pUbi-rBE50的对照载体名称为pUbi-rBE23,作为本发明pUbi-rBE54的对照载体名称为pUbi-rBE26。pUbi-rBE14表达的腺嘌呤基编辑器是名称为rBE14的融合蛋白质(又称wtTadA-TadA7.10-SpCas9(D10A)-NLS),由名称为wtTadA的野生型腺嘌呤脱氨酶、名称为TadA7.10的突变型腺嘌呤脱氨酶、名称为SpCas9(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE46b与rBE14在氨基酸序列上的区别仅在于,将rBE46b中的名称为TadA-R的腺嘌呤脱氨酶替换为由名称为wtTadA的野生型腺嘌呤脱氨酶和名称为TadA7.10的突变型腺嘌呤脱氨酶连接而成的蛋白质wtTadA-TadA7.10,其它氨基酸完全相同。rBE14基因是将rBE46b基因(核苷酸序列是SEQID No.1的第7-4737位)中TadA-R的CDS(核苷酸序列为SEQ ID No.1的第7-507位)替换为SEQ ID No.12所示的wtTadA-TadA7.10基因,保持SEQ ID No.1的其它核苷酸不变得到的DNA分子。SEQ ID No.12是蛋白质wtTadA-TadA7.10的编码基因,其CDS是SEQ ID No.12;SEQID No.12中,第1-501位为wtTadA的CDS,第502-597位为连接肽的CDS,第598-1095位为TadA7.10的CDS。pUbi-rBE14是将pUbi-rBE46b中的rBE46b基因替换为rBE14基因,保持pUbi-rBE46b的其它核苷酸不变得到的rBE14基因表达载体。
pUbi-rBE23表达的腺嘌呤基编辑器是名称为rBE23的融合蛋白质(又称wtTadA-Tada7.10-SpCas9-NG(D10A)-NLS),由名称为wtTadA的野生型腺嘌呤脱氨酶、名称为TadA7.10的突变型腺嘌呤脱氨酶、名称为SpCas9-NG(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE50与rBE23在氨基酸序列上的区别仅在于,将rBE50中的名称为TadA-R的腺嘌呤脱氨酶替换为由名称为wtTadA的野生型腺嘌呤脱氨酶和名称为TadA7.10的突变型腺嘌呤脱氨酶连接而成的蛋白质wtTadA-TadA7.10,其它氨基酸完全相同。rBE23基因是将rBE50基因(核苷酸序列是SEQ ID No.3的第7-4737位)中TadA-R的CDS(核苷酸序列为SEQ ID No.3的第7-507位)替换为SEQ ID No.12所示的wtTadA-TadA7.10基因,保持SEQ ID No.3的其它核苷酸不变得到的DNA分子。SEQ ID No.12是蛋白质wtTadA-TadA7.10的编码基因,其CDS是SEQ ID No.12;SEQ ID No.12中,第1-501位为wtTadA的CDS,第502-597位为连接肽的CDS,第598-1095位为TadA7.10的CDS。pUbi-rBE23是将pUbi-rBE50中的rBE50基因替换为rBE23基因,保持pUbi-rBE50的其它核苷酸不变得到的rBE23基因表达载体。
pUbi-rBE26表达的腺嘌呤基编辑器是名称为rBE26的融合蛋白质(又称wtTadA-Tada7.10-ScCas9(D10A)-NLS),由名称为wtTadA的野生型腺嘌呤脱氨酶、名称为TadA7.10的突变型腺嘌呤脱氨酶、名称为ScCas9(D10A)的Cas蛋白和名称为NLS的核定位信号连接而成的蛋白质。rBE54与rBE26在氨基酸序列上的区别仅在于,将rBE54中的名称为TadA-R的腺嘌呤脱氨酶替换为由名称为wtTadA的野生型腺嘌呤脱氨酶和名称为TadA7.10的突变型腺嘌呤脱氨酶连接而成的蛋白质wtTadA-TadA7.10,,其它氨基酸完全相同。rBE26基因是将rBE54基因(核苷酸序列是SEQ ID No.5的第7-4758位)中TadA-R的CDS(核苷酸序列为SEQID No.5的第7-507位)替换为SEQ ID No.12所示的wtTadA-TadA7.10基因,保持SEQ IDNo.5的其它核苷酸不变得到的DNA分子。SEQ ID No.12是蛋白质wtTadA-TadA7.10的编码基因,其CDS是SEQ ID No.12;SEQ ID No.12中,第1-501位为wtTadA的CDS,第502-597位为连接肽的CDS,第598-1095位为TadA7.10的CDS。pUbi-rBE26是将pUbi-rBE54中的rBE54基因替换为rBE26基因,保持pUbi-rBE54的其它核苷酸不变得到的rBE26基因表达载体。
二、利用水稻腺嘌呤碱基编辑器表达载体对水稻内源基因的靶碱基进行A>G替换
1、针对靶标序列的基因编辑载体pUbi-rBE-sgRNA的构建
所选用靶基因(见表1)的基因组DNA序列从水稻基因组数据库(https://rapdb.dna.affrc.go.jp/)中获得,针对各碱基编辑器识别PAM需求,设计相应的靶标序列及,将表1中各靶标序列(5′-N19-20PAM-3′)的正反向寡核苷酸链(具体序列见表1)委托生工生物工程(上海)股份有限公司人工合成后,使用T4多聚核苷酸激酶将引物进行磷酸化处理,退火形成双链DNA片段(含有sgRNA的靶标序列中的5′-N19-20-3′),将双链DNA片段分别克隆到pENTR4-sgRNA(图1,含有attL1-sgRNA表达盒-attL2)载体的两个BtgZI或两个BsaI酶切位点中,引物U6p-F1(5′-AAGAACGAACTAAGCCGGAC-3′)测序确认插入片段完全正确后(插入片段含有sgRNA的靶标序列中的5′-N19-20-3′),将所得质粒经AatII酶切进行线性化,再通过Gateway的LR反应将sgRNA表达盒(含有sgRNA的编码DNA)分别克隆至水稻腺嘌呤碱基编辑器表达载体pUbi-rBE(图1)的attR1-ccdB-attR2处,获得各靶标序列的基因编辑载体pUbi-rBE-sgRNA(图1)。pUbi-rBE-sgRNA是将pUbi-rBE的元件attR1-ccdB-attR2替换为attB1-sgRNA表达盒-attB2,保持pUbi-rBE的其它核苷酸不变得到的重组表达载体。得到靶向OsMPK6基因的2种碱基编辑载体,分别为pUbi-rBE14-sgRNA-OsMPK6和pUbi-rBE46b-sgRNA-OsMPK6。得到靶向OsTms9基因的2种碱基编辑载体,分别为pUbi-rBE14-sgRNA-OsTms9和pUbi-rBE46b-sgRNA-OsTms9。得到靶向OsSERK2基因的2种碱基编辑载体,分别为pUbi-rBE50-sgRNA-OsSERK2和pUbi-rBE23-sgRNA-OsSERK2。得到靶向OsWRKY45基因的2种碱基编辑载体,分别为pUbi-rBE50-sgRNA-OsWRKY45和pUbi-rBE23-sgRNA-OsWRKY45。得到靶向OsDEP2基因的2种碱基编辑载体,分别为pUbi-rBE50-sgRNA-OsDEP2和pUbi-rBE23-sgRNA-OsDEP2。得到靶向OsGS1基因的靶标序列1(5′-GCAAGAGTACACCCTCCTCCAG-3′)的2种碱基编辑载体,分别为pUbi-rBE54-sgRNA-OsGS1-1和pUbi-rBE26-sgRNA-OsGS1-1。得到靶向OsGS1基因的靶标序列2(5′-GCTCACACCAACTACAGGTGAG-3′)的2种碱基编辑载体,分别为pUbi-rBE54-sgRNA-OsGS1-2和pUbi-rBE26-sgRNA-OsGS1-2。得到1种靶向OsGS1基因的碱基编辑载体为pUbi-rBE62-sgRNA-OsGS1。得到1种靶向OsMPK13基因的碱基编辑载体,为pUbi-rBE62-sgRNA-OsMPK13。
表1各靶基因的靶核苷酸序列信息及其检测引物
Figure BDA0002687273060000091
Figure BDA0002687273060000101
注:表1中的双链DNA片段合成所需的寡核苷酸链中的大写字母即对应于attB1-sgRNA表达盒-attB2中的N19-20,小写字母gtgt对应于BsaI位点,小写字母tgtt对应于BtgZI位点。
其中,pENTR4-sgRNA的构建方法如下:
按照从5′端到3′端的方向,将依次连接的U6启动子序列1、含有两个BtgZI酶切位点的核苷酸序列、sgRNA Scaffold序列、(T)8终止序列、U6启动子序列2、含有两个BsaI酶切位点的核苷酸序列、sgRNA Scaffold序列、(T)8终止序列组合sgRNA表达盒并委托委托生工生物工程(上海)股份有限公司进行人工合成。以公司合成的基因为模板,利用引物对(sgRNA-F:5′-GCAGGCTGTCGACTGGATCCAAGCTTAAGAACGAACTAAGCC-3′和sgRNA-R1:5′-CAAGAAAGCTGGGTGAATTCGATATCAAGCTTATCGATACCG-3′)扩增获得1kb的sgRNA表达盒片段(核苷酸序列是序列表中的序列SEQ ID No.11),以pENTR4(Invitrogen)载体为模板,用pENTR4-F1:(5′-CGAATTCACCCAGCTTTCTTGTACAAAGTTGGCATTATAAGA-3′)和pENTR4-R1:(5′-CTTAGTTCGTTCTTAAGCTTGGATCCAGTCGACAGCCTGCTTTTTTGTACAAAGT-3′)扩增2.2kb的pENTR4载体骨架(是将pENTR4的ccdB基因表达盒片段去除得到的DNA片段),借助试剂盒ClonExpress II OneStep Cloning Kit(购自南京诺唯赞生物科技股份有限公司)将sgRNA表达盒片段和pENTR4载体骨架进行infusion连接,获得载体pENTR4-sgRNA(图1)。其中的两个BtgZI或两个BsaI酶切位点用于克隆中特定基因的识别序列(sgRNA的靶标序列中的5′-N19-20-3′)。SEQ IDNo.11中,第27-348位为U6启动子序列1,第349-389位为含有两个BtgZI位点的核苷酸片段,第390-465位为sgRNA Scaffold序列,第466-473位为(T)8终止序列,第474-782位为U6启动子序列2,第783-806位为含有两个BsaI位点的核苷酸片段,第807-882位为sgRNA Scaffold序列,第883-890位为(T)8终止序列。
2、农杆菌介导水稻稳定遗传转化
2.1水稻愈伤诱导:
将去壳的水稻粳稻品种Kitaake成熟种子用50%的商业化84消毒液处理45min;无菌水清洗3-5次,然后将种子转移至无菌的培养皿中,吸出多余的水份;将种子放置于MSD固体培养基(溶质为4.43g/L MS粉,30g/L蔗糖,2ml/L 2,4-D,8g/L植物凝胶;溶剂为水;pH5.7)上,于光照培养室培养10天,诱导愈伤组织形成;去除种子的胚和芽,将愈伤组织转移至新的MSD培养皿上,培养5天后用于农杆菌的转化。
2.2农杆菌转化:
将步骤1的各靶标序列的基因编辑载体pUbi-rBE-sgRNA通过电击法分别转入农杆菌EHA105电击感受态细胞(购自北京博迈德基因技术有限公司)中。将所得农杆菌菌株在TY液体培养基(溶质为5g/L胰蛋白胨,3g/L酵母提取物;溶剂为水;pH7.0)中室温过夜培养12小时;离心收集农杆菌,用100μM乙酰丁香酮+MSD液体培养基(在MSD液体培养基中加入乙酰丁香酮至乙酰丁香酮的含量为100μM得到的液体培养基,MSD液体培养基的溶质为4.43g/LMS粉,30g/L蔗糖,2ml/L 2,4-D;溶剂为水;pH5.7)重悬,使其OD600nm=0.2待用。
2.3水稻愈伤的农杆菌侵染:
将愈伤组织分别置于上述农杆菌悬浮液中;浸泡30min后除去农杆菌悬浮液,将愈伤组织转移至无菌的吸水纸上除去多余的农杆菌菌液,再将愈伤组织转移至含有100μM乙酰丁香酮的MSD平板上,室温避光培养3天。
2.4水稻抗性愈伤筛选:
将暗培养后的愈伤组织转移至MSD筛选培养基(在MSD固体培养基中加入特美汀和潮霉素B至特美汀的含量为100mg/L和潮霉素B的含量为50mg/L得到的固体培养基)上培养,直至褐色旧愈伤组织表面出现鲜黄色抗性愈伤组织;每2周换一次培养基。
2.5抗性愈伤组织分化与生根:
将抗性愈伤组织转移至再生培养基上(溶质为4.43g/L MS粉,30g/L蔗糖,25g/L山梨醇,0.5mg/L NAA,3mg/L 6BA,100mg/L特美汀,50mg/L潮霉素B,12g/L琼脂粉;溶剂为水;pH=5.7),直至分化形成幼芽,期间每7-10天更换培养基;转移幼芽至1/2MS培养基(溶质为2.21g/L MS粉,15g/L蔗糖,8g/L植物凝胶;溶剂为水;pH5.7)中生根并长成幼苗,得到T0代转基因水稻。
2.6对T0代转基因水稻中各基因靶位点的编辑效率检测
提取T0代转基因水稻幼苗的基因组DNA。针对各基因的靶核苷酸序列,设计特异的PCR扩增引物并委托生工生物工程(上海)股份有限公司进人工合成,利用特异的PCR扩增引物(见表1)对各材料的基因组DNA进行PCR扩增,PCR产物委托生工生物工程(上海)股份有限公司进行Sanger测序。测序结果显示:
碱基编辑器rBE14对OsMPK6的NGG PAM靶点的靶碱基编辑效率为17.65%,碱基编辑器rBE46b对OsMPK6的NGG PAM靶点的靶碱基编辑效率为60.42%:检测的68株T0代转pUbi-rBE14-sgRNA-OsMPK6水稻中有12株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6位的A(对应图2中的T6)可被脱氨替换成G;检测的48株T0代转pUbi-rBE46b-sgRNA-OsMPK6水稻中有29株的腺嘌呤A被脱氨替换成鸟嘌呤G,靶标序列5′到3′方向的第6和8位的A可被脱氨替换成G,其中,有15株的第4位腺嘌呤A(对应图2中的T4)被脱氨替换成G,有29株的第6位腺嘌呤A(对应图2中的T6)被脱氨替换成G。
碱基编辑器rBE14对OsTms9的NGG PAM靶点的靶碱基编辑效率为0%,碱基编辑器rBE46b对OsTms9的NGG PAM靶点的靶碱基编辑效率为64.58%:检测的54株T0代转pUbi-rBE14-sgRNA-OsTms9水稻中有0株的腺嘌呤A被脱氨替换成鸟嘌呤G;检测的48株T0代转pUbi-rBE46b-sgRNA-OsTms9水稻中有31株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6位的A(对应图2中的T6)可被脱氨替换成G。
碱基编辑器rBE23对OsSERK2的NGA PAM靶点的靶碱基编辑效率为44.19%,碱基编辑器rBE50对OsSERK2的NGA PAM靶点的靶碱基编辑效率为100%:检测的43株T0代转pUbi-rBE23-sgRNA-OsSERK2水稻中有19株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6位的A(对应图3中的T6)可被脱氨替换成G;检测的48株T0代转pUbi-rBE50-sgRNA-OsSERK2水稻中有48株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6和8位的A(对应图3中的T6和T8)同时被脱氨替换成G。
碱基编辑器rBE23对OsDEP2的NGA PAM靶点的靶碱基编辑效率为0%,碱基编辑器rBE50对OsDEP2的NGA PAM靶点的靶碱基编辑效率为27.08%:检测的96株T0代转pUbi-rBE23-sgRNA-OsDEP2水稻中有0株的腺嘌呤A被脱氨替换成鸟嘌呤G;检测的48株T0代转pUbi-rBE50-sgRNA-OsDEP2水稻中有13株的腺嘌呤A被脱氨替换成鸟嘌呤G,靶标序列5′到3′方向的第5和7位的A可被脱氨替换成G,其中,有10株的第5位腺嘌呤A(对应图3中的A5)被脱氨替换成G,有13株的第7位腺嘌呤A(对应图3中的A7)被脱氨替换成G。
碱基编辑器rBE23对OsWRKY45的NGT PAM靶点的靶碱基编辑效率为0%,碱基编辑器rBE50对OsWRKY45的NGA PAM靶点的靶碱基编辑效率为89.36%:检测的52株T0代转pUbi-rBE23-sgRNA-OsWRKY45水稻中有0株的腺嘌呤A被脱氨替换成鸟嘌呤G;检测的47株T0代转pUbi-rBE50-sgRNA-OsWRKY45水稻中有42株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6位的A(对应图3中的T6)同时被脱氨替换成G。
碱基编辑器rBE26对OsGS1的NAG PAM靶点(靶标序列1:5′-GCAAGAGTACACCCTCCTCCAG-3′)的靶碱基编辑效率为0%,碱基编辑器rBE54对OsGS1的NAGPAM靶点(靶标序列1:5′-GCAAGAGTACACCCTCCTCCAG-3′)的靶碱基编辑效率为25%:检测的36株T0代转pUbi-rBE26-sgRNA-OsGS1-1水稻中有0株的腺嘌呤A被脱氨替换成鸟嘌呤G;检测的48株T0代转pUbi-rBE54-sgRNA-OsGS1-1水稻中有12株的腺嘌呤A被脱氨替换成鸟嘌呤G,靶标序列5′到3′方向的第4、7和10位的A可被脱氨替换成G,其中,有3株的第4位腺嘌呤A(对应图4中的A4)被脱氨替换成G,有11株的第7位腺嘌呤A(对应图4中的A7)被脱氨替换成G,有12株的第10位腺嘌呤A(对应图4中的A10)被脱氨替换成G。
碱基编辑器rBE26对OsGS1的NAG PAM靶点(靶标序列2:5′-GCTCACACCAACTACAGGTGAG-3′)的靶碱基编辑效率为47.50%,碱基编辑器rBE54对OsGS1的NAG PAM靶点(靶标序列2:5′-GCTCACACCAACTACAGGTGAG-3′)的靶碱基编辑效率为97.92%:检测的40株T0代转pUbi-rBE26-sgRNA-OsGS1-2水稻中有19株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6位的A(对应图4中的A6)同时被脱氨替换成G;检测的48株T0代转pUbi-rBE54-sgRNA-OsGS1-2水稻中有47株的腺嘌呤A被脱氨替换成鸟嘌呤G,靶标序列5′到3′方向的第6和8位的A同时被脱氨替换成G,其中,有47株的第6位腺嘌呤A(对应图4中的A6)被脱氨替换成G,有46株的第8位腺嘌呤A(对应图4中的A8)被脱氨替换成G。
碱基编辑器rBE62对OsMPK13的NAA PAM靶点的靶碱基编辑效率为29.17%:检测的48株T0代转pUbi-rBE62-sgRNA-OsMPK13水稻中有14株的腺嘌呤A被脱氨替换成鸟嘌呤G,均为靶标序列5′到3′方向的第6位的A(对应图5中的A6)同时被脱氨替换成G。
碱基编辑器rBE62对OsGS1的NAT PAM靶点的靶碱基编辑效率为93.75%:检测的48株T0代转pUbi-rBE62-sgRNA-OsGS1水稻中有45株的腺嘌呤A被脱氨替换成鸟嘌呤G,靶标序列5′到3′方向的第4和7位的A同时被脱氨替换成G,其中,有44株的第4位腺嘌呤A(对应图5中的A4)被脱氨替换成G,有21株的第7位腺嘌呤A(对应图5中的A7)被脱氨替换成G。
本实施例通过与选用相同编辑靶位点进行一对一编辑效率对比试验,检测结果显示与原来的腺嘌呤脱氨酶TadA7.10介导的腺嘌呤碱基编辑载体rBE14、rBE23和rBE26相比,本申请中基于腺嘌呤脱氨酶TadA-R建立的腺嘌呤碱基编辑载体rBE46b、rBE50、rBE54对各靶位点的靶碱基编辑效率显著提高(见表2);之前很多无法编辑的靶位点,在TadA-R介导的腺嘌呤碱基编辑载体在作用下,均得到了很好的预期碱基编辑效果,这些数据表明TadA-R介导的腺嘌呤碱基编辑技术的编辑效率远远高于TadA7.10介导的腺嘌呤碱基编辑技术。
表2各碱基编辑载体的编辑效率
Figure BDA0002687273060000121
Figure BDA0002687273060000131
以上对本发明进行了详述。对于本领域技术人员来说,在不脱离本发明的宗旨和范围,以及无需进行不必要的实验情况下,可在等同参数、浓度和条件下,在较宽范围内实施本发明。虽然本发明给出了特殊的实施例,应该理解为,可以对本发明作进一步的改进。总之,按本发明的原理,本申请欲包括任何变更、用途或对本发明的改进,包括脱离了本申请中已公开范围,而用本领域已知的常规技术进行的改变。按以下附带的权利要求的范围,可以进行一些基本特征的应用。
序列表
<110> 中国农业科学院植物保护研究所
<120>一套腺嘌呤碱基编辑器及其相关生物材料与应用
<130> GNCFH202383
<160> 12
<170> PatentIn version 3.5
<210> 1
<211> 4743
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 1
ggatccatgt cagaagtcga gttctcccat gagtattgga tgaggcacgc cctcactctt 60
gcgaagaggg ccagggacga gagggaggtg ccggtcggtg ctgtcctggt cttgaataac 120
agggtgatag gcgaaggttg gaacagggct attggccttc atgaccctac tgctcatgcg 180
gaaatcatgg cacttagaca ggggggcctc gttatgcaaa attaccgcct gatcgacgcc 240
actctttatg tcacatttga accatgtgtt atgtgtgcgg gcgctatgat ccattcacgc 300
ataggtcgcg tggtttttgg agttcgcaac agtaaacgtg gggctgcagg ctctctgatg 360
aacgttttga attatccggg aatgaaccat agagtcgaaa tcacagaagg gattttggca 420
gacgaatgcg cggctcttct ttgtgatttt tacagaatgc cccgccaagt gtttaatgct 480
caaaagaaag cgcagagtag catcaactcg gggggatctt ctgggggctc gtctggttcc 540
gagactcccg gaacttccga gtcggcaaca cctgaatcct ccggcggctc ttcgggcgga 600
tctgacaaaa aatactcaat tggtctggct attgggacaa actctgtggg ctgggcggta 660
attaccgacg agtacaaggt gcctagtaag aaatttaaag tgctcggaaa cactgacagg 720
cactctataa agaagaacct gatcggggca ctgcttttcg actccggaga gacggcggag 780
gcgacgcgtc tcaagcgtac cgcgcgccgc aggtacacaa gaaggaagaa taggatctgc 840
tacttgcagg aaatcttcag taacgagatg gcgaaggtcg acgatagttt ctttcatcgg 900
ttggaagaat cgttcctcgt agaggaggac aaaaagcacg agcgtcaccc aatattcggg 960
aatattgttg acgaggttgc ctaccatgag aaatatccta caatatatca cctccgtaag 1020
aagcttgtcg attcaactga taaggctgat ctcagactca tctatcttgc cctcgcacat 1080
atgattaagt ttcgtggcca cttcttgatt gaaggcgacc tcaacccgga caactcagat 1140
gttgacaagc tttttataca gctcgtccag acatataacc agctgtttga agagaatccc 1200
atcaatgcga gtggggttga tgctaacgcc attttgtccg ccaggttgtc caaatctcgc 1260
agactggaaa acctgatcgc acagcttccc ggtgaaaaga aaaacgggct cttcggcaat 1320
ctcatcgcac tgtccctcgg cctcacccca aacttcaagt ctaacttcga cctggccgag 1380
gatgcgaagc tccagctgtc aaaagataca tacgacgacg atttggacaa tctgcttgcg 1440
caaataggcg accagtatgc ggacctgttc ctggctgcca aaaatctgtc agatgcaatc 1500
ctcctgtccg atatattgcg tgtgaacacc gaaatcacga aggcaccgct tagcgcatcc 1560
atgatcaaga gatacgacga gcaccatcag gacctcacac tcctcaaggc gcttgttcgt 1620
cagcagcttc ccgagaaata taaggaaatt tttttcgatc aaagcaagaa tggatatgct 1680
ggctatattg acggtggcgc ttcgcaggag gagttctata aattcattaa gccgattctg 1740
gagaagatgg acggaacgga ggagctcctc gtcaagctta accgggaaga cctgttgcgg 1800
aagcagagga cttttgataa cggctctatt ccgcaccaaa tccatctggg tgagttgcac 1860
gcaatcttga gaagacaaga ggatttctac ccgttcctta aggataacag agagaagata 1920
gaaaaaatac tgaccttcag gataccatac tatgtgggcc cactggcgcg cggaaatagt 1980
cgtttcgcat ggatgactag aaagtccgaa gaaacgatca cgccatggaa ttttgaggaa 2040
gtggtcgaca agggcgcctc tgcccagagc ttcatcgaaa ggatgaccaa ttttgacaaa 2100
aatctgccta acgaaaaggt gcttccgaag cacagcctgt tgtatgaata cttcacagtt 2160
tataacgagc tcactaaggt caagtacgtc acggagggca tgcgtaagcc tgctttcctg 2220
tctggtgaac aaaaaaaggc gattgtggac ctccttttca agacgaaccg taaagttact 2280
gtgaagcaac tgaaagagga ttactttaag aaaattgagt gcttcgacag tgtggagatt 2340
tccggtgtcg aggaccggtt taacgccagc ctgggtacgt atcatgacct gcttaaaatt 2400
atcaaggata aagatttcct ggataatgaa gagaacgaag atatactgga ggacattgtg 2460
ttgactttga ccctcttcga ggacagagag atgattgagg aaagactgaa gacctacgca 2520
cacctttttg atgacaaggt catgaaacaa ctcaagcgcc ggcgctatac tggctggggc 2580
cggctttctc gcaagctcat caatgggatt cgggataagc aatcaggcaa gacaattttg 2640
gacttcctca aatccgacgg attcgcaaat aggaatttta tgcagctgat acatgacgac 2700
tctttgacat tcaaagaaga catacagaag gctcaggtct ccggccaagg agattctttg 2760
cacgagcata tcgctaactt ggcaggtagc cccgccataa aaaagggcat tcttcaaacg 2820
gtaaaagttg ttgacgaact cgtgaaggtt atgggccgtc ataagccgga aaacattgtt 2880
attgaaatgg ctagggaaaa tcagacgacc cagaagggac agaaaaatag cagggagcgg 2940
atgaagagaa ttgaagaggg aattaaggag cttggatctc agattcttaa ggagcaccct 3000
gtggagaaca cccaacttca gaatgaaaag ctctaccttt actaccttca aaacggccgg 3060
gatatgtacg tcgatcagga acttgacatt aaccggttga gcgattatga cgttgacgct 3120
attgtgcccc aatctttcct taaagacgac tctatcgaca ataaagtgct gacgcgcagc 3180
gataaaaatc gcggtaagtc ggataatgtc ccgtcggaag aggtggttaa aaaaatgaag 3240
aactattgga ggcaactcct gaatgccaag ctgatcactc agaggaaatt cgacaatctc 3300
accaaggcag aaaggggtgg acttagcgag ctcgacaagg ccggttttat caaaagacag 3360
ctggtggaga cacgccaaat caccaaacac gttgcccaga tcctggattc gaggatgaac 3420
acgaagtatg acgagaacga caagttgatt agggaagtca aggtcatcac tttgaagtcc 3480
aagctggtga gcgactttcg caaagacttc cagttttaca aagtcaggga aattaataac 3540
taccaccacg cccacgacgc ctaccttaac gccgtggttg gcacagcact catcaagaaa 3600
taccctaagc tcgaatctga gttcgtctat ggcgactata aggtctacga cgttagaaaa 3660
atgatcgcga aatctgagca ggaaataggc aaggcaactg ccaagtactt cttctattcc 3720
aatatcatga acttttttaa gacggagatt accctggcga atggtgagat ccgcaagcgc 3780
cctttgattg agacaaacgg agaaacagga gagatcgtat gggacaaagg gcgggacttt 3840
gctactgtta ggaaggtgct ctctatgcca caagttaaca ttgtcaaaaa aactgaagtg 3900
cagacaggtg ggtttagcaa ggaatctatc ctgccgaaga ggaactctga caagctgatc 3960
gcccgcaaga aagattggga tccgaaaaag tacggaggat tcgactcccc cacagttgcg 4020
tactccgtgc ttgtcgtggc caaagtggag aagggcaagt ctaagaagct caagagcgtc 4080
aaagagttgt tggggatcac gattatggag cggtcgtctt tcgaaaagaa tccgatagat 4140
tttctcgagg ccaagggtta taaagaagtc aagaaggatc ttatcatcaa gctccctaag 4200
tactccctct ttgagcttga aaacggacgg aaaagaatgc tggcttcagc gggtgaactt 4260
cagaagggta atgaactcgc tctgccctca aaatatgtga atttccttta cctggcatca 4320
cactatgaga agcttaaggg gtctccagag gacaacgagc agaagcaact gttcgttgaa 4380
caacacaagc actaccttga cgagattatc gagcaaatca gcgagtttag caagcgcgtt 4440
atactggcag acgcaaatct tgataaggtc cttagcgcct acaacaagca tagagacaaa 4500
cccatccggg agcaggccga gaacattatt catctcttca ccttgacgaa tcttggggcc 4560
ccggccgcgt tcaagtactt cgatactacc atagacagaa agcgctatac atcgacaaag 4620
gaagttcttg acgccacgct gatccaccaa agtataacag gcctctatga gacacgcatc 4680
gacctttcgc agttgggcgg tgaccgcccc aaaaagaaga ggaaagttgg cgggtgaact 4740
agt 4743
<210> 2
<211> 1576
<212> PRT
<213> 人工序列(Artificial sequence)
<400> 2
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys
145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190
Gly Gly Ser Ser Gly Gly Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
195 200 205
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
210 215 220
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
225 230 235 240
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
245 250 255
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
260 265 270
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
275 280 285
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
290 295 300
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
305 310 315 320
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
325 330 335
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
340 345 350
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
355 360 365
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
370 375 380
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
385 390 395 400
Ala Ser Gly Val Asp Ala Asn Ala Ile Leu Ser Ala Arg Leu Ser Lys
405 410 415
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
420 425 430
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
435 440 445
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
450 455 460
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
465 470 475 480
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
485 490 495
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
500 505 510
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
515 520 525
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
530 535 540
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
545 550 555 560
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
565 570 575
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
580 585 590
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
595 600 605
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
610 615 620
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
625 630 635 640
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
645 650 655
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
660 665 670
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
675 680 685
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
690 695 700
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
705 710 715 720
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
725 730 735
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
740 745 750
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
755 760 765
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
770 775 780
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
785 790 795 800
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
805 810 815
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
820 825 830
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
835 840 845
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
850 855 860
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
865 870 875 880
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
885 890 895
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
900 905 910
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
915 920 925
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
930 935 940
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
945 950 955 960
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
965 970 975
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
980 985 990
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
995 1000 1005
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp
1010 1015 1020
Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala
1025 1030 1035
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys
1040 1045 1050
Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val
1055 1060 1065
Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln
1070 1075 1080
Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu
1085 1090 1095
Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly
1100 1105 1110
Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His
1115 1120 1125
Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
1130 1135 1140
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
1145 1150 1155
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val
1160 1165 1170
Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
1175 1180 1185
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu
1190 1195 1200
Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys
1205 1210 1215
Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys
1220 1225 1230
Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1235 1240 1245
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1250 1255 1260
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
1265 1270 1275
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val
1280 1285 1290
Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile
1295 1300 1305
Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp
1310 1315 1320
Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala
1325 1330 1335
Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys
1340 1345 1350
Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu
1355 1360 1365
Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1370 1375 1380
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys
1385 1390 1395
Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala
1400 1405 1410
Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser
1415 1420 1425
Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu
1430 1435 1440
Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu
1445 1450 1455
Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu
1460 1465 1470
Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1475 1480 1485
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1490 1495 1500
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala
1505 1510 1515
Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg
1520 1525 1530
Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln
1535 1540 1545
Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu
1550 1555 1560
Gly Gly Asp Arg Pro Lys Lys Lys Arg Lys Val Gly Gly
1565 1570 1575
<210> 3
<211> 4743
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 3
ggatccatgt cagaagtcga gttctcccat gagtattgga tgaggcacgc cctcactctt 60
gcgaagaggg ccagggacga gagggaggtg ccggtcggtg ctgtcctggt cttgaataac 120
agggtgatag gcgaaggttg gaacagggct attggccttc atgaccctac tgctcatgcg 180
gaaatcatgg cacttagaca ggggggcctc gttatgcaaa attaccgcct gatcgacgcc 240
actctttatg tcacatttga accatgtgtt atgtgtgcgg gcgctatgat ccattcacgc 300
ataggtcgcg tggtttttgg agttcgcaac agtaaacgtg gggctgcagg ctctctgatg 360
aacgttttga attatccggg aatgaaccat agagtcgaaa tcacagaagg gattttggca 420
gacgaatgcg cggctcttct ttgtgatttt tacagaatgc cccgccaagt gtttaatgct 480
caaaagaaag cgcagagtag catcaactcg gggggatctt ctgggggctc gtctggttcc 540
gagactcccg gaacttccga gtcggcaaca cctgaatcct ccggcggctc ttcgggcgga 600
tctgacaaaa aatactcaat tggtctggct attgggacaa actctgtggg ctgggcggta 660
attaccgacg agtacaaggt gcctagtaag aaatttaaag tgctcggaaa cactgacagg 720
cactctataa agaagaacct gatcggggca ctgcttttcg actccggaga gacggcggag 780
gcgacgcgtc tcaagcgtac cgcgcgccgc aggtacacaa gaaggaagaa taggatctgc 840
tacttgcagg aaatcttcag taacgagatg gcgaaggtcg acgatagttt ctttcatcgg 900
ttggaagaat cgttcctcgt agaggaggac aaaaagcacg agcgtcaccc aatattcggg 960
aatattgttg acgaggttgc ctaccatgag aaatatccta caatatatca cctccgtaag 1020
aagcttgtcg attcaactga taaggctgat ctcagactca tctatcttgc cctcgcacat 1080
atgattaagt ttcgtggcca cttcttgatt gaaggcgacc tcaacccgga caactcagat 1140
gttgacaagc tttttataca gctcgtccag acatataacc agctgtttga agagaatccc 1200
atcaatgcga gtggggttga tgctaaagcc attttgtccg ccaggttgtc caaatctcgc 1260
agactggaaa acctgatcgc acagcttccc ggtgaaaaga aaaacgggct cttcggcaat 1320
ctcatcgcac tgtccctcgg cctcacccca aacttcaagt ctaacttcga cctggccgag 1380
gatgcgaagc tccagctgtc aaaagataca tacgacgacg atttggacaa tctgcttgcg 1440
caaataggcg accagtatgc ggacctgttc ctggctgcca aaaatctgtc agatgcaatc 1500
ctcctgtccg atatattgcg tgtgaacacc gaaatcacga aggcaccgct tagcgcatcc 1560
atgatcaaga gatacgacga gcaccatcag gacctcacac tcctcaaggc gcttgttcgt 1620
cagcagcttc ccgagaaata taaggaaatt tttttcgatc aaagcaagaa tggatatgct 1680
ggctatattg acggtggcgc ttcgcaggag gagttctata aattcattaa gccgattctg 1740
gagaagatgg acggaacgga ggagctcctc gtcaagctta accgggaaga cctgttgcgg 1800
aagcagagga cttttgataa cggctctatt ccgcaccaaa tccatctggg tgagttgcac 1860
gcaatcttga gaagacaaga ggatttctac ccgttcctta aggataacag agagaagata 1920
gaaaaaatac tgaccttcag gataccatac tatgtgggcc cactggcgcg cggaaatagt 1980
cgtttcgcat ggatgactag aaagtccgaa gaaacgatca cgccatggaa ttttgaggaa 2040
gtggtcgaca agggcgcctc tgcccagagc ttcatcgaaa ggatgaccaa ttttgacaaa 2100
aatctgccta acgaaaaggt gcttccgaag cacagcctgt tgtatgaata cttcacagtt 2160
tataacgagc tcactaaggt caagtacgtc acggagggca tgcgtaagcc tgctttcctg 2220
tctggtgaac aaaaaaaggc gattgtggac ctccttttca agacgaaccg taaagttact 2280
gtgaagcaac tgaaagagga ttactttaag aaaattgagt gcttcgacag tgtggagatt 2340
tccggtgtcg aggaccggtt taacgccagc ctgggtacgt atcatgacct gcttaaaatt 2400
atcaaggata aagatttcct ggataatgaa gagaacgaag atatactgga ggacattgtg 2460
ttgactttga ccctcttcga ggacagagag atgattgagg aaagactgaa gacctacgca 2520
cacctttttg atgacaaggt catgaaacaa ctcaagcgcc ggcgctatac tggctggggc 2580
cggctttctc gcaagctcat caatgggatt cgggataagc aatcaggcaa gacaattttg 2640
gacttcctca aatccgacgg attcgcaaat aggaatttta tgcagctgat acatgacgac 2700
tctttgacat tcaaagaaga catacagaag gctcaggtca gcggccaagg agattctttg 2760
cacgagcata tcgctaactt ggcaggtagc cccgccataa aaaagggcat tcttcaaacg 2820
gtaaaagttg ttgacgaact cgtgaaggtt atgggccgtc ataagccgga aaacattgtt 2880
attgaaatgg ctagggaaaa tcagacgacc cagaagggac agaaaaatag cagggagcgg 2940
atgaagagaa ttgaagaggg aattaaggag cttggatctc agattcttaa ggagcaccct 3000
gtggagaaca cccaacttca gaatgaaaag ctctaccttt actaccttca aaacggccgg 3060
gatatgtacg tcgatcagga acttgacatt aaccggttga gcgattatga cgttgaccat 3120
attgtgcccc aatctttcct taaagacgac tctatcgaca ataaagtgct gacgcgcagc 3180
gataaaaatc gcggtaagtc ggataatgtc ccgtcggaag aggtggttaa aaaaatgaag 3240
aactattgga ggcaactcct gaatgccaag ctgatcactc agaggaaatt cgacaatctc 3300
accaaggcag aaaggggtgg acttagcgag ctcgacaagg ccggttttat caaaagacag 3360
ctggtggaga cacgccaaat caccaaacac gttgcccaga tcctggattc gaggatgaac 3420
acgaagtatg acgagaacga caagttgatt agggaagtca aggtcatcac tttgaagtcc 3480
aagctggtga gcgactttcg caaagacttc cagttttaca aagtcaggga aattaataac 3540
taccaccacg cccacgacgc ctaccttaac gccgtggttg gcacagcact catcaagaaa 3600
taccctaagc tcgaatctga gttcgtctat ggcgactata aggtctacga cgttagaaaa 3660
atgatcgcga aatctgagca ggaaataggc aaggcaactg ccaagtactt cttctattcc 3720
aatatcatga acttttttaa gacggagatt accctggcga atggtgagat ccgcaagcgc 3780
cctttgattg agacaaacgg agaaacagga gagatcgtat gggacaaagg gcgggacttt 3840
gctactgtta ggaaggtgct ctctatgcca caagttaaca ttgtcaaaaa aactgaagtg 3900
cagacaggtg ggtttagcaa ggaatctatc cgcccgaaga ggaactctga caagctgatc 3960
gcccgcaaga aagattggga cccgaaaaag tacggaggat tcgtttcccc cacagttgcg 4020
tactccgtgc ttgtcgtggc caaagtggag aagggcaagt ctaagaagct caagagcgtc 4080
aaagagttgt tggggatcac gattatggag cggtcgtctt tcgaaaagaa tccgatagat 4140
tttctcgagg ccaagggtta taaagaagtc aagaaggatc ttatcatcaa gctccctaag 4200
tactccctct ttgagcttga aaacggacgg aaaagaatgc tggcttcagc gcgctttctt 4260
cagaagggta atgaactcgc tctgccctca aaatatgtga atttccttta cctggcatca 4320
cactatgaga agcttaaggg ttctccagag gacaacgagc agaagcaact gttcgttgaa 4380
caacacaagc actaccttga cgagattatc gagcaaatca gcgagtttag caagcgcgtt 4440
atactggcag acgcaaatct tgataaggtc cttagcgcct acaacaagca tagagacaaa 4500
cccatccggg agcaggccga gaacattatt catctcttca ccttgacgaa tcttggggcc 4560
ccgcgcgcgt tcaagtactt cgatactacc atagacagaa aggtctatcg ctcgacaaag 4620
gaagttcttg acgccacgct gatccaccaa agtataacag gcctctatga gacacgcatc 4680
gacctttcgc agttgggcgg tgaccgcccc aaaaagaaga ggaaagttgg cgggtgaact 4740
agt 4743
<210> 4
<211> 1576
<212> PRT
<213> 人工序列(Artificial sequence)
<400> 4
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys
145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190
Gly Gly Ser Ser Gly Gly Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
195 200 205
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
210 215 220
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
225 230 235 240
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
245 250 255
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
260 265 270
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
275 280 285
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
290 295 300
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
305 310 315 320
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
325 330 335
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
340 345 350
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
355 360 365
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
370 375 380
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
385 390 395 400
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
405 410 415
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
420 425 430
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
435 440 445
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
450 455 460
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
465 470 475 480
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
485 490 495
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
500 505 510
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
515 520 525
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
530 535 540
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
545 550 555 560
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
565 570 575
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
580 585 590
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
595 600 605
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
610 615 620
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
625 630 635 640
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
645 650 655
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
660 665 670
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
675 680 685
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
690 695 700
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
705 710 715 720
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
725 730 735
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
740 745 750
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
755 760 765
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
770 775 780
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
785 790 795 800
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
805 810 815
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
820 825 830
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
835 840 845
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
850 855 860
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
865 870 875 880
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
885 890 895
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
900 905 910
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
915 920 925
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
930 935 940
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
945 950 955 960
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
965 970 975
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
980 985 990
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
995 1000 1005
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp
1010 1015 1020
Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His
1025 1030 1035
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys
1040 1045 1050
Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val
1055 1060 1065
Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln
1070 1075 1080
Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu
1085 1090 1095
Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly
1100 1105 1110
Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His
1115 1120 1125
Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
1130 1135 1140
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
1145 1150 1155
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val
1160 1165 1170
Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
1175 1180 1185
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu
1190 1195 1200
Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys
1205 1210 1215
Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys
1220 1225 1230
Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1235 1240 1245
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1250 1255 1260
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
1265 1270 1275
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val
1280 1285 1290
Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile
1295 1300 1305
Arg Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp
1310 1315 1320
Trp Asp Pro Lys Lys Tyr Gly Gly Phe Val Ser Pro Thr Val Ala
1325 1330 1335
Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys
1340 1345 1350
Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu
1355 1360 1365
Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1370 1375 1380
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys
1385 1390 1395
Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala
1400 1405 1410
Ser Ala Arg Phe Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser
1415 1420 1425
Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu
1430 1435 1440
Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu
1445 1450 1455
Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu
1460 1465 1470
Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1475 1480 1485
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1490 1495 1500
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala
1505 1510 1515
Pro Arg Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Val
1520 1525 1530
Tyr Arg Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln
1535 1540 1545
Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu
1550 1555 1560
Gly Gly Asp Arg Pro Lys Lys Lys Arg Lys Val Gly Gly
1565 1570 1575
<210> 5
<211> 4764
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 5
ggatccatgt cagaagtcga gttctcccat gagtattgga tgaggcacgc cctcactctt 60
gcgaagaggg ccagggacga gagggaggtg ccggtcggtg ctgtcctggt cttgaataac 120
agggtgatag gcgaaggttg gaacagggct attggccttc atgaccctac tgctcatgcg 180
gaaatcatgg cacttagaca ggggggcctc gttatgcaaa attaccgcct gatcgacgcc 240
actctttatg tcacatttga accatgtgtt atgtgtgcgg gcgctatgat ccattcacgc 300
ataggtcgcg tggtttttgg agttcgcaac agtaaacgtg gggctgcagg ctctctgatg 360
aacgttttga attatccggg aatgaaccat agagtcgaaa tcacagaagg gattttggca 420
gacgaatgcg cggctcttct ttgtgatttt tacagaatgc cccgccaagt gtttaatgct 480
caaaagaaag cgcagagtag catcaactcg gggggatctt ctgggggctc gtctggttcc 540
gagactcccg gaacttccga gtcggcaaca cctgaatcct ccggcggctc ttcgggcgga 600
tctgagaaaa aatactcaat tggtctggct attggaacca attcggttgg gtgggcagtc 660
ataaccgatg actataaagt tccgagcaaa aaatttaagg tccttggtaa taccaacagg 720
aaaagcataa aaaagaatct gatgggtgct ttgctgttcg attcaggtga gacagccgag 780
gctacccggc ttaagcggac cgctcgcaga aggtacaccc ggagaaaaaa tcgcatccgc 840
tatctccagg aaattttcgc gaatgaaatg gcaaagttgg acgatagttt cttccagagg 900
ctggaagaat ccttccttgt cgaagaagat aagaaaaacg agagacaccc tatcttcgga 960
aacctggcag acgaagtggc gtaccataga aactacccta cgatttatca tctcaggaaa 1020
aagctggcag attcaccgga gaaagccgac ctcaggttga tatacttggc actcgcgcac 1080
attattaaat ttagaggtca cttccttatc gaagggaaac tgaatgcaga aaactcggat 1140
gttgctaaac ttttttatca gttgatacaa acttacaatc agctgtttga agaatcccct 1200
ttggacgaaa tcgaggttga tgctaagggc attctttctg ctaggttgtc aaagagcaaa 1260
aggctcgaaa agctcattgc tgtctttccc aacgaaaaga agaatggact ttttgggaac 1320
attatagctc ttgccctcgg cctgactcca aacttcaaaa gcaactttga tttgactgag 1380
gacgccaaac tccaattgtc aaaggatact tacgatgacg acctggacga actcttgggt 1440
cagatcgggg atcaatacgc ggatcttttc agtgctgcaa agaatctctc cgacgctatt 1500
cttctttcag acatcctgcg ctcaaatagt gaggtcacta aggctccgtt gtccgcgtcg 1560
atggttaaac ggtatgatga acatcaccag gacctcgcgc ttctgaaaac actcgtccgg 1620
caacagttcc ctgaaaagta tgcagaaata ttcaaagacg acacaaaaaa tggttacgct 1680
gggtacgtcg ggattggcat caagcataga aaacggacta ctaaacttgc tacccaagag 1740
gagttctaca agtttattaa gccaatcctg gaaaaaatgg atggcgcgga agaactcctt 1800
gccaagttga atagggatga cctcctccgg aagcaacgca cttttgacaa cggctctatc 1860
ccgcatcaga ttcacttgaa agagttgcac gcaatactcc gccgccaaga ggaattttac 1920
ccatttctca aggagaacag ggagaaaata gagaaaatct tgacgttcag gattccttac 1980
tatgtggggc ctcttgctcg gggtaattct cgctttgcct ggttgacaag aaaatctgaa 2040
gaagctatca ccccgtggaa tttcgaagaa gtcgttgata aaggcgccag cgctcaatct 2100
ttcattgagc ggatgacaaa cttcgacgag cagttgccga ataaaaaggt tctgccaaag 2160
cactcactgc tttatgagta ttttaccgtc tacaacgagt tgacgaaggt caaatacgtg 2220
actgagagga tgcggaaacc tgagtttttg tctggtgagc agaagaaagc cattgttgac 2280
cttcttttca agaccaaccg gaaggtgact gttaagcaac tcaaggaaga ttatttcaag 2340
aaaattgaat gcttcgactc cgttgagata ataggtgttg aggaccgctt caatgcgtca 2400
ctcggaacct atcacgactt gctcaaaata atcaaggaca aagactttct tgataacgaa 2460
gaaaatgaag acatattgga ggatatagtg ctcaccctta cattgttcga ggacagagaa 2520
atgatcgagg agcggcttaa gacctacgcg catctgttcg atgataaggt tatgaagcag 2580
ctgaagagga gacattacac gggttggggc cggctttcca ggaagatgat taacggtatc 2640
cgggataaac agtcaggaaa aactatactg gactttttga aatcagacgg tttctcaaac 2700
agaaacttca tgcaattgat tcatgacgat agtcttactt ttaaagagga aatcgagaag 2760
gcgcaagtga gcggacaagg agactcgctg cacgagcaaa tcgccgacct ggctgggtcg 2820
ccggctataa agaagggtat attgcagacc gtcaaaatcg tggacgagct ggtgaaggtt 2880
atggggcaca aacctgaaaa tattgttatt gagatggcta gggagaatca gactactacg 2940
aagggattgc aacagtctcg cgagcgcaag aaaaggatcg aggaaggtat taaggaactt 3000
gaatcccaga tactcaagga gaatcccgtc gagaacacac aacttcagaa cgaaaaactc 3060
tatctttact atcttcaaaa tggcagagat atgtatgtgg accaagagct ggatattaat 3120
aggctctctg attacgatgt tgaccatatc gtgccgcagt catttattaa agatgactct 3180
attgataaca aggtcctcac tcgctccgtc gaaaatcgcg gtaaatcaga caatgtcccc 3240
tcggaggaag tcgtgaagaa aatgaagaac tactggaggc agctgcttaa cgcaaagttg 3300
attactcagc gcaagtttga caacttgaca aaggccgaga ggggaggact ctctgaggcg 3360
gacaaggcag gtttcatcaa gcgccaactc gtcgagacac ggcagataac caaacacgtc 3420
gcaaggatat tggatagcag aatgaacaca aagagagata agaacgacaa accaatacgc 3480
gaagtgaaag tcatcacatt gaagtccaaa ttggttagtg atttccgcaa ggacttccaa 3540
ctgtacaaag tgagagacat caacaactac catcatgctc acgatgcata tctgaatgct 3600
gtcgtcggca cagctcttat aaagaaatac ccgaaactcg aatcggagtt cgtttatggg 3660
gattataagg tttatgacgt taggaagatg attgccaagt cagaacaaga aatcgggaag 3720
gctacagcga aacgcttttt ttattcgaac ataatgaatt tctttaaaac ggaggtcaaa 3780
cttgcgaacg gggaaatccg gaaacgcccg cttatcgaga caaatggaga aacaggtgaa 3840
gtcgtgtgga ataaagaaaa ggacttcgcc accgttcgga aagttctcgc catgccgcag 3900
gtcaacattg tcaagaaaac ggaggtccaa accgggggct tctccaagga atccattctc 3960
tcaaagaggg agagtgcaaa gctcatacct aggaagaagg gttgggacac acgcaaatac 4020
ggcgggtttg gcagtcccac ggtggcatac tctatccttg tggtcgccaa agtcgaaaag 4080
ggcaaggcga aaaaattgaa gagcgttaaa gtgcttgtcg ggatcaccat aatggagaag 4140
ggctcctacg agaaggaccc tatcgggttc ttggaagcga agggttataa agacattaag 4200
aaagagctga tcttcaaatt gccgaaatac agcctgttcg aactggagaa cggcaggcgg 4260
cgcatgttgg cgagtgccac cgagcttcag aaggctaatg agcttgtttt gccgcagcat 4320
ctcgtccgcc tcctctatta tacgcaaaat attagtgcta ctactgggtc aaataacctc 4380
ggatatattg aacaacatag ggaggagttt aaggagatat ttgagaaaat catagacttc 4440
tctgaaaagt atatactgaa aaataaggtg aactccaatc tcaagtcttc ctttgacgaa 4500
cagtttgctg tgtcggactc catacttctc agcaattctt tcgtttccct gttgaaatat 4560
acgtcatttg gcgcttccgg gggatttacc tttcttgatc ttgacgttaa acagggtagg 4620
ctcagatacc agactgtcac ggaagtgctc gatgccactc ttatatacca atcaattacg 4680
ggcctgtacg aaacgcggac agatttgtcc cagctcggcg gcgaccggcc aaagaagaag 4740
cggaaagtcg gaggctgaac tagt 4764
<210> 6
<211> 1583
<212> PRT
<213> 人工序列(Artificial sequence)
<400> 6
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys
145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190
Gly Gly Ser Ser Gly Gly Ser Glu Lys Lys Tyr Ser Ile Gly Leu Ala
195 200 205
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Asp Tyr Lys
210 215 220
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asn Arg Lys Ser
225 230 235 240
Ile Lys Lys Asn Leu Met Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
245 250 255
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
260 265 270
Arg Lys Asn Arg Ile Arg Tyr Leu Gln Glu Ile Phe Ala Asn Glu Met
275 280 285
Ala Lys Leu Asp Asp Ser Phe Phe Gln Arg Leu Glu Glu Ser Phe Leu
290 295 300
Val Glu Glu Asp Lys Lys Asn Glu Arg His Pro Ile Phe Gly Asn Leu
305 310 315 320
Ala Asp Glu Val Ala Tyr His Arg Asn Tyr Pro Thr Ile Tyr His Leu
325 330 335
Arg Lys Lys Leu Ala Asp Ser Pro Glu Lys Ala Asp Leu Arg Leu Ile
340 345 350
Tyr Leu Ala Leu Ala His Ile Ile Lys Phe Arg Gly His Phe Leu Ile
355 360 365
Glu Gly Lys Leu Asn Ala Glu Asn Ser Asp Val Ala Lys Leu Phe Tyr
370 375 380
Gln Leu Ile Gln Thr Tyr Asn Gln Leu Phe Glu Glu Ser Pro Leu Asp
385 390 395 400
Glu Ile Glu Val Asp Ala Lys Gly Ile Leu Ser Ala Arg Leu Ser Lys
405 410 415
Ser Lys Arg Leu Glu Lys Leu Ile Ala Val Phe Pro Asn Glu Lys Lys
420 425 430
Asn Gly Leu Phe Gly Asn Ile Ile Ala Leu Ala Leu Gly Leu Thr Pro
435 440 445
Asn Phe Lys Ser Asn Phe Asp Leu Thr Glu Asp Ala Lys Leu Gln Leu
450 455 460
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Glu Leu Leu Gly Gln Ile
465 470 475 480
Gly Asp Gln Tyr Ala Asp Leu Phe Ser Ala Ala Lys Asn Leu Ser Asp
485 490 495
Ala Ile Leu Leu Ser Asp Ile Leu Arg Ser Asn Ser Glu Val Thr Lys
500 505 510
Ala Pro Leu Ser Ala Ser Met Val Lys Arg Tyr Asp Glu His His Gln
515 520 525
Asp Leu Ala Leu Leu Lys Thr Leu Val Arg Gln Gln Phe Pro Glu Lys
530 535 540
Tyr Ala Glu Ile Phe Lys Asp Asp Thr Lys Asn Gly Tyr Ala Gly Tyr
545 550 555 560
Val Gly Ile Gly Ile Lys His Arg Lys Arg Thr Thr Lys Leu Ala Thr
565 570 575
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
580 585 590
Gly Ala Glu Glu Leu Leu Ala Lys Leu Asn Arg Asp Asp Leu Leu Arg
595 600 605
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
610 615 620
Lys Glu Leu His Ala Ile Leu Arg Arg Gln Glu Glu Phe Tyr Pro Phe
625 630 635 640
Leu Lys Glu Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
645 650 655
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
660 665 670
Leu Thr Arg Lys Ser Glu Glu Ala Ile Thr Pro Trp Asn Phe Glu Glu
675 680 685
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
690 695 700
Asn Phe Asp Glu Gln Leu Pro Asn Lys Lys Val Leu Pro Lys His Ser
705 710 715 720
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
725 730 735
Tyr Val Thr Glu Arg Met Arg Lys Pro Glu Phe Leu Ser Gly Glu Gln
740 745 750
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
755 760 765
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
770 775 780
Ser Val Glu Ile Ile Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
785 790 795 800
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
805 810 815
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
820 825 830
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
835 840 845
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg His Tyr
850 855 860
Thr Gly Trp Gly Arg Leu Ser Arg Lys Met Ile Asn Gly Ile Arg Asp
865 870 875 880
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
885 890 895
Ser Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
900 905 910
Lys Glu Glu Ile Glu Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
915 920 925
His Glu Gln Ile Ala Asp Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
930 935 940
Ile Leu Gln Thr Val Lys Ile Val Asp Glu Leu Val Lys Val Met Gly
945 950 955 960
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
965 970 975
Thr Thr Lys Gly Leu Gln Gln Ser Arg Glu Arg Lys Lys Arg Ile Glu
980 985 990
Glu Gly Ile Lys Glu Leu Glu Ser Gln Ile Leu Lys Glu Asn Pro Val
995 1000 1005
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
1010 1015 1020
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn
1025 1030 1035
Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
1040 1045 1050
Ile Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Val
1055 1060 1065
Glu Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val
1070 1075 1080
Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu
1085 1090 1095
Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1100 1105 1110
Gly Leu Ser Glu Ala Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu
1115 1120 1125
Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Arg Ile Leu Asp
1130 1135 1140
Ser Arg Met Asn Thr Lys Arg Asp Lys Asn Asp Lys Pro Ile Arg
1145 1150 1155
Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe
1160 1165 1170
Arg Lys Asp Phe Gln Leu Tyr Lys Val Arg Asp Ile Asn Asn Tyr
1175 1180 1185
His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala
1190 1195 1200
Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly
1205 1210 1215
Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser Glu
1220 1225 1230
Gln Glu Ile Gly Lys Ala Thr Ala Lys Arg Phe Phe Tyr Ser Asn
1235 1240 1245
Ile Met Asn Phe Phe Lys Thr Glu Val Lys Leu Ala Asn Gly Glu
1250 1255 1260
Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu
1265 1270 1275
Val Val Trp Asn Lys Glu Lys Asp Phe Ala Thr Val Arg Lys Val
1280 1285 1290
Leu Ala Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln
1295 1300 1305
Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Ser Lys Arg Glu Ser
1310 1315 1320
Ala Lys Leu Ile Pro Arg Lys Lys Gly Trp Asp Thr Arg Lys Tyr
1325 1330 1335
Gly Gly Phe Gly Ser Pro Thr Val Ala Tyr Ser Ile Leu Val Val
1340 1345 1350
Ala Lys Val Glu Lys Gly Lys Ala Lys Lys Leu Lys Ser Val Lys
1355 1360 1365
Val Leu Val Gly Ile Thr Ile Met Glu Lys Gly Ser Tyr Glu Lys
1370 1375 1380
Asp Pro Ile Gly Phe Leu Glu Ala Lys Gly Tyr Lys Asp Ile Lys
1385 1390 1395
Lys Glu Leu Ile Phe Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu
1400 1405 1410
Glu Asn Gly Arg Arg Arg Met Leu Ala Ser Ala Thr Glu Leu Gln
1415 1420 1425
Lys Ala Asn Glu Leu Val Leu Pro Gln His Leu Val Arg Leu Leu
1430 1435 1440
Tyr Tyr Thr Gln Asn Ile Ser Ala Thr Thr Gly Ser Asn Asn Leu
1445 1450 1455
Gly Tyr Ile Glu Gln His Arg Glu Glu Phe Lys Glu Ile Phe Glu
1460 1465 1470
Lys Ile Ile Asp Phe Ser Glu Lys Tyr Ile Leu Lys Asn Lys Val
1475 1480 1485
Asn Ser Asn Leu Lys Ser Ser Phe Asp Glu Gln Phe Ala Val Ser
1490 1495 1500
Asp Ser Ile Leu Leu Ser Asn Ser Phe Val Ser Leu Leu Lys Tyr
1505 1510 1515
Thr Ser Phe Gly Ala Ser Gly Gly Phe Thr Phe Leu Asp Leu Asp
1520 1525 1530
Val Lys Gln Gly Arg Leu Arg Tyr Gln Thr Val Thr Glu Val Leu
1535 1540 1545
Asp Ala Thr Leu Ile Tyr Gln Ser Ile Thr Gly Leu Tyr Glu Thr
1550 1555 1560
Arg Thr Asp Leu Ser Gln Leu Gly Gly Asp Arg Pro Lys Lys Lys
1565 1570 1575
Arg Lys Val Gly Gly
1580
<210> 7
<211> 4743
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 7
ggatccatgt cagaagtcga gttctcccat gagtattgga tgaggcacgc cctcactctt 60
gcgaagaggg ccagggacga gagggaggtg ccggtcggtg ctgtcctggt cttgaataac 120
agggtgatag gcgaaggttg gaacagggct attggccttc atgaccctac tgctcatgcg 180
gaaatcatgg cacttagaca ggggggcctc gttatgcaaa attaccgcct gatcgacgcc 240
actctttatg tcacatttga accatgtgtt atgtgtgcgg gcgctatgat ccattcacgc 300
ataggtcgcg tggtttttgg agttcgcaac agtaaacgtg gggctgcagg ctctctgatg 360
aacgttttga attatccggg aatgaaccat agagtcgaaa tcacagaagg gattttggca 420
gacgaatgcg cggctcttct ttgtgatttt tacagaatgc cccgccaagt gtttaatgct 480
caaaagaaag cgcagagtag catcaactcg gggggatctt ctgggggctc gtctggttcc 540
gagactcccg gaacttccga gtcggcaaca cctgaatcct ccggcggctc ttcgggcgga 600
tctgacaaaa aatactcaat tggtctggct attgggacaa actctgtggg ctgggcggta 660
attaccgacg agtacaaggt gcctagtaag aaatttaaag tgctcggaaa cactgacagg 720
cactctataa agaagaacct gatcggggca ctgcttttcg actccggaga gacggcggag 780
aggacgcgtc tcaagcgtac cgcgcgccgc aggtacacaa gaaggaagaa taggatctgc 840
tacttgcagg aaatcttcag taacgagatg gcgaaggtcg acgatagttt ctttcatcgg 900
ttggaagaat cgttcctcgt agaggaggac aaaaagcacg agcgtcaccc aatattcggg 960
aatattgttg acgaggttgc ctaccatgag aaatatccta caatatatca cctccgtaag 1020
aagcttgtcg attcaactga taaggctgat ctcagactca tctatcttgc cctcgcacat 1080
atgattaagt ttcgtggcca cttcttgatt gaaggcgacc tcaacccgga caactcagat 1140
gttgacaagc tttttataca gctcgtccag acatataacc agctgtttga agagaatccc 1200
atcaatgcga gtggggttga tgctaaagcc attttgtccg ccaggttgtc caaatctcgc 1260
agactggaaa acctgatcgc acagcttccc ggtgaaaaga aaaacgggct cttcggcaat 1320
ctcatcgcac tgtccctcgg cctcacccca aacttcaagt ctaacttcga cctggccgag 1380
gatgcgaagc tccagctgtc aaaagataca tacgacgacg atttggacaa tctgcttgcg 1440
caaataggcg accagtatgc ggacctgttc ctggctgcca aaaatctgtc agatgcaatc 1500
ctcctgtccg atatattgcg tgtgaacacc gaaatcacga aggcaccgct tagcgcatcc 1560
atgatcaaga gatacgacga gcaccatcag gacctcacac tcctcaaggc gcttgttcgt 1620
cagcagcttc ccgagaaata taaggaaatt tttttcgatc aaagcaagaa tggatatgct 1680
ggctatattg acggtggcgc ttcgcaggag gagttctata aattcattaa gccgattctg 1740
gagaagatgg acggaacgga ggagctcctc gtcaagctta accgggaaga cctgttgcgg 1800
aagcagagga cttttgataa cggctctatt ccgcaccaaa tccatctggg tgagttgcac 1860
gcaatcttga gaagacaaga ggatttctac ccgttcctta aggataacag agagaagata 1920
gaaaaaatac tgaccttcag gataccatac tatgtgggcc cactggcgcg cggaaatagt 1980
cgtttcgcat ggatgactag aaagtccgaa gaaacgatca cgccatggaa ttttgaggaa 2040
gtggtcgaca agggcgcctc tgcccagagc ttcatcgaaa ggatgaccaa ttttgacaaa 2100
aatctgccta acgaaaaggt gcttccgaag cacagcctgt tgtatgaata cttcacagtt 2160
tataacgagc tcactaaggt caagtacgtc acggagggca tgcgtaagcc tgctttcctg 2220
tctggtgaac aaaaaaaggc gattgtggac ctccttttca agacgaaccg taaagttact 2280
gtgaagcaac tgaaagagga ttactttaag aaaattgagt gcttcgacag tgtggagatt 2340
tccggtgtcg aggaccggtt taacgccagc ctgggtacgt atcatgacct gcttaaaatt 2400
atcaaggata aagatttcct ggataatgaa gagaacgaag atatactgga ggacattgtg 2460
ttgactttga ccctcttcga ggacagagag atgattgagg aaagactgaa gacctacgca 2520
cacctttttg atgacaaggt catgaaacaa ctcaagcgcc ggcgctatac tggctggggc 2580
cggctttctc gcaagctcat caatgggatt cgggataagc aatcaggcaa gacaattttg 2640
gacttcctca aatccgacgg attcgcaaat aggaatttta tgcagctgat acatgacgac 2700
tctttgacat tcaaagaaga catacagaag gctcaggtct ccggccaagg agattctttg 2760
cacgagcata tcgctaactt ggcaggtagc cccgccataa aaaagggcat tcttcaaacg 2820
gtaaaagttg ttgacgaact cgtgaaggtt atgggccgtc ataagccgga aaacattgtt 2880
attgaaatgg ctagggaaaa tcagacgacc cagaagggac agaaaaatag cagggagcgg 2940
atgaagagaa ttgaagaggg aattaaggag cttggatctc agattcttaa ggagcaccct 3000
gtggagaaca cccaacttca gaatgaaaag ctctaccttt actaccttca aaacggccgg 3060
gatatgtacg tcgatcagga acttgacatt aaccggttga gcgattatga cgttgaccat 3120
attgtgcccc aatctttcct taaagacgac tctatcgaca ataaagtgct gacgcgcagc 3180
gataaaaatc gcggtaagtc ggataatgtc ccgtcggaag aggtggttaa aaaaatgaag 3240
aactattgga ggcaactcct gaatgccaag ctgatcactc agaggaaatt cgacaatctc 3300
accaaggcag aaaggggtgg acttagcgag ctcgacaagg ccggttttat caaaagacag 3360
ctggtggaga cacgccaaat caccaaacac gttgcccaga tcctggattc gaggatgaac 3420
acgaagtatg acgagaacga caagttgatt agggaagtca aggtcatcac tttgaagtcc 3480
aagctggtga gcgactttcg caaagacttc cagttttaca aagtcaggga aattaataac 3540
taccaccacg cccacgacgc ctaccttaac gccgtggttg gcacagcact catcaagaaa 3600
taccctaagc tcgaatctga gttcgtctat ggcgactata aggtctacga cgttagaaaa 3660
atgatcgcga aatctgagca ggaaataggc aaggcaactg ccaagtactt cttctattcc 3720
aatatcatga acttttttaa gacggagatt accctggcga atggtgagat ccgcaagcgc 3780
cctttgattg agacaaacgg agaaacagga gagatcgtat gggacaaagg gcgggacttt 3840
gctactgtta ggaaggtgct ctctatgcca caagttaaca ttgtcaaaaa aactgaagtg 3900
cagacaggtg ggtttagcaa ggaatctatc aggccgaaga ggaactctga caagctgatc 3960
gcccgcaaga aagattggga cccgaaaaag tacggaggat tcttgtggcc cacagttgcg 4020
tactccgtgc ttgtcgtggc caaagtggag aagggcaagt ctaagaagct caagagcgtc 4080
aaagagttgt tggggatcac gattatggag cggtcgtctt tcgaaaagaa tccgatagat 4140
tttctcgagg ccaagggtta taaagaagtc aagaaggatc ttatcatcaa gctccctaag 4200
tactccctct ttgagcttga aaacggacgg aaaagaatgc tggcttcagc gaagcagctt 4260
cagaagggta atgaactcgc tctgccctca aaatatgtga atttccttta cctggcatca 4320
cactatgaga agcttaaggg gtctccagag gacaacgagc agaagcaact gttcgttgaa 4380
caacacaagc actaccttga cgagattatc gagcaaatca gcgagtttag caagcgcgtt 4440
atactggcag acgcaaatct tgataaggtc cttagcgcct acaacaagca tagagacaaa 4500
cccatccggg agcaggccga gaacattatt catctcttca ccttgacgag gcttggggcc 4560
ccgagagcgt tcaagtactt cgatactacc atagacccaa agcaatatcg gtcgacaaag 4620
gaagttcttg acgccacgct gatccaccaa agtataacag gcctctatga gacacgcatc 4680
gacctttcgc agttgggcgg tgaccgcccc aaaaagaaga ggaaagttgg cgggtgaact 4740
agt 4743
<210> 8
<211> 1576
<212> PRT
<213> 人工序列(Artificial sequence)
<400> 8
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 80
Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys
145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190
Gly Gly Ser Ser Gly Gly Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
195 200 205
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
210 215 220
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
225 230 235 240
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
245 250 255
Ala Glu Arg Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
260 265 270
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
275 280 285
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
290 295 300
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
305 310 315 320
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
325 330 335
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
340 345 350
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
355 360 365
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
370 375 380
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
385 390 395 400
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
405 410 415
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
420 425 430
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
435 440 445
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
450 455 460
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
465 470 475 480
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
485 490 495
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
500 505 510
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
515 520 525
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
530 535 540
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
545 550 555 560
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
565 570 575
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
580 585 590
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
595 600 605
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
610 615 620
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
625 630 635 640
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
645 650 655
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
660 665 670
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
675 680 685
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
690 695 700
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
705 710 715 720
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
725 730 735
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
740 745 750
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
755 760 765
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
770 775 780
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
785 790 795 800
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
805 810 815
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
820 825 830
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
835 840 845
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
850 855 860
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
865 870 875 880
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
885 890 895
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
900 905 910
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
915 920 925
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
930 935 940
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
945 950 955 960
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
965 970 975
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
980 985 990
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
995 1000 1005
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp
1010 1015 1020
Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His
1025 1030 1035
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys
1040 1045 1050
Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val
1055 1060 1065
Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln
1070 1075 1080
Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu
1085 1090 1095
Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly
1100 1105 1110
Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His
1115 1120 1125
Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
1130 1135 1140
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
1145 1150 1155
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val
1160 1165 1170
Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
1175 1180 1185
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu
1190 1195 1200
Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys
1205 1210 1215
Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys
1220 1225 1230
Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1235 1240 1245
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1250 1255 1260
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
1265 1270 1275
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val
1280 1285 1290
Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile
1295 1300 1305
Arg Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp
1310 1315 1320
Trp Asp Pro Lys Lys Tyr Gly Gly Phe Leu Trp Pro Thr Val Ala
1325 1330 1335
Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys
1340 1345 1350
Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu
1355 1360 1365
Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys
1370 1375 1380
Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys
1385 1390 1395
Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala
1400 1405 1410
Ser Ala Lys Gln Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser
1415 1420 1425
Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu
1430 1435 1440
Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu
1445 1450 1455
Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu
1460 1465 1470
Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1475 1480 1485
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1490 1495 1500
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Arg Leu Gly Ala
1505 1510 1515
Pro Arg Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Pro Lys Gln
1520 1525 1530
Tyr Arg Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln
1535 1540 1545
Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu
1550 1555 1560
Gly Gly Asp Arg Pro Lys Lys Lys Arg Lys Val Gly Gly
1565 1570 1575
<210> 9
<211> 1765
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 9
gcagcgtgac ccggtcgtgc ccctctctag agataatgag cattgcatgt ctaagttata 60
aaaaattacc acatattttt tttgtcacac ttgtttgaag tgcagtttat ctatctttat 120
acatatattt aaactttact ctacgaataa tataatctat agtactacaa taatatcagt 180
gttttagaga atcatataaa tgaacagtta gacatggtct aaaggacaat tgagtatttt 240
gacaacagga ctctacagtt ttatcttttt agtgtgcatg tgttctcctt tttttttgca 300
aatagcttca cctatataat acttcatcca ttttattagt acatccattt agggtttagg 360
gttaatggtt tttatagact aattttttta gtacatctat tttattctat tttagcctct 420
aaattaagaa aactaaaact ctattttagt ttttttattt aataatttag atataaaata 480
gaataaaata aagtgactaa aaattaaaca aatacccttt aagaaattaa aaaaactaag 540
gaaacatttt tcttgtttcg agtagataat gccagcctgt taaacgccgt cgacgagtct 600
aacggacacc aaccagcgaa ccagcagcgt cgcgtcgggc caagcgaagc agacggcacg 660
gcatctctgt cgctgcctct ggacccctct cgagagttcc gctccaccgt tggacttgct 720
ccgctgtcgg catccagaaa ttgcgtggcg gagcggcaga cgtgagccgg cacggcaggc 780
ggcctcctcc tcctctcacg gcacggcagc tacgggggat tcctttccca ccgctccttc 840
gctttccctt cctcgcccgc cgtaataaat agacaccccc tccacaccct ctttccccaa 900
cctcgtgttg ttcggagcgc acacacacac aaccagatct cccccaaatc cacccgtcgg 960
cacctccgct tcaaggtacg ccgctcgtcc tccccccccc cccctctcta ccttctctag 1020
atcggcgttc cggtccatgg ttagggcccg gtagttctac ttctgttcat gtttgtgtta 1080
gatccgtgtt tgtgttagat ccgtgctgct agcgttcgta cacggatgcg acctgtacgt 1140
cagacacgtt ctgattgcta acttgccagt gtttctcttt ggggaatcct gggatggctc 1200
tagccgttcc gcagacggga tcgatttcat gatttttttt gtttcgttgc atagggtttg 1260
gtttgccctt ttcctttatt tcaatatatg ccgtgcactt gtttgtcggg tcatcttttc 1320
atgctttttt tttgtcttgg ttgtgatgat gtggtgtggt tgggcggtcg ttcattcgtt 1380
ctagatcgga gtagaatact gtttcaaact acctggtgta tttattaatt ttggaactgt 1440
atgtgtgtgt catacatctt catagttacg agtttaagat ggatggaaat atcgatctag 1500
gataggtata catgttgatg tgggttttac tgatgcatat acatgatggc atatgcagca 1560
tctattcata tgctctaacc ttgagtacct atctattata ataaacaagt atgttttata 1620
attattttga tcttgatata cttggatgat ggcatatgca gcagctatat gtggattttt 1680
ttagccctgc cttcatacgc tatttatttg cttggtactg tttcttttgt cgatgctcac 1740
cctgttgttt ggtgttactt ctgca 1765
<210> 10
<211> 253
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 10
gatcgttcaa acatttggca ataaagtttc ttaagattga atcctgttgc cggtcttgcg 60
atgattatca tataatttct gttgaattac gttaagcatg taataattaa catgtaatgc 120
atgacgttat ttatgagatg ggtttttatg attagagtcc cgcaattata catttaatac 180
gcgatagaaa acaaaatata gcgcgcaaac taggataaat tatcgcgcgc ggtgtcatct 240
atgttactag atc 253
<210> 11
<211> 990
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 11
gcaggctgtc gactggatcc aagcttaaga acgaactaag ccggacaaaa aaaggagcac 60
atatacaaac cggttttatt catgaatggt cacgatggat gatggggctc agacttgagc 120
tacgaggccg caggcgagag aagcctagtg tgctctctgc ttgtttgggc cgtaacggag 180
gatacggccg acgagcgtgt actaccgcgc gggatgccgc tgggcgctgc gggggccgtt 240
ggatggggat cggtgggtcg cgggagcgtt gaggggagac aggtttagta ccacctcgcc 300
taccgaacaa tgaagaaccc accttataac cccgcgcgct gccgcttgtg ttggctagga 360
tccatcgcag tcagcgatga gtacagcaag ttttagagct agaaatagca agttaaaata 420
aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt tttgagattt 480
ccaaccaggt ccctggagcc catagtctag taacggccgc cagtgtgctg gaattgccct 540
tggatcatga accaacggcc tggctgtatt tggtggttgt gtagggagat ggggagaaga 600
aaagcccgat tctcttcgct gtgatgggct ggatgcatgc gggggagcgg gaggcccaag 660
tacgtgcacg gtgagcggcc cacagggcga gtgtgagcgc gagaggcggg aggaacagtt 720
tagtaccaca ttgcccagct aactcgaacg cgaccaactt ataaacccgc gcgctgtcgc 780
ttgtgtagag accaaaggag gtctcagttt tagagctaga aatagcaagt taaaataagg 840
ctagtccgtt atcaacttga aaaagtggca ccgagtcggt gctttttttt gtcccttcga 900
agggcaattc tgcagatatc catcacactg gcggccgctc gaggtcgacg gtatcgataa 960
gcttgatatc gaattcaccc agctttcttg 990
<210> 12
<211> 1095
<212> DNA
<213> 人工序列(Artificial sequence)
<400> 12
atgtccgaag tggaatttag ccatgaatat tggatgcggc acgccctcac gcttgccaag 60
agagcctggg atgagaggga ggttcccgtc ggtgccgtgt tggtccataa caacagggtg 120
attggggaag gatggaacag acccattggg cgccatgatc caactgccca tgcagagatt 180
atggcgctca ggcaaggggg gttggttatg caaaactacc ggcttattga cgcaaccctg 240
tatgtcaccc ttgaaccctg tgttatgtgc gcgggggcca tgatacactc tcggataggg 300
cgggtggtgt tcggggctcg ggatgctaag accggagctg ctggttccct catggatgtc 360
ttgcatcatc ctggtatgaa ccatagagtc gagattactg aaggcattct cgcagacgaa 420
tgcgctgccc ttctctcaga tttctttaga atgcgcagac aggaaataaa ggctcaaaaa 480
aaagcacaga gttccacgga ttccggcggg tcgagcggtg gcagctccgg ctccgagaca 540
cccggtacga gtgaatccgc tacgcccgaa tcctcggggg gaagctctgg aggctcatca 600
gaagtcgagt tctcccatga gtattggatg aggcacgccc tcactcttgc gaagagggcc 660
agggacgaga gggaggtgcc ggtcggtgct gtcctggtct tgaataacag ggtgataggc 720
gaaggttgga acagggctat tggccttcat gaccctactg ctcatgcgga aatcatggca 780
cttagacagg ggggcctcgt tatgcaaaat taccgcctga tcgacgccac tctttatgtc 840
acatttgaac catgtgttat gtgtgcgggc gctatgatcc attcacgcat aggtcgcgtg 900
gtttttggag ttcgcaacgc gaaaacaggg gctgcaggct ctctgatgga cgttttgcac 960
tatccgggaa tgaaccatag agtcgaaatc acagaaggga ttttggcaga cgaatgcgcg 1020
gctcttcttt gttatttttt cagaatgccc cgccaagtgt ttaatgctca aaagaaagcg 1080
cagagtagca cagac 1095

Claims (10)

1.融合蛋白在植物单碱基编辑中的应用,所述融合蛋白的名称为TadA-R-cas,含有Cas蛋白和腺嘌呤脱氨酶,所述腺嘌呤脱氨酶是氨基酸序列是SEQ ID No.2的第1-167位的蛋白质。
2.根据权利要求1所述的应用,其特征在于:所述Cas蛋白为ScCas9(D10A)、SpRY(D10A)、SpCas9(D10A)或SpCas9-NG(D10A)。
所述SpCas9(D10A)是氨基酸序列是SEQ ID No.2的第200-1567位的蛋白质,SpCas9-NG(D10A)是氨基酸序列是SEQ ID No.4的第200-1567位的蛋白质,ScCas9(D10A)是氨基酸序列是SEQ ID No.6的第200-1574位的蛋白质,SpRY(D10A)氨基酸序列是SEQ ID No.8的第200-1567位的蛋白质。
3.根据权利要求1或2所述的应用,其特征在于:所述融合蛋白是由所述腺嘌呤脱氨酶、所述Cas蛋白和核定位信号连接而成的蛋白质。
4.根据权利要求1-3中任一所述的应用,其特征在于:所述融合蛋白是TadA-R-ScCas9(D10A)、TadA-R-SpRY(D10A)、TadA-R-SpCas9(D10A)或TadA-R-SpCas9-NG(D10A),所述TadA-R-SpCas9(D10A)为氨基酸序列是SEQ ID No.2的蛋白质,所述TadA-R-SpCas9-NG(D10A)为氨基酸序列是SEQ ID No.4的蛋白质,所述TadA-R-ScCas9(D10A)为氨基酸序列是SEQ ID No.6的蛋白质,所述TadA-R-SpRY(D10A)为氨基酸序列是SEQ ID No.8的蛋白质。
5.与权利要求1-4中任一所述的融合蛋白相关的生物材料在植物单碱基编辑中的应用,所述生物材料为下述任一种:
C1)编码权利要求1-4中任一所述的融合蛋白的DNA分子;
C2)含有C1)所述DNA分子的表达盒,
C3)含有C1)所述DNA分子的重组载体;
C4)含有C1)所述DNA分子的重组微生物;
C5)含有C2)所述表达盒的重组载体;
C6)含有C2)所述表达盒的重组微生物;
C7)含有C3)所述重组载体的重组微生物。
6.根据权利要求5所述的应用,其特征在于:C1)所述DNA分子含有腺嘌呤脱氨酶的编码基因,所述腺嘌呤脱氨酶的编码基因的核苷酸序列是SEQ ID No.1的第7-507位核苷酸。
7.根据权利要求6所述的应用,其特征在于:C1)所述DNA分子为权利要求4中所述的TadA-R-ScCas9(D10A)的编码基因、TadA-R-SpRY(D10A)的编码基因、TadA-R-SpCas9(D10A)的编码基因或TadA-R-SpCas9-NG(D10A)的编码基因;所述TadA-R-SpCas9(D10A)的编码基因的编码链的编码序列是SEQ ID No.1,所述TadA-R-SpCas9-NG(D10A)的编码基因的编码链的编码序列是SEQ ID No.3,所述TadA-R-ScCas9(D10A)的编码基因的编码链的编码序列是SEQ ID No.5,所述TadA-R-SpRY(D10A)的编码基因的编码链的编码序列是SEQ ID No.7。
8.一种将植物基因组上的A定点突变为G的方法,其包括如下步骤:将表达权利要求1-4中任一所述的融合蛋白和sgRNA的DNA分子导入受体植物中,得到含有A定点突变为G的目的植物;所述sgRNA的靶标序列是5′-N19-20PAM-3′,所述N19-20为19-20个N,所述PAM为3个N;所述N为A、G、C或T。
9.权利要求1-4中任一所述应用中的所述腺嘌呤脱氨酶或编码所述腺嘌呤脱氨酶的核酸分子在植物单碱基编辑中的应用。
10.权利要求1-4中任一所述的融合蛋白或权利要求5中所述的生物材料。
CN202010980266.2A 2020-09-17 2020-09-17 一套腺嘌呤碱基编辑器及其相关生物材料与应用 Pending CN112143753A (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010980266.2A CN112143753A (zh) 2020-09-17 2020-09-17 一套腺嘌呤碱基编辑器及其相关生物材料与应用

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010980266.2A CN112143753A (zh) 2020-09-17 2020-09-17 一套腺嘌呤碱基编辑器及其相关生物材料与应用

Publications (1)

Publication Number Publication Date
CN112143753A true CN112143753A (zh) 2020-12-29

Family

ID=73894021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010980266.2A Pending CN112143753A (zh) 2020-09-17 2020-09-17 一套腺嘌呤碱基编辑器及其相关生物材料与应用

Country Status (1)

Country Link
CN (1) CN112143753A (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113699135A (zh) * 2021-08-10 2021-11-26 国家卫生健康委科学技术研究所 一种无pam限制的腺嘌呤碱基编辑器融合蛋白及应用
CN114045277A (zh) * 2021-10-21 2022-02-15 复旦大学 碱基编辑器及其构建方法与应用
CN114438110A (zh) * 2022-01-25 2022-05-06 浙江大学杭州国际科创中心 一种精确无pam限制的腺嘌呤碱基编辑器及其构建方法
CN114560946A (zh) * 2020-11-27 2022-05-31 华东师范大学 无pam限制的腺嘌呤单碱基编辑产品、方法和应用
CN114606227A (zh) * 2022-02-22 2022-06-10 复旦大学 高精度腺嘌呤碱基编辑器及其应用
CN114835818A (zh) * 2022-03-17 2022-08-02 江南大学 一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用
WO2023036189A1 (zh) * 2021-09-07 2023-03-16 华东师范大学 腺嘌呤脱氨酶、包含其的腺嘌呤碱基编辑器及其应用
WO2023125814A1 (zh) * 2021-12-29 2023-07-06 华东师范大学 腺嘌呤脱氨酶及其应用
WO2023163806A1 (en) * 2022-02-22 2023-08-31 Massachusetts Institute Of Technology Engineered nucleases and methods of use thereof
WO2023169454A1 (zh) * 2022-03-08 2023-09-14 中国科学院遗传与发育生物学研究所 腺嘌呤脱氨酶及其在碱基编辑中的用途

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652439A (zh) * 2018-12-27 2019-04-19 宜春学院 利用CRISPR/Cas9介导的腺嘌呤碱基编辑系统改良水稻稻瘟病广谱抗性的方法
CN110029096A (zh) * 2019-05-09 2019-07-19 上海科技大学 一种腺嘌呤碱基编辑工具及其用途

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109652439A (zh) * 2018-12-27 2019-04-19 宜春学院 利用CRISPR/Cas9介导的腺嘌呤碱基编辑系统改良水稻稻瘟病广谱抗性的方法
CN110029096A (zh) * 2019-05-09 2019-07-19 上海科技大学 一种腺嘌呤碱基编辑工具及其用途

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FANG YAN等: "Highly Efficient A·T to G·C Base Editing by Cas9n-Guided tRNA Adenosine Deaminase in Rice", 《MOL PLANT》 *
FANG YAN等: "Highly Efficient A•T to G•C Base Editing by Cas9n-Guided tRNA Adenosine Deaminase in Rice", 《MOL PLANT》 *
MICHELLE F RICHTER等: "Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity", 《NAT BIOTECHNOL》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114560946A (zh) * 2020-11-27 2022-05-31 华东师范大学 无pam限制的腺嘌呤单碱基编辑产品、方法和应用
WO2023015759A1 (zh) * 2021-08-10 2023-02-16 国家卫生健康委科学技术研究所 一种无pam限制的腺嘌呤碱基编辑器融合蛋白及应用
CN113699135A (zh) * 2021-08-10 2021-11-26 国家卫生健康委科学技术研究所 一种无pam限制的腺嘌呤碱基编辑器融合蛋白及应用
CN113699135B (zh) * 2021-08-10 2022-05-24 国家卫生健康委科学技术研究所 一种无pam限制的腺嘌呤碱基编辑器融合蛋白及应用
WO2023036189A1 (zh) * 2021-09-07 2023-03-16 华东师范大学 腺嘌呤脱氨酶、包含其的腺嘌呤碱基编辑器及其应用
CN114045277A (zh) * 2021-10-21 2022-02-15 复旦大学 碱基编辑器及其构建方法与应用
WO2023125814A1 (zh) * 2021-12-29 2023-07-06 华东师范大学 腺嘌呤脱氨酶及其应用
CN114438110A (zh) * 2022-01-25 2022-05-06 浙江大学杭州国际科创中心 一种精确无pam限制的腺嘌呤碱基编辑器及其构建方法
CN114438110B (zh) * 2022-01-25 2023-08-04 浙江大学杭州国际科创中心 一种精确无pam限制的腺嘌呤碱基编辑器及其构建方法
CN114606227A (zh) * 2022-02-22 2022-06-10 复旦大学 高精度腺嘌呤碱基编辑器及其应用
WO2023163806A1 (en) * 2022-02-22 2023-08-31 Massachusetts Institute Of Technology Engineered nucleases and methods of use thereof
CN114606227B (zh) * 2022-02-22 2024-03-08 复旦大学 高精度腺嘌呤碱基编辑器及其应用
WO2023169454A1 (zh) * 2022-03-08 2023-09-14 中国科学院遗传与发育生物学研究所 腺嘌呤脱氨酶及其在碱基编辑中的用途
CN114835818A (zh) * 2022-03-17 2022-08-02 江南大学 一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用
CN114835818B (zh) * 2022-03-17 2024-03-22 江南大学 一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用

Similar Documents

Publication Publication Date Title
CN112143753A (zh) 一套腺嘌呤碱基编辑器及其相关生物材料与应用
CN112126637B (zh) 腺苷脱氨酶及其相关生物材料与应用
CN107177625B (zh) 一种定点突变的人工载体系统及定点突变方法
CN109652422B (zh) 高效的单碱基编辑系统OsSpCas9-eCDA及其应用
US8597950B2 (en) Two-component RNA virus-derived plant expression system
CN110526993B (zh) 一种用于基因编辑的核酸构建物
CA2976387A1 (en) Soybean u6 small nuclear rna gene promoters and their use in constitutive expression of small rna genes in plants
Yamchi et al. Proline accumulation in transgenic tobacco as a result of expression of Arabidopsis Δ 1-pyrroline-5-carboxylate synthetase (P5CS) during osmotic stress
IE913884A1 (en) Plasmids for the production of transgenic plants that are¹modified in habit and yield
CN111116725B (zh) 基因Os11g0682000及其编码的蛋白在调控水稻白叶枯病抗性中的应用
CN109929019B (zh) 一种与植物耐盐碱相关蛋白GsERF7及其编码基因与应用
CN112457380A (zh) 调控植物果形和/或果汁含量的蛋白质及其相关生物材料和应用
CN112080513A (zh) 一套编辑范围扩展的水稻人工基因组编辑系统及其应用
CN115466747B (zh) 糖基转移酶ZmKOB1基因及其在调控玉米雌穗结实性状或发育上的应用
CN114349833B (zh) 钙调素结合蛋白cold12在调控植物耐冷性中的应用
US20230313212A1 (en) Plastid transformation by complementation of nuclear mutations
CN114752573B (zh) 水稻OsGA20ox2蛋白及其编码基因在提高植物抗非生物胁迫中的用途
CN110684114A (zh) 植物耐逆性相关蛋白TaBAKL在调控植物耐逆性中的应用
CN114349832B (zh) 钙调素结合蛋白cold13在调控植物耐冷性中的应用
CN115851784B (zh) 一种利用Lbcpf1变体构建的植物胞嘧啶碱基编辑系统及其应用
CN116286742B (zh) CasD蛋白、CRISPR/CasD基因编辑系统及其在植物基因编辑中的应用
CN114672513B (zh) 一种基因编辑系统及其应用
CN114752620B (zh) ZmGW3蛋白及其基因在调控玉米籽粒发育中的应用
CN114196644B (zh) 一种蛋白棕榈酰化转移酶dhhc16及其在提高水稻耐盐方面的应用
KR100468624B1 (ko) 미생물 리컴비네이즈를 이용한 색소체 형질전환 방법

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201229