CN114908116B - 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法 - Google Patents

一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法 Download PDF

Info

Publication number
CN114908116B
CN114908116B CN202210466252.8A CN202210466252A CN114908116B CN 114908116 B CN114908116 B CN 114908116B CN 202210466252 A CN202210466252 A CN 202210466252A CN 114908116 B CN114908116 B CN 114908116B
Authority
CN
China
Prior art keywords
leu
sequence
lys
trna
gene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210466252.8A
Other languages
English (en)
Other versions
CN114908116A (zh
Inventor
夏兰琴
李慧园
朱紫薇
李少雅
闫磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Original Assignee
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Crop Sciences of Chinese Academy of Agricultural Sciences filed Critical Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority to CN202210466252.8A priority Critical patent/CN114908116B/zh
Publication of CN114908116A publication Critical patent/CN114908116A/zh
Application granted granted Critical
Publication of CN114908116B publication Critical patent/CN114908116B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/65Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8218Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8274Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for herbicide resistance
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • C12Y207/07049RNA-directed DNA polymerase (2.7.7.49), i.e. telomerase or reverse-transcriptase
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/34Vector systems having a special element relevant for transcription being a transcription initiation element
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/50Vector systems having a special element relevant for transcription regulating RNA stability, not being an intron, e.g. poly A signal

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Virology (AREA)
  • Botany (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法。本发明提供了一种载体,该载体能表达抗生素抗性筛选标记蛋白和由Cas9缺刻酶和逆转录酶或其变体融合而成的融合蛋白,并含有DNA片段甲;DNA片段甲含有:1)启动子;2)polyA和终止子序列;及位于1)和2)之间的DNA序列I和插入位点;DNA序列I能表达靶向OsALS基因且将OsALS基因编码蛋白的第627位丝氨酸突变为异亮氨酸的成套向导RNA;成套向导RNA为自5’端到3’端依次由tRNA、pegRNA、tRNA、sgRNA和tRNA组成的一条串联排列序列。本发明的代理引导编辑器(PE3‑AS、PE3‑DS)可以大大提高引导编辑介导的基因精准编辑效率,且可在水稻中实现引导编辑介导的高效多基因精准编辑。

Description

一种通过借助代理引导编辑器进行水稻多基因精准编辑的 方法
技术领域
本发明涉及生物技术领域,具体涉及一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法。
背景技术
基因组编辑技术目前已成为农作物重要基因功能验证和遗传改良的重要工具。CRISPR/Cas9系统由于操作简单、成本低、可进行基因多重编辑等优点,在农作物育种中展现了广阔应用前景。利用CRISPR/Cas9系统进行农作物改良主要分为三种类型:定点敲除、定点修饰及外源基因定点整合。CRISPR/Cas9系统介导的定点敲除通常在基因组上产生随机插入和删除,但在农作物中不同品种的优异等位基因的差异通常是由基因启动子区或编码区特定的一个或多个碱基的插入或差异引起的,因此需要对目的基因进行精准改良。单碱基编辑技术虽然编辑效率较高,但单碱基编辑技术只能在基因组特定位点实现单个碱基的转换,且具有编辑窗口的限制;而CRISPR/Cas9介导的同源重组虽然可以在基因组任意位置实现精准编辑,但在植物中发生效率低,因此亟需在农作物中建立高效的基因精准编辑系统,并用于农作物的精准改良。
此前,哈佛大学David R.Liu教授研究团队,通过将Cas9缺刻酶nCas9(H840A)与逆转录酶突变体(Engineered M-MLV-RT)融合,在哺乳动物中开发了一系列新的基因组精准编辑体系-引导编辑系统(prime editors)。在引导编辑系统中,具有引导编辑作用的pegRNA(prime editing guide RNA),通过在sgRNA骨架的3’端引入引物初始结合位点(Primer binding site,PBS)序列结合nCas9断裂的非靶标链,以pegRNA上携带目标突变的逆转录本(RT)为模板,通过延伸,产生含有目的突变的单链DNA。细胞进一步通过DNA损伤修复和复制把目的突变引入基因组。此外,在非编辑链上引入能产生缺刻的sgRNA,有助于提高引导编辑的效率。该系统能够在不借助DNA双链断裂缺口和DNA供体修复模板的情况下,即可实现靶向插入、删除和所有类型的单碱基自由转换和颠换,为提高作物精准编辑效率提供了可能。
鉴于引导编辑系统在基因组精准修饰方面的巨大应用前景,多家实验室首先探索了其在植物中的可行性和有效性。与人类细胞相比,引导编辑系统在植物细胞中编辑效率普遍较低,且有较大的靶点依赖性,在很大程度上限制了引导编辑系统在农作物中的广泛应用。且到目前为止,引导编辑只能在动、植物中实现单个基因的精准编辑,尚未有引导编辑介导的多个优异等位基因同时精准编辑的报道。
发明内容
本发明的目的是提供一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法。
第一方面,本发明要求保护一种载体,记为载体甲。
所述载体甲能够表达如下(A1)和(A2),并含有如下(A3);
(A1)抗生素抗性筛选标记蛋白;
(A2)由a1)和a2)融合而成的融合蛋白;
a1)Cas9缺刻酶或其变体;
a2)逆转录酶或其变体;
(A3)DNA片段甲;所述DNA片段甲中含有:
b1)启动子;
b2)polyA和终止序列;
b3)位于所述b1)和所述b2)之间的DNA序列I;所述DNA序列I能够表达靶向OsALS基因并且将OsALS基因编码蛋白的第627位丝氨酸突变为异亮氨酸的成套向导RNA;
b4)位于所述b1)和所述b2)之间的插入位点;所述插入位点用于插入DNA序列II;所述DNA序列II能够表达针对一个或若干个靶标基因的一个或若干个成套向导RNA;
所述b3)和所述b4)位置不固定,既可以是所述b3)在前也可以是所述b4)在前;
所述成套向导RNA为自5’端到3’端依次由tRNA、pegRNA、tRNA、sgRNA和tRNA组成的一条串联排列序列。所述tRNA在体内能够自我剪切,从而释放由其串联的各pegRNA和sgRNA。
第一方面,本发明要求保护一种载体,记为载体乙。
所述载体乙能够表达如下(B1),并含有如下(B2)和(B3);
(B1)由a1)和a2)融合而成的融合蛋白;
a1)Cas9缺刻酶或其变体;
a2)逆转录酶或其变体;
(B2)含有潮霉素抗性筛选标记基因突变体的表达盒;所述潮霉素抗性筛选标记基因突变体为将野生型潮霉素抗性筛选标记基因中编码第46位酪氨酸的密码子突变为终止密码子后所得;
(B3)DNA片段乙;所述DNA片段乙中含有:
c1)启动子;
c2)polyA和终止序列;
c3)位于所述c1)和所述c2)之间的DNA序列I;所述DNA序列I能够表达靶向OsALS基因并且将OsALS基因编码蛋白的第627位丝氨酸突变为异亮氨酸的成套向导RNA;
c4)位于所述c1)和所述c2)之间的插入位点;所述插入位点用于插入DNA序列II;所述DNA序列II能够表达针对一个或若干个靶标基因的一个或若干个成套向导RNA;
c5)位于所述c1)和所述c2)之间的DNA序列III;所述DNA序列III能够表达靶向(B2)中所述潮霉素抗性筛选标记基因突变体并且将所述潮霉素抗性筛选标记基因突变体中编码第46位氨基酸的终止密码子回复为异亮氨酸的成套向导RNA;
所述c3)、所述c4)和所述c5)位置不固定;
所述成套向导RNA为自5’端到3’端依次由tRNA、pegRNA、tRNA、sgRNA和tRNA组成的一条串联排列序列。所述tRNA在体内能够自我剪切,从而释放由其串联的各pegRNA和sgRNA。
在前文第一方面和第二方面中,所述载体甲和所述载体乙均可为环形载体。
在前文第一方面和第二方面中,所述载体甲和所述载体乙中如果有若干个所述成套向导RNA,相邻的两个所述成套向导RNA可以共用连接处的tRNA。
在前文第一方面和第二方面中,所述Cas9缺刻酶可为nCas9(H840A);所述逆转录酶可为M-MLV-RT(Moloney murine leukemia virus reverse transcriptase)突变体(突变点为:H9Y,D200N,T306K,W313F,T330P,L603W)。
进一步地,所述融合蛋白两端可分别连接有核定位信号NLS。
进一步地,在所述融合蛋白中,a1)和a2)之间可由linker连接。
在所述载体中,所述融合蛋白由含有所述融合蛋白的编码基因的表达盒表达而来。该表达盒的表达产物自N端到C端依次由核定位信号NLS、nCas9(H840A)、连接肽、所述M-MLV-RT突变体和核定位信号NLS组成。进一步地,该表达产物的氨基酸序列如SEQ ID No.3所示。其中,第1-7位为核定位信号NLS、第16-1382位为nCas9(H840A)、第1383-1415位为连接肽、第1416-2092位为所述M-MLV-RT突变体、第2107-2122位为核定位信号NLS。其中,含有所述融合蛋白的编码基因的表达盒自5’端到3’端依次由玉米增强型启动子Ubiquitin、核定位信号NLS的DNA编码序列、nCas9(H840A)的编码基因、连接肽的DNA编码序列、所述M-MLV-RT突变体的编码基因、核定位信号NLS的DNA编码序列、PolyA、豌豆Rubisco小亚基E9终止子组成(Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9)。
进一步地,含有所述融合蛋白的编码基因的表达盒的序列(反向互补序列)如SEQID No.1的第14-9317位所示(对应SEQ ID No.2的第14-9317位)。
在所述DNA片段甲和所述DNA片段乙中,所述启动子均为水稻组成型Actin启动子;所述终止序列均为Nos终止序列。
在所述DNA序列I表达的成套向导RNA中,pegRNA序列为SEQ ID No.1的第10809-10929位(对应SEQ ID No.2的第11187-11307位);sgRNA为SEQ ID No.1的第11007-11102位(对应SEQ ID No.2的第11385-11480位);
进一步地,所述DNA序列I表达的成套向导RNA的序列为SEQ ID No.1的第10732-11179位(对应SEQ ID No.2的第11110-11557位)。
在所述DNA序列III表达的成套向导RNA中,pegRNA序列为SEQ ID No.2的第10809-10936位;sgRNA为SEQ ID No.2的第11014-11109位。
进一步地,所述DNA序列III表达的成套向导RNA的序列为SEQ ID No.2的第10732-11186位。
在(A1)中,所述抗生素抗性筛选标记蛋白由含有所述抗生素抗性筛选标记蛋白的编码基因的表达盒表达而来;在含有所述抗生素抗性筛选标记蛋白的编码基因的表达盒中,所述抗生素抗性筛选标记蛋白的编码基因由35S启动子启动表达。
进一步地,所述抗生素抗性筛选标记蛋白的编码基因为潮霉素抗性筛选标记基因。所述潮霉素抗性筛选标记基因的序列如SEQ ID No.1的第18260-19285位的反向互补序列所示。
更进一步地,含有所述抗生素抗性筛选标记蛋白的编码基因的表达盒的反向互补序列如SEQ ID No.1的第18046-20029位所示。
在(B2)中,在所述含有潮霉素抗性筛选标记基因突变体的表达盒中,所述潮霉素抗性筛选标记基因突变体由35S启动子启动表达。
进一步地,所述含有潮霉素抗性筛选标记基因突变体的表达盒的序列如SEQ IDNo.2的第18424-20407位所示。
在本发明的具体实施方式中,所述载体甲的序列如SEQ ID No.1所示。所述载体乙的序列如SEQ ID No.2所示。
第三方面,本发明要求保护前文第一方面或前文第二方面中所述的载体在对受体植物进行基因编辑中的应用。
第四方面,本发明要求保护一种对受体植物进行基因编辑的方法。
本发明要求保护的对受体植物进行基因编辑的方法,可包括如下步骤:将前文第一方面或第二方面中所述DNA序列II插入到前文第一方面或第二方面所述载体的所述插入位点,得到重组载体;将所述重组载体导入受体植物,从而实现对所述受体植物进行基因编辑。
在第三方面或第四方面中,所述受体植物可为单子叶植物;
进一步地,所述单子叶植物可为禾本科植物;
更进一步地,所述禾本科植物可为稻属植物;
更加具体地,所述稻属植物可为水稻。
在第三方面或第四方面中,所述基因编辑(精准编辑)可为多基因编辑;
进一步地,所述多基因编辑为同时对2个或2个以上,如2-4个基因进行编辑。
在上述各方面中,所述若干个均可以理解为两个或两个以上,如2-4个。
实验证明,本发明开发了两个代理引导编辑编辑系统,分别是基于OsALSS627I的单代理引导编辑系统,以及基于HygromycinY46*和OsALSS627I的双代理引导编辑系统,用于水稻内源基因的引导编辑,大大提高了编辑效率。基于OsALSS627I的单代理引导编辑器可以将精准编辑效率提高约14倍,而双代理系统可以将精准编辑效率最高提高约50倍。此外,本发明使用双代理系统同时精准编辑了多个内源性基因。总之,本发明为多基因精准编辑开发的代理引导编辑器将极大地扩展引导编辑在水稻多性状同时改良中的能力,并可能在未来扩展到其他作物。
附图说明
图1为本发明所涉及基因的pegRNA和sgRNA的设计。
图2为本发明各载体框架图。
图3为组培流程图。
图4为OsSPL14基因的组培流程图。
图5为目的基因编辑区域的测序峰图。
具体实施方式
下面结合具体实施方式对本发明进行进一步的详细描述,给出的实施例仅为了阐明本发明,而不是为了限制本发明的范围。以下提供的实施例可作为本技术领域普通技术人员进行进一步改进的指南,并不以任何方式构成对本发明的限制。
下述实施例中的实验方法,如无特殊说明,均为常规方法,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。下述实施例中所用的材料、试剂等,如无特殊说明,均可从商业途径得到。
实施例1、通过借助代理引导编辑器进行水稻多基因精准编辑
一、材料和方法
(一)实验材料
用于水稻转化的水稻材料为中花11,由中国农业科学院作物科学研究所提供。
(二)载体构建
1、PE3基础载体的构建及其应用
(1)PE3基础载体的构建
pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9载体全序列如SEQ ID No.4所示。第7327-9317位反向互补序列为玉米增强型启动子Ubi,第7216-7236位反向互补序列为SV40 NLS的DNA编码序列,第3091-7191位反向互补序列为nCas9(H840A)编码基因,第2992-3090位反向互补序列为Linker(33aa)的DNA编码序列,第961-2991位反向互补序列为M-MLV-RT(H9Y,D200N,T306K,W313F,T330P,L603W)突变体编码基因,第871-918位反向互补序列为NLS的DNA编码序列,第649-863位反向互补序列为PolyA序列,第14-648位反向互补序列为E9终止子。
利用APN-Fty/APN-Rty引物对(具体序列见表1),以人工合成的片段(如SEQ IDNo.5所示)为模板进行PCR扩增,获得带有接头的包含Actin启动子、polyA和Nos终止子的表达盒,把表达盒序列通过同源重组酶连入到HindⅢ和PmeⅠ双酶切后的pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9载体,经测序验证正确后得到基础载体pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-polyA-Nos(记为PE3基础载体),表达盒连入载体后原有的HindⅢ和PmeⅠ的酶切位点遭到破坏。
(2)利用PE3基础载体构建编辑目的基因的重组载体
把针对目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列(与下文针对hptII基因的tRNA-pegRNA-tRNA-sgRNA-tRNA片段的DNA序列相比差别仅在于其中的pegRNA和sgRNA序列不同,针对不同目的基因的pegRNA和sgRNA序列设计详见图1)通过overlap PCR进行扩增(相关引物见表1),利用同源重组酶进行连接插入到HindⅢ酶切后的pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-polyA-Nos载体,经测序验证正确最终形成pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-tRNA-pegRNA-tRNA-sgRNA-tRNA-pol yA-Nos,载体结构如图2所示(PE3载体-原始载体)。
当目的基因有多个或者针对一个目的基因有多个靶点时,可插入串联的多个针对目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列。在最终载体中,如果两个以上tRNA-pegRNA-tRNA-sgRNA-tRNA片段直接相连,连接处的tRNA共用。
2、HygromycinY46*单代理基础载体的构建及其应用
(1)HygromycinY46*单代理基础载体的构建
首先突变PE3基础载体上的hptII基因,利用Hpt-KPN-Fty/Hpt-mutant-R1和Hpt-mutant-F2/Hpt-Rsr-Rty引物对进行PCR扩增,使Tyr 46(TAT)变成终止密码子TAG,利用Hpt-KPN-Fty/Hpt-Rsr-Rty为引物,以第一轮PCR获得的片段按照摩尔比1:1进行混合后作为模板使用overlap PCR的方法获得Hpt-kpn-Rsr片段,将Hpt-kpn-Rsr片段利用同源重组酶进行连接到KpnⅠ和RsrⅡ双酶切的PE3基础载体上,经测序验证正确后得到pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-polyA-Nos-hptII-mutant(记为mhptII载体)。只有hptII基因精准修复后(TGA-TAC,为避免污染,突变后的TAC是Tyr-TAT的同义突变),水稻愈伤组织才能在筛选过程中具备潮霉素抗性。相关引物具体序列见表1。
利用overlap PCR方法获得tRNA-pegRNA-tRNA-sgRNA-tRNA片段。第一轮PCR分别使用PE3-F1/hpt-R1,hpt-F2/hpt-R2,hpt-F3/hpt-R3,和hpt-F4/PE3-Rty引物对,以人工合成的片段(如SEQ ID No.6所示)为模板进行扩增,分别获得带有接头的tRNA,pegRNA,tRNA和sgRNA+tRNA片段。第二轮PCR以第一轮PCR获得的tRNA,pegRNA,和tRNA片段按照摩尔比1:1:1进行混合后作为模板,利用引物PE3-Fty/hpt-R3d进行扩增,获得带接头的tRNA-pegRNA-tRNA片段,最后通过overlap PCR,以第一轮PCR获得的sgRNA+tRNA片段和第二轮PCR获得的tRNA-pegRNA-tRNA片段按照摩尔比1:1进行混合后作为模板,以PE3-Fty/PE3-Rty引物对进行扩增,获得带接头的针对hptII基因的tRNA-pegRNA-tRNA-sgRNA-tRNA片段。相关引物具体序列见表1。
将针对hptII基因的带有接头的tRNA-pegRNA-tRNA-sgRNA-tRNA DNA序列(SEQ IDNo.2的第10707-11186位+AGCTTGTCGAGGCTGAGTAAGG)利用同源重组酶进行连接插入经HindⅢ酶切的pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-polyA-Nos-hptII-mutant中,经测序验证正确后得到pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-tRNA-pegRNA(hptII)-tRNA-sgRNA(hptII)-tRNA-polyA-Nos-hptII-mutant(记为HygromycinY46*基础载体)。
(2)利用HygromycinY46*单代理基础载体构建编辑目的基因的重组载体
将针对其它目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列(与针对hptII基因的tRNA-pegRNA-tRNA-sgRNA-tRNA片段的DNA序列相比差别仅在于其中的pegRNA和sgRNA序列不同,针对不同目的基因的pegRNA和sgRNA序列设计详见图1)通过overlap PCR方(相关引物见表1)法获得,利用同源重组酶插入到HindⅢ酶切的HygromycinY46*单代理基础载体,经测序验证正确后最终形成pCXUN-Ubi-NLS-nCas9(H840A)-Linker(33aa)-M-MLV-RT-NLS-PolyA-E9-Actin-tRNA-pegRNA(hptII)-tRNA-sgRNA(hptII)-tRNA-pegRNA-tRNA-sgRNA-tRNA-polyA-Nos-hptII-mutant。载体结构如图2所示(PE3-HS载体(HPTY46*单代理系统))。
当目的基因有多个或者针对一个目的基因有多个靶点时,可插入串联的多个针对目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列。在最终载体中,如果两个以上tRNA-pegRNA-tRNA-sgRNA-tRNA片段直接相连,连接处的tRNA共用。
3、OsALSS627I单代理基础载体的构建及应用
(1)OsALSS627I单代理基础载体的构建
利用overlap PCR方法获得tRNA-pegRNA-tRNA-sgRNA-tRNA片段。第一轮PCR分别使用PE3-F1/ALS-R1,ALS-F2/ALS-R2,ALS-F3/ALS-R3,ALS-F4/PE3-Rty引物对,以人工合成的片段(如SEQ ID No.6所示)为模板进行扩增,分别获得带有接头的tRNA,pegRNA,tRNA和sgRNA+tRNA片段。第二轮PCR以第一轮PCR获得的tRNA,pegRNA,和tRNA片段按照摩尔比1:1:1进行混合后作为模板,利用引物PE3-Fty/ALS-R3d进行扩增,获得带接头的tRNA-pegRNA-tRNA片段,最后通过overlap PCR,以第一轮PCR获得的sgRNA+tRNA片段和第二轮PCR获得的tRNA-pegRNA-tRNA片段按照摩尔比1:1进行混合后作为模板,以PE3-Fty/PE3-Rty引物对进行扩增,获得带接头的针对OsALS基因的tRNA-pegRNA-tRNA-sgRNA-tRNA片段。相关引物具体序列见表1。
把针对水稻内源的OsALS基因带有接头的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列(SEQ ID No.1的第10707-11201位,对应GTGCGGAGCTTTTTTGTAGGTAGAC+SEQ ID No.2的第11110-11579位)通过overlap PCR方法获得,利用同源重组酶进行连接插入到HindⅢ酶切的PE3基础载体中,经测序验证正确后最终形成OsALSS627I单代理基础载体。
OsALSS627I单代理基础载体的全序列如SEQ ID No.1所示。其中,第14-648位为E9终止子的反向互补序列,第649-863位为polyA的反向互补序列,第871-918位为NLS的反向互补序列,第961-2991位为M-MLV-RT(H9Y,D200N,T306K,W313F,T330P,L603W)突变体编码基因的反向互补序列,第2992-3090位为liner(33a)DNA编码的反向互补序列,第3091-7191位为nCas9(H840A)编码基因的反向互补序列,第7216-7236位为SV40 NLS的反向互补序列,第7327-9317位为玉米增强型启动子Ubi的反向互补序列,第9331-10731位为Actin启动子,第10732-10808位为tRNA序列,第10809-10929位为pegRNA(OsALS)序列,第10930-11006位为tRNA序列,第11007-11102位为sgRNA(OsALS)序列,第11103-11179位为tRNA序列,第11185-11399位为polyA的反向互补序列,第11400-11652位为Nos终止子,第18046-18220位为CaMVpolyA signal反向互补序列,第18260-19285位为hptII基因反向互补序列,第19352-20029位为35S启动子的反向互补序列。
(2)利用OsALSS627I单代理基础载体构建编辑目的基因的重组载体
将针对其它目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列(与针对OsALS基因的tRNA-pegRNA-tRNA-sgRNA-tRNA片段的DNA序列相比差别仅在于其中的pegRNA和sgRNA序列不同,针对不同目的基因的pegRNA和sgRNA序列设计详见图1)通过overlap PCR方法获得(相关引物见表1),利用同源重组酶插入到HindⅢ酶切的OsALSS627I单代理基础载体中,然后进行序列测定。所得载体结构如图2所示(PE3-AS载体(OsALSS627I单代理系统))。
当目的基因有多个或者针对一个目的基因有多个靶点时,可插入串联的多个针对目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列。在最终载体中,如果两个以上tRNA-pegRNA-tRNA-sgRNA-tRNA片段直接相连,连接处的tRNA共用。
4、HygromycinY46*和OsALSS627I双代理基础载体的构建及其应用
(1)HygromycinY46*和OsALSS627I双代理基础载体的构建
把针对水稻内源的OsALS基因带有接头的pegRNA-tRNA-sgRNA-tRNA的DNA序列(SEQ ID No.1的第10787-11201位,对应SEQ ID No.2的第11165-11579位)使用ALS-HS-Fty/PE3-Rty引物对(相关引物见表1),以OsALS基因带有接头的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列(SEQ ID No.1的第10707-11201位,对应GTGCGGAGCTTTTTTGTAGGTAGAC+SEQID No.2的第11110-11579位)为模板进行PCR扩增,利用同源重组酶插入到HindⅢ酶切的HygromycinY46*单代理基础载体中,经测序正确后最终形成HygromycinY46*和OsALSS627I双代理基础载体。
HygromycinY46*和OsALSS627I双代理基础载体的全序列如SEQ ID No.2所示。其中,第14-648位为E9终止子的反向互补序列,第649-863位为polyA的反向互补序列,第871-918位为NLS的DNA编码序列的反向互补序列,第961-2991位为M-MLV-RT(H9Y,D200N,T306K,W313F,T330P,L603W)突变体编码基因的反向互补序列,第2992-3090位为linker(33a)的DNA编码序列的反向互补序列,第3091-7191位为nCas9(H840A)编码基因的反向互补序列,第7216-7236位为SV40 NLS的反向互补序列,第7327-9317位为玉米增强型启动子Ubi的反向互补序列,第9331-10731位为Actin启动子,第10732-10808位为tRNA序列,第10809-10936位为pegRNA(hptII)序列,第10937-11013位为tRNA序列,第11014-11109位为sgRNA(hptII)序列,第11110-11186位为tRNA序列,第11187-11307位为pegRNA(OsALS)序列,第11308-11384位为tRNA序列,第11385-11480位为sgRNA(OsALS)序列,第11481-11557位为tRNA序列,第11563-11777位为polyA,第11778-12030位为Nos终止子,第18424-18598位为CaMV polyA signal反向互补序列,第18638-19663位为mhptII编码基因的反向互补序列(把Tyr 46(TAT)变成终止密码子TAG),第19730-20407位为35S启动子的反向互补序列。
(2)利用HygromycinY46*和OsALSS627I双代理基础载体构建编辑目的基因的重组载体
将针对其它目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA片段通过overlap PCR扩增后利用同源重组酶插入到HindⅢ酶切的HygromycinY46*和OsALSS627I双代理基础载体中。所得载体结构如图2所示(PE3-DS载体(双代理系统))。
当目的基因有多个或者针对一个目的基因有多个靶点时,可插入串联的多个针对目的基因的tRNA-pegRNA-tRNA-sgRNA-tRNA的DNA序列。在最终载体中,如果两个以上tRNA-pegRNA-tRNA-sgRNA-tRNA片段直接相连,连接处的tRNA共用。
(三)转基因水稻的获得
1、选取饱满的中花11水稻种子,剥去种皮,灭菌、消毒并清洗后,均匀地点入在含有浓度为3.5mg/L的2,4-D的灭菌N6固体培养基(Phytotech,C167)中,28℃黑暗培养30-40d以诱导愈伤组织的产生。
2、将步骤1得到的愈伤组织在含有0.3M甘露醇和0.3M山梨醇的N6培养基中高渗处理4-6h后,将待导入质粒通过基因枪轰击水稻愈伤,采用0.6μm金粉,轰击压力为900psi进行轰击,轰击后在含有0.3M甘露醇和0.3M山梨醇的N6培养基中先放置在32℃培养箱处理6h后,再移至30℃培养10h后转移至N6筛选培养基(含有0.5mg/L的2,4-D和50mg/L的潮霉素的N6固体培养基)中,30℃持续暗培养2周。
3、选取生长良好呈嫩黄色的抗性愈伤组织,用无菌镊子移至第二轮筛选培养基上,PE3原始载体组和PE-HS载体组移到含有0.5mg/L的2,4-D和50mg/L的潮霉素的N6固体培养基,PE-AS载体组和PE-DS载体组额外添加0.65μmol/L的双草醚(BS),30℃持续暗照培养2-3周。
4、完成步骤3后,挑选生长旺盛的愈伤组织转入MS再生培养基(Phytotech,M524)(含有0.02mg/L NAA、2mg/L kinetin)中,28℃持续光照培养。
5、完成步骤4后,待分化出来的幼苗长至2至5cm,转入MS固体培养基(Phytotech,M519)中28℃光照培养2到4周,之后移入生根培养基中置于温室生长(温度28-30℃,16h光照/8h黑暗),得到T0代植株。对T0代植株进行基因型分析,获得目标植物移入土中并单株收种(图3)。
(四)转基因水稻的基因型鉴定
提取待测植株的基因组DNA,以基因组DNA为模板,对进行PCR扩增,通过Sanger测序确定植株的基因型。所用引物参见表1。
表1、本发明引物
/>
三、结果与分析
本发明提出了一种假设,可通过对代理引导编辑器中hptII或OsALS精准编辑事件的富集,进一步富集引导编辑活性细胞。通过精准编辑将引导编辑载体中hptII的终止密码子(*)精准校正为Tyr46,可恢复水稻愈伤组织对潮霉素的抗性,从而充当基于HygromycinY46*的单代理引导编辑系统(PE3-HS)。此外,因hptII突变体(mhptII)是一个外源基因,本发明还添加了一个基于内源性等位基因OsALSS627I的代理系统(PE3-AS)(图2),OsALS既可作为内源基因,也可作为一个代理系统,用于富集其他目标内源基因的精准编辑事件。OsALS中627位的丝氨酸突变为异亮氨酸(Ser-to-Ile)将赋予水稻对双草醚(BS)的除草剂抗性,在愈伤组织诱导和筛选过程中,双草醚可以作为筛选剂。本发明还进一步推断,当基于HygromycinY46*和OsALSS627I的代理引导编辑器作为双代理引导编辑器(PE3-DS)使用时,其对于活性细胞的富集作用将高于单一代理引导编辑器,因此双代理引导编辑器可更有效地富集内源基因的精准编辑事件。
为了测试上述假设,本发明选择了几个内源基因作为引导编辑的目标基因进行相关实验(图1),内源基因包括OsSPL14(也称为OsIPA1),OsDHDPS,OsNR2,以及OsEPSPS基因。每个内源基因的pegRNA和sgRNA的详细设计如图1所示。为了实现多基因引导编辑,使用tRNA策略同时产生一组串联的pegRNAs和sgRNAs。
本发明分别测试了这三种代理系统对内源基因OsSPL14、OsDHDPS和OsNR2进行引导编辑的效率。如图4所示,对于内源基因OsSPL14的引导编辑,mhptII载体(CK)转化的愈伤组织在单独添加潮霉素或同时潮霉素和BS的诱导培养基上培养4周后,几乎未得到再生植株;而使用代理系统时,转化后的愈伤在潮霉素或潮霉素和BS的诱导培养基上培养4周后获得的抗性愈伤均可分化出再生植株(图4),且植株假阳性率低,此结果表明代理系统可有效地富集内源基因的精准编辑事件。进一步研究发现,PE3-AS,PE3-DS的筛选效果优于PE3-HS。
对于OsSPL14基因,本发明通过基因枪介导的遗传转化将PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体分别转化200个水稻愈伤组织,并将每个愈伤组织再生的所有植株作为单一事件处理。通过水稻遗传转化,PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体分别获得了101、95、20和73簇独立的再生植株(表2)。使用表1中列出的引物OsSPL14-F/OsSPL14-R,以上述再生植株的基因组DNA为模板进行PCR扩增,并对PCR产物进行Sanger测序,通过对测序进行分析,最终确定每个独立株系的基因型。结果表明,转化PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体最终分别获得了1簇、2簇、2簇和9簇精准编辑植株,精准编辑效率分别为1.0%、2.1%、10.0%和12.3%。进一步分析表明,与PE3原始载体相比,代理系统PE3-HS、PE3-AS和PE3-DS可将引导编辑效率分别提高约2倍、10倍和12倍,其中双代理系统编辑效率最高(表2)。值得注意的是,在双代理系统转化获得的9簇精准编辑植株中,3簇为杂合编辑植株,6簇为双等位编辑植株(图5):一条链发生精准编辑,另一条链发生部分精准编辑。且上述9簇精准编辑植株中,绝大多数编辑植株的hptII和OsALSS627I基因型为纯合或杂合,仅少量植株存在逃逸现象,这可能与再生培养基不添加筛选有关(数据未显示)。
对于OsDHDPS基因,本发明通过基因枪介导的遗传转化将PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体分别转化200个水稻愈伤组织,并将每个愈伤组织再生的所有植株作为单一事件处理。通过水稻遗传转化,PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体分别获得了94、80、35和24簇独立的再生植株(表2)。使用表1中列出的引物OsDHDPS-F/OsDHDPS-R,以上述再生植株的基因组DNA为模板进行PCR扩增,并对PCR产物进行Sanger测序,通过对测序进行分析,最终确定每个独立株系的基因型。结果表明,转化PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体最终分别获得了0簇、1簇、5簇和13簇精准编辑植株,精准编辑效率分别为0%、1.3%、14.3%和54.2%(表2)。进一步分析表明,与PE3原始载体相比,代理系统PE3-HS、PE3-AS和PE3-DS可以将引导编辑效率分别提高约2倍、约14倍、约50倍。同样,双代理系统编辑效率最高(表2)。
对于OsNR2,本发明通过基因枪介导的遗传转化将PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体分别转化200个水稻愈伤组织,并将每个愈伤组织再生的所有植株作为单一事件处理。通过水稻遗传转化,PE3原始载体、PE3-HS、PE3-AS和PE3-DS载体分别获得了102、48、41和31簇独立的再生植株(表2)。使用表1中列出的引物OsNR2-F/OsNR2-R,以上述再生植株的基因组DNA为模板进行PCR扩增,并对PCR产物进行Sanger测序,通过对测序进行分析,最终确定每个独立株系的基因型。通过对测序进行分析,最终确定每个独立株系的基因型。结果表明,转化PE3原始载体未获得精准编辑的株系,而PE3-HS、PE3-AS和PE3-DS载体最终分别获得了1簇、1簇和1簇精准编辑植株,精准编辑效率分别为2.1%、2.4%和3.2%(表2)。OsNR2相对较低的编辑效率可能表明,引导编辑效率也取决于目标内源性基因的固有性质。上述结果表明,代理系统可使原始载体不能发生精准编辑的目标基因(如OsDHDPS,OsNR2)中实现了精准编辑。在引导编辑的代理系统中,双代理系统在实现精准编辑方面是最有效、和省时省力的。
表2、通过不同的代理引导编辑器编辑不同目的基因的精准编辑效率统计
本发明利用双代理系统在水稻中进行了多基因引导编辑,如:OsSPL14与OsALS组合、OsDHDPS与OsALS组合、OsNR2与OsALS组合,上述组合中两个内源基因同时发生精准编辑的效率分别为9.6%(7/73)、45.8%(11/24)和3.2%(1/31)(表3)。本发明还利用双代理系统对不同内源基因组合进行了同时精准编辑,基因组合如下:OsSPL14与OsDHDPS、OsDHDPS与OsSPL14、OsSPL14与OsEPSPS、OsSPL14与OsVQ25、OsSPL14与OsCYP71A1、OsDHDPS与OsVQ25,在OsALS发生精准编辑的基础上,上述组合中多基因同时发生精准编辑(精准编辑/基因敲除)的效率分别为4.7%(3/64)、4.8%(3/62)、3.8%(2/52)、1.8%(1/56)、6.7%(1/15)和4.0%(1/25)(表3)。此外,本发明还测试了利用PE3-AS系统同时精准编辑两个内源性基因OsSPL14和OsDHDPS的能力,多基因精准编辑效率为2.6%(3/115),编辑效率低于双代理系统。
表3、代理编辑系统介导的多基因精准编辑效率统计
注:表中Ho:纯合,He:杂合,Bi:双等位,d12:删除12bp,i44:插入44bp。
为了评估水稻中代理引导编辑系统的特异性,使用表1中的相关引物对每个靶点潜在的脱靶位点(CRISPR-GE,http://skl.scau.edu.cn/)(表4)进行了脱靶分析。结果表明,本发明在预测的脱靶位点未检测到脱靶现象。
表4、潜在的脱靶位点分析
注:表中“潜在的脱靶位点序列”一列中加粗字体表示错配碱基,下划线处表示PAM位点。
综上所述,本发明利用OsALSS627I的单代理引导基因编辑器(PE3-AS)和基于HygromycinY46*和OsALSS627I的双代理引导编辑器(PE3-DS),显著提高了水稻中引导编辑效率,并在此基础上开发了高效的植物多基因引导编辑系统,在水稻中实现了高效多基因精准编辑,进一步拓展了引导编辑系统在农作物多基因聚合育种中的应用。
以上对本发明进行了详述。对于本领域技术人员来说,在不脱离本发明的宗旨和范围,以及无需进行不必要的实验情况下,可在等同参数、浓度和条件下,在较宽范围内实施本发明。虽然本发明给出了特殊的实施例,应该理解为,可以对本发明作进一步的改进。总之,按本发明的原理,本申请欲包括任何变更、用途或对本发明的改进,包括脱离了本申请中已公开范围,而用本领域已知的常规技术进行的改变。按以下附带的权利要求的范围,可以进行一些基本特征的应用。
<110> 中国农业科学院作物科学研究所
<120> 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法
<130> GNCLN221511
<160> 6
<170> PatentIn version 3.5
<210> 1
<211> 20343
<212> DNA
<213> Artificial sequence
<400> 1
gaattcgggt accgttgtca atcaattggc aagtcataaa atgcattaaa aaatattttc 60
atactcaact acaaatccat gagtataact ataattataa agcaatgatt agaatctgac 120
aaggattctg gaaaattaca taaaggaaag ttcataaatg tctaaaacac aagaggacat 180
acttgtattc agtaacattt gcagcttttc taggtctgaa aatatatttg ttgcctagtg 240
aataagcata atggtacaac tacaagtgtt ttactcctca tattaacttc ggtcattaga 300
ggccacgatt tgacacattt ttactcaaaa caaaatgttt gcatatctct tataatttca 360
aattcaacac acaacaaata agagaaaaaa caaataatat taatttgaga atgaacaaaa 420
ggaccatatc attcattaac tcttctccat ccatttccat ttcacagttc gatagcgaaa 480
accgaataaa aaacacagta aattacaagc acaacaaatg gtacaagaaa aacagttttc 540
ccaatgccat aatactcaaa ctcagtagga ttctggtgtg tgcgcaatga aactgatgca 600
ttgaacttga cgaacgttgt cgaaaccgat gatacgaacg aaagctctgg ggaaattcga 660
gctctttaaa tttttttttt tttttttttt ttttgttaaa tttttttttt tttttttttt 720
ttttgttaaa tttttttttt tttttttttt tttttgttaa cttgatgtcc gaaaacaaaa 780
ctgaaagaac acagtaaatt acaagcagaa caatggcttt tccaatgcca taatactcaa 840
agttaacctt actcagcctc gacgagctca cttcttcttc ttcgcctgcc ccgccttctt 900
cgtcgccgct ggccgcttct cgaactcgct gccgtcagct gtgcgcttgc tgccgccgga 960
gggggagctg ttctcgatga ggagggtgga ggtgtctggg gtctcggtga tagcggcctt 1020
gcgggcggct tggtcggcca ttctgttgcc gcgggcctca gcggagtggc ccttctggtg 1080
gcccgggcaa tggatgatgg agaggcgctt cgggaggaag agggccttga ggagcgcgag 1140
gatctcgtcc ttgttcttga tctccttgcc ctcggaggtg agccagcccc tcctcctgta 1200
gatctcgccg tggatgtggg cggtggcgaa ggcgtagcgg gagtcggtgt acacattgag 1260
cttcttgccc tcggccatct tcagggcttg ggtgagggca atcagctcgg cgcgttgagc 1320
tgaggtgccg gctgggaggg ccttggccca aatcacctcg gtctcggttg tcacagcggc 1380
gccggccttc ctttggccct cctggaggag ggaggagccg tcggtgtacc aggtgtggtc 1440
ggcgtctggg aggggctggt cggtgagatc cggtctggtg ccgtgggcct cggcgaggat 1500
gtcgaggcag ttgtgttgga ggccctcctc agggagtggg aggagggtgg ccgggttgag 1560
ggcgacaaca gggccgaatt ggacgcggtc ggtgtcgagg aggagggcct ggtagtgggt 1620
catgcgggcg ttggagagcc atctgtcagg aggctgctta acgagggcct cgacggcgtg 1680
tggggcgagg atgaccagcg gttggcccat ggtgagcttg ccggcgtcct tggtgagcac 1740
ggcaatggcc gccaccattc tgaggcatgg tggccagccg gcggccacgg gatcgagctt 1800
cttggagaga taagccaccg ggcgcctcca tgggcccagc ttctgggtga ggacgccctt 1860
ggcgtagcct tgcttctcgt ccacgaagag ctcgaagggc ttggtgaggt ctgggaggcc 1920
gagggccggg gcggtgagca gggcctgctt gatctcctgg taggccttct gttggtctgg 1980
gccccagttg aagagtgtgc ctggcttggt gagtggatag agcggcgcgg ccatctccgc 2040
gaagcccgga atgaagaggc ggcagaagcc ggccttgccg aggaactcgc ggagttgtct 2100
cggtgtcttc ggggttggct ggcccatcac tgtctccttt ctggcctcgg tgagccagcg 2160
ttggccctcc ttgaggaggt agccgaggta cttgacttgc ttctggcaga tctgggcctt 2220
cttagcggag gcgcggtagc cgaggttgcc gagggtttgg aggagggctc ttgtgccctg 2280
ttggcagtcg agctctgagg tcgcggcgag gaggaggtcg tccacgtact ggaggaggat 2340
gaggtccggg tgctgaatcc tgaagtcggc gaggtcgcgg tggagggcct cgttgaagag 2400
ggtgggggag ttcttgaagc cttgcgggag gcgggtccac gtgagttggc cggagatgcc 2460
catctcaggg tcgcgccact cgaaggcgaa gagtggctgg gaggttgggt ggaggcggag 2520
gcagaagaag gcgtccttga ggtcgagcac ggtgtaccat tggtgggatg gcgggaggcc 2580
ggagaggagg ttgtatgggt ttgggacggt tgggtggatg tcctcaacgc gcttattgac 2640
ctcgcggagg tcctgcacgg ggcggtagtc gttggtgcct ggcttcttca cggggaggag 2700
tggggtgttc cacggggatt ggcagggcac gaggatgcct tgatcgagga ggcgctggat 2760
gtgtggctta atgccgagtc tggcctcctg ggacattggg tattgcttaa tggagaccgg 2820
tgtggaggtc gccttgagcg ggatgatgag tggagcctgg cgcacagcga ggcccatgcc 2880
gcctgtctcg gcccaggcct gagggaagtc ggagagccag gtagagccga gagacacgtc 2940
aggctccttg gatgtctcgt ggagtctgta ctcgtcctcg atgttgaggg tagaagagcc 3000
gccagaggag ccgccggagg actcaggggt ggcggactcg gaggtgccag gggtctcaga 3060
gccggatgag ccgccggaag agccgccgga gtcgcccccg agctgagaca ggtcgatgcg 3120
cgtctcgtag aggccggtaa tcgactggtg gatgagggtc gcgtccagga cctccttagt 3180
gcttgtgtac ctcttgcgat cgatagttgt gtcgaagtac ttgaaagcag caggggcgcc 3240
gaggttcgtc agggtgaaga gatgaatgat attctcagcc tgctccctga ttggcttgtc 3300
gcggtgcttg ttgtacgcgg agaggacctt atccagattc gcgtcggcca ggatcacgcg 3360
cttggagaac tcggaaatct gctcaatgat ctcgtcgagg taatgcttgt gctgctcgac 3420
gaacagctgc ttctgctcgt tgtcctcggg gctgcccttg agcttctcgt agtgggaggc 3480
caggtagagg aagttcacat acttggacgg cagagccagc tcgttcccct tctgcagctc 3540
gccagcggaa gccagcatcc gcttcctgcc gttctccagc tcgaagagtg agtacttggg 3600
gagcttaatg atcaggtcct tcttcacctc cttgtagccc ttcgcctcca ggaaatcgat 3660
cgggttcttc tcgaagctgg agcgctccat aatcgtgatc cccagcagct ccttcacgct 3720
cttgagcttc ttggacttgc ccttctcaac cttcgccaca accaggaccg agtaggccac 3780
agtggggctg tcgaacccgc cgtacttctt cggatcccag tccttcttgc gggcgatgag 3840
cttgtcgctg ttccgcttag gcagaattga ctccttagag aacccgccag tctggacctc 3900
tgtcttcttg acgatattca cttgtggcat ggagagaacc ttcctgacgg tcgcgaaatc 3960
cctgcccttg tcccacacga tctcccccgt ctcgccgttc gtctcgatga gggggcgctt 4020
ccggatctcg ccattggcca gagtgatctc tgtcttgaag aaattcataa tgttagagta 4080
gaagaagtac ttggcggtag ccttgccaat ctcctgctcc gacttggcga tcatcttcct 4140
cacatcgtaa accttgtagt ccccgtacac gaactcgctc tcgagctttg ggtacttctt 4200
gatcagagct gtgccgacca ccgcgttcag gtacgcgtca tgggcatggt ggtaattgtt 4260
gatctcccga accttgtaga actggaaatc cttcctgaag tcggagacga gctttgactt 4320
cagggtgatg accttcacct cgcggatcag cttgtcattc tcatcgtact tagtgttcat 4380
ccgtgagtcg agaatctgcg caacgtgctt agtgatctgc cgtgtctcga ccagctgcct 4440
cttgatgaag cccgccttgt ccagctcaga gagcccgccc ctctcagcct ttgtgaggtt 4500
atcgaacttc cgctgcgtga tcagcttggc attcaggagc tggcgccagt agttcttcat 4560
cttcttaacg acctcctctg aaggaacatt atcagacttg ccccggttct tgtccgacct 4620
ggtgaggacc ttgttgtcaa tggagtcatc cttcaggaat gactgtggaa cgatagcatc 4680
gacgtcgtaa tcgctgagcc tgttaatatc cagctcctgg tccacataca tatcgcggcc 4740
attctggagg tagtacaggt agagcttctc attctgcagc tgcgtgttct ccaccgggtg 4800
ctccttgagg atctgggacc ccagctcctt aatgccctcc tcgatcctct tcatcctctc 4860
gcgtgagttc ttctggccct tctgcgtggt ctgattctcc cgggccatct caatgacgat 4920
gttctcaggc ttgtgcctgc ccatgacctt caccagctcg tccacaacct tcacggtctg 4980
cagaatcccc ttcttgatag ctggcgagcc agcgaggttc gcgatatgct cgtgcagcga 5040
gtccccctgg ccgctcacct gagccttctg gatatcctcc ttgaatgtga ggctgtcatc 5100
gtgaatcagc tgcatgaaat tgcggttcgc gaagccatcg ctcttcagga agtcgaggat 5160
cgtcttcccg gactgcttgt cccgaatgcc gttgatgagc ttcctgctca gcctccccca 5220
gccggtgtac ctcctcctct tgagctgctt catgaccttg tcatcgaaga gatgggcgta 5280
agtcttcagg cgctcctcga tcatctcccg gtcctcgaac agagtgagtg tcagcacaat 5340
gtcctcgagg atatcctcat tctcctcgtt gtccaggaag tccttatcct taatgatctt 5400
caggagatcg tggtaggtcc ccagggaggc gttgaagcgg tcctcaacgc cagagatctc 5460
gaccgaatcg aagcactcaa tcttcttgaa gtagtcctcc ttgagctgct taaccgtgac 5520
cttccggttg gtcttgaaca ggaggtccac gatggccttc ttctgctccc cagacaggaa 5580
agccggcttc ctcatgccct cggtcacata cttcacctta gtcagctcgt tgtagactgt 5640
gaagtactcg tacaggagcg agtgcttagg gagcaccttc tcatttggca ggttcttgtc 5700
gaaattcgtc atcctctcga tgaacgactg agcgctagcg cccttgtcga ccacctcctc 5760
gaagttccac ggcgtgatcg tctcctctga cttgcgggtc atccaagcga agcgggagtt 5820
gcccctagcg agtgggccga cgtagtacgg gatcctgaaa gtcagaatct tctcgatctt 5880
ctcgcggtta tccttgagga aagggtagaa gtcctcctgc ctcctcagga tagcgtgcag 5940
ctccccgaga tgaatctggt gtgggatgct gccgttatcg aatgtccgct gcttcctcag 6000
gaggtcctcg cgattgagct tcaccagcag ctcctccgtg ccgtccatct tctccagaat 6060
cggcttgatg aacttgtaga actcctcctg agaggccccg ccgtcaatgt acccagcgta 6120
gccgttcttc gactgatcga agaagatctc cttgtacttc tcggggagct gctgcctgac 6180
cagcgccttc aggagggtca gatcctgatg gtgctcgtcg tagcgcttga tcatggaggc 6240
tgagagcgga gccttcgtaa tctcggtgtt caccctgaga atatcagaca ggaggatggc 6300
gtccgacaga ttcttggcag cgaggaacag gtccgcgtac tgatcgccga tctgggccag 6360
gaggttatcc aggtcatcgt cgtatgtgtc cttggagagc tgcagcttgg cgtcctcagc 6420
gagatcgaaa ttcgacttga agttgggcgt gagccccagg ctgagcgcaa tgagattccc 6480
gaacaggccg ttcttcttct cgcccggcag ctgggcgatc aggttctcga ggcgccgaga 6540
cttcgagagc ctagcggaca ggatagcctt cgcgtcgacg cctgacgcat taatggggtt 6600
ctcctcgaag agctggttgt acgtctgcac gagctggatg aacagcttgt caacatcgct 6660
attgtccggg ttgagatccc cctcgatcag gaaatggccc ctgaacttaa tcatgtgggc 6720
cagagcgagg tagatcaggc ggaggtccgc cttatctgtg gagtccacga gcttcttccg 6780
cagatggtag atcgtagggt acttctcgtg gtaggcaacc tcgtcgacaa tgttgccgaa 6840
gattggatgc cgctcgtgct tcttatcctc ctccacgagg aatgactcct ccagcctgtg 6900
gaagaaagaa tcgtcaacct tcgccatctc gttggagaaa atctcctgca ggtagcagat 6960
gcgattcttc ctgcgcgtgt accgcctgcg ggcggtgcgc ttgagccgcg tagcctcagc 7020
cgtctcgccg ctgtcgaaca ggagagcgcc aatgagattc ttcttgatgg aatgccgatc 7080
ggtgttgccc aggaccttga acttctttga gggcaccttg tactcgtcgg tgatcacggc 7140
ccagccaaca gagttagtcc caatatcgag gccgatcgag tacttcttgt cagcagctgg 7200
caccccgtgg atgccaacct tcctcttctt cttcggagcc atcttgtcat catcatcctt 7260
gtaatcaatg tcgtggtcct tgtaatcccc gtcgtggtcc ttgtaatcca tctagtaggc 7320
ctagggctgc agaagtaaca ccaaacaaca gggtgagcat cgacaaaaga aacagtacca 7380
agcaaataaa tagcgtatga aggcagggct aaaaaaatcc acatatagct gctgcatatg 7440
ccatcatcca agtatatcaa gatcaaaata attataaaac atacttgttt attataatag 7500
ataggtactc aaggttagag catatgaata gatgctgcat atgccatcat gtatatgcat 7560
cagtaaaacc cacatcaaca tgtataccta tcctagatcg atatttccat ccatcttaaa 7620
ctcgtaacta tgaagatgta tgacacacac atacagttcc aaaattaata aatacaccag 7680
gtagtttgaa acagtattct actccgatct agaacgaatg aacgaccgcc caaccacacc 7740
acatcatcac aaccaagcga acaaaaagca tctctgtata tgcatcagta aaacccgcat 7800
caacatgtat acctatccta gatcgatatt tccatccatc atcttcaatt cgtaactatg 7860
aatatgtatg gcacacacat acagatccaa aattaataaa tccaccaggt agtttgaaac 7920
agaattctac tccgatctag aacgaccgcc caaccagacc acatcatcac aaccaagaca 7980
aaaaaaagca tgaaaagatg acccgacaaa caagtgcacg gcatatattg aaataaagga 8040
aaagggcaaa ccaaacccta tgcaacgaaa caaaaaaaat catgaaatcg atcccgtctg 8100
cggaacggct agagccatcc caggattccc caaagagaaa cactggcaag ttagcaatca 8160
gaacgtgtct gacgtacagg tcgcatccgt gtacgaacgc tagcagcacg gatctaacac 8220
aaacacggat ctaacacaaa catgaacaga agtagaacta ccgggcccta accatggacc 8280
ggaacgccga tctagagaag gtagagaggg gggggggggg aggacgagcg gcgtaccttg 8340
aagcggaggt gccgacgggt ggatttgggg gagatctggt tgtgtgtgtg tgcgctccga 8400
acaacacgag gttggggaaa gagggtgtgg agggggtgtc tatttattac ggcgggcgag 8460
gaagggaaag cgaaggagcg gtgggaaagg aatcccccgt agctgccgtg ccgtgagagg 8520
aggaggaggc cgcctgccgt gccggctcac gtctgccgct ccgccacgca tttctggatg 8580
ccgacagcgg agcaagtcca acggtggagc ggaactctcg agaggggtcc agaggcagcg 8640
acagagatgc cgtgccgtct gcttcgcttg gcccgacgcg acgctgctgg ttcgctggtt 8700
ggtgtccgtt agactcgtcg acggcgttta acaggctggc attatctact cgaaacaaga 8760
aaaatgtttc cttagttttt ttaatttctt aaagggtatt tgtttaattt ttagtcactt 8820
tattttattc tattttatat ctaaattatt aaataaaaaa actaaaatag agttttagtt 8880
ttcttaattt agaggctaaa atagaataaa atagatgtac taaaaaaatt agtctataaa 8940
aaccattaac cctaaaccct aaatggatgt actaataaaa tggatgaagt attatatagg 9000
tgaagctatt tgcaaaaaaa aaggagaaca catgcacact aaaaagataa aactgtagag 9060
tcctgttgtc aaaatactca attgtccttt agaccatgtc taactgttca tttatatgat 9120
tctctaaaac actgatatta ttgtagtact atagattata ttattcgtag agtaaagttt 9180
aaatatatgt ataaagatag ataaactgca cttcaaacaa gtgtgacaaa aaaaatatgt 9240
ggtaattttt tataacttag acatgcaatg ctcattatct ctagagaggg gcacgaccgg 9300
gtcacgctgc actgcaggaa ttcgatatca tcgaggtcat tcatatgctt gagaagagag 9360
tcgggatagt ccaaaataaa acaaaggtaa gattacctgg tcaaaagtga aaacatcagt 9420
taaaaggtgg tataaagtaa aatatcggta ataaaaggtg gcccaaagtg aaatttactc 9480
ttttctacta ttataaaaat tgaggatgtt tttgtcggta ctttgatacg tcatttttgt 9540
atgaattggt ttttaagttt attcgctttt ggaaatgcat atctgtattt gagtcgggtt 9600
ttaagttcgt ttgcttttgt aaatacagag ggatttgtat aagaaatatc tttaaaaaaa 9660
cccatatgct aatttgacat aatttttgag aaaaatatat attcaggcga attctcacaa 9720
tgaacaataa taagattaaa atagctttcc cccgttgcag cgcatgggta ttttttctag 9780
taaaaataaa agataaactt agactcaaaa catttacaaa aacaacccct aaagttccta 9840
aagcccaaag tgctatccac gatccatagc aagcccagcc caacccaacc caacccaacc 9900
caccccagtc cagccaactg gacaatagtc tccacacccc cccactatca ccgtgagttg 9960
tccgcacgca ccgcacgtct cgcagccaaa aaaaaaaaaa gaaagaaaaa aaagaaaaag 10020
aaaaaacagc aggtgggtcc gggtcgtggg ggccggaaac gcgaggagga tcgcgagcca 10080
gcgacgaggc cggccctccc tccgcttcca aagaaacgcc ccccatcgcc actatataca 10140
tacccccccc tctcctccca tccccccaac cctaccacca ccaccaccac cacctccacc 10200
tcctcccccc tcgctgccgg acgacgagct cctcccccct ccccctccgc cgccgccgcg 10260
ccggtaacca ccccgcccct ctcctctttc tttctccgtt ttttttttcc gtctcggtct 10320
cgatctttgg ccttggtagt ttgggtgggc gagaggcggc ttcgtgcgcg cccagatcgg 10380
tgcgcgggag gggcgggatc tcgcggctgg ggctctcgcc ggcgtggatc cggcccggat 10440
ctcgcgggga atggggctct cggatgtaga tctgcgatcc gccgttgttg ggggagatga 10500
tggggggttt aaaatttccg ccatgctaaa caagatcagg aagaggggaa aagggcacta 10560
tggtttatat ttttatatat ttctgctgct tcgtcaggct tagatgtgct agatctttct 10620
ttcttctttt tgtgggtaga atttgaatcc ctcagcattg ttcatcggta gtttttcttt 10680
tcatgatttg tgacaaatgc agcctcgtgc ggagcttttt tgtaggtaga caacaaagca 10740
ccagtggtct agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg 10800
ctggtgcatc cttgaatgcg cccccactgt tttagagcta gaaatagcaa gttaaaataa 10860
ggctagtccg ttatcaactt gaaaaagtgg caccgagtcg gtgcatgatc ccgatcgggg 10920
gcgcattcaa acaaagcacc agtggtctag tggtagaata gtaccctgcc acggtacaga 10980
cccgggttcg attcccggct ggtgcagact ccagggccat acttgtgttt tagagctaga 11040
aatagcaagt taaaataagg ctagtccgtt atcaacttga aaaagtggca ccgagtcggt 11100
gcaacaaagc accagtggtc tagtggtaga atagtaccct gccacggtac agacccgggt 11160
tcgattcccg gctggtgcaa gcttgtcgag gctgagtaag gttaactttg agtattatgg 11220
cattggaaaa gccattgttc tgcttgtaat ttactgtgtt ctttcagttt tgttttcgga 11280
catcaagtta acaaaaaaaa aaaaaaaaaa aaaaaaattt aacaaaaaaa aaaaaaaaaa 11340
aaaaaaattt aacaaaaaaa aaaaaaaaaa aaaaaaattt aaagagctcg aatttccccg 11400
atcgttcaaa catttggcaa taaagtttct taagattgaa tcctgttgcc ggtcttgcga 11460
tgattatcat ataatttctg ttgaattacg ttaagcatgt aataattaac atgtaatgca 11520
tgacgttatt tatgagatgg gtttttatga ttagagtccc gcaattatac atttaatacg 11580
cgatagaaaa caaaatatag cgcgcaaact aggataaatt atcgcgcgcg gtgtcatcta 11640
tgttactaga tcaaactatc agtgtttgac aggatatatt ggcgggtaaa cctaagagaa 11700
aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta tccgttcgtc 11760
catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact ttgatccaac 11820
ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct tctgaaaacg 11880
acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt tcctggcgtt 11940
ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac cggagacatt 12000
acgccatgaa caagagcgcc gccgctggcc tgctgggcta tgcccgcgtc agcaccgacg 12060
accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc aagctgtttt 12120
ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg cttgaccacc 12180
tagccctggc gacgttgtga cagtgaccag gctagaccgc ctggcccgca gcacccgcga 12240
cctactggac attgccgagc gcatccagga ggccggcgcg ggcctgcgta gcctggcaga 12300
gccgtgggcc gacaccacca cgccggccgg ccgcatggtg ttgaccgtgt tcgccggcat 12360
tgccgagttc gagcgttccc taatcatcga ccgcacccgg agcgggcgcg aggccgccaa 12420
ggcccgaggc gtgaagtttg gcccccgccc taccctcacc ccggcacaga tcgcgcacgc 12480
ccgcgagctg atcgaccagg aaggccgcac cgtgaaagag gcggctgcac tgcttggcgt 12540
gcatcgctcg accctgtacc gcgcacttga gcgcagcgag gaagtgacgc ccaccgaggc 12600
caggcggcgc ggtgccttcc gtgaggacgc attgaccgag gccgacgccc tggcggccgc 12660
cgagaatgaa cgccaagagg aacaagcatg aaaccgcacc aggacggcca ggacgaaccg 12720
tttttcatta ccgaagagat cgaggcggag atgatcgcgg ccgggtacgt gttcgagccg 12780
cccgcgcacg tctcaaccgt gcggctgcat gaaatcctgg ccggtttgtc tgatgccaag 12840
ctggcggcct ggccggccag cttggccgct gaagaaaccg agcgccgccg tctaaaaagg 12900
tgatgtgtat ttgagtaaaa cagcttgcgt catgcggtcg ctgcgtatat gatgcgatga 12960
gtaaataaac aaatacgcaa ggggaacgca tgaaggttat cgctgtactt aaccagaaag 13020
gcgggtcagg caagacgacc atcgcaaccc atctagcccg cgccctgcaa ctcgccgggg 13080
ccgatgttct gttagtcgat tccgatcccc agggcagtgc ccgcgattgg gcggccgtgc 13140
gggaagatca accgctaacc gttgtcggca tcgaccgccc gacgattgac cgcgacgtga 13200
aggccatcgg ccggcgcgac ttcgtagtga tcgacggagc gccccaggcg gcggacttgg 13260
ctgtgtccgc gatcaaggca gccgacttcg tgctgattcc ggtgcagcca agcccttacg 13320
acatatgggc aaccgccgac ctggtggagc tggttaagca gcgcattgag gtcacggatg 13380
gaaggctaca agcggccttt gtcgtgtcgc gggcgatcaa aggcacgcgc atcggcggtg 13440
aggttgccga ggcgctggcc gggtacgagc tgcccattct tgagtcccgt atcacgcagc 13500
gcgtgagcta cccaggcact gccgccgccg gcacaaccgt tcttgaatca gaacccgagg 13560
gcgacgctgc ccgcgaggtc caggcgctgg ccgctgaaat taaatcaaaa ctcatttgag 13620
ttaatgaggt aaagagaaaa tgagcaaaag cacaaacacg ctaagtgccg gccgtccgag 13680
cgcacgcagc agcaaggctg caacgttggc cagcctggca gacacgccag ccatgaagcg 13740
ggtcaacttt cagttgccgg cggaggatca caccaagctg aagatgtacg cggtacgcca 13800
aggcaagacc attaccgagc tgctatctga atacatcgcg cagctaccag agtaaatgag 13860
caaatgaata aatgagtaga tgaattttag cggctaaagg aggcggcatg gaaaatcaag 13920
aacaaccagg caccgacgcc gtggaatgcc ccatgtgtgg aggaacgggc ggttggccag 13980
gcgtaagcgg ctgggttgtc tgccggccct gcaatggcac tggaaccccc aagcccgagg 14040
aatcggcgtg acggtcgcaa accatccggc ccggtacaaa tcggcgcggc gctgggtgat 14100
gacctggtgg agaagttgaa ggccgcgcag gccgcccagc ggcaacgcat cgaggcagaa 14160
gcacgccccg gtgaatcgtg gcaagcggcc gctgatcgaa tccgcaaaga atcccggcaa 14220
ccgccggcag ccggtgcgcc gtcgattagg aagccgccca agggcgacga gcaaccagat 14280
tttttcgttc cgatgctcta tgacgtgggc acccgcgata gtcgcagcat catggacgtg 14340
gccgttttcc gtctgtcgaa gcgtgaccga cgagctggcg aggtgatccg ctacgagctt 14400
ccagacgggc acgtagaggt ttccgcaggg ccggccggca tggccagtgt gtgggattac 14460
gacctggtac tgatggcggt ttcccatcta accgaatcca tgaaccgata ccgggaaggg 14520
aagggagaca agcccggccg cgtgttccgt ccacacgttg cggacgtact caagttctgc 14580
cggcgagccg atggcggaaa gcagaaagac gacctggtag aaacctgcat tcggttaaac 14640
accacgcacg ttgccatgca gcgtacgaag aaggccaaga acggccgcct ggtgacggta 14700
tccgagggtg aagccttgat tagccgctac aagatcgtaa agagcgaaac cgggcggccg 14760
gagtacatcg agatcgagct agctgattgg atgtaccgcg agatcacaga aggcaagaac 14820
ccggacgtgc tgacggttca ccccgattac tttttgatcg atcccggcat cggccgtttt 14880
ctctaccgcc tggcacgccg cgccgcaggc aaggcagaag ccagatggtt gttcaagacg 14940
atctacgaac gcagtggcag cgccggagag ttcaagaagt tctgtttcac cgtgcgcaag 15000
ctgatcgggt caaatgacct gccggagtac gatttgaagg aggaggcggg gcaggctggc 15060
ccgatcctag tcatgcgcta ccgcaacctg atcgagggcg aagcatccgc cggttcctaa 15120
tgtacggagc agatgctagg gcaaattgcc ctagcagggg aaaaaggtcg aaaaggtctc 15180
tttcctgtgg atagcacgta cattgggaac ccaaagccgt acattgggaa ccggaacccg 15240
tacattggga acccaaagcc gtacattggg aaccggtcac acatgtaagt gactgatata 15300
aaagagaaaa aaggcgattt ttccgcctaa aactctttaa aacttattaa aactcttaaa 15360
acccgcctgg cctgtgcata actgtctggc cagcgcacag ccgaagagct gcaaaaagcg 15420
cctacccttc ggtcgctgcg ctccctacgc cccgccgctt cgcgtcggcc tatcgcggcc 15480
gctggccgct caaaaatggc tggcctacgg ccaggcaatc taccagggcg cggacaagcc 15540
gcgccgtcgc cactcgaccg ccggcgccca catcaaggca ccctgcctcg cgcgtttcgg 15600
tgatgacggt gaaaacctct gacacatgca gctcccggag acggtcacag cttgtctgta 15660
agcggatgcc gggagcagac aagcccgtca gggcgcgtca gcgggtgttg gcgggtgtcg 15720
gggcgcagcc atgacccagt cacgtagcga tagcggagtg tatactggct taactatgcg 15780
gcatcagagc agattgtact gagagtgcac catatgcggt gtgaaatacc gcacagatgc 15840
gtaaggagaa aataccgcat caggcgctct tccgcttcct cgctcactga ctcgctgcgc 15900
tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat acggttatcc 15960
acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca aaaggccagg 16020
aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat 16080
cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata aagataccag 16140
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga 16200
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg 16260
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt 16320
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac 16380
gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc 16440
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag gacagtattt 16500
ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag ctcttgatcc 16560
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc 16620
agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga cgctcagtgg 16680
aacgaaaact cacgttaagg gattttggtc atgcattcta ggtactaaaa caattcatcc 16740
agtaaaatat aatattttat tttctcccaa tcaggcttga tccccagtaa gtcaaaaaat 16800
agctcgacat actgttcttc cccgatatcc tccctgatcg accggacgca gaaggcaatg 16860
tcataccact tgtccgccct gccgcttctc ccaagatcaa taaagccact tactttgcca 16920
tctttcacaa agatgttgct gtctcccagg tcgccgtggg aaaagacaag ttcctcttcg 16980
ggcttttccg tctttaaaaa atcatacagc tcgcgcggat ctttaaatgg agtgtcttct 17040
tcccagtttt cgcaatccac atcggccaga tcgttattca gtaagtaatc caattcggct 17100
aagcggctgt ctaagctatt cgtataggga caatccgata tgtcgatgga gtgaaagagc 17160
ctgatgcact ccgcatacag ctcgataatc ttttcagggc tttgttcatc ttcatactct 17220
tccgagcaaa ggacgccatc ggcctcactc atgagcagat tgctccagcc atcatgccgt 17280
tcaaagtgca ggacctttgg aacaggcagc tttccttcca gccatagcat catgtccttt 17340
tcccgttcaa catcataggt ggtcccttta taccggctgt ccgtcatttt taaatatagg 17400
ttttcatttt ctcccaccag cttatatacc ttagcaggag acattccttc cgtatctttt 17460
acgcagcggt atttttcgat cagttttttc aattccggtg atattctcat tttagccatt 17520
tattatttcc ttcctctttt ctacagtatt taaagatacc ccaagaagct aattataaca 17580
agacgaactc caattcactg ttccttgcat tctaaaacct taaataccag aaaacagctt 17640
tttcaaagtt gttttcaaag ttggcgtata acatagtatc gacggagccg attttgaaac 17700
cgcggtgatc acaggcagca acgctctgtc atcgttacaa tcaacatgct accctccgcg 17760
agatcatccg tgtttcaaac ccggcagctt agttgccgtt cttccgaata gcatcggtaa 17820
catgagcaaa gtctgccgcc ttacaacggc tctcccgctg acgccgtccc ggactgatgg 17880
gctgcctgta tcgagtggtg attttgtgcc gagctgccgg tcggggagct gttggctggc 17940
tggtggcagg atatattgtg gtgtaaacaa attgacgctt agacaactta ataacacatt 18000
gcggacgttt ttaatgtact gaattaacgc cgaattaatt cgggggatct ggattttagt 18060
actggatttt ggttttagga attagaaatt ttattgatag aagtatttta caaatacaaa 18120
tacatactaa gggtttctta tatgctcaac acatgagcga aaccctatag gaaccctaat 18180
tcccttatct gggaactact cacacattat tatggagaaa ctcgagcttg tcgatcgaca 18240
gatccggtcg gcatctactc tatttctttg ccctcggacg agtgctgggg cgtcggtttc 18300
cactatcggc gagtacttct acacagccat cggtccagac ggccgcgctt ctgcgggcga 18360
tttgtgtacg cccgacagtc ccggctccgg atcggacgat tgcgtcgcat cgaccctgcg 18420
cccaagctgc atcatcgaaa ttgccgtcaa ccaagctctg atagagttgg tcaagaccaa 18480
tgcggagcat atacgcccgg agtcgtggcg atcctgcaag ctccggatgc ctccgctcga 18540
agtagcgcgt ctgctgctcc atacaagcca accacggcct ccagaagaag atgttggcga 18600
cctcgtattg ggaatccccg aacatcgcct cgctccagtc aatgaccgct gttatgcggc 18660
cattgtccgt caggacattg ttggagccga aatccgcgtg cacgaggtgc cggacttcgg 18720
ggcagtcctc ggcccaaagc atcagctcat cgagagcctg cgcgacggac gcactgacgg 18780
tgtcgtccat cacagtttgc cagtgataca catggggatc agcaatcgcg catatgaaat 18840
cacgccatgt agtgtattga ccgattcctt gcggtccgaa tgggccgaac ccgctcgtct 18900
ggctaagatc ggccgcagcg atcgcatcca tagcctccgc gaccggttgt agaacagcgg 18960
gcagttcggt ttcaggcagg tcttgcaacg tgacaccctg tgcacggcgg gagatgcaat 19020
aggtcaggct ctcgctaaac tccccaatgt caagcacttc cggaatcggg agcgcggccg 19080
atgcaaagtg ccgataaaca taacgatctt tgtagaaacc atcggcgcag ctatttaccc 19140
gcaggacata tccacgccct cctacatcga agctgaaagc acgagattct tcgccctccg 19200
agagctgcat caggtcggag acgctgtcga acttttcgat cagaaacttc tcgacagacg 19260
tcgcggtgag ttcaggcttt ttcatatctc attgcccccc ggatctgcga aagctcgaga 19320
gagatagatt tgtagagaga gactggtgat ttcagcgtgt cctctccaaa tgaaatgaac 19380
ttccttatat agaggaaggt cttgcgaagg atagtgggat tgtgcgtcat cccttacgtc 19440
agtggagata tcacatcaat ccacttgctt tgaagacgtg gttggaacgt cttctttttc 19500
cacgatgctc ctcgtgggtg ggggtccatc tttgggacca ctgtcggcag aggcatcttg 19560
aacgatagcc tttcctttat cgcaatgatg gcatttgtag gtgccacctt ccttttctac 19620
tgtccttttg atgaagtgac agatagctgg gcaatggaat ccgaggaggt ttcccgatat 19680
taccctttgt tgaaaagtct caatagccct ttggtcttct gagactgtat ctttgatatt 19740
cttggagtag acgagagtgt cgtgctccac catgttatca catcaatcca cttgctttga 19800
agacgtggtt ggaacgtctt ctttttccac gatgctcctc gtgggtgggg gtccatcttt 19860
gggaccactg tcggcagagg catcttgaac gatagccttt cctttatcgc aatgatggca 19920
tttgtaggtg ccaccttcct tttctactgt ccttttgatg aagtgacaga tagctgggca 19980
atggaatccg aggaggtttc ccgatattac cctttgttga aaagtctcaa tagccctttg 20040
gtcttctgag actgtatctt tgatattctt ggagtagacg agagtgtcgt gctccaccat 20100
gttggcaagc tgctctagcc aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat 20160
taatgcagct ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt 20220
aatgtgagtt agctcactca ttaggcaccc caggctttac actttatgct tccggctcgt 20280
atgttgtgtg gaattgtgag cggataacaa tttcacacag gaaacagcta tgaccatgat 20340
tac 20343
<210> 2
<211> 20721
<212> DNA
<213> Artificial sequence
<400> 2
gaattcgggt accgttgtca atcaattggc aagtcataaa atgcattaaa aaatattttc 60
atactcaact acaaatccat gagtataact ataattataa agcaatgatt agaatctgac 120
aaggattctg gaaaattaca taaaggaaag ttcataaatg tctaaaacac aagaggacat 180
acttgtattc agtaacattt gcagcttttc taggtctgaa aatatatttg ttgcctagtg 240
aataagcata atggtacaac tacaagtgtt ttactcctca tattaacttc ggtcattaga 300
ggccacgatt tgacacattt ttactcaaaa caaaatgttt gcatatctct tataatttca 360
aattcaacac acaacaaata agagaaaaaa caaataatat taatttgaga atgaacaaaa 420
ggaccatatc attcattaac tcttctccat ccatttccat ttcacagttc gatagcgaaa 480
accgaataaa aaacacagta aattacaagc acaacaaatg gtacaagaaa aacagttttc 540
ccaatgccat aatactcaaa ctcagtagga ttctggtgtg tgcgcaatga aactgatgca 600
ttgaacttga cgaacgttgt cgaaaccgat gatacgaacg aaagctctgg ggaaattcga 660
gctctttaaa tttttttttt tttttttttt ttttgttaaa tttttttttt tttttttttt 720
ttttgttaaa tttttttttt tttttttttt tttttgttaa cttgatgtcc gaaaacaaaa 780
ctgaaagaac acagtaaatt acaagcagaa caatggcttt tccaatgcca taatactcaa 840
agttaacctt actcagcctc gacgagctca cttcttcttc ttcgcctgcc ccgccttctt 900
cgtcgccgct ggccgcttct cgaactcgct gccgtcagct gtgcgcttgc tgccgccgga 960
gggggagctg ttctcgatga ggagggtgga ggtgtctggg gtctcggtga tagcggcctt 1020
gcgggcggct tggtcggcca ttctgttgcc gcgggcctca gcggagtggc ccttctggtg 1080
gcccgggcaa tggatgatgg agaggcgctt cgggaggaag agggccttga ggagcgcgag 1140
gatctcgtcc ttgttcttga tctccttgcc ctcggaggtg agccagcccc tcctcctgta 1200
gatctcgccg tggatgtggg cggtggcgaa ggcgtagcgg gagtcggtgt acacattgag 1260
cttcttgccc tcggccatct tcagggcttg ggtgagggca atcagctcgg cgcgttgagc 1320
tgaggtgccg gctgggaggg ccttggccca aatcacctcg gtctcggttg tcacagcggc 1380
gccggccttc ctttggccct cctggaggag ggaggagccg tcggtgtacc aggtgtggtc 1440
ggcgtctggg aggggctggt cggtgagatc cggtctggtg ccgtgggcct cggcgaggat 1500
gtcgaggcag ttgtgttgga ggccctcctc agggagtggg aggagggtgg ccgggttgag 1560
ggcgacaaca gggccgaatt ggacgcggtc ggtgtcgagg aggagggcct ggtagtgggt 1620
catgcgggcg ttggagagcc atctgtcagg aggctgctta acgagggcct cgacggcgtg 1680
tggggcgagg atgaccagcg gttggcccat ggtgagcttg ccggcgtcct tggtgagcac 1740
ggcaatggcc gccaccattc tgaggcatgg tggccagccg gcggccacgg gatcgagctt 1800
cttggagaga taagccaccg ggcgcctcca tgggcccagc ttctgggtga ggacgccctt 1860
ggcgtagcct tgcttctcgt ccacgaagag ctcgaagggc ttggtgaggt ctgggaggcc 1920
gagggccggg gcggtgagca gggcctgctt gatctcctgg taggccttct gttggtctgg 1980
gccccagttg aagagtgtgc ctggcttggt gagtggatag agcggcgcgg ccatctccgc 2040
gaagcccgga atgaagaggc ggcagaagcc ggccttgccg aggaactcgc ggagttgtct 2100
cggtgtcttc ggggttggct ggcccatcac tgtctccttt ctggcctcgg tgagccagcg 2160
ttggccctcc ttgaggaggt agccgaggta cttgacttgc ttctggcaga tctgggcctt 2220
cttagcggag gcgcggtagc cgaggttgcc gagggtttgg aggagggctc ttgtgccctg 2280
ttggcagtcg agctctgagg tcgcggcgag gaggaggtcg tccacgtact ggaggaggat 2340
gaggtccggg tgctgaatcc tgaagtcggc gaggtcgcgg tggagggcct cgttgaagag 2400
ggtgggggag ttcttgaagc cttgcgggag gcgggtccac gtgagttggc cggagatgcc 2460
catctcaggg tcgcgccact cgaaggcgaa gagtggctgg gaggttgggt ggaggcggag 2520
gcagaagaag gcgtccttga ggtcgagcac ggtgtaccat tggtgggatg gcgggaggcc 2580
ggagaggagg ttgtatgggt ttgggacggt tgggtggatg tcctcaacgc gcttattgac 2640
ctcgcggagg tcctgcacgg ggcggtagtc gttggtgcct ggcttcttca cggggaggag 2700
tggggtgttc cacggggatt ggcagggcac gaggatgcct tgatcgagga ggcgctggat 2760
gtgtggctta atgccgagtc tggcctcctg ggacattggg tattgcttaa tggagaccgg 2820
tgtggaggtc gccttgagcg ggatgatgag tggagcctgg cgcacagcga ggcccatgcc 2880
gcctgtctcg gcccaggcct gagggaagtc ggagagccag gtagagccga gagacacgtc 2940
aggctccttg gatgtctcgt ggagtctgta ctcgtcctcg atgttgaggg tagaagagcc 3000
gccagaggag ccgccggagg actcaggggt ggcggactcg gaggtgccag gggtctcaga 3060
gccggatgag ccgccggaag agccgccgga gtcgcccccg agctgagaca ggtcgatgcg 3120
cgtctcgtag aggccggtaa tcgactggtg gatgagggtc gcgtccagga cctccttagt 3180
gcttgtgtac ctcttgcgat cgatagttgt gtcgaagtac ttgaaagcag caggggcgcc 3240
gaggttcgtc agggtgaaga gatgaatgat attctcagcc tgctccctga ttggcttgtc 3300
gcggtgcttg ttgtacgcgg agaggacctt atccagattc gcgtcggcca ggatcacgcg 3360
cttggagaac tcggaaatct gctcaatgat ctcgtcgagg taatgcttgt gctgctcgac 3420
gaacagctgc ttctgctcgt tgtcctcggg gctgcccttg agcttctcgt agtgggaggc 3480
caggtagagg aagttcacat acttggacgg cagagccagc tcgttcccct tctgcagctc 3540
gccagcggaa gccagcatcc gcttcctgcc gttctccagc tcgaagagtg agtacttggg 3600
gagcttaatg atcaggtcct tcttcacctc cttgtagccc ttcgcctcca ggaaatcgat 3660
cgggttcttc tcgaagctgg agcgctccat aatcgtgatc cccagcagct ccttcacgct 3720
cttgagcttc ttggacttgc ccttctcaac cttcgccaca accaggaccg agtaggccac 3780
agtggggctg tcgaacccgc cgtacttctt cggatcccag tccttcttgc gggcgatgag 3840
cttgtcgctg ttccgcttag gcagaattga ctccttagag aacccgccag tctggacctc 3900
tgtcttcttg acgatattca cttgtggcat ggagagaacc ttcctgacgg tcgcgaaatc 3960
cctgcccttg tcccacacga tctcccccgt ctcgccgttc gtctcgatga gggggcgctt 4020
ccggatctcg ccattggcca gagtgatctc tgtcttgaag aaattcataa tgttagagta 4080
gaagaagtac ttggcggtag ccttgccaat ctcctgctcc gacttggcga tcatcttcct 4140
cacatcgtaa accttgtagt ccccgtacac gaactcgctc tcgagctttg ggtacttctt 4200
gatcagagct gtgccgacca ccgcgttcag gtacgcgtca tgggcatggt ggtaattgtt 4260
gatctcccga accttgtaga actggaaatc cttcctgaag tcggagacga gctttgactt 4320
cagggtgatg accttcacct cgcggatcag cttgtcattc tcatcgtact tagtgttcat 4380
ccgtgagtcg agaatctgcg caacgtgctt agtgatctgc cgtgtctcga ccagctgcct 4440
cttgatgaag cccgccttgt ccagctcaga gagcccgccc ctctcagcct ttgtgaggtt 4500
atcgaacttc cgctgcgtga tcagcttggc attcaggagc tggcgccagt agttcttcat 4560
cttcttaacg acctcctctg aaggaacatt atcagacttg ccccggttct tgtccgacct 4620
ggtgaggacc ttgttgtcaa tggagtcatc cttcaggaat gactgtggaa cgatagcatc 4680
gacgtcgtaa tcgctgagcc tgttaatatc cagctcctgg tccacataca tatcgcggcc 4740
attctggagg tagtacaggt agagcttctc attctgcagc tgcgtgttct ccaccgggtg 4800
ctccttgagg atctgggacc ccagctcctt aatgccctcc tcgatcctct tcatcctctc 4860
gcgtgagttc ttctggccct tctgcgtggt ctgattctcc cgggccatct caatgacgat 4920
gttctcaggc ttgtgcctgc ccatgacctt caccagctcg tccacaacct tcacggtctg 4980
cagaatcccc ttcttgatag ctggcgagcc agcgaggttc gcgatatgct cgtgcagcga 5040
gtccccctgg ccgctcacct gagccttctg gatatcctcc ttgaatgtga ggctgtcatc 5100
gtgaatcagc tgcatgaaat tgcggttcgc gaagccatcg ctcttcagga agtcgaggat 5160
cgtcttcccg gactgcttgt cccgaatgcc gttgatgagc ttcctgctca gcctccccca 5220
gccggtgtac ctcctcctct tgagctgctt catgaccttg tcatcgaaga gatgggcgta 5280
agtcttcagg cgctcctcga tcatctcccg gtcctcgaac agagtgagtg tcagcacaat 5340
gtcctcgagg atatcctcat tctcctcgtt gtccaggaag tccttatcct taatgatctt 5400
caggagatcg tggtaggtcc ccagggaggc gttgaagcgg tcctcaacgc cagagatctc 5460
gaccgaatcg aagcactcaa tcttcttgaa gtagtcctcc ttgagctgct taaccgtgac 5520
cttccggttg gtcttgaaca ggaggtccac gatggccttc ttctgctccc cagacaggaa 5580
agccggcttc ctcatgccct cggtcacata cttcacctta gtcagctcgt tgtagactgt 5640
gaagtactcg tacaggagcg agtgcttagg gagcaccttc tcatttggca ggttcttgtc 5700
gaaattcgtc atcctctcga tgaacgactg agcgctagcg cccttgtcga ccacctcctc 5760
gaagttccac ggcgtgatcg tctcctctga cttgcgggtc atccaagcga agcgggagtt 5820
gcccctagcg agtgggccga cgtagtacgg gatcctgaaa gtcagaatct tctcgatctt 5880
ctcgcggtta tccttgagga aagggtagaa gtcctcctgc ctcctcagga tagcgtgcag 5940
ctccccgaga tgaatctggt gtgggatgct gccgttatcg aatgtccgct gcttcctcag 6000
gaggtcctcg cgattgagct tcaccagcag ctcctccgtg ccgtccatct tctccagaat 6060
cggcttgatg aacttgtaga actcctcctg agaggccccg ccgtcaatgt acccagcgta 6120
gccgttcttc gactgatcga agaagatctc cttgtacttc tcggggagct gctgcctgac 6180
cagcgccttc aggagggtca gatcctgatg gtgctcgtcg tagcgcttga tcatggaggc 6240
tgagagcgga gccttcgtaa tctcggtgtt caccctgaga atatcagaca ggaggatggc 6300
gtccgacaga ttcttggcag cgaggaacag gtccgcgtac tgatcgccga tctgggccag 6360
gaggttatcc aggtcatcgt cgtatgtgtc cttggagagc tgcagcttgg cgtcctcagc 6420
gagatcgaaa ttcgacttga agttgggcgt gagccccagg ctgagcgcaa tgagattccc 6480
gaacaggccg ttcttcttct cgcccggcag ctgggcgatc aggttctcga ggcgccgaga 6540
cttcgagagc ctagcggaca ggatagcctt cgcgtcgacg cctgacgcat taatggggtt 6600
ctcctcgaag agctggttgt acgtctgcac gagctggatg aacagcttgt caacatcgct 6660
attgtccggg ttgagatccc cctcgatcag gaaatggccc ctgaacttaa tcatgtgggc 6720
cagagcgagg tagatcaggc ggaggtccgc cttatctgtg gagtccacga gcttcttccg 6780
cagatggtag atcgtagggt acttctcgtg gtaggcaacc tcgtcgacaa tgttgccgaa 6840
gattggatgc cgctcgtgct tcttatcctc ctccacgagg aatgactcct ccagcctgtg 6900
gaagaaagaa tcgtcaacct tcgccatctc gttggagaaa atctcctgca ggtagcagat 6960
gcgattcttc ctgcgcgtgt accgcctgcg ggcggtgcgc ttgagccgcg tagcctcagc 7020
cgtctcgccg ctgtcgaaca ggagagcgcc aatgagattc ttcttgatgg aatgccgatc 7080
ggtgttgccc aggaccttga acttctttga gggcaccttg tactcgtcgg tgatcacggc 7140
ccagccaaca gagttagtcc caatatcgag gccgatcgag tacttcttgt cagcagctgg 7200
caccccgtgg atgccaacct tcctcttctt cttcggagcc atcttgtcat catcatcctt 7260
gtaatcaatg tcgtggtcct tgtaatcccc gtcgtggtcc ttgtaatcca tctagtaggc 7320
ctagggctgc agaagtaaca ccaaacaaca gggtgagcat cgacaaaaga aacagtacca 7380
agcaaataaa tagcgtatga aggcagggct aaaaaaatcc acatatagct gctgcatatg 7440
ccatcatcca agtatatcaa gatcaaaata attataaaac atacttgttt attataatag 7500
ataggtactc aaggttagag catatgaata gatgctgcat atgccatcat gtatatgcat 7560
cagtaaaacc cacatcaaca tgtataccta tcctagatcg atatttccat ccatcttaaa 7620
ctcgtaacta tgaagatgta tgacacacac atacagttcc aaaattaata aatacaccag 7680
gtagtttgaa acagtattct actccgatct agaacgaatg aacgaccgcc caaccacacc 7740
acatcatcac aaccaagcga acaaaaagca tctctgtata tgcatcagta aaacccgcat 7800
caacatgtat acctatccta gatcgatatt tccatccatc atcttcaatt cgtaactatg 7860
aatatgtatg gcacacacat acagatccaa aattaataaa tccaccaggt agtttgaaac 7920
agaattctac tccgatctag aacgaccgcc caaccagacc acatcatcac aaccaagaca 7980
aaaaaaagca tgaaaagatg acccgacaaa caagtgcacg gcatatattg aaataaagga 8040
aaagggcaaa ccaaacccta tgcaacgaaa caaaaaaaat catgaaatcg atcccgtctg 8100
cggaacggct agagccatcc caggattccc caaagagaaa cactggcaag ttagcaatca 8160
gaacgtgtct gacgtacagg tcgcatccgt gtacgaacgc tagcagcacg gatctaacac 8220
aaacacggat ctaacacaaa catgaacaga agtagaacta ccgggcccta accatggacc 8280
ggaacgccga tctagagaag gtagagaggg gggggggggg aggacgagcg gcgtaccttg 8340
aagcggaggt gccgacgggt ggatttgggg gagatctggt tgtgtgtgtg tgcgctccga 8400
acaacacgag gttggggaaa gagggtgtgg agggggtgtc tatttattac ggcgggcgag 8460
gaagggaaag cgaaggagcg gtgggaaagg aatcccccgt agctgccgtg ccgtgagagg 8520
aggaggaggc cgcctgccgt gccggctcac gtctgccgct ccgccacgca tttctggatg 8580
ccgacagcgg agcaagtcca acggtggagc ggaactctcg agaggggtcc agaggcagcg 8640
acagagatgc cgtgccgtct gcttcgcttg gcccgacgcg acgctgctgg ttcgctggtt 8700
ggtgtccgtt agactcgtcg acggcgttta acaggctggc attatctact cgaaacaaga 8760
aaaatgtttc cttagttttt ttaatttctt aaagggtatt tgtttaattt ttagtcactt 8820
tattttattc tattttatat ctaaattatt aaataaaaaa actaaaatag agttttagtt 8880
ttcttaattt agaggctaaa atagaataaa atagatgtac taaaaaaatt agtctataaa 8940
aaccattaac cctaaaccct aaatggatgt actaataaaa tggatgaagt attatatagg 9000
tgaagctatt tgcaaaaaaa aaggagaaca catgcacact aaaaagataa aactgtagag 9060
tcctgttgtc aaaatactca attgtccttt agaccatgtc taactgttca tttatatgat 9120
tctctaaaac actgatatta ttgtagtact atagattata ttattcgtag agtaaagttt 9180
aaatatatgt ataaagatag ataaactgca cttcaaacaa gtgtgacaaa aaaaatatgt 9240
ggtaattttt tataacttag acatgcaatg ctcattatct ctagagaggg gcacgaccgg 9300
gtcacgctgc actgcaggaa ttcgatatca tcgaggtcat tcatatgctt gagaagagag 9360
tcgggatagt ccaaaataaa acaaaggtaa gattacctgg tcaaaagtga aaacatcagt 9420
taaaaggtgg tataaagtaa aatatcggta ataaaaggtg gcccaaagtg aaatttactc 9480
ttttctacta ttataaaaat tgaggatgtt tttgtcggta ctttgatacg tcatttttgt 9540
atgaattggt ttttaagttt attcgctttt ggaaatgcat atctgtattt gagtcgggtt 9600
ttaagttcgt ttgcttttgt aaatacagag ggatttgtat aagaaatatc tttaaaaaaa 9660
cccatatgct aatttgacat aatttttgag aaaaatatat attcaggcga attctcacaa 9720
tgaacaataa taagattaaa atagctttcc cccgttgcag cgcatgggta ttttttctag 9780
taaaaataaa agataaactt agactcaaaa catttacaaa aacaacccct aaagttccta 9840
aagcccaaag tgctatccac gatccatagc aagcccagcc caacccaacc caacccaacc 9900
caccccagtc cagccaactg gacaatagtc tccacacccc cccactatca ccgtgagttg 9960
tccgcacgca ccgcacgtct cgcagccaaa aaaaaaaaaa gaaagaaaaa aaagaaaaag 10020
aaaaaacagc aggtgggtcc gggtcgtggg ggccggaaac gcgaggagga tcgcgagcca 10080
gcgacgaggc cggccctccc tccgcttcca aagaaacgcc ccccatcgcc actatataca 10140
tacccccccc tctcctccca tccccccaac cctaccacca ccaccaccac cacctccacc 10200
tcctcccccc tcgctgccgg acgacgagct cctcccccct ccccctccgc cgccgccgcg 10260
ccggtaacca ccccgcccct ctcctctttc tttctccgtt ttttttttcc gtctcggtct 10320
cgatctttgg ccttggtagt ttgggtgggc gagaggcggc ttcgtgcgcg cccagatcgg 10380
tgcgcgggag gggcgggatc tcgcggctgg ggctctcgcc ggcgtggatc cggcccggat 10440
ctcgcgggga atggggctct cggatgtaga tctgcgatcc gccgttgttg ggggagatga 10500
tggggggttt aaaatttccg ccatgctaaa caagatcagg aagaggggaa aagggcacta 10560
tggtttatat ttttatatat ttctgctgct tcgtcaggct tagatgtgct agatctttct 10620
ttcttctttt tgtgggtaga atttgaatcc ctcagcattg ttcatcggta gtttttcttt 10680
tcatgatttg tgacaaatgc agcctcgtgc ggagcttttt tgtaggtaga caacaaagca 10740
ccagtggtct agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg 10800
ctggtgcacg atgtaggagg gcgtggatgt tttagagcta gaaatagcaa gttaaaataa 10860
ggctagtccg ttatcaactt gaaaaagtgg caccgagtcg gtgctttacc cgcaggacgt 10920
atccacgccc tcctacaaca aagcaccagt ggtctagtgg tagaatagta ccctgccacg 10980
gtacagaccc gggttcgatt cccggctggt gcactccgag agctgcatca ggtgttttag 11040
agctagaaat agcaagttaa aataaggcta gtccgttatc aacttgaaaa agtggcaccg 11100
agtcggtgca acaaagcacc agtggtctag tggtagaata gtaccctgcc acggtacaga 11160
cccgggttcg attcccggct ggtgcatcct tgaatgcgcc cccactgttt tagagctaga 11220
aatagcaagt taaaataagg ctagtccgtt atcaacttga aaaagtggca ccgagtcggt 11280
gcatgatccc gatcgggggc gcattcaaac aaagcaccag tggtctagtg gtagaatagt 11340
accctgccac ggtacagacc cgggttcgat tcccggctgg tgcagactcc agggccatac 11400
ttgtgtttta gagctagaaa tagcaagtta aaataaggct agtccgttat caacttgaaa 11460
aagtggcacc gagtcggtgc aacaaagcac cagtggtcta gtggtagaat agtaccctgc 11520
cacggtacag acccgggttc gattcccggc tggtgcaagc ttgtcgaggc tgagtaaggt 11580
taactttgag tattatggca ttggaaaagc cattgttctg cttgtaattt actgtgttct 11640
ttcagttttg ttttcggaca tcaagttaac aaaaaaaaaa aaaaaaaaaa aaaaatttaa 11700
caaaaaaaaa aaaaaaaaaa aaaaatttaa caaaaaaaaa aaaaaaaaaa aaaaatttaa 11760
agagctcgaa tttccccgat cgttcaaaca tttggcaata aagtttctta agattgaatc 11820
ctgttgccgg tcttgcgatg attatcatat aatttctgtt gaattacgtt aagcatgtaa 11880
taattaacat gtaatgcatg acgttattta tgagatgggt ttttatgatt agagtcccgc 11940
aattatacat ttaatacgcg atagaaaaca aaatatagcg cgcaaactag gataaattat 12000
cgcgcgcggt gtcatctatg ttactagatc aaactatcag tgtttgacag gatatattgg 12060
cgggtaaacc taagagaaaa gagcgtttat tagaataacg gatatttaaa agggcgtgaa 12120
aaggtttatc cgttcgtcca tttgtatgtg catgccaacc acagggttcc cctcgggatc 12180
aaagtacttt gatccaaccc ctccgctgct atagtgcagt cggcttctga cgttcagtgc 12240
agccgtcttc tgaaaacgac atgtcgcaca agtcctaagt tacgcgacag gctgccgccc 12300
tgcccttttc ctggcgtttt cttgtcgcgt gttttagtcg cataaagtag aatacttgcg 12360
actagaaccg gagacattac gccatgaaca agagcgccgc cgctggcctg ctgggctatg 12420
cccgcgtcag caccgacgac caggacttga ccaaccaacg ggccgaactg cacgcggccg 12480
gctgcaccaa gctgttttcc gagaagatca ccggcaccag gcgcgaccgc ccggagctgg 12540
ccaggatgct tgaccaccta gccctggcga cgttgtgaca gtgaccaggc tagaccgcct 12600
ggcccgcagc acccgcgacc tactggacat tgccgagcgc atccaggagg ccggcgcggg 12660
cctgcgtagc ctggcagagc cgtgggccga caccaccacg ccggccggcc gcatggtgtt 12720
gaccgtgttc gccggcattg ccgagttcga gcgttcccta atcatcgacc gcacccggag 12780
cgggcgcgag gccgccaagg cccgaggcgt gaagtttggc ccccgcccta ccctcacccc 12840
ggcacagatc gcgcacgccc gcgagctgat cgaccaggaa ggccgcaccg tgaaagaggc 12900
ggctgcactg cttggcgtgc atcgctcgac cctgtaccgc gcacttgagc gcagcgagga 12960
agtgacgccc accgaggcca ggcggcgcgg tgccttccgt gaggacgcat tgaccgaggc 13020
cgacgccctg gcggccgccg agaatgaacg ccaagaggaa caagcatgaa accgcaccag 13080
gacggccagg acgaaccgtt tttcattacc gaagagatcg aggcggagat gatcgcggcc 13140
gggtacgtgt tcgagccgcc cgcgcacgtc tcaaccgtgc ggctgcatga aatcctggcc 13200
ggtttgtctg atgccaagct ggcggcctgg ccggccagct tggccgctga agaaaccgag 13260
cgccgccgtc taaaaaggtg atgtgtattt gagtaaaaca gcttgcgtca tgcggtcgct 13320
gcgtatatga tgcgatgagt aaataaacaa atacgcaagg ggaacgcatg aaggttatcg 13380
ctgtacttaa ccagaaaggc gggtcaggca agacgaccat cgcaacccat ctagcccgcg 13440
ccctgcaact cgccggggcc gatgttctgt tagtcgattc cgatccccag ggcagtgccc 13500
gcgattgggc ggccgtgcgg gaagatcaac cgctaaccgt tgtcggcatc gaccgcccga 13560
cgattgaccg cgacgtgaag gccatcggcc ggcgcgactt cgtagtgatc gacggagcgc 13620
cccaggcggc ggacttggct gtgtccgcga tcaaggcagc cgacttcgtg ctgattccgg 13680
tgcagccaag cccttacgac atatgggcaa ccgccgacct ggtggagctg gttaagcagc 13740
gcattgaggt cacggatgga aggctacaag cggcctttgt cgtgtcgcgg gcgatcaaag 13800
gcacgcgcat cggcggtgag gttgccgagg cgctggccgg gtacgagctg cccattcttg 13860
agtcccgtat cacgcagcgc gtgagctacc caggcactgc cgccgccggc acaaccgttc 13920
ttgaatcaga acccgagggc gacgctgccc gcgaggtcca ggcgctggcc gctgaaatta 13980
aatcaaaact catttgagtt aatgaggtaa agagaaaatg agcaaaagca caaacacgct 14040
aagtgccggc cgtccgagcg cacgcagcag caaggctgca acgttggcca gcctggcaga 14100
cacgccagcc atgaagcggg tcaactttca gttgccggcg gaggatcaca ccaagctgaa 14160
gatgtacgcg gtacgccaag gcaagaccat taccgagctg ctatctgaat acatcgcgca 14220
gctaccagag taaatgagca aatgaataaa tgagtagatg aattttagcg gctaaaggag 14280
gcggcatgga aaatcaagaa caaccaggca ccgacgccgt ggaatgcccc atgtgtggag 14340
gaacgggcgg ttggccaggc gtaagcggct gggttgtctg ccggccctgc aatggcactg 14400
gaacccccaa gcccgaggaa tcggcgtgac ggtcgcaaac catccggccc ggtacaaatc 14460
ggcgcggcgc tgggtgatga cctggtggag aagttgaagg ccgcgcaggc cgcccagcgg 14520
caacgcatcg aggcagaagc acgccccggt gaatcgtggc aagcggccgc tgatcgaatc 14580
cgcaaagaat cccggcaacc gccggcagcc ggtgcgccgt cgattaggaa gccgcccaag 14640
ggcgacgagc aaccagattt tttcgttccg atgctctatg acgtgggcac ccgcgatagt 14700
cgcagcatca tggacgtggc cgttttccgt ctgtcgaagc gtgaccgacg agctggcgag 14760
gtgatccgct acgagcttcc agacgggcac gtagaggttt ccgcagggcc ggccggcatg 14820
gccagtgtgt gggattacga cctggtactg atggcggttt cccatctaac cgaatccatg 14880
aaccgatacc gggaagggaa gggagacaag cccggccgcg tgttccgtcc acacgttgcg 14940
gacgtactca agttctgccg gcgagccgat ggcggaaagc agaaagacga cctggtagaa 15000
acctgcattc ggttaaacac cacgcacgtt gccatgcagc gtacgaagaa ggccaagaac 15060
ggccgcctgg tgacggtatc cgagggtgaa gccttgatta gccgctacaa gatcgtaaag 15120
agcgaaaccg ggcggccgga gtacatcgag atcgagctag ctgattggat gtaccgcgag 15180
atcacagaag gcaagaaccc ggacgtgctg acggttcacc ccgattactt tttgatcgat 15240
cccggcatcg gccgttttct ctaccgcctg gcacgccgcg ccgcaggcaa ggcagaagcc 15300
agatggttgt tcaagacgat ctacgaacgc agtggcagcg ccggagagtt caagaagttc 15360
tgtttcaccg tgcgcaagct gatcgggtca aatgacctgc cggagtacga tttgaaggag 15420
gaggcggggc aggctggccc gatcctagtc atgcgctacc gcaacctgat cgagggcgaa 15480
gcatccgccg gttcctaatg tacggagcag atgctagggc aaattgccct agcaggggaa 15540
aaaggtcgaa aaggtctctt tcctgtggat agcacgtaca ttgggaaccc aaagccgtac 15600
attgggaacc ggaacccgta cattgggaac ccaaagccgt acattgggaa ccggtcacac 15660
atgtaagtga ctgatataaa agagaaaaaa ggcgattttt ccgcctaaaa ctctttaaaa 15720
cttattaaaa ctcttaaaac ccgcctggcc tgtgcataac tgtctggcca gcgcacagcc 15780
gaagagctgc aaaaagcgcc tacccttcgg tcgctgcgct ccctacgccc cgccgcttcg 15840
cgtcggccta tcgcggccgc tggccgctca aaaatggctg gcctacggcc aggcaatcta 15900
ccagggcgcg gacaagccgc gccgtcgcca ctcgaccgcc ggcgcccaca tcaaggcacc 15960
ctgcctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc tcccggagac 16020
ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg gcgcgtcagc 16080
gggtgttggc gggtgtcggg gcgcagccat gacccagtca cgtagcgata gcggagtgta 16140
tactggctta actatgcggc atcagagcag attgtactga gagtgcacca tatgcggtgt 16200
gaaataccgc acagatgcgt aaggagaaaa taccgcatca ggcgctcttc cgcttcctcg 16260
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 16320
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 16380
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 16440
cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 16500
ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 16560
accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 16620
catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 16680
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 16740
tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 16800
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 16860
actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 16920
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 16980
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 17040
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gcattctagg 17100
tactaaaaca attcatccag taaaatataa tattttattt tctcccaatc aggcttgatc 17160
cccagtaagt caaaaaatag ctcgacatac tgttcttccc cgatatcctc cctgatcgac 17220
cggacgcaga aggcaatgtc ataccacttg tccgccctgc cgcttctccc aagatcaata 17280
aagccactta ctttgccatc tttcacaaag atgttgctgt ctcccaggtc gccgtgggaa 17340
aagacaagtt cctcttcggg cttttccgtc tttaaaaaat catacagctc gcgcggatct 17400
ttaaatggag tgtcttcttc ccagttttcg caatccacat cggccagatc gttattcagt 17460
aagtaatcca attcggctaa gcggctgtct aagctattcg tatagggaca atccgatatg 17520
tcgatggagt gaaagagcct gatgcactcc gcatacagct cgataatctt ttcagggctt 17580
tgttcatctt catactcttc cgagcaaagg acgccatcgg cctcactcat gagcagattg 17640
ctccagccat catgccgttc aaagtgcagg acctttggaa caggcagctt tccttccagc 17700
catagcatca tgtccttttc ccgttcaaca tcataggtgg tccctttata ccggctgtcc 17760
gtcattttta aatataggtt ttcattttct cccaccagct tatatacctt agcaggagac 17820
attccttccg tatcttttac gcagcggtat ttttcgatca gttttttcaa ttccggtgat 17880
attctcattt tagccattta ttatttcctt cctcttttct acagtattta aagatacccc 17940
aagaagctaa ttataacaag acgaactcca attcactgtt ccttgcattc taaaacctta 18000
aataccagaa aacagctttt tcaaagttgt tttcaaagtt ggcgtataac atagtatcga 18060
cggagccgat tttgaaaccg cggtgatcac aggcagcaac gctctgtcat cgttacaatc 18120
aacatgctac cctccgcgag atcatccgtg tttcaaaccc ggcagcttag ttgccgttct 18180
tccgaatagc atcggtaaca tgagcaaagt ctgccgcctt acaacggctc tcccgctgac 18240
gccgtcccgg actgatgggc tgcctgtatc gagtggtgat tttgtgccga gctgccggtc 18300
ggggagctgt tggctggctg gtggcaggat atattgtggt gtaaacaaat tgacgcttag 18360
acaacttaat aacacattgc ggacgttttt aatgtactga attaacgccg aattaattcg 18420
ggggatctgg attttagtac tggattttgg ttttaggaat tagaaatttt attgatagaa 18480
gtattttaca aatacaaata catactaagg gtttcttata tgctcaacac atgagcgaaa 18540
ccctatagga accctaattc ccttatctgg gaactactca cacattatta tggagaaact 18600
cgagcttgtc gatcgacaga tccggtcggc atctactcta tttctttgcc ctcggacgag 18660
tgctggggcg tcggtttcca ctatcggcga gtacttctac acagccatcg gtccagacgg 18720
ccgcgcttct gcgggcgatt tgtgtacgcc cgacagtccc ggctccggat cggacgattg 18780
cgtcgcatcg accctgcgcc caagctgcat catcgaaatt gccgtcaacc aagctctgat 18840
agagttggtc aagaccaatg cggagcatat acgcccggag tcgtggcgat cctgcaagct 18900
ccggatgcct ccgctcgaag tagcgcgtct gctgctccat acaagccaac cacggcctcc 18960
agaagaagat gttggcgacc tcgtattggg aatccccgaa catcgcctcg ctccagtcaa 19020
tgaccgctgt tatgcggcca ttgtccgtca ggacattgtt ggagccgaaa tccgcgtgca 19080
cgaggtgccg gacttcgggg cagtcctcgg cccaaagcat cagctcatcg agagcctgcg 19140
cgacggacgc actgacggtg tcgtccatca cagtttgcca gtgatacaca tggggatcag 19200
caatcgcgca tatgaaatca cgccatgtag tgtattgacc gattccttgc ggtccgaatg 19260
ggccgaaccc gctcgtctgg ctaagatcgg ccgcagcgat cgcatccata gcctccgcga 19320
ccggttgtag aacagcgggc agttcggttt caggcaggtc ttgcaacgtg acaccctgtg 19380
cacggcggga gatgcaatag gtcaggctct cgctaaactc cccaatgtca agcacttccg 19440
gaatcgggag cgcggccgat gcaaagtgcc gataaacata acgatctttg tagaaaccat 19500
cggcgcagct atttacccgc aggacctatc cacgccctcc tacatcgaag ctgaaagcac 19560
gagattcttc gccctccgag agctgcatca ggtcggagac gctgtcgaac ttttcgatca 19620
gaaacttctc gacagacgtc gcggtgagtt caggcttttt catatctcat tgccccccgg 19680
atctgcgaaa gctcgagaga gatagatttg tagagagaga ctggtgattt cagcgtgtcc 19740
tctccaaatg aaatgaactt ccttatatag aggaaggtct tgcgaaggat agtgggattg 19800
tgcgtcatcc cttacgtcag tggagatatc acatcaatcc acttgctttg aagacgtggt 19860
tggaacgtct tctttttcca cgatgctcct cgtgggtggg ggtccatctt tgggaccact 19920
gtcggcagag gcatcttgaa cgatagcctt tcctttatcg caatgatggc atttgtaggt 19980
gccaccttcc ttttctactg tccttttgat gaagtgacag atagctgggc aatggaatcc 20040
gaggaggttt cccgatatta ccctttgttg aaaagtctca atagcccttt ggtcttctga 20100
gactgtatct ttgatattct tggagtagac gagagtgtcg tgctccacca tgttatcaca 20160
tcaatccact tgctttgaag acgtggttgg aacgtcttct ttttccacga tgctcctcgt 20220
gggtgggggt ccatctttgg gaccactgtc ggcagaggca tcttgaacga tagcctttcc 20280
tttatcgcaa tgatggcatt tgtaggtgcc accttccttt tctactgtcc ttttgatgaa 20340
gtgacagata gctgggcaat ggaatccgag gaggtttccc gatattaccc tttgttgaaa 20400
agtctcaata gccctttggt cttctgagac tgtatctttg atattcttgg agtagacgag 20460
agtgtcgtgc tccaccatgt tggcaagctg ctctagccaa tacgcaaacc gcctctcccc 20520
gcgcgttggc cgattcatta atgcagctgg cacgacaggt ttcccgactg gaaagcgggc 20580
agtgagcgca acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac 20640
tttatgcttc cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga 20700
aacagctatg accatgatta c 20721
<210> 3
<211> 2122
<212> PRT
<213> Artificial sequence
<400> 3
Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala Ala Asp
1 5 10 15
Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val Gly Trp
20 25 30
Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val
35 40 45
Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala
50 55 60
Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg
65 70 75 80
Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu
85 90 95
Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe
100 105 110
His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu
115 120 125
Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu
130 135 140
Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr
145 150 155 160
Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile
165 170 175
Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn
180 185 190
Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln
195 200 205
Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala
210 215 220
Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile
225 230 235 240
Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile
245 250 255
Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu
260 265 270
Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
275 280 285
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe
290 295 300
Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu
305 310 315 320
Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile
325 330 335
Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu
340 345 350
Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln
355 360 365
Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu
370 375 380
Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr
385 390 395 400
Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln
405 410 415
Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu
420 425 430
Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys
435 440 445
Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr
450 455 460
Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr
465 470 475 480
Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val
485 490 495
Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe
500 505 510
Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
515 520 525
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val
530 535 540
Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys
545 550 555 560
Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys
565 570 575
Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val
580 585 590
Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr
595 600 605
His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu
610 615 620
Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe
625 630 635 640
Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu
645 650 655
Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly
660 665 670
Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln
675 680 685
Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn
690 695 700
Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu
705 710 715 720
Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu
725 730 735
His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu
740 745 750
Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
755 760 765
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr Thr
770 775 780
Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu Glu
785 790 795 800
Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val Glu
805 810 815
Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn
820 825 830
Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser
835 840 845
Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp Asp
850 855 860
Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys
865 870 875 880
Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr
885 890 895
Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp
900 905 910
Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala
915 920 925
Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His
930 935 940
Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn
945 950 955 960
Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu
965 970 975
Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile
980 985 990
Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
995 1000 1005
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1010 1015 1020
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1025 1030 1035
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1040 1045 1050
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1055 1060 1065
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1070 1075 1080
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1085 1090 1095
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1100 1105 1110
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1115 1120 1125
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1130 1135 1140
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1145 1150 1155
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1160 1165 1170
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1175 1180 1185
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1190 1195 1200
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1205 1210 1215
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1220 1225 1230
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1235 1240 1245
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1250 1255 1260
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1265 1270 1275
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1280 1285 1290
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1295 1300 1305
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1310 1315 1320
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1325 1330 1335
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1340 1345 1350
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1355 1360 1365
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
1370 1375 1380
Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr
1385 1390 1395
Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly
1400 1405 1410
Ser Ser Thr Leu Asn Ile Glu Asp Glu Tyr Arg Leu His Glu Thr
1415 1420 1425
Ser Lys Glu Pro Asp Val Ser Leu Gly Ser Thr Trp Leu Ser Asp
1430 1435 1440
Phe Pro Gln Ala Trp Ala Glu Thr Gly Gly Met Gly Leu Ala Val
1445 1450 1455
Arg Gln Ala Pro Leu Ile Ile Pro Leu Lys Ala Thr Ser Thr Pro
1460 1465 1470
Val Ser Ile Lys Gln Tyr Pro Met Ser Gln Glu Ala Arg Leu Gly
1475 1480 1485
Ile Lys Pro His Ile Gln Arg Leu Leu Asp Gln Gly Ile Leu Val
1490 1495 1500
Pro Cys Gln Ser Pro Trp Asn Thr Pro Leu Leu Pro Val Lys Lys
1505 1510 1515
Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln Asp Leu Arg Glu Val
1520 1525 1530
Asn Lys Arg Val Glu Asp Ile His Pro Thr Val Pro Asn Pro Tyr
1535 1540 1545
Asn Leu Leu Ser Gly Leu Pro Pro Ser His Gln Trp Tyr Thr Val
1550 1555 1560
Leu Asp Leu Lys Asp Ala Phe Phe Cys Leu Arg Leu His Pro Thr
1565 1570 1575
Ser Gln Pro Leu Phe Ala Phe Glu Trp Arg Asp Pro Glu Met Gly
1580 1585 1590
Ile Ser Gly Gln Leu Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys
1595 1600 1605
Asn Ser Pro Thr Leu Phe Asn Glu Ala Leu His Arg Asp Leu Ala
1610 1615 1620
Asp Phe Arg Ile Gln His Pro Asp Leu Ile Leu Leu Gln Tyr Val
1625 1630 1635
Asp Asp Leu Leu Leu Ala Ala Thr Ser Glu Leu Asp Cys Gln Gln
1640 1645 1650
Gly Thr Arg Ala Leu Leu Gln Thr Leu Gly Asn Leu Gly Tyr Arg
1655 1660 1665
Ala Ser Ala Lys Lys Ala Gln Ile Cys Gln Lys Gln Val Lys Tyr
1670 1675 1680
Leu Gly Tyr Leu Leu Lys Glu Gly Gln Arg Trp Leu Thr Glu Ala
1685 1690 1695
Arg Lys Glu Thr Val Met Gly Gln Pro Thr Pro Lys Thr Pro Arg
1700 1705 1710
Gln Leu Arg Glu Phe Leu Gly Lys Ala Gly Phe Cys Arg Leu Phe
1715 1720 1725
Ile Pro Gly Phe Ala Glu Met Ala Ala Pro Leu Tyr Pro Leu Thr
1730 1735 1740
Lys Pro Gly Thr Leu Phe Asn Trp Gly Pro Asp Gln Gln Lys Ala
1745 1750 1755
Tyr Gln Glu Ile Lys Gln Ala Leu Leu Thr Ala Pro Ala Leu Gly
1760 1765 1770
Leu Pro Asp Leu Thr Lys Pro Phe Glu Leu Phe Val Asp Glu Lys
1775 1780 1785
Gln Gly Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu Gly Pro Trp
1790 1795 1800
Arg Arg Pro Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro Val Ala
1805 1810 1815
Ala Gly Trp Pro Pro Cys Leu Arg Met Val Ala Ala Ile Ala Val
1820 1825 1830
Leu Thr Lys Asp Ala Gly Lys Leu Thr Met Gly Gln Pro Leu Val
1835 1840 1845
Ile Leu Ala Pro His Ala Val Glu Ala Leu Val Lys Gln Pro Pro
1850 1855 1860
Asp Arg Trp Leu Ser Asn Ala Arg Met Thr His Tyr Gln Ala Leu
1865 1870 1875
Leu Leu Asp Thr Asp Arg Val Gln Phe Gly Pro Val Val Ala Leu
1880 1885 1890
Asn Pro Ala Thr Leu Leu Pro Leu Pro Glu Glu Gly Leu Gln His
1895 1900 1905
Asn Cys Leu Asp Ile Leu Ala Glu Ala His Gly Thr Arg Pro Asp
1910 1915 1920
Leu Thr Asp Gln Pro Leu Pro Asp Ala Asp His Thr Trp Tyr Thr
1925 1930 1935
Asp Gly Ser Ser Leu Leu Gln Glu Gly Gln Arg Lys Ala Gly Ala
1940 1945 1950
Ala Val Thr Thr Glu Thr Glu Val Ile Trp Ala Lys Ala Leu Pro
1955 1960 1965
Ala Gly Thr Ser Ala Gln Arg Ala Glu Leu Ile Ala Leu Thr Gln
1970 1975 1980
Ala Leu Lys Met Ala Glu Gly Lys Lys Leu Asn Val Tyr Thr Asp
1985 1990 1995
Ser Arg Tyr Ala Phe Ala Thr Ala His Ile His Gly Glu Ile Tyr
2000 2005 2010
Arg Arg Arg Gly Trp Leu Thr Ser Glu Gly Lys Glu Ile Lys Asn
2015 2020 2025
Lys Asp Glu Ile Leu Ala Leu Leu Lys Ala Leu Phe Leu Pro Lys
2030 2035 2040
Arg Leu Ser Ile Ile His Cys Pro Gly His Gln Lys Gly His Ser
2045 2050 2055
Ala Glu Ala Arg Gly Asn Arg Met Ala Asp Gln Ala Ala Arg Lys
2060 2065 2070
Ala Ala Ile Thr Glu Thr Pro Asp Thr Ser Thr Leu Leu Ile Glu
2075 2080 2085
Asn Ser Ser Pro Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser
2090 2095 2100
Glu Phe Glu Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala
2105 2110 2115
Lys Lys Lys Lys
2120
<210> 4
<211> 18234
<212> DNA
<213> Artificial sequence
<400> 4
gaattcgggt accgttgtca atcaattggc aagtcataaa atgcattaaa aaatattttc 60
atactcaact acaaatccat gagtataact ataattataa agcaatgatt agaatctgac 120
aaggattctg gaaaattaca taaaggaaag ttcataaatg tctaaaacac aagaggacat 180
acttgtattc agtaacattt gcagcttttc taggtctgaa aatatatttg ttgcctagtg 240
aataagcata atggtacaac tacaagtgtt ttactcctca tattaacttc ggtcattaga 300
ggccacgatt tgacacattt ttactcaaaa caaaatgttt gcatatctct tataatttca 360
aattcaacac acaacaaata agagaaaaaa caaataatat taatttgaga atgaacaaaa 420
ggaccatatc attcattaac tcttctccat ccatttccat ttcacagttc gatagcgaaa 480
accgaataaa aaacacagta aattacaagc acaacaaatg gtacaagaaa aacagttttc 540
ccaatgccat aatactcaaa ctcagtagga ttctggtgtg tgcgcaatga aactgatgca 600
ttgaacttga cgaacgttgt cgaaaccgat gatacgaacg aaagctctgg ggaaattcga 660
gctctttaaa tttttttttt tttttttttt ttttgttaaa tttttttttt tttttttttt 720
ttttgttaaa tttttttttt tttttttttt tttttgttaa cttgatgtcc gaaaacaaaa 780
ctgaaagaac acagtaaatt acaagcagaa caatggcttt tccaatgcca taatactcaa 840
agttaacctt actcagcctc gacgagctca cttcttcttc ttcgcctgcc ccgccttctt 900
cgtcgccgct ggccgcttct cgaactcgct gccgtcagct gtgcgcttgc tgccgccgga 960
gggggagctg ttctcgatga ggagggtgga ggtgtctggg gtctcggtga tagcggcctt 1020
gcgggcggct tggtcggcca ttctgttgcc gcgggcctca gcggagtggc ccttctggtg 1080
gcccgggcaa tggatgatgg agaggcgctt cgggaggaag agggccttga ggagcgcgag 1140
gatctcgtcc ttgttcttga tctccttgcc ctcggaggtg agccagcccc tcctcctgta 1200
gatctcgccg tggatgtggg cggtggcgaa ggcgtagcgg gagtcggtgt acacattgag 1260
cttcttgccc tcggccatct tcagggcttg ggtgagggca atcagctcgg cgcgttgagc 1320
tgaggtgccg gctgggaggg ccttggccca aatcacctcg gtctcggttg tcacagcggc 1380
gccggccttc ctttggccct cctggaggag ggaggagccg tcggtgtacc aggtgtggtc 1440
ggcgtctggg aggggctggt cggtgagatc cggtctggtg ccgtgggcct cggcgaggat 1500
gtcgaggcag ttgtgttgga ggccctcctc agggagtggg aggagggtgg ccgggttgag 1560
ggcgacaaca gggccgaatt ggacgcggtc ggtgtcgagg aggagggcct ggtagtgggt 1620
catgcgggcg ttggagagcc atctgtcagg aggctgctta acgagggcct cgacggcgtg 1680
tggggcgagg atgaccagcg gttggcccat ggtgagcttg ccggcgtcct tggtgagcac 1740
ggcaatggcc gccaccattc tgaggcatgg tggccagccg gcggccacgg gatcgagctt 1800
cttggagaga taagccaccg ggcgcctcca tgggcccagc ttctgggtga ggacgccctt 1860
ggcgtagcct tgcttctcgt ccacgaagag ctcgaagggc ttggtgaggt ctgggaggcc 1920
gagggccggg gcggtgagca gggcctgctt gatctcctgg taggccttct gttggtctgg 1980
gccccagttg aagagtgtgc ctggcttggt gagtggatag agcggcgcgg ccatctccgc 2040
gaagcccgga atgaagaggc ggcagaagcc ggccttgccg aggaactcgc ggagttgtct 2100
cggtgtcttc ggggttggct ggcccatcac tgtctccttt ctggcctcgg tgagccagcg 2160
ttggccctcc ttgaggaggt agccgaggta cttgacttgc ttctggcaga tctgggcctt 2220
cttagcggag gcgcggtagc cgaggttgcc gagggtttgg aggagggctc ttgtgccctg 2280
ttggcagtcg agctctgagg tcgcggcgag gaggaggtcg tccacgtact ggaggaggat 2340
gaggtccggg tgctgaatcc tgaagtcggc gaggtcgcgg tggagggcct cgttgaagag 2400
ggtgggggag ttcttgaagc cttgcgggag gcgggtccac gtgagttggc cggagatgcc 2460
catctcaggg tcgcgccact cgaaggcgaa gagtggctgg gaggttgggt ggaggcggag 2520
gcagaagaag gcgtccttga ggtcgagcac ggtgtaccat tggtgggatg gcgggaggcc 2580
ggagaggagg ttgtatgggt ttgggacggt tgggtggatg tcctcaacgc gcttattgac 2640
ctcgcggagg tcctgcacgg ggcggtagtc gttggtgcct ggcttcttca cggggaggag 2700
tggggtgttc cacggggatt ggcagggcac gaggatgcct tgatcgagga ggcgctggat 2760
gtgtggctta atgccgagtc tggcctcctg ggacattggg tattgcttaa tggagaccgg 2820
tgtggaggtc gccttgagcg ggatgatgag tggagcctgg cgcacagcga ggcccatgcc 2880
gcctgtctcg gcccaggcct gagggaagtc ggagagccag gtagagccga gagacacgtc 2940
aggctccttg gatgtctcgt ggagtctgta ctcgtcctcg atgttgaggg tagaagagcc 3000
gccagaggag ccgccggagg actcaggggt ggcggactcg gaggtgccag gggtctcaga 3060
gccggatgag ccgccggaag agccgccgga gtcgcccccg agctgagaca ggtcgatgcg 3120
cgtctcgtag aggccggtaa tcgactggtg gatgagggtc gcgtccagga cctccttagt 3180
gcttgtgtac ctcttgcgat cgatagttgt gtcgaagtac ttgaaagcag caggggcgcc 3240
gaggttcgtc agggtgaaga gatgaatgat attctcagcc tgctccctga ttggcttgtc 3300
gcggtgcttg ttgtacgcgg agaggacctt atccagattc gcgtcggcca ggatcacgcg 3360
cttggagaac tcggaaatct gctcaatgat ctcgtcgagg taatgcttgt gctgctcgac 3420
gaacagctgc ttctgctcgt tgtcctcggg gctgcccttg agcttctcgt agtgggaggc 3480
caggtagagg aagttcacat acttggacgg cagagccagc tcgttcccct tctgcagctc 3540
gccagcggaa gccagcatcc gcttcctgcc gttctccagc tcgaagagtg agtacttggg 3600
gagcttaatg atcaggtcct tcttcacctc cttgtagccc ttcgcctcca ggaaatcgat 3660
cgggttcttc tcgaagctgg agcgctccat aatcgtgatc cccagcagct ccttcacgct 3720
cttgagcttc ttggacttgc ccttctcaac cttcgccaca accaggaccg agtaggccac 3780
agtggggctg tcgaacccgc cgtacttctt cggatcccag tccttcttgc gggcgatgag 3840
cttgtcgctg ttccgcttag gcagaattga ctccttagag aacccgccag tctggacctc 3900
tgtcttcttg acgatattca cttgtggcat ggagagaacc ttcctgacgg tcgcgaaatc 3960
cctgcccttg tcccacacga tctcccccgt ctcgccgttc gtctcgatga gggggcgctt 4020
ccggatctcg ccattggcca gagtgatctc tgtcttgaag aaattcataa tgttagagta 4080
gaagaagtac ttggcggtag ccttgccaat ctcctgctcc gacttggcga tcatcttcct 4140
cacatcgtaa accttgtagt ccccgtacac gaactcgctc tcgagctttg ggtacttctt 4200
gatcagagct gtgccgacca ccgcgttcag gtacgcgtca tgggcatggt ggtaattgtt 4260
gatctcccga accttgtaga actggaaatc cttcctgaag tcggagacga gctttgactt 4320
cagggtgatg accttcacct cgcggatcag cttgtcattc tcatcgtact tagtgttcat 4380
ccgtgagtcg agaatctgcg caacgtgctt agtgatctgc cgtgtctcga ccagctgcct 4440
cttgatgaag cccgccttgt ccagctcaga gagcccgccc ctctcagcct ttgtgaggtt 4500
atcgaacttc cgctgcgtga tcagcttggc attcaggagc tggcgccagt agttcttcat 4560
cttcttaacg acctcctctg aaggaacatt atcagacttg ccccggttct tgtccgacct 4620
ggtgaggacc ttgttgtcaa tggagtcatc cttcaggaat gactgtggaa cgatagcatc 4680
gacgtcgtaa tcgctgagcc tgttaatatc cagctcctgg tccacataca tatcgcggcc 4740
attctggagg tagtacaggt agagcttctc attctgcagc tgcgtgttct ccaccgggtg 4800
ctccttgagg atctgggacc ccagctcctt aatgccctcc tcgatcctct tcatcctctc 4860
gcgtgagttc ttctggccct tctgcgtggt ctgattctcc cgggccatct caatgacgat 4920
gttctcaggc ttgtgcctgc ccatgacctt caccagctcg tccacaacct tcacggtctg 4980
cagaatcccc ttcttgatag ctggcgagcc agcgaggttc gcgatatgct cgtgcagcga 5040
gtccccctgg ccgctcacct gagccttctg gatatcctcc ttgaatgtga ggctgtcatc 5100
gtgaatcagc tgcatgaaat tgcggttcgc gaagccatcg ctcttcagga agtcgaggat 5160
cgtcttcccg gactgcttgt cccgaatgcc gttgatgagc ttcctgctca gcctccccca 5220
gccggtgtac ctcctcctct tgagctgctt catgaccttg tcatcgaaga gatgggcgta 5280
agtcttcagg cgctcctcga tcatctcccg gtcctcgaac agagtgagtg tcagcacaat 5340
gtcctcgagg atatcctcat tctcctcgtt gtccaggaag tccttatcct taatgatctt 5400
caggagatcg tggtaggtcc ccagggaggc gttgaagcgg tcctcaacgc cagagatctc 5460
gaccgaatcg aagcactcaa tcttcttgaa gtagtcctcc ttgagctgct taaccgtgac 5520
cttccggttg gtcttgaaca ggaggtccac gatggccttc ttctgctccc cagacaggaa 5580
agccggcttc ctcatgccct cggtcacata cttcacctta gtcagctcgt tgtagactgt 5640
gaagtactcg tacaggagcg agtgcttagg gagcaccttc tcatttggca ggttcttgtc 5700
gaaattcgtc atcctctcga tgaacgactg agcgctagcg cccttgtcga ccacctcctc 5760
gaagttccac ggcgtgatcg tctcctctga cttgcgggtc atccaagcga agcgggagtt 5820
gcccctagcg agtgggccga cgtagtacgg gatcctgaaa gtcagaatct tctcgatctt 5880
ctcgcggtta tccttgagga aagggtagaa gtcctcctgc ctcctcagga tagcgtgcag 5940
ctccccgaga tgaatctggt gtgggatgct gccgttatcg aatgtccgct gcttcctcag 6000
gaggtcctcg cgattgagct tcaccagcag ctcctccgtg ccgtccatct tctccagaat 6060
cggcttgatg aacttgtaga actcctcctg agaggccccg ccgtcaatgt acccagcgta 6120
gccgttcttc gactgatcga agaagatctc cttgtacttc tcggggagct gctgcctgac 6180
cagcgccttc aggagggtca gatcctgatg gtgctcgtcg tagcgcttga tcatggaggc 6240
tgagagcgga gccttcgtaa tctcggtgtt caccctgaga atatcagaca ggaggatggc 6300
gtccgacaga ttcttggcag cgaggaacag gtccgcgtac tgatcgccga tctgggccag 6360
gaggttatcc aggtcatcgt cgtatgtgtc cttggagagc tgcagcttgg cgtcctcagc 6420
gagatcgaaa ttcgacttga agttgggcgt gagccccagg ctgagcgcaa tgagattccc 6480
gaacaggccg ttcttcttct cgcccggcag ctgggcgatc aggttctcga ggcgccgaga 6540
cttcgagagc ctagcggaca ggatagcctt cgcgtcgacg cctgacgcat taatggggtt 6600
ctcctcgaag agctggttgt acgtctgcac gagctggatg aacagcttgt caacatcgct 6660
attgtccggg ttgagatccc cctcgatcag gaaatggccc ctgaacttaa tcatgtgggc 6720
cagagcgagg tagatcaggc ggaggtccgc cttatctgtg gagtccacga gcttcttccg 6780
cagatggtag atcgtagggt acttctcgtg gtaggcaacc tcgtcgacaa tgttgccgaa 6840
gattggatgc cgctcgtgct tcttatcctc ctccacgagg aatgactcct ccagcctgtg 6900
gaagaaagaa tcgtcaacct tcgccatctc gttggagaaa atctcctgca ggtagcagat 6960
gcgattcttc ctgcgcgtgt accgcctgcg ggcggtgcgc ttgagccgcg tagcctcagc 7020
cgtctcgccg ctgtcgaaca ggagagcgcc aatgagattc ttcttgatgg aatgccgatc 7080
ggtgttgccc aggaccttga acttctttga gggcaccttg tactcgtcgg tgatcacggc 7140
ccagccaaca gagttagtcc caatatcgag gccgatcgag tacttcttgt cagcagctgg 7200
caccccgtgg atgccaacct tcctcttctt cttcggagcc atcttgtcat catcatcctt 7260
gtaatcaatg tcgtggtcct tgtaatcccc gtcgtggtcc ttgtaatcca tctagtaggc 7320
ctagggctgc agaagtaaca ccaaacaaca gggtgagcat cgacaaaaga aacagtacca 7380
agcaaataaa tagcgtatga aggcagggct aaaaaaatcc acatatagct gctgcatatg 7440
ccatcatcca agtatatcaa gatcaaaata attataaaac atacttgttt attataatag 7500
ataggtactc aaggttagag catatgaata gatgctgcat atgccatcat gtatatgcat 7560
cagtaaaacc cacatcaaca tgtataccta tcctagatcg atatttccat ccatcttaaa 7620
ctcgtaacta tgaagatgta tgacacacac atacagttcc aaaattaata aatacaccag 7680
gtagtttgaa acagtattct actccgatct agaacgaatg aacgaccgcc caaccacacc 7740
acatcatcac aaccaagcga acaaaaagca tctctgtata tgcatcagta aaacccgcat 7800
caacatgtat acctatccta gatcgatatt tccatccatc atcttcaatt cgtaactatg 7860
aatatgtatg gcacacacat acagatccaa aattaataaa tccaccaggt agtttgaaac 7920
agaattctac tccgatctag aacgaccgcc caaccagacc acatcatcac aaccaagaca 7980
aaaaaaagca tgaaaagatg acccgacaaa caagtgcacg gcatatattg aaataaagga 8040
aaagggcaaa ccaaacccta tgcaacgaaa caaaaaaaat catgaaatcg atcccgtctg 8100
cggaacggct agagccatcc caggattccc caaagagaaa cactggcaag ttagcaatca 8160
gaacgtgtct gacgtacagg tcgcatccgt gtacgaacgc tagcagcacg gatctaacac 8220
aaacacggat ctaacacaaa catgaacaga agtagaacta ccgggcccta accatggacc 8280
ggaacgccga tctagagaag gtagagaggg gggggggggg aggacgagcg gcgtaccttg 8340
aagcggaggt gccgacgggt ggatttgggg gagatctggt tgtgtgtgtg tgcgctccga 8400
acaacacgag gttggggaaa gagggtgtgg agggggtgtc tatttattac ggcgggcgag 8460
gaagggaaag cgaaggagcg gtgggaaagg aatcccccgt agctgccgtg ccgtgagagg 8520
aggaggaggc cgcctgccgt gccggctcac gtctgccgct ccgccacgca tttctggatg 8580
ccgacagcgg agcaagtcca acggtggagc ggaactctcg agaggggtcc agaggcagcg 8640
acagagatgc cgtgccgtct gcttcgcttg gcccgacgcg acgctgctgg ttcgctggtt 8700
ggtgtccgtt agactcgtcg acggcgttta acaggctggc attatctact cgaaacaaga 8760
aaaatgtttc cttagttttt ttaatttctt aaagggtatt tgtttaattt ttagtcactt 8820
tattttattc tattttatat ctaaattatt aaataaaaaa actaaaatag agttttagtt 8880
ttcttaattt agaggctaaa atagaataaa atagatgtac taaaaaaatt agtctataaa 8940
aaccattaac cctaaaccct aaatggatgt actaataaaa tggatgaagt attatatagg 9000
tgaagctatt tgcaaaaaaa aaggagaaca catgcacact aaaaagataa aactgtagag 9060
tcctgttgtc aaaatactca attgtccttt agaccatgtc taactgttca tttatatgat 9120
tctctaaaac actgatatta ttgtagtact atagattata ttattcgtag agtaaagttt 9180
aaatatatgt ataaagatag ataaactgca cttcaaacaa gtgtgacaaa aaaaatatgt 9240
ggtaattttt tataacttag acatgcaatg ctcattatct ctagagaggg gcacgaccgg 9300
gtcacgctgc actgcaggaa ttcgatatca agcttggcac tggccgtcgt tttacaacgt 9360
cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc ttgcagcaca tccccctttc 9420
gccagctggc gtaatagcga agaggcccgc accgatcgcc cttcccaaca gttgcgcagc 9480
ctgaatggcg aatgctagag cagcttgagc ttggatcaga ttgtcgtttc ccgccttcag 9540
tttaaactat cagtgtttga caggatatat tggcgggtaa acctaagaga aaagagcgtt 9600
tattagaata acggatattt aaaagggcgt gaaaaggttt atccgttcgt ccatttgtat 9660
gtgcatgcca accacagggt tcccctcggg atcaaagtac tttgatccaa cccctccgct 9720
gctatagtgc agtcggcttc tgacgttcag tgcagccgtc ttctgaaaac gacatgtcgc 9780
acaagtccta agttacgcga caggctgccg ccctgccctt ttcctggcgt tttcttgtcg 9840
cgtgttttag tcgcataaag tagaatactt gcgactagaa ccggagacat tacgccatga 9900
acaagagcgc cgccgctggc ctgctgggct atgcccgcgt cagcaccgac gaccaggact 9960
tgaccaacca acgggccgaa ctgcacgcgg ccggctgcac caagctgttt tccgagaaga 10020
tcaccggcac caggcgcgac cgcccggagc tggccaggat gcttgaccac ctagccctgg 10080
cgacgttgtg acagtgacca ggctagaccg cctggcccgc agcacccgcg acctactgga 10140
cattgccgag cgcatccagg aggccggcgc gggcctgcgt agcctggcag agccgtgggc 10200
cgacaccacc acgccggccg gccgcatggt gttgaccgtg ttcgccggca ttgccgagtt 10260
cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc gaggccgcca aggcccgagg 10320
cgtgaagttt ggcccccgcc ctaccctcac cccggcacag atcgcgcacg cccgcgagct 10380
gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca ctgcttggcg tgcatcgctc 10440
gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg cccaccgagg ccaggcggcg 10500
cggtgccttc cgtgaggacg cattgaccga ggccgacgcc ctggcggccg ccgagaatga 10560
acgccaagag gaacaagcat gaaaccgcac caggacggcc aggacgaacc gtttttcatt 10620
accgaagaga tcgaggcgga gatgatcgcg gccgggtacg tgttcgagcc gcccgcgcac 10680
gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt ctgatgccaa gctggcggcc 10740
tggccggcca gcttggccgc tgaagaaacc gagcgccgcc gtctaaaaag gtgatgtgta 10800
tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata tgatgcgatg agtaaataaa 10860
caaatacgca aggggaacgc atgaaggtta tcgctgtact taaccagaaa ggcgggtcag 10920
gcaagacgac catcgcaacc catctagccc gcgccctgca actcgccggg gccgatgttc 10980
tgttagtcga ttccgatccc cagggcagtg cccgcgattg ggcggccgtg cgggaagatc 11040
aaccgctaac cgttgtcggc atcgaccgcc cgacgattga ccgcgacgtg aaggccatcg 11100
gccggcgcga cttcgtagtg atcgacggag cgccccaggc ggcggacttg gctgtgtccg 11160
cgatcaaggc agccgacttc gtgctgattc cggtgcagcc aagcccttac gacatatggg 11220
caaccgccga cctggtggag ctggttaagc agcgcattga ggtcacggat ggaaggctac 11280
aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg catcggcggt gaggttgccg 11340
aggcgctggc cgggtacgag ctgcccattc ttgagtcccg tatcacgcag cgcgtgagct 11400
acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc agaacccgag ggcgacgctg 11460
cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa actcatttga gttaatgagg 11520
taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc ggccgtccga gcgcacgcag 11580
cagcaaggct gcaacgttgg ccagcctggc agacacgcca gccatgaagc gggtcaactt 11640
tcagttgccg gcggaggatc acaccaagct gaagatgtac gcggtacgcc aaggcaagac 11700
cattaccgag ctgctatctg aatacatcgc gcagctacca gagtaaatga gcaaatgaat 11760
aaatgagtag atgaatttta gcggctaaag gaggcggcat ggaaaatcaa gaacaaccag 11820
gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg cggttggcca ggcgtaagcg 11880
gctgggttgt ctgccggccc tgcaatggca ctggaacccc caagcccgag gaatcggcgt 11940
gacggtcgca aaccatccgg cccggtacaa atcggcgcgg cgctgggtga tgacctggtg 12000
gagaagttga aggccgcgca ggccgcccag cggcaacgca tcgaggcaga agcacgcccc 12060
ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag aatcccggca accgccggca 12120
gccggtgcgc cgtcgattag gaagccgccc aagggcgacg agcaaccaga ttttttcgtt 12180
ccgatgctct atgacgtggg cacccgcgat agtcgcagca tcatggacgt ggccgttttc 12240
cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc gctacgagct tccagacggg 12300
cacgtagagg tttccgcagg gccggccggc atggccagtg tgtgggatta cgacctggta 12360
ctgatggcgg tttcccatct aaccgaatcc atgaaccgat accgggaagg gaagggagac 12420
aagcccggcc gcgtgttccg tccacacgtt gcggacgtac tcaagttctg ccggcgagcc 12480
gatggcggaa agcagaaaga cgacctggta gaaacctgca ttcggttaaa caccacgcac 12540
gttgccatgc agcgtacgaa gaaggccaag aacggccgcc tggtgacggt atccgagggt 12600
gaagccttga ttagccgcta caagatcgta aagagcgaaa ccgggcggcc ggagtacatc 12660
gagatcgagc tagctgattg gatgtaccgc gagatcacag aaggcaagaa cccggacgtg 12720
ctgacggttc accccgatta ctttttgatc gatcccggca tcggccgttt tctctaccgc 12780
ctggcacgcc gcgccgcagg caaggcagaa gccagatggt tgttcaagac gatctacgaa 12840
cgcagtggca gcgccggaga gttcaagaag ttctgtttca ccgtgcgcaa gctgatcggg 12900
tcaaatgacc tgccggagta cgatttgaag gaggaggcgg ggcaggctgg cccgatccta 12960
gtcatgcgct accgcaacct gatcgagggc gaagcatccg ccggttccta atgtacggag 13020
cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc gaaaaggtct ctttcctgtg 13080
gatagcacgt acattgggaa cccaaagccg tacattggga accggaaccc gtacattggg 13140
aacccaaagc cgtacattgg gaaccggtca cacatgtaag tgactgatat aaaagagaaa 13200
aaaggcgatt tttccgccta aaactcttta aaacttatta aaactcttaa aacccgcctg 13260
gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc tgcaaaaagc gcctaccctt 13320
cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc ctatcgcggc cgctggccgc 13380
tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc gcggacaagc cgcgccgtcg 13440
ccactcgacc gccggcgccc acatcaaggc accctgcctc gcgcgtttcg gtgatgacgg 13500
tgaaaacctc tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc 13560
cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggcgcagc 13620
catgacccag tcacgtagcg atagcggagt gtatactggc ttaactatgc ggcatcagag 13680
cagattgtac tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg cgtaaggaga 13740
aaataccgca tcaggcgctc ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt 13800
cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc cacagaatca 13860
ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag gaaccgtaaa 13920
aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat 13980
cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc 14040
cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc 14100
gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag gtatctcagt 14160
tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac 14220
cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg 14280
ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca 14340
gagttcttga agtggtggcc taactacggc tacactagaa ggacagtatt tggtatctgc 14400
gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc cggcaaacaa 14460
accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa 14520
ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg gaacgaaaac 14580
tcacgttaag ggattttggt catgcattct aggtactaaa acaattcatc cagtaaaata 14640
taatatttta ttttctccca atcaggcttg atccccagta agtcaaaaaa tagctcgaca 14700
tactgttctt ccccgatatc ctccctgatc gaccggacgc agaaggcaat gtcataccac 14760
ttgtccgccc tgccgcttct cccaagatca ataaagccac ttactttgcc atctttcaca 14820
aagatgttgc tgtctcccag gtcgccgtgg gaaaagacaa gttcctcttc gggcttttcc 14880
gtctttaaaa aatcatacag ctcgcgcgga tctttaaatg gagtgtcttc ttcccagttt 14940
tcgcaatcca catcggccag atcgttattc agtaagtaat ccaattcggc taagcggctg 15000
tctaagctat tcgtataggg acaatccgat atgtcgatgg agtgaaagag cctgatgcac 15060
tccgcataca gctcgataat cttttcaggg ctttgttcat cttcatactc ttccgagcaa 15120
aggacgccat cggcctcact catgagcaga ttgctccagc catcatgccg ttcaaagtgc 15180
aggacctttg gaacaggcag ctttccttcc agccatagca tcatgtcctt ttcccgttca 15240
acatcatagg tggtcccttt ataccggctg tccgtcattt ttaaatatag gttttcattt 15300
tctcccacca gcttatatac cttagcagga gacattcctt ccgtatcttt tacgcagcgg 15360
tatttttcga tcagtttttt caattccggt gatattctca ttttagccat ttattatttc 15420
cttcctcttt tctacagtat ttaaagatac cccaagaagc taattataac aagacgaact 15480
ccaattcact gttccttgca ttctaaaacc ttaaatacca gaaaacagct ttttcaaagt 15540
tgttttcaaa gttggcgtat aacatagtat cgacggagcc gattttgaaa ccgcggtgat 15600
cacaggcagc aacgctctgt catcgttaca atcaacatgc taccctccgc gagatcatcc 15660
gtgtttcaaa cccggcagct tagttgccgt tcttccgaat agcatcggta acatgagcaa 15720
agtctgccgc cttacaacgg ctctcccgct gacgccgtcc cggactgatg ggctgcctgt 15780
atcgagtggt gattttgtgc cgagctgccg gtcggggagc tgttggctgg ctggtggcag 15840
gatatattgt ggtgtaaaca aattgacgct tagacaactt aataacacat tgcggacgtt 15900
tttaatgtac tgaattaacg ccgaattaat tcgggggatc tggattttag tactggattt 15960
tggttttagg aattagaaat tttattgata gaagtatttt acaaatacaa atacatacta 16020
agggtttctt atatgctcaa cacatgagcg aaaccctata ggaaccctaa ttcccttatc 16080
tgggaactac tcacacatta ttatggagaa actcgagctt gtcgatcgac agatccggtc 16140
ggcatctact ctatttcttt gccctcggac gagtgctggg gcgtcggttt ccactatcgg 16200
cgagtacttc tacacagcca tcggtccaga cggccgcgct tctgcgggcg atttgtgtac 16260
gcccgacagt cccggctccg gatcggacga ttgcgtcgca tcgaccctgc gcccaagctg 16320
catcatcgaa attgccgtca accaagctct gatagagttg gtcaagacca atgcggagca 16380
tatacgcccg gagtcgtggc gatcctgcaa gctccggatg cctccgctcg aagtagcgcg 16440
tctgctgctc catacaagcc aaccacggcc tccagaagaa gatgttggcg acctcgtatt 16500
gggaatcccc gaacatcgcc tcgctccagt caatgaccgc tgttatgcgg ccattgtccg 16560
tcaggacatt gttggagccg aaatccgcgt gcacgaggtg ccggacttcg gggcagtcct 16620
cggcccaaag catcagctca tcgagagcct gcgcgacgga cgcactgacg gtgtcgtcca 16680
tcacagtttg ccagtgatac acatggggat cagcaatcgc gcatatgaaa tcacgccatg 16740
tagtgtattg accgattcct tgcggtccga atgggccgaa cccgctcgtc tggctaagat 16800
cggccgcagc gatcgcatcc atagcctccg cgaccggttg tagaacagcg ggcagttcgg 16860
tttcaggcag gtcttgcaac gtgacaccct gtgcacggcg ggagatgcaa taggtcaggc 16920
tctcgctaaa ctccccaatg tcaagcactt ccggaatcgg gagcgcggcc gatgcaaagt 16980
gccgataaac ataacgatct ttgtagaaac catcggcgca gctatttacc cgcaggacat 17040
atccacgccc tcctacatcg aagctgaaag cacgagattc ttcgccctcc gagagctgca 17100
tcaggtcgga gacgctgtcg aacttttcga tcagaaactt ctcgacagac gtcgcggtga 17160
gttcaggctt tttcatatct cattgccccc cggatctgcg aaagctcgag agagatagat 17220
ttgtagagag agactggtga tttcagcgtg tcctctccaa atgaaatgaa cttccttata 17280
tagaggaagg tcttgcgaag gatagtggga ttgtgcgtca tcccttacgt cagtggagat 17340
atcacatcaa tccacttgct ttgaagacgt ggttggaacg tcttcttttt ccacgatgct 17400
cctcgtgggt gggggtccat ctttgggacc actgtcggca gaggcatctt gaacgatagc 17460
ctttccttta tcgcaatgat ggcatttgta ggtgccacct tccttttcta ctgtcctttt 17520
gatgaagtga cagatagctg ggcaatggaa tccgaggagg tttcccgata ttaccctttg 17580
ttgaaaagtc tcaatagccc tttggtcttc tgagactgta tctttgatat tcttggagta 17640
gacgagagtg tcgtgctcca ccatgttatc acatcaatcc acttgctttg aagacgtggt 17700
tggaacgtct tctttttcca cgatgctcct cgtgggtggg ggtccatctt tgggaccact 17760
gtcggcagag gcatcttgaa cgatagcctt tcctttatcg caatgatggc atttgtaggt 17820
gccaccttcc ttttctactg tccttttgat gaagtgacag atagctgggc aatggaatcc 17880
gaggaggttt cccgatatta ccctttgttg aaaagtctca atagcccttt ggtcttctga 17940
gactgtatct ttgatattct tggagtagac gagagtgtcg tgctccacca tgttggcaag 18000
ctgctctagc caatacgcaa accgcctctc cccgcgcgtt ggccgattca ttaatgcagc 18060
tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc gcaacgcaat taatgtgagt 18120
tagctcactc attaggcacc ccaggcttta cactttatgc ttccggctcg tatgttgtgt 18180
ggaattgtga gcggataaca atttcacaca ggaaacagct atgaccatga ttac 18234
<210> 5
<211> 1876
<212> DNA
<213> Artificial sequence
<400> 5
tcgaggtcat tcatatgctt gagaagagag tcgggatagt ccaaaataaa acaaaggtaa 60
gattacctgg tcaaaagtga aaacatcagt taaaaggtgg tataaagtaa aatatcggta 120
ataaaaggtg gcccaaagtg aaatttactc ttttctacta ttataaaaat tgaggatgtt 180
tttgtcggta ctttgatacg tcatttttgt atgaattggt ttttaagttt attcgctttt 240
ggaaatgcat atctgtattt gagtcgggtt ttaagttcgt ttgcttttgt aaatacagag 300
ggatttgtat aagaaatatc tttaaaaaaa cccatatgct aatttgacat aatttttgag 360
aaaaatatat attcaggcga attctcacaa tgaacaataa taagattaaa atagctttcc 420
cccgttgcag cgcatgggta ttttttctag taaaaataaa agataaactt agactcaaaa 480
catttacaaa aacaacccct aaagttccta aagcccaaag tgctatccac gatccatagc 540
aagcccagcc caacccaacc caacccaacc caccccagtc cagccaactg gacaatagtc 600
tccacacccc cccactatca ccgtgagttg tccgcacgca ccgcacgtct cgcagccaaa 660
aaaaaaaaaa gaaagaaaaa aaagaaaaag aaaaaacagc aggtgggtcc gggtcgtggg 720
ggccggaaac gcgaggagga tcgcgagcca gcgacgaggc cggccctccc tccgcttcca 780
aagaaacgcc ccccatcgcc actatataca tacccccccc tctcctccca tccccccaac 840
cctaccacca ccaccaccac cacctccacc tcctcccccc tcgctgccgg acgacgagct 900
cctcccccct ccccctccgc cgccgccgcg ccggtaacca ccccgcccct ctcctctttc 960
tttctccgtt ttttttttcc gtctcggtct cgatctttgg ccttggtagt ttgggtgggc 1020
gagaggcggc ttcgtgcgcg cccagatcgg tgcgcgggag gggcgggatc tcgcggctgg 1080
ggctctcgcc ggcgtggatc cggcccggat ctcgcgggga atggggctct cggatgtaga 1140
tctgcgatcc gccgttgttg ggggagatga tggggggttt aaaatttccg ccatgctaaa 1200
caagatcagg aagaggggaa aagggcacta tggtttatat ttttatatat ttctgctgct 1260
tcgtcaggct tagatgtgct agatctttct ttcttctttt tgtgggtaga atttgaatcc 1320
ctcagcattg ttcatcggta gtttttcttt tcatgatttg tgacaaatgc agcctcgtgc 1380
ggagcttttt tgtaggtaga caaagcttgt cgaggctgag taaggttaac tttgagtatt 1440
atggcattgg aaaagccatt gttctgcttg taatttactg tgttctttca gttttgtttt 1500
cggacatcaa gttaacaaaa aaaaaaaaaa aaaaaaaaaa atttaacaaa aaaaaaaaaa 1560
aaaaaaaaaa atttaacaaa aaaaaaaaaa aaaaaaaaaa atttaaagag ctcgaatttc 1620
cccgatcgtt caaacatttg gcaataaagt ttcttaagat tgaatcctgt tgccggtctt 1680
gcgatgatta tcatataatt tctgttgaat tacgttaagc atgtaataat taacatgtaa 1740
tgcatgacgt tatttatgag atgggttttt atgattagag tcccgcaatt atacatttaa 1800
tacgcgatag aaaacaaaat atagcgcgca aactaggata aattatcgcg cgcggtgtca 1860
tctatgttac tagatc 1876
<210> 6
<211> 175
<212> DNA
<213> Artificial sequence
<400> 6
gttttagagc tagaaatagc aagttaaaat aaggctagtc cgttatcaac ttgaaaaagt 60
ggcaccgagt cggtgcaaca aagcaccagt ggtctagtgg tagaatagta ccctgccacg 120
gtacagaccc gggttcgatt cccggctggt gcaagcttgt cgaggctgag taagg 175

Claims (12)

1.一种载体,记为载体甲,其特征在于:所述载体甲表达如下(A1)和(A2),并含有如下(A3);
(A1)抗生素抗性筛选标记蛋白;
(A2)由a1)和a2)融合而成的融合蛋白;
a1)Cas9缺刻酶;
a2)逆转录酶;
含有所述融合蛋白的编码基因的表达盒的序列如SEQ ID No.1的第14-9317位所示;
(A3)DNA片段甲;所述DNA片段甲中含有:
b1)启动子;
b2)polyA和终止序列;
b3)位于所述b1)和所述b2)之间的DNA序列I;所述DNA序列I表达靶向OsALS基因并且将OsALS基因编码蛋白的第627位丝氨酸突变为异亮氨酸的成套向导RNA;所述DNA序列I表达的成套向导RNA的序列为SEQ ID No.1的第10732-11179位;
b4)位于所述b1)和所述b2)之间的插入位点;所述插入位点用于插入DNA序列II;所述DNA序列II表达针对一个或若干个靶标基因的一个或若干个成套向导RNA;
所述成套向导RNA为自5’端到3’端依次由tRNA、pegRNA、tRNA、sgRNA和tRNA组成的一条串联排列序列。
2.一种载体,记为载体乙,其特征在于:所述载体乙表达如下(B1),并含有如下(B2)和(B3);
(B1)由a1)和a2)融合而成的融合蛋白;
a1)Cas9缺刻酶;
a2)逆转录酶;
含有所述融合蛋白的编码基因的表达盒的序列如SEQ ID No.1的第14-9317位所示;
(B2)含有潮霉素抗性筛选标记基因突变体的表达盒;所述潮霉素抗性筛选标记基因突变体为将野生型潮霉素抗性筛选标记基因中编码第46位酪氨酸的密码子突变为终止密码子后所得;
(B3)DNA片段乙;所述DNA片段乙中含有:
c1)启动子;
c2)polyA和终止序列;
c3)位于所述c1)和所述c2)之间的DNA序列I;所述DNA序列I表达靶向OsALS基因并且将OsALS基因编码蛋白的第627位丝氨酸突变为异亮氨酸的成套向导RNA;所述DNA序列I表达的成套向导RNA的序列为SEQ ID No.1的第10732-11179位;
c4)位于所述c1)和所述c2)之间的插入位点;所述插入位点用于插入DNA序列II;所述DNA序列II表达针对一个或若干个靶标基因的一个或若干个成套向导RNA;
c5)位于所述c1)和所述c2)之间的DNA序列III;所述DNA序列III表达靶向(B2)中所述潮霉素抗性筛选标记基因突变体并且将所述潮霉素抗性筛选标记基因突变体中编码第46位氨基酸的终止密码子回复为异亮氨酸的成套向导RNA;所述DNA序列III表达的成套向导RNA的序列为SEQ ID No.2的第10732-11186位;
所述成套向导RNA为自5’端到3’端依次由tRNA、pegRNA、tRNA、sgRNA和tRNA组成的一条串联排列序列。
3.根据权利要求1或2所述的载体,其特征在于:在所述DNA片段甲和所述DNA片段乙中,所述启动子均为水稻组成型Actin启动子;和/或,所述终止序列均为Nos终止序列。
4. 根据权利要求1或2所述的载体,其特征在于:在(A1)中,所述抗生素抗性筛选标记蛋白由含有所述抗生素抗性筛选标记蛋白的编码基因的表达盒表达而来;在含有所述抗生素抗性筛选标记蛋白的编码基因的表达盒中,所述抗生素抗性筛选标记蛋白的编码基因由35S启动子启动表达;
和/或
在(B2)中,在所述含有潮霉素抗性筛选标记基因突变体的表达盒中,所述潮霉素抗性筛选标记基因突变体由35S启动子启动表达。
5. 根据权利要求4所述的载体,其特征在于:在(A1)中,含有所述抗生素抗性筛选标记蛋白的编码基因的表达盒的序列如SEQ ID No.1的第18046-20029位所示;
和/或
在(B2)中,所述含有潮霉素抗性筛选标记基因突变体的表达盒的序列如SEQ ID No.2的第18424-20407位所示。
6. 根据权利要求1或2所述的载体,其特征在于:所述载体甲的序列如SEQ ID No.1所示;和/或
所述载体乙的序列如SEQ ID No.2所示。
7.权利要求1-6中任一所述的载体在对受体植物进行基因编辑中的应用。
8.一种对受体植物进行基因编辑的方法,包括如下步骤:将权利要求1-6任一中的所述DNA序列II插入到权利要求1-6中任一所述载体的所述插入位点,得到重组载体;将所述重组载体导入受体植物,从而实现对所述受体植物进行基因编辑。
9. 根据权利要求8所述的方法,其特征在于:所述受体植物为单子叶植物;
和/或
所述基因编辑为多基因编辑。
10.根据权利要求9所述的方法,其特征在于:
所述单子叶植物为禾本科植物;
和/或
所述多基因编辑为同时对2个或2个以上。
11.根据权利要求10所述的方法,其特征在于:
所述禾本科植物为稻属植物。
12.根据权利要求11所述方法,其特征在于:
所述稻属植物为水稻。
CN202210466252.8A 2022-04-29 2022-04-29 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法 Active CN114908116B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210466252.8A CN114908116B (zh) 2022-04-29 2022-04-29 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210466252.8A CN114908116B (zh) 2022-04-29 2022-04-29 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法

Publications (2)

Publication Number Publication Date
CN114908116A CN114908116A (zh) 2022-08-16
CN114908116B true CN114908116B (zh) 2024-05-10

Family

ID=82765470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210466252.8A Active CN114908116B (zh) 2022-04-29 2022-04-29 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法

Country Status (1)

Country Link
CN (1) CN114908116B (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110951743A (zh) * 2019-12-31 2020-04-03 北京市农林科学院 一种提高植物基因替换效率的方法
CN111378051A (zh) * 2020-03-25 2020-07-07 北京市农林科学院 Pe-p2引导编辑系统及其在基因组碱基编辑中的应用
WO2021165508A1 (en) * 2020-02-21 2021-08-26 Biogemma Prime editing technology for plant genome engineering

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110157727A (zh) * 2017-12-21 2019-08-23 中国科学院遗传与发育生物学研究所 植物碱基编辑方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110951743A (zh) * 2019-12-31 2020-04-03 北京市农林科学院 一种提高植物基因替换效率的方法
WO2021165508A1 (en) * 2020-02-21 2021-08-26 Biogemma Prime editing technology for plant genome engineering
CN111378051A (zh) * 2020-03-25 2020-07-07 北京市农林科学院 Pe-p2引导编辑系统及其在基因组碱基编辑中的应用

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Precise Modifications of Both Exogenous and Endogenous Genes in Rice by Prime Editing;Huiyuan Li等;《Mol Plant》;第671-673页 *
利用CRISPR/Cas9技术创建OsCOL9水稻突变体;刘维;刘浩;董双玉;古丰玮;陈志强;王加峰;王慧;;华北农学报(第04期);摘要 *

Also Published As

Publication number Publication date
CN114908116A (zh) 2022-08-16

Similar Documents

Publication Publication Date Title
KR101331677B1 (ko) 스트레스 내성 식물
CN107043779B (zh) 一种CRISPR/nCas9介导的定点碱基替换在植物中的应用
US6090393A (en) Recombinant canine adenoviruses, method for making and uses thereof
US6156567A (en) Truncated transcriptionally active cytomegalovirus promoters
CN101939434B (zh) 用于在大豆中提高种子贮藏油脂的生成和改变脂肪酸谱的来自解脂耶氏酵母的dgat基因
DK2329028T3 (en) GENERATION BY genetic modification of resistance to bolting WITH SUGAR BEET BY transgenic expression of ROEHOMOLOGEN FOR FLOWERING TIME CONTROL GENE FT
AU2010258955B2 (en) Expression cassettes derived from maize
RU2203321C2 (ru) Ретровирусный вектор на основе вируса мышиного лейкоза (mlv) (варианты)
KR20140113997A (ko) 부탄올 생성을 위한 유전자 스위치
BRPI0620552A2 (pt) polinucleotìdeo isolado, polipeptìdeo delta-9 elongase, construção recombinante, célula vegetal, método para transformar uma célula, método para produção de uma planta transgênica, sementes transgênicas, método para fabricar ácidos graxos poliinsaturados de cadeia longa, óleos, método para produzir pelo menos um ácido graxo poliinsaturado, plantas de semente oleaginosa, sementes, alimentos, fragmento de ácido nucléico isolado e progênies de plantas
KR20140092759A (ko) 숙주 세포 및 아이소부탄올의 제조 방법
KR20140099224A (ko) 케토-아이소발레레이트 데카르복실라제 효소 및 이의 이용 방법
CN111549026B (zh) 一种水稻增强子及鉴定方法
CN109069668A (zh) 用于眼病的基因疗法
CN106929532A (zh) 人工创制玉米雄性不育系与高效的转育方法
CN114908116B (zh) 一种通过借助代理引导编辑器进行水稻多基因精准编辑的方法
KR20180137558A (ko) 유전자내 식물 형질전환을 위한 구조체 및 벡터
CN113615567A (zh) 一种用于农作物遗传智能化育种制种载体
CN110951702B (zh) 水稻DMNT和TMTT合成相关蛋白OsCYP92C21及其编码基因与应用
BRPI0616533A2 (pt) polinucleotìdeo isolado, fragmento de ácido nucléico isolado, construções de dna recombinante, plantas, sementes, células vegetais, tecidos vegetais, método de isolamento de fragmentos de ácidos nucléico, método de mapeamento de variações genéticas, método de cultivo molecular, plantas de milho, métodos de alteração do transporte de nitrogênio das plantas e variantes de hat de plantas alteradas
CN114621974B (zh) 植物单基因或多基因crispr激活技术的载体及构建方法、应用
EA004328B1 (ru) Способ придания растительной клетке или растению устойчивости к вирусу, содержащему последовательность 3 тройного блока генов (tgb3), и трансгенное растение, устойчивое к указанному вирусу
CN111471684B (zh) 植物组成型启动子ALSpro及其应用
KR102076333B1 (ko) 잠두위조바이러스2를 이용한 두 종의 외래 유전자 동시 발현용 유전자 전달 벡터
CN1875102B (zh) 新型质粒及其利用

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant