CN106701788B - 喷司他丁和阿糖腺苷生物合成基因簇及其应用 - Google Patents

喷司他丁和阿糖腺苷生物合成基因簇及其应用 Download PDF

Info

Publication number
CN106701788B
CN106701788B CN201611181302.9A CN201611181302A CN106701788B CN 106701788 B CN106701788 B CN 106701788B CN 201611181302 A CN201611181302 A CN 201611181302A CN 106701788 B CN106701788 B CN 106701788B
Authority
CN
China
Prior art keywords
ala
leu
val
gly
arg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611181302.9A
Other languages
English (en)
Other versions
CN106701788A (zh
Inventor
陈文青
巫攀
邓子新
万丹
徐顾丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201611181302.9A priority Critical patent/CN106701788B/zh
Publication of CN106701788A publication Critical patent/CN106701788A/zh
Application granted granted Critical
Publication of CN106701788B publication Critical patent/CN106701788B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0006Oxidoreductases (1.) acting on CH-OH groups as donors (1.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1048Glycosyltransferases (2.4)
    • C12N9/1077Pentosyltransferases (2.4.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/93Ligases (6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/38Nucleosides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/38Nucleosides
    • C12P19/40Nucleosides having a condensed ring system containing a six-membered ring having two nitrogen atoms in the same ring, e.g. purine nucleosides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y101/00Oxidoreductases acting on the CH-OH group of donors (1.1)
    • C12Y101/01Oxidoreductases acting on the CH-OH group of donors (1.1) with NAD+ or NADP+ as acceptor (1.1.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y204/00Glycosyltransferases (2.4)
    • C12Y204/02Pentosyltransferases (2.4.2)
    • C12Y204/02001Purine-nucleoside phosphorylase (2.4.2.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y204/00Glycosyltransferases (2.4)
    • C12Y204/02Pentosyltransferases (2.4.2)
    • C12Y204/02007Adenine phosphoribosyltransferase (2.4.2.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y303/00Hydrolases acting on ether bonds (3.3)
    • C12Y303/01Thioether and trialkylsulfonium hydrolases (3.3.1)
    • C12Y303/01001Adenosylhomocysteinase (3.3.1.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y603/00Ligases forming carbon-nitrogen bonds (6.3)
    • C12Y603/02Acid—amino-acid ligases (peptide synthases)(6.3.2)
    • C12Y603/02006Phosphoribosylaminoimidazolesuccinocarboxamide synthase (6.3.2.6)

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

本发明涉及由抗生链霉菌产生的治疗淋巴细胞白血病天然产物喷司他丁和抗孢疹病毒、水痘病毒天然产物阿糖腺苷的生物合成基因簇的克隆、测序、分析、功能研究及其应用。两个化合物的生物合成基因包含在同一个基因簇中且相互独立。整个基因簇共包含10个基因:3个与喷司他丁合成相关的基因;5个与阿糖腺苷合成相关的基因;2个与两者转运调节相关的基因。通过对上述生物合成基因的遗传操作可阻断或提高喷司他丁和阿糖腺苷的生物合成。本发明所提供的基因及其蛋白可用于该化合物的基因工程、蛋白表达、酶催化反应等,也可以用来寻找和发现可用于医药、工业或农业的化合物或基因、蛋白。

Description

喷司他丁和阿糖腺苷生物合成基因簇及其应用
技术领域
本发明属于微生物基因资源和基因工程领域,具体涉及核苷类抗生素喷司他丁(Pentostatin,PTN)和阿糖腺苷(arabinofuranosyladenine,Ara-A)的生物合成基因簇的克隆,序列分析,基因体内功能验证,体外生化研究及其应用。
背景技术
1967年,科学家首次从抗生链霉菌NRRL 3238(S.antibioticus NRRL 3238)的发酵液中分离出嘌呤类核苷抗生素阿糖腺苷(Biochemistry,1972,11,911-916)。阿糖腺苷具有广泛的抗DNA病毒活性,并已经应用于单纯孢疹脑炎、新生儿孢疹、带状孢疹和慢性髓细胞性白血病的治疗中(Biochemistry,1972,11,911-916;Trends Pharmacol Sci,2010,31,255-265)。由于阿糖腺苷极易于被细胞中广泛存在的腺苷脱氨酶(adenosine deaminase,ADA)作用转化形成阿糖肌苷(arabinofuransylhypoxanthine,Ara-I),前期研究者同样从抗生链霉菌NRRL 3238的发酵液中分离得到了腺苷脱氨酶的强有效抑制剂喷司他丁(pentastatin,PTN)。同样作为嘌呤类核苷抗生素,喷司他丁在临床上应用于急性T细胞型淋巴母细胞白血病、慢性淋巴细胞白血病、皮肤T淋巴细胞瘤和多毛细胞白血病的治疗(JAntibiot(Tokyo),1992,45,1914-1918)。
从化学结构上来看,阿糖腺苷作为腺苷的结构类似物,与腺苷在结构上的区别在于在C-2’位上的羟基发生立体异构;喷司他丁则含有一个特殊的1,3-二氮杂七元环。在前期报道中,通过同位素喂养实验,表明阿糖腺苷的生物合成途径中涉及到的C-2’的羟基异构化是由于C-2’和C-3’位的羟基发生交换而形成的(Arch Biochem Biophys,1989,270,374-382)。同样通过同位素标记实验,确定了喷司他丁的生物合成是以腺苷为前体,通过D-核糖的C-1插入嘌呤环的C-6位和N-1位形成了特殊的1,3-二氮杂七元环(Biochemistry,1984,23,904-907;Biochemistry,1987,26,5636-5641);此外,也推测了喷司他丁生物合成的起始阶段与L-组氨酸的生物合成途径具有一定的相似性(Biochemistry,1988,27,5790-5795)。尽管阿糖腺苷和喷司他丁在临床上已得到广泛应用以及在化学合成方面的也有很好的研究进展,但是在过去数十年间,关于这两个嘌呤类核苷抗生素的生物合成仍然知之甚少。
发明内容
为了克服现有技术中存在的不足,本发明以一种抗生链霉菌(S.antibioticusNRRL 3238)产生的具有抗DNA病毒的天然产物嘌呤类核苷抗生素喷司他丁和阿糖腺苷为目标分子,对其生物合成基因簇进行克隆,通过序列分析、功能验证、体外生化实验,揭示了一个基因簇包含两个独立的生物合成途径的现象。本发明中整个基因簇是包含10个基因的核苷酸序列为SEQ ID NO:1中第20237-28760位所示,其中负责喷司他丁生物合成的基因,即penA,penB,penC共3个基因;负责阿糖腺苷生物合成的基因,即penD,penG,penH,penI,penJ共5个基因;负责喷司他丁和阿糖腺苷转运和调控的基因penE,penF共2个基因。
本发明还提供了一个编码ATP磷酸核糖转移酶的核苷序列,其编码的氨基酸序列为SEQ ID NO.2,命名为penA,其核苷序列位于SEQ ID NO.1中第26377-27285碱基处。
本发明还提供了一个编码辅因子NADP(H)依赖的短链脱氢酶的核苷序列,其编码的氨基酸序列为SEQ ID NO.3,命名为penB,其核苷序列位于SEQ ID NO.1中第25663-26367碱基处。
本发明还提供了一个编码SAICAR合成酶的核苷序列,其编码的氨基酸序列为SEQID NO.4,命名为penC,其核苷序列位于SEQ ID NO.1中第24933-25676碱基处。
本发明还提供了一个编码SAH水解酶的核苷序列,其编码的氨基酸序列为SEQ IDNO.5,命名为penD,其核苷序列位于SEQ ID NO.1中第23743-24933碱基处。
本发明还提供了一个编码膜转运蛋白的核苷序列,其编码的氨基酸序列为SEQ IDNO.6,命名为penE,其核苷序列位于SEQ ID NO.1中第22467-23672碱基处。
本发明还提供了一个编码核苷磷酸化酶的核苷序列,其编码的氨基酸序列为SEQID NO.7,命名为penF,其核苷序列位于SEQ ID NO.1中第21394-22470碱基处。
本发明还提供了一个编码SAH水解酶的核苷序列,其编码的氨基酸序列为SEQ IDNO.8,命名为penG,其核苷序列位于SEQ ID NO.1中第20237-21400碱基处。
本发明还提供了一个编码氧化还原酶的核苷序列,其编码的氨基酸序列为SEQ IDNO.9,命名为penH,其核苷序列位于SEQ ID NO.1第28119-28760碱基处。
本发明还提供了一个编码氧化还原酶的核苷序列,其编码的氨基酸序列为SEQ IDNO.10,命名为penI,其核苷序列位于SEQ ID NO.1中第27743-28312碱基处。
本发明还提供了一个编码氧化还原酶的核苷序列,其编码的氨基酸序列为SEQ IDNO.11,命名为penJ,其核苷序列位于SEQ ID NO.1中第27289-27774碱基处。
从克隆生物合成基因簇出发,采用微生物学、分子生物学、生物化学及有机化学相结合的方法研究其生物合成,通过对其生物合成机制的研究揭示了一个基因簇包含两个独立的生物合成途径的现象,在这种独特的生物合成途径中蕴含了一个保护与被保护的机制,以及包括喷司他丁在内的独特化学结构形成的酶学机制。
在此基础上运用代谢工程的原理,通过组合生物学对生物合成途径的合理修饰,探索结构稳定、活性更好、并能通过微生物大量发酵产生的新型药物。
本发明的喷司他丁和阿糖腺苷生物合成基因簇的应用,包括(但不限于):
(1)本发明还提供了产生喷司他丁和阿糖腺苷生物合成基因簇中断的微生物体的途径,至少其中之一的基因包含有SEQ ID NO.1中的核苷酸序列。
(2)包含本发明所提供的核苷酸序列或至少部分核苷酸序列的克隆DNA可用于从抗生链霉菌NRRL 3238(S.antibioticus NRRL 3238)基因组文库中定位更多的文库质粒。这些文库质粒至少包含本发明中的部分序列。
(3)包含本发明所提供的核苷酸序列或至少部分核苷酸序列可以被修饰或突变。这些途径包括插入、置换或缺失,聚合酶链式反应,错误介导聚合酶链式反应,位点特异性突变,不同序列的重新连接,序列的不同部分或与其他来源的同源序列进行定向进化,或通过紫外线或化学试剂诱变等。
(4)包含本发明所提供的核苷酸序列或至少部分核苷酸序列的克隆基因可以通过合适的表达体系在外源宿主中表达以得到相应的酶或其他更高的生物活性或产量。这些外源宿主包括链霉菌、假单孢菌、大肠杆菌、芽孢杆菌、酵母、植物和动物等。
(5)本发明所提供的氨基酸序列可以用来分离所需要的蛋白并可用于抗体的制备。
(6)包含本发明所提供的氨基酸序列或至少部分序列的多肽可能在去除或替代某些氨基酸之后仍有生物活性甚至有新的生物学活性,或者提高了产量或优化了蛋白动力学特征或其他致力于得到的性质。
(7)包含本发明所提供的核苷酸序列或至少部分核苷酸序列的基因或基因簇可以在异源宿主中表达并通过DNA芯片技术了解它们在宿主代谢链中的功能。
(8)包含本发明所提供的核苷酸序列或至少部分核苷酸序列的基因或基因簇可以通过遗传重组来构建重组质粒以获得新型生物合成途径,也可以通过插入、置换、缺失或失活进而获得新型生物合成途径。
(9)包含本发明所提供的核苷酸序列编码的蛋白可以催化6-酮基喷司他丁(6-keto PTN)合成喷司他丁,并可以通过与其他天然产物的生物合成途径或部分生物合成途径重组,来获得新的嘌呤核苷类化合物。
(10)包含本发明所提供的核苷酸序列编码的蛋白可以催化腺苷及其结构类似物合成肌苷及其结构类似物,并可以通过与其他天然产物的生物合成途径或部分生物合成途径重组,来获得新的嘌呤核苷类化合物。
因此,含有上述基因簇的重组载体、表达盒、转基因细胞系或重组菌也是本发明保护的范围。
上述的蛋白质、上述的基因簇、上述的重组载体、表达盒、转基因细胞系或重组菌在合成喷司他丁和/或阿糖腺苷中的应用也是本发明保护的范围。
本发明的另一个目的是提供一种合成喷司他丁或阿糖腺苷的方法。
本发明提供的方法,为发酵上述的重组菌,收集发酵产物,即得喷司他丁和/或阿糖腺苷。
总之,本发明所提供的包含喷司他丁和阿糖腺苷生物合成相关的所有基因和蛋白信息可以帮助人们理解嘌呤核苷类抗生素的生物合成机制,为进一步遗传改造提供了材料和知识。本发明所提供的基因及其蛋白也可以用来寻找和发现可用于医药、工业或农业的化合物或基因、蛋白。
附图说明
图1:喷司他丁和阿糖腺苷的化学结构。
图2:A)喷司他丁和阿糖腺苷生物合成基因簇的基因结构图。B)喷司他丁和阿糖腺苷生物合成基因簇中断示意图和发酵产物的高效液相色谱(HPLC)分析,其中,WT-野生型,ST-标准品,TD3-突变株。
图3:喷司他丁和阿糖腺苷生物合成基因中断示意图和发酵产物的液相质谱联用仪(LC-MS)分析。
图4:喷司他丁和阿糖腺苷生物合成途径推测。
图5:氧化还原酶功能分析。
A)PenB蛋白SDS-PAGE分析;B)PenB生化反应示意图;C)以PTN作为底物时PenB生化反应的LC-MS分析;D)以6-keto PTN作为底物时PenB生化反应的LC-MS分析。
具体实施方式
通过以下详细说明结合附图可以进一步理解本发明的特点和优点。所提供的实施例仅是对本发明方法的说明,而不以任何方式限制本发明揭示的其余内容。
下列实施例中所使用的实验方法如无特殊说明,均为常规方法。
下述实施例中所使用的材料、试剂等,如无特殊说明,均可从商业途径得到。
1.克隆、分析喷司他丁和阿糖腺苷的生物合成基因簇:
我们首先提出了抗生链霉菌NRRL 3238总DNA,利用测序技术对其总DNA进行全基因组扫描测序,并构建了以pOJ446为载体的抗生链霉菌NRRL 3238总基因组文库。根据1984年和1987年Hanvey等的报道,腺苷是喷司他丁的生物合成直接前体,通过D-核糖的C-1插入嘌呤环的C-6位和N-1位形成了特殊的1,3-二氮杂七元环,而C-1插入之前的途径与L-组氨酸的生物合成途径有相似性。以变铅链霉菌TK24和大肠杆菌来源的L-组氨酸生物合成中的编码ATP核糖磷酸转移酶的基因hisG作为探针,与抗生链霉菌NRRL 3238总DNA的测序结果进行序列分析比对,找到与其同源的编码蛋白。本发明利用两对引物1F:TCAGACCACGCACAGGGAA,1R:TGGCGTCTTGGTCCACTGTCT;引物2F:GTGCGACCAGCCTTCCAGT,2R:TGGCTCGTCTGTCCACTCGTC对总基因组文库中大约2000个克隆进行筛选,分离得到11个含hisG同源的基因penA的粘粒,再通过DNA序列和序列分析,选取了一个覆盖染色体39kb的DNA区域的黏粒,从而确定了喷司他丁的生物合成基因簇在基因组上的大概位置。本发明进一步对含有喷司他丁生物合成基因簇的黏粒进行序列分析,利用SacI酶切位点,将含有喷司他丁生物合成基因簇的DNA区域范围缩小,得到的目标基因簇片段克隆到载体pJTU2463上。中断目标基因簇片段得到的突变菌株不产生喷司他丁和阿糖腺苷。DNA测序分析了9kb的染色体区域,生物信息分析包含了12个开放读码框。详细的分析结果列于表1.
表1:喷司他丁和阿糖腺苷生物合成基因簇中各基因及编码蛋白的功能分析
a括号中提供NCBI登记号
b喷司他丁和阿糖腺苷生物合成基因簇之外的开放读码框
2、喷司他丁和阿糖腺苷的生物合成基因簇边界的确定:
对基因orf-1进行中断不影响喷司他丁和阿糖腺苷的产生;并根据基因编码蛋白的功能分析,喷司他丁和阿糖腺苷的生物合成基因簇本确定为基因penA到penH,涵盖染色体9kb的区域,包含10个开放读码框。整个喷司他丁和阿糖腺苷的生物合成基因簇共10个基因:3个与喷司他丁合成相关的基因;5个与阿糖腺苷合成相关的基因;2个与两者转运调节相关的基因。如图例2所示。对上述基因簇进行中断,得到突变株TD3,对其发酵液进行高效液相分析确定此突变株不产生喷司他丁和阿糖腺苷。如图例2所示。
3、喷司他丁和阿糖腺苷生物合成相关基因的体内功能确定:
在喷司他丁和阿糖腺苷最小范围基因簇上依次敲除每个基因,并接合转移到宿主菌CXR14中进行异源表达,宿主菌CXR14为产生多氧霉素工业菌株的大片段缺失的链霉菌,且其本身不产生喷司他丁和阿糖腺苷。突变菌株发酵液利用液相质谱联用仪(LC-MS)进行检测分析,确定penA,penB,penC共3个基因是与喷司他丁生物合成相关的基因;penD,penG,penH,penI,penJ共5个基因是与阿糖腺苷生物合成相关的基因,penE,penF共2个基因作为转运和调节基因。如图例3所示。
4、喷司他丁和阿糖腺苷的生物合成
喷司他丁和阿糖腺苷生物合成基因簇中每个基因的功能进行体内喷司他丁的生物合成与L-组氨酸的生物合成有一定的相关性,以ATP或dATP为起始化合物,ATP核糖磷酸转移酶催化ATP或dATP形成磷酸核糖ATP(PR-ATP)或磷酸核糖dATP(PR-dATP)(化合物1),再利用L-组氨酸生物合成途径中的磷酸核糖AMP环化水解酶HisI、磷酸核糖ATP焦磷酸酶HisE、磷酸核糖异构化酶HisA三个蛋白合成化合物2,再经过SAICAR合成酶一系列的催化作用合成化合物5,经过一个区磷酸化的催化作用形成6-酮基喷司他丁(化合物6),最后有短链脱氢酶PenB将6-酮基喷司他丁催化合成喷司他丁。如图例4A所示。阿糖腺苷的生物合成是由S-腺苷高半胱氨酸(S-adenosyl-L-homocysteine,SAH)作为起始化合物,通过SAH水解酶PenD和PenG的共同催化和调节形成腺苷,腺苷再在氧化还原酶PenH,PenI和PenJ的共同催化下2’位羟基发生异构化形成阿糖腺苷。如图例4B所示。
5、喷司他丁生物合成相关基因penB编码的辅因子NADP(H)依赖的短链脱氢酶PenB体外功能验证:
将penB基因利用PCR扩增出来后克隆到表达载体上后,在大肠杆菌中异源过量表达、纯化,通过SDS-PAGE分析后得到较纯的蛋白PenB。通过生物信息学分析后,发现PenB是辅因子NADP(H)依赖的短链脱氢酶。本发明以喷司他丁作为底物,PenB催化喷司他丁的体外生化反应经过质谱检测后,分析发现形成了6-酮基喷司他丁(6-keto PTN)。大量制备分离纯化6-酮基喷司他丁,并通过质谱核磁确定其结构和纯度后,再以6-酮基喷司他丁作为底物,PenB催化6-酮基喷司他丁的反应经过质谱检测后分析发现形成喷司他丁。由此,可知PenB可催化喷司他丁形成6-酮基喷司他丁,并且这个反应是一个可逆的氧化还原反应,作为喷司他丁生物合成途径的最后一步,最终形成喷司他丁。如图例5所示。
以下进一步提供实施实例,这些实施实例有助于理解本发明,仅用作说明而不限制本发明的应用范围。
下列实施例中未注明具体条件的实验方法,通常按照常规条件如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor LaboratoryPress,1989)或链霉菌手册(Practical Streptomyces Genetics)中所述的条件,或按照制造厂商所建议的条件。除非另外说明,否则百分比和份数按重量计。
下述实施例中所使用的材料、试剂等,如无特殊说明,均可从商业途径得到。
【实施例1】喷司他丁和阿糖腺苷产生菌抗生链霉菌NRRL 3238总DNA的提取:
取30μL抗生链霉菌NRRL 3238的孢子至50mL的TSBY培养基中,30℃,200rpm,培养24-36h,至培养基呈现浑浊状态。50mL的抗生链霉菌NRRL 3238菌液,4000rmp,4℃,10min离心去上清,收集菌体。将菌体溶于25mL的10.3%蔗糖溶液中震荡混匀洗涤菌体,4000rmp,4℃,10min离心去上清;再将菌体溶于15mL的set buffer中振荡混匀,4000rmp,4℃,10min离心去上清,重复两次;将菌体溶于10mL的set buffer中震荡混匀后,加入50μL的溶菌酶溶液(100mg/mL)置于37℃水浴锅中温浴30min;随后加入280μL蛋白酶K溶液(50mg/mL),混匀后加入600μL 10%SDS,颠倒混匀后置于55℃温浴4h,此期间每隔15min颠倒混匀,每隔30min添加蛋白酶K溶液100μL,直至菌丝体裂解变透明;之后加入4mL 5M的NaCl,颠倒混匀,室温下放置菌液至37℃左右;加入10mL的氯仿颠倒混匀至乳白色,4000rmp,4℃,10min,取出上清液后加入0.6倍体积的异丙醇混匀;混匀后,有絮状DNA析出,将析出的DNA小心挑出并用75%的乙醇洗涤两次,置于通风处吹干,溶于适量的超纯水中。核酸电泳(40v,12h)检测DNA的大小,并用Nanodrop 2000测定其浓度及OD值确定提出总DNA的纯度。
【实施例2】喷司他丁和阿糖腺苷产生菌抗生链霉菌NRRL 3238基因组文库的建立
首先通过一系列的稀释实验来确定Sau 3AI的用量,配制总体积为500μL的酶切体系(5μL 0.1u/μL Sau 3AI,495μL 10Xbuffer稀释过的DNA)溶液1,取300μL溶液1与100μLDNA混匀成溶液2(Sau 3AI终浓度0.075u/100μL),取200μL溶液1与200μL DNA混匀成溶液3(Sau 3AI终浓度0.05u/100μL),取200μL溶液2与200μL DNA混匀成溶液4(Sau 3AI终浓度0.0375u/100μL),取200μL溶液4与200μL DNA混匀成溶液5(Sau 3AI终浓度0.01875u/100μL)。上述溶液混匀都均置于冰上,随后一起37℃水浴1h后取出立即置于冰上,用12cm长1%琼脂糖凝胶电泳,以control DNA和λmix为maker,上样量5μL,30v电压,18h后凝胶成像仪下检测酶切质量。根据预酶切实验结果选择合适的稀释液进行脉冲场电泳(脉冲场电泳条件:泵的温度16℃,电泳时间16h,电压6.0v,转角120°,转角时间1s,6s),将样品上样至预先准备好的1%的低熔点琼脂糖凝胶中,脉冲场电泳结束后在长波紫外光下检测,回收48-kb左右大小的凝胶,溶于10Xβ-琼脂糖酶I反应缓冲液,65℃温浴待凝胶完全溶解,冷却至42℃后加入β-琼脂糖酶(按100μL体积加1μL酶)42℃温浴1h,然后65℃温浴15min对酶失活。12000rmp,15min离心取上清,加入1/10体积的最新配制的3M的乙酸钠和2倍体积的异丙醇,充分混匀室温放置10min后12000rmp,15min离心去除上清,75%乙醇洗涤两次,室温下干燥后加适量水溶解。电泳检测回收片段质量进行去磷酸化处理,克隆至载体pJTU2463b。
pJTU2463b载体的处理:提取质粒凝胶电泳检测后用HpaI单酶切,去磷酸化(防止自连),再用BamHI酶切,得到7-kb和2-kb的DNA片段,凝胶回收7-kb的片段。
总DNA片段和载体pJTU2463b的酶连:将浓度为15ng/μL的处理好的载体与浓度为46.2ng/μL回收到的去磷酸化总DNA片段以1:3的比例进行酶连,酶连体系:载体4μL,DNA片段13μL,T4buffer 2μL,T4连接酶(NEB)1μL,超纯水补到20μL体系。16℃温浴12h后,70℃温浴10min酶失活。对酶连产物进行凝胶电泳检测后,进行后续步骤。
文库包装:从-80℃冰箱内取出25μL包装蛋白与10μL酶连产物混匀,30℃温浴90min后再加入25μL包装蛋白,继续30℃温浴90min,加入PDB稀释至1mL,加入25μL氯仿。
EPI300感受态的制备:在LB固体平板上划单菌落,挑单菌落至5mL LB中37℃过夜培养,菌液按1%转接至50mL LB中,并加入500μL的1M MgSO4 37℃培养至OD600=0.85。
转染:取上述包装产物10μL加PDB稀释到1mL,取稀释的包装产物10μL与100μL制备好的EPI300感受态混匀,37℃温浴20min,涂布到阿泊拉LA培养皿上,37℃培养12h后挑单克隆于装有LB培养基的96孔板中,37℃培养24h后,加入等体积40%甘油保存在-80℃。
【实施例3】喷司他丁和阿糖腺苷产生菌抗生链霉菌NRRL 3238发酵条件,产物高效液相(HPLC)检测条件
将抗生链霉菌NRRL 3238的孢子接种于种子培养基中,28℃,220rmp培养48h,按照4%的接种量转接至发酵摇瓶中,28℃,220rmp培养6天。收集发酵液,将发酵液12000rmp离心20min,取上清后过0.22μm的微孔滤膜进行HPLC和LC-MS的检测分析。
HPLC检测条件:A相为加了0.15%三氟乙酸(TFA)的超纯水,B相为甲醇。初始为95%的A相在30min内梯度洗脱至80%,31min时A相转换为10%并保持这一浓度持续洗脱至45min,在46min时A相转换至95%,保持至65min。流速为0.5mL/min,检测波长270nm,柱温30℃。
【实施例4】喷司他丁和阿糖腺苷产生菌抗生链霉菌NRRL 3238及其异源表达接合转移的方法
将要接合转移的目标质粒先转化到大肠杆菌E.Coli ET12567/pUZ8002感受态中,待长出转化子后验证,将阳性单克隆接种于5mL的LB培养液中,37℃过夜培养,将菌液按10%接种于5mL LB培养液中37℃培养3-5h。取宿主链霉菌孢子5000rmp离心3min,去上清后超纯水洗涤两次,5000rmp离心3min去上清。加入700μL的TES,混匀后根据受体链霉菌的不同以不同的温度(45℃或50℃)热击5min或者10min,再加入700μL 2x孢子预萌发液,30℃培养3-5h。
将培养好的大肠杆菌于4℃,4000rmp离心3min,去上清加入20mL LB洗涤两次后离心去上清,加入1mL LB培养基混匀大肠杆菌细胞;将培养好的链霉菌的受体孢子5000rmp离心3min,去上清后用LB洗涤两次,将上述处理好的大肠杆菌与处理好的孢子混匀,涂布于MS培养皿上,24h后加入1mL含适量用于筛选的抗生素的超纯水来覆盖,置于30℃培养数天至接合子长出。
【实施例5】含喷司他丁和阿糖腺苷生物合成基因簇的黏粒上PCR-targeting的方法
将含有喷司他丁和阿糖腺苷生物合成基因簇的黏粒pWUH1106转化到大肠杆菌E.coli BW25113/pIJ790感受态细胞中,30℃培养过夜,挑单菌落于5mL LB培养基中(含阿泊拉和氯霉素),30℃培养过夜,菌液按1%转接到50mL LB中并同时加入浓度为1M的L-阿拉伯糖,30℃培养3h至OD600为0.4-0.6。收集菌体于4℃,4000rmp离心5min,10%甘油洗3次后加200μL 10%甘油混匀菌体,以50μL每管分装待用。取5μL处理好的片段与感受态细胞混匀,加入到电转杯中电转(电转条件:200Ω,25μF,2.5kV)后,37℃预培养30min,涂布于LA培养皿上,37℃培养8h,待长出转化子后进行后续验证。
【实施例6】含喷司他丁和阿糖腺苷生物合成相关基因编码的蛋白重组、超量表达、分离纯化
目标基因进行PCR扩增,DNA测序验证正确后克隆到表达载体上,表达质粒转化到大肠杆菌E.coli BL21(DE)/pLysE中,挑取阳性单克隆于5mL LB培养基中37℃过夜培养,按1%转接于500mL LB中,37℃培养到菌体OD600至0.5-0.8,加入IPTG(终浓度0.1-0.2mM)诱导,18℃培养20h,6000rmp离心5min收集菌体。
向上述收集到的菌体中加入适量(20mL-30mL)的裂解buffer,震荡混菌后用超声破碎仪超声破碎大肠杆菌细胞,4℃12000rmp离心30min取上清。在4℃条件下,将上清液装入有镍填料的重力柱中,用含有不同浓度(20mM-200mM)咪唑的Tris buffer洗脱,对不同浓度下洗脱的样品进行SDS-PAGE分析,收集较纯的蛋白样品。
SEQUENCE LISTING
<110> 武汉大学
<120> 喷司他丁和阿糖腺苷生物合成基因簇及其应用
<160> 13
<170> PatentIn version 3.3
<210> 1
<211> 37740
<212> DNA
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 1
gtcggtcagc ccgccccctt gcgcatacct catacagtcg gcataccgca gggatcacga 60
ggcgtcagcc ccctacgaca ccacgctttg aaggtcagta gacggctttc tcttcgctgc 120
tgtgctgctt cgggccggta tccaccctgt cttgtgatcg ctgcggagtg acaatcgttc 180
ggttcgggca ctctttcgag ggactgcaag aaccctgcct cgcctctagc ctgttctcat 240
gtgagtcacc ccccgccttg catatgcaaa aaacggcagg ggtgtctcgg gaattgctct 300
gctcgaacag gtgaggttcg ggccggtctc gttgacggaa gtgggttctg tcatggacgt 360
acgcaatggc atccgtgcga gcgcgatcaa ggaagcgagg tgggtcaaaa gctcggcaag 420
tcagggcgtg ggaaactgcg ttgaggtaac cgacgtgacc ggcgtcggtg tggcggtgcg 480
caactcccgc ttccctcacg gacccgccct ggtcttcact ccggaagaga tcaaggcgtt 540
cctggccgga gccaagtcgg gggagttcga ccacttcgcc ggctgacggc cagggcttcg 600
ggctgaagac gaatcggcgg acgtacggac tccacgctgg gatactggtg cgccagtgga 660
ccagtctcga ggagtgccgc atgtccgccg atcgttatgg gctggcgtcg cagtccgttc 720
cggcgcagga ggccgacgtg tgccgtggtg ccgcgtcgcg caacctgggc aacttcctga 780
gagagctgcg cacccgcaga gggctgacga tggcgcaggc caaacacgcc atccgtgggt 840
ccgtttccaa gatcagcaga ctggagcgag gggagagtcc tcccaaggaa caggacgtgt 900
gggacctcgc ccgtctctac ggtgcatccg gcggcgagct ggaggacatc cacgacctgc 960
tgcggcaggt acggcaggag acacgtggca gccagttctc ggacgtcact ccggacttcc 1020
tgcggcgcct gatcgaactg gaaaaatccg cagtccacat ccgggcatac gagaactgcg 1080
tggtccctgg cctgctgcag acccgggaat acgccaaggc tgtgatcgag gccgcgatac 1140
cggaggccga cgaggtcacc gtcagccggc acttgcaggt ccgcaaagag cgatgggagc 1200
tgtacagcaa cgcccaggac accgagatca ccgccgttct cgacgaaggg gtcctgcgca 1260
ggatggtcgg aggcccagag atcatgctgg gccagctccg gcacctcagt gtggctgccg 1320
gcgaggacat gccgaatgtc aacatccggg tcataccgtt cgaccggggg ctcagcacgg 1380
ctcccagctt cccgatcact cacctgacct tcaaaaacgg cggcccgccg gagatggtgt 1440
acctggaatt gctggattcc gcgaactaca tcacggagac ggccaaactc gcccagtacc 1500
ggctggtcct ggacaggaca cttgacgcgg cgctggggcg cagggaaagc gttgccttcc 1560
ttgaaaaaat gatcaaggaa gtgaagaccc gcctccccgc gtcgcgttag caggctgcgg 1620
cgggggagcg gtggaactcc cgtacgacat ggtgtcgttg gccggcgtca gctgatgggc 1680
acgcggccga ccccgcccca ctcgacccag tcgtcggcct tggggcgggg catgagctca 1740
ctgtccggcc gccagtccgt aacatcacac aagccggggt cgatgatgtc caggcccagt 1800
gcgtcgaagt aggcgcggac ttcgtcggct gtccggacgc gcccccactt gccgcccgtg 1860
gcccggtcca tcaagtcggt cacccggttg cggatatcag cgcggtcgct gacgagttga 1920
cagatgacga agaagctgcc cggcttcagc cgcgctttga cggcggcaag catcggcagt 1980
acgtcctcgt ccttgaggca gtgaaccacg gacacgaaca gtgcggcggt gcactcgtcc 2040
ttacggatga gccgcctgac ctccggggcg tcgaaaatcg cttccgtctc ccgcatgtct 2100
gcccgtaaca ccgcgacgcg atcattctga tccaggcgac tacgcccatg ggccagcacg 2160
atcgggtcgg tgtccaggta gacgatccgg caagagcggt ccaccctttg cgccacttgg 2220
tggacgttgt cctgcgtggg caggccggag ccgtggtcga ggaactgacg gatcccgtac 2280
tcctgcgcca ggacccggac gactctgcgg aggaacgccc ggttgttgcg ggcgaggact 2340
ttcgagctcg gggcgatgcg cagcagatcc tcgcacgcct ttcggtcggc cgcgtagttg 2400
tccaccccgt ccagcagcca gtcgtacata cgcgccacgc tgggggtccg ggtgtcgatg 2460
gcgtcggaaa tgggtttctc ctcacgggtc atcggccccc ggcgcgagat ccggcgaaga 2520
tctgaggggc tggaaacccc catgctagag cgctacttgg cacacgtgca gccccacacg 2580
ctcctcatac aggccaatga agcagcgggc gagaatcacc gatgaaccgg tagcgaaatt 2640
atgcgccggc gtgagcgccg cagcgcccac gccgacgagg cgcgccggtc ggctgcccgt 2700
gcgcaccctg cggcgaactg gagcggaccg gtgaggaaag cgtccccccg cgaggacgcg 2760
gtgcacggcg ggatcggctt acgcggtagg caatgatcgg ctcagcccgc agcggcgggc 2820
acgcccatga gaccgcaccc cgcagccgcc cgtccctacc cccgcgctcc cagcgtcagc 2880
aactgcctcg gccgtctgct ccttcctcca cctgaagccc catgtgagac atggcccgca 2940
ccagacgacg actgtcggca agcctgcgac gggcgccgca gagcaaagca ccgatgtccc 3000
ccttgttcag ccccatcacc gtggcggtct cagacaggga gacaccgata taggcatgaa 3060
gcaccacggc atcagccatc acagggcgta gaacgcaatg agcacaagca ggtcggcaag 3120
aaggacaccg tgcgctgtcg gcaactcgct gacggagcag tccccatgcc aacgcggcag 3180
cggacgggct gctcagcgct gtcgtccaca cgaacgcgat atcgtcgaag gcggcactca 3240
cgacgtccgt agccgcggtg ctcctaggga tgcgagcctg cgcataaagc aggtagagct 3300
tgtggtagtt ctcataaaag gcccggtagt ccacacgccg cgctacctcg gcgcgacatg 3360
cccgggaagg tggtatcacg tccagcttcc ttcactcgct gggacattcg atgcctcgcc 3420
ccttagcgcc caggtccgcg ggtctagccg aggaggaagc cgccgcaggg gaagtggtga 3480
gggctggcgg aaaaggtgca tcgaacgcca tcccacagat tcctcagaag gctcggatcg 3540
gttcgcagtg cggacgccaa ggtggcgacg gtgttccaag cgggcacgca ccggccgagc 3600
aggatggctt cgagcctgtc cgccgacacg cctgtaagac tggccacctc atcaagcgag 3660
ggctggcccg cgaccagaag catcccccgc agcgccgccg ccagacggcc cgccgcctcc 3720
tcgacagacg ctggcggcgg gctgctggcc tgcagggaca cctgccacag aaagtgaaga 3780
tcctccgaac ggccgcccag ttgttccacc aggccccgca tccggtccca gggcaggatg 3840
tcctcgcccg ccaacgcgcg gtcagcctcg tcgaaggaaa gtccggcgcg cgcggccgcg 3900
gtggcgacct gtgcagccga agcggtccca ggcaactgcc tgagggcacc gacgagcatg 3960
tcggttcctg agcccaaccg tgccgcggcc ggagggcctg tcgtccggcc gagcgccgag 4020
ggggcaacgt cctgcgcaag gcgccgcaca cgctgcggat tccaccgggc ccgggccgat 4080
ctctcgctta cccccgcagc ggccgccaca tcggcccagt cctgaccacg ctccaggcgt 4140
tcccgtactg tgctaccgac gaggcattcc gacaaccgtc ccaagcggct gaccgtatcc 4200
agtaactccc ctaaggggcc agacagttct gtgctgagcc ggctggtgag ccggtacatc 4260
tcttcgatca acgcagcagc aactacctgc cctttgtcac ccgtcgagat cttcgctcgg 4320
tcccgctggc gctgtgcctg gcgacgacat cgttcgccgc agtacagccg aggccggccg 4380
ggccccgtac gctgaagaaa ttcgtgcccg caacccgcac atcgggcagt cagtggcgct 4440
tccacgccgc acccttccta ctttgcgtct tccaggtcgc agccgagccg gctgcgacct 4500
gggactgcac tagtggccgg cggggccgag ggcgccgcaa ggcgcctcgg ccccgccggg 4560
tgatcaggtg aggctgatcg ggccgatctg caccacgact tccaccgagg ggaagccggg 4620
cgtcacagtg acttggccgg gtgacgtcac caggctgccg caggccgcca gaaaggccag 4680
caaaatcacg atcaggacca cagtcggccg cagcggcacg gtttgcagcg cggcactctg 4740
cgggatactg atgcaggtag ccgcgcgacc attcacgtat ccgcggccta ccggattgaa 4800
catgggtagc tccttgttcg attattggcg gccccggctg cggatcgcac ccgcgccggg 4860
gccgttgagc acccgtcgtg tccgacggat gatgcgaatg tagccgaatc ctgaggcccc 4920
gttgagacga tcccgtccac gcttgcaccg cttccgctga tgcaacgcac gatgggaatg 4980
gatatgacgc agagtaaaca aaaatccagg cagcgcggtt ggccgactgc atatgccgga 5040
gagcttttca tttccctttc gggcctcatg ccgatcgcct tcggtcatta ccagagaaga 5100
aggctcagga tgggcctttc atgcctgagg gcgtaacccg ttgagcgggc cctgggcggt 5160
ctggtcatga ctcaccggag ccactcgatc gggtggcccg tcctggggag gtgccgtctt 5220
gggtgccctg caaatgcaac agacaaattg gttaaatggg acagcttcgg gaggtgggga 5280
gaatggagcg ggaggtgttc gccgcgtttg ttctcgcctg ccagggcgcg ccgtcccccg 5340
ccagtctgcc catcaaggac cgccgaagac tcgaacaacc gaaaggcggc aaccgcttcg 5400
ttgcccaccc ctgaagtggg catctcgtcc ctggagtccg gaacgagcca agctccagcg 5460
gccagatgac ccacgtcccc gactcagctc aggtcgggct tcctcgggca gggctgtgtt 5520
gccacgtcaa ctgagcctag gggaaccagt ccttgccgag tgcccgtggg gcagggccca 5580
ggtcacctcg tcggtgtcca ggcggtggtg ccaagaggtc agcacgtggc tggcgaccaa 5640
ccggtcgaga agctcggcag tctgatgggg agagtggccg cacagacggg ccagcacatc 5700
catctcgatg gtgtgcgcgg cgcgaggcga caggtgcgcg gccaggacga gcgcaaccag 5760
ccgtaccgcg ggaggcgtgg cacccggaag ggccggcggc ccgggccgca gggcccagtg 5820
agcggcccgg cgccgagcag tgcggccggg cgcctggtcc cggaccgcgg catcgagcaa 5880
ctgcaccgtc accggtgcgg ccctgaggtc cggtggaagc aaccagcggc cgtgagccaa 5940
ctcctcccag gcctcggcgc gtccgcgcag ccgcatgccg cgcagcaagc ctgcgggcag 6000
ccggacaaga ccgcgggtgt cggcccgcag cgcgcattgc agagccagca gccggccggc 6060
gggagaggtg cacctcggaa gcgcggtggc caggtaggtg agcatctccc gcacgcgcat 6120
tccttcctcc gcggcgggcc ggtgcacgcg gtgtgcagcc gcggccgctg caggcggtga 6180
agcgggggag tgcggagtgc cgagcacagt gcaggggacg acggaggtgt tgccggtcgc 6240
ggccgcgcag gccgcgcagg tgtcggccag ccgccagacg cggccgccgt cgtcgcgggt 6300
gagggccagc aggacaggac cggcgcaccc acggtgacgc cgatgccaca ggcagccaag 6360
cgaccggcac tggcagatcc gcagatgcgc gggcagcgca tcggcgcggg cgtgacaggc 6420
cagatgagcc agggcggcgg accttgcaga cgccgccaca ggggaaaccg tgtgattggt 6480
gcagcggacg cagaccagtg caggcccacc cggccgcggg cgcaattcca ccgtccagac 6540
gcgccgtacc gcggcgccgg cgagcctcat gggctggctc tcctcctgtc ttgtccctgg 6600
gaccgggctg ccgcacagcg ggcacccgga gcggacacgc gttcatgcag accagtgcac 6660
cgtcactgtc cgcgatcctc cgaggcaaat agtgcagaag tctgcactga cagggcagag 6720
gtctgcaccg tcggatggtg gagtgacgat catcccgccc gaccccaacc tcaccgccct 6780
gcgagtcgaa ctcgcacggc tacggggtca gcgcggatgg accttcgacg aactcgccga 6840
acgcagcggc ctggccaggc gcaccctcat cgacctcgaa cacggccgca ccaccggcag 6900
catcaccacc tggcacgccc tggcccacgc cttcgacgtc cccatcgaac gcttcctcac 6960
ctcactgtgc gaaggccacc cagcccccag cgcctcacag ccttgacccc caccaccgca 7020
agcccctgcc ggcgccgacg agaatcaccc gcacaggcga gacgccctcg aacgacagaa 7080
ccccaccccg gccacacgtt cccacccact gcgccgatcg cgtgacatcc acgaaagccg 7140
ccaccaccga ccggcgcgcc acctggcacg gtcgccgcca tgcgccccca caacccaccc 7200
gccggccgtc actgggacgg cagcccccgc catctccctc cccgccgctc cctgcagatc 7260
gccacacaca gcccgagccg actcctgcgc gcgggccaga ccggccgctg ccgccactgc 7320
ggcaaccgca tcgactggta cccgcgatcc gacggccggc ccatcgccct gcaccccgcg 7380
gaagtgccca ccaccagcgt ctccgccacc ggccgctggc acctcagcag cggcgtcgcc 7440
tacccccacg atgacggcag ccggtggtgc cgcatcccgc acgccgccct ctgcccccac 7500
cagcccccgg acccacacac cgcaagcacc ccgcttgcaa cactgcgtcg cgaactcagc 7560
ctgcgcacca gacgcctgat cgacacgggc gccttcaccc cggacaccca cccgcccgcc 7620
accgcagggc cggccggaca caacgagcgg cccacccgcc ccgtcgtccg gatccttctc 7680
atcaactacc tcgccgaaag ccccatcgac accctccgct gcgtcgccca gaccatccac 7740
cgagaccgct gcacacacct cctgcccgac cagaccccag gacgctggat cctgctgccc 7800
acccagccag ccagcggtca actcaccctg cccgatacca gcatggcggt ctacgacctc 7860
agccacctgc cctacgcaga ccaactacgc tggcgcgccc aacgctgcac catccacgcc 7920
gcctctaccg cagccgccga cttggccctg gccggctggg agcccttcga ccccctcctg 7980
cacgccgatc acatccgcac aacgctgccc acgcccaccc ggcaccaccc cagcgcacgg 8040
tgaccgaccc gccgtatcga ccccgccagc actcacaacg tcgcgcgacg acggatcacc 8100
ggccacccac cccggcgcgc cacctcgctg agggcatcgc caaagccggg caccccgccg 8160
gtctgctcgc ctgctacgcc gccgactcgg tgccccggcc acccaccgac acgatcacca 8220
cacccgaaag ccgcacgcaa caaggaaact cccgggcccc tccccgcctg ccgccccggc 8280
cggccttacg gcctacgcac ggcccgaagg caccacttca agagagctcc ggccgaccga 8340
cgatgcgccg gagcacagcc gaaccacgcc gagcagcgca ctcgtactcc gtccaccagt 8400
tgttcgcgca cgcgagcccc gtacaccctc acacccgtcg gacaagcccc acccatccct 8460
gcaaggaact ctgagatgaa gcccaccgac gagcaaacag ccgccctgga cgccttccgg 8520
gccggtgaac acctcaccct gcaagccgga gccggcaccg gcaagaccac cctgctggcc 8580
atgctcgccc gcacgacttc gcgctgcggc aagtacctcg ccttcaaccg ggccatcgcc 8640
caagaggcca ccgcacgctt cccacgcaca gtccagtgca agaccgccca cgcgctcgcc 8700
tacgccgccg tcggccaccg ctacaccagc cgcctgaacg ccccccgccg ccccgcatgg 8760
caagccggac aagccctcgg cctcaccaaa gccatccgta tcggcgaacg cgacatctcc 8820
caacgcgccc tgtccaacgc cctcctgcgc accatcaccc gcttctgcca taccgccgac 8880
gagacgatca cccaccacca cgtgccgaaa ctacgcggcc tggaagacgc aggcatgcac 8940
cgcgaactcg ccacccacat cctgcccgcc gcccggaaag cctggaccga cctgcagaac 9000
cccgacgacg gccaggtccg cttcgaacac gaccactacc tcaagatctg ggccctcggc 9060
cgaccgcgca tcgaagccga atacctgctc ctggacgaag cccaggacac caaccccgtc 9120
gtggagaaag tcttcctcgc tcaacgcgac cacgcccagc tcgtcatggt cggcgactca 9180
gcccaggcca tctaccaatg gcgcggcgcc aaagacgtca tgacagcctt caacggcacc 9240
agactgaccc tgtcacagtc cttccgcttc ggaccccgcc tcgccgagga agccaaccgc 9300
tggctccacc tggccgacgc ccccatccgg ctcaccggca cccccaccgt gcctactgaa 9360
atcggcctca tcaccagccc cgacgccgta ctgtgccgca ccaacgtcgg cgccatggcc 9420
cacgtcatga acctcatgaa caccggacac caggtcgccc tgactggagg aggagacacc 9480
ctctacgccc tcgcccaggc agcacgcgac ctgaaagaag gccgccgcac ccaccacccc 9540
gaactgatcc tcttcccctc ctggggcgac ctgcaggact acgccgccca cgaccccgca 9600
ggacgagacc tgcaaccact ggtgaacctc gtcgacaccc acggcaccga cgccatcctc 9660
accgccgtca cccgactcgt gcccgaacca caagcccaag tcaccatctc caccgcccac 9720
aaagccaaag gaagagaatg gccccgcgta ctcatcgcgg acgacttccc ccgccccaaa 9780
gagcaccaac cagaagaccc caacagcccc gcagcgccac ccgacccgat cgacgacgcc 9840
gaagcccgcc tggcctacgt agccgtcacc agagcccacc gacgcctgga cctcggcggc 9900
ctggcctgga tccatgaaac ccgttgacga agaccagcaa cctgctcaag ctactccgac 9960
cagaccagcc ccagcatgtc ggccacccgc agcaccgacc gactggtgca gttcctgagc 10020
ccggccaccc gcctgtcagc cgtccgtcgc gacgaccagc cgtgtcatcc cggacctatg 10080
cgagcggcag ctcggcgacg aaagtctccc gcacgatgcg cactgcccac cgctacactg 10140
cgcctccttg aaagcctccc ttagcaatcc gtttctgccc gcttccaagg agcgctcgtg 10200
acgaccctgc ctcccgctcc acgtccctgc caatattgcc cgtaccgtct ggatgtcccc 10260
tcaggtgtct ggtcggcgga agagtacgca aagctgccga cgtacgacag gcccacacct 10320
gagcagcctg ccaagctctt ccagtgccac cagcacgacc acgacagcgg ccgcgcccgt 10380
gtgtgcggag ggtgggccgg atgccatgac ggagacgagt tgttggcgct gcgcgtggcc 10440
accatcgctg gagagatcgc cgtggagacg gctcaggcga tccgggacta cgcctccccg 10500
gtaccactat tcgcctccgg cgaggaggca gctgttcacg gtatgcggga gatcctcaat 10560
cccggcccgg acgcccgccg agccatcgac aagatcagcc gcacgcgcac cgacctcacc 10620
tagagccggg caggcggtga ggcccggggc acgttccggc tggttcagct aggtttcacg 10680
caaaacgctc aaacgggcag caaggacggg gagttctgat gcagtgagag cgccgacgag 10740
cagggtgttc agcatggctt tgacggccgc tgcgacgggc ccggaccgac gaagccccca 10800
gctccggaac gtgtccttgg acctgttggg ccgccccggt gagtgctggg catatcatcc 10860
tgcctgcccg cagcgaagag caccgtgcca tctgggacga tcgggcagca cgcgcaacgg 10920
ccttcgggag ggaaacgacc atgaagcgaa ccgtccggcc cgggcaggtt ctgcgagacc 10980
tcgcacctga catggtcgcc cgggaccggc gtctgcgcgt gctggccctg ggggacgacg 11040
ggcgcgccga atgcctggtc gtccacgacc acggcggcag cacgggacgc aaaccctcca 11100
tcaagatcga cgccttggcg tcgccgtcga agttcgaact cgtcgaggag gcagacgacg 11160
tcaccgccga cctccggtac acgcggctgc tggccgcgat taccgccgtc caccgcccag 11220
gtgctacccc ggccgattac gcccgcgccg ccttcgacac cctgaactag cggccagaaa 11280
cgacccggtg tccgcgccgt gaccgacgtt caccagctcc gtgctcgact ggacaggttc 11340
gccgtagcct cgggcccctc tgaaagagaa cctgcagcag ctcgacacga cgagtgagct 11400
gatgcaactg ctccggccga gccggtgtca ccgactccgc ggaagaactt cgctgggacg 11460
cggtgcacgt gcgttgtacg tcggcaaacg ggtgataagg gaccgctgtc agggtttggc 11520
ggtcaggctg gcgaacaggc ggtcttgcac agagcgtttc agccggcaca ggggcggctc 11580
ctggacggtc ggcgtgggcg gaggcggccc gaatctggta cggctgctgg aaaggtgccg 11640
gtgaaaggac agcggcgagg acatccgcga gttgagcgtc aacaccgcca ctcggtgatc 11700
atcctgcgcg tgacgaaagc cgcccatacg gcccggatgc gcacgggccg agtacctgct 11760
cgtggcacgg ctgcaacgac tgcgtgagca ctgggcaggt tgcggtcgtg ccggggcgga 11820
ctggctgtcg gcgctgaacg accgcgtctt gggggcgagg tggccggggc ggtggccgcg 11880
tgcacaccgg actgagcggc gtcgaccgcg cgtgtcaggc cctggtcgag gccgggaggc 11940
ggaaatcggc gggccttccg gttccgccct gctccggacg ctcaccaggc gtgtcaccag 12000
cgctggactg tcccctttct ggctgcctgc acaggttgct cggccgatgg gcggccgccg 12060
agggtggctt gcttcccggt gtcctgctgt ccgtgtcgtc cactagggtt catgatcgcg 12120
aacgcggtgg ttgtatgccc gtggtctgag agggagcgcg ggggagaggc aacggggatg 12180
ggctacacga ttccgggctg gctggatgag gtcctggact tcatcggtat caacttcccg 12240
aacgtggacg aggacgacta tcgcgaaatg gctgatgcca tgcgcgattt cgcggacaag 12300
ttcgaagggc atggcgccga cgcgcacaag gccgtctccc ggatcctgtc ctcctcccag 12360
ggctgggccg tcgacgcgat ggacaagcac tggaatcagg tgaaggccgg tcacctggag 12420
aagcttcccg agctggctcg cctgttcgcc aacgcgtgcg acgccctggc cgacatcgtc 12480
ttctggatga agaggaaggc cgagaccgag ctggcggtca tggccggctc ggtgggtctg 12540
tcgatcgggc ttgcctgggt gaccggcggt ctgtcagcgg ttctcggtgc ggctgagatc 12600
acggccatgc ggcaggcggt caagcggatc atcgatgagg ccgcggaccg gatcgtcgac 12660
gaggtgatcg cccagctcac cgagccggtc aacgccaagc tggaggcgat ggtcgaggac 12720
atggttctcg accttgcgga tgacgctttc tccatgccgc cgacggccgg cagcggcgct 12780
ggacacgatg ccaagggcgg gcacggcgca atgcagctcg cctcagccgg cggtgcgggt 12840
ggccacaacg gtgacgcggg gaagaccacc aagatcgacc acgtcgagtt cgaaaacggt 12900
gcgggaaagg tctcccgaca cggtgacgag ctgcacctgg ccgcgagtgc gccactgcgc 12960
cgggcgcgag gcgcgttcgg aaagagcaag ggccgcgatc cgttcactca gatcttcgac 13020
acggtgctgc acggcgcgct caagggctcc gagaaggcgc tgaagaggat agccacacac 13080
atcaccaaca ccgtccccga ccgggtgaag gcgacctccc gcctgcacaa gggcatcgac 13140
caggacgtcc ggagcaagct cgacgccatc cgcctgggcg ataaggacgg cggcaccggc 13200
cgggacgggc tgccgggtat ccccgggcag cacaggaagt ccgatgacgc ctacagcaag 13260
ccctccccgc tcaccggcgc gaaggacgac ccgcggcgcc atgcgatccc gttgacgaac 13320
aagacctgtg agaacgaccc ggtcgatgtc gcgaccggag aaatgacgct gccgtgcacc 13380
gatctgtcgc tgcccggtgt cctgccgctc gttctgcgcc gtacgcacct gtcggactac 13440
cggtacgggc agtggtacgg ccgcagctgg gcctccaccc tggacgagcg ttttgaactc 13500
gacccgctgg ggcagggcgc ggtctgggcc cgcgaggacg gctcgctgct cgtctacccc 13560
catctgcctg cggccgacga cccagcggga gtgatgccgc tcgagggccc gcgcctggcg 13620
ttgcggcatg atggtgacga caacggcacc atcacgtact gcatatccga cccggccggc 13680
ggatggacac gttcgttcac cgggagcccc tacttcgcct cgccctccta ctggctcacc 13740
gcgatcgagg accgcaacgg caaccggatc gtcttccacc gcgacggcaa cggcgccccg 13800
gctgccgcct cgcacagcgg cggctaccag gtgactttct cggtctccga cgaccggatc 13860
cagaaactgg ccctgcgcac tcccgagggg ccgcactcgg tgctgcgcta cggatacgac 13920
ccacagggca acctggagac cgtcatcaac tccagcggac tgccgctgcg ctacacctac 13980
gacgacactc gcatcacggc ctggaccgac cgcaacgact ccaccttccg ctacgtctac 14040
gacgacgagg gacgcgtcgt gcgcacggtc ggcccggacg gcatcttgtc ctccaccttc 14100
acctacaccc ggcaccccga caccggcgac aagatcaccc gctacacgga ttccaccggt 14160
gccaccagca cctactacct caacagcgcg ctgcaggtcg tcgccaagac cgacccgctg 14220
ggacacacca cccacatccg ctacgacgac cacgaccgga tacaggccca caccgacccg 14280
ctgggtgcca cgacgtacta cgagcgcgac ccacgcggca acctgaccgg cctgcggacc 14340
gctgacggag ccttcacaca cgccgcctac aacgagcggg acctgcccgt caccgtcacc 14400
gaacgcggcg gtgccaccag ccacttcgaa tacgacaccc gaggcaaccg cacggccgcc 14460
gtcactcccg acggcgcccg caccgaatac acctacaacg acctgggaca cgtcatcgcg 14520
atccgcaatg cgctcggtga cgtcacccgg atcaccacca acgcagccgg cctcccgatc 14580
ggcatcaccg cccccaacgg cgccaccacc accctggtcc gtgacccctt cggccgcgtc 14640
accgaggcca ccgaccccct gggcaacacc ctgaaccagg gatggaccac cgaaggacgc 14700
ctcgcgtggc gacgactgcc cgacgccagc cgggaagaat ggacgtggga cggcgaaggc 14760
aacctgacca gccacaccga ccgcatgggc cgcaccaccc accacaccgt cacccacttc 14820
gacaaaccct ccgccaccac cacaaccaac ggagccgact accgcttcac ccacgacacc 14880
gaactacgcc tcaccacggt gaccaacgcc gccgggctgc agtggaccta cacctacgac 14940
gcagccggcc gcctcacctc cgagaccgac ttcgacggcc gcaccatcac ctacgagcac 15000
gacgcgctcg gacgcctgac ccgccgcacc aacgccgccg gacaaaccct caccttcgaa 15060
cgcgacatcc tgggccgggt cacgcaccta cgccatgacg acggctcgac ctccaccttc 15120
acccgcgacg acagcggcca tgtcacccgc atcaccaacc cccacgccac catcgacctg 15180
acccgggaca ccgccggccg catcatctcc gagaccgtca acggcgccac cacccgctac 15240
gcctacgacc cactcggccg ccgcacacac cgccagaccc ccacaggcgc caccagcacc 15300
ctcacctacg acaccaacgg cctggcctcg tacaccagcg gcgagcacac cttcgccttc 15360
gaacgcgacg ccctcggccg cgagaccacc cgcaccctgg acggaacacc caccctgcac 15420
cacacctggg acagcgtggg ccgcatcctc acccaaaccc tgcccacgtc ccagcacggc 15480
ccgatcgagc ggtccttcac ctaccagccc gacggcaccc tcacccgcgt agaggacagc 15540
ctcaccggag aacgcaccta cacactcgac gccgccagcc gaatcaccgc cgtccacgcc 15600
cgcggctgga gcgaaaccta cgcctacaac gccgcgggcg acctcaccca cagctccctg 15660
cccgaacccg cccccggcca gcaccacacc ggccccgtcc actacaccgg cagccgcctg 15720
acaacggccg gacgcaccca cgaccactac gacgcccagg gccgcgtcat ccgccgccag 15780
accaccaccc tcagcggcaa aaccctcacc tggcacttca cctggaacgc cgaagaccgc 15840
ctcacgcacg tcaccacacc ccaccacggg cgctggcact acctctacga cgccctgggc 15900
cgccgcatcg ccaaatgccg cctggacgac aacgaccgcg tcctcgagcg catcacctac 15960
acctgggacg gcgcccagct cgccgaacag cacaccgacg gcatttccct gacgtgggac 16020
tacctggggc agcaccccct ggcccaacgc gaaaccaaaa ccgccggcca gcaggaggcc 16080
atcgaccggc gcttcttcgc catcgtcgcc gacctctccg gcgcccccag cgaactcatc 16140
gcccccgacg ggaccatcgc ctggcgcgcc cgcagcaccg cctggggagc cacccagtgg 16200
aaccgagact gcaccgccta cacgcccctg cgctacccgg gccagcactt cgatccggaa 16260
accggcctgc actacaacgt caaccgctac tacgacccct gcctgggccg ctacctcacc 16320
cccgacccgc tcggactggc tcccgccgcc aaccactacg cgtacgtccc caacccgttc 16380
accctcaccg acccgctcgg actcgccggg tgcaccgccg accccacctg gggcggaaag 16440
gtcgtgtgga tacgggacga acacggacgg ccctacgaaa tgcacgccac catcacccgc 16500
gacatgatcg gccagggaac cgacgccaac gccgccctgc gcccgcccgg cttcgtccat 16560
ggcaccaggc acaaccaggc acgcggacac atgctcgcac agatactcgg cggctccgga 16620
gacaccctgg acaacctctt caccatcacc cagaacccca cgaactcccc gcacatgaga 16680
gacctggaac tgcggatccg ggacgccgtc ctgggcttgg acgaccgccc cggagagatc 16740
gttcagtaca gcgtctatct tgagtacacg gacgacgaga agacctctgt accgaagtgg 16800
atcaccatgg aagccgacgg caaccgcggc ttccacctcg cagcagacct cgagaaccca 16860
gaccatgccg cacagcagat ccgacgcagg gacggaatcg aatgagccct cttgagcgcc 16920
tcacagaact ctgcccccca ccccccaccg agcagccacc ggtaaactgg ccgagcgtcg 16980
aatccagact tggcctgcgc cttcccgagg actacaagag actcaccgcg acctacgacc 17040
cgggacgctt cgcgaactac ctctggatct acgacccccg gcacacctcg gtccacgtca 17100
acctcgtcgg ccccgcgacc gaacgcattc gggaacaaat gcgttcagac cacgcacagg 17160
gaatctaccc atcacctgtc agcccagagc tgctcctgcc ctgcggtgcg acagacaacg 17220
gcgaatatct cttctgggtc accacccccc gggatgaccc ggatgcctgg acgatcgtcg 17280
tcaacgaggc acgcggacca cgctggttca cctacgacgg caacctgacc cagttcctcg 17340
cctccgtcct cagcggcgac accaccgttc cacaattccc caaggacctc ctccaaagcg 17400
gtaccggctt cgatccctca cgcctgaacg agtggtcccc gcccctgcca ccagttcgtc 17460
cacccaccga ccccgaggcc atccgagcat gggcgcgcgc gaacggacac gacgtcccga 17520
tgcgcggacg cattcccgca cgagtccgac aagcatggga acaagcacac agaaacaact 17580
gaccacagca aagcgccggt catcgagcaa cggtcgcacc aaaacacaac gaccagacag 17640
tggaccaaga cgccagaggc caacagggcc gcctccggca agccacagcg aaaacctatg 17700
cctcgaccct gcagaccgcc gcgactgagc cacgcggtgc cggtcgccgg gccgacatcg 17760
cgcccggccg ccgcaccggc cgtggacatc gacaacgccg accagacccg cccgatgcac 17820
ctgcagacca tcgctcgaat cccctgaacg ggtagagact gcggtttccc gggcgcgtcg 17880
tacggttccg ccgctacgaa gagcacgacc agtgatctcc gaaaggactc cgtcgtgcgc 17940
ttcgcccgga ccgccgccct caccgccgcc ctcgcagccc tgctcgtccc cgccaccgcc 18000
catgccaccc cggtcgctga tcccaccggg cgcgccgcgg gtcagaccgt caccttgccg 18060
gtgcgcgacg ccctggctgc tctgcccgtc cgcgatgagg accgtaccgg gtatgagcgc 18120
acggcgttca agcactgggt ggacgcagac aaggacggct gcaacacgag ggccgaggtg 18180
ctgaaggccg aggccgtcac cgcgcccgag cagggcgcga actgccggct cagtggcggc 18240
cgctggtatt cgccgtacga cgaccgctat atcgccggac ccagcggtct ggatatcgac 18300
cacttggtcc cgctcgccga ggcctgggat tccggcgcct ctgcctggtc ggccgcgcag 18360
cggcaggcgt acgccaacga cctgggcgac gagagggcgt tgatcgcggt gtcggctgcg 18420
tcgaaccggt cgaaggcgga ccaggatccg gcgacgtggc tgccgccgac cgtcggctac 18480
cgctgtcagt acgtcaccga ctgggtcgcc gacaagacgc gctgggacct gagcatcgac 18540
cgcggcgaag aaatcgccct gtcccagacc ctgagccgct gcccgaacgc gccggtcacc 18600
gtcgctctcg cccggtaggg ccgggccgtt gcgatcagcg catcggttca gcctgtcgtg 18660
gtggtgcggt cagtcctgca ccggcaggtc ggtggtgtag tccgcgctga gtctcgtgcg 18720
ataaagggtg atacaggaag cgagggctag gcatcgtaca agggcggccc ggcttcctga 18780
gcagcccttg tactgtctaa atccgagtag ggacaagacc cgtgtgctac ggaccggacc 18840
acgaatagac aagagctcac gtttcctgct cttggagccg cccgagtgag gccgtcgcac 18900
cggcaggcga ctgtcatccg acggggcggg tggactgccg aatcaccagg tccgtccgta 18960
cctgtttgag cggctccaag gtctccccac gcaacatcgc cgccatggcg tcgaccccca 19020
gtctgcccag ctcctcgccg tgcagatcga ccgtcgtcag gccgggaagc aacaggccgg 19080
cgatgtctgt gttgtcgatg cccacgacgg atatgtcatc gggcacacgc atgccgagcg 19140
cggcggcggc gtggtagacg ccggcggcgg cgacgtcatc gtcgcacacc accgcgcggg 19200
gcggtgtgcg atcgctcagc agcttgcgtg ccgtgtccag agtcggcctg agcgtctctt 19260
cgagaggtat ggagaactcg gtcacccgta ggtcgcgggt ggcgtcctcg aacccggcct 19320
gccgggaccg gaaactgtag gaggaccgcc ggtagcgcag gtggccgatc gagcggtgcc 19380
cgaggtccgt caggtgctcg acggccgtac gcatgccgcc cgccacgtcg agaccgacga 19440
ccagtcggtt cggtgcggac agagccgggt cggagtcgat gaggaccgtg ggcacgaagg 19500
gcggcagttc gtcgatctgc cggtcgttcg gcgagcagat catcaagccg tcgaactggt 19560
tcgcggagag cacccgtgcc atcgttccgc tgttccaact ggaactggcc acgaccgtca 19620
ggccgtgggc ctcggctgcg tcatgggcgc tggccagcac ctgggcgaat aagggccccc 19680
ggatgttggg cacggccagg agtactaaac cggtgcgtcc tacgcggagt tggcgggctg 19740
ccgcctgagg ccggtagccg agcctctcgc cggcgtcccg cacccgcttc tccgtggccg 19800
cggagacacg tttctctccc ttccccgaga acaccagcga cacgctggcc tgggagaccc 19860
ccgcgagccg cgccacgtcg cgagaggtgg gccgtcgggg cgtcgaagcg gtgctccact 19920
cggacaacgg gagttctccg ggttccatac gttcttcgcc ggacgcgtcc tcgcgacggg 19980
aaccggagac catggaccac cttcccaagg gaaagtcact gtgcatctgc tgtatggctt 20040
gctgaagtcg gtgcgtgcgg tgcttaccgc gagtagaggc taccggttgt tccgcggctt 20100
tccagcgtgc tgtaaccggc ggcggtggtt ggctcgaaac ggcttgcagt aggcagtgtt 20160
acgtataacc ttgcttgccg accgcctgta cgcggccata cctcacccgc tgccttcaat 20220
cgattaggag cgtttcgtga cgaacatgct cagcccggcc gcacccgcca cccggttggg 20280
gaccgggcac gctgctcgcg ccgggaccct gccggtgctg cagagcgccc aggcccgctg 20340
gtccatgccg cccgcccgcc tggtactcgt cacgcacctg ctcgacacag cgatcccctt 20400
tgtccggttg ttcgagcgct gtatggacct ggtcagggtg gtgcccgtcc cctacagcgc 20460
gcagcccgaa gcgctcgccc ggctcgatga cctcccgatc acggtgcccg aatcgatcgg 20520
ggaggtcgga gcggtcgccg tgcgcgacgc ggagcgtgcg gcccgtgaga gtgaacttcc 20580
cgtggtcata caggaggtgg gcggatactg cgccgatgcg gtcggccggc tcgcccagtt 20640
cccgaacgtg cggggcgtcg tggaggacac caaacagggg caatggcggt acgaacggaa 20700
catgccgcta ccgctgcccg tcttcaccat cgcggacagc ccactcaagg cgctggagga 20760
cgtacaggtg ggccgctcgg tggcctacag cgtcgagcgg ttattacgcc tgcgcttcta 20820
ccgactgctc agcgaacgac gggtactggt gctcggctac ggcggcatcg gtacggcctt 20880
ggccgaacac ctgcggcgga ctggtgccca ggtcgcggtc tacgacccgg acgaggtgcg 20940
gatgtcggcc gccgtggtgc acggcttccg ggtgggggcg cgggaggatc tgctgggctg 21000
ggcggaggcg atcgtcggtg tctcgggaca ccgcgcgctc accgtcgagg atctccccct 21060
gctgcgggac ggcgtcgtac tggccagcgg cagttccaag caggtggagt tcgatgtcga 21120
agggatatgc cgcagtgccg acaccctcgt cgaagccgac gaggtcatgg agctgcaagt 21180
cgcgaaccgc acggtgtact tgctcaacca cggcaagccg gtgaacttcc tggaacagag 21240
catcctcggc tccgtgctgg acctggtcta caccgagctg tatctgtgca cgcgggagct 21300
ggtcgggcgc gtgtggagcc cggggctgca ccgcctcgac cccgggattc agcaggaact 21360
cgcccaacag tggcgtgagg aatacgggcg gcagtggtga cgcttcccga ccgggtccgc 21420
gcgcacgtcc tggccgactt cgccactgcc gacccagcgc acgacatcca tcacctcgac 21480
cgggtcgcgg ctttggccgg ggacatcgcg gtactactgg gtgccgatcc ccagaccgcc 21540
caggtcgccg cgtacgttca cgactaccac cgggtggagg aggccaggca ggggcggcgg 21600
cccatccgtc ccgaggaggc acgctccgcc gtactggacg tcctcgaacg gtccgaggtg 21660
ccggagaagc tgcacggaac gatcctgcgc gccgtcgagc tgaccgggcg ttaccgcttc 21720
ggcggcgacg aactcgacgg cgaggacctg atcgccgccg cggtgcacga cgcggacaat 21780
ctggacgcca tgggcgccgt cggcgtcgga cgcgccttcg cattcggtgg gttgctcggg 21840
gagccgctgt gggagcccgc cgccgggctg aaggagctgt acacggaagg cgagacgtct 21900
tcggtcctgg cccacctgta cgagaaactc gtccatcttg agaaggacat gctcacggag 21960
ccggccaggc gcctggccgc cgagcgtgcc ttccaactgc accggttcgc cgcggagttc 22020
cgttcacagt ggggggagga ggacgtcgtc tcccattccg gaggtacccg cgtccactgg 22080
gacccgctca cccgcttcct ggccgtgacc cagccggagc ccgacggcat caccgtgaca 22140
cacatcggct tccgaggaca ggcgatgctg gccttcgacg aacagggaca gcctgtcgga 22200
gtggacctgt tgggtgcgcc ggaggcgctc acacactgcg tacctcacgc gcagcgctca 22260
agggcatggg tcgcggacgc ggcgggtggg tggttgttgg acgcggaggc cgatgtcgtg 22320
tggatcagta tcagcgaagc gcccgtccgc cgtcgtctca ccgccgtcgg tgacatcgag 22380
gtgcagctgc gcgagggcaa actggccacg ctgcggatgc acctgaccga ggaagccccg 22440
gtttccgtcg gcggcgaggg cgccccatga gttatgtgag cctgctgcga agtccccacg 22500
ctgcccggtt gctgctaggc accctcgtcg ggcggctgcc gagcgccatg gccgccgtcg 22560
cgatccccct ggcactgcgc gacgcgggcg cgccgtacgg gttcatcggg cttgccgtcg 22620
gggcgttcgc catcgcggct gccgtggcag ggcccctgct cggccgcctc gtggaccata 22680
tcggccagcc cctggtgctg ctggggacgg ctgtgctggc gggttgcggg tttgtggtga 22740
tcgcggcggc cgttgaccag caggccggcg tcctggtcgg cgcggccatc gcaggggcgg 22800
cgaccccgcc gctggagcct tgcctgcggg cactgtggcc ggagatcgtc gatgccgaag 22860
aactggagtc ggcctacgcc ctggactccg cttcccagca actgatcttc gtgggcgggc 22920
cgctgatcgt cgccggctgc gtctccgtcg cctcacccgt gggagccttg tgggccgccg 22980
ccctgctggg gctcacgggg gtcctcgtcg tagccactgc ggctcccgcc cgggcctggc 23040
gggcccctgc ccgtcaggcc gactggctcg gcccgatgcg cagccgcagt ctcgttgtgc 23100
tgctgatcag cctcaccggt gtgggcgtgg ccatcggcac actcaatgtc gttgtcgtgg 23160
cctacgcgga ggagcaccgg ctgcccgggg gcgccccgac cctcctgacg ctgaacgctt 23220
tcggtgctct gatcggggga ctcgtctacg gtgccgtaca ccgctggccc gtaccccctg 23280
cgcggcggac cctgctgctg gccgtcggcc tggcggtgag ctacgcactc ctgtgtctgt 23340
tgcccgcgcc gccattgatg gcctgtctga tgctactgac cgggctgttc ctcgcaccga 23400
cgctgacggt ctccttcgtc ctggtcggag aactcgcccc gacgggcacc gtcaccgagg 23460
ccttcgcctg gctggtgacg ctgatgacat cgggctcggc gctagggtcc gcagccgtcg 23520
ggctggtcct ggagcaccgc ggtcccacct gggcggcggc ctgcggcgtc ctgggtctca 23580
cgatgagcgt cctcatcctc ctagccggac agtcccggct ggattcggac cagcgtacgg 23640
agaccaaggc gtcggctgtc cctgcggcat gaaccgcacc gtggctcggc gcctcggccg 23700
ccataaccgt gatcttggaa gcaggtggac agtgataggg ggatggtgcg gatgccgacc 23760
atggaatggg tagccggcag ttgccgcctg ctggcctcga ccgcagccga tttcgagcgc 23820
gaccggccct tcgaggggct gcggatcggc gcggcgatcc atctggaacc caagaccgcg 23880
accctgctga tggtgctggc gcggggcggg gcggacgtcg tcgcgaccgg aaaccttggg 23940
acctcacagg gcgccaccct gaccttcctg agagaacagg gaatcaccgt catcggcgac 24000
cgcacccgcg atccacaggc ccaggacgcg tgcctgcgcc aggtcctggc gaccaggccc 24060
gacctgctgc tggacaacgg cggtgacctg ttcctgcggt atctcgacgc cccgtacgaa 24120
gggcttcggg gaggaacgga ggaaaccacc tcggggcggg cgcagctgat gccggtgcgt 24180
gaccggatca agcgcccggt gctggtgatc aacgacagtc cgatcaagca gttcgcagaa 24240
aacacccatg cggtcgggca gagcgtgctg gagtcgttcc tgcgcatcac caaccgggcc 24300
accaacggac ggcgtgtcac cgtcgtcggc tacggcgcct gtgggcgtgg catcgccaag 24360
aacttcgccc atgcgcatgc ctgtgtcgct gtgtgggacg tcgatcccgt tcgccgcctg 24420
gaggccctgt tcgacggcta cgcggtcccc ggccggccgg aggcactggc gtcagccgac 24480
atcatcgtga cctccacggg ccacccgggc atcatcaccg cggatgacct gtacctgctg 24540
tcggacggcg tgatcctggt gaacgcgggc cacctccctt gggagatcga cgtgcccgga 24600
ctgctcgccc atcccaaagt actgacgtgc accgagccgg ccgaaggact gcagaccctg 24660
acgacgacta gcggcgcccg cgtcaacatc ctcaccgaag gccacatggt gaatctgaac 24720
ggccctcggc ccctggggaa ctccgtagag tccatggacc tcgggttcac cctgcaggcc 24780
cgctgcctgg aagccatcgc cacgggccgt gtccctgccg accagtgcgt cgtccccgtc 24840
ccgcccgaga tcgacgcccg ggtggcgcag gcatatgtgg atctggcagg cgagaaacga 24900
ccgatgaacg cataccccgc agaggagcag tagtgaagca atccgtggag gcggacaact 24960
ctcccatcga agaccttctc tcctggaagc ggggcaggcc cgcggacatc gaggggcgaa 25020
gcaaaaaact gtggctgctg cccgacggcc tgtgtctcat cgagatcatt ccctcactgc 25080
gcagcttcac ctacgaccgg gacgagctgg ttgaggaaac cggcccgtta cgactcgact 25140
tctacgaacg ggccgccgcg aaactggccg atgcgggaat ccgtacggcc ttctcccgcc 25200
gcatctccgc cacctgttac gtggccgagt accaccctgc gccgcccttc gaggtcgtcg 25260
tgaaaaaccg tgcggtcggc tccaccctcg tgaagtatcc ggggctgttc caggaaaacc 25320
agccgctgcc cgcgccggtg gtgaaattcg actaccgggt cgatccagag gaccagccca 25380
tcggcgagga ctatctgcgc gcgctgggcc ttccggtcga ggaactgcgc cggcaggcgc 25440
tggcggtgaa caccacgctg cgcgactggc tccaccccgt ggaactgtgg gacttctgcc 25500
tcatcttcgg attctccgac gaccacgagc cggtgctgat atccgaggtg tcgcaggact 25560
gcatgcgcct gcgtcactcc gacggatccc cgttggacaa ggacctgttc cgcagaggag 25620
cgtcacagga aacgatcact tcgcaatggc ggaggctgtt cgatggcctc ggctgaccac 25680
ccggtggcct tgctcaccgg tacgaaccgc ggaagcggca gaagcatcgc ccgggagctg 25740
catgcgcggg gctaccgcat tttctccctc aaccgcacgc tgaccggtga ggaatggctc 25800
catgaagagc gatgcgacct tgccgacccc gagcagatca ggggcggtgt tgcccgggtg 25860
ctcgcgacgg caggccggct caacgtgtgc gtttccaatg cggtcgaccg tgtcctggac 25920
ccgatcgccg acatgcgctg ggaggattgg gacaggtccc tggcggtgaa tctcagcgcg 25980
aacttccatc tgacccaggc tgtgctgccg gcgctgagat ccggtgacgg gctcatcgtc 26040
ttcatgggaa gccatgccgc tacacggtat ttcgagggcg gtgccgccta cagtgccgct 26100
aaagccgctc tgtccgcctt cgtggagacg cttctgatgg aagaacggaa caacggcgtg 26160
cgggcgtgcc tcgtctcgcc gggcgccatc gccaacctgg acggggacgt ggacccccac 26220
aagatgacga ctcatgcggt cgccaaggcc gttgtctcga tcatcgccga tttccccagg 26280
gatctgctgg tgggggagat ggagatcagg ccggcagcgc tccccgaacg ccctgtcacc 26340
ggaatcgacc ggctgctgca cgtctaggag gacagcatgc tctccctcgc actgcccaag 26400
ggatcttcgc tcgaacagcg gacgctgggc ttgttcgcgg cggccggcct gcaggtgacc 26460
aggccgtccg agcgcgccta ccgcggcacc atcacctacg gcggccccat acgggtcgcc 26520
ttcttcaagc cacgagaaat ccccctggtc gtcgcggcgg gggtgttcga tgccgggctg 26580
accggcgccg actggatcga ggagaccggg gccaaggtgg agagcgtggt gtccttcacc 26640
tactccaaga ccacggactc gccctggcgg gtggtgctgg cggttccggc cgacgacgcg 26700
gcccggaccg tgcaggatct cggccccggc acacgtatcg ccaccgagta ccccaccatc 26760
gcccgccggt tcctccaaga tgagggaatc caggcggagg tggtccattc gtacggcgcc 26820
accgaggcga agatcccgga actggccgat gccatcgtgg acgtcgtcga gacgggttcc 26880
tcactgcacc acaacggcct gcggatcatc accacgatcc gcacctgcgc cccatggctc 26940
atcgccagtc ctgaggcgtg gtgcgacgcc gaccggcttc gacggatcca ggggatggcc 27000
cggctgctgg acgccgctca cgcacagtcc gcccaggcgc tgctgaccgt acgcgtgccc 27060
acgcactgtc tggaccgggt agtccgctcg atgcccgagc gttcctggcg cgtcggggcc 27120
gacctgcacc acaccgacct tgtgatcgtc caagggctgg cacagcgtgc agggctgccc 27180
gacgtcatcg accgtctgct gggagcaggc gcgatcgaag tcacccagac cgacggtggc 27240
atgaccacac cggcgttccc ctcggacacc catgtgcgag cgtgatcact actgagacgc 27300
cggcctcggc cggacccggg tccggccgag gccggcgtct cgcacggacg cggccgggcc 27360
gcctgacgag cagaacgcgc tcgagtcggc cgcagggaca ggtccggtgc ccgtacggat 27420
gccatggtga cgacggagcg cggacagggc cgcgaggcgg ccgcacatgc cgtggaccga 27480
cggcccgggc ggtgtgggtg tggcggcgga gcacagatac aggccgcgca gcggagtgga 27540
ataaggatcc cagcgcaggg tcgggcgagc cacggactgg cgcaggggca gggcgtcggc 27600
gccgatgtcg ccaccgacgt ggttcggatc gtactcctcg aaacgggcga ccgatcggcc 27660
ttgggcggcg atgatcgtgt cggcgaaccc gagtgcgtac tcctcgatcc gttcgcggat 27720
cagccggacg ggttcacacg tgtcaccgtt gggcacatgg gcgtaggccg acacgggccg 27780
tttgccttgc gctggccagg gacggatcgg ccaccgcggg atcgacgacc agcgtcaagg 27840
ggtcggggac gcgctcgccc gcggcggtcg cggtctcccg ccggatgatg tccgcccgtg 27900
tgccgccgag gtgcacggtt ccggtccgtc ccacgagcgg gtcgccccag gggatggcgc 27960
cgctgacgag gaagtcggcc ttggccgctc cggggcgtac cggtagcgac ctaagacccg 28020
ccggtagcgg gcgggcagcc gttgccctgc gagggcaagg gcctccttgg ggccgacgtc 28080
cagcaggacg agcggggcgt tcccccaact ccgccaggtc acggacgtgg tgcccggtgt 28140
gcaccgtgcc gccgtgcgcg ttgtgggcca gcagcagtgc caccaccgtg gaggccacag 28200
agggaagctt gccgacggcg tgcgcggcaa ccccggtcag caaggccctg gcctgcggag 28260
tcgtgaacgg gttgtggcgc gtggcgtggg ccggcacccg tgaggtgaac accagcggga 28320
tgatcggctc gcgccaaggc ccgctgactg ccgagcacta ggtcgaccac ggcggtggag 28380
cgggacacca gaggccgtac gaggcgttcg cacctgggcc cgtccgcacc gagccggctg 28440
gcggtacggg cgaggtcacg ccaggcggcg gtcgcccggc caccgggcag gggatgggcg 28500
tagggcactt ccggcggcag gagccgcacc ccgcgggcag gcaggtcgaa gcggcggaag 28560
aacgccgagg cagcggccat agggtggacg gcgaagcaca cgtcgtgccg gacctcgctg 28620
tcgaacaggg ctgtggtgcg cagtccgccg ccgacggtat cggcccgttc atgaagcgag 28680
gccttgaggc ccgcccgggc cagagtgacg gatgctgcaa gcccgttggg cccggtgccc 28740
acgatcgcag cgtctgtcat gccgtacctc ccctgtggtg tgcctgcggg cggaggccgc 28800
tcatgtgtcc ttggcataga cgctcacgaa catcgcgccc tgcggtgtgt agggggccga 28860
gccccagtcg gagtagcggc cttcaggcct gagcccggcg gcggcggcgt aggcgtcaac 28920
ctcctccggt gaggtgaggc ggctcctctc cgtgccgacc cgcgaagtgc cgtccgcctc 28980
gaaccagatg tgcgagcagt gccacaggtc tccctcggga accagcgtcg agtgggtctg 29040
caggccggtg ccggggtcgg ggtagggaac gaagtacgtc atccgttgct gctctccgtg 29100
ccaggcgagg atggcgggct tgttgtgtgt ctccaccacc aggacgcccc cgggcgccag 29160
ccgttcggcc gcccggctga cggcctgttg ctgatcgacg ggggcgagca gcatggacag 29220
ggtgctgcac acgcagtaga caagcccgta ctgacgatcg tcggtatagg tgcggatgtc 29280
accgcgcacc cctttgacgt cgacaccggc ggccgccacg tccttctcga gctccgagag 29340
catctccggc gaggagtcca ctcccacgac ttcgccggtc tcacgggcaa gcgggatggc 29400
gatacgcccc gttcccacgc ccatctccag cgcgccgagc ccgtggttcg gatgcaggga 29460
cgccagcttc ttcgccgtaa ggtcggcctg gccgcccttg gggaagatcc ggtcgtacca 29520
tccggcgaac tgacgtccgt aagtgctgtc gtccttactc gtcatgcctg cttctctctc 29580
aggacgtcag gaggcgtccg cgaagcccgc gcggtggccg gccgtcggac gcgcggatgg 29640
gcgggggcgg aggccgggtc catcggcgtg tacggcgaga cgaagagcgt ctcagggctg 29700
gtggcggtgt cgtgcccggt ggcgcgtgcc agagcgaatt ggaacccttt gacgtcaccc 29760
gaggcttcgg tgcgcacctg ctcccgcaga tacgcggaga tcacgcgata ggcgggcgtg 29820
cggaccacct cggagaaacc cttgtcgagt gtctggagca tctcactgca gatatcgaac 29880
accgcctttt ctgttcgtcg gccgggaaac cagaagatct ggtgggcccg ccgcgcggct 29940
atcaggtcga ccgcatgcca gggagatgtc tcgcctgcgg tgttcagtgt gcggtaaaag 30000
aagtggtaat catgcaccgc cggttccggg gcgaagaatc gccagttcgg cagcagcgca 30060
aacgtgtcct tgcgctgaag ccggttgaac accgggtgcg ggtgctgcac ggcgagcgtg 30120
ccgatcagca tggccgcgcc cgcgatccgc gcgatcgcgc cagggccggt gaaggcgttc 30180
ttcacatgct gattcattgc gccgcctctt ccgtgatacg cgcctgcggt atgtcagccg 30240
cctggatctc gttgaggaac acgatgatcc ggcggcccac ctcattggcg aactgggcgc 30300
cggtcagcag cgagtcatgg tcggcgccct ccagcacgaa gctcttcacc tggtgtccct 30360
gagcccggtg cagctcggcc agttcgttgt gcatcaacag ctgctccggg tcccggtcga 30420
cagtctgctg cgcagagatc accagggcat gtgctgggat ggcgggcagc ttcccgtcga 30480
agccccggaa ctcctgctcg acggcggccc attcacggcg gccggcctcc cacagacgtg 30540
cgtcggcgta ctgggcgaac acacgggtac ggcagtcggc gggcagcgcc ctcacccagg 30600
ccggccgcgc caggagaacg ccgagtccgg cattgaggga ccgcaccatg gtgttcatca 30660
caagcacaag gtccttggcg gacctgctct gttgacttga ccggttcagt tcgtcggggt 30720
gagaggagtc gagatagacg atgccctgca gccggtctcc cagatctgcg gcggcccggc 30780
gggccagttc accaccgagt gagtggccga ccaacacgac cttgcggcct tccgggacaa 30840
cagcgcgggt caggtcaacc aggtcgtcga ccgattccgc cagccgatag ggagaggtgc 30900
tccggtactt gctgccggcg tatccggctc gggcgtaggt gaccacgccg tacccgcttt 30960
cccgggtgat cttctccgtg atccacgcga agtgctctga cgtggagacc aggccgttga 31020
caaagacgag gaccgggagt gcggggtcgt caccctcgct gcgctcgtac tggagctcgt 31080
tttggtgacg tgtgctcagg tacttggagt gcgaccagcc ttccagtgtg cgcatgcgcc 31140
gctggaccgt gaccacaccg gccatggtca ccgctccggc cagcagcgcc acgccggtgt 31200
acagggcgcg gtcgtcacgg tcggccaccg acggatgtgt cctcggggcc gaggtgtagg 31260
cgaccatcgg gtgcatggag gtgaaggcgc tgacgaaccg gcccagcccc atgaccaagc 31320
cgttcgcgac gtggaacgcg cctgcgccgg cgatcagcgg tcgagcgagt tttccgccgg 31380
ccaggtaggc ggccgggaag aggcactcaa gcgcgagaac cccgtgcgtg aggcacttcg 31440
cggccttcgg atgttttttc gccagctgga agacgggctc atgaccatac gtcgccgttc 31500
gcatgatgcc gctgagcgcg gaggcgtctc tccaggacct gctcagcagt ttcatccaac 31560
cggacacgac atacgaagtg ttggcctgca gcgccacgta ccaaagcagg gcgtcctggg 31620
tctgcggacg agtggacaga cgagccatgc cagtggccgt ctgtaccagg accgatacct 31680
ggtcggagcc gtcggtgccg taccggtgtc gcgcatggag tgcggcggtg gtgacaccga 31740
ggaagaggtt gcccgccccc cgccaacgtg cctggcccgg cagcagcaag cccacgctga 31800
ccgcggcccg ggccacatgc accgccgccg tggtcttctc cccgctgatg acgtcgagga 31860
acttgcgggt caaggggggg ccgtacccgt cacgggccat cgtgggccag tggtggagtc 31920
cgccccgcct ggtctgcctt cgctgggtca ggtactccag agaggaggtc agcgaggtga 31980
tggcggacag gcgctccgag actccgatgg cttgcgaccg ggtcacagag atcggtccgg 32040
ttagcgcgga tacgattctg gtaggcagtt tcacgagtgc ttcctcgcct tttaccggtg 32100
aactggtctt cgttccgtga tcggggccgc ccgccctgcc gggccagaaa ctggtctctg 32160
gcccggcagg gttctggtct gcttatacgg agtcgagagc gttggccccg cgaatcattt 32220
cccgatggcg aacgtgaggc agaagttcgc cgacggggtc gccagcacgg tgacgccctg 32280
ctctaccagt acggtgccct cggcccgctc gatcgggacc tcgtcgaggc gggcatcttc 32340
gtttatgaag gcggtgatgt cgttcacgat ttccccttgg ggtagtcgga gagtttggaa 32400
tttcggctga accagacgtc agccgacgac atggccagcg aagacaggtt atacgtaaca 32460
ctcggccggg cggatcgttt cgggccactc tcgccgtcgg ccaagccgtg cgttggtgca 32520
cccattcgtc ctgatgtggc atgctgacgg tcaatgggtg cttcggtagc cgcggctcgt 32580
aggttgatca gcttgttcca tcctcgcggt gcccgccgag tgaccacggg ccgccaagta 32640
cggcctcatg tcgtcgacgg ctgaaagtgg ttctgacctc ggttccggat actgttgatg 32700
atcttcgctc ggtcgggtga aggggcgggt tctgcgcgat gacgataggc ccggggctgg 32760
gttgggtgat tccgccgctg acggtgggga gtgtctggtc aacagcggat ccggccctgt 32820
ggcccgcccc cgacgagcgc cgttcccgcg cgcttgaata gcaggaagtc cggcaaacag 32880
cggggacggg tgccttgcga tcaggcgttg gacctagggc tcgcagagaa tcaaggtgtt 32940
gcttcccgtg atgcggctcg ttgttgtgat atgagcgtct ccgacgaggt ccgtggccaa 33000
ctcgctgtga agttcgaggt gttattccca catctggatg agcggcagcg gcggctgctg 33060
atgggggccg aggcccggat cctgggccac ggcggtgtcc gggcggccgc acgcgcggcc 33120
cggggcagcg aggcgacggt ccgcaagggt gtggaggagc tggaggccgg cgagggacct 33180
ctgggccggg tacgcaggcc cggtggtggc cggaagaggt ccgcagatct tgatccgggt 33240
cttcggcccg cgttgctggc cctggtggaa cctgatgagc ggggtgatcc gatgtcgccg 33300
ctgcgctgga cggggaagtc gaccaggagc ctggcggccg tgctcacccg tcaggggcac 33360
cgggtgagcg cggacacggt cggggacctg ctgcgggagg agggcttcag cctgcaggcc 33420
ggtgccaaga ccctcgaggg caggcagcac ccggaccgcg acgcccagtt ccgctacatc 33480
aacgagcagg ccagaaagca catggacgcc ggtcagccgg tgatcagcgt ggatgcgaag 33540
aagaaggaac tggtcggcga ctacaagaac gcgggccgcc agtggcggcc ggccggtgag 33600
ccggccctgg tcaagacaca cgacttcctg gaccggcacg ggccgggcaa ggcgataccc 33660
tacgggatct atgacaccgc cgcgttcgcc gtcgcctcca tccgccgctg gtggcaggcc 33720
cggggccggc acgactacca ggccgccacc cgtctgctga tcacagcaga cgcaggcggg 33780
tccaacggct accgcacccg cgcctggaag accgaactcg ccgccctggc cgccgagacg 33840
ggcctggacg tcacggtctg ccacatgccg ccgggcacat cgaagtggaa caagatcgag 33900
caccggctgt tctgccacat ctccatgaac tggcgcggtc gcccgctgaa cagccacgac 33960
gtcatcgtga acagcatcgc ggcaaccacc acccgcaccg ggctgaccgt ccacgccgaa 34020
ctcgacccgg gtacctacga caccggcatc aagatcaccg acagcgatat cggcaccctg 34080
cccgtgcacc ggcaccgctt ccacggcgac tggaactaca ccctccaccc cttacctcgc 34140
gacaccacca gcgcagacgc gaccccacag ccggccaacg gcccgtcccc gcagccccgc 34200
accgggtccc tacgcaaccc ggagctgacc ggcatgcccg aagcgacgct ggacgaactc 34260
atcggccaac tcgccccggc cctcgatgag ttacgcgagc aaggacggct ccggcagcga 34320
gggggtgaac gcatccgtgc tcgcggcgct ggagccaagg acaagctgac tactgccgac 34380
agagtcctgg ccaccgtgct ctacctgcgc aaacttggca cccgcgatct cctcgcccaa 34440
ctcttcggag tcaacagggg cggcacccag cgccgtacac gggaaagacg agacggagtt 34500
ggacacggcg tgaccaccga gacccaggac ctctacgtcg aggaccacga tctcctcatc 34560
gaccccatgc tcgaaggcgt catggagttc cccgaatact ccaccatcac caacagcctc 34620
atcggcttcc acaaagactt cggcaccatc ctgaccggct gcgcccgcgg ctacgcccgc 34680
gtcaccgtcg aaacccacac cgaacagccc ccgctggaca ccggcggctg ggacgacgtc 34740
ctggacctct ccgccgacct cacccgaggc cgcgccttcc tcgccagcta cgacaaggcg 34800
ctcaccacca acctcgcctt cgacggcccc ggaacctacc ggatccgcat ccacgcccgg 34860
ggccgcgccg acgtgaaaga cgccgccctg ccccgaaaac cccgccccac caagaccagc 34920
cctccgcccg agacgtacct cgtgcaggtg tggaaggccc ccaccgcccc cgagcagatc 34980
cacaagagca gcacccaccc caccctgccg gcagccgact acaccggacc cggcacctcg 35040
ccctaccagc cggccgtcga cgcccccgca atccactccg aaggcggcgg gatcgtcctg 35100
gtccgcacct gcttcaccga cccccaggcc tggcaccacc tggtggactt catcgaccac 35160
ggcggcgaag acggcaccgc catcgacgtc acccccatcg acaaccccgc ctacgacggg 35220
ctgaccgaag accagctgcg caccctgatc gaccgtgacg aggacgactg gccccaccac 35280
agcgtcctcc tcgtcgccga ccagcaggcc ctcacctcac ccgacctgac cctcctcgcc 35340
gtcgacaacc cacccggtga ccccgcaccg tccttccgta tcccaaagca gcacctggaa 35400
agcttcgtga tcaacatgga cctggggaac accgacttcc tcgagtggtc ccgcgacgcg 35460
gatgacgacg gcatctaccg agaaaccccc ggcgacgcat aaaccacgac ccccggcccg 35520
cagccggacc ggcactccgc cgaactggcc ggcgccccgc tgccgcgggg gcctgctgcc 35580
cggcatcacc gcgcgggcgc acaccagacc ggcgccggtg ccgcccggcc cactgagctg 35640
agacaagtcg acgcccgccg cacccggctg ggtgaattca tcgtgccgac ggaaccctcg 35700
cggggaccgc aacgtggtgt tgtacgtgaa cccgatcaac gagacgcacg tgagggagcc 35760
ggggctcgtc gtggtcgaca tcgcggccgc cgacgacgca accgccttcg ctttccagca 35820
gatgctcgcc ggccggtggg cgacagcgac agcgcagcgc acgacgcgcg accccggaca 35880
gcccggcgtt cggctgcgct gctacgtcga cctgcaccag gaacccaacc gcccgtagtt 35940
gagacggcca cggcctgaga cggctcatac cgaggtgcca tcctcgccaa acagcggtac 36000
ctgcgccacg tcgatatccg tgcggatgcg gtgcggccgg acggcatgcc tccaccgccc 36060
cgcgctgtcc cgcgcgacgt cgacggccac ctccatcacc acgtcgggct ccaccaggag 36120
gacatcaagc gcgcgctgtg ttccccagcc cgccgagaac gtccacccct gccacggatg 36180
cgcacccatc ggcggggtca gcaggtcggc cagggcgcgg ccggcggcct gggacagtgt 36240
agtgctgcgc ccggtgtact gcaggcggcc ctgcgcgtcg tagcggccca gcagcaccgt 36300
ccggggcgcc gtgagggaac cggtgacggc accaacgatg gcttcggtgg tggcccgcac 36360
cttgtacttc cgccatgatc gagagcccct gacgtagggc tcgtccagcc gtttgaagca 36420
cagtccctcc agtccggccg ccgtccactc cagccacccc ctcgctacgg cggggtcggt 36480
ggtcgacggg cacagcgtca gcggcgctgc caaaccgtgg tcggcgaaca gcgcctccag 36540
cgccgcgcgc cgctgcgcat acggccagcc ggtcacattc cggcccgcgt gcaccagatc 36600
gaagaccacg tagtgcgccg gccattgccg ggccgcatcg actgctccgg ttccgcgcag 36660
ggcgaggcgc tgctggagcc gctcgaacgc gagccggccg gactcccaca ccacgagctc 36720
gccgtccagg ccggtgtccg cgggcagctg cgcgagggcc gcctcccgga tctcgggaaa 36780
cgccgaggtc atgtccgtcc cctgcctcga gcgcagcagc acccggccgc ccgcgtgcac 36840
ggcgagctgg gcgcggaagc cgtcccactt cggctcggcg gcccacccgg cggacagagc 36900
ggggctgtcc acggccacgg tcagcatcgg ctcgggcaag gaccaggtca tgggtgatgc 36960
cttccccgcc ggccgccagg agtggtgacc gtcaccctcg cccggtaggg cccggcccct 37020
tcgggcgatg gtgtcggtac cgcctccgcg agaccatgct gggcgaccga cagcgtcgca 37080
gcgtcatgta cggccacccc cggtcgaccg ggggaaaggg tttctgcccg gcgctagcgg 37140
cgggctccgt acggacgtgg ttaccctcgg ccatgcgcac gacccgcctc cggaacgctg 37200
gtagcccggg cgagtatggc ggtgcggcgg acgcacttca gggcggtcct ctgcgccgag 37260
tcgtcagtct cacagagccc gccaccgggt cccgccgcgc gcgggggagt gcaaaggtgg 37320
gccgccagga ggtctcgacc tcctggcggc ccaccaggcc cgaagcactg acgactgcgg 37380
gttttcgggc tgttcggcgc tacggtgttg ctgtcgcggt caggtcgcac gcggtgcggg 37440
cccactcttt gtagggttcc acgcggtcgg tgagcaccgc gtccacgccg gtggccttcg 37500
ccttttccca cgccggcggc cagttcgccg tccagacata gaccttcagc ccggcggagt 37560
gccagttccg gaccgtaccg ggggtggcga ggaaccggga gacgttcaac gccgtgccgt 37620
accgcgccgc ttccttgccc gtgatcggcg tcttctcgtg ggtgagcgcg gtctgcagtt 37680
cgggcgcggc ccggcgtacg gcgtcgaccg catccgcgtt gaagctgtgc acgatgacgt 37740
<210> 2
<211> 302
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 2
Met Leu Ser Leu Ala Leu Pro Lys Gly Ser Ser Leu Glu Gln Arg Thr
1 5 10 15
Leu Gly Leu Phe Ala Ala Ala Gly Leu Gln Val Thr Arg Pro Ser Glu
20 25 30
Arg Ala Tyr Arg Gly Thr Ile Thr Tyr Gly Gly Pro Ile Arg Val Ala
35 40 45
Phe Phe Lys Pro Arg Glu Ile Pro Leu Val Val Ala Ala Gly Val Phe
50 55 60
Asp Ala Gly Leu Thr Gly Ala Asp Trp Ile Glu Glu Thr Gly Ala Lys
65 70 75 80
Val Glu Ser Val Val Ser Phe Thr Tyr Ser Lys Thr Thr Asp Ser Pro
85 90 95
Trp Arg Val Val Leu Ala Val Pro Ala Asp Asp Ala Ala Arg Thr Val
100 105 110
Gln Asp Leu Gly Pro Gly Thr Arg Ile Ala Thr Glu Tyr Pro Thr Ile
115 120 125
Ala Arg Arg Phe Leu Gln Asp Glu Gly Ile Gln Ala Glu Val Val His
130 135 140
Ser Tyr Gly Ala Thr Glu Ala Lys Ile Pro Glu Leu Ala Asp Ala Ile
145 150 155 160
Val Asp Val Val Glu Thr Gly Ser Ser Leu His His Asn Gly Leu Arg
165 170 175
Ile Ile Thr Thr Ile Arg Thr Cys Ala Pro Trp Leu Ile Ala Ser Pro
180 185 190
Glu Ala Trp Cys Asp Ala Asp Arg Leu Arg Arg Ile Gln Gly Met Ala
195 200 205
Arg Leu Leu Asp Ala Ala His Ala Gln Ser Ala Gln Ala Leu Leu Thr
210 215 220
Val Arg Val Pro Thr His Cys Leu Asp Arg Val Val Arg Ser Met Pro
225 230 235 240
Glu Arg Ser Trp Arg Val Gly Ala Asp Leu His His Thr Asp Leu Val
245 250 255
Ile Val Gln Gly Leu Ala Gln Arg Ala Gly Leu Pro Asp Val Ile Asp
260 265 270
Arg Leu Leu Gly Ala Gly Ala Ile Glu Val Thr Gln Thr Asp Gly Gly
275 280 285
Met Thr Thr Pro Ala Phe Pro Ser Asp Thr His Val Arg Ala
290 295 300
<210> 3
<211> 234
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 3
Met Ala Ser Ala Asp His Pro Val Ala Leu Leu Thr Gly Thr Asn Arg
1 5 10 15
Gly Ser Gly Arg Ser Ile Ala Arg Glu Leu His Ala Arg Gly Tyr Arg
20 25 30
Ile Phe Ser Leu Asn Arg Thr Leu Thr Gly Glu Glu Trp Leu His Glu
35 40 45
Glu Arg Cys Asp Leu Ala Asp Pro Glu Gln Ile Arg Gly Gly Val Ala
50 55 60
Arg Val Leu Ala Thr Ala Gly Arg Leu Asn Val Cys Val Ser Asn Ala
65 70 75 80
Val Asp Arg Val Leu Asp Pro Ile Ala Asp Met Arg Trp Glu Asp Trp
85 90 95
Asp Arg Ser Leu Ala Val Asn Leu Ser Ala Asn Phe His Leu Thr Gln
100 105 110
Ala Val Leu Pro Ala Leu Arg Ser Gly Asp Gly Leu Ile Val Phe Met
115 120 125
Gly Ser His Ala Ala Thr Arg Tyr Phe Glu Gly Gly Ala Ala Tyr Ser
130 135 140
Ala Ala Lys Ala Ala Leu Ser Ala Phe Val Glu Thr Leu Leu Met Glu
145 150 155 160
Glu Arg Asn Asn Gly Val Arg Ala Cys Leu Val Ser Pro Gly Ala Ile
165 170 175
Ala Asn Leu Asp Gly Asp Val Asp Pro His Lys Met Thr Thr His Ala
180 185 190
Val Ala Lys Ala Val Val Ser Ile Ile Ala Asp Phe Pro Arg Asp Leu
195 200 205
Leu Val Gly Glu Met Glu Ile Arg Pro Ala Ala Leu Pro Glu Arg Pro
210 215 220
Val Thr Gly Ile Asp Arg Leu Leu His Val
225 230
<210> 4
<211> 247
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 4
Met Lys Gln Ser Val Glu Ala Asp Asn Ser Pro Ile Glu Asp Leu Leu
1 5 10 15
Ser Trp Lys Arg Gly Arg Pro Ala Asp Ile Glu Gly Arg Ser Lys Lys
20 25 30
Leu Trp Leu Leu Pro Asp Gly Leu Cys Leu Ile Glu Ile Ile Pro Ser
35 40 45
Leu Arg Ser Phe Thr Tyr Asp Arg Asp Glu Leu Val Glu Glu Thr Gly
50 55 60
Pro Leu Arg Leu Asp Phe Tyr Glu Arg Ala Ala Ala Lys Leu Ala Asp
65 70 75 80
Ala Gly Ile Arg Thr Ala Phe Ser Arg Arg Ile Ser Ala Thr Cys Tyr
85 90 95
Val Ala Glu Tyr His Pro Ala Pro Pro Phe Glu Val Val Val Lys Asn
100 105 110
Arg Ala Val Gly Ser Thr Leu Val Lys Tyr Pro Gly Leu Phe Gln Glu
115 120 125
Asn Gln Pro Leu Pro Ala Pro Val Val Lys Phe Asp Tyr Arg Val Asp
130 135 140
Pro Glu Asp Gln Pro Ile Gly Glu Asp Tyr Leu Arg Ala Leu Gly Leu
145 150 155 160
Pro Val Glu Glu Leu Arg Arg Gln Ala Leu Ala Val Asn Thr Thr Leu
165 170 175
Arg Asp Trp Leu His Pro Val Glu Leu Trp Asp Phe Cys Leu Ile Phe
180 185 190
Gly Phe Ser Asp Asp His Glu Pro Val Leu Ile Ser Glu Val Ser Gln
195 200 205
Asp Cys Met Arg Leu Arg His Ser Asp Gly Ser Pro Leu Asp Lys Asp
210 215 220
Leu Phe Arg Arg Gly Ala Ser Gln Glu Thr Ile Thr Ser Gln Trp Arg
225 230 235 240
Arg Leu Phe Asp Gly Leu Gly
245
<210> 5
<211> 396
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 5
Met Val Arg Met Pro Thr Met Glu Trp Val Ala Gly Ser Cys Arg Leu
1 5 10 15
Leu Ala Ser Thr Ala Ala Asp Phe Glu Arg Asp Arg Pro Phe Glu Gly
20 25 30
Leu Arg Ile Gly Ala Ala Ile His Leu Glu Pro Lys Thr Ala Thr Leu
35 40 45
Leu Met Val Leu Ala Arg Gly Gly Ala Asp Val Val Ala Thr Gly Asn
50 55 60
Leu Gly Thr Ser Gln Gly Ala Thr Leu Thr Phe Leu Arg Glu Gln Gly
65 70 75 80
Ile Thr Val Ile Gly Asp Arg Thr Arg Asp Pro Gln Ala Gln Asp Ala
85 90 95
Cys Leu Arg Gln Val Leu Ala Thr Arg Pro Asp Leu Leu Leu Asp Asn
100 105 110
Gly Gly Asp Leu Phe Leu Arg Tyr Leu Asp Ala Pro Tyr Glu Gly Leu
115 120 125
Arg Gly Gly Thr Glu Glu Thr Thr Ser Gly Arg Ala Gln Leu Met Pro
130 135 140
Val Arg Asp Arg Ile Lys Arg Pro Val Leu Val Ile Asn Asp Ser Pro
145 150 155 160
Ile Lys Gln Phe Ala Glu Asn Thr His Ala Val Gly Gln Ser Val Leu
165 170 175
Glu Ser Phe Leu Arg Ile Thr Asn Arg Ala Thr Asn Gly Arg Arg Val
180 185 190
Thr Val Val Gly Tyr Gly Ala Cys Gly Arg Gly Ile Ala Lys Asn Phe
195 200 205
Ala His Ala His Ala Cys Val Ala Val Trp Asp Val Asp Pro Val Arg
210 215 220
Arg Leu Glu Ala Leu Phe Asp Gly Tyr Ala Val Pro Gly Arg Pro Glu
225 230 235 240
Ala Leu Ala Ser Ala Asp Ile Ile Val Thr Ser Thr Gly His Pro Gly
245 250 255
Ile Ile Thr Ala Asp Asp Leu Tyr Leu Leu Ser Asp Gly Val Ile Leu
260 265 270
Val Asn Ala Gly His Leu Pro Trp Glu Ile Asp Val Pro Gly Leu Leu
275 280 285
Ala His Pro Lys Val Leu Thr Cys Thr Glu Pro Ala Glu Gly Leu Gln
290 295 300
Thr Leu Thr Thr Thr Ser Gly Ala Arg Val Asn Ile Leu Thr Glu Gly
305 310 315 320
His Met Val Asn Leu Asn Gly Pro Arg Pro Leu Gly Asn Ser Val Glu
325 330 335
Ser Met Asp Leu Gly Phe Thr Leu Gln Ala Arg Cys Leu Glu Ala Ile
340 345 350
Ala Thr Gly Arg Val Pro Ala Asp Gln Cys Val Val Pro Val Pro Pro
355 360 365
Glu Ile Asp Ala Arg Val Ala Gln Ala Tyr Val Asp Leu Ala Gly Glu
370 375 380
Lys Arg Pro Met Asn Ala Tyr Pro Ala Glu Glu Gln
385 390 395
<210> 6
<211> 401
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 6
Met Ser Tyr Val Ser Leu Leu Arg Ser Pro His Ala Ala Arg Leu Leu
1 5 10 15
Leu Gly Thr Leu Val Gly Arg Leu Pro Ser Ala Met Ala Ala Val Ala
20 25 30
Ile Pro Leu Ala Leu Arg Asp Ala Gly Ala Pro Tyr Gly Phe Ile Gly
35 40 45
Leu Ala Val Gly Ala Phe Ala Ile Ala Ala Ala Val Ala Gly Pro Leu
50 55 60
Leu Gly Arg Leu Val Asp His Ile Gly Gln Pro Leu Val Leu Leu Gly
65 70 75 80
Thr Ala Val Leu Ala Gly Cys Gly Phe Val Val Ile Ala Ala Ala Val
85 90 95
Asp Gln Gln Ala Gly Val Leu Val Gly Ala Ala Ile Ala Gly Ala Ala
100 105 110
Thr Pro Pro Leu Glu Pro Cys Leu Arg Ala Leu Trp Pro Glu Ile Val
115 120 125
Asp Ala Glu Glu Leu Glu Ser Ala Tyr Ala Leu Asp Ser Ala Ser Gln
130 135 140
Gln Leu Ile Phe Val Gly Gly Pro Leu Ile Val Ala Gly Cys Val Ser
145 150 155 160
Val Ala Ser Pro Val Gly Ala Leu Trp Ala Ala Ala Leu Leu Gly Leu
165 170 175
Thr Gly Val Leu Val Val Ala Thr Ala Ala Pro Ala Arg Ala Trp Arg
180 185 190
Ala Pro Ala Arg Gln Ala Asp Trp Leu Gly Pro Met Arg Ser Arg Ser
195 200 205
Leu Val Val Leu Leu Ile Ser Leu Thr Gly Val Gly Val Ala Ile Gly
210 215 220
Thr Leu Asn Val Val Val Val Ala Tyr Ala Glu Glu His Arg Leu Pro
225 230 235 240
Gly Gly Ala Pro Thr Leu Leu Thr Leu Asn Ala Phe Gly Ala Leu Ile
245 250 255
Gly Gly Leu Val Tyr Gly Ala Val His Arg Trp Pro Val Pro Pro Ala
260 265 270
Arg Arg Thr Leu Leu Leu Ala Val Gly Leu Ala Val Ser Tyr Ala Leu
275 280 285
Leu Cys Leu Leu Pro Ala Pro Pro Leu Met Ala Cys Leu Met Leu Leu
290 295 300
Thr Gly Leu Phe Leu Ala Pro Thr Leu Thr Val Ser Phe Val Leu Val
305 310 315 320
Gly Glu Leu Ala Pro Thr Gly Thr Val Thr Glu Ala Phe Ala Trp Leu
325 330 335
Val Thr Leu Met Thr Ser Gly Ser Ala Leu Gly Ser Ala Ala Val Gly
340 345 350
Leu Val Leu Glu His Arg Gly Pro Thr Trp Ala Ala Ala Cys Gly Val
355 360 365
Leu Gly Leu Thr Met Ser Val Leu Ile Leu Leu Ala Gly Gln Ser Arg
370 375 380
Leu Asp Ser Asp Gln Arg Thr Glu Thr Lys Ala Ser Ala Val Pro Ala
385 390 395 400
Ala
<210> 7
<211> 358
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 7
Met Val Thr Leu Pro Asp Arg Val Arg Ala His Val Leu Ala Asp Phe
1 5 10 15
Ala Thr Ala Asp Pro Ala His Asp Ile His His Leu Asp Arg Val Ala
20 25 30
Ala Leu Ala Gly Asp Ile Ala Val Leu Leu Gly Ala Asp Pro Gln Thr
35 40 45
Ala Gln Val Ala Ala Tyr Val His Asp Tyr His Arg Val Glu Glu Ala
50 55 60
Arg Gln Gly Arg Arg Pro Ile Arg Pro Glu Glu Ala Arg Ser Ala Val
65 70 75 80
Leu Asp Val Leu Glu Arg Ser Glu Val Pro Glu Lys Leu His Gly Thr
85 90 95
Ile Leu Arg Ala Val Glu Leu Thr Gly Arg Tyr Arg Phe Gly Gly Asp
100 105 110
Glu Leu Asp Gly Glu Asp Leu Ile Ala Ala Ala Val His Asp Ala Asp
115 120 125
Asn Leu Asp Ala Met Gly Ala Val Gly Val Gly Arg Ala Phe Ala Phe
130 135 140
Gly Gly Leu Leu Gly Glu Pro Leu Trp Glu Pro Ala Ala Gly Leu Lys
145 150 155 160
Glu Leu Tyr Thr Glu Gly Glu Thr Ser Ser Val Leu Ala His Leu Tyr
165 170 175
Glu Lys Leu Val His Leu Glu Lys Asp Met Leu Thr Glu Pro Ala Arg
180 185 190
Arg Leu Ala Ala Glu Arg Ala Phe Gln Leu His Arg Phe Ala Ala Glu
195 200 205
Phe Arg Ser Gln Trp Gly Glu Glu Asp Val Val Ser His Ser Gly Gly
210 215 220
Thr Arg Val His Trp Asp Pro Leu Thr Arg Phe Leu Ala Val Thr Gln
225 230 235 240
Pro Glu Pro Asp Gly Ile Thr Val Thr His Ile Gly Phe Arg Gly Gln
245 250 255
Ala Met Leu Ala Phe Asp Glu Gln Gly Gln Pro Val Gly Val Asp Leu
260 265 270
Leu Gly Ala Pro Glu Ala Leu Thr His Cys Val Pro His Ala Gln Arg
275 280 285
Ser Arg Ala Trp Val Ala Asp Ala Ala Gly Gly Trp Leu Leu Asp Ala
290 295 300
Glu Ala Asp Val Val Trp Ile Ser Ile Ser Glu Ala Pro Val Arg Arg
305 310 315 320
Arg Leu Thr Ala Val Gly Asp Ile Glu Val Gln Leu Arg Glu Gly Lys
325 330 335
Leu Ala Thr Leu Arg Met His Leu Thr Glu Glu Ala Pro Val Ser Val
340 345 350
Gly Gly Glu Gly Ala Pro
355
<210> 8
<211> 387
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 8
Met Thr Asn Met Leu Ser Pro Ala Ala Pro Ala Thr Arg Leu Gly Thr
1 5 10 15
Gly His Ala Ala Arg Ala Gly Thr Leu Pro Val Leu Gln Ser Ala Gln
20 25 30
Ala Arg Trp Ser Met Pro Pro Ala Arg Leu Val Leu Val Thr His Leu
35 40 45
Leu Asp Thr Ala Ile Pro Phe Val Arg Leu Phe Glu Arg Cys Met Asp
50 55 60
Leu Val Arg Val Val Pro Val Pro Tyr Ser Ala Gln Pro Glu Ala Leu
65 70 75 80
Ala Arg Leu Asp Asp Leu Pro Ile Thr Val Pro Glu Ser Ile Gly Glu
85 90 95
Val Gly Ala Val Ala Val Arg Asp Ala Glu Arg Ala Ala Arg Glu Ser
100 105 110
Glu Leu Pro Val Val Ile Gln Glu Val Gly Gly Tyr Cys Ala Asp Ala
115 120 125
Val Gly Arg Leu Ala Gln Phe Pro Asn Val Arg Gly Val Val Glu Asp
130 135 140
Thr Lys Gln Gly Gln Trp Arg Tyr Glu Arg Asn Met Pro Leu Pro Leu
145 150 155 160
Pro Val Phe Thr Ile Ala Asp Ser Pro Leu Lys Ala Leu Glu Asp Val
165 170 175
Gln Val Gly Arg Ser Val Ala Tyr Ser Val Glu Arg Leu Leu Arg Leu
180 185 190
Arg Phe Tyr Arg Leu Leu Ser Glu Arg Arg Val Leu Val Leu Gly Tyr
195 200 205
Gly Gly Ile Gly Thr Ala Leu Ala Glu His Leu Arg Arg Thr Gly Ala
210 215 220
Gln Val Ala Val Tyr Asp Pro Asp Glu Val Arg Met Ser Ala Ala Val
225 230 235 240
Val His Gly Phe Arg Val Gly Ala Arg Glu Asp Leu Leu Gly Trp Ala
245 250 255
Glu Ala Ile Val Gly Val Ser Gly His Arg Ala Leu Thr Val Glu Asp
260 265 270
Leu Pro Leu Leu Arg Asp Gly Val Val Leu Ala Ser Gly Ser Ser Lys
275 280 285
Gln Val Glu Phe Asp Val Glu Gly Ile Cys Arg Ser Ala Asp Thr Leu
290 295 300
Val Glu Ala Asp Glu Val Met Glu Leu Gln Val Ala Asn Arg Thr Val
305 310 315 320
Tyr Leu Leu Asn His Gly Lys Pro Val Asn Phe Leu Glu Gln Ser Ile
325 330 335
Leu Gly Ser Val Leu Asp Leu Val Tyr Thr Glu Leu Tyr Leu Cys Thr
340 345 350
Arg Glu Leu Val Gly Arg Val Trp Ser Pro Gly Leu His Arg Leu Asp
355 360 365
Pro Gly Ile Gln Gln Glu Leu Ala Gln Gln Trp Arg Glu Glu Tyr Gly
370 375 380
Arg Gln Trp
385
<210> 9
<211> 213
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 9
Met Thr Asp Ala Ala Ile Val Gly Thr Gly Pro Asn Gly Leu Ala Ala
1 5 10 15
Ser Val Thr Leu Ala Arg Ala Gly Leu Lys Ala Ser Leu His Glu Arg
20 25 30
Ala Asp Thr Val Gly Gly Gly Leu Arg Thr Thr Ala Leu Phe Asp Ser
35 40 45
Glu Val Arg His Asp Val Cys Phe Ala Val His Pro Met Ala Ala Ala
50 55 60
Ser Ala Phe Phe Arg Arg Phe Asp Leu Pro Ala Arg Gly Val Arg Leu
65 70 75 80
Leu Pro Pro Glu Val Pro Tyr Ala His Pro Leu Pro Gly Gly Arg Ala
85 90 95
Thr Ala Ala Trp Arg Asp Leu Ala Arg Thr Ala Ser Arg Leu Gly Ala
100 105 110
Asp Gly Pro Arg Cys Glu Arg Leu Val Arg Pro Leu Val Ser Arg Ser
115 120 125
Thr Ala Val Val Asp Leu Val Leu Gly Ser Gln Arg Ala Leu Ala Arg
130 135 140
Ala Asp His Pro Ala Gly Val His Leu Thr Gly Ala Gly Pro Arg His
145 150 155 160
Ala Pro Gln Pro Val His Asp Ser Ala Gly Gln Gly Leu Ala Asp Arg
165 170 175
Gly Cys Arg Ala Arg Arg Arg Gln Ala Ser Leu Cys Gly Leu His Gly
180 185 190
Gly Gly Thr Ala Ala Gly Pro Gln Arg Ala Arg Arg His Gly Ala His
195 200 205
Arg Ala Pro Arg Pro
210
<210> 10
<211> 189
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 10
Met Phe Thr Ser Arg Val Pro Ala His Ala Thr Arg His Asn Pro Phe
1 5 10 15
Thr Thr Pro Gln Ala Arg Ala Leu Leu Thr Gly Val Ala Ala His Ala
20 25 30
Val Gly Lys Leu Pro Ser Val Ala Ser Thr Val Val Ala Leu Leu Leu
35 40 45
Ala His Asn Ala His Gly Gly Thr Val His Thr Gly His His Val Arg
50 55 60
Asp Leu Ala Glu Leu Gly Glu Arg Pro Ala Arg Pro Ala Gly Arg Arg
65 70 75 80
Pro Gln Gly Gly Pro Cys Pro Arg Arg Ala Thr Ala Ala Arg Pro Leu
85 90 95
Pro Ala Gly Leu Arg Ser Leu Pro Val Arg Pro Gly Ala Ala Lys Ala
100 105 110
Asp Phe Leu Val Ser Gly Ala Ile Pro Trp Gly Asp Pro Leu Val Gly
115 120 125
Arg Thr Gly Thr Val His Leu Gly Gly Thr Arg Ala Asp Ile Ile Arg
130 135 140
Arg Glu Thr Ala Thr Ala Ala Gly Glu Arg Val Pro Asp Pro Leu Thr
145 150 155 160
Leu Val Val Asp Pro Ala Val Ala Asp Pro Ser Leu Ala Ser Ala Arg
165 170 175
Gln Thr Ala Arg Val Gly Leu Arg Pro Cys Ala Gln Arg
180 185
<210> 11
<211> 161
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 11
Met Ser Ala Tyr Ala His Val Pro Asn Gly Asp Thr Cys Glu Pro Val
1 5 10 15
Arg Leu Ile Arg Glu Arg Ile Glu Glu Tyr Ala Leu Gly Phe Ala Asp
20 25 30
Thr Ile Ile Ala Ala Gln Gly Arg Ser Val Ala Arg Phe Glu Glu Tyr
35 40 45
Asp Pro Asn His Val Gly Gly Asp Ile Gly Ala Asp Ala Leu Pro Leu
50 55 60
Arg Gln Ser Val Ala Arg Pro Thr Leu Arg Trp Asp Pro Tyr Ser Thr
65 70 75 80
Pro Leu Arg Gly Leu Tyr Leu Cys Ser Ala Ala Thr Pro Thr Pro Pro
85 90 95
Gly Pro Ser Val His Gly Met Cys Gly Arg Leu Ala Ala Leu Ser Ala
100 105 110
Leu Arg Arg His His Gly Ile Arg Thr Gly Thr Gly Pro Val Pro Ala
115 120 125
Ala Asp Ser Ser Ala Phe Cys Ser Ser Gly Gly Pro Ala Ala Ser Val
130 135 140
Arg Asp Ala Gly Leu Gly Arg Thr Arg Val Arg Pro Arg Pro Ala Ser
145 150 155 160
Gln
<210> 12
<211> 1071
<212> DNA
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 12
atgaccgcgt cccgcatcga caccgagacc ctccgccggc ttcccaaggc cgtcctgcac 60
gaccacctcg acggcggcct gcgccccgcc accgtggtgg aactcgccgc cgcggtcggc 120
cacaccctgc ccaccaccga ccccgacgag ctggccgcct ggtacgtcga ggccgccaac 180
tccggcgacc tggtccgcta catcgccacc ttcgagcaca ccctcgccgt catgcagacc 240
cgcgagggcc tgctgcgcac cgccgaggag tacgtcctcg acctcgccgc cgacggagtc 300
gtctacgcgg aggtgcgcta cgcccccgag ctgatgctca agggcggact caccctgacc 360
gaggtcgtcg aggccgtcca ggagggcctg gccgccggca tggcgaaggc cgcggcggcc 420
ggcacccccg tccgggtcgg caccctgctg tgcggcatgc gcatgttcga ccgggtccgg 480
gaggccgccg gactggccgt cgcctaccgg gacgccggtg tcgtcggctt cgacatcgcc 540
ggagccgagg acggcttccc gcccgccgac cacctcgacg ccttcgcgta cctgcgcgcc 600
gagagcatgc ccttcaccat ccacgccggc gaggcgtacg gcctgcccag catccaccag 660
gcgctccagg tgtgcggcgc ccagcgcatc ggccacggag tgcgcctgac cgaggacatc 720
gtggacggca agctcggccg gctcgcctcc tgggtgcgcg accgccggat cgccctggag 780
atgtgcccca cctccaacct ccagaccggc tgcgccacct cgatcgccga gcaccccatc 840
accgccctga aggacctggg cttccgggtc accctgaaca ccgacaaccg cctggtgtcg 900
gggacgacga tgacccgtga gatgtccctg ctggtggagc aggccggctg gacggtggag 960
gacctgcgca cggtcaccgt gaacgccctc aagagcgcgt tcgtcccgtt cgacgagcgc 1020
acggccctga tcgaggacgt ggtcctgccg ggttacgccg ccgcgctctg a 1071
<210> 13
<211> 356
<212> PRT
<213> 抗生链霉菌 NRRL 3238(Streptomyces antibioticus NRRL 3238)
<400> 13
Met Thr Ala Ser Arg Ile Asp Thr Glu Thr Leu Arg Arg Leu Pro Lys
1 5 10 15
Ala Val Leu His Asp His Leu Asp Gly Gly Leu Arg Pro Ala Thr Val
20 25 30
Val Glu Leu Ala Ala Ala Val Gly His Thr Leu Pro Thr Thr Asp Pro
35 40 45
Asp Glu Leu Ala Ala Trp Tyr Val Glu Ala Ala Asn Ser Gly Asp Leu
50 55 60
Val Arg Tyr Ile Ala Thr Phe Glu His Thr Leu Ala Val Met Gln Thr
65 70 75 80
Arg Glu Gly Leu Leu Arg Thr Ala Glu Glu Tyr Val Leu Asp Leu Ala
85 90 95
Ala Asp Gly Val Val Tyr Ala Glu Val Arg Tyr Ala Pro Glu Leu Met
100 105 110
Leu Lys Gly Gly Leu Thr Leu Thr Glu Val Val Glu Ala Val Gln Glu
115 120 125
Gly Leu Ala Ala Gly Met Ala Lys Ala Ala Ala Ala Gly Thr Pro Val
130 135 140
Arg Val Gly Thr Leu Leu Cys Gly Met Arg Met Phe Asp Arg Val Arg
145 150 155 160
Glu Ala Ala Gly Leu Ala Val Ala Tyr Arg Asp Ala Gly Val Val Gly
165 170 175
Phe Asp Ile Ala Gly Ala Glu Asp Gly Phe Pro Pro Ala Asp His Leu
180 185 190
Asp Ala Phe Ala Tyr Leu Arg Ala Glu Ser Met Pro Phe Thr Ile His
195 200 205
Ala Gly Glu Ala Tyr Gly Leu Pro Ser Ile His Gln Ala Leu Gln Val
210 215 220
Cys Gly Ala Gln Arg Ile Gly His Gly Val Arg Leu Thr Glu Asp Ile
225 230 235 240
Val Asp Gly Lys Leu Gly Arg Leu Ala Ser Trp Val Arg Asp Arg Arg
245 250 255
Ile Ala Leu Glu Met Cys Pro Thr Ser Asn Leu Gln Thr Gly Cys Ala
260 265 270
Thr Ser Ile Ala Glu His Pro Ile Thr Ala Leu Lys Asp Leu Gly Phe
275 280 285
Arg Val Thr Leu Asn Thr Asp Asn Arg Leu Val Ser Gly Thr Thr Met
290 295 300
Thr Arg Glu Met Ser Leu Leu Val Glu Gln Ala Gly Trp Thr Val Glu
305 310 315 320
Asp Leu Arg Thr Val Thr Val Asn Ala Leu Lys Ser Ala Phe Val Pro
325 330 335
Phe Asp Glu Arg Thr Ala Leu Ile Glu Asp Val Val Leu Pro Gly Tyr
340 345 350
Ala Ala Ala Leu
355

Claims (6)

1.一种抗生素喷司他丁和阿糖腺苷的生物合成基因簇,其特征在于,所述的基因簇的核苷酸序列为SEQ ID NO: 1中第20237-28760位所示;基因簇所包括的编码喷司他丁和阿糖腺苷生物合成相关基因具体为:
负责喷司他丁生物合成的基因,即penApenBpenC共3个基因:
penA位于SEQ ID NO: 1中第26377-27285位碱基处,长度为909个碱基对,编码ATP磷酸核糖转移酶,长度为302个氨基酸,氨基酸序列为SEQ ID NO.2;
penB位于SEQ ID NO: 1中第25663-26367位碱基处,长度为705个碱基对,编码短链脱氢酶,长度为234个氨基酸,氨基酸序列为SEQ ID NO.3;
penC位于SEQ ID NO: 1中第24933-25676位碱基处,长度为744个碱基对,编码SAICAR合成酶,长度为247个氨基酸,氨基酸序列为SEQ ID NO.4;
负责阿糖腺苷生物合成的基因,即penDpenGpenHpenIpenJ共5个基因:
penD位于SEQ ID NO: 1中第23743-24933位碱基处,长度为1191个碱基对,编码SAH水解酶,长度为396个氨基酸,氨基酸序列为SEQ ID NO.5;
penG位于SEQ ID NO: 1中第20237-21400位碱基处,长度为1164个碱基对,编码SAH水解酶,长度为387个氨基酸,氨基酸序列为SEQ ID NO.8;
penH位于SEQ ID NO: 1中第 28119-28760位碱基处,长度为642个碱基对,编码氧化还原酶,长度为213个氨基酸,氨基酸序列为SEQ ID NO.9;
penI位于SEQ ID NO: 1中第 27743-28312位碱基处,长度为570个碱基对,编码氧化还原酶,长度为189个氨基酸,氨基酸序列为SEQ ID NO.10;
penJ位于SEQ ID NO: 1中第 27289-27774位碱基处,长度为486个碱基对,编码氧化还原酶,长度为161个氨基酸,氨基酸序列为SEQ ID NO.11;
负责喷司他丁和阿糖腺苷转运和调控的基因penEpenF共2个基因:
penE位于SEQ ID NO: 1中第 22467-23672位碱基处,长度为1206个碱基对,编码膜转运蛋白,长度为401个氨基酸,氨基酸序列为SEQ ID NO.6;
penF位于SEQ ID NO: 1中第 21394-22470位碱基处,长度为1077个碱基对,编码核苷磷酸化酶,长度为358个氨基酸,氨基酸序列为SEQ ID NO.7。
2.一种辅因子NADP(H)依赖的短链脱氢酶的编码基因penB,其特征在于,其核苷酸如SEQ ID NO: 1中第25663-26367位碱基所示。
3.一种权利要求2所述的辅因子NADP(H)依赖的短链脱氢酶的编码基因penB编码的辅因子NADP(H)依赖的短链脱氢酶PenB。
4.权利要求3所述的辅因子NADP(H)依赖的短链脱氢酶PenB在催化喷司他丁和6-酮基喷司他丁互相转化的反应中的应用。
5.含有权利要求1所述的喷司他丁和阿糖腺苷的生物合成基因簇的重组载体、表达盒、转基因细胞系或重组菌。
6.权利要求5所述的重组载体、表达盒、转基因细胞系或重组菌在合成喷司他丁和/或阿糖腺苷中的应用。
CN201611181302.9A 2016-12-20 2016-12-20 喷司他丁和阿糖腺苷生物合成基因簇及其应用 Active CN106701788B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611181302.9A CN106701788B (zh) 2016-12-20 2016-12-20 喷司他丁和阿糖腺苷生物合成基因簇及其应用

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611181302.9A CN106701788B (zh) 2016-12-20 2016-12-20 喷司他丁和阿糖腺苷生物合成基因簇及其应用

Publications (2)

Publication Number Publication Date
CN106701788A CN106701788A (zh) 2017-05-24
CN106701788B true CN106701788B (zh) 2019-10-25

Family

ID=58938721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611181302.9A Active CN106701788B (zh) 2016-12-20 2016-12-20 喷司他丁和阿糖腺苷生物合成基因簇及其应用

Country Status (1)

Country Link
CN (1) CN106701788B (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110777155B (zh) * 2019-11-22 2021-08-10 武汉大学 最小霉素生物合成基因簇、重组菌及其应用
CN113444724B (zh) * 2021-05-10 2023-03-17 西南大学 一种启动子及重组载体和用途
CN113354718B (zh) * 2021-06-21 2023-06-02 重庆市畜牧科学院 一种哌尼生素前体、表达盒及其制备方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102234673A (zh) * 2010-04-29 2011-11-09 上海医药工业研究院 抗生链霉菌发酵生产喷司他丁的发酵培养基以及发酵方法
CN104946707A (zh) * 2014-03-31 2015-09-30 中国科学院天津工业生物技术研究所 用于发酵制备喷司他丁的发酵培养基及发酵制备方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102234673A (zh) * 2010-04-29 2011-11-09 上海医药工业研究院 抗生链霉菌发酵生产喷司他丁的发酵培养基以及发酵方法
CN104946707A (zh) * 2014-03-31 2015-09-30 中国科学院天津工业生物技术研究所 用于发酵制备喷司他丁的发酵培养基及发酵制备方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
喷司他丁生物合成的发酵优化;李晓辉;《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》;20150615(第06期);第B018-25页 *

Also Published As

Publication number Publication date
CN106701788A (zh) 2017-05-24

Similar Documents

Publication Publication Date Title
KR101659101B1 (ko) 박테리아 [2Fe-2S] 다이하이드록시산 탈수효소의 동정 및 용도
DK2588616T3 (en) PROCEDURE FOR MAKING A RELATIONSHIP OF INTEREST
CN1500146A (zh) 双岐杆菌的基因组
KR20130117753A (ko) 포스포케톨라아제를 포함하는 재조합 숙주 세포
CN106701788B (zh) 喷司他丁和阿糖腺苷生物合成基因簇及其应用
KR20210018219A (ko) 발효 및 생산 중 곰팡이 형태를 제어하기 위한 신호 전달에 포함된 유전자의 조작
CN111534493B (zh) 一种嘌呤核苷磷酸化酶突变体、基因及应用
KR20180093083A (ko) 켈리마이신 생합성 유전자 클러스터
KR20230111189A (ko) 재프로그램 가능한 iscb 뉴클레아제 및 이의 용도
CN101275141A (zh) 阿嗪霉素的生物合成基因簇
KR20200010285A (ko) 증가된 nadph를 유도하는 생합성 경로의 게놈 공학
CN101157929A (zh) 番红霉素的生物合成基因簇
KR20200134333A (ko) 발효에 의한 히스타민 생산을 위해 조작된 생합성 경로
CN106676115B (zh) 2’-氯代喷司他丁和2’-氨基-2’-脱氧腺苷生物合成基因簇及其应用
CN109790558A (zh) 方法
Chen et al. Twenty years hunting for sulfur in DNA
Jarling et al. Isolation of mak1 from Actinoplanes missouriensis and evidence that Pep2 from Streptomyces coelicolor is a maltokinase
KR101189475B1 (ko) 삼원환 화합물의 생합성을 담당하는 유전자와 단백질
KR20230012530A (ko) 이소프레노이드의 생산을 위한 개선된 방법
CN108624544B (zh) 阿卡波糖工程菌及其制备方法和应用
KR100861771B1 (ko) 발리다마이신 생합성을 위한 발리오론 합성효소 및 이의 제조방법
CN113462704B (zh) 一种植物细胞分裂素狭霉素的生物合成基因簇及其生物材料以及在合成狭霉素中的应用
KR102114010B1 (ko) 흑부병으로부터의 셀룰로즈 및/또는 헤미셀룰로즈 분해 효소 및 이의 용도
CN110551739A (zh) 吡唑霉素生物合成基因簇、重组菌及其应用
US6210935B1 (en) Staurosporin biosynthesis gene clusters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant