CN109337904B - 基于C2c1核酸酶的基因组编辑系统和方法 - Google Patents

基于C2c1核酸酶的基因组编辑系统和方法 Download PDF

Info

Publication number
CN109337904B
CN109337904B CN201811300251.6A CN201811300251A CN109337904B CN 109337904 B CN109337904 B CN 109337904B CN 201811300251 A CN201811300251 A CN 201811300251A CN 109337904 B CN109337904 B CN 109337904B
Authority
CN
China
Prior art keywords
leu
glu
lys
arg
ala
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811300251.6A
Other languages
English (en)
Other versions
CN109337904A (zh
Inventor
李伟
周琪
滕飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Zoology of CAS
Original Assignee
Institute of Zoology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Zoology of CAS filed Critical Institute of Zoology of CAS
Priority to CN201811300251.6A priority Critical patent/CN109337904B/zh
Priority to CN202011431478.1A priority patent/CN112961853A/zh
Priority to JP2021523583A priority patent/JP7361109B2/ja
Priority to PCT/CN2018/118458 priority patent/WO2020087631A1/zh
Priority to EP18938930.7A priority patent/EP3929292A4/en
Publication of CN109337904A publication Critical patent/CN109337904A/zh
Application granted granted Critical
Publication of CN109337904B publication Critical patent/CN109337904B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/14Drugs for disorders of the nervous system for treating abnormal movements, e.g. chorea, dyskinesia
    • A61P25/16Anti-Parkinson drugs
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/7105Natural ribonucleic acids, i.e. containing only riboses attached to adenine, guanine, cytosine or uracil and having 3'-5' phosphodiester links
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/18Antipsychotics, i.e. neuroleptics; Drugs for mania or schizophrenia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/28Drugs for disorders of the nervous system for treating neurodegenerative disorders of the central nervous system, e.g. nootropic agents, cognition enhancers, drugs for treating Alzheimer's disease or other forms of dementia
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/30Drugs for disorders of the nervous system for treating abuse or dependence
    • A61P25/34Tobacco-abuse
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P27/00Drugs for disorders of the senses
    • A61P27/02Ophthalmic agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P29/00Non-central analgesic, antipyretic or antiinflammatory agents, e.g. antirheumatic agents; Non-steroidal antiinflammatory drugs [NSAID]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P9/00Drugs for disorders of the cardiovascular system
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Neurology (AREA)
  • Neurosurgery (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Psychiatry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Addiction (AREA)
  • Psychology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Pain & Pain Management (AREA)
  • Rheumatology (AREA)
  • Ophthalmology & Optometry (AREA)
  • Mycology (AREA)
  • Cardiology (AREA)
  • Hospice & Palliative Care (AREA)
  • Cell Biology (AREA)

Abstract

本发明涉及基因工程领域。具体而言,本发明涉及基于C2c1核酸酶的基因组编辑系统和方法。本发明还涉及可与不同C2c1核酸酶组合用于基因组编辑的人工向导RNA。

Description

基于C2c1核酸酶的基因组编辑系统和方法
技术领域
本发明涉及基因工程领域。具体而言,本发明涉及基于C2c1核酸酶的基因组编辑系统和方法。本发明还涉及可与不同C2c1核酸酶组合用于基因组编辑的人工向导RNA。
发明背景
随着CRISPR-Cas(Clustered Regularly Interspaced Short PalindromicRepeats-CRISPR-associated蛋白)系统的出现,精确的基因组编辑由于其在基因治疗中的光明前景已经成为最令人关注的领域。到目前为止,已成功利用三种类型的CRISPR-Cas系统以促进哺乳动物基因组工程,包括II型Cas9(Cong,L et al.Science 339,819-823(2013);Mali,P.et al.Science 339,823-826(2013))、V-A型Cpfl(Zetsche,B.et al.Cell163,759-771(2015))和V-B型C2c1。对于II型和V型CRISPR-Cas系统,向导RNA和Cas效应蛋白是靶DNA识别和切割的两种核心成分(Wright,A.V.,Nunez,J.K.&Doudna,J.A.Cell 164,29-44(2016);Shmakov,S.et al.Nat Rev Microbiol 15,169-182(2017))。以前的研究表明在密切相关的Cas9系统(Fonfara,I.et al.Nucleic Acids Res 42,2577-2590(2014))以及Cpf1系统(Zetsche,B.et al.Cell 163,759-771(2015))中,双RNA(crRNA和tracRRNA)和蛋白质组分是可互换的,并能初步优化(Nishimasu,H.et al.Cell 156,935-949(2014);Zalatan,J.G.et al.Cell 160,339-350(2015))。虽然许多新兴的CRISPR-Cas系统和研究促进CRISPR-Cas系统的广泛应用(Wright,A.V.,Nunez,J.K.&Doudna,J.A.Cell 164,29-44(2016);Shmakov,S.et al.Nat Rev Microbiol 15,169-182(2017)),但对于如何重新设计甚至从头合成促酶基因组编辑系统仍然知之甚少。
V-B型CRISPR-C2c1系统是一种新兴的具有前景的基因工程技术。然而,可用于哺乳动物基因组编辑的C2c1却很少,大大限制了其应用。本领域仍然需要新的可用于哺乳动物基因组编辑的基于C2c1核酸酶的基因组编辑系统。
发明简述
在一方面,本发明提供了一种用于对细胞基因组中的靶序列进行定点修饰的基因组编辑系统,其包含以下i)至v)中至少一项:
i)C2c1蛋白或其变体,和向导RNA;
ii)包含编码C2c1蛋白或其变体的核苷酸序列的表达构建体,和向导RNA;
iii)C2c1蛋白或其变体,和包含编码向导RNA的核苷酸序列的表达构建体;
iv)包含编码C2c1蛋白或其变体的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;
v)包含编码C2c1蛋白或其变体的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体;
其中所述向导RNA能够与所述C2c1蛋白或其变体形成复合物,将所述C2c1蛋白直系同源物或其变体靶向所述细胞基因组中的靶序列。
在一些实施方案中,所述C2c1蛋白是来自Alicyclobacillus acidiphilus的AaC2c1蛋白、来自Alicyclobacillus kakegawensis的AkC2c1蛋白、来自Alicyclobacillusmacrosporangiidus的AmC2c1蛋白、来自Bacillus hisashii的BhC2c1蛋白、来自Bacillus属的BsC2c1蛋白、来自Bacillus属的Bs3C2c1蛋白、来自Desulfovibrio inopinatus的DiC2c1蛋白、来自Laceyella sediminis的LsC2c1蛋白、来自Spirochaetes bacterium的SbC2c1蛋白、来自Tuberibacillus calidus的TcC2c1蛋白。例如,所述C2c1蛋白是来自Alicyclobacillus acidiphilus NBRC 100859的AaC2c1蛋白、来自Alicyclobacilluskakegawensis NBRC 103104的AkC2c1蛋白、来自Alicyclobacillus macrosporangiidusstrain DSM 17980的AmC2c1蛋白、来自Bacillus hisashii strain C4的BhC2c1蛋白、来自Bacillus属NSP2.1的BsC2c1蛋白、来自Bacillus属V3-13contig_40的Bs3C2c1蛋白、来自Desulfovibrio inopinatus DSM 10711的DiC2c1蛋白、来自Laceyella sediminis strainRHA1的LsC2c1蛋白、来自Spirochaetes bacterium GWB1_27_13的SbC2c1蛋白、来自Tuberibacillus calidus DSM 17572的TcC2c1蛋白。
在第二方面,本发明提供了一种对细胞基因组中的靶序列进行定点修饰的方法,包括将本发明的基因组编辑系统导入所述细胞。
在第三方面,本发明提供了一种治疗有需要的对象中的疾病的方法,包括向所述对象递送有效量的本发明的基因组编辑系统以修饰所述对象中与所述疾病相关的基因。
在第四方面,本发明提供了本发明的基因组编辑系统在制备用于治疗有需要的对象中的疾病的药物组合物中的用途,其中所述基因组编辑系统用于修饰所述对象中与所述疾病相关的基因。
在第五方面,本发明提供了用于本发明的方法的试剂盒,该试剂盒包括本发明的基因组编辑系统,以及使用说明。
在第六方面,本发明提供了一种用于治疗有需要的对象中的疾病的药物组合物,其包含本发明的基因组编辑系统和药学可接受的载体,其中所述基因组编辑系统用于修饰所述对象中与所述疾病相关的基因。
附图简述
图1.选择用于基因组编辑测试的非冗余C2c1直系同源物的系统发生树及其基因座。
(a)邻接系统发生树,显示测试的C2c1直系同源物的进化关系。(b)对应于(a)中突出显示的8种C2c1蛋白的细菌基因座图谱crRNA DR和推定的tracrRNA的模拟共折叠显示出稳定的二级结构。DR,直接重复。每个细菌基因组间隔区(spacer)的数目在其CRISPR阵列的上方或下方表示。
图2.C2c1直系同源物的蛋白质比对:测试的10种C2c1直系同源物的氨基酸序列的多序列比对。保守的残基用红色背景突出显示,保守突变用轮廓和红色字体突出显示。
图3.人293T细胞中C2c1直系同源物介导的基因组靶向。
(a)T7EI测定结果表明在人类基因组中与其同源sgRNA结合的八种C2c1蛋白的基因组靶向活性。三角形表示切割的条带。(b)T7EI测定结果表明在人293T细胞中由与其同源sgRNA(Bs3sgRNA)结合的Bs3C2c1介导的同时多重基因组靶向。(c)Sanger测序显示由与Bs3sgRNA结合的Bs3C2c1诱导的代表性插入缺失(indel)。PAM和原间隔区序列分别用红色和蓝色着色。插入缺失和插入分别用紫色破折号和绿色小写字符表示。
图4.用于RNA指导的基因组编辑的C2c1蛋白。
(a)本发明中测试的10种C2c1直系同源物的图形概述。示出其大小(氨基酸数目)。(b)T7EI测定结果表明在人293T细胞中由其同源sgRNA指导的八种C2c1直系同源物的基因组靶向活性。三角形表示切割的条带。(c-d)T7EI测定结果表明在人293T细胞中由AasgRNA(c)和AksgRNA(d)指导的八种C2c1直系同源物的基因组靶向活性。三角形表示切割的条带。
图5.C2c1的sgRNA的DNA比对:测试衍生自10个C2c1基因座的8种sgRNA的DNA序列的多序列比对。
图6.不同C2c1直系同源物与sgRNA之间的可互换性。
T7EI测定结果表明在人293T细胞中由AasgRNA(a)、AksgRNA(b)、AmsgRNA(c)、Bs3sgRNA(d)和LssgRNA(e)指导的八种C2c1直系同源物的基因组靶向活性。红色三角形表示切割的条带。
图7.人工sgRNA介导的多重基因组靶向。
(a)对应于DiC2c1和TcC2c1的细菌基因座的图谱。两个C2c1基因座没有CRISPR阵列。(b-c)T7EI测定结果表明在人293T细胞中由AasgRNA(b)和AksgRNA(c)指导的AaC2c1、DiC2c1和TcC2c1的基因组靶向活性。三角形表示切割的条带。(d)T7EI测定结果表明在人293T细胞中由与AksgRNA结合的TcC2c1介导的同时多重基因组靶向。(e)示意图说明人工sgRNA支架13(artgRNA13)的二级结构。(f)T7EI测定结果表明在人293T细胞中由与artgRNA13结合的TcC2c1介导的同时多重基因组靶向。
图8.不同sgRNA指导C2c1进行基因组编辑。
T7EI测定结果表明在人293T细胞中由AasgRNA(a)、AksgRNA(b)、AmsgRNA(c)、Bs3sgRNA(d)和LssgRNA(e)指导的AaC2c1、DiC2c1和TcC2c1的基因组靶向活性。三角形表示切割的条带。
图9.TcC2c1介导的多重基因组编辑。
(a)T7EI测定结果表明在人293T细胞中由与AmsgRNA结合的TcC2c1介导的同时多重基因组靶向。(b-c)Sanger测序显示由与AksgRNA(b)和AmsgRNA(c)结合的TcC2c1诱导的代表性插入缺失。PAM和原间隔区序列分别用红色和蓝色着色。插入缺失和插入分别用紫色破折号和绿色小写字符表示。
图10.人工sgRNA指导TcC2c1进行基因组编辑。
(a)示意图说明36种人工sgRNA(artgRNA)支架(支架:1-12和14-37)的二级结构。(b)T7EI测定结果表明在人293T细胞中artsgRNA指导的TcC2c1的基因组靶向活性。三角形表示切割的条带。(c)T7EI测定结果表明在人293T细胞中由与artgRNA13结合的AaC2c1介导的同时多重基因组靶向。
发明详述
二、定义
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白质和核酸化学、分子生物学、细胞和组织培养、微生物学、免疫学相关术语和实验室操作步骤均为相应领域内广泛使用的术语和常规步骤。例如,本发明中使用的标准重组DNA和分子克隆技术为本领域技术人员熟知,并且在如下文献中有更全面的描述:Sambrook,J.,Fritsch,E.F.和Maniatis,T.,MolecularCloning:A Laboratory Manual;Cold Spring Harbor Laboratory Press:Cold SpringHarbor,1989(下文称为“Sambrook”)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
在一方面,本发明提供了一种用于对细胞基因组中的靶序列进行定点修饰的基因组编辑系统,其包含以下i)至v)中至少一项:
i)C2c1蛋白或其变体,和向导RNA;
ii)包含编码C2c1蛋白或其变体的核苷酸序列的表达构建体,和向导RNA;
iii)C2c1蛋白或其变体,和包含编码向导RNA的核苷酸序列的表达构建体;
iv)包含编码C2c1蛋白或其变体的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;
v)包含编码C2c1蛋白或其变体的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体;
其中所述向导RNA能够与所述C2c1蛋白或其变体形成复合物,将所述C2c1蛋白或其变体靶向所述细胞基因组中的靶序列。在一些实施方案中,所述靶向导致所述靶序列中的一或多个核苷酸的取代、缺失和/或添加。
“基因组”如本文所用不仅涵盖存在于细胞核中的染色体DNA,而且还包括存在于细胞的亚细胞组分(如线粒体、质体)中的细胞器DNA。
“C2c1核酸酶”、“C2c1蛋白”和“C2c1”在本文中可互换使用,指的是包括C2c1蛋白或其片段的RNA指导的核酸酶。C2c1具有向导RNA介导的DNA结合活性以及DNA切割活性,能在向导RNA的指导下靶向并切割DNA靶序列形成DNA双链断裂(DSB)。DSB能够激活细胞内固有的修复机制非同源末端连接(Non-homologous end joining,NHEJ)和同源重组(Homologous recombination,HR)对细胞中的DNA损伤进行修复,在修复过程中,对该特定的DNA序列进行定点编辑。
在一些实施方案中,所述C2c1蛋白是来自Alicyclobacillus acidiphilus的AaC2c1蛋白、来自Alicyclobacillus kakegawensis的AkC2c1蛋白、来自Alicyclobacillusmacrosporangiidus的AmC2c1蛋白、来自Bacillus hisashii的BhC2c1蛋白、来自Bacillus属的BsC2c1蛋白、来自Bacillus属的Bs3C2c1蛋白、来自Desulfovibrio inopinatus的DiC2c1蛋白、来自Laceyella sediminis的LsC2c1蛋白、来自Spirochaetes bacterium的SbC2c1蛋白、来自Tuberibacillus calidus的TcC2c1蛋白。
例如,所述C2c1蛋白是来自Alicyclobacillus acidiphilus NBRC 100859的AaC2c1蛋白、来自Alicyclobacillus kakegawensis NBRC 103104的AkC2c1蛋白、来自Alicyclobacillus macrosporangiidus strain DSM 17980的AmC2c1蛋白、来自Bacillushisashii strain C4的BhC2c1蛋白、来自Bacillus属NSP2.1的BsC2c1蛋白、来自Bacillus属V3-13contig_40的Bs3C2c1蛋白、来自Desulfovibrio inopinatus DSM 10711的DiC2c1蛋白、来自Laceyella sediminis strain RHA1的LsC2c1蛋白、来自Spirochaetesbacterium GWB1_27_13的SbC2c1蛋白、来自Tuberibacillus calidus DSM 17572的TcC2c1蛋白。
在本发明一些实施方式中,所述C2c1蛋白是其天然基因座不具有CRISPR阵列的C2c1蛋白。在一些实施方式中,所述天然基因座不具有CRISPR阵列的C2c1蛋白是DiC2c1或TcC2c1蛋白。
在一些实施方案中,所述C2c1蛋白包含SEQ ID NO:1-10中任一所示的氨基酸序列。例如,所述AaC2c1、AkC2c1、AmC2c1、BhC2c1、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1、TcC2c1蛋白分别包含SEQ ID NO:1-10所示氨基酸序列。
在一些实施方案中,所述C2c1蛋白的变体分别包含与野生型C2c1蛋白(如野生型AaC2c1、AkC2c1、AmC2c1、BhC2cl、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1或TcC2c1蛋白)具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%序列相同性的氨基酸序列,并且分别具有野生型C2c1蛋白(如野生型AaC2c1、AkC2c1、AmC2c1、BhC2c1、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1、TcC2c1蛋白)的基因组编辑和/靶向活性。
在一些实施方案中,所述C2c1蛋白的变体分别包含相对于野生型C2c1蛋白(如野生型AaC2c1、AkC2c1、AmC2c1、BhC2c1、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1、TcC2c1蛋白)具有一或多个氨基酸残基取代、缺失或添加的氨基酸序列,并且分别具有野生型C2c1蛋白(如野生型AaC2c1、AkC2c1、AmC2c1、BhC2c1、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1、TcC2c1蛋白)的基因组编辑和/或靶向活性。例如,所述C2c1蛋白的变体分别包含相对于野生型C2c1蛋白(如野生型AaC2c1、AkC2c1、AmC2c1、BhC2c1、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1、TcC2c1蛋白)具有1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸残基取代、缺失或添加的氨基酸序列。在一些实施方案中,所述氨基酸取代是保守型取代。
“多肽”、“肽”、和“蛋白质”在本发明中可互换使用,指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物,以及适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式,包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和ADP-核糖基化。
序列“相同性”具有本领域公认的含义,并且可以利用公开的技术计算两个核酸或多肽分子或区域之间序列相同性的百分比。可以沿着多核苷酸或多肽的全长或者沿着该分子的区域测量序列相同性。(参见,例如:Computational Molecular Biology,Lesk,A.M.,ed.,Oxford University Press,New York,1988;Biocomputing:Informatics and GenomeProjects,Smith,D.W.,ed.,Academic Press,New York,1993;Computer Analysis ofSequence Data,Part I,Griffin,A.M.,and Griffin,H.G,eds.,Humana Press,NewJersey,1994;Sequence Analysis in Molecular Biology,von Heinje,G,AcademicPress,1987;and Sequence Analysis Primer,Gribskov,M.and Devereux,J.,eds.,MStockton Press,New York,1991)。虽然存在许多测量两个多核苷酸或多肽之间的相同性的方法,但是术语“相同性”是技术人员公知的(Carrillo,H.&Lipman,D.,SIAM JAppliedMath 48:1073(1988))。
在肽或蛋白中,合适的保守型氨基酸取代是本领域技术人员已知的,并且一般可以进行而不改变所得分子的生物活性。通常,本领域技术人员认识到多肽的非必需区中的单个氨基酸取代基本上不改变生物活性(参见,例如,Watson et al.,Molecular Biologyof the Gene,4th Edition,1987,The Benjamin/Cummings Pub.co.,p.224)。
在一些实施方案中,所述C2c1蛋白的变体包含核酸酶死亡的C2c1蛋白(dC2c1)。核酸酶死亡的C2c1蛋白指的是保留向导RNA介导的DNA结合活性但是不具备双链DNA切割活性的C2c1蛋白。在一些实施方案中,所述核酸酶死亡的C2c1蛋白涵盖C2c1切口酶,其只切割双链靶DNA的一条链。
在一些实施方案中,所述C2c1蛋白的变体是核酸酶死亡的C2c1蛋白与脱氨酶的融合蛋白。例如,所述融合蛋白中的核酸酶死亡的C2c1蛋白与脱氨酶可以通过接头例如肽接头连接。
如本发明所用,“脱氨酶”是指催化脱氨基反应的酶。在本发明一些实施方式中,所述脱氨酶指的是胞嘧啶脱氨酶,其能够接受单链DNA作为底物并能够催化胞苷或脱氧胞苷分别脱氨化为尿嘧啶或脱氧尿嘧啶。在本发明一些实施方式中,所述脱氨酶指的是腺嘌呤脱氨酶,其能够接受单链DNA作为底物并能够催化腺苷或脱氧腺苷(A)形成肌苷(I)。通过使用核酸酶死亡的C2c1蛋白与脱氨酶的融合蛋白,可以实现靶DNA序列中的碱基编辑,例如C至T的转换或A至G的转换。
在本发明的一些实施方案中,本发明的基因组编辑系统中的C2c1蛋白或其变体还可以包含核定位序列(NLS)。一般而言,所述C2c1蛋白或其变体中的一个或多个NLS应具有足够的强度,以便在细胞核中驱动所述C2c1蛋白或其变体以可实现其基因编辑功能的量积聚。一般而言,核定位活性的强度由所述C2c1蛋白或其变体中NLS的数目、位置、所使用的一个或多个特定的NLS、或这些因素的组合决定。
在本发明的一些实施方案中,本发明的基因组编辑系统中的C2c1蛋白或其变体的NLS可以位于N端和/或C端。在一些实施方案中,所述C2c1蛋白或其变体包含约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述C2c1蛋白或其变体包含在或接近于N端的约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述C2c1蛋白或其变体包含在或接近于C端约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述C2c1蛋白或其变体包含这些的组合,如包含在N端的一个或多个NLS以及在C端的一个或多个NLS。当存在多于一个NLS时,每一个可以被选择为不依赖于其他NLS。在本发明的一些实施方式中,所述C2c1蛋白或其变体包含2个NLS,例如所述2个NLS分别位于N端和C端。
一般而言,NLS由暴露于蛋白表面上的带正电的赖氨酸或精氨酸的一个或多个短序列组成,但其他类型的NLS也是已知的。NLS的非限制性实例包括:KKRKV、PKKKRKV,或SGGSPKKKRKV。
此外,根据所需要编辑的DNA位置,本发明的C2c1蛋白或其变体还可以包括其他的定位序列,例如细胞质定位序列、叶绿体定位序列、线粒体定位序列等。
在本发明的一些实施方案中,所述靶序列长度为18-35个核苷酸,优选20个核苷酸。在本发明的一些实施方案中,所述靶序列在其5’端侧翼为选自:5’TTTN-3’、5’ATTN-3’、5’GTTN-3’、5’CTTN-3’、5’TTC-3’、5’TTG-3’、5’TTA-3’、5’TTT-3’、5’TAN-3’、5’TGN-3’、5’TCN-3’和5’ATC-3’的PAM(前间区邻近基序)序列,其中N选自A、G、C和T。
在本发明中,待进行修饰的靶序列可以位于基因组的任何位置,例如位于功能基因如蛋白编码基因内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而实现对所述基因功能修饰或对基因表达的修饰。可以通过T7EI、PCR/RE或测序方法检测基因组靶序列中的取代、缺失和/或添加
“向导RNA”和“gRNA”在本文中可互换使用,通常由部分互补形成复合物的crRNA和tracrRNA分子构成,其中crRNA包含与靶序列具有足够相同性以便与靶序列的互补序列杂交并且指导CRISPR复合物(C2c1+crRNA+tracrRNA)与该靶序列以序列特异性方式结合的序列。然而,可以设计并使用单向导RNA(sgRNA),其同时包含crRNA和tracrRNA的特征。
在本发明的一些实施方案中,所述向导RNA是sgRNA。在一些具体实施方案中,所述sgRNA由选自以下之一的核酸序列编码:
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTCTGACGTCGGATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’(AasgRNA);
5’-tcgtctataGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAGATGACCGCTCGCTCAGCGATCTGACAACGGATCGCTGAGCGAGCGGTCTGAGAAGTGGCAC-Nx-3’(AksgRNA1);
5’-ggaattgccgatctaTAGGACGGCAGATTCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAACATGATCGCCCGCTCAACGGTCCGATGTCGGATCGTTGAGCGGGCGATCTGAGAAGTGGCAC-Nx-3’(AmsgRNA1);
5’-GAGGTTCTGTCTTTTGGTCAGGACAACCGTCTAGCTATAAGTGCTGCAGGGTGTGAGAAACTCCTATTGCTGGACGATGTCTCTTTTATTTCTTTTTTCTTGGATGTCCAAGAAAAAAGAAATGATACGAGGCATTAGCAC-Nx-3’(BhsgRNA);
5’-CCATAAGTCGACTTACATATCCGTGCGTGTGCATTATGGGCCCATCCACAGGTCTATTCCCACGGATAATCACGACTTTCCACTAAGCTTTCGAATGTTCGAAAGCTTAGTGGAAAGCTTCGTGGTTAGCAC-Nx-3’(BssgRNA);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTTATTTCTGCTAAGTGTTTAGTTGCCTGAATACTTAGCAGAAATAATGATGATTGGCAC-Nx-3’(Bs3sgRNA);
5’-GGCAAAGAATACTGTGCGTGTGCTAAGGATGGAAAAAATCCATTCAACCACAGGATTACATTATTTATCTAATCACTTAAATCTTTAAGTGATTAGATGAATTAAATGTGATTAGCAC-Nx-3’(LssgRNA);或
5’-GTCTTAGGGTATATCCCAAATTTGTCTTAGTATGTGCATTGCTTACAGCGACAACTAAGGTTTGTTTATCTTTTTTTTACATTGTAAGATGTTTTACATTATAAAAAGAAGATAATCTTATTGCAC-Nx-3’(SbsgRNA);
其中Nx表示X个连续的核苷酸组成的核苷酸序列(spacer序列),N各自独立地选自A、G、C和T;X为18≤X≤35的整数。优选地,X=20。在一些实施方案中,序列Nx(spacer序列)能够与靶序列的互补序列特异性杂交。所述sgRNA中除Nx之外的序列为sgRNA的支架(scaffold)序列。在一些实施方案中,所述sgRNA包含由SEQ ID NO:31-38中任一项的核苷酸序列编码的支架序列。
本发明令人惊奇地发现,不同的C2c1系统中的C2c1蛋白以及向导RNA可以互换使用,从而使得可以人工设计通用的向导RNA。
因此在另一方面,本发明提供一种人工sgRNA,其由选自以下的核苷酸序列编码:
5’-GGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA1);
5’-GGTCTAAAGGACAGAAGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA2);
5’-GGTCTAAAGGACAGAAAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA3);
5’-GGTCGTCTATAGGACGGCGAGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA4);
5’-GGTCGTCTATAGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA5);
5’-GGTCGTCTATAGGACGGCGAGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA6);
5’-GGTGACCTATAGGGTCAATGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA7);
5’-GGTGACCTATAGGGTCAATGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA8);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA9);
5’-GGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA10);
5’-GGTCTAAAGGACAGAAGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA11);
5’-GGTCTAAAGGACAGAAAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA12);
5’-GGTCGTCTATAGGACGGCGAGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA13);
5’-GGTCGTCTATAGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA14);
5’-GGTCGTCTATAGGACGGCGAGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA15);
5’-GGTGACCTATAGGGTCAATGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA16);
5’-GGTGACCTATAGGGTCAATGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA17);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA18);
5’-GGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA19);
5’-GGTCTAAAGGACAGAAGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA20);
5’-GGTCTAAAGGACAGAAAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA21);
5’-GGTCGTCTATAGGACGGCGAGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA22);
5’-GGTCGTCTATAGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA23);
5’-GGTCGTCTATAGGACGGCGAGAATCTGTGCGTGTGCCAIAAGTAATTAAAAATTACCCACCACAGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA24);
5’-GGTGACCTATAGGGTCAATGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA25);
5’-GGTGACCTATAGGGTCAATGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA26);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA27);
5’-GGTCTAAAGGACAGAACAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA28);
5’-GGTCGTCTATAGGACGGCGAGCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA29);
5’-GGAATTGCCGATCTATAGGACGGCAGATTTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA30);
5’-GGAATTGCCGATCTATAGGACGGCAGATTGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA31);
5’-GGAATTGCCGATCTATAGGACGGCAGATTCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA32);
5’-GGTCTAAAGGACAGAACAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA33);
5’-GGTCGTCTATAGGACGGCGAGCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA34);
5’-GGAATTGCCGATCTATAGGACGGCAGATTTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA35);
5’-GGAATTGCCGATCTATAGGACGGCAGATTGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRN36A);或
5’-GGAATTGCCGATCTATAGGACGGCAGATTCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA37),
其中Nx表示X个连续的核苷酸组成的核苷酸序列(spacer序列),N各自独立地选自A、G、C和T;X为18≤X≤35的整数。优选地,X=20。在一些实施方案中,序列Nx(spacer序列)能够与靶序列的互补序列特异性杂交。所述sgRNA中除Nx之外的序列为sgRNA的支架(scaffold)序列。
在一些实施方案中,所述人工sgRNA包含由SEQ ID NO:39-75中任一项的核苷酸序列编码的支架序列。
在一些实施方案中,本发明的基因组编辑系统中的向导RNA是本发明的人工sgRNA。
为了在靶细胞中获得有效表达,在本发明的一些实施方式中,所述编码C2c1蛋白或其变体的核苷酸序列针对待进行基因组编辑的细胞所来自的生物体进行密码子优化。
密码子优化是指通过用在宿主细胞的基因中更频繁地或者最频繁地使用的密码子代替天然序列的至少一个密码子(例如约或多于约1、2、3、4、5、10、15、20、25、50个或更多个密码子同时维持该天然氨基酸序列而修饰核酸序列以便增强在感兴趣宿主细胞中的表达的方法。不同的物种对于特定氨基酸的某些密码子展示出特定的偏好。密码子偏好性(在生物之间的密码子使用的差异)经常与信使RNA(mRNA)的翻译效率相关,而该翻译效率则被认为依赖于被翻译的密码子的性质和特定的转运RNA(tRNA)分子的可用性。细胞内选定的tRNA的优势一般反映了最频繁用于肽合成的密码子。因此,可以将基因定制为基于密码子优化在给定生物中的最佳基因表达。密码子利用率表可以容易地获得,例如在www.kazusa.orjp/codon/上可获得的密码子使用数据库(“Codon Usage Database”)中,并且这些表可以通过不同的方式调整适用。参见,Nakamura Y.等,“Codon usage tabulatedfrom the international DNA sequence databases:status for theyear2000.Nucl.Acids Res.,28:292(2000)。
可通过本发明的系统进行基因组编辑的细胞所来自的生物体优选是真核生物,包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。
在本发明的一些具体实施方式中,所述编码C2c1蛋白或其变体的核苷酸序列针对人进行密码子优化。在一些具体实施方式中,所述密码子优化的编码AaC2c1、AkC2c1、AmC2c1、BhC2c1、BsC2c1、Bs3C2c1、DiC2c1、LsC2c1、SbC2c1、TcC2c1蛋白的核苷酸序列分别选自SEQ ID NO:11-20。
根据本发明的一些实施方式,本发明所述系统的表达构建体中所述编码C2c1蛋白或其变体的核苷酸序列和/或所述编码向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
如本发明所用,“表达构建体”是指适于感兴趣的核苷酸序列在生物体中表达的载体如重组载体。“表达”指功能产物的产生。例如,核苷酸序列的表达可指核苷酸序列的转录(如转录生成mRNA或功能RNA)和/或RNA翻译成前体或成熟蛋白质。本发明的“表达构建体”可以是线性的核酸片段、环状质粒、病毒载体,或者,在一些实施方式中,可以是能够翻译的RNA(如mRNA)。
本发明的“表达构建体”可包含不同来源的调控序列和感兴趣的核苷酸序列,或相同来源但以不同于通常天然存在的方式排列的调控序列和感兴趣的核苷酸序列。
“调控序列”和“调控元件”可互换使用,指位于编码序列的上游(5′非编码序列)、中间或下游(3′非编码序列),并且影响相关编码序列的转录、RNA加工或稳定性或者翻译的核苷酸序列。调控序列可包括但不限于启动子、翻译前导序列、内含子和多腺苷酸化识别序列。
“启动子”指能够控制另一核酸片段转录的核酸片段。在本发明的一些实施方案中,启动子是能够控制细胞中基因转录的启动子,无论其是否来源于所述细胞。启动子可以是组成型启动子或组织特异性启动子或发育调控启动子或诱导型启动子。
“组成型启动子”指一般将引起基因在多数细胞类型中在多数情况下表达的启动子。“组织特异性启动子”和“组织优选启动子”可互换使用,并且指主要但非必须专一地在一种组织或器官中表达,而且也可在一种特定细胞或细胞型中表达的启动子。“发育调控启动子”指其活性由发育事件决定的启动子。“诱导型启动子”响应内源性或外源性刺激(环境、激素、化学信号等)而选择性表达可操纵连接的DNA序列。
如本文中所用,术语“可操作地连接”指调控元件(例如但不限于,启动子序列、转录终止序列等)与核酸序列(例如,编码序列或开放读码框)连接,使得核苷酸序列的转录被所述转录调控元件控制和调节。用于将调控元件区域可操作地连接于核酸分子的技术为本领域已知的。
本发明可使用的启动子的实例包括但不限于聚合酶(pol)I、pol II或pol III启动子。pol I启动子的实例包括鸡RNApol I启动子。pol II启动子的实例包括但不限于巨细胞病毒立即早期(CMV)启动子、劳斯肉瘤病毒长末端重复(RSV-LTR)启动子和猿猴病毒40(SV40)立即早期启动子。pol III启动子的实例包括U6和H1启动子。可以使用诱导型启动子如金属硫蛋白启动子。启动子的其他实例包括T7噬菌体启动子、T3噬菌体启动子、β-半乳糖苷酶启动子和Sp6噬菌体启动子。当用于植物时,启动子可以是花椰菜花叶病毒35S启动子、玉米Ubi-1启动子、小麦U6启动子、水稻U3启动子、玉米U3启动子、水稻肌动蛋白启动子。
可通过本发明的系统进行基因组编辑的细胞优选是真核生物细胞,包括但不限于,哺乳动物细胞如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅的细胞;植物细胞包括单子叶植物细胞和双子叶植物细胞,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等的细胞。在本发明的一些实施方案中,所述细胞是真核生物细胞,优选哺乳动物细胞,更优选是人细胞。
在另一方面,本发明提供了一种修饰细胞基因组中靶序列的方法,包括将本发明的基因组编辑系统导入所述细胞,由此所述向导RNA将所述C2c1蛋白或其变体靶向所述细胞基因组中的靶序列。在一些实施方案中,所述靶向导致所述靶序列中的一或多个核苷酸的取代、缺失和/或添加。
将本发明的基因组编辑系统的核酸分子(例如质粒、线性核酸片段、RNA等)或蛋白质“导入”细胞是指用所述核酸或蛋白质转化细胞,使得所述核酸或蛋白质在细胞中能够发挥功能。本发明所用的“转化”包括稳定转化和瞬时转化。“稳定转化”指将外源核苷酸序列导入基因组中,导致外源基因稳定遗传。一旦稳定转化,外源核酸序列稳定地整合进所述生物体和其任何连续世代的基因组中。“瞬时转化”指将核酸分子或蛋白质导入细胞中,执行功能而没有外源基因稳定遗传。瞬时转化中,外源核酸序列不整合进基因组中。
可用于将本发明的基因组编辑系统导入细胞的方法包括但不限于:磷酸钙转染、原生质融合、电穿孔、脂质体转染、微注射、病毒感染(如杆状病毒、痘苗病毒、腺病毒、腺相关病毒、慢病毒和其他病毒)、基因枪法、PEG介导的原生质体转化、土壤农杆菌介导的转化。
在一些实施方式中,所述方法在体外进行。例如,所述细胞是分离的细胞。在一些实施方式中,所述细胞是CAR-T细胞。在一些实施方式中,所述细胞是诱导的胚胎干细胞。
在另一些实施方式中,所述方法还可以在体内进行。例如,所述细胞是生物体内的细胞,可以通过例如病毒介导的方法将本发明的系统体内导入所述细胞。例如,所述细胞可以是患者体内的肿瘤细胞。
在另一方面,本发明还提一种产生经遗传修饰的细胞的方法,包括将本发明的基因组编辑系统导入细胞中,由此所述向导RNA将所述C2c1蛋白或其变体靶向所述细胞基因组中的靶序列。在一些实施方式中,所述靶向导致所述靶序列中的一或多个核苷酸取代、缺失和/或添加。
在另一方面,本发明还提供经遗传修饰的生物体,其包含通过本发明的方法产生的经遗传修饰的细胞或其后代。
如本文所用,“生物体”包括适于基因组编辑的任何生物体,优选真核生物。生物体的实例包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。在本发明的一些实施方案中,所述生物体是真核生物,优选哺乳动物,更优选人。
如本文所用,“经遗传修饰的生物体”或“经遗传修饰的细胞”意指在其基因组内包含外源多核苷酸或修饰的基因或表达调控序列的生物体或细胞。例如外源多核苷酸能够稳定地整合进生物体或细胞的基因组中,并遗传连续的世代。外源多核苷酸可单独地或作为重组DNA构建体的部分整合进基因组中。修饰的基因或表达调控序列为在生物体或细胞基因组中所述序列包含单个或多个脱氧核苷酸取代、缺失和添加。针对序列而言的“外源”意指来自外来物种的序列,或者如果来自相同物种,则指通过蓄意的人为干预而从其天然形式发生了组成和/或基因座的显著改变的序列。
在另一方面,本发明提供了一种基因表达调控系统,其基于本发明的核酸酶死亡的C2c1蛋白。此系统尽管并没有改变靶基因的序列,在本文范围内也定义为基因组编辑系统。
在一些实施方案中,本发明的基因表达调控系统是基因抑制或沉默系统,其可以包含以下之一:
i)核酸酶死亡的C2c1蛋白或其与转录阻遏蛋白的融合蛋白,和向导RNA;
ii)包含编码核酸酶死亡的C2c1蛋白或其与转录阻遏蛋白的融合蛋白的核苷酸序列的表达构建体,和向导RNA;
iii)核酸酶死亡的C2c1蛋白或其与转录阻遏蛋白的融合蛋白,和包含编码向导RNA的核苷酸序列的表达构建体;
iv)包含编码核酸酶死亡的C2c1蛋白或其与转录阻遏蛋白的融合蛋白的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;或
v)包含编码核酸酶死亡的C2c1蛋白或其与转录阻遏蛋白的融合蛋白的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体。
所述核酸酶死亡的C2c1蛋白或向导RNA的定义如上所述。所述转录阻遏蛋白的选择属于本领域技术人员的技能范围。
如本文所用,基因抑制或沉默是指基因表达水平的下调或消除,优选在转录水平。
然而,本发明的基因表达调控系统还可以使用核酸酶死亡的C2c1蛋白和转录激活蛋白的融合蛋白。在此种情况下,所述基因表达调控系统是基因表达激活系统。例如,本发明的基因表达激活系统可以包含以下之一:
i)核酸酶死亡的C2c1蛋白和转录激活蛋白的融合蛋白,和向导RNA;
ii)包含编码核酸酶死亡的C2c1蛋白和转录激活蛋白的融合蛋白的核苷酸序列的表达构建体,和向导RNA;
iii)核酸酶死亡的C2c1蛋白和转录激活蛋白的融合蛋白,和包含编码向导RNA的核苷酸序列的表达构建体;
iv)包含编码核酸酶死亡的C2c1蛋白和转录激活蛋白的融合蛋白的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;或
v)包含编码核酸酶死亡的C2c1蛋白和转录激活蛋白的融合蛋白的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体。
所述核酸酶死亡的C2c1蛋白或向导RNA的定义如上所述。所述转录激活蛋白的选择属于本领域技术人员的技能范围。
如本文所用,基因激活是指基因表达水平的上调,优选在转录水平。
在另一方面,本发明还涵盖本发明的基因组编辑系统在疾病治疗中的应用。
通过本发明的基因组编辑系统对疾病相关基因进行修饰,可以实现疾病相关基因的上调、下调、失活、激活或者突变纠正等,从而实现疾病的预防和/或治疗。例如,本发明中靶序列可以位于疾病相关基因的蛋白编码区内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而可以实现对所述疾病相关基因功能修饰或对疾病相关基因表达的修饰。
“疾病相关”基因是指与非疾病对照的组织或细胞相比,在来源于疾病影响的组织的细胞中以异常水平或以异常形式产生转录或翻译产物的任何基因。在改变的表达与疾病的出现和/或进展相关的情况下,它可以是以异常高的水平被表达的基因;它可以是以异常低的水平被表达的基因。疾病相关基因还指具有一个或多个突变或直接负责或与一个或多个负责疾病的病因学的基因连锁不平衡的遗传变异的基因。转录的或翻译的产物可以是已知的或未知的,并且可以处于正常或异常水平。
因此,本发明还提供治疗有需要的对象中的疾病的方法,包括向所述对象递送有效量的本发明的基因组编辑系统以修饰与所述疾病相关的基因。
本发明还提供本发明的基因组编辑系统在制备用于治疗有需要的对象中的疾病的药物组合物中的用途,其中所述基因组编辑系统用于修饰与所述疾病相关的基因。
本发明还提供用于治疗有需要的对象中的疾病的药物组合物,其包含本发明的基因组编辑系统和药学可接受的载体,其中所述基因组编辑系统用于修饰与所述疾病相关的基因。
在一些实施方式中,所述对象是哺乳动物,例如人。
所述疾病的实例包括但不限于肿瘤、炎症、帕金森病、心血管疾病、阿尔茨海默病、自闭症、药物成瘾、年龄相关性黄斑变性、精神分裂症、遗传性疾病等。
在仍另一方面,本发明的范围内还包括用于本发明的方法的试剂盒,该试剂盒包括本发明的基因组编辑系统,以及使用说明。试剂盒一般包括表明试剂盒内容物的预期用途和/或使用方法的标签。术语标签包括在试剂盒上或与试剂盒一起提供的或以其他方式随试剂盒提供的任何书面的或记录的材料。
实施例
为了便于理解本发明,下面将参照相关具体实施例及附图对本发明进行更全面的描述。附图中给出了本发明的较佳实施例。但是,本发明可以以许多不同的形式来实现,并不限于本文所描述的实施例。相反地,提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。
材料与方法
1.DNA操作
根据Molecular Cloning:A Laboratory Manual并进行一些修改进行DNA操作,包括DNA制备、消化、连接、扩增、合成、纯化、琼脂糖凝胶电泳等。简而言之,通过连接退火的寡核苷酸(表1)到BsaI消化的pUC19-U6-sgRNA(SEQ ID NO:23-30)载体中构建用于细胞转染测定的靶向sgRNA。
表1.人类基因组靶靶序列。
Figure BDA0001851846150000151
Figure BDA0001851846150000161
2.从头基因合成和质粒构建
采用PSI-BLAST程序(Altschul,S.F.et al.Nucleic Acids Res 25,3389-3402(1997))鉴定新的CRISPR-C2c1直系同源物。其编码序列进行人源化(Grote,A.et al.,Nucleic Acids Res 33,W526-531(2005)),并且使用GeneDesign程序(Richardson,S.M.etal.,Genome Res 16,550-556(2006))设计用于C2c1基因和sgRNA合成的寡核苷酸。根据文献(Li,G.et al.,Methods Mol Biol 1073,9-17(2013))合成各C2c1基因。使用
Figure BDA0001851846150000162
HiFiDNA Assembly Master Mix(NEB)通过体外同源重组体外组装纯化的产物成表达载体。将pCAG-2AeGFP(SEQ ID NO:21)和pUC19-U6(SEQ ID NO:22)载体分别用于C2c1蛋白和sgRNA的哺乳动物表达。
3.细胞培养和转染
将人胚胎肾293T细胞在补充有10%胎牛血清(FBS,Gibco)和1%抗生素-抗真菌剂(Gibco)的Dulbecco′s Modified Eagle培养基(DMEM,Gibco)中于37℃,5%CO2孵育下培养。按照制造商推荐的方案,使用Lipofectamine LTX(Invitrogen)转染293T细胞。对于48孔板的每个孔,使用总共400ng质粒(C2c1∶sgRNA=2∶1)。然后在转染后48小时,直接收获细胞用于基因组DNA提取。
4.T7核酸内切酶I(T7EI)测定和Sanger测序
将收获的细胞直接用补充有蛋白酶K的Buffer L(Bimake)裂解,并在55℃下孵育3小时并在95℃下灭活10分钟。对每个基因的C2c1靶位点周围的基因组区域进行PCR扩增。将200-400ng PCR产物与ddH2O混合至终体积10μL,并根据先前方法进行再退火过程以形成异源双链体。重新退火后,用1/10体积的NEBufferTM 2.1和0.2μL T7EI(NEB)在37℃处理产物30分钟,并在3%琼脂糖凝胶上进行分析。基于相对条带强度对插入缺失进行定量(Cong,L.et al.,Science 339,819-823(2013))。将T7EI测定鉴定的突变产物克隆到TA克隆载体中,并转化到感受态大肠杆菌菌株(Transgen Biotech)中。过夜培养后,随机挑出菌落并测序。
实施例1、新C2c1蛋白鉴定
选择并从头合成来自不同细菌的六种代表性C2c1蛋白,以及之前报道的四种C2c1直系同源物,在人胚胎肾293T细胞中进行基因组编辑(图1、2和SEQ ID NO:1-10)。在这10种C2c1直系同源物中,来自D.inopinatus(DiC2c1)和T.calidus(TcC2c1)的C2c1既没有可预测的前体CRISPR RNA(pre-crRNA)也没有反式激活crRNA(tracrRNA)(图1b),提示这两种C2c1蛋白可能不适合基因组编辑应用。
为了进行哺乳动物基因组编辑,用单独的C2c1酶和其靶向含有适当PAM的人内源基因座的同源单向导RNA(sgRNA)共转染293T细胞(图1)。T7核酸内切酶(T7EI)测定的结果显示,除了发明人先前已经鉴定的AaC2c1和AkC2c1之外,四种新的C2c1直系同源物(AmC2c1、BhC2c1、Bs3C2c1和LsC2c1)稳健地编辑人类基因组,尽管它们的靶向效率在不同的直系同源物之间和在不同的靶向位点不同(图1b和图3a)。还通过简单地使用多个sgRNA,使用Bs3C2c1实现多重基因组编辑,同时编辑人类基因组中的四个位点(图3b,c)。这些新发现的C2c1直系同源物扩展对基于C2c1的基因组编辑的选择。
实施例2、不同C2c1及双RNA的可互换性
为了研究C2c1系统中双RNA(crRNA和tracRRNA)和蛋白质组分之间的可互换性,首先分析C2c1蛋白和双RNA两者的保守性。除了C2c1直系同源物的保守氨基酸序列外(图4a和图2),前体crRNA:tracrRNA双链体的DNA序列及其二级结构也表现出高保守性(图1b和图5)。接下来,用分别与来自8个C2c1系统的各sgRNA复合的8种C2c1直系同源物,在293T细胞中进行基因组编辑。如T7EI测定的结果所示,衍生自AaC2c1、AkC2c1、AmC2c1、Bs3C2c1和LsC2c1基因座的sgRNA可以替代原始sgRNA用于哺乳动物基因组编辑,尽管在不同C2c1直系同源物和sgRNA之间的活性有所不同(图4c,d和图6)。这些结果证明不同C2c1和来自不同C2c1基因座的双RNA之间的可互换性。
实施例3、利用天然基因座无CRISPR阵列的C2c1进行基因组编辑
本发明进一步选择两个基因座没有携带CRISPR阵列的C2c1直系同源物DiC2c1和TcC2c1进行后续实验(图7a)。基因座没有携带CRISPR阵列使得它们的crRNA:tracrRNA双链体的序列不可预测。在293T细胞中共转染与靶向不同基因组位点的衍生自其他8种C2c1直系同源物的基因座的sgRNA组合的DiC2c1和TcC2c1以及AaC2c1。T7EI测定结果表明衍生自AaC2c1、AkC2c1、AmC2c1、Bs3C2c1和LsC2c1的sgRNA使TcC2c1能够稳健地编辑人类基因组(图7b、c和图8)。此外,AasgRNA或AksgRNA能够使TcC2c1实现多重基因组编辑(图7d和图9)。上述结果表明在来自不同系统的C2c1和双链RNA之间可互换性使得可能利用天然基因座不具有CRISPR阵列的C2c1直系同源物来编辑哺乳动物基因组。
实施例4、设计用于C2c1介导的基因组编辑的人工sgRNA
不同C2c1系统中C2c1蛋白和双RNA之间的可互换性有利于设计新的人工sgRNA(artsgRNA)支架以促进C2c1介导的基因组编辑。考虑到C2c1直系同源物中DNA序列和二级结构的保守性(图1b和3),设计并从头合成37种sgRNA支架(SEQ ID NO:39-75),用于靶向人CCR5基因座(图7e,图10a)。T7EI测定的结果表明22种artsgRNA支架有效地工作(图10b)。为了验证artgRNA的普遍适用性,使用artsgRNA13指导TcC2c1或AaC2c1进行多重基因组编辑(图10a)。T7EI测定结果表明,artsgRNA13同时促进TcC2c1和AaC2c1两者的多重基因组编辑(图7f和图10c)。结果表明通过设计和合成artsgRNA能促进C2c1介导的基因组编辑特别是多重基因组编辑。
表2本发明涉及的序列及信息
Figure BDA0001851846150000181
Figure BDA0001851846150000191
序列表
<110> 中国科学院动物研究所
<120> 基于C2c1核酸酶的基因组编辑系统和方法
<130> TC2537
<160> 80
<170> PatentIn version 3.5
<210> 1
<211> 1129
<212> PRT
<213> Alicyclobacillus acidiphilus
<400> 1
Met Ala Val Lys Ser Met Lys Val Lys Leu Arg Leu Asp Asn Met Pro
1 5 10 15
Glu Ile Arg Ala Gly Leu Trp Lys Leu His Thr Glu Val Asn Ala Gly
20 25 30
Val Arg Tyr Tyr Thr Glu Trp Leu Ser Leu Leu Arg Gln Glu Asn Leu
35 40 45
Tyr Arg Arg Ser Pro Asn Gly Asp Gly Glu Gln Glu Cys Tyr Lys Thr
50 55 60
Ala Glu Glu Cys Lys Ala Glu Leu Leu Glu Arg Leu Arg Ala Arg Gln
65 70 75 80
Val Glu Asn Gly His Cys Gly Pro Ala Gly Ser Asp Asp Glu Leu Leu
85 90 95
Gln Leu Ala Arg Gln Leu Tyr Glu Leu Leu Val Pro Gln Ala Ile Gly
100 105 110
Ala Lys Gly Asp Ala Gln Gln Ile Ala Arg Lys Phe Leu Ser Pro Leu
115 120 125
Ala Asp Lys Asp Ala Val Gly Gly Leu Gly Ile Ala Lys Ala Gly Asn
130 135 140
Lys Pro Arg Trp Val Arg Met Arg Glu Ala Gly Glu Pro Gly Trp Glu
145 150 155 160
Glu Glu Lys Ala Lys Ala Glu Ala Arg Lys Ser Thr Asp Arg Thr Ala
165 170 175
Asp Val Leu Arg Ala Leu Ala Asp Phe Gly Leu Lys Pro Leu Met Arg
180 185 190
Val Tyr Thr Asp Ser Asp Met Ser Ser Val Gln Trp Lys Pro Leu Arg
195 200 205
Lys Gly Gln Ala Val Arg Thr Trp Asp Arg Asp Met Phe Gln Gln Ala
210 215 220
Ile Glu Arg Met Met Ser Trp Glu Ser Trp Asn Gln Arg Val Gly Glu
225 230 235 240
Ala Tyr Ala Lys Leu Val Glu Gln Lys Ser Arg Phe Glu Gln Lys Asn
245 250 255
Phe Val Gly Gln Glu His Leu Val Gln Leu Val Asn Gln Leu Gln Gln
260 265 270
Asp Met Lys Glu Ala Ser His Gly Leu Glu Ser Lys Glu Gln Thr Ala
275 280 285
His Tyr Leu Thr Gly Arg Ala Leu Arg Gly Ser Asp Lys Val Phe Glu
290 295 300
Lys Trp Glu Lys Leu Asp Pro Asp Ala Pro Phe Asp Leu Tyr Asp Thr
305 310 315 320
Glu Ile Lys Asn Val Gln Arg Arg Asn Thr Arg Arg Phe Gly Ser His
325 330 335
Asp Leu Phe Ala Lys Leu Ala Glu Pro Lys Tyr Gln Ala Leu Trp Arg
340 345 350
Glu Asp Ala Ser Phe Leu Thr Arg Tyr Ala Val Tyr Asn Ser Ile Val
355 360 365
Arg Lys Leu Asn His Ala Lys Met Phe Ala Thr Phe Thr Leu Pro Asp
370 375 380
Ala Thr Ala His Pro Ile Trp Thr Arg Phe Asp Lys Leu Gly Gly Asn
385 390 395 400
Leu His Gln Tyr Thr Phe Leu Phe Asn Glu Phe Gly Glu Gly Arg His
405 410 415
Ala Ile Arg Phe Gln Lys Leu Leu Thr Val Glu Asp Gly Val Ala Lys
420 425 430
Glu Val Asp Asp Val Thr Val Pro Ile Ser Met Ser Ala Gln Leu Asp
435 440 445
Asp Leu Leu Pro Arg Asp Pro His Glu Leu Val Ala Leu Tyr Phe Gln
450 455 460
Asp Tyr Gly Ala Glu Gln His Leu Ala Gly Glu Phe Gly Gly Ala Lys
465 470 475 480
Ile Gln Tyr Arg Arg Asp Gln Leu Asn His Leu His Ala Arg Arg Gly
485 490 495
Ala Arg Asp Val Tyr Leu Asn Leu Ser Val Arg Val Gln Ser Gln Ser
500 505 510
Glu Ala Arg Gly Glu Arg Arg Pro Pro Tyr Ala Ala Val Phe Arg Leu
515 520 525
Val Gly Asp Asn His Arg Ala Phe Val His Phe Asp Lys Leu Ser Asp
530 535 540
Tyr Leu Ala Glu His Pro Asp Asp Gly Lys Leu Gly Ser Glu Gly Leu
545 550 555 560
Leu Ser Gly Leu Arg Val Met Ser Val Asp Leu Gly Leu Arg Thr Ser
565 570 575
Ala Ser Ile Ser Val Phe Arg Val Ala Arg Lys Asp Glu Leu Lys Pro
580 585 590
Asn Ser Glu Gly Arg Val Pro Phe Cys Phe Pro Ile Glu Gly Asn Glu
595 600 605
Asn Leu Val Ala Val His Glu Arg Ser Gln Leu Leu Lys Leu Pro Gly
610 615 620
Glu Thr Glu Ser Lys Asp Leu Arg Ala Ile Arg Glu Glu Arg Gln Arg
625 630 635 640
Thr Leu Arg Gln Leu Arg Thr Gln Leu Ala Tyr Leu Arg Leu Leu Val
645 650 655
Arg Cys Gly Ser Glu Asp Val Gly Arg Arg Glu Arg Ser Trp Ala Lys
660 665 670
Leu Ile Glu Gln Pro Met Asp Ala Asn Gln Met Thr Pro Asp Trp Arg
675 680 685
Glu Ala Phe Glu Asp Glu Leu Gln Lys Leu Lys Ser Leu Tyr Gly Ile
690 695 700
Cys Gly Asp Arg Glu Trp Thr Glu Ala Val Tyr Glu Ser Val Arg Arg
705 710 715 720
Val Trp Arg His Met Gly Lys Gln Val Arg Asp Trp Arg Lys Asp Val
725 730 735
Arg Ser Gly Glu Arg Pro Lys Ile Arg Gly Tyr Gln Lys Asp Val Val
740 745 750
Gly Gly Asn Ser Ile Glu Gln Ile Glu Tyr Leu Glu Arg Gln Tyr Lys
755 760 765
Phe Leu Lys Ser Trp Ser Phe Phe Gly Lys Val Ser Gly Gln Val Ile
770 775 780
Arg Ala Glu Lys Gly Ser Arg Phe Ala Ile Thr Leu Arg Glu His Ile
785 790 795 800
Asp His Ala Lys Glu Asp Arg Leu Lys Lys Leu Ala Asp Arg Ile Ile
805 810 815
Met Glu Ala Leu Gly Tyr Val Tyr Ala Leu Asp Asp Glu Arg Gly Lys
820 825 830
Gly Lys Trp Val Ala Lys Tyr Pro Pro Cys Gln Leu Ile Leu Leu Glu
835 840 845
Glu Leu Ser Glu Tyr Gln Phe Asn Asn Asp Arg Pro Pro Ser Glu Asn
850 855 860
Asn Gln Leu Met Gln Trp Ser His Arg Gly Val Phe Gln Glu Leu Leu
865 870 875 880
Asn Gln Ala Gln Val His Asp Leu Leu Val Gly Thr Met Tyr Ala Ala
885 890 895
Phe Ser Ser Arg Phe Asp Ala Arg Thr Gly Ala Pro Gly Ile Arg Cys
900 905 910
Arg Arg Val Pro Ala Arg Cys Ala Arg Glu Gln Asn Pro Glu Pro Phe
915 920 925
Pro Trp Trp Leu Asn Lys Phe Val Ala Glu His Lys Leu Asp Gly Cys
930 935 940
Pro Leu Arg Ala Asp Asp Leu Ile Pro Thr Gly Glu Gly Glu Phe Phe
945 950 955 960
Val Ser Pro Phe Ser Ala Glu Glu Gly Asp Phe His Gln Ile His Ala
965 970 975
Asp Leu Asn Ala Ala Gln Asn Leu Gln Arg Arg Leu Trp Ser Asp Phe
980 985 990
Asp Ile Ser Gln Ile Arg Leu Arg Cys Asp Trp Gly Glu Val Asp Gly
995 1000 1005
Glu Pro Val Leu Ile Pro Arg Thr Thr Gly Lys Arg Thr Ala Asp
1010 1015 1020
Ser Tyr Gly Asn Lys Val Phe Tyr Thr Lys Thr Gly Val Thr Tyr
1025 1030 1035
Tyr Glu Arg Glu Arg Gly Lys Lys Arg Arg Lys Val Phe Ala Gln
1040 1045 1050
Glu Glu Leu Ser Glu Glu Glu Ala Glu Leu Leu Val Glu Ala Asp
1055 1060 1065
Glu Ala Arg Glu Lys Ser Val Val Leu Met Arg Asp Pro Ser Gly
1070 1075 1080
Ile Ile Asn Arg Gly Asp Trp Thr Arg Gln Lys Glu Phe Trp Ser
1085 1090 1095
Met Val Asn Gln Arg Ile Glu Gly Tyr Leu Val Lys Gln Ile Arg
1100 1105 1110
Ser Arg Val Arg Leu Gln Glu Ser Ala Cys Glu Asn Thr Gly Asp
1115 1120 1125
Ile
<210> 2
<211> 1147
<212> PRT
<213> Alicyclobacillus kakegawensis
<400> 2
Met Ala Val Lys Ser Ile Lys Val Lys Leu Arg Leu Ser Glu Cys Pro
1 5 10 15
Asp Ile Leu Ala Gly Met Trp Gln Leu His Arg Ala Thr Asn Ala Gly
20 25 30
Val Arg Tyr Tyr Thr Glu Trp Val Ser Leu Met Arg Gln Glu Ile Leu
35 40 45
Tyr Ser Arg Gly Pro Asp Gly Gly Gln Gln Cys Tyr Met Thr Ala Glu
50 55 60
Asp Cys Gln Arg Glu Leu Leu Arg Arg Leu Arg Asn Arg Gln Leu His
65 70 75 80
Asn Gly Arg Gln Asp Gln Pro Gly Thr Asp Ala Asp Leu Leu Ala Ile
85 90 95
Ser Arg Arg Leu Tyr Glu Ile Leu Val Leu Gln Ser Ile Gly Lys Arg
100 105 110
Gly Asp Ala Gln Gln Ile Ala Ser Ser Phe Leu Ser Pro Leu Val Asp
115 120 125
Pro Asn Ser Lys Gly Gly Arg Gly Glu Ala Lys Ser Gly Arg Lys Pro
130 135 140
Ala Trp Gln Lys Met Arg Asp Gln Gly Asp Pro Arg Trp Val Ala Ala
145 150 155 160
Arg Glu Lys Tyr Glu Gln Arg Lys Ala Val Asp Pro Ser Lys Glu Ile
165 170 175
Leu Asn Ser Leu Asp Ala Leu Gly Leu Arg Pro Leu Phe Ala Val Phe
180 185 190
Thr Glu Thr Tyr Arg Ser Gly Val Asp Trp Lys Pro Leu Gly Lys Ser
195 200 205
Gln Gly Val Arg Thr Trp Asp Arg Asp Met Phe Gln Gln Ala Leu Glu
210 215 220
Arg Leu Met Ser Trp Glu Ser Trp Asn Arg Arg Val Gly Glu Glu Tyr
225 230 235 240
Ala Arg Leu Phe Gln Gln Lys Met Lys Phe Glu Gln Glu His Phe Ala
245 250 255
Glu Gln Ser His Leu Val Lys Leu Ala Arg Ala Leu Glu Ala Asp Met
260 265 270
Arg Ala Ala Ser Gln Gly Phe Glu Ala Lys Arg Gly Thr Ala His Gln
275 280 285
Ile Thr Arg Arg Ala Leu Arg Gly Ala Asp Arg Val Phe Glu Ile Trp
290 295 300
Lys Ser Ile Pro Glu Glu Ala Leu Phe Ser Gln Tyr Asp Glu Val Ile
305 310 315 320
Arg Gln Val Gln Ala Glu Lys Arg Arg Asp Phe Gly Ser His Asp Leu
325 330 335
Phe Ala Lys Leu Ala Glu Pro Lys Tyr Gln Pro Leu Trp Arg Ala Asp
340 345 350
Glu Thr Phe Leu Thr Arg Tyr Ala Leu Tyr Asn Gly Val Leu Arg Asp
355 360 365
Leu Glu Lys Ala Arg Gln Phe Ala Thr Phe Thr Leu Pro Asp Ala Cys
370 375 380
Val Asn Pro Ile Trp Thr Arg Phe Glu Ser Ser Gln Gly Ser Asn Leu
385 390 395 400
His Lys Tyr Glu Phe Leu Phe Asp His Leu Gly Pro Gly Arg His Ala
405 410 415
Val Arg Phe Gln Arg Leu Leu Val Val Glu Ser Glu Gly Ala Lys Glu
420 425 430
Arg Asp Ser Val Val Val Pro Val Ala Pro Ser Gly Gln Leu Asp Lys
435 440 445
Leu Val Leu Arg Glu Glu Glu Lys Ser Ser Val Ala Leu His Leu His
450 455 460
Asp Thr Ala Arg Pro Asp Gly Phe Met Ala Glu Trp Ala Gly Ala Lys
465 470 475 480
Leu Gln Tyr Glu Arg Ser Thr Leu Ala Arg Lys Ala Arg Arg Asp Lys
485 490 495
Gln Gly Met Arg Ser Trp Arg Arg Gln Pro Ser Met Leu Met Ser Ala
500 505 510
Ala Gln Met Leu Glu Asp Ala Lys Gln Ala Gly Asp Val Tyr Leu Asn
515 520 525
Ile Ser Val Arg Val Lys Ser Pro Ser Glu Val Arg Gly Gln Arg Arg
530 535 540
Pro Pro Tyr Ala Ala Leu Phe Arg Ile Asp Asp Lys Gln Arg Arg Val
545 550 555 560
Thr Val Asn Tyr Asn Lys Leu Ser Ala Tyr Leu Glu Glu His Pro Asp
565 570 575
Lys Gln Ile Pro Gly Ala Pro Gly Leu Leu Ser Gly Leu Arg Val Met
580 585 590
Ser Val Asp Leu Gly Leu Arg Thr Ser Ala Ser Ile Ser Val Phe Arg
595 600 605
Val Ala Lys Lys Glu Glu Val Glu Ala Leu Gly Asp Gly Arg Pro Pro
610 615 620
His Tyr Tyr Pro Ile His Gly Thr Asp Asp Leu Val Ala Val His Glu
625 630 635 640
Arg Ser His Leu Ile Gln Met Pro Gly Glu Thr Glu Thr Lys Gln Leu
645 650 655
Arg Lys Leu Arg Glu Glu Arg Gln Ala Val Leu Arg Pro Leu Phe Ala
660 665 670
Gln Leu Ala Leu Leu Arg Leu Leu Val Arg Cys Gly Ala Ala Asp Glu
675 680 685
Arg Ile Arg Thr Arg Ser Trp Gln Arg Leu Thr Lys Gln Gly Arg Glu
690 695 700
Phe Thr Lys Arg Leu Thr Pro Ser Trp Arg Glu Ala Leu Glu Leu Glu
705 710 715 720
Leu Thr Arg Leu Glu Ala Tyr Cys Gly Arg Val Pro Asp Asp Glu Trp
725 730 735
Ser Arg Ile Val Asp Arg Thr Val Ile Ala Leu Trp Arg Arg Met Gly
740 745 750
Lys Gln Val Arg Asp Trp Arg Lys Gln Val Lys Ser Gly Ala Lys Val
755 760 765
Lys Val Lys Gly Tyr Gln Leu Asp Val Val Gly Gly Asn Ser Leu Ala
770 775 780
Gln Ile Asp Tyr Leu Glu Gln Gln Tyr Lys Phe Leu Arg Arg Trp Ser
785 790 795 800
Phe Phe Ala Arg Ala Ser Gly Leu Val Val Arg Ala Asp Arg Glu Ser
805 810 815
His Phe Ala Val Ala Leu Arg Gln His Ile Glu Asn Ala Lys Arg Asp
820 825 830
Arg Leu Lys Lys Leu Ala Asp Arg Ile Leu Met Glu Ala Leu Gly Tyr
835 840 845
Val Tyr Glu Ala Ser Gly Pro Arg Glu Gly Gln Trp Thr Ala Gln His
850 855 860
Pro Pro Cys Gln Leu Ile Ile Leu Glu Glu Leu Ser Ala Tyr Arg Phe
865 870 875 880
Ser Asp Asp Arg Pro Pro Ser Glu Asn Ser Lys Leu Met Ala Trp Gly
885 890 895
His Arg Gly Ile Leu Glu Glu Leu Val Asn Gln Ala Gln Val His Asp
900 905 910
Val Leu Val Gly Thr Val Tyr Ala Ala Phe Ser Ser Arg Phe Asp Ala
915 920 925
Arg Thr Gly Ala Pro Gly Val Arg Cys Arg Arg Val Pro Ala Arg Phe
930 935 940
Val Gly Ala Thr Val Asp Asp Ser Leu Pro Leu Trp Leu Thr Glu Phe
945 950 955 960
Leu Asp Lys His Arg Leu Asp Lys Asn Leu Leu Arg Pro Asp Asp Val
965 970 975
Ile Pro Thr Gly Glu Gly Glu Phe Leu Val Ser Pro Cys Gly Glu Glu
980 985 990
Ala Ala Arg Val Arg Gln Val His Ala Asp Ile Asn Ala Ala Gln Asn
995 1000 1005
Leu Gln Arg Arg Leu Trp Gln Asn Phe Asp Ile Thr Glu Leu Arg
1010 1015 1020
Leu Arg Cys Asp Val Lys Met Gly Gly Glu Gly Thr Val Leu Val
1025 1030 1035
Pro Arg Val Asn Asn Ala Arg Ala Lys Gln Leu Phe Gly Lys Lys
1040 1045 1050
Val Leu Val Ser Gln Asp Gly Val Thr Phe Phe Glu Arg Ser Gln
1055 1060 1065
Thr Gly Gly Lys Pro His Ser Glu Lys Gln Thr Asp Leu Thr Asp
1070 1075 1080
Lys Glu Leu Glu Leu Ile Ala Glu Ala Asp Glu Ala Arg Ala Lys
1085 1090 1095
Ser Val Val Leu Phe Arg Asp Pro Ser Gly His Ile Gly Lys Gly
1100 1105 1110
His Trp Ile Arg Gln Arg Glu Phe Trp Ser Leu Val Lys Gln Arg
1115 1120 1125
Ile Glu Ser His Thr Ala Glu Arg Ile Arg Val Arg Gly Val Gly
1130 1135 1140
Ser Ser Leu Asp
1145
<210> 3
<211> 1146
<212> PRT
<213> Alicyclobacillus macrosporangiidus
<400> 3
Met Asn Val Ala Val Lys Ser Ile Lys Val Lys Leu Met Leu Gly His
1 5 10 15
Leu Pro Glu Ile Arg Glu Gly Leu Trp His Leu His Glu Ala Val Asn
20 25 30
Leu Gly Val Arg Tyr Tyr Thr Glu Trp Leu Ala Leu Leu Arg Gln Gly
35 40 45
Asn Leu Tyr Arg Arg Gly Lys Asp Gly Ala Gln Glu Cys Tyr Met Thr
50 55 60
Ala Glu Gln Cys Arg Gln Glu Leu Leu Val Arg Leu Arg Asp Arg Gln
65 70 75 80
Lys Arg Asn Gly His Thr Gly Asp Pro Gly Thr Asp Glu Glu Leu Leu
85 90 95
Gly Val Ala Arg Arg Leu Tyr Glu Leu Leu Val Pro Gln Ser Val Gly
100 105 110
Lys Lys Gly Gln Ala Gln Met Leu Ala Ser Gly Phe Leu Ser Pro Leu
115 120 125
Ala Asp Pro Lys Ser Glu Gly Gly Lys Gly Thr Ser Lys Ser Gly Arg
130 135 140
Lys Pro Ala Trp Met Gly Met Lys Glu Ala Gly Asp Ser Arg Trp Val
145 150 155 160
Glu Ala Lys Ala Arg Tyr Glu Ala Asn Lys Ala Lys Asp Pro Thr Lys
165 170 175
Gln Val Ile Ala Ser Leu Glu Met Tyr Gly Leu Arg Pro Leu Phe Asp
180 185 190
Val Phe Thr Glu Thr Tyr Lys Thr Ile Arg Trp Met Pro Leu Gly Lys
195 200 205
His Gln Gly Val Arg Ala Trp Asp Arg Asp Met Phe Gln Gln Ser Leu
210 215 220
Glu Arg Leu Met Ser Trp Glu Ser Trp Asn Glu Arg Val Gly Ala Glu
225 230 235 240
Phe Ala Arg Leu Val Asp Arg Arg Asp Arg Phe Arg Glu Lys His Phe
245 250 255
Thr Gly Gln Glu His Leu Val Ala Leu Ala Gln Arg Leu Glu Gln Glu
260 265 270
Met Lys Glu Ala Ser Pro Gly Phe Glu Ser Lys Ser Ser Gln Ala His
275 280 285
Arg Ile Thr Lys Arg Ala Leu Arg Gly Ala Asp Gly Ile Ile Asp Asp
290 295 300
Trp Leu Lys Leu Ser Glu Gly Glu Pro Val Asp Arg Phe Asp Glu Ile
305 310 315 320
Leu Arg Lys Arg Gln Ala Gln Asn Pro Arg Arg Phe Gly Ser His Asp
325 330 335
Leu Phe Leu Lys Leu Ala Glu Pro Val Phe Gln Pro Leu Trp Arg Glu
340 345 350
Asp Pro Ser Phe Leu Ser Arg Trp Ala Ser Tyr Asn Glu Val Leu Asn
355 360 365
Lys Leu Glu Asp Ala Lys Gln Phe Ala Thr Phe Thr Leu Pro Ser Pro
370 375 380
Cys Ser Asn Pro Val Trp Ala Arg Phe Glu Asn Ala Glu Gly Thr Asn
385 390 395 400
Ile Phe Lys Tyr Asp Phe Leu Phe Asp His Phe Gly Lys Gly Arg His
405 410 415
Gly Val Arg Phe Gln Arg Met Ile Val Met Arg Asp Gly Val Pro Thr
420 425 430
Glu Val Glu Gly Ile Val Val Pro Ile Ala Pro Ser Arg Gln Leu Asp
435 440 445
Ala Leu Ala Pro Asn Asp Ala Ala Ser Pro Ile Asp Val Phe Val Gly
450 455 460
Asp Pro Ala Ala Pro Gly Ala Phe Arg Gly Gln Phe Gly Gly Ala Lys
465 470 475 480
Ile Gln Tyr Arg Arg Ser Ala Leu Val Arg Lys Gly Arg Arg Glu Glu
485 490 495
Lys Ala Tyr Leu Cys Gly Phe Arg Leu Pro Ser Gln Arg Arg Thr Gly
500 505 510
Thr Pro Ala Asp Asp Ala Gly Glu Val Phe Leu Asn Leu Ser Leu Arg
515 520 525
Val Glu Ser Gln Ser Glu Gln Ala Gly Arg Arg Asn Pro Pro Tyr Ala
530 535 540
Ala Val Phe His Ile Ser Asp Gln Thr Arg Arg Val Ile Val Arg Tyr
545 550 555 560
Gly Glu Ile Glu Arg Tyr Leu Ala Glu His Pro Asp Thr Gly Ile Pro
565 570 575
Gly Ser Arg Gly Leu Thr Ser Gly Leu Arg Val Met Ser Val Asp Leu
580 585 590
Gly Leu Arg Thr Ser Ala Ala Ile Ser Val Phe Arg Val Ala His Arg
595 600 605
Asp Glu Leu Thr Pro Asp Ala His Gly Arg Gln Pro Phe Phe Phe Pro
610 615 620
Ile His Gly Met Asp His Leu Val Ala Leu His Glu Arg Ser His Leu
625 630 635 640
Ile Arg Leu Pro Gly Glu Thr Glu Ser Lys Lys Val Arg Ser Ile Arg
645 650 655
Glu Gln Arg Leu Asp Arg Leu Asn Arg Leu Arg Ser Gln Met Ala Ser
660 665 670
Leu Arg Leu Leu Val Arg Thr Gly Val Leu Asp Glu Gln Lys Arg Asp
675 680 685
Arg Asn Trp Glu Arg Leu Gln Ser Ser Met Glu Arg Gly Gly Glu Arg
690 695 700
Met Pro Ser Asp Trp Trp Asp Leu Phe Gln Ala Gln Val Arg Tyr Leu
705 710 715 720
Ala Gln His Arg Asp Ala Ser Gly Glu Ala Trp Gly Arg Met Val Gln
725 730 735
Ala Ala Val Arg Thr Leu Trp Arg Gln Leu Ala Lys Gln Val Arg Asp
740 745 750
Trp Arg Lys Glu Val Arg Arg Asn Ala Asp Lys Val Lys Ile Arg Gly
755 760 765
Ile Ala Arg Asp Val Pro Gly Gly His Ser Leu Ala Gln Leu Asp Tyr
770 775 780
Leu Glu Arg Gln Tyr Arg Phe Leu Arg Ser Trp Ser Ala Phe Ser Val
785 790 795 800
Gln Ala Gly Gln Val Val Arg Ala Glu Arg Asp Ser Arg Phe Ala Val
805 810 815
Ala Leu Arg Glu His Ile Asp Asn Gly Lys Lys Asp Arg Leu Lys Lys
820 825 830
Leu Ala Asp Arg Ile Leu Met Glu Ala Leu Gly Tyr Val Tyr Val Thr
835 840 845
Asp Gly Arg Arg Ala Gly Gln Trp Gln Ala Val Tyr Pro Pro Cys Gln
850 855 860
Leu Val Leu Leu Glu Glu Leu Ser Glu Tyr Arg Phe Ser Asn Asp Arg
865 870 875 880
Pro Pro Ser Glu Asn Ser Gln Leu Met Val Trp Ser His Arg Gly Val
885 890 895
Leu Glu Glu Leu Ile His Gln Ala Gln Val His Asp Val Leu Val Gly
900 905 910
Thr Ile Pro Ala Ala Phe Ser Ser Arg Phe Asp Ala Arg Thr Gly Ala
915 920 925
Pro Gly Ile Arg Cys Arg Arg Val Pro Ser Ile Pro Leu Lys Asp Ala
930 935 940
Pro Ser Ile Pro Ile Trp Leu Ser His Tyr Leu Lys Gln Thr Glu Arg
945 950 955 960
Asp Ala Ala Ala Leu Arg Pro Gly Glu Leu Ile Pro Thr Gly Asp Gly
965 970 975
Glu Phe Leu Val Thr Pro Ala Gly Arg Gly Ala Ser Gly Val Arg Val
980 985 990
Val His Ala Asp Ile Asn Ala Ala His Asn Leu Gln Arg Arg Leu Trp
995 1000 1005
Glu Asn Phe Asp Leu Ser Asp Ile Arg Val Arg Cys Asp Arg Arg
1010 1015 1020
Glu Gly Lys Asp Gly Thr Val Val Leu Ile Pro Arg Leu Thr Asn
1025 1030 1035
Gln Arg Val Lys Glu Arg Tyr Ser Gly Val Ile Phe Thr Ser Glu
1040 1045 1050
Asp Gly Val Ser Phe Thr Val Gly Asp Ala Lys Thr Arg Arg Arg
1055 1060 1065
Ser Ser Ala Ser Gln Gly Glu Gly Asp Asp Leu Ser Asp Glu Glu
1070 1075 1080
Gln Glu Leu Leu Ala Glu Ala Asp Asp Ala Arg Glu Arg Ser Val
1085 1090 1095
Val Leu Phe Arg Asp Pro Ser Gly Phe Val Asn Gly Gly Arg Trp
1100 1105 1110
Thr Ala Gln Arg Ala Phe Trp Gly Met Val His Asn Arg Ile Glu
1115 1120 1125
Thr Leu Leu Ala Glu Arg Phe Ser Val Ser Gly Ala Ala Glu Lys
1130 1135 1140
Val Arg Gly
1145
<210> 4
<211> 1108
<212> PRT
<213> Bacillus hisashii
<400> 4
Met Ala Thr Arg Ser Phe Ile Leu Lys Ile Glu Pro Asn Glu Glu Val
1 5 10 15
Lys Lys Gly Leu Trp Lys Thr His Glu Val Leu Asn His Gly Ile Ala
20 25 30
Tyr Tyr Met Asn Ile Leu Lys Leu Ile Arg Gln Glu Ala Ile Tyr Glu
35 40 45
His His Glu Gln Asp Pro Lys Asn Pro Lys Lys Val Ser Lys Ala Glu
50 55 60
Ile Gln Ala Glu Leu Trp Asp Phe Val Leu Lys Met Gln Lys Cys Asn
65 70 75 80
Ser Phe Thr His Glu Val Asp Lys Asp Glu Val Phe Asn Ile Leu Arg
85 90 95
Glu Leu Tyr Glu Glu Leu Val Pro Ser Ser Val Glu Lys Lys Gly Glu
100 105 110
Ala Asn Gln Leu Ser Asn Lys Phe Leu Tyr Pro Leu Val Asp Pro Asn
115 120 125
Ser Gln Ser Gly Lys Gly Thr Ala Ser Ser Gly Arg Lys Pro Arg Trp
130 135 140
Tyr Asn Leu Lys Ile Ala Gly Asp Pro Ser Trp Glu Glu Glu Lys Lys
145 150 155 160
Lys Trp Glu Glu Asp Lys Lys Lys Asp Pro Leu Ala Lys Ile Leu Gly
165 170 175
Lys Leu Ala Glu Tyr Gly Leu Ile Pro Leu Phe Ile Pro Tyr Thr Asp
180 185 190
Ser Asn Glu Pro Ile Val Lys Glu Ile Lys Trp Met Glu Lys Ser Arg
195 200 205
Asn Gln Ser Val Arg Arg Leu Asp Lys Asp Met Phe Ile Gln Ala Leu
210 215 220
Glu Arg Phe Leu Ser Trp Glu Ser Trp Asn Leu Lys Val Lys Glu Glu
225 230 235 240
Tyr Glu Lys Val Glu Lys Glu Tyr Lys Thr Leu Glu Glu Arg Ile Lys
245 250 255
Glu Asp Ile Gln Ala Leu Lys Ala Leu Glu Gln Tyr Glu Lys Glu Arg
260 265 270
Gln Glu Gln Leu Leu Arg Asp Thr Leu Asn Thr Asn Glu Tyr Arg Leu
275 280 285
Ser Lys Arg Gly Leu Arg Gly Trp Arg Glu Ile Ile Gln Lys Trp Leu
290 295 300
Lys Met Asp Glu Asn Glu Pro Ser Glu Lys Tyr Leu Glu Val Phe Lys
305 310 315 320
Asp Tyr Gln Arg Lys His Pro Arg Glu Ala Gly Asp Tyr Ser Val Tyr
325 330 335
Glu Phe Leu Ser Lys Lys Glu Asn His Phe Ile Trp Arg Asn His Pro
340 345 350
Glu Tyr Pro Tyr Leu Tyr Ala Thr Phe Cys Glu Ile Asp Lys Lys Lys
355 360 365
Lys Asp Ala Lys Gln Gln Ala Thr Phe Thr Leu Ala Asp Pro Ile Asn
370 375 380
His Pro Leu Trp Val Arg Phe Glu Glu Arg Ser Gly Ser Asn Leu Asn
385 390 395 400
Lys Tyr Arg Ile Leu Thr Glu Gln Leu His Thr Glu Lys Leu Lys Lys
405 410 415
Lys Leu Thr Val Gln Leu Asp Arg Leu Ile Tyr Pro Thr Glu Ser Gly
420 425 430
Gly Trp Glu Glu Lys Gly Lys Val Asp Ile Val Leu Leu Pro Ser Arg
435 440 445
Gln Phe Tyr Asn Gln Ile Phe Leu Asp Ile Glu Glu Lys Gly Lys His
450 455 460
Ala Phe Thr Tyr Lys Asp Glu Ser Ile Lys Phe Pro Leu Lys Gly Thr
465 470 475 480
Leu Gly Gly Ala Arg Val Gln Phe Asp Arg Asp His Leu Arg Arg Tyr
485 490 495
Pro His Lys Val Glu Ser Gly Asn Val Gly Arg Ile Tyr Phe Asn Met
500 505 510
Thr Val Asn Ile Glu Pro Thr Glu Ser Pro Val Ser Lys Ser Leu Lys
515 520 525
Ile His Arg Asp Asp Phe Pro Lys Val Val Asn Phe Lys Pro Lys Glu
530 535 540
Leu Thr Glu Trp Ile Lys Asp Ser Lys Gly Lys Lys Leu Lys Ser Gly
545 550 555 560
Ile Glu Ser Leu Glu Ile Gly Leu Arg Val Met Ser Ile Asp Leu Gly
565 570 575
Gln Arg Gln Ala Ala Ala Ala Ser Ile Phe Glu Val Val Asp Gln Lys
580 585 590
Pro Asp Ile Glu Gly Lys Leu Phe Phe Pro Ile Lys Gly Thr Glu Leu
595 600 605
Tyr Ala Val His Arg Ala Ser Phe Asn Ile Lys Leu Pro Gly Glu Thr
610 615 620
Leu Val Lys Ser Arg Glu Val Leu Arg Lys Ala Arg Glu Asp Asn Leu
625 630 635 640
Lys Leu Met Asn Gln Lys Leu Asn Phe Leu Arg Asn Val Leu His Phe
645 650 655
Gln Gln Phe Glu Asp Ile Thr Glu Arg Glu Lys Arg Val Thr Lys Trp
660 665 670
Ile Ser Arg Gln Glu Asn Ser Asp Val Pro Leu Val Tyr Gln Asp Glu
675 680 685
Leu Ile Gln Ile Arg Glu Leu Met Tyr Lys Pro Tyr Lys Asp Trp Val
690 695 700
Ala Phe Leu Lys Gln Leu His Lys Arg Leu Glu Val Glu Ile Gly Lys
705 710 715 720
Glu Val Lys His Trp Arg Lys Ser Leu Ser Asp Gly Arg Lys Gly Leu
725 730 735
Tyr Gly Ile Ser Leu Lys Asn Ile Asp Glu Ile Asp Arg Thr Arg Lys
740 745 750
Phe Leu Leu Arg Trp Ser Leu Arg Pro Thr Glu Pro Gly Glu Val Arg
755 760 765
Arg Leu Glu Pro Gly Gln Arg Phe Ala Ile Asp Gln Leu Asn His Leu
770 775 780
Asn Ala Leu Lys Glu Asp Arg Leu Lys Lys Met Ala Asn Thr Ile Ile
785 790 795 800
Met His Ala Leu Gly Tyr Cys Tyr Asp Val Arg Lys Lys Lys Trp Gln
805 810 815
Ala Lys Asn Pro Ala Cys Gln Ile Ile Leu Phe Glu Asp Leu Ser Asn
820 825 830
Tyr Asn Pro Tyr Glu Glu Arg Ser Arg Phe Glu Asn Ser Lys Leu Met
835 840 845
Lys Trp Ser Arg Arg Glu Ile Pro Arg Gln Val Ala Leu Gln Gly Glu
850 855 860
Ile Tyr Gly Leu Gln Val Gly Glu Val Gly Ala Gln Phe Ser Ser Arg
865 870 875 880
Phe His Ala Lys Thr Gly Ser Pro Gly Ile Arg Cys Ser Val Val Thr
885 890 895
Lys Glu Lys Leu Gln Asp Asn Arg Phe Phe Lys Asn Leu Gln Arg Glu
900 905 910
Gly Arg Leu Thr Leu Asp Lys Ile Ala Val Leu Lys Glu Gly Asp Leu
915 920 925
Tyr Pro Asp Lys Gly Gly Glu Lys Phe Ile Ser Leu Ser Lys Asp Arg
930 935 940
Lys Cys Val Thr Thr His Ala Asp Ile Asn Ala Ala Gln Asn Leu Gln
945 950 955 960
Lys Arg Phe Trp Thr Arg Thr His Gly Phe Tyr Lys Val Tyr Cys Lys
965 970 975
Ala Tyr Gln Val Asp Gly Gln Thr Val Tyr Ile Pro Glu Ser Lys Asp
980 985 990
Gln Lys Gln Lys Ile Ile Glu Glu Phe Gly Glu Gly Tyr Phe Ile Leu
995 1000 1005
Lys Asp Gly Val Tyr Glu Trp Val Asn Ala Gly Lys Leu Lys Ile
1010 1015 1020
Lys Lys Gly Ser Ser Lys Gln Ser Ser Ser Glu Leu Val Asp Ser
1025 1030 1035
Asp Ile Leu Lys Asp Ser Phe Asp Leu Ala Ser Glu Leu Lys Gly
1040 1045 1050
Glu Lys Leu Met Leu Tyr Arg Asp Pro Ser Gly Asn Val Phe Pro
1055 1060 1065
Ser Asp Lys Trp Met Ala Ala Gly Val Phe Phe Gly Lys Leu Glu
1070 1075 1080
Arg Ile Leu Ile Ser Lys Leu Thr Asn Gln Tyr Ser Ile Ser Thr
1085 1090 1095
Ile Glu Asp Asp Ser Ser Lys Gln Ser Met
1100 1105
<210> 5
<211> 1108
<212> PRT
<213> Bacillus
<400> 5
Met Ala Ile Arg Ser Ile Lys Leu Lys Leu Lys Thr His Thr Gly Pro
1 5 10 15
Glu Ala Gln Asn Leu Arg Lys Gly Ile Trp Arg Thr His Arg Leu Leu
20 25 30
Asn Glu Gly Val Ala Tyr Tyr Met Lys Met Leu Leu Leu Phe Arg Gln
35 40 45
Glu Ser Thr Gly Glu Arg Pro Lys Glu Glu Leu Gln Glu Glu Leu Ile
50 55 60
Cys His Ile Arg Glu Gln Gln Gln Arg Asn Gln Ala Asp Lys Asn Thr
65 70 75 80
Gln Ala Leu Pro Leu Asp Lys Ala Leu Glu Ala Leu Arg Gln Leu Tyr
85 90 95
Glu Leu Leu Val Pro Ser Ser Val Gly Gln Ser Gly Asp Ala Gln Ile
100 105 110
Ile Ser Arg Lys Phe Leu Ser Pro Leu Val Asp Pro Asn Ser Glu Gly
115 120 125
Gly Lys Gly Thr Ser Lys Ala Gly Ala Lys Pro Thr Trp Gln Lys Lys
130 135 140
Lys Glu Ala Asn Asp Pro Thr Trp Glu Gln Asp Tyr Glu Lys Trp Lys
145 150 155 160
Lys Arg Arg Glu Glu Asp Pro Thr Ala Ser Val Ile Thr Thr Leu Glu
165 170 175
Glu Tyr Gly Ile Arg Pro Ile Phe Pro Leu Tyr Thr Asn Thr Val Thr
180 185 190
Asp Ile Ala Trp Leu Pro Leu Gln Ser Asn Gln Phe Val Arg Thr Trp
195 200 205
Asp Arg Asp Met Leu Gln Gln Ala Ile Glu Arg Leu Leu Ser Trp Glu
210 215 220
Ser Trp Asn Lys Arg Val Gln Glu Glu Tyr Ala Lys Leu Lys Glu Lys
225 230 235 240
Met Ala Gln Leu Asn Glu Gln Leu Glu Gly Gly Gln Glu Trp Ile Ser
245 250 255
Leu Leu Glu Gln Tyr Glu Glu Asn Arg Glu Arg Glu Leu Arg Glu Asn
260 265 270
Met Thr Ala Ala Asn Asp Lys Tyr Arg Ile Thr Lys Arg Gln Met Lys
275 280 285
Gly Trp Asn Glu Leu Tyr Glu Leu Trp Ser Thr Phe Pro Ala Ser Ala
290 295 300
Ser His Glu Gln Tyr Lys Glu Ala Leu Lys Arg Val Gln Gln Arg Leu
305 310 315 320
Arg Gly Arg Phe Gly Asp Ala His Phe Phe Gln Tyr Leu Met Glu Glu
325 330 335
Lys Asn Arg Leu Ile Trp Lys Gly Asn Pro Gln Arg Ile His Tyr Phe
340 345 350
Val Ala Arg Asn Glu Leu Thr Lys Arg Leu Glu Glu Ala Lys Gln Ser
355 360 365
Ala Thr Met Thr Leu Pro Asn Ala Arg Lys His Pro Leu Trp Val Arg
370 375 380
Phe Asp Ala Arg Gly Gly Asn Leu Gln Asp Tyr Tyr Leu Thr Ala Glu
385 390 395 400
Ala Asp Lys Pro Arg Ser Arg Arg Phe Val Thr Phe Ser Gln Leu Ile
405 410 415
Trp Pro Ser Glu Ser Gly Trp Met Glu Lys Lys Asp Val Glu Val Glu
420 425 430
Leu Ala Leu Ser Arg Gln Phe Tyr Gln Gln Val Lys Leu Leu Lys Asn
435 440 445
Asp Lys Gly Lys Gln Lys Ile Glu Phe Lys Asp Lys Gly Ser Gly Ser
450 455 460
Thr Phe Asn Gly His Leu Gly Gly Ala Lys Leu Gln Leu Glu Arg Gly
465 470 475 480
Asp Leu Glu Lys Glu Glu Lys Asn Phe Glu Asp Gly Glu Ile Gly Ser
485 490 495
Val Tyr Leu Asn Val Val Ile Asp Phe Glu Pro Leu Gln Glu Val Lys
500 505 510
Asn Gly Arg Val Gln Ala Pro Tyr Gly Gln Val Leu Gln Leu Ile Arg
515 520 525
Arg Pro Asn Glu Phe Pro Lys Val Thr Thr Tyr Lys Ser Glu Gln Leu
530 535 540
Val Glu Trp Ile Lys Ala Ser Pro Gln His Ser Ala Gly Val Glu Ser
545 550 555 560
Leu Ala Ser Gly Phe Arg Val Met Ser Ile Asp Leu Gly Leu Arg Ala
565 570 575
Ala Ala Ala Thr Ser Ile Phe Ser Val Glu Glu Ser Ser Asp Lys Asn
580 585 590
Ala Ala Asp Phe Ser Tyr Trp Ile Glu Gly Thr Pro Leu Val Ala Val
595 600 605
His Gln Arg Ser Tyr Met Leu Arg Leu Pro Gly Glu Gln Val Glu Lys
610 615 620
Gln Val Met Glu Lys Arg Asp Glu Arg Phe Gln Leu His Gln Arg Val
625 630 635 640
Lys Phe Gln Ile Arg Val Leu Ala Gln Ile Met Arg Met Ala Asn Lys
645 650 655
Gln Tyr Gly Asp Arg Trp Asp Glu Leu Asp Ser Leu Lys Gln Ala Val
660 665 670
Glu Gln Lys Lys Ser Pro Leu Asp Gln Thr Asp Arg Thr Phe Trp Glu
675 680 685
Gly Ile Val Cys Asp Leu Thr Lys Val Leu Pro Arg Asn Glu Ala Asp
690 695 700
Trp Glu Gln Ala Val Val Gln Ile His Arg Lys Ala Glu Glu Tyr Val
705 710 715 720
Gly Lys Ala Val Gln Ala Trp Arg Lys Arg Phe Ala Ala Asp Glu Arg
725 730 735
Lys Gly Ile Ala Gly Leu Ser Met Trp Asn Ile Glu Glu Leu Glu Gly
740 745 750
Leu Arg Lys Leu Leu Ile Ser Trp Ser Arg Arg Thr Arg Asn Pro Gln
755 760 765
Glu Val Asn Arg Phe Glu Arg Gly His Thr Ser His Gln Arg Leu Leu
770 775 780
Thr His Ile Gln Asn Val Lys Glu Asp Arg Leu Lys Gln Leu Ser His
785 790 795 800
Ala Ile Val Met Thr Ala Leu Gly Tyr Val Tyr Asp Glu Arg Lys Gln
805 810 815
Glu Trp Cys Ala Glu Tyr Pro Ala Cys Gln Val Ile Leu Phe Glu Asn
820 825 830
Leu Ser Gln Tyr Arg Ser Asn Leu Asp Arg Ser Thr Lys Glu Asn Ser
835 840 845
Thr Leu Met Lys Trp Ala His Arg Ser Ile Pro Lys Tyr Val His Met
850 855 860
Gln Ala Glu Pro Tyr Gly Ile Gln Ile Gly Asp Val Arg Ala Glu Tyr
865 870 875 880
Ser Ser Arg Phe Tyr Ala Lys Thr Gly Thr Pro Gly Ile Arg Cys Lys
885 890 895
Lys Val Arg Gly Gln Asp Leu Gln Gly Arg Arg Phe Glu Asn Leu Gln
900 905 910
Lys Arg Leu Val Asn Glu Gln Phe Leu Thr Glu Glu Gln Val Lys Gln
915 920 925
Leu Arg Pro Gly Asp Ile Val Pro Asp Asp Ser Gly Glu Leu Phe Met
930 935 940
Thr Leu Thr Asp Gly Ser Gly Ser Lys Glu Val Val Phe Leu Gln Ala
945 950 955 960
Asp Ile Asn Ala Ala His Asn Leu Gln Lys Arg Phe Trp Gln Arg Tyr
965 970 975
Asn Glu Leu Phe Lys Val Ser Cys Arg Val Ile Val Arg Asp Glu Glu
980 985 990
Glu Tyr Leu Val Pro Lys Thr Lys Ser Val Gln Ala Lys Leu Gly Lys
995 1000 1005
Gly Leu Phe Val Lys Lys Ser Asp Thr Ala Trp Lys Asp Val Tyr
1010 1015 1020
Val Trp Asp Ser Gln Ala Lys Leu Lys Gly Lys Thr Thr Phe Thr
1025 1030 1035
Glu Glu Ser Glu Ser Pro Glu Gln Leu Glu Asp Phe Gln Glu Ile
1040 1045 1050
Ile Glu Glu Ala Glu Glu Ala Lys Gly Thr Tyr Arg Thr Leu Phe
1055 1060 1065
Arg Asp Pro Ser Gly Val Phe Phe Pro Glu Ser Val Trp Tyr Pro
1070 1075 1080
Gln Lys Asp Phe Trp Gly Glu Val Lys Arg Lys Leu Tyr Gly Lys
1085 1090 1095
Leu Arg Glu Arg Phe Leu Thr Lys Ala Arg
1100 1105
<210> 6
<211> 1112
<212> PRT
<213> Bacillus
<400> 6
Met Ala Ile Arg Ser Ile Lys Leu Lys Met Lys Thr Asn Ser Gly Thr
1 5 10 15
Asp Ser Ile Tyr Leu Arg Lys Ala Leu Trp Arg Thr His Gln Leu Ile
20 25 30
Asn Glu Gly Ile Ala Tyr Tyr Met Asn Leu Leu Thr Leu Tyr Arg Gln
35 40 45
Glu Ala Ile Gly Asp Lys Thr Lys Glu Ala Tyr Gln Ala Glu Leu Ile
50 55 60
Asn Ile Ile Arg Asn Gln Gln Arg Asn Asn Gly Ser Ser Glu Glu His
65 70 75 80
Gly Ser Asp Gln Glu Ile Leu Ala Leu Leu Arg Gln Leu Tyr Glu Leu
85 90 95
Ile Ile Pro Ser Ser Ile Gly Glu Ser Gly Asp Ala Asn Gln Leu Gly
100 105 110
Asn Lys Phe Leu Tyr Pro Leu Val Asp Pro Asn Ser Gln Ser Gly Lys
115 120 125
Gly Thr Ser Asn Ala Gly Arg Lys Pro Arg Trp Lys Arg Leu Lys Glu
130 135 140
Glu Gly Asn Pro Asp Trp Glu Leu Glu Lys Lys Lys Asp Glu Glu Arg
145 150 155 160
Lys Ala Lys Asp Pro Thr Val Lys Ile Phe Asp Asn Leu Asn Lys Tyr
165 170 175
Gly Leu Leu Pro Leu Phe Pro Leu Phe Thr Asn Ile Gln Lys Asp Ile
180 185 190
Glu Trp Leu Pro Leu Gly Lys Arg Gln Ser Val Arg Lys Trp Asp Lys
195 200 205
Asp Met Phe Ile Gln Ala Ile Glu Arg Leu Leu Ser Trp Glu Ser Trp
210 215 220
Asn Arg Arg Val Ala Asp Glu Tyr Lys Gln Leu Lys Glu Lys Thr Glu
225 230 235 240
Ser Tyr Tyr Lys Glu His Leu Thr Gly Gly Glu Glu Trp Ile Glu Lys
245 250 255
Ile Arg Lys Phe Glu Lys Glu Arg Asn Met Glu Leu Glu Lys Asn Ala
260 265 270
Phe Ala Pro Asn Asp Gly Tyr Phe Ile Thr Ser Arg Gln Ile Arg Gly
275 280 285
Trp Asp Arg Val Tyr Glu Lys Trp Ser Lys Leu Pro Glu Ser Ala Ser
290 295 300
Pro Glu Glu Leu Trp Lys Val Val Ala Glu Gln Gln Asn Lys Met Ser
305 310 315 320
Glu Gly Phe Gly Asp Pro Lys Val Phe Ser Phe Leu Ala Asn Arg Glu
325 330 335
Asn Arg Asp Ile Trp Arg Gly His Ser Glu Arg Ile Tyr His Ile Ala
340 345 350
Ala Tyr Asn Gly Leu Gln Lys Lys Leu Ser Arg Thr Lys Glu Gln Ala
355 360 365
Thr Phe Thr Leu Pro Asp Ala Ile Glu His Pro Leu Trp Ile Arg Tyr
370 375 380
Glu Ser Pro Gly Gly Thr Asn Leu Asn Leu Phe Lys Leu Glu Glu Lys
385 390 395 400
Gln Lys Lys Asn Tyr Tyr Val Thr Leu Ser Lys Ile Ile Trp Pro Ser
405 410 415
Glu Glu Lys Trp Ile Glu Lys Glu Asn Ile Glu Ile Pro Leu Ala Pro
420 425 430
Ser Ile Gln Phe Asn Arg Gln Ile Lys Leu Lys Gln His Val Lys Gly
435 440 445
Lys Gln Glu Ile Ser Phe Ser Asp Tyr Ser Ser Arg Ile Ser Leu Asp
450 455 460
Gly Val Leu Gly Gly Ser Arg Ile Gln Phe Asn Arg Lys Tyr Ile Lys
465 470 475 480
Asn His Lys Glu Leu Leu Gly Glu Gly Asp Ile Gly Pro Val Phe Phe
485 490 495
Asn Leu Val Val Asp Val Ala Pro Leu Gln Glu Thr Arg Asn Gly Arg
500 505 510
Leu Gln Ser Pro Ile Gly Lys Ala Leu Lys Val Ile Ser Ser Asp Phe
515 520 525
Ser Lys Val Ile Asp Tyr Lys Pro Lys Glu Leu Met Asp Trp Met Asn
530 535 540
Thr Gly Ser Ala Ser Asn Ser Phe Gly Val Ala Ser Leu Leu Glu Gly
545 550 555 560
Met Arg Val Met Ser Ile Asp Met Gly Gln Arg Thr Ser Ala Ser Val
565 570 575
Ser Ile Phe Glu Val Val Lys Glu Leu Pro Lys Asp Gln Glu Gln Lys
580 585 590
Leu Phe Tyr Ser Ile Asn Asp Thr Glu Leu Phe Ala Ile His Lys Arg
595 600 605
Ser Phe Leu Leu Asn Leu Pro Gly Glu Val Val Thr Lys Asn Asn Lys
610 615 620
Gln Gln Arg Gln Glu Arg Arg Lys Lys Arg Gln Phe Val Arg Ser Gln
625 630 635 640
Ile Arg Met Leu Ala Asn Val Leu Arg Leu Glu Thr Lys Lys Thr Pro
645 650 655
Asp Glu Arg Lys Lys Ala Ile His Lys Leu Met Glu Ile Val Gln Ser
660 665 670
Tyr Asp Ser Trp Thr Ala Ser Gln Lys Glu Val Trp Glu Lys Glu Leu
675 680 685
Asn Leu Leu Thr Asn Met Ala Ala Phe Asn Asp Glu Ile Trp Lys Glu
690 695 700
Ser Leu Val Glu Leu His His Arg Ile Glu Pro Tyr Val Gly Gln Ile
705 710 715 720
Val Ser Lys Trp Arg Lys Gly Leu Ser Glu Gly Arg Lys Asn Leu Ala
725 730 735
Gly Ile Ser Met Trp Asn Ile Asp Glu Leu Glu Asp Thr Arg Arg Leu
740 745 750
Leu Ile Ser Trp Ser Lys Arg Ser Arg Thr Pro Gly Glu Ala Asn Arg
755 760 765
Ile Glu Thr Asp Glu Pro Phe Gly Ser Ser Leu Leu Gln His Ile Gln
770 775 780
Asn Val Lys Asp Asp Arg Leu Lys Gln Met Ala Asn Leu Ile Ile Met
785 790 795 800
Thr Ala Leu Gly Phe Lys Tyr Asp Lys Glu Glu Lys Asp Arg Tyr Lys
805 810 815
Arg Trp Lys Glu Thr Tyr Pro Ala Cys Gln Ile Ile Leu Phe Glu Asn
820 825 830
Leu Asn Arg Tyr Leu Phe Asn Leu Asp Arg Ser Arg Arg Glu Asn Ser
835 840 845
Arg Leu Met Lys Trp Ala His Arg Ser Ile Pro Arg Thr Val Ser Met
850 855 860
Gln Gly Glu Met Phe Gly Leu Gln Val Gly Asp Val Arg Ser Glu Tyr
865 870 875 880
Ser Ser Arg Phe His Ala Lys Thr Gly Ala Pro Gly Ile Arg Cys His
885 890 895
Ala Leu Thr Glu Glu Asp Leu Lys Ala Gly Ser Asn Thr Leu Lys Arg
900 905 910
Leu Ile Glu Asp Gly Phe Ile Asn Glu Ser Glu Leu Ala Tyr Leu Lys
915 920 925
Lys Gly Asp Ile Ile Pro Ser Gln Gly Gly Glu Leu Phe Val Thr Leu
930 935 940
Ser Lys Arg Tyr Lys Lys Asp Ser Asp Asn Asn Glu Leu Thr Val Ile
945 950 955 960
His Ala Asp Ile Asn Ala Ala Gln Asn Leu Gln Lys Arg Phe Trp Gln
965 970 975
Gln Asn Ser Glu Val Tyr Arg Val Pro Cys Gln Leu Ala Arg Met Gly
980 985 990
Glu Asp Lys Leu Tyr Ile Pro Lys Ser Gln Thr Glu Thr Ile Lys Lys
995 1000 1005
Tyr Phe Gly Lys Gly Ser Phe Val Lys Asn Asn Thr Glu Gln Glu
1010 1015 1020
Val Tyr Lys Trp Glu Lys Ser Glu Lys Met Lys Ile Lys Thr Asp
1025 1030 1035
Thr Thr Phe Asp Leu Gln Asp Leu Asp Gly Phe Glu Asp Ile Ser
1040 1045 1050
Lys Thr Ile Glu Leu Ala Gln Glu Gln Gln Lys Lys Tyr Leu Thr
1055 1060 1065
Met Phe Arg Asp Pro Ser Gly Tyr Phe Phe Asn Asn Glu Thr Trp
1070 1075 1080
Arg Pro Gln Lys Glu Tyr Trp Ser Ile Val Asn Asn Ile Ile Lys
1085 1090 1095
Ser Cys Leu Lys Lys Lys Ile Leu Ser Asn Lys Val Glu Leu
1100 1105 1110
<210> 7
<211> 1149
<212> PRT
<213> Desulfovibrio inopinatus
<400> 7
Met Pro Thr Arg Thr Ile Asn Leu Lys Leu Val Leu Gly Lys Asn Pro
1 5 10 15
Glu Asn Ala Thr Leu Arg Arg Ala Leu Phe Ser Thr His Arg Leu Val
20 25 30
Asn Gln Ala Thr Lys Arg Ile Glu Glu Phe Leu Leu Leu Cys Arg Gly
35 40 45
Glu Ala Tyr Arg Thr Val Asp Asn Glu Gly Lys Glu Ala Glu Ile Pro
50 55 60
Arg His Ala Val Gln Glu Glu Ala Leu Ala Phe Ala Lys Ala Ala Gln
65 70 75 80
Arg His Asn Gly Cys Ile Ser Thr Tyr Glu Asp Gln Glu Ile Leu Asp
85 90 95
Val Leu Arg Gln Leu Tyr Glu Arg Leu Val Pro Ser Val Asn Glu Asn
100 105 110
Asn Glu Ala Gly Asp Ala Gln Ala Ala Asn Ala Trp Val Ser Pro Leu
115 120 125
Met Ser Ala Glu Ser Glu Gly Gly Leu Ser Val Tyr Asp Lys Val Leu
130 135 140
Asp Pro Pro Pro Val Trp Met Lys Leu Lys Glu Glu Lys Ala Pro Gly
145 150 155 160
Trp Glu Ala Ala Ser Gln Ile Trp Ile Gln Ser Asp Glu Gly Gln Ser
165 170 175
Leu Leu Asn Lys Pro Gly Ser Pro Pro Arg Trp Ile Arg Lys Leu Arg
180 185 190
Ser Gly Gln Pro Trp Gln Asp Asp Phe Val Ser Asp Gln Lys Lys Lys
195 200 205
Gln Asp Glu Leu Thr Lys Gly Asn Ala Pro Leu Ile Lys Gln Leu Lys
210 215 220
Glu Met Gly Leu Leu Pro Leu Val Asn Pro Phe Phe Arg His Leu Leu
225 230 235 240
Asp Pro Glu Gly Lys Gly Val Ser Pro Trp Asp Arg Leu Ala Val Arg
245 250 255
Ala Ala Val Ala His Phe Ile Ser Trp Glu Ser Trp Asn His Arg Thr
260 265 270
Arg Ala Glu Tyr Asn Ser Leu Lys Leu Arg Arg Asp Glu Phe Glu Ala
275 280 285
Ala Ser Asp Glu Phe Lys Asp Asp Phe Thr Leu Leu Arg Gln Tyr Glu
290 295 300
Ala Lys Arg His Ser Thr Leu Lys Ser Ile Ala Leu Ala Asp Asp Ser
305 310 315 320
Asn Pro Tyr Arg Ile Gly Val Arg Ser Leu Arg Ala Trp Asn Arg Val
325 330 335
Arg Glu Glu Trp Ile Asp Lys Gly Ala Thr Glu Glu Gln Arg Val Thr
340 345 350
Ile Leu Ser Lys Leu Gln Thr Gln Leu Arg Gly Lys Phe Gly Asp Pro
355 360 365
Asp Leu Phe Asn Trp Leu Ala Gln Asp Arg His Val His Leu Trp Ser
370 375 380
Pro Arg Asp Ser Val Thr Pro Leu Val Arg Ile Asn Ala Val Asp Lys
385 390 395 400
Val Leu Arg Arg Arg Lys Pro Tyr Ala Leu Met Thr Phe Ala His Pro
405 410 415
Arg Phe His Pro Arg Trp Ile Leu Tyr Glu Ala Pro Gly Gly Ser Asn
420 425 430
Leu Arg Gln Tyr Ala Leu Asp Cys Thr Glu Asn Ala Leu His Ile Thr
435 440 445
Leu Pro Leu Leu Val Asp Asp Ala His Gly Thr Trp Ile Glu Lys Lys
450 455 460
Ile Arg Val Pro Leu Ala Pro Ser Gly Gln Ile Gln Asp Leu Thr Leu
465 470 475 480
Glu Lys Leu Glu Lys Lys Lys Asn Arg Leu Tyr Tyr Arg Ser Gly Phe
485 490 495
Gln Gln Phe Ala Gly Leu Ala Gly Gly Ala Glu Val Leu Phe His Arg
500 505 510
Pro Tyr Met Glu His Asp Glu Arg Ser Glu Glu Ser Leu Leu Glu Arg
515 520 525
Pro Gly Ala Val Trp Phe Lys Leu Thr Leu Asp Val Ala Thr Gln Ala
530 535 540
Pro Pro Asn Trp Leu Asp Gly Lys Gly Arg Val Arg Thr Pro Pro Glu
545 550 555 560
Val His His Phe Lys Thr Ala Leu Ser Asn Lys Ser Lys His Thr Arg
565 570 575
Thr Leu Gln Pro Gly Leu Arg Val Leu Ser Val Asp Leu Gly Met Arg
580 585 590
Thr Phe Ala Ser Cys Ser Val Phe Glu Leu Ile Glu Gly Lys Pro Glu
595 600 605
Thr Gly Arg Ala Phe Pro Val Ala Asp Glu Arg Ser Met Asp Ser Pro
610 615 620
Asn Lys Leu Trp Ala Lys His Glu Arg Ser Phe Lys Leu Thr Leu Pro
625 630 635 640
Gly Glu Thr Pro Ser Arg Lys Glu Glu Glu Glu Arg Ser Ile Ala Arg
645 650 655
Ala Glu Ile Tyr Ala Leu Lys Arg Asp Ile Gln Arg Leu Lys Ser Leu
660 665 670
Leu Arg Leu Gly Glu Glu Asp Asn Asp Asn Arg Arg Asp Ala Leu Leu
675 680 685
Glu Gln Phe Phe Lys Gly Trp Gly Glu Glu Asp Val Val Pro Gly Gln
690 695 700
Ala Phe Pro Arg Ser Leu Phe Gln Gly Leu Gly Ala Ala Pro Phe Arg
705 710 715 720
Ser Thr Pro Glu Leu Trp Arg Gln His Cys Gln Thr Tyr Tyr Asp Lys
725 730 735
Ala Glu Ala Cys Leu Ala Lys His Ile Ser Asp Trp Arg Lys Arg Thr
740 745 750
Arg Pro Arg Pro Thr Ser Arg Glu Met Trp Tyr Lys Thr Arg Ser Tyr
755 760 765
His Gly Gly Lys Ser Ile Trp Met Leu Glu Tyr Leu Asp Ala Val Arg
770 775 780
Lys Leu Leu Leu Ser Trp Ser Leu Arg Gly Arg Thr Tyr Gly Ala Ile
785 790 795 800
Asn Arg Gln Asp Thr Ala Arg Phe Gly Ser Leu Ala Ser Arg Leu Leu
805 810 815
His His Ile Asn Ser Leu Lys Glu Asp Arg Ile Lys Thr Gly Ala Asp
820 825 830
Ser Ile Val Gln Ala Ala Arg Gly Tyr Ile Pro Leu Pro His Gly Lys
835 840 845
Gly Trp Glu Gln Arg Tyr Glu Pro Cys Gln Leu Ile Leu Phe Glu Asp
850 855 860
Leu Ala Arg Tyr Arg Phe Arg Val Asp Arg Pro Arg Arg Glu Asn Ser
865 870 875 880
Gln Leu Met Gln Trp Asn His Arg Ala Ile Val Ala Glu Thr Thr Met
885 890 895
Gln Ala Glu Leu Tyr Gly Gln Ile Val Glu Asn Thr Ala Ala Gly Phe
900 905 910
Ser Ser Arg Phe His Ala Ala Thr Gly Ala Pro Gly Val Arg Cys Arg
915 920 925
Phe Leu Leu Glu Arg Asp Phe Asp Asn Asp Leu Pro Lys Pro Tyr Leu
930 935 940
Leu Arg Glu Leu Ser Trp Met Leu Gly Asn Thr Lys Val Glu Ser Glu
945 950 955 960
Glu Glu Lys Leu Arg Leu Leu Ser Glu Lys Ile Arg Pro Gly Ser Leu
965 970 975
Val Pro Trp Asp Gly Gly Glu Gln Phe Ala Thr Leu His Pro Lys Arg
980 985 990
Gln Thr Leu Cys Val Ile His Ala Asp Met Asn Ala Ala Gln Asn Leu
995 1000 1005
Gln Arg Arg Phe Phe Gly Arg Cys Gly Glu Ala Phe Arg Leu Val
1010 1015 1020
Cys Gln Pro His Gly Asp Asp Val Leu Arg Leu Ala Ser Thr Pro
1025 1030 1035
Gly Ala Arg Leu Leu Gly Ala Leu Gln Gln Leu Glu Asn Gly Gln
1040 1045 1050
Gly Ala Phe Glu Leu Val Arg Asp Met Gly Ser Thr Ser Gln Met
1055 1060 1065
Asn Arg Phe Val Met Lys Ser Leu Gly Lys Lys Lys Ile Lys Pro
1070 1075 1080
Leu Gln Asp Asn Asn Gly Asp Asp Glu Leu Glu Asp Val Leu Ser
1085 1090 1095
Val Leu Pro Glu Glu Asp Asp Thr Gly Arg Ile Thr Val Phe Arg
1100 1105 1110
Asp Ser Ser Gly Ile Phe Phe Pro Cys Asn Val Trp Ile Pro Ala
1115 1120 1125
Lys Gln Phe Trp Pro Ala Val Arg Ala Met Ile Trp Lys Val Met
1130 1135 1140
Ala Ser His Ser Leu Gly
1145
<210> 8
<211> 1090
<212> PRT
<213> Laceyella sediminis
<400> 8
Met Ser Ile Arg Ser Phe Lys Leu Lys Ile Lys Thr Lys Ser Gly Val
1 5 10 15
Asn Ala Glu Glu Leu Arg Arg Gly Leu Trp Arg Thr His Gln Leu Ile
20 25 30
Asn Asp Gly Ile Ala Tyr Tyr Met Asn Trp Leu Val Leu Leu Arg Gln
35 40 45
Glu Asp Leu Phe Ile Arg Asn Glu Glu Thr Asn Glu Ile Glu Lys Arg
50 55 60
Ser Lys Glu Glu Ile Gln Gly Glu Leu Leu Glu Arg Val His Lys Gln
65 70 75 80
Gln Gln Arg Asn Gln Trp Ser Gly Glu Val Asp Asp Gln Thr Leu Leu
85 90 95
Gln Thr Leu Arg His Leu Tyr Glu Glu Ile Val Pro Ser Val Ile Gly
100 105 110
Lys Ser Gly Asn Ala Ser Leu Lys Ala Arg Phe Phe Leu Gly Pro Leu
115 120 125
Val Asp Pro Asn Asn Lys Thr Thr Lys Asp Val Ser Lys Ser Gly Pro
130 135 140
Thr Pro Lys Trp Lys Lys Met Lys Asp Ala Gly Asp Pro Asn Trp Val
145 150 155 160
Gln Glu Tyr Glu Lys Tyr Met Ala Glu Arg Gln Thr Leu Val Arg Leu
165 170 175
Glu Glu Met Gly Leu Ile Pro Leu Phe Pro Met Tyr Thr Asp Glu Val
180 185 190
Gly Asp Ile His Trp Leu Pro Gln Ala Ser Gly Tyr Thr Arg Thr Trp
195 200 205
Asp Arg Asp Met Phe Gln Gln Ala Ile Glu Arg Leu Leu Ser Trp Glu
210 215 220
Ser Trp Asn Arg Arg Val Arg Glu Arg Arg Ala Gln Phe Glu Lys Lys
225 230 235 240
Thr His Asp Phe Ala Ser Arg Phe Ser Glu Ser Asp Val Gln Trp Met
245 250 255
Asn Lys Leu Arg Glu Tyr Glu Ala Gln Gln Glu Lys Ser Leu Glu Glu
260 265 270
Asn Ala Phe Ala Pro Asn Glu Pro Tyr Ala Leu Thr Lys Lys Ala Leu
275 280 285
Arg Gly Trp Glu Arg Val Tyr His Ser Trp Met Arg Leu Asp Ser Ala
290 295 300
Ala Ser Glu Glu Ala Tyr Trp Gln Glu Val Ala Thr Cys Gln Thr Ala
305 310 315 320
Met Arg Gly Glu Phe Gly Asp Pro Ala Ile Tyr Gln Phe Leu Ala Gln
325 330 335
Lys Glu Asn His Asp Ile Trp Arg Gly Tyr Pro Glu Arg Val Ile Asp
340 345 350
Phe Ala Glu Leu Asn His Leu Gln Arg Glu Leu Arg Arg Ala Lys Glu
355 360 365
Asp Ala Thr Phe Thr Leu Pro Asp Ser Val Asp His Pro Leu Trp Val
370 375 380
Arg Tyr Glu Ala Pro Gly Gly Thr Asn Ile His Gly Tyr Asp Leu Val
385 390 395 400
Gln Asp Thr Lys Arg Asn Leu Thr Leu Ile Leu Asp Lys Phe Ile Leu
405 410 415
Pro Asp Glu Asn Gly Ser Trp His Glu Val Lys Lys Val Pro Phe Ser
420 425 430
Leu Ala Lys Ser Lys Gln Phe His Arg Gln Val Trp Leu Gln Glu Glu
435 440 445
Gln Lys Gln Lys Lys Arg Glu Val Val Phe Tyr Asp Tyr Ser Thr Asn
450 455 460
Leu Pro His Leu Gly Thr Leu Ala Gly Ala Lys Leu Gln Trp Asp Arg
465 470 475 480
Asn Phe Leu Asn Lys Arg Thr Gln Gln Gln Ile Glu Glu Thr Gly Glu
485 490 495
Ile Gly Lys Val Phe Phe Asn Ile Ser Val Asp Val Arg Pro Ala Val
500 505 510
Glu Val Lys Asn Gly Arg Leu Gln Asn Gly Leu Gly Lys Ala Leu Thr
515 520 525
Val Leu Thr His Pro Asp Gly Thr Lys Ile Val Thr Gly Trp Lys Ala
530 535 540
Glu Gln Leu Glu Lys Trp Val Gly Glu Ser Gly Arg Val Ser Ser Leu
545 550 555 560
Gly Leu Asp Ser Leu Ser Glu Gly Leu Arg Val Met Ser Ile Asp Leu
565 570 575
Gly Gln Arg Thr Ser Ala Thr Val Ser Val Phe Glu Ile Thr Lys Glu
580 585 590
Ala Pro Asp Asn Pro Tyr Lys Phe Phe Tyr Gln Leu Glu Gly Thr Glu
595 600 605
Leu Phe Ala Val His Gln Arg Ser Phe Leu Leu Ala Leu Pro Gly Glu
610 615 620
Asn Pro Pro Gln Lys Ile Lys Gln Met Arg Glu Ile Arg Trp Lys Glu
625 630 635 640
Arg Asn Arg Ile Lys Gln Gln Val Asp Gln Leu Ser Ala Ile Leu Arg
645 650 655
Leu His Lys Lys Val Asn Glu Asp Glu Arg Ile Gln Ala Ile Asp Lys
660 665 670
Leu Leu Gln Lys Val Ala Ser Trp Gln Leu Asn Glu Glu Ile Ala Thr
675 680 685
Ala Trp Asn Gln Ala Leu Ser Gln Leu Tyr Ser Lys Ala Lys Glu Asn
690 695 700
Asp Leu Gln Trp Asn Gln Ala Ile Lys Asn Ala His His Gln Leu Glu
705 710 715 720
Pro Val Val Gly Lys Gln Ile Ser Leu Trp Arg Lys Asp Leu Ser Thr
725 730 735
Gly Arg Gln Gly Ile Ala Gly Leu Ser Leu Trp Ser Ile Glu Glu Leu
740 745 750
Glu Ala Thr Lys Lys Leu Leu Thr Arg Trp Ser Lys Arg Ser Arg Glu
755 760 765
Pro Gly Val Val Lys Arg Ile Glu Arg Phe Glu Thr Phe Ala Lys Gln
770 775 780
Ile Gln His His Ile Asn Gln Val Lys Glu Asn Arg Leu Lys Gln Leu
785 790 795 800
Ala Asn Leu Ile Val Met Thr Ala Leu Gly Tyr Lys Tyr Asp Gln Glu
805 810 815
Gln Lys Lys Trp Ile Glu Val Tyr Pro Ala Cys Gln Val Val Leu Phe
820 825 830
Glu Asn Leu Arg Ser Tyr Arg Phe Ser Tyr Glu Arg Ser Arg Arg Glu
835 840 845
Asn Lys Lys Leu Met Glu Trp Ser His Arg Ser Ile Pro Lys Leu Val
850 855 860
Gln Met Gln Gly Glu Leu Phe Gly Leu Gln Val Ala Asp Val Tyr Ala
865 870 875 880
Ala Tyr Ser Ser Arg Tyr His Gly Arg Thr Gly Ala Pro Gly Ile Arg
885 890 895
Cys His Ala Leu Thr Glu Ala Asp Leu Arg Asn Glu Thr Asn Ile Ile
900 905 910
His Glu Leu Ile Glu Ala Gly Phe Ile Lys Glu Glu His Arg Pro Tyr
915 920 925
Leu Gln Gln Gly Asp Leu Val Pro Trp Ser Gly Gly Glu Leu Phe Ala
930 935 940
Thr Leu Gln Lys Pro Tyr Asp Asn Pro Arg Ile Leu Thr Leu His Ala
945 950 955 960
Asp Ile Asn Ala Ala Gln Asn Ile Gln Lys Arg Phe Trp His Pro Ser
965 970 975
Met Trp Phe Arg Val Asn Cys Glu Ser Val Met Glu Gly Glu Ile Val
980 985 990
Thr Tyr Val Pro Lys Asn Lys Thr Val His Lys Lys Gln Gly Lys Thr
995 1000 1005
Phe Arg Phe Val Lys Val Glu Gly Ser Asp Val Tyr Glu Trp Ala
1010 1015 1020
Lys Trp Ser Lys Asn Arg Asn Lys Asn Thr Phe Ser Ser Ile Thr
1025 1030 1035
Glu Arg Lys Pro Pro Ser Ser Met Ile Leu Phe Arg Asp Pro Ser
1040 1045 1050
Gly Thr Phe Phe Lys Glu Gln Glu Trp Val Glu Gln Lys Thr Phe
1055 1060 1065
Trp Gly Lys Val Gln Ser Met Ile Gln Ala Tyr Met Lys Lys Thr
1070 1075 1080
Ile Val Gln Arg Met Glu Glu
1085 1090
<210> 9
<211> 1119
<212> PRT
<213> Spirochaetes
<400> 9
Met Ser Phe Thr Ile Ser Tyr Pro Phe Lys Leu Ile Ile Lys Asn Lys
1 5 10 15
Asp Glu Ala Lys Ala Leu Leu Asp Thr His Gln Tyr Met Asn Glu Gly
20 25 30
Val Lys Tyr Tyr Leu Glu Lys Leu Leu Met Phe Arg Gln Glu Lys Ile
35 40 45
Phe Ile Gly Glu Asp Glu Thr Gly Lys Arg Ile Tyr Ile Glu Glu Thr
50 55 60
Glu Tyr Lys Lys Gln Ile Glu Glu Phe Tyr Leu Ile Lys Lys Thr Glu
65 70 75 80
Leu Gly Arg Asn Leu Thr Leu Thr Leu Asp Glu Phe Lys Thr Leu Met
85 90 95
Arg Glu Leu Tyr Ile Cys Leu Val Ser Ser Ser Met Glu Asn Lys Lys
100 105 110
Gly Phe Pro Asn Ala Gln Gln Ala Ser Leu Asn Ile Phe Ser Pro Leu
115 120 125
Phe Asp Ala Glu Ser Lys Gly Tyr Ile Leu Lys Glu Glu Asn Asn Asn
130 135 140
Ile Ser Leu Ile His Lys Asp Tyr Gly Lys Ile Leu Leu Lys Arg Leu
145 150 155 160
Arg Asp Asn Asn Leu Ile Pro Ile Phe Thr Lys Phe Thr Asp Ile Lys
165 170 175
Lys Ile Thr Ala Lys Leu Ser Pro Thr Ala Leu Asp Arg Met Ile Phe
180 185 190
Ala Gln Ala Ile Glu Lys Leu Leu Ser Tyr Glu Ser Trp Cys Lys Leu
195 200 205
Met Ile Lys Glu Arg Phe Asp Lys Glu Val Lys Ile Lys Glu Leu Glu
210 215 220
Asn Lys Cys Glu Asn Lys Gln Glu Arg Asp Lys Ile Phe Glu Ile Leu
225 230 235 240
Glu Lys Tyr Glu Glu Glu Arg Gln Lys Thr Phe Glu Gln Asp Ser Gly
245 250 255
Phe Ala Lys Lys Gly Lys Phe Tyr Ile Thr Gly Arg Met Leu Lys Gly
260 265 270
Phe Asp Glu Ile Lys Glu Lys Trp Leu Lys Glu Lys Asp Arg Ser Glu
275 280 285
Gln Asn Leu Ile Asn Ile Leu Asn Lys Tyr Gln Thr Asp Asn Ser Lys
290 295 300
Leu Val Gly Asp Arg Asn Leu Phe Glu Phe Ile Ile Lys Leu Glu Asn
305 310 315 320
Gln Cys Leu Trp Asn Gly Asp Ile Asp Tyr Leu Lys Ile Lys Arg Asp
325 330 335
Ile Asn Lys Asn Gln Ile Trp Leu Asp Arg Pro Glu Met Pro Arg Phe
340 345 350
Thr Met Pro Asp Phe Lys Lys His Pro Leu Trp Tyr Arg Tyr Glu Asp
355 360 365
Pro Ser Asn Ser Asn Phe Arg Asn Tyr Lys Ile Glu Val Val Lys Asp
370 375 380
Glu Asn Tyr Ile Thr Ile Pro Leu Ile Thr Glu Arg Asn Asn Glu Tyr
385 390 395 400
Phe Glu Glu Asn Tyr Thr Phe Asn Leu Ala Lys Leu Lys Lys Leu Ser
405 410 415
Glu Asn Ile Thr Phe Ile Pro Lys Ser Lys Asn Lys Glu Phe Glu Phe
420 425 430
Ile Asp Ser Asn Asp Glu Glu Glu Asp Lys Lys Asp Gln Lys Lys Ser
435 440 445
Lys Gln Tyr Ile Lys Tyr Cys Asp Thr Ala Lys Asn Thr Ser Tyr Gly
450 455 460
Lys Ser Gly Gly Ile Arg Leu Tyr Phe Asn Arg Asn Glu Leu Glu Asn
465 470 475 480
Tyr Lys Asp Gly Lys Lys Met Asp Ser Tyr Thr Val Phe Thr Leu Ser
485 490 495
Ile Arg Asp Tyr Lys Ser Leu Phe Ala Lys Glu Lys Leu Gln Pro Gln
500 505 510
Ile Phe Asn Thr Val Asp Asn Lys Ile Thr Ser Leu Lys Ile Gln Lys
515 520 525
Lys Phe Gly Asn Glu Glu Gln Thr Asn Phe Leu Ser Tyr Phe Thr Gln
530 535 540
Asn Gln Ile Thr Lys Lys Asp Trp Met Asp Glu Lys Thr Phe Gln Asn
545 550 555 560
Val Lys Glu Leu Asn Glu Gly Ile Arg Val Leu Ser Val Asp Leu Gly
565 570 575
Gln Arg Phe Phe Ala Ala Val Ser Cys Phe Glu Ile Met Ser Glu Ile
580 585 590
Asp Asn Asn Lys Leu Phe Phe Asn Leu Asn Asp Gln Asn His Lys Ile
595 600 605
Ile Arg Ile Asn Asp Lys Asn Tyr Tyr Ala Lys His Ile Tyr Ser Lys
610 615 620
Thr Ile Lys Leu Ser Gly Glu Asp Asp Asp Leu Tyr Lys Glu Arg Lys
625 630 635 640
Ile Asn Lys Asn Tyr Lys Leu Ser Tyr Gln Glu Arg Lys Asn Lys Ile
645 650 655
Gly Ile Phe Thr Arg Gln Ile Asn Lys Leu Asn Gln Leu Leu Lys Ile
660 665 670
Ile Arg Asn Asp Glu Ile Asp Lys Glu Lys Phe Lys Glu Leu Ile Glu
675 680 685
Thr Thr Lys Arg Tyr Val Lys Asn Thr Tyr Asn Asp Gly Ile Ile Asp
690 695 700
Trp Asn Asn Val Asp Asn Lys Ile Leu Ser Tyr Glu Asn Lys Glu Asp
705 710 715 720
Val Ile Asn Leu His Lys Glu Leu Asp Lys Lys Leu Glu Ile Asp Phe
725 730 735
Lys Glu Phe Ile Arg Glu Cys Arg Lys Pro Ile Phe Arg Ser Gly Gly
740 745 750
Leu Ser Met Gln Arg Ile Asp Phe Leu Glu Lys Leu Asn Lys Leu Lys
755 760 765
Arg Lys Trp Val Ala Arg Thr Gln Lys Ser Ala Glu Ser Ile Val Leu
770 775 780
Thr Pro Lys Phe Gly Tyr Lys Leu Lys Glu His Ile Asn Glu Leu Lys
785 790 795 800
Asp Asn Arg Val Lys Gln Gly Val Asn Tyr Ile Leu Met Thr Ala Leu
805 810 815
Gly Tyr Ile Lys Asp Asn Glu Ile Lys Asn Asp Ser Lys Lys Lys Gln
820 825 830
Lys Glu Asp Trp Val Lys Lys Asn Arg Ala Cys Gln Ile Ile Leu Met
835 840 845
Glu Lys Leu Thr Glu Tyr Thr Phe Ala Glu Asp Arg Pro Arg Glu Glu
850 855 860
Asn Ser Lys Leu Arg Met Trp Ser His Arg Gln Ile Phe Asn Phe Leu
865 870 875 880
Gln Gln Lys Ala Ser Leu Trp Gly Ile Leu Val Gly Asp Val Phe Ala
885 890 895
Pro Tyr Thr Ser Lys Cys Leu Ser Asp Asn Asn Ala Pro Gly Ile Arg
900 905 910
Cys His Gln Val Thr Lys Lys Asp Leu Ile Asp Asn Ser Trp Phe Leu
915 920 925
Lys Ile Val Val Lys Asp Asp Ala Phe Cys Asp Leu Ile Glu Ile Asn
930 935 940
Lys Glu Asn Val Lys Asn Lys Ser Ile Lys Ile Asn Asp Ile Leu Pro
945 950 955 960
Leu Arg Gly Gly Glu Leu Phe Ala Ser Ile Lys Asp Gly Lys Leu His
965 970 975
Ile Val Gln Ala Asp Ile Asn Ala Ser Arg Asn Ile Ala Lys Arg Phe
980 985 990
Leu Ser Gln Ile Asn Pro Phe Arg Val Val Leu Lys Lys Asp Lys Asp
995 1000 1005
Glu Thr Phe His Leu Lys Asn Glu Pro Asn Tyr Leu Lys Asn Tyr
1010 1015 1020
Tyr Ser Ile Leu Asn Phe Val Pro Thr Asn Glu Glu Leu Thr Phe
1025 1030 1035
Phe Lys Val Glu Glu Asn Lys Asp Ile Lys Pro Thr Lys Arg Ile
1040 1045 1050
Lys Met Asp Lys His Glu Lys Glu Ser Thr Asp Glu Gly Asp Asp
1055 1060 1065
Tyr Ser Lys Asn Gln Ile Ala Leu Phe Arg Asp Asp Ser Gly Ile
1070 1075 1080
Phe Phe Asp Lys Ser Leu Trp Val Asp Gly Lys Ile Phe Trp Ser
1085 1090 1095
Val Val Lys Asn Lys Met Thr Lys Leu Leu Arg Glu Arg Asn Asn
1100 1105 1110
Lys Lys Asn Gly Ser Lys
1115
<210> 10
<211> 1142
<212> PRT
<213> Tuberibacillus calidus
<400> 10
Met Asn Ile His Leu Lys Glu Leu Ile Arg Met Ala Thr Lys Ser Phe
1 5 10 15
Ile Leu Lys Met Lys Thr Lys Asn Asn Pro Gln Leu Arg Leu Ser Leu
20 25 30
Trp Lys Thr His Glu Leu Phe Asn Phe Gly Val Ala Tyr Tyr Met Asp
35 40 45
Leu Leu Ser Leu Phe Arg Gln Lys Asp Leu Tyr Met His Asn Asp Glu
50 55 60
Asp Pro Asp His Pro Val Val Leu Lys Lys Glu Glu Ile Gln Glu Arg
65 70 75 80
Leu Trp Met Lys Val Arg Glu Thr Gln Gln Lys Asn Gly Phe His Gly
85 90 95
Glu Val Ser Lys Asp Glu Val Leu Glu Thr Leu Arg Ala Leu Tyr Glu
100 105 110
Glu Leu Val Pro Ser Ala Val Gly Lys Ser Gly Glu Ala Asn Gln Ile
115 120 125
Ser Asn Lys Tyr Leu Tyr Pro Leu Thr Asp Pro Ala Ser Gln Ser Gly
130 135 140
Lys Gly Thr Ala Asn Ser Gly Arg Lys Pro Arg Trp Lys Lys Leu Lys
145 150 155 160
Glu Ala Gly Asp Pro Ser Trp Lys Asp Ala Tyr Glu Lys Trp Glu Lys
165 170 175
Glu Arg Gln Glu Asp Pro Lys Leu Lys Ile Leu Ala Ala Leu Gln Ser
180 185 190
Phe Gly Leu Ile Pro Leu Phe Arg Pro Phe Thr Glu Asn Asp His Lys
195 200 205
Ala Val Ile Ser Val Lys Trp Met Pro Lys Ser Lys Asn Gln Ser Val
210 215 220
Arg Lys Phe Asp Lys Asp Met Phe Asn Gln Ala Ile Glu Arg Phe Leu
225 230 235 240
Ser Trp Glu Ser Trp Asn Glu Lys Val Ala Glu Asp Tyr Glu Lys Thr
245 250 255
Val Ser Ile Tyr Glu Ser Leu Gln Lys Glu Leu Lys Gly Ile Ser Thr
260 265 270
Lys Ala Phe Glu Ile Met Glu Arg Val Glu Lys Ala Tyr Glu Ala His
275 280 285
Leu Arg Glu Ile Thr Phe Ser Asn Ser Thr Tyr Arg Ile Gly Asn Arg
290 295 300
Ala Ile Arg Gly Trp Thr Glu Ile Val Lys Lys Trp Met Lys Leu Asp
305 310 315 320
Pro Ser Ala Pro Gln Gly Asn Tyr Leu Asp Val Val Lys Asp Tyr Gln
325 330 335
Arg Arg His Pro Arg Glu Ser Gly Asp Phe Lys Leu Phe Glu Leu Leu
340 345 350
Ser Arg Pro Glu Asn Gln Ala Ala Trp Arg Glu Tyr Pro Glu Phe Leu
355 360 365
Pro Leu Tyr Val Lys Tyr Arg His Ala Glu Gln Arg Met Lys Thr Ala
370 375 380
Lys Lys Gln Ala Thr Phe Thr Leu Cys Asp Pro Ile Arg His Pro Leu
385 390 395 400
Trp Val Arg Tyr Glu Glu Arg Ser Gly Thr Asn Leu Asn Lys Tyr Arg
405 410 415
Leu Ile Met Asn Glu Lys Glu Lys Val Val Gln Phe Asp Arg Leu Ile
420 425 430
Cys Leu Asn Ala Asp Gly His Tyr Glu Glu Gln Glu Asp Val Thr Val
435 440 445
Pro Leu Ala Pro Ser Gln Gln Phe Asp Asp Gln Ile Lys Phe Ser Ser
450 455 460
Glu Asp Thr Gly Lys Gly Lys His Asn Phe Ser Tyr Tyr His Lys Gly
465 470 475 480
Ile Asn Tyr Glu Leu Lys Gly Thr Leu Gly Gly Ala Arg Ile Gln Phe
485 490 495
Asp Arg Glu His Leu Leu Arg Arg Gln Gly Val Lys Ala Gly Asn Val
500 505 510
Gly Arg Ile Phe Leu Asn Val Thr Leu Asn Ile Glu Pro Met Gln Pro
515 520 525
Phe Ser Arg Ser Gly Asn Leu Gln Thr Ser Val Gly Lys Ala Leu Lys
530 535 540
Val Tyr Val Asp Gly Tyr Pro Lys Val Val Asn Phe Lys Pro Lys Glu
545 550 555 560
Leu Thr Glu His Ile Lys Glu Ser Glu Lys Asn Thr Leu Thr Leu Gly
565 570 575
Val Glu Ser Leu Pro Thr Gly Leu Arg Val Met Ser Val Asp Leu Gly
580 585 590
Gln Arg Gln Ala Ala Ala Ile Ser Ile Phe Glu Val Val Ser Glu Lys
595 600 605
Pro Asp Asp Asn Lys Leu Phe Tyr Pro Val Lys Asp Thr Asp Leu Phe
610 615 620
Ala Val His Arg Thr Ser Phe Asn Ile Lys Leu Pro Gly Glu Lys Arg
625 630 635 640
Thr Glu Arg Arg Met Leu Glu Gln Gln Lys Arg Asp Gln Ala Ile Arg
645 650 655
Asp Leu Ser Arg Lys Leu Lys Phe Leu Lys Asn Val Leu Asn Met Gln
660 665 670
Lys Leu Glu Lys Thr Asp Glu Arg Glu Lys Arg Val Asn Arg Trp Ile
675 680 685
Lys Asp Arg Glu Arg Glu Glu Glu Asn Pro Val Tyr Val Gln Glu Phe
690 695 700
Glu Met Ile Ser Lys Val Leu Tyr Ser Pro His Ser Val Trp Val Asp
705 710 715 720
Gln Leu Lys Ser Ile His Arg Lys Leu Glu Glu Gln Leu Gly Lys Glu
725 730 735
Ile Ser Lys Trp Arg Gln Ser Ile Ser Gln Gly Arg Gln Gly Val Tyr
740 745 750
Gly Ile Ser Leu Lys Asn Ile Glu Asp Ile Glu Lys Thr Arg Arg Leu
755 760 765
Leu Phe Arg Trp Ser Met Arg Pro Glu Asn Pro Gly Glu Val Lys Gln
770 775 780
Leu Gln Pro Gly Glu Arg Phe Ala Ile Asp Gln Gln Asn His Leu Asn
785 790 795 800
His Leu Lys Asp Asp Arg Ile Lys Lys Leu Ala Asn Gln Ile Val Met
805 810 815
Thr Ala Leu Gly Tyr Arg Tyr Asp Gly Lys Arg Lys Lys Trp Ile Ala
820 825 830
Lys His Pro Ala Cys Gln Leu Val Leu Phe Glu Asp Leu Ser Arg Tyr
835 840 845
Ala Phe Tyr Asp Glu Arg Ser Arg Leu Glu Asn Arg Asn Leu Met Arg
850 855 860
Trp Ser Arg Arg Glu Ile Pro Lys Gln Val Ala Gln Ile Gly Gly Leu
865 870 875 880
Tyr Gly Leu Leu Val Gly Glu Val Gly Ala Gln Tyr Ser Ser Arg Phe
885 890 895
His Ala Lys Ser Gly Ala Pro Gly Ile Arg Cys Arg Val Val Lys Glu
900 905 910
His Glu Leu Tyr Ile Thr Glu Gly Gly Gln Lys Val Arg Asn Gln Lys
915 920 925
Phe Leu Asp Ser Leu Val Glu Asn Asn Ile Ile Glu Pro Asp Asp Ala
930 935 940
Arg Arg Leu Glu Pro Gly Asp Leu Ile Arg Asp Gln Gly Gly Asp Lys
945 950 955 960
Phe Ala Thr Leu Asp Glu Arg Gly Glu Leu Val Ile Thr His Ala Asp
965 970 975
Ile Asn Ala Ala Gln Asn Leu Gln Lys Arg Phe Trp Thr Arg Thr His
980 985 990
Gly Leu Tyr Arg Ile Arg Cys Glu Ser Arg Glu Ile Lys Asp Ala Val
995 1000 1005
Val Leu Val Pro Ser Asp Lys Asp Gln Lys Glu Lys Met Glu Asn
1010 1015 1020
Leu Phe Gly Ile Gly Tyr Leu Gln Pro Phe Lys Gln Glu Asn Asp
1025 1030 1035
Val Tyr Lys Trp Val Lys Gly Glu Lys Ile Lys Gly Lys Lys Thr
1040 1045 1050
Ser Ser Gln Ser Asp Asp Lys Glu Leu Val Ser Glu Ile Leu Gln
1055 1060 1065
Glu Ala Ser Val Met Ala Asp Glu Leu Lys Gly Asn Arg Lys Thr
1070 1075 1080
Leu Phe Arg Asp Pro Ser Gly Tyr Val Phe Pro Lys Asp Arg Trp
1085 1090 1095
Tyr Thr Gly Gly Arg Tyr Phe Gly Thr Leu Glu His Leu Leu Lys
1100 1105 1110
Arg Lys Leu Ala Glu Arg Arg Leu Phe Asp Gly Gly Ser Ser Arg
1115 1120 1125
Arg Gly Leu Phe Asn Gly Thr Asp Ser Asn Thr Asn Val Glu
1130 1135 1140
<210> 11
<211> 3387
<212> DNA
<213> Alicyclobacillus acidiphilus
<400> 11
atggccgtga agagcatgaa ggtgaagctg cgcctggaca acatgcccga gatccgcgcc 60
ggcctgtgga agctgcacac cgaggtgaac gccggcgtgc gctactacac cgagtggctg 120
agcctgctgc gccaggagaa cctgtaccgc cgcagcccca acggcgacgg cgagcaggag 180
tgctacaaga ccgccgagga gtgcaaggcc gagctgctgg agcgcctgcg cgcccgccag 240
gtggagaacg gccactgcgg ccccgccggc agcgacgacg agctgctgca gctggcccgc 300
cagctgtacg agctgctggt gccccaggcc atcggcgcca agggcgacgc ccagcagatc 360
gcccgcaagt tcctgagccc cctggccgac aaggacgccg tgggcggcct gggcatcgcc 420
aaggccggca acaagccccg ctgggtgcgc atgcgcgagg ccggcgagcc cggctgggag 480
gaggagaagg ccaaggccga ggcccgcaag agcaccgacc gcaccgccga cgtgctgcgc 540
gccctggccg acttcggcct gaagcccctg atgcgcgtgt acaccgacag cgacatgagc 600
agcgtgcagt ggaagcccct gcgcaagggc caggccgtgc gcacctggga ccgcgacatg 660
ttccagcagg ccatcgagcg catgatgagc tgggagagct ggaaccagcg cgtgggcgag 720
gcctacgcca agctggtgga gcagaagagc cgcttcgagc agaagaactt cgtgggccag 780
gagcacctgg tgcagctggt gaaccagctg cagcaggaca tgaaggaggc cagccacggc 840
ctggagagca aggagcagac cgcccactac ctgaccggcc gcgccctgcg cggcagcgac 900
aaggtgttcg agaagtggga gaagctggac cccgacgccc ccttcgacct gtacgacacc 960
gagatcaaga acgtgcagcg ccgcaacacc cgccgcttcg gcagccacga cctgttcgcc 1020
aagctggccg agcccaagta ccaggccctg tggcgcgagg acgccagctt cctgacccgc 1080
tacgccgtgt acaacagcat cgtgcgcaag ctgaaccacg ccaagatgtt cgccaccttc 1140
accctgcccg acgccaccgc ccaccccatc tggacccgct tcgacaagct gggcggcaac 1200
ctgcaccagt acaccttcct gttcaacgag ttcggcgagg gccgccacgc catccgcttc 1260
cagaagctgc tgaccgtgga ggacggcgtg gccaaggagg tggacgacgt gaccgtgccc 1320
atcagcatga gcgcccagct ggacgacctg ctgccccgcg acccccacga gctggtggcc 1380
ctgtacttcc aggactacgg cgccgagcag cacctggccg gcgagttcgg cggcgccaag 1440
atccagtacc gccgcgacca gctgaaccac ctgcacgccc gccgcggcgc ccgcgacgtg 1500
tacctgaacc tgagcgtgcg cgtgcagagc cagagcgagg cccgcggcga gcgccgcccc 1560
ccctacgccg ccgtgttccg cctggtgggc gacaaccacc gcgccttcgt gcacttcgac 1620
aagctgagcg actacctggc cgagcacccc gacgacggca agctgggcag cgagggcctg 1680
ctgagcggcc tgcgcgtgat gagcgtggac ctgggcctgc gcaccagcgc cagcatcagc 1740
gtgttccgcg tggcccgcaa ggacgagctg aagcccaaca gcgagggccg cgtgcccttc 1800
tgcttcccca tcgagggcaa cgagaacctg gtggccgtgc acgagcgcag ccagctgctg 1860
aagctgcccg gcgagaccga gagcaaggac ctgcgcgcca tccgcgagga gcgccagcgc 1920
accctgcgcc agctgcgcac ccagctggcc tacctgcgcc tgctggtgcg ctgcggcagc 1980
gaggacgtgg gccgccgcga gcgcagctgg gccaagctga tcgagcagcc catggacgcc 2040
aaccagatga cccccgactg gcgcgaggcc ttcgaggacg agctgcagaa gctgaagagc 2100
ctgtacggca tctgcggcga ccgcgagtgg accgaggccg tgtacgagag cgtgcgccgc 2160
gtgtggcgcc acatgggcaa gcaggtgcgc gactggcgca aggacgtgcg cagcggcgag 2220
cgccccaaga tccgcggcta ccagaaggac gtggtgggcg gcaacagcat cgagcagatc 2280
gagtacctgg agcgccagta caagttcctg aagagctgga gcttcttcgg caaggtgagc 2340
ggccaggtga tccgcgccga gaagggcagc cgcttcgcca tcaccctgcg cgagcacatc 2400
gaccacgcca aggaggaccg cctgaagaag ctggccgacc gcatcatcat ggaggccctg 2460
ggctacgtgt acgccctgga cgacgagcgc ggcaagggca agtgggtggc caagtacccc 2520
ccctgccagc tgatcctgct ggaggagctg agcgagtacc agttcaacaa cgaccgcccc 2580
cccagcgaga acaaccagct gatgcagtgg agccaccgcg gcgtgttcca ggagctgctg 2640
aaccaggccc aggtgcacga cctgctggtg ggcaccatgt acgccgcctt cagcagccgc 2700
ttcgacgccc gcaccggcgc ccccggcatc cgctgccgcc gcgtgcccgc ccgctgcgcc 2760
cgcgagcaga accccgagcc cttcccctgg tggctgaaca agttcgtggc cgagcacaag 2820
ctggacggct gccccctgcg cgccgacgac ctgatcccca ccggcgaggg cgagttcttc 2880
gtgagcccct tcagcgccga ggagggcgac ttccaccaga tccacgccga cctgaacgcc 2940
gcccagaacc tgcagcgccg cctgtggagc gacttcgaca tcagccagat ccgcctgcgc 3000
tgcgactggg gcgaggtgga cggcgagccc gtgctgatcc cccgcaccac cggcaagcgc 3060
accgccgaca gctacggcaa caaggtgttc tacaccaaga ccggcgtgac ctactacgag 3120
cgcgagcgcg gcaagaagcg ccgcaaggtg ttcgcccagg aggagctgag cgaggaggag 3180
gccgagctgc tggtggaggc cgacgaggcc cgcgagaaga gcgtggtgct gatgcgcgac 3240
cccagcggca tcatcaaccg cggcgactgg acccgccaga aggagttctg gagcatggtg 3300
aaccagcgca tcgagggcta cctggtgaag cagatccgca gccgcgtgcg cctgcaggag 3360
agcgcctgcg agaacaccgg cgacatc 3387
<210> 12
<211> 3441
<212> DNA
<213> Alicyclobacillus kakegawensis
<400> 12
atggccgtga agagcatcaa ggtgaagctg cgcctgagcg agtgccccga catcctggcc 60
ggcatgtggc agctgcaccg cgccaccaac gccggcgtgc gctactacac cgagtgggtg 120
agcctgatgc gccaggagat cctgtacagc cgcggccccg acggcggcca gcagtgctac 180
atgaccgccg aggactgcca gcgcgagctg ctgcgccgcc tgcgcaaccg ccagctgcac 240
aacggccgcc aggaccagcc cggcaccgac gccgacctgc tggccatcag ccgccgcctg 300
tacgagatcc tggtgctgca gagcatcggc aagcgcggcg acgcccagca gatcgccagc 360
agcttcctga gccccctggt ggaccccaac agcaagggcg gccgcggcga ggccaagagc 420
ggccgcaagc ccgcctggca gaagatgcgc gaccagggcg acccccgctg ggtggccgcc 480
cgcgagaagt acgagcagcg caaggccgtg gaccccagca aggagatcct gaacagcctg 540
gacgccctgg gcctgcgccc cctgttcgcc gtgttcaccg agacctaccg cagcggcgtg 600
gactggaagc ccctgggcaa gagccagggc gtgcgcacct gggaccgcga catgttccag 660
caggccctgg agcgcctgat gagctgggag agctggaacc gccgcgtggg cgaggagtac 720
gcccgcctgt tccagcagaa gatgaagttc gagcaggagc acttcgccga gcagagccac 780
ctggtgaagc tggcccgcgc cctggaggcc gacatgcgcg ccgccagcca gggcttcgag 840
gccaagcgcg gcaccgccca ccagatcacc cgccgcgccc tgcgcggcgc cgaccgcgtg 900
ttcgagatct ggaagagcat ccccgaggag gccctgttca gccagtacga cgaggtgatc 960
cgccaggtgc aggccgagaa gcgccgcgac ttcggcagcc acgacctgtt cgccaagctg 1020
gccgagccca agtaccagcc cctgtggcgc gccgacgaga ccttcctgac ccgctacgcc 1080
ctgtacaacg gcgtgctgcg cgacctggag aaggcccgcc agttcgccac cttcaccctg 1140
cccgacgcct gcgtgaaccc catctggacc cgcttcgaga gcagccaggg cagcaacctg 1200
cacaagtacg agttcctgtt cgaccacctg ggccccggcc gccacgccgt gcgcttccag 1260
cgcctgctgg tggtggagag cgagggcgcc aaggagcgcg acagcgtggt ggtgcccgtg 1320
gcccccagcg gccagctgga caagctggtg ctgcgcgagg aggagaagag cagcgtggcc 1380
ctgcacctgc acgacaccgc ccgccccgac ggcttcatgg ccgagtgggc cggcgccaag 1440
ctgcagtacg agcgcagcac cctggcccgc aaggcccgcc gcgacaagca gggcatgcgc 1500
agctggcgcc gccagcccag catgctgatg agcgccgccc agatgctgga ggacgccaag 1560
caggccggcg acgtgtacct gaacatcagc gtgcgcgtga agagccccag cgaggtgcgc 1620
ggccagcgcc gcccccccta cgccgccctg ttccgcatcg acgacaagca gcgccgcgtg 1680
accgtgaact acaacaagct gagcgcctac ctggaggagc accccgacaa gcagatcccc 1740
ggcgcccccg gcctgctgag cggcctgcgc gtgatgagcg tggacctggg cctgcgcacc 1800
agcgccagca tcagcgtgtt ccgcgtggcc aagaaggagg aggtggaggc cctgggcgac 1860
ggccgccccc cccactacta ccccatccac ggcaccgacg acctggtggc cgtgcacgag 1920
cgcagccacc tgatccagat gcccggcgag accgagacca agcagctgcg caagctgcgc 1980
gaggagcgcc aggccgtgct gcgccccctg ttcgcccagc tggccctgct gcgcctgctg 2040
gtgcgctgcg gcgccgccga cgagcgcatc cgcacccgca gctggcagcg cctgaccaag 2100
cagggccgcg agttcaccaa gcgcctgacc cccagctggc gcgaggccct ggagctggag 2160
ctgacccgcc tggaggccta ctgcggccgc gtgcccgacg acgagtggag ccgcatcgtg 2220
gaccgcaccg tgatcgccct gtggcgccgc atgggcaagc aggtgcgcga ctggcgcaag 2280
caggtgaaga gcggcgccaa ggtgaaggtg aagggctacc agctggacgt ggtgggcggc 2340
aacagcctgg cccagatcga ctacctggag cagcagtaca agttcctgcg ccgctggagc 2400
ttcttcgccc gcgccagcgg cctggtggtg cgcgccgacc gcgagagcca cttcgccgtg 2460
gccctgcgcc agcacatcga gaacgccaag cgcgaccgcc tgaagaagct ggccgaccgc 2520
atcctgatgg aggccctggg ctacgtgtac gaggccagcg gcccccgcga gggccagtgg 2580
accgcccagc accccccctg ccagctgatc atcctggagg agctgagcgc ctaccgcttc 2640
agcgacgacc gcccccccag cgagaacagc aagctgatgg cctggggcca ccgcggcatc 2700
ctggaggagc tggtgaacca ggcccaggtg cacgacgtgc tggtgggcac cgtgtacgcc 2760
gccttcagca gccgcttcga cgcccgcacc ggcgcccccg gcgtgcgctg ccgccgcgtg 2820
cccgcccgct tcgtgggcgc caccgtggac gacagcctgc ccctgtggct gaccgagttc 2880
ctggacaagc accgcctgga caagaacctg ctgcgccccg acgacgtgat ccccaccggc 2940
gagggcgagt tcctggtgag cccctgcggc gaggaggccg cccgcgtgcg ccaggtgcac 3000
gccgacatca acgccgccca gaacctgcag cgccgcctgt ggcagaactt cgacatcacc 3060
gagctgcgcc tgcgctgcga cgtgaagatg ggcggcgagg gcaccgtgct ggtgccccgc 3120
gtgaacaacg cccgcgccaa gcagctgttc ggcaagaagg tgctggtgag ccaggacggc 3180
gtgaccttct tcgagcgcag ccagaccggc ggcaagcccc acagcgagaa gcagaccgac 3240
ctgaccgaca aggagctgga gctgatcgcc gaggccgacg aggcccgcgc caagagcgtg 3300
gtgctgttcc gcgaccccag cggccacatc ggcaagggcc actggatccg ccagcgcgag 3360
ttctggagcc tggtgaagca gcgcatcgag agccacaccg ccgagcgcat ccgcgtgcgc 3420
ggcgtgggca gcagcctgga c 3441
<210> 13
<211> 3438
<212> DNA
<213> Alicyclobacillus macrosporangiidus
<400> 13
atgaacgtgg ccgtgaagag catcaaggtg aagctgatgc tgggccacct gcccgagatc 60
cgcgagggcc tgtggcacct gcacgaggcc gtgaacctgg gcgtgcgcta ctacaccgag 120
tggctggccc tgctgcgcca gggcaacctg taccgccgcg gcaaggacgg cgcccaggag 180
tgctacatga ccgccgagca gtgccgccag gagctgctgg tgcgcctgcg cgaccgccag 240
aagcgcaacg gccacaccgg cgaccccggc accgacgagg agctgctggg cgtggcccgc 300
cgcctgtacg agctgctggt gccccagagc gtgggcaaga agggccaggc ccagatgctg 360
gccagcggct tcctgagccc cctggccgac cccaagagcg agggcggcaa gggcaccagc 420
aagagcggcc gcaagcccgc ctggatgggc atgaaggagg ccggcgacag ccgctgggtg 480
gaggccaagg cccgctacga ggccaacaag gccaaggacc ccaccaagca ggtgatcgcc 540
agcctggaga tgtacggcct gcgccccctg ttcgacgtgt tcaccgagac ctacaagacc 600
atccgctgga tgcccctggg caagcaccag ggcgtgcgcg cctgggaccg cgacatgttc 660
cagcagagcc tggagcgcct gatgagctgg gagagctgga acgagcgcgt gggcgccgag 720
ttcgcccgcc tggtggaccg ccgcgaccgc ttccgcgaga agcacttcac cggccaggag 780
cacctggtgg ccctggccca gcgcctggag caggagatga aggaggccag ccccggcttc 840
gagagcaaga gcagccaggc ccaccgcatc accaagcgcg ccctgcgcgg cgccgacggc 900
atcatcgacg actggctgaa gctgagcgag ggcgagcccg tggaccgctt cgacgagatc 960
ctgcgcaagc gccaggccca gaacccccgc cgcttcggca gccacgacct gttcctgaag 1020
ctggccgagc ccgtgttcca gcccctgtgg cgcgaggacc ccagcttcct gagccgctgg 1080
gccagctaca acgaggtgct gaacaagctg gaggacgcca agcagttcgc caccttcacc 1140
ctgcccagcc cctgcagcaa ccccgtgtgg gcccgcttcg agaacgccga gggcaccaac 1200
atcttcaagt acgacttcct gttcgaccac ttcggcaagg gccgccacgg cgtgcgcttc 1260
cagcgcatga tcgtgatgcg cgacggcgtg cccaccgagg tggagggcat cgtggtgccc 1320
atcgccccca gccgccagct ggacgccctg gcccccaacg acgccgccag ccccatcgac 1380
gtgttcgtgg gcgaccccgc cgcccccggc gccttccgcg gccagttcgg cggcgccaag 1440
atccagtacc gccgcagcgc cctggtgcgc aagggccgcc gcgaggagaa ggcctacctg 1500
tgcggcttcc gcctgcccag ccagcgccgc accggcaccc ccgccgacga cgccggcgag 1560
gtgttcctga acctgagcct gcgcgtggag agccagagcg agcaggccgg ccgccgcaac 1620
cccccctacg ccgccgtgtt ccacatcagc gaccagaccc gccgcgtgat cgtgcgctac 1680
ggcgagatcg agcgctacct ggccgagcac cccgacaccg gcatccccgg cagccgcggc 1740
ctgaccagcg gcctgcgcgt gatgagcgtg gacctgggcc tgcgcaccag cgccgccatc 1800
agcgtgttcc gcgtggccca ccgcgacgag ctgacccccg acgcccacgg ccgccagccc 1860
ttcttcttcc ccatccacgg catggaccac ctggtggccc tgcacgagcg cagccacctg 1920
atccgcctgc ccggcgagac cgagagcaag aaggtgcgca gcatccgcga gcagcgcctg 1980
gaccgcctga accgcctgcg cagccagatg gccagcctgc gcctgctggt gcgcaccggc 2040
gtgctggacg agcagaagcg cgaccgcaac tgggagcgcc tgcagagcag catggagcgc 2100
ggcggcgagc gcatgcccag cgactggtgg gacctgttcc aggcccaggt gcgctacctg 2160
gcccagcacc gcgacgccag cggcgaggcc tggggccgca tggtgcaggc cgccgtgcgc 2220
accctgtggc gccagctggc caagcaggtg cgcgactggc gcaaggaggt gcgccgcaac 2280
gccgacaagg tgaagatccg cggcatcgcc cgcgacgtgc ccggcggcca cagcctggcc 2340
cagctggact acctggagcg ccagtaccgc ttcctgcgca gctggagcgc cttcagcgtg 2400
caggccggcc aggtggtgcg cgccgagcgc gacagccgct tcgccgtggc cctgcgcgag 2460
cacatcgaca acggcaagaa ggaccgcctg aagaagctgg ccgaccgcat cctgatggag 2520
gccctgggct acgtgtacgt gaccgacggc cgccgcgccg gccagtggca ggccgtgtac 2580
cccccctgcc agctggtgct gctggaggag ctgagcgagt accgcttcag caacgaccgc 2640
ccccccagcg agaacagcca gctgatggtg tggagccacc gcggcgtgct ggaggagctg 2700
atccaccagg cccaggtgca cgacgtgctg gtgggcacca tccccgccgc cttcagcagc 2760
cgcttcgacg cccgcaccgg cgcccccggc atccgctgcc gccgcgtgcc cagcatcccc 2820
ctgaaggacg cccccagcat ccccatctgg ctgagccact acctgaagca gaccgagcgc 2880
gacgccgccg ccctgcgccc cggcgagctg atccccaccg gcgacggcga gttcctggtg 2940
acccccgccg gccgcggcgc cagcggcgtg cgcgtggtgc acgccgacat caacgccgcc 3000
cacaacctgc agcgccgcct gtgggagaac ttcgacctga gcgacatccg cgtgcgctgc 3060
gaccgccgcg agggcaagga cggcaccgtg gtgctgatcc cccgcctgac caaccagcgc 3120
gtgaaggagc gctacagcgg cgtgatcttc accagcgagg acggcgtgag cttcaccgtg 3180
ggcgacgcca agacccgccg ccgcagcagc gccagccagg gcgagggcga cgacctgagc 3240
gacgaggagc aggagctgct ggccgaggcc gacgacgccc gcgagcgcag cgtggtgctg 3300
ttccgcgacc ccagcggctt cgtgaacggc ggccgctgga ccgcccagcg cgccttctgg 3360
ggcatggtgc acaaccgcat cgagaccctg ctggccgagc gcttcagcgt gagcggcgcc 3420
gccgagaagg tgcgcggc 3438
<210> 14
<211> 3324
<212> DNA
<213> Bacillus hisashii
<400> 14
atggccaccc gcagcttcat cctgaagatc gagcccaacg aggaggtgaa gaagggcctg 60
tggaagaccc acgaggtgct gaaccacggc atcgcctact acatgaacat cctgaagctg 120
atccgccagg aggccatcta cgagcaccac gagcaggacc ccaagaaccc caagaaggtg 180
agcaaggccg agatccaggc cgagctgtgg gacttcgtgc tgaagatgca gaagtgcaac 240
agcttcaccc acgaggtgga caaggacgag gtgttcaaca tcctgcgcga gctgtacgag 300
gagctggtgc ccagcagcgt ggagaagaag ggcgaggcca accagctgag caacaagttc 360
ctgtaccccc tggtggaccc caacagccag agcggcaagg gcaccgccag cagcggccgc 420
aagccccgct ggtacaacct gaagatcgcc ggcgacccca gctgggagga ggagaagaag 480
aagtgggagg aggacaagaa gaaggacccc ctggccaaga tcctgggcaa gctggccgag 540
tacggcctga tccccctgtt catcccctac accgacagca acgagcccat cgtgaaggag 600
atcaagtgga tggagaagag ccgcaaccag agcgtgcgcc gcctggacaa ggacatgttc 660
atccaggccc tggagcgctt cctgagctgg gagagctgga acctgaaggt gaaggaggag 720
tacgagaagg tggagaagga gtacaagacc ctggaggagc gcatcaagga ggacatccag 780
gccctgaagg ccctggagca gtacgagaag gagcgccagg agcagctgct gcgcgacacc 840
ctgaacacca acgagtaccg cctgagcaag cgcggcctgc gcggctggcg cgagatcatc 900
cagaagtggc tgaagatgga cgagaacgag cccagcgaga agtacctgga ggtgttcaag 960
gactaccagc gcaagcaccc ccgcgaggcc ggcgactaca gcgtgtacga gttcctgagc 1020
aagaaggaga accacttcat ctggcgcaac caccccgagt acccctacct gtacgccacc 1080
ttctgcgaga tcgacaagaa gaagaaggac gccaagcagc aggccacctt caccctggcc 1140
gaccccatca accaccccct gtgggtgcgc ttcgaggagc gcagcggcag caacctgaac 1200
aagtaccgca tcctgaccga gcagctgcac accgagaagc tgaagaagaa gctgaccgtg 1260
cagctggacc gcctgatcta ccccaccgag agcggcggct gggaggagaa gggcaaggtg 1320
gacatcgtgc tgctgcccag ccgccagttc tacaaccaga tcttcctgga catcgaggag 1380
aagggcaagc acgccttcac ctacaaggac gagagcatca agttccccct gaagggcacc 1440
ctgggcggcg cccgcgtgca gttcgaccgc gaccacctgc gccgctaccc ccacaaggtg 1500
gagagcggca acgtgggccg catctacttc aacatgaccg tgaacatcga gcccaccgag 1560
agccccgtga gcaagagcct gaagatccac cgcgacgact tccccaaggt ggtgaacttc 1620
aagcccaagg agctgaccga gtggatcaag gacagcaagg gcaagaagct gaagagcggc 1680
atcgagagcc tggagatcgg cctgcgcgtg atgagcatcg acctgggcca gcgccaggcc 1740
gccgccgcca gcatcttcga ggtggtggac cagaagcccg acatcgaggg caagctgttc 1800
ttccccatca agggcaccga gctgtacgcc gtgcaccgcg ccagcttcaa catcaagctg 1860
cccggcgaga ccctggtgaa gagccgcgag gtgctgcgca aggcccgcga ggacaacctg 1920
aagctgatga accagaagct gaacttcctg cgcaacgtgc tgcacttcca gcagttcgag 1980
gacatcaccg agcgcgagaa gcgcgtgacc aagtggatca gccgccagga gaacagcgac 2040
gtgcccctgg tgtaccagga cgagctgatc cagatccgcg agctgatgta caagccctac 2100
aaggactggg tggccttcct gaagcagctg cacaagcgcc tggaggtgga gatcggcaag 2160
gaggtgaagc actggcgcaa gagcctgagc gacggccgca agggcctgta cggcatcagc 2220
ctgaagaaca tcgacgagat cgaccgcacc cgcaagttcc tgctgcgctg gagcctgcgc 2280
cccaccgagc ccggcgaggt gcgccgcctg gagcccggcc agcgcttcgc catcgaccag 2340
ctgaaccacc tgaacgccct gaaggaggac cgcctgaaga agatggccaa caccatcatc 2400
atgcacgccc tgggctactg ctacgacgtg cgcaagaaga agtggcaggc caagaacccc 2460
gcctgccaga tcatcctgtt cgaggacctg agcaactaca acccctacga ggagcgcagc 2520
cgcttcgaga acagcaagct gatgaagtgg agccgccgcg agatcccccg ccaggtggcc 2580
ctgcagggcg agatctacgg cctgcaggtg ggcgaggtgg gcgcccagtt cagcagccgc 2640
ttccacgcca agaccggcag ccccggcatc cgctgcagcg tggtgaccaa ggagaagctg 2700
caggacaacc gcttcttcaa gaacctgcag cgcgagggcc gcctgaccct ggacaagatc 2760
gccgtgctga aggagggcga cctgtacccc gacaagggcg gcgagaagtt catcagcctg 2820
agcaaggacc gcaagtgcgt gaccacccac gccgacatca acgccgccca gaacctgcag 2880
aagcgcttct ggacccgcac ccacggcttc tacaaggtgt actgcaaggc ctaccaggtg 2940
gacggccaga ccgtgtacat ccccgagagc aaggaccaga agcagaagat catcgaggag 3000
ttcggcgagg gctacttcat cctgaaggac ggcgtgtacg agtgggtgaa cgccggcaag 3060
ctgaagatca agaagggcag cagcaagcag agcagcagcg agctggtgga cagcgacatc 3120
ctgaaggaca gcttcgacct ggccagcgag ctgaagggcg agaagctgat gctgtaccgc 3180
gaccccagcg gcaacgtgtt ccccagcgac aagtggatgg ccgccggcgt gttcttcggc 3240
aagctggagc gcatcctgat cagcaagctg accaaccagt acagcatcag caccatcgag 3300
gacgacagca gcaagcagag catg 3324
<210> 15
<211> 3324
<212> DNA
<213> Bacillus
<400> 15
atggccatcc gcagcatcaa gctgaagctg aagacccaca ccggccccga ggcccagaac 60
ctgcgcaagg gcatctggcg cacccaccgc ctgctgaacg agggcgtggc ctactacatg 120
aagatgctgc tgctgttccg ccaggagagc accggcgagc gccccaagga ggagctgcag 180
gaggagctga tctgccacat ccgcgagcag cagcagcgca accaggccga caagaacacc 240
caggccctgc ccctggacaa ggccctggag gccctgcgcc agctgtacga gctgctggtg 300
cccagcagcg tgggccagag cggcgacgcc cagatcatca gccgcaagtt cctgagcccc 360
ctggtggacc ccaacagcga gggcggcaag ggcaccagca aggccggcgc caagcccacc 420
tggcagaaga agaaggaggc caacgacccc acctgggagc aggactacga gaagtggaag 480
aagcgccgcg aggaggaccc caccgccagc gtgatcacca ccctggagga gtacggcatc 540
cgccccatct tccccctgta caccaacacc gtgaccgaca tcgcctggct gcccctgcag 600
agcaaccagt tcgtgcgcac ctgggaccgc gacatgctgc agcaggccat cgagcgcctg 660
ctgagctggg agagctggaa caagcgcgtg caggaggagt acgccaagct gaaggagaag 720
atggcccagc tgaacgagca gctggagggc ggccaggagt ggatcagcct gctggagcag 780
tacgaggaga accgcgagcg cgagctgcgc gagaacatga ccgccgccaa cgacaagtac 840
cgcatcacca agcgccagat gaagggctgg aacgagctgt acgagctgtg gagcaccttc 900
cccgccagcg ccagccacga gcagtacaag gaggccctga agcgcgtgca gcagcgcctg 960
cgcggccgct tcggcgacgc ccacttcttc cagtacctga tggaggagaa gaaccgcctg 1020
atctggaagg gcaaccccca gcgcatccac tacttcgtgg cccgcaacga gctgaccaag 1080
cgcctggagg aggccaagca gagcgccacc atgaccctgc ccaacgcccg caagcacccc 1140
ctgtgggtgc gcttcgacgc ccgcggcggc aacctgcagg actactacct gaccgccgag 1200
gccgacaagc cccgcagccg ccgcttcgtg accttcagcc agctgatctg gcccagcgag 1260
agcggctgga tggagaagaa ggacgtggag gtggagctgg ccctgagccg ccagttctac 1320
cagcaggtga agctgctgaa gaacgacaag ggcaagcaga agatcgagtt caaggacaag 1380
ggcagcggca gcaccttcaa cggccacctg ggcggcgcca agctgcagct ggagcgcggc 1440
gacctggaga aggaggagaa gaacttcgag gacggcgaga tcggcagcgt gtacctgaac 1500
gtggtgatcg acttcgagcc cctgcaggag gtgaagaacg gccgcgtgca ggccccctac 1560
ggccaggtgc tgcagctgat ccgccgcccc aacgagttcc ccaaggtgac cacctacaag 1620
agcgagcagc tggtggagtg gatcaaggcc agcccccagc acagcgccgg cgtggagagc 1680
ctggccagcg gcttccgcgt gatgagcatc gacctgggcc tgcgcgccgc cgccgccacc 1740
agcatcttca gcgtggagga gagcagcgac aagaacgccg ccgacttcag ctactggatc 1800
gagggcaccc ccctggtggc cgtgcaccag cgcagctaca tgctgcgcct gcccggcgag 1860
caggtggaga agcaggtgat ggagaagcgc gacgagcgct tccagctgca ccagcgcgtg 1920
aagttccaga tccgcgtgct ggcccagatc atgcgcatgg ccaacaagca gtacggcgac 1980
cgctgggacg agctggacag cctgaagcag gccgtggagc agaagaagag ccccctggac 2040
cagaccgacc gcaccttctg ggagggcatc gtgtgcgacc tgaccaaggt gctgccccgc 2100
aacgaggccg actgggagca ggccgtggtg cagatccacc gcaaggccga ggagtacgtg 2160
ggcaaggccg tgcaggcctg gcgcaagcgc ttcgccgccg acgagcgcaa gggcatcgcc 2220
ggcctgagca tgtggaacat cgaggagctg gagggcctgc gcaagctgct gatcagctgg 2280
agccgccgca cccgcaaccc ccaggaggtg aaccgcttcg agcgcggcca caccagccac 2340
cagcgcctgc tgacccacat ccagaacgtg aaggaggacc gcctgaagca gctgagccac 2400
gccatcgtga tgaccgccct gggctacgtg tacgacgagc gcaagcagga gtggtgcgcc 2460
gagtaccccg cctgccaggt gatcctgttc gagaacctga gccagtaccg cagcaacctg 2520
gaccgcagca ccaaggagaa cagcaccctg atgaagtggg cccaccgcag catccccaag 2580
tacgtgcaca tgcaggccga gccctacggc atccagatcg gcgacgtgcg cgccgagtac 2640
agcagccgct tctacgccaa gaccggcacc cccggcatcc gctgcaagaa ggtgcgcggc 2700
caggacctgc agggccgccg cttcgagaac ctgcagaagc gcctggtgaa cgagcagttc 2760
ctgaccgagg agcaggtgaa gcagctgcgc cccggcgaca tcgtgcccga cgacagcggc 2820
gagctgttca tgaccctgac cgacggcagc ggcagcaagg aggtggtgtt cctgcaggcc 2880
gacatcaacg ccgcccacaa cctgcagaag cgcttctggc agcgctacaa cgagctgttc 2940
aaggtgagct gccgcgtgat cgtgcgcgac gaggaggagt acctggtgcc caagaccaag 3000
agcgtgcagg ccaagctggg caagggcctg ttcgtgaaga agagcgacac cgcctggaag 3060
gacgtgtacg tgtgggacag ccaggccaag ctgaagggca agaccacctt caccgaggag 3120
agcgagagcc ccgagcagct ggaggacttc caggagatca tcgaggaggc cgaggaggcc 3180
aagggcacct accgcaccct gttccgcgac cccagcggcg tgttcttccc cgagagcgtg 3240
tggtaccccc agaaggactt ctggggcgag gtgaagcgca agctgtacgg caagctgcgc 3300
gagcgcttcc tgaccaaggc ccgc 3324
<210> 16
<211> 3336
<212> DNA
<213> Bacillus
<400> 16
atggccatcc gcagcatcaa gctgaagatg aagaccaaca gcggcaccga cagcatctac 60
ctgcgcaagg ccctgtggcg cacccaccag ctgatcaacg agggcatcgc ctactacatg 120
aacctgctga ccctgtaccg ccaggaggcc atcggcgaca agaccaagga ggcctaccag 180
gccgagctga tcaacatcat ccgcaaccag cagcgcaaca acggcagcag cgaggagcac 240
ggcagcgacc aggagatcct ggccctgctg cgccagctgt acgagctgat catccccagc 300
agcatcggcg agagcggcga cgccaaccag ctgggcaaca agttcctgta ccccctggtg 360
gaccccaaca gccagagcgg caagggcacc agcaacgccg gccgcaagcc ccgctggaag 420
cgcctgaagg aggagggcaa ccccgactgg gagctggaga agaagaagga cgaggagcgc 480
aaggccaagg accccaccgt gaagatcttc gacaacctga acaagtacgg cctgctgccc 540
ctgttccccc tgttcaccaa catccagaag gacatcgagt ggctgcccct gggcaagcgc 600
cagagcgtgc gcaagtggga caaggacatg ttcatccagg ccatcgagcg cctgctgagc 660
tgggagagct ggaaccgccg cgtggccgac gagtacaagc agctgaagga gaagaccgag 720
agctactaca aggagcacct gaccggcggc gaggagtgga tcgagaagat ccgcaagttc 780
gagaaggagc gcaacatgga gctggagaag aacgccttcg cccccaacga cggctacttc 840
atcaccagcc gccagatccg cggctgggac cgcgtgtacg agaagtggag caagctgccc 900
gagagcgcca gccccgagga gctgtggaag gtggtggccg agcagcagaa caagatgagc 960
gagggcttcg gcgaccccaa ggtgttcagc ttcctggcca accgcgagaa ccgcgacatc 1020
tggcgcggcc acagcgagcg catctaccac atcgccgcct acaacggcct gcagaagaag 1080
ctgagccgca ccaaggagca ggccaccttc accctgcccg acgccatcga gcaccccctg 1140
tggatccgct acgagagccc cggcggcacc aacctgaacc tgttcaagct ggaggagaag 1200
cagaagaaga actactacgt gaccctgagc aagatcatct ggcccagcga ggagaagtgg 1260
atcgagaagg agaacatcga gatccccctg gcccccagca tccagttcaa ccgccagatc 1320
aagctgaagc agcacgtgaa gggcaagcag gagatcagct tcagcgacta cagcagccgc 1380
atcagcctgg acggcgtgct gggcggcagc cgcatccagt tcaaccgcaa gtacatcaag 1440
aaccacaagg agctgctggg cgagggcgac atcggccccg tgttcttcaa cctggtggtg 1500
gacgtggccc ccctgcagga gacccgcaac ggccgcctgc agagccccat cggcaaggcc 1560
ctgaaggtga tcagcagcga cttcagcaag gtgatcgact acaagcccaa ggagctgatg 1620
gactggatga acaccggcag cgccagcaac agcttcggcg tggccagcct gctggagggc 1680
atgcgcgtga tgagcatcga catgggccag cgcaccagcg ccagcgtgag catcttcgag 1740
gtggtgaagg agctgcccaa ggaccaggag cagaagctgt tctacagcat caacgacacc 1800
gagctgttcg ccatccacaa gcgcagcttc ctgctgaacc tgcccggcga ggtggtgacc 1860
aagaacaaca agcagcagcg ccaggagcgc cgcaagaagc gccagttcgt gcgcagccag 1920
atccgcatgc tggccaacgt gctgcgcctg gagaccaaga agacccccga cgagcgcaag 1980
aaggccatcc acaagctgat ggagatcgtg cagagctacg acagctggac cgccagccag 2040
aaggaggtgt gggagaagga gctgaacctg ctgaccaaca tggccgcctt caacgacgag 2100
atctggaagg agagcctggt ggagctgcac caccgcatcg agccctacgt gggccagatc 2160
gtgagcaagt ggcgcaaggg cctgagcgag ggccgcaaga acctggccgg catcagcatg 2220
tggaacatcg acgagctgga ggacacccgc cgcctgctga tcagctggag caagcgcagc 2280
cgcacccccg gcgaggccaa ccgcatcgag accgacgagc ccttcggcag cagcctgctg 2340
cagcacatcc agaacgtgaa ggacgaccgc ctgaagcaga tggccaacct gatcatcatg 2400
accgccctgg gcttcaagta cgacaaggag gagaaggacc gctacaagcg ctggaaggag 2460
acctaccccg cctgccagat catcctgttc gagaacctga accgctacct gttcaacctg 2520
gaccgcagcc gccgcgagaa cagccgcctg atgaagtggg cccaccgcag catcccccgc 2580
accgtgagca tgcagggcga gatgttcggc ctgcaggtgg gcgacgtgcg cagcgagtac 2640
agcagccgct tccacgccaa gaccggcgcc cccggcatcc gctgccacgc cctgaccgag 2700
gaggacctga aggccggcag caacaccctg aagcgcctga tcgaggacgg cttcatcaac 2760
gagagcgagc tggcctacct gaagaagggc gacatcatcc ccagccaggg cggcgagctg 2820
ttcgtgaccc tgagcaagcg ctacaagaag gacagcgaca acaacgagct gaccgtgatc 2880
cacgccgaca tcaacgccgc ccagaacctg cagaagcgct tctggcagca gaacagcgag 2940
gtgtaccgcg tgccctgcca gctggcccgc atgggcgagg acaagctgta catccccaag 3000
agccagaccg agaccatcaa gaagtacttc ggcaagggca gcttcgtgaa gaacaacacc 3060
gagcaggagg tgtacaagtg ggagaagagc gagaagatga agatcaagac cgacaccacc 3120
ttcgacctgc aggacctgga cggcttcgag gacatcagca agaccatcga gctggcccag 3180
gagcagcaga agaagtacct gaccatgttc cgcgacccca gcggctactt cttcaacaac 3240
gagacctggc gcccccagaa ggagtactgg agcatcgtga acaacatcat caagagctgc 3300
ctgaagaaga agatcctgag caacaaggtg gagctg 3336
<210> 17
<211> 3447
<212> DNA
<213> Desulfovibrio inopinatus
<400> 17
atgcccaccc gcaccatcaa cctgaagctg gtgctgggca agaaccccga gaacgccacc 60
ctgcgccgcg ccctgttcag cacccaccgc ctggtgaacc aggccaccaa gcgcatcgag 120
gagttcctgc tgctgtgccg cggcgaggcc taccgcaccg tggacaacga gggcaaggag 180
gccgagatcc cccgccacgc cgtgcaggag gaggccctgg ccttcgccaa ggccgcccag 240
cgccacaacg gctgcatcag cacctacgag gaccaggaga tcctggacgt gctgcgccag 300
ctgtacgagc gcctggtgcc cagcgtgaac gagaacaacg aggccggcga cgcccaggcc 360
gccaacgcct gggtgagccc cctgatgagc gccgagagcg agggcggcct gagcgtgtac 420
gacaaggtgc tggacccccc ccccgtgtgg atgaagctga aggaggagaa ggcccccggc 480
tgggaggccg ccagccagat ctggatccag agcgacgagg gccagagcct gctgaacaag 540
cccggcagcc ccccccgctg gatccgcaag ctgcgcagcg gccagccctg gcaggacgac 600
ttcgtgagcg accagaagaa gaagcaggac gagctgacca agggcaacgc ccccctgatc 660
aagcagctga aggagatggg cctgctgccc ctggtgaacc ccttcttccg ccacctgctg 720
gaccccgagg gcaagggcgt gagcccctgg gaccgcctgg ccgtgcgcgc cgccgtggcc 780
cacttcatca gctgggagag ctggaaccac cgcacccgcg ccgagtacaa cagcctgaag 840
ctgcgccgcg acgagttcga ggccgccagc gacgagttca aggacgactt caccctgctg 900
cgccagtacg aggccaagcg ccacagcacc ctgaagagca tcgccctggc cgacgacagc 960
aacccctacc gcatcggcgt gcgcagcctg cgcgcctgga accgcgtgcg cgaggagtgg 1020
atcgacaagg gcgccaccga ggagcagcgc gtgaccatcc tgagcaagct gcagacccag 1080
ctgcgcggca agttcggcga ccccgacctg ttcaactggc tggcccagga ccgccacgtg 1140
cacctgtgga gcccccgcga cagcgtgacc cccctggtgc gcatcaacgc cgtggacaag 1200
gtgctgcgcc gccgcaagcc ctacgccctg atgaccttcg cccacccccg cttccacccc 1260
cgctggatcc tgtacgaggc ccccggcggc agcaacctgc gccagtacgc cctggactgc 1320
accgagaacg ccctgcacat caccctgccc ctgctggtgg acgacgccca cggcacctgg 1380
atcgagaaga agatccgcgt gcccctggcc cccagcggcc agatccagga cctgaccctg 1440
gagaagctgg agaagaagaa gaaccgcctg tactaccgca gcggcttcca gcagttcgcc 1500
ggcctggccg gcggcgccga ggtgctgttc caccgcccct acatggagca cgacgagcgc 1560
agcgaggaga gcctgctgga gcgccccggc gccgtgtggt tcaagctgac cctggacgtg 1620
gccacccagg ccccccccaa ctggctggac ggcaagggcc gcgtgcgcac cccccccgag 1680
gtgcaccact tcaagaccgc cctgagcaac aagagcaagc acacccgcac cctgcagccc 1740
ggcctgcgcg tgctgagcgt ggacctgggc atgcgcacct tcgccagctg cagcgtgttc 1800
gagctgatcg agggcaagcc cgagaccggc cgcgccttcc ccgtggccga cgagcgcagc 1860
atggacagcc ccaacaagct gtgggccaag cacgagcgca gcttcaagct gaccctgccc 1920
ggcgagaccc ccagccgcaa ggaggaggag gagcgcagca tcgcccgcgc cgagatctac 1980
gccctgaagc gcgacatcca gcgcctgaag agcctgctgc gcctgggcga ggaggacaac 2040
gacaaccgcc gcgacgccct gctggagcag ttcttcaagg gctggggcga ggaggacgtg 2100
gtgcccggcc aggccttccc ccgcagcctg ttccagggcc tgggcgccgc ccccttccgc 2160
agcacccccg agctgtggcg ccagcactgc cagacctact acgacaaggc cgaggcctgc 2220
ctggccaagc acatcagcga ctggcgcaag cgcacccgcc cccgccccac cagccgcgag 2280
atgtggtaca agacccgcag ctaccacggc ggcaagagca tctggatgct ggagtacctg 2340
gacgccgtgc gcaagctgct gctgagctgg agcctgcgcg gccgcaccta cggcgccatc 2400
aaccgccagg acaccgcccg cttcggcagc ctggccagcc gcctgctgca ccacatcaac 2460
agcctgaagg aggaccgcat caagaccggc gccgacagca tcgtgcaggc cgcccgcggc 2520
tacatccccc tgccccacgg caagggctgg gagcagcgct acgagccctg ccagctgatc 2580
ctgttcgagg acctggcccg ctaccgcttc cgcgtggacc gcccccgccg cgagaacagc 2640
cagctgatgc agtggaacca ccgcgccatc gtggccgaga ccaccatgca ggccgagctg 2700
tacggccaga tcgtggagaa caccgccgcc ggcttcagca gccgcttcca cgccgccacc 2760
ggcgcccccg gcgtgcgctg ccgcttcctg ctggagcgcg acttcgacaa cgacctgccc 2820
aagccctacc tgctgcgcga gctgagctgg atgctgggca acaccaaggt ggagagcgag 2880
gaggagaagc tgcgcctgct gagcgagaag atccgccccg gcagcctggt gccctgggac 2940
ggcggcgagc agttcgccac cctgcacccc aagcgccaga ccctgtgcgt gatccacgcc 3000
gacatgaacg ccgcccagaa cctgcagcgc cgcttcttcg gccgctgcgg cgaggccttc 3060
cgcctggtgt gccagcccca cggcgacgac gtgctgcgcc tggccagcac ccccggcgcc 3120
cgcctgctgg gcgccctgca gcagctggag aacggccagg gcgccttcga gctggtgcgc 3180
gacatgggca gcaccagcca gatgaaccgc ttcgtgatga agagcctggg caagaagaag 3240
atcaagcccc tgcaggacaa caacggcgac gacgagctgg aggacgtgct gagcgtgctg 3300
cccgaggagg acgacaccgg ccgcatcacc gtgttccgcg acagcagcgg catcttcttc 3360
ccctgcaacg tgtggatccc cgccaagcag ttctggcccg ccgtgcgcgc catgatctgg 3420
aaggtgatgg ccagccacag cctgggc 3447
<210> 18
<211> 3270
<212> DNA
<213> Laceyella sediminis
<400> 18
atgagcatcc gcagcttcaa gctgaagatc aagaccaaga gcggcgtgaa cgccgaggag 60
ctgcgccgcg gcctgtggcg cacccaccag ctgatcaacg acggcatcgc ctactacatg 120
aactggctgg tgctgctgcg ccaggaggac ctgttcatcc gcaacgagga gaccaacgag 180
atcgagaagc gcagcaagga ggagatccag ggcgagctgc tggagcgcgt gcacaagcag 240
cagcagcgca accagtggag cggcgaggtg gacgaccaga ccctgctgca gaccctgcgc 300
cacctgtacg aggagatcgt gcccagcgtg atcggcaaga gcggcaacgc cagcctgaag 360
gcccgcttct tcctgggccc cctggtggac cccaacaaca agaccaccaa ggacgtgagc 420
aagagcggcc ccacccccaa gtggaagaag atgaaggacg ccggcgaccc caactgggtg 480
caggagtacg agaagtacat ggccgagcgc cagaccctgg tgcgcctgga ggagatgggc 540
ctgatccccc tgttccccat gtacaccgac gaggtgggcg acatccactg gctgccccag 600
gccagcggct acacccgcac ctgggaccgc gacatgttcc agcaggccat cgagcgcctg 660
ctgagctggg agagctggaa ccgccgcgtg cgcgagcgcc gcgcccagtt cgagaagaag 720
acccacgact tcgccagccg cttcagcgag agcgacgtgc agtggatgaa caagctgcgc 780
gagtacgagg cccagcagga gaagagcctg gaggagaacg ccttcgcccc caacgagccc 840
tacgccctga ccaagaaggc cctgcgcggc tgggagcgcg tgtaccacag ctggatgcgc 900
ctggacagcg ccgccagcga ggaggcctac tggcaggagg tggccacctg ccagaccgcc 960
atgcgcggcg agttcggcga ccccgccatc taccagttcc tggcccagaa ggagaaccac 1020
gacatctggc gcggctaccc cgagcgcgtg atcgacttcg ccgagctgaa ccacctgcag 1080
cgcgagctgc gccgcgccaa ggaggacgcc accttcaccc tgcccgacag cgtggaccac 1140
cccctgtggg tgcgctacga ggcccccggc ggcaccaaca tccacggcta cgacctggtg 1200
caggacacca agcgcaacct gaccctgatc ctggacaagt tcatcctgcc cgacgagaac 1260
ggcagctggc acgaggtgaa gaaggtgccc ttcagcctgg ccaagagcaa gcagttccac 1320
cgccaggtgt ggctgcagga ggagcagaag cagaagaagc gcgaggtggt gttctacgac 1380
tacagcacca acctgcccca cctgggcacc ctggccggcg ccaagctgca gtgggaccgc 1440
aacttcctga acaagcgcac ccagcagcag atcgaggaga ccggcgagat cggcaaggtg 1500
ttcttcaaca tcagcgtgga cgtgcgcccc gccgtggagg tgaagaacgg ccgcctgcag 1560
aacggcctgg gcaaggccct gaccgtgctg acccaccccg acggcaccaa gatcgtgacc 1620
ggctggaagg ccgagcagct ggagaagtgg gtgggcgaga gcggccgcgt gagcagcctg 1680
ggcctggaca gcctgagcga gggcctgcgc gtgatgagca tcgacctggg ccagcgcacc 1740
agcgccaccg tgagcgtgtt cgagatcacc aaggaggccc ccgacaaccc ctacaagttc 1800
ttctaccagc tggagggcac cgagctgttc gccgtgcacc agcgcagctt cctgctggcc 1860
ctgcccggcg agaacccccc ccagaagatc aagcagatgc gcgagatccg ctggaaggag 1920
cgcaaccgca tcaagcagca ggtggaccag ctgagcgcca tcctgcgcct gcacaagaag 1980
gtgaacgagg acgagcgcat ccaggccatc gacaagctgc tgcagaaggt ggccagctgg 2040
cagctgaacg aggagatcgc caccgcctgg aaccaggccc tgagccagct gtacagcaag 2100
gccaaggaga acgacctgca gtggaaccag gccatcaaga acgcccacca ccagctggag 2160
cccgtggtgg gcaagcagat cagcctgtgg cgcaaggacc tgagcaccgg ccgccagggc 2220
atcgccggcc tgagcctgtg gagcatcgag gagctggagg ccaccaagaa gctgctgacc 2280
cgctggagca agcgcagccg cgagcccggc gtggtgaagc gcatcgagcg cttcgagacc 2340
ttcgccaagc agatccagca ccacatcaac caggtgaagg agaaccgcct gaagcagctg 2400
gccaacctga tcgtgatgac cgccctgggc tacaagtacg accaggagca gaagaagtgg 2460
atcgaggtgt accccgcctg ccaggtggtg ctgttcgaga acctgcgcag ctaccgcttc 2520
agctacgagc gcagccgccg cgagaacaag aagctgatgg agtggagcca ccgcagcatc 2580
cccaagctgg tgcagatgca gggcgagctg ttcggcctgc aggtggccga cgtgtacgcc 2640
gcctacagca gccgctacca cggccgcacc ggcgcccccg gcatccgctg ccacgccctg 2700
accgaggccg acctgcgcaa cgagaccaac atcatccacg agctgatcga ggccggcttc 2760
atcaaggagg agcaccgccc ctacctgcag cagggcgacc tggtgccctg gagcggcggc 2820
gagctgttcg ccaccctgca gaagccctac gacaaccccc gcatcctgac cctgcacgcc 2880
gacatcaacg ccgcccagaa catccagaag cgcttctggc accccagcat gtggttccgc 2940
gtgaactgcg agagcgtgat ggagggcgag atcgtgacct acgtgcccaa gaacaagacc 3000
gtgcacaaga agcagggcaa gaccttccgc ttcgtgaagg tggagggcag cgacgtgtac 3060
gagtgggcca agtggagcaa gaaccgcaac aagaacacct tcagcagcat caccgagcgc 3120
aagcccccca gcagcatgat cctgttccgc gaccccagcg gcaccttctt caaggagcag 3180
gagtgggtgg agcagaagac cttctggggc aaggtgcaga gcatgatcca ggcctacatg 3240
aagaagacca tcgtgcagcg catggaggag 3270
<210> 19
<211> 3357
<212> DNA
<213> Spirochaetes
<400> 19
atgagcttca ccatcagcta ccccttcaag ctgatcatca agaacaagga cgaggccaag 60
gccctgctgg acacccacca gtacatgaac gagggcgtga agtactacct ggagaagctg 120
ctgatgttcc gccaggagaa gatcttcatc ggcgaggacg agaccggcaa gcgcatctac 180
atcgaggaga ccgagtacaa gaagcagatc gaggagttct acctgatcaa gaagaccgag 240
ctgggccgca acctgaccct gaccctggac gagttcaaga ccctgatgcg cgagctgtac 300
atctgcctgg tgagcagcag catggagaac aagaagggct tccccaacgc ccagcaggcc 360
agcctgaaca tcttcagccc cctgttcgac gccgagagca agggctacat cctgaaggag 420
gagaacaaca acatcagcct gatccacaag gactacggca agatcctgct gaagcgcctg 480
cgcgacaaca acctgatccc catcttcacc aagttcaccg acatcaagaa gatcaccgcc 540
aagctgagcc ccaccgccct ggaccgcatg atcttcgccc aggccatcga gaagctgctg 600
agctacgaga gctggtgcaa gctgatgatc aaggagcgct tcgacaagga ggtgaagatc 660
aaggagctgg agaacaagtg cgagaacaag caggagcgcg acaagatctt cgagatcctg 720
gagaagtacg aggaggagcg ccagaagacc ttcgagcagg acagcggctt cgccaagaag 780
ggcaagttct acatcaccgg ccgcatgctg aagggcttcg acgagatcaa ggagaagtgg 840
ctgaaggaga aggaccgcag cgagcagaac ctgatcaaca tcctgaacaa gtaccagacc 900
gacaacagca agctggtggg cgaccgcaac ctgttcgagt tcatcatcaa gctggagaac 960
cagtgcctgt ggaacggcga catcgactac ctgaagatca agcgcgacat caacaagaac 1020
cagatctggc tggaccgccc cgagatgccc cgcttcacca tgcccgactt caagaagcac 1080
cccctgtggt accgctacga ggaccccagc aacagcaact tccgcaacta caagatcgag 1140
gtggtgaagg acgagaacta catcaccatc cccctgatca ccgagcgcaa caacgagtac 1200
ttcgaggaga actacacctt caacctggcc aagctgaaga agctgagcga gaacatcacc 1260
ttcatcccca agagcaagaa caaggagttc gagttcatcg acagcaacga cgaggaggag 1320
gacaagaagg accagaagaa gagcaagcag tacatcaagt actgcgacac cgccaagaac 1380
accagctacg gcaagagcgg cggcatccgc ctgtacttca accgcaacga gctggagaac 1440
tacaaggacg gcaagaagat ggacagctac accgtgttca ccctgagcat ccgcgactac 1500
aagagcctgt tcgccaagga gaagctgcag ccccagatct tcaacaccgt ggacaacaag 1560
atcaccagcc tgaagatcca gaagaagttc ggcaacgagg agcagaccaa cttcctgagc 1620
tacttcaccc agaaccagat caccaagaag gactggatgg acgagaagac cttccagaac 1680
gtgaaggagc tgaacgaggg catccgcgtg ctgagcgtgg acctgggcca gcgcttcttc 1740
gccgccgtga gctgcttcga gatcatgagc gagatcgaca acaacaagct gttcttcaac 1800
ctgaacgacc agaaccacaa gatcatccgc atcaacgaca agaactacta cgccaagcac 1860
atctacagca agaccatcaa gctgagcggc gaggacgacg acctgtacaa ggagcgcaag 1920
atcaacaaga actacaagct gagctaccag gagcgcaaga acaagatcgg catcttcacc 1980
cgccagatca acaagctgaa ccagctgctg aagatcatcc gcaacgacga gatcgacaag 2040
gagaagttca aggagctgat cgagaccacc aagcgctacg tgaagaacac ctacaacgac 2100
ggcatcatcg actggaacaa cgtggacaac aagatcctga gctacgagaa caaggaggac 2160
gtgatcaacc tgcacaagga gctggacaag aagctggaga tcgacttcaa ggagttcatc 2220
cgcgagtgcc gcaagcccat cttccgcagc ggcggcctga gcatgcagcg catcgacttc 2280
ctggagaagc tgaacaagct gaagcgcaag tgggtggccc gcacccagaa gagcgccgag 2340
agcatcgtgc tgacccccaa gttcggctac aagctgaagg agcacatcaa cgagctgaag 2400
gacaaccgcg tgaagcaggg cgtgaactac atcctgatga ccgccctggg ctacatcaag 2460
gacaacgaga tcaagaacga cagcaagaag aagcagaagg aggactgggt gaagaagaac 2520
cgcgcctgcc agatcatcct gatggagaag ctgaccgagt acaccttcgc cgaggaccgc 2580
ccccgcgagg agaacagcaa gctgcgcatg tggagccacc gccagatctt caacttcctg 2640
cagcagaagg ccagcctgtg gggcatcctg gtgggcgacg tgttcgcccc ctacaccagc 2700
aagtgcctga gcgacaacaa cgcccccggc atccgctgcc accaggtgac caagaaggac 2760
ctgatcgaca acagctggtt cctgaagatc gtggtgaagg acgacgcctt ctgcgacctg 2820
atcgagatca acaaggagaa cgtgaagaac aagagcatca agatcaacga catcctgccc 2880
ctgcgcggcg gcgagctgtt cgccagcatc aaggacggca agctgcacat cgtgcaggcc 2940
gacatcaacg ccagccgcaa catcgccaag cgcttcctga gccagatcaa ccccttccgc 3000
gtggtgctga agaaggacaa ggacgagacc ttccacctga agaacgagcc caactacctg 3060
aagaactact acagcatcct gaacttcgtg cccaccaacg aggagctgac cttcttcaag 3120
gtggaggaga acaaggacat caagcccacc aagcgcatca agatggacaa gcacgagaag 3180
gagagcaccg acgagggcga cgactacagc aagaaccaga tcgccctgtt ccgcgacgac 3240
agcggcatct tcttcgacaa gagcctgtgg gtggacggca agatcttctg gagcgtggtg 3300
aagaacaaga tgaccaagct gctgcgcgag cgcaacaaca agaagaacgg cagcaag 3357
<210> 20
<211> 3426
<212> DNA
<213> Tuberibacillus calidus
<400> 20
atgaacatcc acctgaagga gctgatccgc atggccacca agagcttcat cctgaagatg 60
aagaccaaga acaaccccca gctgcgcctg agcctgtgga agacccacga gctgttcaac 120
ttcggcgtgg cctactacat ggacctgctg agcctgttcc gccagaagga cctgtacatg 180
cacaacgacg aggaccccga ccaccccgtg gtgctgaaga aggaggagat ccaggagcgc 240
ctgtggatga aggtgcgcga gacccagcag aagaacggct tccacggcga ggtgagcaag 300
gacgaggtgc tggagaccct gcgcgccctg tacgaggagc tggtgcccag cgccgtgggc 360
aagagcggcg aggccaacca gatcagcaac aagtacctgt accccctgac cgaccccgcc 420
agccagagcg gcaagggcac cgccaacagc ggccgcaagc cccgctggaa gaagctgaag 480
gaggccggcg accccagctg gaaggacgcc tacgagaagt gggagaagga gcgccaggag 540
gaccccaagc tgaagatcct ggccgccctg cagagcttcg gcctgatccc cctgttccgc 600
cccttcaccg agaacgacca caaggccgtg atcagcgtga agtggatgcc caagagcaag 660
aaccagagcg tgcgcaagtt cgacaaggac atgttcaacc aggccatcga gcgcttcctg 720
agctgggaga gctggaacga gaaggtggcc gaggactacg agaagaccgt gagcatctac 780
gagagcctgc agaaggagct gaagggcatc agcaccaagg ccttcgagat catggagcgc 840
gtggagaagg cctacgaggc ccacctgcgc gagatcacct tcagcaacag cacctaccgc 900
atcggcaacc gcgccatccg cggctggacc gagatcgtga agaagtggat gaagctggac 960
cccagcgccc cccagggcaa ctacctggac gtggtgaagg actaccagcg ccgccacccc 1020
cgcgagagcg gcgacttcaa gctgttcgag ctgctgagcc gccccgagaa ccaggccgcc 1080
tggcgcgagt accccgagtt cctgcccctg tacgtgaagt accgccacgc cgagcagcgc 1140
atgaagaccg ccaagaagca ggccaccttc accctgtgcg accccatccg ccaccccctg 1200
tgggtgcgct acgaggagcg cagcggcacc aacctgaaca agtaccgcct gatcatgaac 1260
gagaaggaga aggtggtgca gttcgaccgc ctgatctgcc tgaacgccga cggccactac 1320
gaggagcagg aggacgtgac cgtgcccctg gcccccagcc agcagttcga cgaccagatc 1380
aagttcagca gcgaggacac cggcaagggc aagcacaact tcagctacta ccacaagggc 1440
atcaactacg agctgaaggg caccctgggc ggcgcccgca tccagttcga ccgcgagcac 1500
ctgctgcgcc gccagggcgt gaaggccggc aacgtgggcc gcatcttcct gaacgtgacc 1560
ctgaacatcg agcccatgca gcccttcagc cgcagcggca acctgcagac cagcgtgggc 1620
aaggccctga aggtgtacgt ggacggctac cccaaggtgg tgaacttcaa gcccaaggag 1680
ctgaccgagc acatcaagga gagcgagaag aacaccctga ccctgggcgt ggagagcctg 1740
cccaccggcc tgcgcgtgat gagcgtggac ctgggccagc gccaggccgc cgccatcagc 1800
atcttcgagg tggtgagcga gaagcccgac gacaacaagc tgttctaccc cgtgaaggac 1860
accgacctgt tcgccgtgca ccgcaccagc ttcaacatca agctgcccgg cgagaagcgc 1920
accgagcgcc gcatgctgga gcagcagaag cgcgaccagg ccatccgcga cctgagccgc 1980
aagctgaagt tcctgaagaa cgtgctgaac atgcagaagc tggagaagac cgacgagcgc 2040
gagaagcgcg tgaaccgctg gatcaaggac cgcgagcgcg aggaggagaa ccccgtgtac 2100
gtgcaggagt tcgagatgat cagcaaggtg ctgtacagcc cccacagcgt gtgggtggac 2160
cagctgaaga gcatccaccg caagctggag gagcagctgg gcaaggagat cagcaagtgg 2220
cgccagagca tcagccaggg ccgccagggc gtgtacggca tcagcctgaa gaacatcgag 2280
gacatcgaga agacccgccg cctgctgttc cgctggagca tgcgccccga gaaccccggc 2340
gaggtgaagc agctgcagcc cggcgagcgc ttcgccatcg accagcagaa ccacctgaac 2400
cacctgaagg acgaccgcat caagaagctg gccaaccaga tcgtgatgac cgccctgggc 2460
taccgctacg acggcaagcg caagaagtgg atcgccaagc accccgcctg ccagctggtg 2520
ctgttcgagg acctgagccg ctacgccttc tacgacgagc gcagccgcct ggagaaccgc 2580
aacctgatgc gctggagccg ccgcgagatc cccaagcagg tggcccagat cggcggcctg 2640
tacggcctgc tggtgggcga ggtgggcgcc cagtacagca gccgcttcca cgccaagagc 2700
ggcgcccccg gcatccgctg ccgcgtggtg aaggagcacg agctgtacat caccgagggc 2760
ggccagaagg tgcgcaacca gaagttcctg gacagcctgg tggagaacaa catcatcgag 2820
cccgacgacg cccgccgcct ggagcccggc gacctgatcc gcgaccaggg cggcgacaag 2880
ttcgccaccc tggacgagcg cggcgagctg gtgatcaccc acgccgacat caacgccgcc 2940
cagaacctgc agaagcgctt ctggacccgc acccacggcc tgtaccgcat ccgctgcgag 3000
agccgcgaga tcaaggacgc cgtggtgctg gtgcccagcg acaaggacca gaaggagaag 3060
atggagaacc tgttcggcat cggctacctg cagcccttca agcaggagaa cgacgtgtac 3120
aagtgggtga agggcgagaa gatcaagggc aagaagacca gcagccagag cgacgacaag 3180
gagctggtga gcgagatcct gcaggaggcg agcgtgatgg ccgacgagct gaagggcaac 3240
cgcaagaccc tgttccgcga ccccagcggc tacgtgttcc ccaaggaccg ctggtacacc 3300
ggcggccgct acttcggcac cctggagcac ctgctgaagc gcaagctggc cgagcgccgc 3360
ctgttcgacg gcggcagcag ccgccgcggc ctgttcaacg gcaccgacag caacaccaac 3420
gtggag 3426
<210> 21
<211> 2870
<212> DNA
<213> Artificial Sequence
<220>
<223> pCAG-2AeGFP 部分序列 (CAG-NLS-XmaI-NheI-NLS-T2A-eGFP-SV40)
<400> 21
gacattgatt attgactagt tattaatagt aatcaattac ggggtcatta gttcatagcc 60
catatatgga gttccgcgtt acataactta cggtaaatgg cccgcctggc tgaccgccca 120
acgacccccg cccattgacg tcaataatga cgtatgttcc catagtaacg ccaataggga 180
ctttccattg acgtcaatgg gtggactatt tacggtaaac tgcccacttg gcagtacatc 240
aagtgtatca tatgccaagt acgcccccta ttgacgtcaa tgacggtaaa tggcccgcct 300
ggcattatgc ccagtacatg accttatggg actttcctac ttggcagtac atctacgtat 360
tagtcatcgc tattaccatg ggtcgaggtg agccccacgt tctgcttcac tctccccatc 420
tcccccccct ccccaccccc aattttgtat ttatttattt tttaattatt ttgtgcagcg 480
atgggggcgg gggggggggg ggcgcgcgcc aggcggggcg gggcggggcg aggggcgggg 540
cggggcgagg cggagaggtg cggcggcagc caatcagagc ggcgcgctcc gaaagtttcc 600
ttttatggcg aggcggcggc ggcggcggcc ctataaaaag cgaagcgcgc ggcgggcggg 660
agtcgctgcg ttgccttcgc cccgtgcccc gctccgcgcc gcctcgcgcc gcccgccccg 720
gctctgactg accgcgttac tcccacaggt gagcgggcgg gacggccctt ctcctccggg 780
ctgtaattag cgcttggttt aatgacggct cgtttctttt ctgtggctgc gtgaaagcct 840
taaagggctc cgggagggcc ctttgtgcgg gggggagcgg ctcggggggt gcgtgcgtgt 900
gtgtgtgcgt ggggagcgcc gcgtgcggcc cgcgctgccc ggcggctgtg agcgctgcgg 960
gcgcggcgcg gggctttgtg cgctccgcgt gtgcgcgagg ggagcgcggc cgggggcggt 1020
gccccgcggt gcgggggggc tgcgagggga acaaaggctg cgtgcggggt gtgtgcgtgg 1080
gggggtgagc agggggtgtg ggcgcggcgg tcgggctgta acccccccct gcacccccct 1140
ccccgagttg ctgagcacgg cccggcttcg ggtgcggggc tccgtacggg gcgtggcgcg 1200
gggctcgccg tgccgggcgg ggggtggcgg caggtggggg tgccgggcgg ggcggggccg 1260
cctcgggccg gggagggctc gggggagggg cgcggcggcc cccggagcgc cggcggctgt 1320
cgaggcgcgg cgagccgcag ccattgcctt ttatggtaat cgtgcgagag ggcgcaggga 1380
cttcctttgt cccaaatctg tgcggagccg aaatctggga ggcgccgccg caccccctct 1440
agcgggcgcg gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg ggagggcctt 1500
cgtgcgtcgc cgcgccgccg tccccttctc catctccagc ctcggggctg tccgcagggg 1560
gacggctgcc ttcggggggg acggggcagg gcggggttcg gcttctggcg tgtgaccggc 1620
ggctctagcg cctctgctaa ccatgttcat gccttcttct ttttcctaca gctcctgggc 1680
aacgtgctgg ttattgtgct gtctcatcat tttggcaaag ctagtgaatt ctaatacgac 1740
tcactatagg ccgccaccat gcccaagaag aagaggaagg ttcccggggc tagcccaaag 1800
aagaagagga aagtctctag atacccttat gatgttccag attatgccgg atacccatac 1860
gatgtccctg actatgcagg ctcctaccct tatgacgtcc cagactacgc cggatccagg 1920
tccggcggcg gagagggcag aggaagtctt ctaacatgcg gtgacgtgga ggagaatccc 1980
ggcccaatgg tgagcaaggg cgaggagctg ttcaccgggg tggtgcccat cctggtcgag 2040
ctggacggcg acgtaaacgg ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc 2100
acctacggca agctgaccct gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg 2160
cccaccctcg tgaccaccct gacctacggc gtgcagtgct tcagccgcta ccccgaccac 2220
atgaagcagc acgacttctt caagtccgcc atgcccgaag gctacgtcca ggagcgcacc 2280
atcttcttca aggacgacgg caactacaag acccgcgccg aggtgaagtt cgagggcgac 2340
accctggtga accgcatcga gctgaagggc atcgacttca aggaggacgg caacatcctg 2400
gggcacaagc tggagtacaa ctacaacagc cacaacgtct atatcatggc cgacaagcag 2460
aagaacggca tcaaggtgaa cttcaagatc cgccacaaca tcgaggacgg cagcgtgcag 2520
ctcgccgacc actaccagca gaacaccccc atcggcgacg gccccgtgct gctgcccgac 2580
aaccactacc tgagcaccca gtccgccctg agcaaagacc ccaacgagaa gcgcgatcac 2640
atggtcctgc tggagttcgt gaccgccgcc gggatcactc tcggcatgga cgagctgtac 2700
aagtaactgc agcgcgggga tctcatgctg gagttcttcg cccaccccaa cttgtttatt 2760
gcagcttata atggttacaa ataaagcaat agcatcacaa atttcacaaa taaagcattt 2820
ttttcactgc attctagttg tggtttgtcc aaactcatca atgtatctta 2870
<210> 22
<211> 410
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6 部分序列(U6-BasI-HindIII)
<220>
<221> misc_feature
<222> (283)..(289)
<223> n is a, c, g, or t
<220>
<221> misc_feature
<222> (297)..(375)
<223> n is a, c, g, or t
<400> 22
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accggagaga ccnnnnnnng gtctcannnn 300
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360
nnnnnnnnnn nnnnnaagct tggcgtaatc atggtcatag ctgtttcctg 410
<210> 23
<211> 476
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-AasgRNA 部分序列(U6-AasgRNA_支架-BasI-BasI-terminator)
<400> 23
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accgggtcta aaggacagaa tttttcaacg 300
ggtgtgccaa tggccacttt ccaggtggca aagcccgttg aacttctcaa aaagaacgct 360
cgctcagtgt tctgacgtcg gatcactgag cgagcgatct gagaagtggc acagagaccg 420
agagagggtc tcattttttt taagcttggc gtaatcatgg tcatagctgt ttcctg 476
<210> 24
<211> 476
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-AksgRNA部分序列(U6-AksgRNA1_支架-BasI-BasI-终止子)
<400> 24
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accggtcgtc tataggacgg cgaggacaac 300
gggaagtgcc aatgtgctct ttccaagagc aaacaccccg ttggcttcaa gatgaccgct 360
cgctcagcga tctgacaacg gatcgctgag cgagcggtct gagaagtggc acagagaccg 420
agagagggtc tcattttttt taagcttggc gtaatcatgg tcatagctgt ttcctg 476
<210> 25
<211> 484
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-AmsgRNA部分序列(U6-AmsgRNA1_支架-BasI-BasI-终止子)
<400> 25
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accggggaat tgccgatcta taggacggca 300
gattcaacgg gatgtgccaa tgcactcttt ccaggagtga acaccccgtt ggcttcaaca 360
tgatcgcccg ctcaacggtc cgatgtcgga tcgttgagcg ggcgatctga gaagtggcac 420
agagaccgag agagggtctc atttttttta agcttggcgt aatcatggtc atagctgttt 480
cctg 484
<210> 26
<211> 439
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-BhsgRNA 部分序列(U6-BhsgRNA1_支架-BasI-BasI-终止子)
<400> 26
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accgggaggt tctgtctttt ggtcaggaca 300
accgtctagc tataagtgct gcagggtgtg agaaactcct attgctggac gatgtctctt 360
acgaggcatt agcacagaga ccgagagagg gtctcatttt ttttaagctt ggcgtaatca 420
tggtcatagc tgtttcctg 439
<210> 27
<211> 471
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-BssgRNA 部分序列(U6-BssgRNA1_支架-BasI-BasI-终止子)
<400> 27
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accggccata agtcgactta catatccgtg 300
cgtgtgcatt atgggcccat ccacaggtct attcccacgg ataatcacga ctttccacta 360
agctttcgaa tgttcgaaag cttagtggaa agcttcgtgg ttagcacaga gaccgagaga 420
gggtctcatt ttttttaagc ttggcgtaat catggtcata gctgtttcct g 471
<210> 28
<211> 469
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-Bs3sgRNA部分序列(U6-BssgRNA1_支架-BasI-BasI-终止子)
<400> 28
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accggggtga cctatagggt caatgaatct 300
gtgcgtgtgc cataagtaat taaaaattac ccaccacagg attatcttat ttctgctaag 360
tgtttagttg cctgaatact tagcagaaat aatgatgatt ggcacagaga ccgagagagg 420
gtctcatttt ttttaagctt ggcgtaatca tggtcatagc tgtttcctg 469
<210> 29
<211> 457
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-LssgRNA 部分序列(U6-BssgRNA1_支架-BasI-BasI-终止子)
<400> 29
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accggggcaa agaatactgt gcgtgtgcta 300
aggatggaaa aaatccattc aaccacagga ttacattatt tatctaatca cttaaatctt 360
taagtgatta gatgaattaa atgtgattag cacagagacc gagagagggt ctcatttttt 420
ttaagcttgg cgtaatcatg gtcatagctg tttcctg 457
<210> 30
<211> 465
<212> DNA
<213> Artificial Sequence
<220>
<223> pUC19-U6-SbsgRNA 部分序列(U6-BssgRNA1_支架-BasI-BasI-终止子)
<400> 30
tgtaaaacga cggccagtga attcgagggc ctatttccca tgattccttc atatttgcat 60
atacgataca aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata 120
ttagtacaaa atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa 180
ttatgtttta aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg 240
gctttatata tcttgtggaa aggacgaaac accgggtctt agggtatatc ccaaatttgt 300
cttagtatgt gcattgctta cagcgacaac taaggtttgt ttatcttttt tttacattgt 360
aagatgtttt acattataaa aagaagataa tcttattgca cagagaccga gagagggtct 420
catttttttt aagcttggcg taatcatggt catagctgtt tcctg 465
<210> 31
<211> 137
<212> DNA
<213> Artificial Sequence
<220>
<223> AasgRNA_支架
<400> 31
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaaaga acgctcgctc agtgttctga cgtcggatca ctgagcgagc 120
gatctgagaa gtggcac 137
<210> 32
<211> 137
<212> DNA
<213> Artificial Sequence
<220>
<223> AksgRNA1_支架
<400> 32
tcgtctatag gacggcgagg acaacgggaa gtgccaatgt gctctttcca agagcaaaca 60
ccccgttggc ttcaagatga ccgctcgctc agcgatctga caacggatcg ctgagcgagc 120
ggtctgagaa gtggcac 137
<210> 33
<211> 145
<212> DNA
<213> Artificial Sequence
<220>
<223> AmsgRNA1_支架
<400> 33
ggaattgccg atctatagga cggcagattc aacgggatgt gccaatgcac tctttccagg 60
agtgaacacc ccgttggctt caacatgatc gcccgctcaa cggtccgatg tcggatcgtt 120
gagcgggcga tctgagaagt ggcac 145
<210> 34
<211> 100
<212> DNA
<213> Artificial Sequence
<220>
<223> BhsgRNA_支架
<400> 34
gaggttctgt cttttggtca ggacaaccgt ctagctataa gtgctgcagg gtgtgagaaa 60
ctcctattgc tggacgatgt ctcttttatt tcttttttct tggatgtcca agaaaaaaga 120
aatgatacga ggcattagca c 141
<210> 35
<211> 132
<212> DNA
<213> Artificial Sequence
<220>
<223> BssgRNA_支架
<400> 35
ccataagtcg acttacatat ccgtgcgtgt gcattatggg cccatccaca ggtctattcc 60
cacggataat cacgactttc cactaagctt tcgaatgttc gaaagcttag tggaaagctt 120
cgtggttagc ac 132
<210> 36
<211> 130
<212> DNA
<213> Artificial Sequence
<220>
<223> Bs3sgRNA_支架
<400> 36
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acaggattat cttatttctg ctaagtgttt agttgcctga atacttagca gaaataatga 120
tgattggcac 130
<210> 37
<211> 118
<212> DNA
<213> Artificial Sequence
<220>
<223> LssgRNA_支架
<400> 37
ggcaaagaat actgtgcgtg tgctaaggat ggaaaaaatc cattcaacca caggattaca 60
ttatttatct aatcacttaa atctttaagt gattagatga attaaatgtg attagcac 118
<210> 38
<211> 126
<212> DNA
<213> Artificial Sequence
<220>
<223> SbsgRNA_支架
<400> 38
gtcttagggt atatcccaaa tttgtcttag tatgtgcatt gcttacagcg acaactaagg 60
tttgtttatc ttttttttac attgtaagat gttttacatt ataaaaagaa gataatctta 120
ttgcac 126
<210> 39
<211> 86
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA1支架
<400> 39
ggtctaaagg acagaatttt tcaacgggtg tgccaatggc cactttccag gtggcaaagc 60
ccgttgaact tcaagcgaag tggcac 86
<210> 40
<211> 84
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA2支架
<400> 40
ggtctaaagg acagaagaca acgggaagtg ccaatgtgct ctttccaaga gcaaacaccc 60
cgttgacttc aagcgaagtg gcac 84
<210> 41
<211> 79
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA3支架
<400> 41
ggtctaaagg acagaaaatc tgtgcgtgtg ccataagtaa ttaaaaatta cccaccacag 60
acttcaagcg aagtggcac 79
<210> 42
<211> 91
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA4支架
<400> 42
ggtcgtctat aggacggcga gtttttcaac gggtgtgcca atggccactt tccaggtggc 60
aaagcccgtt gaacttcaag cgaagtggca c 91
<210> 43
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA5支架
<400> 43
ggtcgtctat aggacggcga ggacaacggg aagtgccaat gtgctctttc caagagcaaa 60
caccccgttg acttcaagcg aagtggcac 89
<210> 44
<211> 84
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA6支架
<400> 44
ggtcgtctat aggacggcga gaatctgtgc gtgtgccata agtaattaaa aattacccac 60
cacagacttc aagcgaagtg gcac 84
<210> 45
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA7支架
<400> 45
ggtgacctat agggtcaatg tttttcaacg ggtgtgccaa tggccacttt ccaggtggca 60
aagcccgttg aacttcaagc gaagtggcac 90
<210> 46
<211> 88
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA8支架
<400> 46
ggtgacctat agggtcaatg gacaacggga agtgccaatg tgctctttcc aagagcaaac 60
accccgttga cttcaagcga agtggcac 88
<210> 47
<211> 83
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA9支架
<400> 47
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acagacttca agcgaagtgg cac 83
<210> 48
<211> 85
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA10支架
<400> 48
ggtctaaagg acagaatttt tcaacgggtg tgccaatggc cactttccag gtggcaaagc 60
ccgttgagct tcaaagaagt ggcac 85
<210> 49
<211> 83
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA11支架
<400> 49
ggtctaaagg acagaagaca acgggaagtg ccaatgtgct ctttccaaga gcaaacaccc 60
cgttggcttc aaagaagtgg cac 83
<210> 50
<211> 78
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA12支架
<400> 50
ggtctaaagg acagaaaatc tgtgcgtgtg ccataagtaa ttaaaaatta cccaccacag 60
gcttcaaaga agtggcac 78
<210> 51
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA13支架
<400> 51
ggtcgtctat aggacggcga gtttttcaac gggtgtgcca atggccactt tccaggtggc 60
aaagcccgtt gagcttcaaa gaagtggcac 90
<210> 52
<211> 88
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA14支架
<400> 52
ggtcgtctat aggacggcga ggacaacggg aagtgccaat gtgctctttc caagagcaaa 60
caccccgttg gcttcaaaga agtggcac 88
<210> 53
<211> 83
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA15支架
<400> 53
ggtcgtctat aggacggcga gaatctgtgc gtgtgccata agtaattaaa aattacccac 60
cacaggcttc aaagaagtgg cac 83
<210> 54
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA16支架
<400> 54
ggtgacctat agggtcaatg tttttcaacg ggtgtgccaa tggccacttt ccaggtggca 60
aagcccgttg agcttcaaag aagtggcac 89
<210> 55
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA17支架
<400> 55
ggtgacctat agggtcaatg gacaacggga agtgccaatg tgctctttcc aagagcaaac 60
accccgttgg cttcaaagaa gtggcac 87
<210> 56
<211> 82
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA18支架
<400> 56
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acaggcttca aagaagtggc ac 82
<210> 57
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA19支架
<400> 57
ggtctaaagg acagaatttt tcaacgggtg tgccaatggc cactttccag gtggcaaagc 60
ccgttgagat tatctatgat gattggcac 89
<210> 58
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA20支架
<400> 58
ggtctaaagg acagaagaca acgggaagtg ccaatgtgct ctttccaaga gcaaacaccc 60
cgttggatta tctatgatga ttggcac 87
<210> 59
<211> 82
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA21支架
<400> 59
ggtctaaagg acagaaaatc tgtgcgtgtg ccataagtaa ttaaaaatta cccaccacag 60
gattatctat gatgattggc ac 82
<210> 60
<211> 94
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA22支架
<400> 60
ggtcgtctat aggacggcga gtttttcaac gggtgtgcca atggccactt tccaggtggc 60
aaagcccgtt gagattatct atgatgattg gcac 94
<210> 61
<211> 92
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA23支架
<400> 61
ggtcgtctat aggacggcga ggacaacggg aagtgccaat gtgctctttc caagagcaaa 60
caccccgttg gattatctat gatgattggc ac 92
<210> 62
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA24支架
<400> 62
ggtcgtctat aggacggcga gaatctgtgc gtgtgccata agtaattaaa aattacccac 60
cacaggatta tctatgatga ttggcac 87
<210> 63
<211> 93
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA25支架
<400> 63
ggtgacctat agggtcaatg tttttcaacg ggtgtgccaa tggccacttt ccaggtggca 60
aagcccgttg agattatcta tgatgattgg cac 93
<210> 64
<211> 91
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA26支架
<400> 64
ggtgacctat agggtcaatg gacaacggga agtgccaatg tgctctttcc aagagcaaac 60
accccgttgg attatctatg atgattggca c 91
<210> 65
<211> 86
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA27支架
<400> 65
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acaggattat ctatgatgat tggcac 86
<210> 66
<211> 82
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA28支架
<400> 66
ggtctaaagg acagaacaac gggatgtgcc aatgcactct ttccaggagt gaacaccccg 60
ttgacttcaa gcgaagtggc ac 82
<210> 67
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA29支架
<400> 67
ggtcgtctat aggacggcga gcaacgggat gtgccaatgc actctttcca ggagtgaaca 60
ccccgttgac ttcaagcgaa gtggcac 87
<210> 68
<211> 99
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA30支架
<400> 68
ggaattgccg atctatagga cggcagattt ttttcaacgg gtgtgccaat ggccactttc 60
caggtggcaa agcccgttga acttcaagcg aagtggcac 99
<210> 69
<211> 97
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA31支架
<400> 69
ggaattgccg atctatagga cggcagattg acaacgggaa gtgccaatgt gctctttcca 60
agagcaaaca ccccgttgac ttcaagcgaa gtggcac 97
<210> 70
<211> 95
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA32支架
<400> 70
ggaattgccg atctatagga cggcagattc aacgggatgt gccaatgcac tctttccagg 60
agtgaacacc ccgttgactt caagcgaagt ggcac 95
<210> 71
<211> 81
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA33支架
<400> 71
ggtctaaagg acagaacaac gggatgtgcc aatgcactct ttccaggagt gaacaccccg 60
ttggcttcaa agaagtggca c 81
<210> 72
<211> 86
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA34支架
<400> 72
ggtcgtctat aggacggcga gcaacgggat gtgccaatgc actctttcca ggagtgaaca 60
ccccgttggc ttcaaagaag tggcac 86
<210> 73
<211> 98
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA35支架
<400> 73
ggaattgccg atctatagga cggcagattt ttttcaacgg gtgtgccaat ggccactttc 60
caggtggcaa agcccgttga gcttcaaaga agtggcac 98
<210> 74
<211> 96
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA36支架
<400> 74
ggaattgccg atctatagga cggcagattg acaacgggaa gtgccaatgt gctctttcca 60
agagcaaaca ccccgttggc ttcaaagaag tggcac 96
<210> 75
<211> 94
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRA37支架
<400> 75
ggaattgccg atctatagga cggcagattc aacgggatgt gccaatgcac tctttccagg 60
agtgaacacc ccgttggctt caaagaagtg gcac 94
<210> 76
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CCR5 site 10
<400> 76
tccttctcct gaacaccttc 20
<210> 77
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CCR5 site 28
<400> 77
tttggcctga ataattgcag 20
<210> 78
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DNMT1 site 16
<400> 78
cccttcagct aaaataaagg 20
<210> 79
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> RNF2 site 8
<400> 79
tagtcatggt gttcttcaac 20
<210> 80
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Site 5
<400> 80
gctctcaaga cccacaatcc 20

Claims (14)

1.一种用于对细胞基因组中的靶序列进行定点修饰的基因组编辑系统,其包含以下i)至v)中至少一项:
i)C2c1蛋白,和向导RNA;
ii)包含编码C2c1蛋白的核苷酸序列的表达构建体,和向导RNA;
iii)C2c1蛋白,和包含编码向导RNA的核苷酸序列的表达构建体;
iv)包含编码C2c1蛋白的核苷酸序列的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;
v)包含编码C2c1蛋白的核苷酸序列和编码向导RNA的核苷酸序列的表达构建体;
其中所述向导RNA能够与所述C2c1蛋白形成复合物,将所述C2c1蛋白靶向所述细胞基因组中的靶序列,其中所述C2c1蛋白是来自Tuberibacillus calidus的TcC2c1蛋白,
其中所述向导RNA是sgRNA,且包含由选自SEQ ID NO:31-38或39-75的核苷酸序列编码的sgRNA支架。
2.权利要求1的系统,其中所述C2c1蛋白是来自Tuberibacillus calidus DSM 17572的TcC2c1蛋白。
3.权利要求1的系统,其中所述C2c1蛋白包含SEQ ID NO:10所示的氨基酸序列。
4.权利要求1的系统,其中编码C2c1蛋白的核苷酸序列经密码子优化。
5.权利要求4的系统,其中所述编码C2c1蛋白的核苷酸序列示于SEQ ID NO:20。
6.一种对细胞基因组中的靶序列进行定点修饰的用于非治疗目的的方法,包括将权利要求1-5中任一项的系统导入细胞中。
7.权利要求6的方法,其中所述细胞来自哺乳动物、家禽或植物。
8.权利要求7的方法,其中所述哺乳动物选自人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;所述家禽选自鸡、鸭、鹅;所述植物选自水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥。
9.权利要求6的方法,其中所述系统通过选自以下的方法导入所述细胞:磷酸钙转染、原生质融合、电穿孔、脂质体转染、微注射、病毒感染、基因枪法、PEG介导的原生质体转化、土壤农杆菌介导的转化。
10.权利要求1-5中任一项的基因组编辑系统在制备用于治疗有需要的对象中的疾病的药物组合物中的用途,其中所述基因组编辑系统用于修饰所述对象中与所述疾病相关的基因。
11.一种用于治疗有需要的对象中的疾病的药物组合物,其包含权利要求1-5中任一项的基因组编辑系统和药学可接受的载体,其中所述基因组编辑系统用于修饰所述对象中与所述疾病相关的基因。
12.权利要求10的用途或权利要求11的药物组合物,其中所述对象是哺乳动物。
13.权利要求10的用途或权利要求11的药物组合物,其中所述对象是人。
14.权利要求10的用途或权利要求11的药物组合物,其中所述疾病选自肿瘤、炎症、帕金森病、心血管疾病、阿尔茨海默病、自闭症、药物成瘾、年龄相关性黄斑变性、精神分裂症和遗传性疾病。
CN201811300251.6A 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法 Active CN109337904B (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN201811300251.6A CN109337904B (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法
CN202011431478.1A CN112961853A (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法
JP2021523583A JP7361109B2 (ja) 2018-11-02 2018-11-30 C2c1ヌクレアーゼに基づくゲノム編集のためのシステムおよび方法
PCT/CN2018/118458 WO2020087631A1 (zh) 2018-11-02 2018-11-30 基于C2c1核酸酶的基因组编辑系统和方法
EP18938930.7A EP3929292A4 (en) 2018-11-02 2018-11-30 SYSTEM AND METHODS FOR GENOME EDITING BASED ON C2C1 NUCLEASES

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811300251.6A CN109337904B (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202011431478.1A Division CN112961853A (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法

Publications (2)

Publication Number Publication Date
CN109337904A CN109337904A (zh) 2019-02-15
CN109337904B true CN109337904B (zh) 2020-12-25

Family

ID=65313826

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202011431478.1A Pending CN112961853A (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法
CN201811300251.6A Active CN109337904B (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202011431478.1A Pending CN112961853A (zh) 2018-11-02 2018-11-02 基于C2c1核酸酶的基因组编辑系统和方法

Country Status (4)

Country Link
EP (1) EP3929292A4 (zh)
JP (1) JP7361109B2 (zh)
CN (2) CN112961853A (zh)
WO (1) WO2020087631A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110862935B (zh) * 2019-08-05 2021-04-06 浙江佳河绿环境科技有限公司 一种耐高温宫本寿芽孢杆菌及高温堆肥发酵方法
US11851702B2 (en) 2020-03-23 2023-12-26 The Broad Institute, Inc. Rapid diagnostics
CN116334037A (zh) * 2020-11-11 2023-06-27 山东舜丰生物科技有限公司 新型Cas酶和系统以及应用

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016205749A9 (en) * 2015-06-18 2017-01-19 The Broad Institute Inc. Novel crispr enzymes and systems

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2010328121A1 (en) * 2009-12-09 2012-06-07 Danisco Us Inc. Compositions and methods comprising protease variants
KR20180008572A (ko) * 2015-05-15 2018-01-24 파이어니어 하이 부렛드 인터내쇼날 인코포레이팃드 Cas 엔도뉴클레아제 시스템, pam 서열 및 가이드 rna 요소의 신속한 특성화
CN108513579B (zh) * 2015-10-09 2022-10-04 孟山都技术公司 新颖的rna导向性核酸酶及其用途
GB201618507D0 (en) * 2016-11-02 2016-12-14 Stichting Voor De Technische Wetenschappen And Wageningen Univ Microbial genome editing
CN108277231B (zh) * 2017-01-06 2021-02-02 中国科学院分子植物科学卓越创新中心 一种用于棒状杆菌基因组编辑的crispr系统
JP2020508693A (ja) * 2017-03-06 2020-03-26 インスティテュート フォー ベーシック サイエンスInstitute For Basic Science C2c1エンドヌクレアーゼを含むゲノム編集用組成物およびこれを用いたゲノム編集方法
WO2019127087A1 (zh) 2017-12-27 2019-07-04 中国科学院动物研究所 基因组编辑系统和方法
WO2020033601A1 (en) 2018-08-07 2020-02-13 The Broad Institute, Inc. Novel cas12b enzymes and systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016205749A9 (en) * 2015-06-18 2017-01-19 The Broad Institute Inc. Novel crispr enzymes and systems

Also Published As

Publication number Publication date
EP3929292A4 (en) 2023-03-01
JP2022512868A (ja) 2022-02-07
JP7361109B2 (ja) 2023-10-13
EP3929292A1 (en) 2021-12-29
CN112961853A (zh) 2021-06-15
WO2020087631A1 (zh) 2020-05-07
CN109337904A (zh) 2019-02-15

Similar Documents

Publication Publication Date Title
KR102084186B1 (ko) Dna 단일가닥 절단에 의한 염기 교정 비표적 위치 확인 방법
KR102606680B1 (ko) S. 피오게네스 cas9 돌연변이 유전자 및 이에 의해 암호화되는 폴리펩티드
CN107109422B (zh) 使用由两个载体表达的拆分的Cas9的基因组编辑
KR102598856B1 (ko) 변경된 PAM 특이성을 갖는 조작된 CRISPR-Cas9 뉴클레아제
KR102210322B1 (ko) Rna-안내 게놈 편집의 특이성을 증가시키기 위한 rna-안내 foki 뉴클레아제(rfn)의 용도
AU2022200130B2 (en) Engineered Cas9 systems for eukaryotic genome modification
CN113015797A (zh) Rna-指导的核酸酶及其活性片段和变体及其使用方法
CN110799525A (zh) 具有改变的PAM特异性的CPF1(CAS12a)的变体
CN112105728B (zh) CRISPR/Cas效应蛋白及系统
CN113881652B (zh) 新型Cas酶和系统以及应用
CN109337904B (zh) 基于C2c1核酸酶的基因组编辑系统和方法
CN113015798B (zh) CRISPR-Cas12a酶和系统
CN113652445A (zh) 基因组编辑系统和方法
CN116096892A (zh) 具有RuvC结构域的酶
KR20220062289A (ko) Rna-가이드된 뉴클레아제 및 그의 활성 단편 및 변이체 및 사용 방법
KR20230014700A (ko) Rna-가이드된 뉴클레아제 및 그의 활성 단편 및 변이체 및 사용 방법
CA3228222A1 (en) Class ii, type v crispr systems
CN115667283A (zh) Rna指导的千碱基规模基因组重组工程
CN115380111A (zh) 用于碱基多样化的组合物、系统和方法
CN112266418A (zh) 改进的基因组编辑系统及其应用
AU2021355838A1 (en) Technique for modifying target nucleotide sequence using crispr-type i-d system
EP4209589A1 (en) Miniaturized cytidine deaminase-containing complex for modifying double-stranded dna
WO2023039434A1 (en) Systems and methods for transposing cargo nucleotide sequences
KR20230058482A (ko) 표적 dna의 편집 방법, 표적 dna가 편집된 세포의 제조 방법, 및 그것들에 사용하는 dna 편집 시스템
CN115678913A (zh) 表观遗传因子在真核细胞中优化基因编辑工具的应用

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant