CN109837328B - 核酸检测方法 - Google Patents

核酸检测方法 Download PDF

Info

Publication number
CN109837328B
CN109837328B CN201811453278.9A CN201811453278A CN109837328B CN 109837328 B CN109837328 B CN 109837328B CN 201811453278 A CN201811453278 A CN 201811453278A CN 109837328 B CN109837328 B CN 109837328B
Authority
CN
China
Prior art keywords
leu
glu
lys
arg
ala
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811453278.9A
Other languages
English (en)
Other versions
CN109837328A (zh
Inventor
李伟
周琪
滕飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Zoology of CAS
Original Assignee
Institute of Zoology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Zoology of CAS filed Critical Institute of Zoology of CAS
Publication of CN109837328A publication Critical patent/CN109837328A/zh
Application granted granted Critical
Publication of CN109837328B publication Critical patent/CN109837328B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • C12Q1/6818Hybridisation assays characterised by the detection means involving interaction of two or more labels, e.g. resonant energy transfer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • C12Q1/708Specific hybridization probes for papilloma

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Virology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

本发明涉及基因工程领域。具体而言,本发明涉及新的基于CRISPR系统的核酸检测方法。更具体而言,本发明涉及Cas12b介导的DNA检测方法及相关试剂盒。

Description

核酸检测方法
技术领域
本发明涉及基因工程领域。具体而言,本发明涉及新的基于CRISPR系统的核酸检测方法。更具体而言,本发明涉及Cas12b介导的DNA检测方法及相关试剂盒。
背景技术
快速便携的核酸检测有望在临床诊断和检疫检测中发挥重要作用。CRISPR核酸酶Cas12a和Cas13,由于具有ssDNA或ssRNA反式切割活性已经被开发用于以高灵敏度和特异性快速检测DNA和RNA。基于CRISPR-Cas13的RNA检测平台称为SHERLOCK,基于Cas12a的DNA检测平台也称作DETECTR。
开发进一步的具有更高灵敏度和更高特异性的核酸检测平台在本领域具有重要意义。
发明简述
本发明提供一种基于Cas12b的核酸检测方法,称为CDetection,能够以比Cas12a更高的特异性检测DNA,并且具有高达亚渺摩尔(attomolar)级的灵敏度。本发明还提供了增强的CDetection(eCDetection),其能够实现区分仅有单个核苷酸多态性不同的两种靶DNA。基于CRISPR-Cas12b的CDetection技术能够提供快速简便的DNA检测方法,适用于健康领域和生物技术的各种应用。
附图描述
图1.Cas12蛋白中保守的靶识别激活非特异性单链DNA(ssDNA)切割。(a)示意图说明Cas12b具有经典的dsDNA靶识别和切割,以及非经典的旁路ssDNA切割。(b)M13mp18ssDNA底物切割时间曲线,其中纯化的AaCas12b、ArCas12a、HkCas12a、PrCas12a和SpCas9和与M13噬菌体互补的向导RNA(靶gRNA,OT(on target)-gRNA)结合。(c)M13mp18 ssDNA底物切割时间过程,其中纯化的AaCas12b、ArCas12a、HkCas12a、PrCas12a和SpCas9和与M13噬菌体没有序列同源性的gRNA(非靶gRNA,NT(non-target)-gRNA)及互补ssDNA激活物结合。
图2.Cas12b的RuvC结构域负责ssDNA反式切割。(a-b)M13mp18 ssDNA底物和pUC19dsDNA切割时间曲线,其中纯化的WT AaCas12b和RuvC催化突变体(R785A或D977A)与靶gRNA(OT-gRNA)或非靶gRNA(NT-gRNA)结合(a),或与NT-gRNA及互补ssDNA激活物(b)结合,其中非靶gRNA与M13噬菌体或pUC19没有序列同源性。
图3.Cas12b介导的反式激活切割对非特异性ssDNA的偏好性。(a)AaCas12b介导的对含有A、T、G或C核苷酸同聚体的ssDNA报告分子的反式切割的核苷酸偏好性。将AaCas12b与靶向合成ssDNA靶1或靶2的sgRNA一起孵育。误差棒表示平均值的标准误差(s.e.m.),n=3。RFU,相对荧光单位。(b)测试多种缓冲液对Cas12b介导的反式激活切割的影响。将AaCas12b与靶向合成ssDNA靶1的sgRNA一起孵育。误差棒表示平均值的标准误差(s.e.m.),n=3。(c)使用on-target-ssDNA(靶-ssDNA或OT-ssDNA)、non-target-ssDNA(非靶-ssDNA或NT-ssDNA)、on-target-dsDNA(靶-dsDNA或OT-dsDNA)和non-target-dsDNA(非靶-dsDNA或NT-dsDNA)作为激活物,对AaCas12b的反式切割活性分析。误差条表示s.e.m.,n=3。
图4.Cas12b介导的DNA检测的特异性和灵敏度。(a)使用具有指示的单个错配的ssDNA或dsDNA激活物分析AaCas12b的反式切割活性。误差条表示s.e.m.,n=3。RFU,相对荧光单位。PT,完美配对的靶。mPAM,突变的PAM。(b)使用具有指示的连续错配的ssDNA或dsDNA激活物分析AaCas12b的反式切割活性。误差棒表示s.e.m.,n=3。(c)比较AaCas12b、PrCas12a、LbCas12a在区分dsDNA方面的特异性,使用合成的HPV16激活物。误差条表示s.e.m.,n=3。(d)使用dsDNA激活物比较AaC2c1的反式切割活性和预扩增增强的反式切割活性(CDetection)。将AaCas12b与靶向靶1合成dsDNA的sgRNA一起温育。误差条表示s.e.m.,n=3。RPA,重组酶聚合酶扩增。(e)基于AaCas12b、PrCas12a、LbCas12a的使用RPA预扩增的DNA检测所获得的最大荧光信号。将Cas12蛋白与靶向合成HPV16 dsDNA的gRNA孵育,所述合成HPV16 dsDNA与背景基因组混合。误差条表示s.e.m.,n=3。(f)基于AaCas12b、LbCas12a的使用RPA预扩增的DNA检测所获得的荧光时间曲线。将Cas12蛋白与靶向合成HPV16 dsDNA的gRNA孵育,所述合成HPV16 dsDNA稀释于人血浆中,终浓度10-18M。误差条表示s.e.m.,n=3。
图5.AaCas12b介导的DNA检测的特异性和灵敏度。(a)比较没有预扩增的使用ssDNA或dsDNA激活物的AaCas12b反式切割活性。AaCas12b与靶向合成ssDNA或dsDNA的sgRNA一起孵育。误差条表示s.e.m.,n=3。(b)AaCas12b检测到CaMV DNA的存在。(c)AaCas12b区分两种密切相关的合成HPV序列,其具有六个核苷酸多态性。误差棒表示s.e.m.,n=3。(d)比较AaCas12b、PrCas12a、LbCas12a在区分dsDNA方面的特异性,使用合成的HPV18激活物。误差条表示s.e.m.,n=3。(e)使用dsDNA激活物比较AaCas12b的反式切割活性和预扩增增强的反式切割活性(CDetection)。将AaCas12b与靶向靶2的合成ssDNA或dsDNA的sgRNA一起温育。误差条表示s.e.m.,n=3。
图6.CDetection在DNA检测中达到亚渺摩尔级灵敏度。(a)在检测和背景基因组混合的HPV16或HPV18 dsDNA时,CDetection实现亚渺摩尔级
Figure GDA0003076542300000021
灵敏度。误差棒表示平均值的标准误差(s.e.m.),n=3。RFU,相对荧光单位。(b)使用RPA预扩增,从基于AaCas12b和LbCas12a的DNA检测中获得的荧光时间曲线。Cas12蛋白与靶向合成HPV18 dsDNA的gRNA混合,所述合成HPV18 dsDNA稀释于人血浆中,终浓度10-18M。误差条表示s.e.m.,n=3。
图7.CDetection的应用。(a)通过CDetection检测ABO血液基因分型的示意图。示出六种常见ABO等位基因和三种靶向性sgRNA。各sgRNA以可检测信号区分出各自对应的等位基因。如果所有sgRNA均没有产生信号,则等位基因为A101或A201。(b)使用CDetection进行ABO血液基因分型获得的荧光信号。CDetection未能区分仅有一个单碱基多态性不同的两种dsDNA激活物(on-B101 vs.off-B101,on-O01 vs.off-O01,on-O02/03vs.off-O02/03)。误差棒表示平均值的标准误差(s.e.m.),n=3。RFU,相对荧光单位。(c)示意通过引入经调节的gRNA(tgRNA)开发eCDetection。使用sgRNA的CDetection无法区分两种仅有单一错配的dsDNA激活物,而使用tgRNA(其在间隔区携带错配)的eCDetecion可以通过单核苷酸分辨率实现DNA检测。
图8.增强型CDetection(eCDetection)的广泛应用。(a)使用tgRNA的eCDection以单核苷酸分辨率特异性实现ABO血液基因分型检测。误差棒表示s.e.m.,n=3。RFU,相对荧光单位。tgRNA,经调节的gRNA。(b)和(c)上图示意BRCA1基因和靶向sgRNA以及tgRNA之间的序列差异。下图用最大荧光信号示出无RPA的CDetection使用sgRNA和tgRNA检测人BRCA1基因突变3232A>G(b)和3537A>G(c)的特异性。(d)荧光时间曲线显示有RPA的CDetection使用tgRNA(3232-1)检测人BRCA1基因突变3232A>G的灵敏度和特异性。野生型BRCA1或BRCA13232A>G dsDNA稀释于人血浆中,终浓度为10-18M。误差棒表示s.e.m.,n=3。
图9.使用CDetection平台进行精确DNA检测。(a)上图示出TP53基因和靶向sgRNA以及tgRNA之间的序列差异。下图用最大荧光信号示出使用sgRNA和tgRNA检测人TP53基因突变856G>A的特异性。误差棒表示s.e.m.,n=3。RFU,相对荧光单位。tgRNA,经调节的gRNA。(b)和(c)上图示意BRCA1基因和靶向sgRNA以及tgRNA之间的序列差异。下图用最大荧光信号示出使用sgRNA和tgRNA检测人BRCA1基因突变3232A>G(b)和3537A>G(c)的特异性。(d)荧光时间曲线显示有RPA的CDetection使用tgRNA(3232-1)检测人BRCA1基因突变3232A>G的灵敏度和特异性。野生型BRCA1或BRCA1 3232A>G dsDNA稀释于人血浆中,终浓度为10-16M。误差棒表示s.e.m.,n=3。
图10.CDetection平台的快速及精确的广泛诊断应用。首先通过直接裂解从不同样品获得基因组DNA用于临床诊断和定量。对于不同的目的,经过或未经RPA预扩增的DNA通过使用sgRNA或tgRNA的CDetection进行检测。
图11.选择用于基因组编辑测试的非冗余C2c1直系同源物的系统发生树及其基因座。(a)邻接系统发生树,显示测试的C2c1直系同源物的进化关系。(b)对应于(a)中突出显示的8种C2c1蛋白的细菌基因座图谱crRNA DR和推定的tracrRNA的模拟共折叠显示出稳定的二级结构。DR,直接重复。每个细菌基因组间隔区(spacer)的数目在其CRISPR阵列的上方或下方表示。
图12.C2c1直系同源物的蛋白质比对:测试的10种C2c1直系同源物的氨基酸序列的多序列比对。保守的残基用红色背景突出显示,保守突变用轮廓和红色字体突出显示。
图13.人293T细胞中C2c1直系同源物介导的基因组靶向。(a)T7EI测定结果表明在人类基因组中与其同源sgRNA结合的八种C2c1蛋白的基因组靶向活性。三角形表示切割的条带。(b)T7EI测定结果表明在人293T细胞中由与其同源sgRNA(Bs3sgRNA)结合的Bs3C2c1介导的同时多重基因组靶向。(c)Sanger测序显示由与Bs3sgRNA结合的Bs3C2c1诱导的代表性插入缺失(indel)。PAM和原间隔区序列分别用红色和蓝色着色。插入缺失和插入分别用紫色破折号和绿色小写字符表示。
图14.用于RNA指导的基因组编辑的C2c1蛋白。(a)本发明中测试的10种C2c1直系同源物的图形概述。示出其大小(氨基酸数目)。(b)T7EI测定结果表明在人293T细胞中由其同源sgRNA指导的八种C2c1直系同源物的基因组靶向活性。三角形表示切割的条带。(c-d)T7EI测定结果表明在人293T细胞中由AasgRNA(c)和AksgRNA(d)指导的八种C2c1直系同源物的基因组靶向活性。三角形表示切割的条带。
图15.C2c1的sgRNA的DNA比对:测试衍生自10个C2c1基因座的8种sgRNA的DNA序列的多序列比对。
图16.不同C2c1直系同源物与sgRNA之间的可互换性。T7EI测定结果表明在人293T细胞中由AasgRNA(a)、AksgRNA(b)、AmsgRNA(c)、Bs3sgRNA(d)和LssgRNA(e)指导的八种C2c1直系同源物的基因组靶向活性。红色三角形表示切割的条带。
图17.人工sgRNA介导的多重基因组靶向。(a)对应于DiC2c1和TcC2c1的细菌基因座的图谱。两个C2c1基因座没有CRISPR阵列。(b-c)T7EI测定结果表明在人293T细胞中由AasgRNA(b)和AksgRNA(c)指导的AaC2c1、DiC2c1和TcC2c1的基因组靶向活性。三角形表示切割的条带。(d)T7EI测定结果表明在人293T细胞中由与AksgRNA结合的TcC2c1介导的同时多重基因组靶向。(e)示意图说明人工sgRNA支架13(artgRNA13)的二级结构。(f)T7EI测定结果表明在人293T细胞中由与artgRNA13结合的TcC2c1介导的同时多重基因组靶向。
图18.不同sgRNA指导C2c1进行基因组编辑。T7EI测定结果表明在人293T细胞中由AasgRNA(a)、AksgRNA(b)、AmsgRNA(c)、Bs3sgRNA(d)和LssgRNA(e)指导的AaC2c1、DiC2c1和TcC2c1的基因组靶向活性。三角形表示切割的条带。
图19.TcC2c1介导的多重基因组编辑。(a)T7EI测定结果表明在人293T细胞中由与AmsgRNA结合的TcC2c1介导的同时多重基因组靶向。(b-c)Sanger测序显示由与AksgRNA(b)和AmsgRNA(c)结合的TcC2c1诱导的代表性插入缺失。PAM和原间隔区序列分别用红色和蓝色着色。插入缺失和插入分别用紫色破折号和绿色小写字符表示。
图20.人工sgRNA指导TcC2c1进行基因组编辑。(a)示意图说明36种人工sgRNA(artgRNA)支架(支架:1-12和14-37)的二级结构。(b)T7EI测定结果表明在人293T细胞中artsgRNA指导的TcC2c1的基因组靶向活性。三角形表示切割的条带。(c)T7EI测定结果表明在人293T细胞中由与artgRNA13结合的AaC2c1介导的同时多重基因组靶向。
发明详述
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白质和核酸化学、分子生物学、细胞和组织培养、微生物学、免疫学相关术语和实验室操作步骤均为相应领域内广泛使用的术语和常规步骤。例如,本发明中使用的标准重组DNA和分子克隆技术为本领域技术人员熟知,并且在如下文献中有更全面的描述:Sambrook,J.,Fritsch,E.F.和Maniatis,T.,MolecularCloning:A Laboratory Manual;Cold Spring Harbor Laboratory Press:Cold SpringHarbor,1989(下文称为“Sambrook”)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
在第一方面,本发明提供一种检测生物样品中靶核酸分子的存在和/或量的方法,所述方法包括以下步骤:
(a)使所述生物样品接触:i)Cas12b蛋白,ii)针对所述靶核酸分子中的靶序列的gRNA,和iii)被切割后产生可检测信号的单链DNA报告分子,从而形成反应混合物;
(b)检测所述反应混合物中产生的可检测信号的存在和/或水平,
其中所述可检测信号的存在和/或水平代表所述靶核酸分子的存在和/或量。
在一些实施方案中,所述靶核酸分子是双链DNA分子。在一些实施方案中,所述靶核酸分子是单链DNA分子。所述靶核酸分子可以是基因组DNA、cDNA、病毒DNA等,或它们的片段。
“Cas12b”、“Cas12b核酸酶”、“Cas12b蛋白”、“C2c1”、“C2c1核酸酶”和“C2c1蛋白”在本文中可互换使用,指的是一种来自微生物CRISPR系统的RNA指导的序列特异性核酸酶。Cas12b能在向导RNA的指导下靶向并切割DNA靶序列形成DNA双链断裂(DSB),也称作经典dsDNA切割活性。更重要的是,Cas12b与gRNA的复合物在识别并结合相应靶DNA序列后,能够激活其非特异性单链DNA切割活性,也称作非经典旁路ssDNA切割活性。利用非经典旁路ssDNA切割活性,切割在被切割后产生可检测信号的单链DNA报告分子,即可以反映靶DNA的存在和/或量。在本文中,被Cas12b与gRNA的复合物识别并结合,激活Cas12b的非特异性单链DNA切割活性的DNA分子也称作“激活物”。
在一些实施方案中,所述Cas12b蛋白是来自Alicyclobacillus acidiphilus的AaCas12b蛋白、来自Alicyclobacillus kakegawensis的AkCas12b蛋白、来自Alicyclobacillus macrosporangiidus的AmCas12b蛋白、来自Bacillus hisashii的BhCas12b蛋白、来自Bacillus属的BsCas12b蛋白、来自Bacillus属的Bs3Cas12b蛋白、来自Desulfovibrio inopinatus的DiCas12b蛋白、来自Laceyella sediminis的LsCas12b蛋白、来自Spirochaetes bacterium的SbCas12b蛋白、来自Tuberibacillus calidus的TcCas12b蛋白。在一些优选实施方案中,所述Cas12b蛋白是来自Alicyclobacillus acidiphilus的Cas12b蛋白(AaCas12b)。本申请人已经鉴定这些Cas12b蛋白可以用于在哺乳动物中进行基因组编辑,也可以用于本发明的核酸检测方法。
例如,所述Cas12b蛋白是来自Alicyclobacillus acidiphilus NBRC 100859的AaCas12b蛋白、来自Alicyclobacillus kakegawensis NBRC 103104的AkCas12b蛋白、来自Alicyclobacillus macrosporangiidus strain DSM 17980的AmCas12b蛋白、来自Bacillus hisashii strain C4的BhCas12b蛋白、来自Bacillus属NSP2.1的BsCas12b蛋白、来自Bacillus属V3-13 contig_40的Bs3Cas12b蛋白、来自Desulfovibrio inopinatus DSM10711的DiCas12b蛋白、来自Laceyella sediminis strain RHA1的LsCas12b蛋白、来自Spirochaetes bacterium GWB1_27_13的SbCas12b蛋白、来自Tuberibacillus calidusDSM17572的TcCas12b蛋白。在一些优选实施方案中,所述Cas12b蛋白是来自Alicyclobacillus acidiphilus NBRC 100859的Cas12b蛋白。
Alicyclobacillus acidiphilus的Cas12b的基因座上缺少已经测序的直接重复(DR)阵列,因此本领域技术人员将会认为其无法进行基因编辑,从而在CRISPR核酸酶筛选中将其略过。然而,本发明人令人惊奇地发现,来自Alicyclobacillus acidiphilus的Cas12b蛋白同样具有经典靶向dsDNA切割活性和非经典旁路ssDNA切割活性,从而能用于基因编辑和核酸检测。类似地,所鉴定的其它一些Cas12b蛋白,例如DiCas12b或TcCas12b蛋白,尽管其天然基因座不具有CRISPR阵列,也出乎意料地可以用于本发明。
在本发明一些实施方式中,所述Cas12b蛋白是其天然基因座不具有CRISPR阵列的Cas12b蛋白。在一些实施方式中,所述天然基因座不具有CRISPR阵列的Cas12b蛋白是AaCas12b蛋白、DiCas12b或TcCas12b蛋白。
在一些实施方案中,所述Cas12b蛋白包含与SEQ ID NO:1-10中任一个具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%序列相同性的氨基酸序列。在一些实施方案中,所述Cas12b蛋白的氨基酸序列相对于SEQ ID NO:1-10中任一个具有一或多个氨基酸残基取代、缺失或添加。例如,所述Cas12b蛋白相对于SEQ IDNO:1-10中任一个具有1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸残基取代、缺失或添加的氨基酸序列。在一些实施方案中,所述氨基酸取代是保守型取代。在一些实施方案中,所述Cas12b蛋白包含SEQ ID NO:1-10中任一所示的氨基酸序列。例如,所述AaCas12b、AkCas12b、AmCas12b、BhCas12b、BsCas12b、Bs3Cas12b、DiCas12b、LsCas12b、SbCas12b、TcCas12b蛋白分别包含SEQ ID NO:1-10所示氨基酸序列。在一些优选实施方式中,所述Cas12b蛋白包含SEQ ID NO:1所示的氨基酸序列。
本发明人已经证明了Cas12b蛋白的RuvC结构域对于其非经典旁路ssDNA切割活性是关键的。在一些实施方案中,所述Cas12b蛋白包含野生型Cas12b蛋白的RuvC结构域,所述野生型Cas12b蛋白例如包含SEQ ID NO:1-10任一项所示氨基酸序列。本领域技术人员可以容易地鉴定出Cas12b蛋白的RuvC结构域。例如,通过NCBI提供的工具。
序列“相同性”具有本领域公认的含义,并且可以利用公开的技术计算两个核酸或多肽分子或区域之间序列相同性的百分比。可以沿着多核苷酸或多肽的全长或者沿着该分子的区域测量序列相同性。(参见,例如:Computational Molecular Biology,Lesk,A.M.,ed.,Oxford University Press,New York,1988;Biocomputing:Informatics and GenomeProjects,Smith,D.W.,ed.,Academic Press,New York,1993;Computer Analysis ofSequence Data,Part I,Griffin,A.M.,and Griffin,H.G.,eds.,Humana Press,NewJersey,1994;Sequence Analysis in Molecular Biology,von Heinje,G.,AcademicPress,1987;and Sequence Analysis Primer,Gribskov,M.and Devereux,J.,eds.,MStockton Press,New York,1991)。虽然存在许多测量两个多核苷酸或多肽之间的相同性的方法,但是术语“相同性”是技术人员公知的(Carrillo,H.&Lipman,D.,SIAM J AppliedMath 48:1073(1988))。
在肽或蛋白中,合适的保守型氨基酸取代是本领域技术人员已知的,并且一般可以进行而不改变所得分子的生物活性。通常,本领域技术人员认识到多肽的非必需区中的单个氨基酸取代基本上不改变生物活性(参见,例如,Watson et al.,Molecular Biologyof the Gene,4th Edition,1987,The Benjamin/Cummings Pub.co.,p.224)。
特别地,本领域人员将可以理解,在同一细菌物种的不同菌株Cas12b蛋白可能在氨基酸序列存在一定差异,但是却能实现基本上相同的功能。
在一些实施方案中,所述Cas12b蛋白是重组产生的。在一些实施方案中,所述Cas12b蛋白还含有融合标签,例如用于Cas12b蛋白分离/和或纯化的标签。重组产生蛋白质的方法是本领域已知的。并且本领域已知多种可以用于分离/和或纯化蛋白质的标签,包括但不限于His标签、GST标签等。通常而言,这些标签不会改变目的蛋白的活性。
向导RNA”和“gRNA”在本文中可互换使用,通常由部分互补形成复合物的crRNA和tracrRNA分子构成,其中crRNA包含与靶序列具有足够相同性以便与靶序列的互补序列杂交并且指导CRISPR复合物(CRISPR核酸酶+crRNA+tracrRNA)与该靶序列以序列特异性方式结合的序列。然而,可以设计并使用单向导RNA(sgRNA),其同时包含crRNA和tracrRNA的特征。不同的CRISPR核酸酶对应的gRNA存在不同。例如,Cas9和Cas12b通常需要crRNA和tracrRNA两者,然而,Cas12a(Cpf1)则只需要crRNA。
“针对靶核酸分子的靶序列的gRNA”指的是gRNA能够特异性识别所述靶序列。例如,在一些实施方案中(靶核酸分子是双链DNA),所述gRNA包含能够与靶序列的互补序列特异性杂交的间隔区序列(spacer)。在一些实施方案中(靶核酸分子是单链DNA),所述gRNA包含能够与靶序列特异性杂交的间隔区序列。
A.acidiphilus CRISPR基因座中没有直接重复序列(DR)阵列,因此,AaCas12b并没有对应的crRNA。然而,本发明人发现,AaCas12b也可以采用来自其他生物体的Cas12b蛋白的相应gRNA。例如AaCas12b可以采用自身的tracrRNA和来自A.acidoterrestris的CRISPR基因座的crRNA序列作为gRNA。本发明人对可用于AaCas12b的gRNA进行了优化。
在本发明的一些实施方案中,所述向导RNA是由crRNA和tracrRNA部分互补形成的复合物。在一些实施方案中,所述tracrRNA由以下的核酸序列编码:5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTCTGAC-3’。在一些实施方案中,所述crRNA由以下的核酸序列编码:5’-GTCGGATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’,其中Nx表示X个连续的核苷酸组成的核苷酸序列,N各自独立地选自A、G、C和T;X为18≤X≤35的整数。优选地,X=20。在一些实施方案中,Nx是能够与靶序列的互补序列特异性杂交的间隔区序列(靶核酸分子是双链DNA)。在一些实施方案中,Nx是能够与靶序列特异性杂交的间隔区序列(靶核酸分子是单链DNA)。
在本发明的一些实施方案中,所述向导RNA是sgRNA。在一些实施方案中,所述sgRNA由5’端的支架序列和3’端的间隔区序列组成。间隔区序列可以与靶序列或靶序列的互补序列特异性杂交。间隔区序列通常长度为18至35个核苷酸,优选20个核苷酸。
在一些具体实施方案中,所述sgRNA由选自以下之一的核酸序列编码:
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTCTGACGTCGGATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’;
5’-AACTGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTCTGACGTCGGATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’;
5’-CTGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTCTGACGTCGGATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGATCTGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGCTGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAGCTGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAACTGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAGCGAGAAGTGGCAC-Nx-3’;
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTAAGCAGAAGTGGCAC-Nx-3’;和
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’;
其中Nx表示X个连续的核苷酸组成的核苷酸序列,N各自独立地选自A、G、C和T;X为18≤X≤35的整数。优选地,X=20。在一些实施方案中,Nx是能够与靶序列的互补序列特异性杂交的间隔区序列(靶核酸分子是双链DNA)。在一些实施方案中,Nx是能够与靶序列特异性杂交的间隔区序列(靶核酸分子是单链DNA)。在一些实施方案中,所述sgRNA包含由SEQID NO:11-21中任一项的核苷酸序列编码的支架序列。
在一些具体实施方案中,所述sgRNA由选自以下之一的核酸序列编码:
5’-GTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCTCAAAAAGAACGCTCGCTCAGTGTTCTGACGTCGGATCACTGAGCGAGCGATCTGAGAAGTGGCAC-Nx-3’(AasgRNA);
5’-tcgtctataGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAGATGACCGCTCGCTCAGCGATCTGACAACGGATCGCTGAGCGAGCGGTCTGAGAAGTGGCAC-Nx-3’(AksgRNA1);
5’-ggaattgccgatctaTAGGACGGCAGATTCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAACATGATCGCCCGCTCAACGGTCCGATGTCGGATCGTTGAGCGGGCGATCTGAGAAGTGGCAC-Nx-3’(AmsgRNA1);
5’-GAGGTTCTGTCTTTTGGTCAGGACAACCGTCTAGCTATAAGTGCTGCAGGGTGTGAGAAACTCCTATTGCTGGACGATGTCTCTTTTATTTCTTTTTTCTTGGATGTCCAAGAAAAAAGAAATGATACGAGGCATTAGCAC-Nx-3’(BhsgRNA);
5’-CCATAAGTCGACTTACATATCCGTGCGTGTGCATTATGGGCCCATCCACAGGTCTATTCCCACGGATAATCACGACTTTCCACTAAGCTTTCGAATGTTCGAAAGCTTAGTGGAAAGCTTCGTGGTTAGCAC-Nx-3’(BssgRNA);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTTATTTCTGCTAAGTGTTTAGTTGCCTGAATACTTAGCAGAAATAATGATGATTGGCAC-Nx-3’(Bs3sgRNA);
5’-GGCAAAGAATACTGTGCGTGTGCTAAGGATGGAAAAAATCCATTCAACCACAGGATTACATTATTTATCTAATCACTTAAATCTTTAAGTGATTAGATGAATTAAATGTGATTAGCAC-Nx-3’(LssgRNA);或
5’-GTCTTAGGGTATATCCCAAATTTGTCTTAGTATGTGCATTGCTTACAGCGACAACTAAGGTTTGTTTATCTTTTTTTTACATTGTAAGATGTTTTACATTATAAAAAGAAGATAATCTTATTGCAC-Nx-3’(SbsgRNA);
其中Nx表示X个连续的核苷酸组成的核苷酸序列(spacer序列),N各自独立地选自A、G、C和T;X为18≤X≤35的整数。优选地,X=20。在一些实施方案中,序列Nx(spacer序列)能够与靶序列的互补序列特异性杂交。所述sgRNA中除Nx之外的序列为sgRNA的支架(scaffold)序列。在一些实施方案中,所述sgRNA包含由SEQ ID NO:22-29中任一项的核苷酸序列编码的支架序列。
本发明令人惊奇地发现,不同的Cas12b系统中的Cas12b蛋白以及向导RNA可以互换使用,从而使得可以人工设计通用的向导RNA。
因此在一些实施方案中,所述sgRNA是人工sgRNA,其由选自以下的核苷酸序列编码:
5’-GGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA1);
5’-GGTCTAAAGGACAGAAGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA2);
5’-GGTCTAAAGGACAGAAAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA3);
5’-GGTCGTCTATAGGACGGCGAGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA4);
5’-GGTCGTCTATAGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA5);
5’-GGTCGTCTATAGGACGGCGAGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA6);
5’-GGTGACCTATAGGGTCAATGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA7);
5’-GGTGACCTATAGGGTCAATGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA8);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA9);
5’-GGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA10);
5’-GGTCTAAAGGACAGAAGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA11);
5’-GGTCTAAAGGACAGAAAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA12);
5’-GGTCGTCTATAGGACGGCGAGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA13);
5’-GGTCGTCTATAGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA14);
5’-GGTCGTCTATAGGACGGCGAGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA15);
5’-GGTGACCTATAGGGTCAATGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA16);
5’-GGTGACCTATAGGGTCAATGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA17);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA18);
5’-GGTCTAAAGGACAGAATTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA19);
5’-GGTCTAAAGGACAGAAGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA20);
5’-GGTCTAAAGGACAGAAAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA21);
5’-GGTCGTCTATAGGACGGCGAGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA22);
5’-GGTCGTCTATAGGACGGCGAGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA23);
5’-GGTCGTCTATAGGACGGCGAGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA24);
5’-GGTGACCTATAGGGTCAATGTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA25);
5’-GGTGACCTATAGGGTCAATGGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA26);
5’-GGTGACCTATAGGGTCAATGAATCTGTGCGTGTGCCATAAGTAATTAAAAATTACCCACCACAGGATTATCTATGATGATTGGCAC-Nx-3’(artsgRNA27);
5’-GGTCTAAAGGACAGAACAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA28);
5’-GGTCGTCTATAGGACGGCGAGCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA29);
5’-GGAATTGCCGATCTATAGGACGGCAGATTTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA30);
5’-GGAATTGCCGATCTATAGGACGGCAGATTGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA31);
5’-GGAATTGCCGATCTATAGGACGGCAGATTCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGACTTCAAGCGAAGTGGCAC-Nx-3’(artsgRNA32);
5’-GGTCTAAAGGACAGAACAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA33);
5’-GGTCGTCTATAGGACGGCGAGCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA34);
5’-GGAATTGCCGATCTATAGGACGGCAGATTTTTTTCAACGGGTGTGCCAATGGCCACTTTCCAGGTGGCAAAGCCCGTTGAGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA35);
5’-GGAATTGCCGATCTATAGGACGGCAGATTGACAACGGGAAGTGCCAATGTGCTCTTTCCAAGAGCAAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRN36A);或
5’-GGAATTGCCGATCTATAGGACGGCAGATTCAACGGGATGTGCCAATGCACTCTTTCCAGGAGTGAACACCCCGTTGGCTTCAAAGAAGTGGCAC-Nx-3’(artsgRNA37),
其中Nx表示X个连续的核苷酸组成的核苷酸序列(spacer序列),N各自独立地选自A、G、C和T;X为18≤X≤35的整数。优选地,X=20。在一些实施方案中,序列Nx(spacer序列)能够与靶序列的互补序列特异性杂交。所述sgRNA中除Nx之外的序列为sgRNA的支架(scaffold)序列。
在一些实施方案中,所述人工sgRNA包含由SEQ ID NO:30-66中任一项的核苷酸序列编码的支架序列。
在一些实施方案中,所述gRNA的间隔区序列被设计为与靶序列或其互补序列完全匹配。在一些实施方案中,所述gRNA的间隔区序列被设计为与靶序列或其互补序列具有至少一个核苷酸错配,例如具有一个核苷酸错配。这样的gRNA也称为经调节的(tuned)gRNA,所述核苷酸错配也成为调节位点。利用Cas12b对sgRNA与靶之间不同错配的耐受能力差异,设计为与靶序列或其互补序列具有核苷酸错配的gRNA能够区分靶序列中的单核苷酸多态性变异。在一些实施方案中,所述至少一个核苷酸错配的位置不同于所述单核苷酸多态性变异的位置。例如,经调节的sgRNA与靶序列1在位置1具有一个核苷酸错配,而靶序列1与靶序列2在位置2具有单核苷酸多肽性,也即经调节的sgRNA与靶序列2之间具有两个核苷酸错配。由于Cas12b对错配数目的耐受性差异,其仅在靶序列1存在时产生可检测信号(只有1个错配),而靶序列2没有可检测信号(由于存在两个错配),从而可以将包含单核苷酸多态性的靶序列1和靶序列2区分开。本领域技术人员可以根据具体的靶序列筛选合适的调节位点。
本发明中,gRNA中除间隔区序列之外的序列也称做gRNA支架(scaffold)。
在一些实施方案中,所述gRNA体外转录产生。在一些实施方案中,所述gRNA通过化学合成产生。
在本发明的一些实施方案中,所述靶核酸分子包含特征性的长度为18-35个核苷酸,优选20个核苷酸的靶序列。在本发明的一些实施方案中,特别是涉及双链DNA检测,所述靶序列在紧接其5’端具有选自:5’TTTN-3’、5’ATTN-3’、5’GTTN-3’、5’CTTN-3’、5’TTC-3’、5’TTG-3’、5’TTA-3’、5’TTT-3’、5’TAN-3’、5’TGN-3’、5’TCN-3’和5’ATC-3’的PAM(前间区邻近基序)序列,优选5’TTTN-3’。
“被切割后产生可检测信号的单链DNA报告分子”例如可以在单链DNA的两端分别包含荧光团和其猝灭基团。当单链DNA未被切割时,由于猝灭基团存在,荧光团不发出荧光。而当Cas12b-gRNA复合物被靶核酸分子激活,通过其非经典旁路ssDNA切割活性切割所述单链DNA报告分子的DNA单链时,荧光团被释放,从而发出荧光。合适的荧光团及其相应猝灭基团,以及其标记核酸分子的方法在本领域是已知的。合适的荧光团包括但不限于FAM、TEX、HEX、Cy3或Cy5。合适的猝灭基团包括但不限于BHQ1、BHQ2、BHQ3或TAMRA。合适的荧光团-猝灭基团对包括但不限于FAM-BHQ1、TEX-BHQ2、Cy5-BHQ3、Cy3-BHQ1或FAM-TAMRA。因此,在一些实施方案中,所述可检测信号是荧光信号。在一些实施方式中,所述荧光团是FAM,所述猝灭基团是BHQ1。
所述单链DNA报告分子中单链DNA的长度可以是大约2-100个核苷酸,例如2-5个、2-10个、2-15个、2-20个、2-25个、2-30个、2-40个或2至更多个核苷酸。所述单链DNA报告分子中单链DNA可以包含任意序列,但在一些实施方案中,polyG(聚鸟苷酸)除外。在一些实施方案中,所述单链DNA报告分子中的单链DNA可以选自polyA(聚腺苷酸)、polyC(聚胞苷酸)或polyT(聚胸苷酸)。
在一些具体实施方式中,所述单链DNA报告分子选自5’-FAM-AAAAA-BHQ1-3’、5’-FAM-TTTTT-BHQ1-3’、和5’-FAM-CCCCC-BHQ1-3’。
在本发明的方法的一些实施方案中,还包括在步骤(a)之前对所述生物样品中的核酸分子进行扩增的步骤。所述扩增包括但不限于PCR扩增或重组酶聚合酶扩增(Recombinase Polymerase Amplification,RPA)。优选地,所述扩增是重组酶聚合酶扩增。
在一些实施方案中,所述重组酶聚合酶扩增进行约10分钟-约60分钟
在一些实施方案中,在与所述生物样品接触之前,所述Cas12b蛋白已经与所述gRNA预先复合形成Cas12b-gRNA复合物。
在一些实施方案中,所述步骤(a)的反应进行约20分钟-约180分钟,例如约20分钟、约30分钟、约40分钟、约50分钟、约60分钟、约90分钟、约120分钟,或之间的任何时间。
在一些实施方案中,步骤(a)在合适的缓冲液中进行。例如,所述缓冲液是NEBufferTM2、NEBufferTM 2.1或
Figure GDA0003076542300000131
Buffer。在一些实施方式中,所述缓冲液包含终浓度50mM NaCl、10mM Tris-HCl、10mM MgCl2、1mM DTT,pH7.9。在一些实施方式中,所述缓冲液包含终浓度50mM NaCl、10mM Tris-HCl、10mM MgCl2、100μg/ml BSA,pH7.9。在一些实施方式中,所述缓冲液包含终浓度50mM乙酸钾,20mM Tris-乙酸,10mM乙酸镁,100μg/mlBSA,pH 7.9。
可用于本发明的方法的生物样品包括但不限于全血、血浆、血清、脑脊液、尿液、粪便、细胞或组织提取物等。所述生物样品涵盖提取自细胞或组织的核酸样品。
本发明的范围内还包括用于本发明的方法的试剂盒,该试剂盒包括用于实施本发明的方法的试剂,以及使用说明。例如,所述试剂盒可以包括Cas12b蛋白(例如本发明的Cas12b蛋白)、gRNA(例如包含本发明的gRNA支架)或用于产生gRNA(例如其包含本发明的gRNA支架)的试剂、单链DNA报告分子(例如本发明的单链DNA报告分子),合适的缓冲液、和/或核酸扩增试剂。试剂盒一般包括表明试剂盒内容物的预期用途和/或使用方法的标签。术语标签包括在试剂盒上或与试剂盒一起提供的或以其他方式随试剂盒提供的任何书面的或记录的材料。
本发明还提供了上文所定义的Cas12b蛋白和/或包含本发明的支架的gRNA和/或用于产生包含本发明的支架的gRNA的试剂在制备用于本发明的方法的试剂盒中的用途。
实施例
实验材料和方法
蛋白表达和纯化
SpCas9和LbCas12a蛋白商购获得(NEB)。根据之前的报道纯化AaCas12b、ArCas12a、HkCas12a和PrCas12a蛋白。简言之,BPK2014-Cas12-His10蛋白在大肠杆菌菌株BL21(λDE3)中表达,用0.5mM IPTG在16℃诱导表达16小时。收获细胞沉淀并裂解,使用His60 Ni Superflow Resin(Takara)纯化。将纯化的Cas12蛋白透析,浓缩,最后使用BCA蛋白质测定试剂盒(Thermo Fisher)定量。
核酸制备
DNA寡核苷酸商购获得(Genscript)。双链DNA激活物通过PCR扩增获得并使用Oligo Clean&Concentrator Kit(ZYMO Research)纯化。向导RNA使用HiScribeTM T7 HighYield RNA Synthesis Kit(NEB)体外转录并用MicroElute RNA Clean Up Kit(Omega)纯化。
在指定反应中使用的背景基因组DNA使用Mouse Direct PCR Kit(Bimake)纯化自人胚肾293T细胞。为了模拟无细胞DNA(cfDNA),dsDNA以指定浓度稀释至人血浆中(ThermoFisher)。
荧光团猝灭剂(FQ)-标记的报告测定
用30nM Cas12、36nM gRNA、混合在40ng背景基因组DNA中(在指定的反应中)的40nM激活物(除非另有说明)、200nM定制合成的均聚物ssDNA FQ报告基因(表1)和NEBufferTM2(除非另有说明)在Corning 384-孔聚苯乙烯NBS微孔板中进行20μl反应检测测定。将反应物在37℃下孵育以在荧光板读数器(BioTek Synergy 4)中指示时间,每5分钟测量荧光动力学(λex=485nm;λem=528nm,透射增益=61)。通过SigmaPlot软件分析荧光结果。
重组酶聚合酶扩增(RPA)反应
根据制造商的方案,使用TwistAmp Basic(TwistDx)进行重组酶聚合酶扩增(RPA)反应。将含有不同DNA输入量的50μl RPA反应系统在37℃下孵育10分钟。如上所述,将16μlRPA产物直接转移至20μl检测试验。
实施例1、Cas12b的反式ssDNA切割活性的表征
源自Alicyclobacillus acidiphilus NBRC 100859的CRISPR-Cas12b核酸酶(AaCas12b,氨基酸序列示于SEQ ID NO:1)已被用于哺乳动物基因组编辑,因为其具有经典的dsDNA靶向切割能力(图1a)。为了表征Cas12b的非经典反式旁路切割活性,使用Cas12b、Cas12a和Cas9分别与其相应的向导RNA(gRNA)组合进行体外ssDNA切割测定。结果表明,AaCas12b和Cas12a(ArCas12a、HkCas12a和PrCas12a)在靶向M13的sgRNA(OT-gRNA)存在下可诱导单链M13 DNA噬菌体快速降解,而SpCas9则不能(图1b)。同样,AaCas12b和Cas12a在存在与M13噬菌体基因组没有序列同源性的非靶gRNA及其互补ssDNA“激活物”的情况下也实现了M13降解(图1c)。催化失活的变体(发生R785A和D977A取代)中非经典旁路ssDNA切割活性被消除,表明该旁路切割活性是RuvC结构域依赖性的(图2)。这些结果表明AaCas12b-sgRNA复合物一旦被与向导序列互补的DNA触发,就可以获得非特异性ssDNA反式切割活性。
实施例2、开发Cas12b介导的DNA检测系统
为了开发Cas12b介导的DNA检测系统,首先分析了AaCas12b-sgRNA复合物对荧光团-猝灭剂(FQ)标记的同聚物报告分子的切割偏好。发现AaCas12b偏好多聚胸苷酸(ployT)以及多聚腺苷酸(poly A)和多聚胞苷酸(poly C),而多聚鸟苷酸(poly G)基本上不被切割(图3a)。同时,可以优化切割效率,其在NEBufferTM 2表现最佳(图3b)。然后,使用NEBufferTM2中的poly T报告分子进行AaCas12b介导的切割测定。使用sgRNA-互补的OT-ssDNA和OT-dsDNA(OT表示on target,在靶),或sgRNA-非互补的NT-ssDNA和NT-dsDNA(NT表述non target,非靶)作为激活物,发现了OT-ssDNA和OT-dsDNA激活物能够触发AaCas12b切割FQ-报告分子,尽管OT-dsDNA效率较低(图3c)。
接下来使用带有不同错配的ssDNA或dsDNA激活物测试了反式切割活化的特异性。发现PAM序列对于dsDNA激活物触发的AaCas12b的反式切割活性至关重要,并且对于ssDNA不是必须的(图4a、b)。同时,还发现dsDNA激活物和sgRNA之间的错配会阻碍甚至消除反式切割活性,而ssDNA激活物则对错配更耐受(图4a、b)。
然后确定了AaCas12b-sgRNA-激活物系统的灵敏度,发现没有预扩增,AaCas12b在ssDNA-激活物输入浓度<1.6nM或dsDNA-激活物输入浓度<8nM时不产生可检测信号(图5a)。
由于dsDNA激活物具有更高的特异性(图4a、b),将AaCas12b-sgRNA-dsDNA激活系统设计为DNA检测平台(Cas12b-based DNA detection,CDetection)。合成了一种花椰菜花叶病毒(CaMV)的dsDNA和两种人乳头瘤病毒(HPV16和HPV18)的dsDNA作为在检测反应中的激活物。当激活物的输入浓度≥10nM时,AaCas12b-sgRNA不仅可以产生可检测的信号(图5b,c),还可以区分两种dsDNA病毒HPV16和HPV18(图5c)。
与基于Cas12a的DNA检测相比,AaCas12b在这两个检测位点均展示更高的检测灵敏度,因此CDetection产生更高的信号水平和更低的背景水平(图4c和图5d)。
为了提高灵敏度,使用重组酶聚合酶扩增(RPA)进行了预扩增,并在
Figure GDA0003076542300000162
实现单分子检测(图4d和图5e)。这些结果表明,Cas12b介导的DNA检测(CDetection)可以实现具有高特异性和渺摩尔级灵敏度的DNA检测,其灵敏度高于基于Cas12a的检测平台(图4e)。
实施例3、开发增强型Cas12b介导的DNA检测系统
为了扩展和模拟CDetection在分子诊断应用中的应用,将合成的HPV dsDNA稀释到人类基因组DNA中。结果显示CDetection可以在亚渺摩尔级
Figure GDA0003076542300000161
下检测出感染性病毒靶标(图6a)。
AaCas12b在人血浆中的高灵敏度促使测试CDetection在基于cfDNA的非侵入性诊断中的应用。虽然之前使用过cfDNA分析灵敏度达到了108分之一,但这些方法需要相对大量的cfDNA(5-10ng/ml血浆),并且是耗时的。为了证明CDetection在cfDNA检测中的优势,将HPV的dsDNA稀释进人血浆中并检测该新方法的灵敏度。结果显示,CDetection可以检测人血浆中浓度低至
Figure GDA0003076542300000163
的HPV DNA的存在(图4f和图6b),表明可以用仅仅一滴血快速检测存在的感染性病毒。
为了扩展CDetection在精确诊断中的应用,我们设计了使用三种靶向sgRNA和相应的dsDNA激活物(on-以及off-激活物)的实验来鉴定六种常见的人ABO等位基因。理论上,携带三种sgRNA中的每一种的CDetection可分别鉴定出O01、O02/O03和B101。如果所有sgRNA的荧光信号没有检测到,则等位基因应为A101/A201(图7a)。结果显示,CDetection无法区分不同的ABO等位基因,因为它产生在on和off的dsDNA激活物组之间难以区分的荧光信号(图7b)。
为了增强CDetection的特异性,设计了在间隔区序列(spacer)中含有单核苷酸错配的经调节的向导RNA(tuned gRNA,tgRNA),其使得只有单个核苷酸差异的两种相似靶标由不可区分变为可区分(图7c)。
为了证明这种增强的CDetection(eCDetection)的单碱基分辨率灵敏度,进行了ABO血液基因分型测试。结果表明,eCDetection可以高精度地确定血型,而CDetection则不能(图8a)。
测序和探针检测是检测点突变引起的疾病的两种主要方法。然而,测序是昂贵且耗时的,并且其灵敏度取决于测序深度。基于探针的方法对单碱基突变的敏感性较差。由于本发明的eCDetection方法具有高特异性和灵敏度,可以使用eCDetection检测人类基因组中的低比率单碱基突变。
选择了癌症相关的TP53 856G>A突变来测试其可行性。结果表明,CDetection可以使用选定的tgRNA准确区分点突变等位基因与野生型等位基因(图9a)。
此外,应用CDetection平台检测乳腺癌相关BRCA1基因的两个热点(3232A>G和3537A>G)。用选择的tgRNA(tgRNA-3232-1和tgRNA-3537-4)进行CDetection可以很好地区分点突变,而sgRNA几乎不支持点突变检测(图8b、c和图9b、c)。
此外,为了模拟通过CDetection使用cfDNA早期临床检测原发疾病,将BRCA13232A>G dsDNA稀释到人血浆中。研究结果表明,CDetection可以在单碱基分辨率下实现点突变检测(图8d和图9d)。本发明的eCDetection能够以单碱基分辨率在临床研究实现快速DNA检测。
总之,本发明提供了CDetection平台,该平台基于Cas12b核酸酶的非经典旁路ssDNA切割特性,可以渺摩尔级灵敏度检测DNA。同时,结合经调节的gRNA,开发了增强版(eCDetection),实现单核苷酸分辨率。本发明的CDetection和eCDetection平台可以在各种分子诊断应用中更容易地检测核酸,以及在临床研究中进行基因型分析(图10)。
实施例4、其他Cas12b蛋白的鉴定
选择并从头合成来自不同细菌的六种代表性Cas12b蛋白,以及之前报道的四种Cas12b直系同源物,在人胚胎肾293T细胞中进行基因组编辑(图11、12和SEQ ID NO:1-10)。在这10种Cas12b直系同源物中,来自D.inopinatus(DiCas12b)和T.calidus(TcCas12b)的Cas12b既没有可预测的前体CRISPR RNA(pre-crRNA)也没有反式激活crRNA(tracrRNA)(图11b),提示这两种Cas12b蛋白可能不适合基因组编辑应用。
为了进行哺乳动物基因组编辑,用单独的Cas12b酶和其靶向含有适当PAM的人内源基因座的同源单向导RNA(sgRNA)共转染293T细胞(图11)。T7核酸内切酶(T7EI)测定的结果显示,AaCas12b、AkCas12b、AmCas12b、BhCas12b、Bs3Cas12b和LsCas12b稳健地编辑人类基因组,尽管它们的靶向效率在不同的直系同源物之间和在不同的靶向位点不同(图11b和图13a)。还通过简单地使用多个sgRNA,使用Bs3Cas12b实现多重基因组编辑,同时编辑人类基因组中的四个位点(图13b,c)。这些新发现的有功能的Cas12b直系同源物扩展对基于Cas12b的基因组编辑的选择,同时扩展了基于Cas12b的DNA检测的选择。
实施例5、不同Cas12b及双RNA的可互换性
为了研究Cas12b系统中双RNA(crRNA和tracRRNA)和蛋白质组分之间的可互换性,首先分析Cas12b蛋白和双RNA两者的保守性。除了Cas12b直系同源物的保守氨基酸序列外(图14a和图12),前体crRNA:tracrRNA双链体的DNA序列及其二级结构也表现出高保守性(图11b和图15)。接下来,用分别与来自8个Cas12b系统的各sgRNA复合的8种Cas12b直系同源物,在293T细胞中进行基因组编辑。如T7EI测定的结果所示,衍生自AaCas12b、AkCas12b、AmCas12b、Bs3Cas12b和LsCas12b基因座的sgRNA可以替代原始sgRNA用于哺乳动物基因组编辑,尽管在不同Cas12b直系同源物和sgRNA之间的活性有所不同(图14c,d和图16)。这些结果证明不同Cas12b和来自不同Cas12b基因座的双RNA之间的可互换性。
实施例6、利用天然基因座无CRISPR阵列的Cas12b进行基因组编辑
本发明进一步选择两个基因座没有携带CRISPR阵列的Cas12b直系同源物DiCas12b和TcCas12b进行后续实验(图17a)。基因座没有携带CRISPR阵列使得它们的crRNA:tracrRNA双链体的序列不可预测。在293T细胞中共转染与靶向不同基因组位点的衍生自其他8种Cas12b直系同源物的基因座的sgRNA组合的DiCas12b和TcCas12b以及AaCas12b。T7EI测定结果表明衍生自AaCas12b、AkCas12b、AmCas12b、Bs3Cas12b和LsCas12b的sgRNA使TcCas12b能够稳健地编辑人类基因组(图17b、c和图18)。此外,AasgRNA或AksgRNA能够使TcCas12b实现多重基因组编辑(图17d和图19)。上述结果表明在来自不同系统的Cas12b和双链RNA之间可互换性使得可能利用天然基因座不具有CRISPR阵列的Cas12b直系同源物来编辑哺乳动物基因组。
实施例7、设计用于Cas12b介导的基因组编辑的人工sgRNA
不同Cas12b系统中Cas12b蛋白和双RNA之间的可互换性有利于设计新的人工sgRNA(artsgRNA)支架以促进Cas12b介导的基因组编辑。考虑到Cas12b直系同源物中DNA序列和二级结构的保守性(图11b和13),设计并从头合成37种sgRNA支架(SEQ ID NO:30-66),用于靶向人CCR5基因座(图17e,图20a)。T7EI测定的结果表明22种artsgRNA支架有效地工作。为了验证artgRNA的普遍适用性,使用artsgRNA13指导TcCas12b或AaCas12b进行多重基因组编辑(图20a)。T7EI测定结果表明,artsgRNA13同时促进TcCas12b和AaCas12b两者的多重基因组编辑(图17f和图20c)。结果表明通过设计和合成artsgRNA能促进Cas12b介导的基因组编辑特别是多重基因组编辑。
表1.实施例中涉及的核酸序列(其中下划线为spacer序列,粗体斜体为tgRNA中的错配核苷酸)
Figure GDA0003076542300000181
Figure GDA0003076542300000191
Figure GDA0003076542300000201
表2.Cas12b-介导的DNA检测中测试的缓冲液的组分。缓冲液1、3、4、5来自已有的报道,而缓冲液2、6、7、8来自商品化的缓冲液。1:Nuclease Assay Buffer;2:NEBufferTM3.1;3:Cas12a Binding Buffer;4:Cas13 Buffer;5:Cas12a Buffer;6:NEBufferTM 2;7:NEBufferTM 2.1;8:
Figure GDA0003076542300000202
Buffer。
Figure GDA0003076542300000203
表3.本发明的序列表及对应的序列信息
Figure GDA0003076542300000204
Figure GDA0003076542300000211
Figure GDA0003076542300000221
序列表
<110> 中国科学院动物研究所
<120> 核酸检测方法
<130> TC2737
<150> 201811099146.0
<151> 2018-09-20
<160> 66
<170> PatentIn version 3.5
<210> 1
<211> 1129
<212> PRT
<213> Alicyclobacillus acidiphilus
<400> 1
Met Ala Val Lys Ser Met Lys Val Lys Leu Arg Leu Asp Asn Met Pro
1 5 10 15
Glu Ile Arg Ala Gly Leu Trp Lys Leu His Thr Glu Val Asn Ala Gly
20 25 30
Val Arg Tyr Tyr Thr Glu Trp Leu Ser Leu Leu Arg Gln Glu Asn Leu
35 40 45
Tyr Arg Arg Ser Pro Asn Gly Asp Gly Glu Gln Glu Cys Tyr Lys Thr
50 55 60
Ala Glu Glu Cys Lys Ala Glu Leu Leu Glu Arg Leu Arg Ala Arg Gln
65 70 75 80
Val Glu Asn Gly His Cys Gly Pro Ala Gly Ser Asp Asp Glu Leu Leu
85 90 95
Gln Leu Ala Arg Gln Leu Tyr Glu Leu Leu Val Pro Gln Ala Ile Gly
100 105 110
Ala Lys Gly Asp Ala Gln Gln Ile Ala Arg Lys Phe Leu Ser Pro Leu
115 120 125
Ala Asp Lys Asp Ala Val Gly Gly Leu Gly Ile Ala Lys Ala Gly Asn
130 135 140
Lys Pro Arg Trp Val Arg Met Arg Glu Ala Gly Glu Pro Gly Trp Glu
145 150 155 160
Glu Glu Lys Ala Lys Ala Glu Ala Arg Lys Ser Thr Asp Arg Thr Ala
165 170 175
Asp Val Leu Arg Ala Leu Ala Asp Phe Gly Leu Lys Pro Leu Met Arg
180 185 190
Val Tyr Thr Asp Ser Asp Met Ser Ser Val Gln Trp Lys Pro Leu Arg
195 200 205
Lys Gly Gln Ala Val Arg Thr Trp Asp Arg Asp Met Phe Gln Gln Ala
210 215 220
Ile Glu Arg Met Met Ser Trp Glu Ser Trp Asn Gln Arg Val Gly Glu
225 230 235 240
Ala Tyr Ala Lys Leu Val Glu Gln Lys Ser Arg Phe Glu Gln Lys Asn
245 250 255
Phe Val Gly Gln Glu His Leu Val Gln Leu Val Asn Gln Leu Gln Gln
260 265 270
Asp Met Lys Glu Ala Ser His Gly Leu Glu Ser Lys Glu Gln Thr Ala
275 280 285
His Tyr Leu Thr Gly Arg Ala Leu Arg Gly Ser Asp Lys Val Phe Glu
290 295 300
Lys Trp Glu Lys Leu Asp Pro Asp Ala Pro Phe Asp Leu Tyr Asp Thr
305 310 315 320
Glu Ile Lys Asn Val Gln Arg Arg Asn Thr Arg Arg Phe Gly Ser His
325 330 335
Asp Leu Phe Ala Lys Leu Ala Glu Pro Lys Tyr Gln Ala Leu Trp Arg
340 345 350
Glu Asp Ala Ser Phe Leu Thr Arg Tyr Ala Val Tyr Asn Ser Ile Val
355 360 365
Arg Lys Leu Asn His Ala Lys Met Phe Ala Thr Phe Thr Leu Pro Asp
370 375 380
Ala Thr Ala His Pro Ile Trp Thr Arg Phe Asp Lys Leu Gly Gly Asn
385 390 395 400
Leu His Gln Tyr Thr Phe Leu Phe Asn Glu Phe Gly Glu Gly Arg His
405 410 415
Ala Ile Arg Phe Gln Lys Leu Leu Thr Val Glu Asp Gly Val Ala Lys
420 425 430
Glu Val Asp Asp Val Thr Val Pro Ile Ser Met Ser Ala Gln Leu Asp
435 440 445
Asp Leu Leu Pro Arg Asp Pro His Glu Leu Val Ala Leu Tyr Phe Gln
450 455 460
Asp Tyr Gly Ala Glu Gln His Leu Ala Gly Glu Phe Gly Gly Ala Lys
465 470 475 480
Ile Gln Tyr Arg Arg Asp Gln Leu Asn His Leu His Ala Arg Arg Gly
485 490 495
Ala Arg Asp Val Tyr Leu Asn Leu Ser Val Arg Val Gln Ser Gln Ser
500 505 510
Glu Ala Arg Gly Glu Arg Arg Pro Pro Tyr Ala Ala Val Phe Arg Leu
515 520 525
Val Gly Asp Asn His Arg Ala Phe Val His Phe Asp Lys Leu Ser Asp
530 535 540
Tyr Leu Ala Glu His Pro Asp Asp Gly Lys Leu Gly Ser Glu Gly Leu
545 550 555 560
Leu Ser Gly Leu Arg Val Met Ser Val Asp Leu Gly Leu Arg Thr Ser
565 570 575
Ala Ser Ile Ser Val Phe Arg Val Ala Arg Lys Asp Glu Leu Lys Pro
580 585 590
Asn Ser Glu Gly Arg Val Pro Phe Cys Phe Pro Ile Glu Gly Asn Glu
595 600 605
Asn Leu Val Ala Val His Glu Arg Ser Gln Leu Leu Lys Leu Pro Gly
610 615 620
Glu Thr Glu Ser Lys Asp Leu Arg Ala Ile Arg Glu Glu Arg Gln Arg
625 630 635 640
Thr Leu Arg Gln Leu Arg Thr Gln Leu Ala Tyr Leu Arg Leu Leu Val
645 650 655
Arg Cys Gly Ser Glu Asp Val Gly Arg Arg Glu Arg Ser Trp Ala Lys
660 665 670
Leu Ile Glu Gln Pro Met Asp Ala Asn Gln Met Thr Pro Asp Trp Arg
675 680 685
Glu Ala Phe Glu Asp Glu Leu Gln Lys Leu Lys Ser Leu Tyr Gly Ile
690 695 700
Cys Gly Asp Arg Glu Trp Thr Glu Ala Val Tyr Glu Ser Val Arg Arg
705 710 715 720
Val Trp Arg His Met Gly Lys Gln Val Arg Asp Trp Arg Lys Asp Val
725 730 735
Arg Ser Gly Glu Arg Pro Lys Ile Arg Gly Tyr Gln Lys Asp Val Val
740 745 750
Gly Gly Asn Ser Ile Glu Gln Ile Glu Tyr Leu Glu Arg Gln Tyr Lys
755 760 765
Phe Leu Lys Ser Trp Ser Phe Phe Gly Lys Val Ser Gly Gln Val Ile
770 775 780
Arg Ala Glu Lys Gly Ser Arg Phe Ala Ile Thr Leu Arg Glu His Ile
785 790 795 800
Asp His Ala Lys Glu Asp Arg Leu Lys Lys Leu Ala Asp Arg Ile Ile
805 810 815
Met Glu Ala Leu Gly Tyr Val Tyr Ala Leu Asp Asp Glu Arg Gly Lys
820 825 830
Gly Lys Trp Val Ala Lys Tyr Pro Pro Cys Gln Leu Ile Leu Leu Glu
835 840 845
Glu Leu Ser Glu Tyr Gln Phe Asn Asn Asp Arg Pro Pro Ser Glu Asn
850 855 860
Asn Gln Leu Met Gln Trp Ser His Arg Gly Val Phe Gln Glu Leu Leu
865 870 875 880
Asn Gln Ala Gln Val His Asp Leu Leu Val Gly Thr Met Tyr Ala Ala
885 890 895
Phe Ser Ser Arg Phe Asp Ala Arg Thr Gly Ala Pro Gly Ile Arg Cys
900 905 910
Arg Arg Val Pro Ala Arg Cys Ala Arg Glu Gln Asn Pro Glu Pro Phe
915 920 925
Pro Trp Trp Leu Asn Lys Phe Val Ala Glu His Lys Leu Asp Gly Cys
930 935 940
Pro Leu Arg Ala Asp Asp Leu Ile Pro Thr Gly Glu Gly Glu Phe Phe
945 950 955 960
Val Ser Pro Phe Ser Ala Glu Glu Gly Asp Phe His Gln Ile His Ala
965 970 975
Asp Leu Asn Ala Ala Gln Asn Leu Gln Arg Arg Leu Trp Ser Asp Phe
980 985 990
Asp Ile Ser Gln Ile Arg Leu Arg Cys Asp Trp Gly Glu Val Asp Gly
995 1000 1005
Glu Pro Val Leu Ile Pro Arg Thr Thr Gly Lys Arg Thr Ala Asp
1010 1015 1020
Ser Tyr Gly Asn Lys Val Phe Tyr Thr Lys Thr Gly Val Thr Tyr
1025 1030 1035
Tyr Glu Arg Glu Arg Gly Lys Lys Arg Arg Lys Val Phe Ala Gln
1040 1045 1050
Glu Glu Leu Ser Glu Glu Glu Ala Glu Leu Leu Val Glu Ala Asp
1055 1060 1065
Glu Ala Arg Glu Lys Ser Val Val Leu Met Arg Asp Pro Ser Gly
1070 1075 1080
Ile Ile Asn Arg Gly Asp Trp Thr Arg Gln Lys Glu Phe Trp Ser
1085 1090 1095
Met Val Asn Gln Arg Ile Glu Gly Tyr Leu Val Lys Gln Ile Arg
1100 1105 1110
Ser Arg Val Arg Leu Gln Glu Ser Ala Cys Glu Asn Thr Gly Asp
1115 1120 1125
Ile
<210> 2
<211> 1147
<212> PRT
<213> Alicyclobacillus kakegawensis
<400> 2
Met Ala Val Lys Ser Ile Lys Val Lys Leu Arg Leu Ser Glu Cys Pro
1 5 10 15
Asp Ile Leu Ala Gly Met Trp Gln Leu His Arg Ala Thr Asn Ala Gly
20 25 30
Val Arg Tyr Tyr Thr Glu Trp Val Ser Leu Met Arg Gln Glu Ile Leu
35 40 45
Tyr Ser Arg Gly Pro Asp Gly Gly Gln Gln Cys Tyr Met Thr Ala Glu
50 55 60
Asp Cys Gln Arg Glu Leu Leu Arg Arg Leu Arg Asn Arg Gln Leu His
65 70 75 80
Asn Gly Arg Gln Asp Gln Pro Gly Thr Asp Ala Asp Leu Leu Ala Ile
85 90 95
Ser Arg Arg Leu Tyr Glu Ile Leu Val Leu Gln Ser Ile Gly Lys Arg
100 105 110
Gly Asp Ala Gln Gln Ile Ala Ser Ser Phe Leu Ser Pro Leu Val Asp
115 120 125
Pro Asn Ser Lys Gly Gly Arg Gly Glu Ala Lys Ser Gly Arg Lys Pro
130 135 140
Ala Trp Gln Lys Met Arg Asp Gln Gly Asp Pro Arg Trp Val Ala Ala
145 150 155 160
Arg Glu Lys Tyr Glu Gln Arg Lys Ala Val Asp Pro Ser Lys Glu Ile
165 170 175
Leu Asn Ser Leu Asp Ala Leu Gly Leu Arg Pro Leu Phe Ala Val Phe
180 185 190
Thr Glu Thr Tyr Arg Ser Gly Val Asp Trp Lys Pro Leu Gly Lys Ser
195 200 205
Gln Gly Val Arg Thr Trp Asp Arg Asp Met Phe Gln Gln Ala Leu Glu
210 215 220
Arg Leu Met Ser Trp Glu Ser Trp Asn Arg Arg Val Gly Glu Glu Tyr
225 230 235 240
Ala Arg Leu Phe Gln Gln Lys Met Lys Phe Glu Gln Glu His Phe Ala
245 250 255
Glu Gln Ser His Leu Val Lys Leu Ala Arg Ala Leu Glu Ala Asp Met
260 265 270
Arg Ala Ala Ser Gln Gly Phe Glu Ala Lys Arg Gly Thr Ala His Gln
275 280 285
Ile Thr Arg Arg Ala Leu Arg Gly Ala Asp Arg Val Phe Glu Ile Trp
290 295 300
Lys Ser Ile Pro Glu Glu Ala Leu Phe Ser Gln Tyr Asp Glu Val Ile
305 310 315 320
Arg Gln Val Gln Ala Glu Lys Arg Arg Asp Phe Gly Ser His Asp Leu
325 330 335
Phe Ala Lys Leu Ala Glu Pro Lys Tyr Gln Pro Leu Trp Arg Ala Asp
340 345 350
Glu Thr Phe Leu Thr Arg Tyr Ala Leu Tyr Asn Gly Val Leu Arg Asp
355 360 365
Leu Glu Lys Ala Arg Gln Phe Ala Thr Phe Thr Leu Pro Asp Ala Cys
370 375 380
Val Asn Pro Ile Trp Thr Arg Phe Glu Ser Ser Gln Gly Ser Asn Leu
385 390 395 400
His Lys Tyr Glu Phe Leu Phe Asp His Leu Gly Pro Gly Arg His Ala
405 410 415
Val Arg Phe Gln Arg Leu Leu Val Val Glu Ser Glu Gly Ala Lys Glu
420 425 430
Arg Asp Ser Val Val Val Pro Val Ala Pro Ser Gly Gln Leu Asp Lys
435 440 445
Leu Val Leu Arg Glu Glu Glu Lys Ser Ser Val Ala Leu His Leu His
450 455 460
Asp Thr Ala Arg Pro Asp Gly Phe Met Ala Glu Trp Ala Gly Ala Lys
465 470 475 480
Leu Gln Tyr Glu Arg Ser Thr Leu Ala Arg Lys Ala Arg Arg Asp Lys
485 490 495
Gln Gly Met Arg Ser Trp Arg Arg Gln Pro Ser Met Leu Met Ser Ala
500 505 510
Ala Gln Met Leu Glu Asp Ala Lys Gln Ala Gly Asp Val Tyr Leu Asn
515 520 525
Ile Ser Val Arg Val Lys Ser Pro Ser Glu Val Arg Gly Gln Arg Arg
530 535 540
Pro Pro Tyr Ala Ala Leu Phe Arg Ile Asp Asp Lys Gln Arg Arg Val
545 550 555 560
Thr Val Asn Tyr Asn Lys Leu Ser Ala Tyr Leu Glu Glu His Pro Asp
565 570 575
Lys Gln Ile Pro Gly Ala Pro Gly Leu Leu Ser Gly Leu Arg Val Met
580 585 590
Ser Val Asp Leu Gly Leu Arg Thr Ser Ala Ser Ile Ser Val Phe Arg
595 600 605
Val Ala Lys Lys Glu Glu Val Glu Ala Leu Gly Asp Gly Arg Pro Pro
610 615 620
His Tyr Tyr Pro Ile His Gly Thr Asp Asp Leu Val Ala Val His Glu
625 630 635 640
Arg Ser His Leu Ile Gln Met Pro Gly Glu Thr Glu Thr Lys Gln Leu
645 650 655
Arg Lys Leu Arg Glu Glu Arg Gln Ala Val Leu Arg Pro Leu Phe Ala
660 665 670
Gln Leu Ala Leu Leu Arg Leu Leu Val Arg Cys Gly Ala Ala Asp Glu
675 680 685
Arg Ile Arg Thr Arg Ser Trp Gln Arg Leu Thr Lys Gln Gly Arg Glu
690 695 700
Phe Thr Lys Arg Leu Thr Pro Ser Trp Arg Glu Ala Leu Glu Leu Glu
705 710 715 720
Leu Thr Arg Leu Glu Ala Tyr Cys Gly Arg Val Pro Asp Asp Glu Trp
725 730 735
Ser Arg Ile Val Asp Arg Thr Val Ile Ala Leu Trp Arg Arg Met Gly
740 745 750
Lys Gln Val Arg Asp Trp Arg Lys Gln Val Lys Ser Gly Ala Lys Val
755 760 765
Lys Val Lys Gly Tyr Gln Leu Asp Val Val Gly Gly Asn Ser Leu Ala
770 775 780
Gln Ile Asp Tyr Leu Glu Gln Gln Tyr Lys Phe Leu Arg Arg Trp Ser
785 790 795 800
Phe Phe Ala Arg Ala Ser Gly Leu Val Val Arg Ala Asp Arg Glu Ser
805 810 815
His Phe Ala Val Ala Leu Arg Gln His Ile Glu Asn Ala Lys Arg Asp
820 825 830
Arg Leu Lys Lys Leu Ala Asp Arg Ile Leu Met Glu Ala Leu Gly Tyr
835 840 845
Val Tyr Glu Ala Ser Gly Pro Arg Glu Gly Gln Trp Thr Ala Gln His
850 855 860
Pro Pro Cys Gln Leu Ile Ile Leu Glu Glu Leu Ser Ala Tyr Arg Phe
865 870 875 880
Ser Asp Asp Arg Pro Pro Ser Glu Asn Ser Lys Leu Met Ala Trp Gly
885 890 895
His Arg Gly Ile Leu Glu Glu Leu Val Asn Gln Ala Gln Val His Asp
900 905 910
Val Leu Val Gly Thr Val Tyr Ala Ala Phe Ser Ser Arg Phe Asp Ala
915 920 925
Arg Thr Gly Ala Pro Gly Val Arg Cys Arg Arg Val Pro Ala Arg Phe
930 935 940
Val Gly Ala Thr Val Asp Asp Ser Leu Pro Leu Trp Leu Thr Glu Phe
945 950 955 960
Leu Asp Lys His Arg Leu Asp Lys Asn Leu Leu Arg Pro Asp Asp Val
965 970 975
Ile Pro Thr Gly Glu Gly Glu Phe Leu Val Ser Pro Cys Gly Glu Glu
980 985 990
Ala Ala Arg Val Arg Gln Val His Ala Asp Ile Asn Ala Ala Gln Asn
995 1000 1005
Leu Gln Arg Arg Leu Trp Gln Asn Phe Asp Ile Thr Glu Leu Arg
1010 1015 1020
Leu Arg Cys Asp Val Lys Met Gly Gly Glu Gly Thr Val Leu Val
1025 1030 1035
Pro Arg Val Asn Asn Ala Arg Ala Lys Gln Leu Phe Gly Lys Lys
1040 1045 1050
Val Leu Val Ser Gln Asp Gly Val Thr Phe Phe Glu Arg Ser Gln
1055 1060 1065
Thr Gly Gly Lys Pro His Ser Glu Lys Gln Thr Asp Leu Thr Asp
1070 1075 1080
Lys Glu Leu Glu Leu Ile Ala Glu Ala Asp Glu Ala Arg Ala Lys
1085 1090 1095
Ser Val Val Leu Phe Arg Asp Pro Ser Gly His Ile Gly Lys Gly
1100 1105 1110
His Trp Ile Arg Gln Arg Glu Phe Trp Ser Leu Val Lys Gln Arg
1115 1120 1125
Ile Glu Ser His Thr Ala Glu Arg Ile Arg Val Arg Gly Val Gly
1130 1135 1140
Ser Ser Leu Asp
1145
<210> 3
<211> 1146
<212> PRT
<213> Alicyclobacillus macrosporangiidus
<400> 3
Met Asn Val Ala Val Lys Ser Ile Lys Val Lys Leu Met Leu Gly His
1 5 10 15
Leu Pro Glu Ile Arg Glu Gly Leu Trp His Leu His Glu Ala Val Asn
20 25 30
Leu Gly Val Arg Tyr Tyr Thr Glu Trp Leu Ala Leu Leu Arg Gln Gly
35 40 45
Asn Leu Tyr Arg Arg Gly Lys Asp Gly Ala Gln Glu Cys Tyr Met Thr
50 55 60
Ala Glu Gln Cys Arg Gln Glu Leu Leu Val Arg Leu Arg Asp Arg Gln
65 70 75 80
Lys Arg Asn Gly His Thr Gly Asp Pro Gly Thr Asp Glu Glu Leu Leu
85 90 95
Gly Val Ala Arg Arg Leu Tyr Glu Leu Leu Val Pro Gln Ser Val Gly
100 105 110
Lys Lys Gly Gln Ala Gln Met Leu Ala Ser Gly Phe Leu Ser Pro Leu
115 120 125
Ala Asp Pro Lys Ser Glu Gly Gly Lys Gly Thr Ser Lys Ser Gly Arg
130 135 140
Lys Pro Ala Trp Met Gly Met Lys Glu Ala Gly Asp Ser Arg Trp Val
145 150 155 160
Glu Ala Lys Ala Arg Tyr Glu Ala Asn Lys Ala Lys Asp Pro Thr Lys
165 170 175
Gln Val Ile Ala Ser Leu Glu Met Tyr Gly Leu Arg Pro Leu Phe Asp
180 185 190
Val Phe Thr Glu Thr Tyr Lys Thr Ile Arg Trp Met Pro Leu Gly Lys
195 200 205
His Gln Gly Val Arg Ala Trp Asp Arg Asp Met Phe Gln Gln Ser Leu
210 215 220
Glu Arg Leu Met Ser Trp Glu Ser Trp Asn Glu Arg Val Gly Ala Glu
225 230 235 240
Phe Ala Arg Leu Val Asp Arg Arg Asp Arg Phe Arg Glu Lys His Phe
245 250 255
Thr Gly Gln Glu His Leu Val Ala Leu Ala Gln Arg Leu Glu Gln Glu
260 265 270
Met Lys Glu Ala Ser Pro Gly Phe Glu Ser Lys Ser Ser Gln Ala His
275 280 285
Arg Ile Thr Lys Arg Ala Leu Arg Gly Ala Asp Gly Ile Ile Asp Asp
290 295 300
Trp Leu Lys Leu Ser Glu Gly Glu Pro Val Asp Arg Phe Asp Glu Ile
305 310 315 320
Leu Arg Lys Arg Gln Ala Gln Asn Pro Arg Arg Phe Gly Ser His Asp
325 330 335
Leu Phe Leu Lys Leu Ala Glu Pro Val Phe Gln Pro Leu Trp Arg Glu
340 345 350
Asp Pro Ser Phe Leu Ser Arg Trp Ala Ser Tyr Asn Glu Val Leu Asn
355 360 365
Lys Leu Glu Asp Ala Lys Gln Phe Ala Thr Phe Thr Leu Pro Ser Pro
370 375 380
Cys Ser Asn Pro Val Trp Ala Arg Phe Glu Asn Ala Glu Gly Thr Asn
385 390 395 400
Ile Phe Lys Tyr Asp Phe Leu Phe Asp His Phe Gly Lys Gly Arg His
405 410 415
Gly Val Arg Phe Gln Arg Met Ile Val Met Arg Asp Gly Val Pro Thr
420 425 430
Glu Val Glu Gly Ile Val Val Pro Ile Ala Pro Ser Arg Gln Leu Asp
435 440 445
Ala Leu Ala Pro Asn Asp Ala Ala Ser Pro Ile Asp Val Phe Val Gly
450 455 460
Asp Pro Ala Ala Pro Gly Ala Phe Arg Gly Gln Phe Gly Gly Ala Lys
465 470 475 480
Ile Gln Tyr Arg Arg Ser Ala Leu Val Arg Lys Gly Arg Arg Glu Glu
485 490 495
Lys Ala Tyr Leu Cys Gly Phe Arg Leu Pro Ser Gln Arg Arg Thr Gly
500 505 510
Thr Pro Ala Asp Asp Ala Gly Glu Val Phe Leu Asn Leu Ser Leu Arg
515 520 525
Val Glu Ser Gln Ser Glu Gln Ala Gly Arg Arg Asn Pro Pro Tyr Ala
530 535 540
Ala Val Phe His Ile Ser Asp Gln Thr Arg Arg Val Ile Val Arg Tyr
545 550 555 560
Gly Glu Ile Glu Arg Tyr Leu Ala Glu His Pro Asp Thr Gly Ile Pro
565 570 575
Gly Ser Arg Gly Leu Thr Ser Gly Leu Arg Val Met Ser Val Asp Leu
580 585 590
Gly Leu Arg Thr Ser Ala Ala Ile Ser Val Phe Arg Val Ala His Arg
595 600 605
Asp Glu Leu Thr Pro Asp Ala His Gly Arg Gln Pro Phe Phe Phe Pro
610 615 620
Ile His Gly Met Asp His Leu Val Ala Leu His Glu Arg Ser His Leu
625 630 635 640
Ile Arg Leu Pro Gly Glu Thr Glu Ser Lys Lys Val Arg Ser Ile Arg
645 650 655
Glu Gln Arg Leu Asp Arg Leu Asn Arg Leu Arg Ser Gln Met Ala Ser
660 665 670
Leu Arg Leu Leu Val Arg Thr Gly Val Leu Asp Glu Gln Lys Arg Asp
675 680 685
Arg Asn Trp Glu Arg Leu Gln Ser Ser Met Glu Arg Gly Gly Glu Arg
690 695 700
Met Pro Ser Asp Trp Trp Asp Leu Phe Gln Ala Gln Val Arg Tyr Leu
705 710 715 720
Ala Gln His Arg Asp Ala Ser Gly Glu Ala Trp Gly Arg Met Val Gln
725 730 735
Ala Ala Val Arg Thr Leu Trp Arg Gln Leu Ala Lys Gln Val Arg Asp
740 745 750
Trp Arg Lys Glu Val Arg Arg Asn Ala Asp Lys Val Lys Ile Arg Gly
755 760 765
Ile Ala Arg Asp Val Pro Gly Gly His Ser Leu Ala Gln Leu Asp Tyr
770 775 780
Leu Glu Arg Gln Tyr Arg Phe Leu Arg Ser Trp Ser Ala Phe Ser Val
785 790 795 800
Gln Ala Gly Gln Val Val Arg Ala Glu Arg Asp Ser Arg Phe Ala Val
805 810 815
Ala Leu Arg Glu His Ile Asp Asn Gly Lys Lys Asp Arg Leu Lys Lys
820 825 830
Leu Ala Asp Arg Ile Leu Met Glu Ala Leu Gly Tyr Val Tyr Val Thr
835 840 845
Asp Gly Arg Arg Ala Gly Gln Trp Gln Ala Val Tyr Pro Pro Cys Gln
850 855 860
Leu Val Leu Leu Glu Glu Leu Ser Glu Tyr Arg Phe Ser Asn Asp Arg
865 870 875 880
Pro Pro Ser Glu Asn Ser Gln Leu Met Val Trp Ser His Arg Gly Val
885 890 895
Leu Glu Glu Leu Ile His Gln Ala Gln Val His Asp Val Leu Val Gly
900 905 910
Thr Ile Pro Ala Ala Phe Ser Ser Arg Phe Asp Ala Arg Thr Gly Ala
915 920 925
Pro Gly Ile Arg Cys Arg Arg Val Pro Ser Ile Pro Leu Lys Asp Ala
930 935 940
Pro Ser Ile Pro Ile Trp Leu Ser His Tyr Leu Lys Gln Thr Glu Arg
945 950 955 960
Asp Ala Ala Ala Leu Arg Pro Gly Glu Leu Ile Pro Thr Gly Asp Gly
965 970 975
Glu Phe Leu Val Thr Pro Ala Gly Arg Gly Ala Ser Gly Val Arg Val
980 985 990
Val His Ala Asp Ile Asn Ala Ala His Asn Leu Gln Arg Arg Leu Trp
995 1000 1005
Glu Asn Phe Asp Leu Ser Asp Ile Arg Val Arg Cys Asp Arg Arg
1010 1015 1020
Glu Gly Lys Asp Gly Thr Val Val Leu Ile Pro Arg Leu Thr Asn
1025 1030 1035
Gln Arg Val Lys Glu Arg Tyr Ser Gly Val Ile Phe Thr Ser Glu
1040 1045 1050
Asp Gly Val Ser Phe Thr Val Gly Asp Ala Lys Thr Arg Arg Arg
1055 1060 1065
Ser Ser Ala Ser Gln Gly Glu Gly Asp Asp Leu Ser Asp Glu Glu
1070 1075 1080
Gln Glu Leu Leu Ala Glu Ala Asp Asp Ala Arg Glu Arg Ser Val
1085 1090 1095
Val Leu Phe Arg Asp Pro Ser Gly Phe Val Asn Gly Gly Arg Trp
1100 1105 1110
Thr Ala Gln Arg Ala Phe Trp Gly Met Val His Asn Arg Ile Glu
1115 1120 1125
Thr Leu Leu Ala Glu Arg Phe Ser Val Ser Gly Ala Ala Glu Lys
1130 1135 1140
Val Arg Gly
1145
<210> 4
<211> 1108
<212> PRT
<213> Bacillus hisashii
<400> 4
Met Ala Thr Arg Ser Phe Ile Leu Lys Ile Glu Pro Asn Glu Glu Val
1 5 10 15
Lys Lys Gly Leu Trp Lys Thr His Glu Val Leu Asn His Gly Ile Ala
20 25 30
Tyr Tyr Met Asn Ile Leu Lys Leu Ile Arg Gln Glu Ala Ile Tyr Glu
35 40 45
His His Glu Gln Asp Pro Lys Asn Pro Lys Lys Val Ser Lys Ala Glu
50 55 60
Ile Gln Ala Glu Leu Trp Asp Phe Val Leu Lys Met Gln Lys Cys Asn
65 70 75 80
Ser Phe Thr His Glu Val Asp Lys Asp Glu Val Phe Asn Ile Leu Arg
85 90 95
Glu Leu Tyr Glu Glu Leu Val Pro Ser Ser Val Glu Lys Lys Gly Glu
100 105 110
Ala Asn Gln Leu Ser Asn Lys Phe Leu Tyr Pro Leu Val Asp Pro Asn
115 120 125
Ser Gln Ser Gly Lys Gly Thr Ala Ser Ser Gly Arg Lys Pro Arg Trp
130 135 140
Tyr Asn Leu Lys Ile Ala Gly Asp Pro Ser Trp Glu Glu Glu Lys Lys
145 150 155 160
Lys Trp Glu Glu Asp Lys Lys Lys Asp Pro Leu Ala Lys Ile Leu Gly
165 170 175
Lys Leu Ala Glu Tyr Gly Leu Ile Pro Leu Phe Ile Pro Tyr Thr Asp
180 185 190
Ser Asn Glu Pro Ile Val Lys Glu Ile Lys Trp Met Glu Lys Ser Arg
195 200 205
Asn Gln Ser Val Arg Arg Leu Asp Lys Asp Met Phe Ile Gln Ala Leu
210 215 220
Glu Arg Phe Leu Ser Trp Glu Ser Trp Asn Leu Lys Val Lys Glu Glu
225 230 235 240
Tyr Glu Lys Val Glu Lys Glu Tyr Lys Thr Leu Glu Glu Arg Ile Lys
245 250 255
Glu Asp Ile Gln Ala Leu Lys Ala Leu Glu Gln Tyr Glu Lys Glu Arg
260 265 270
Gln Glu Gln Leu Leu Arg Asp Thr Leu Asn Thr Asn Glu Tyr Arg Leu
275 280 285
Ser Lys Arg Gly Leu Arg Gly Trp Arg Glu Ile Ile Gln Lys Trp Leu
290 295 300
Lys Met Asp Glu Asn Glu Pro Ser Glu Lys Tyr Leu Glu Val Phe Lys
305 310 315 320
Asp Tyr Gln Arg Lys His Pro Arg Glu Ala Gly Asp Tyr Ser Val Tyr
325 330 335
Glu Phe Leu Ser Lys Lys Glu Asn His Phe Ile Trp Arg Asn His Pro
340 345 350
Glu Tyr Pro Tyr Leu Tyr Ala Thr Phe Cys Glu Ile Asp Lys Lys Lys
355 360 365
Lys Asp Ala Lys Gln Gln Ala Thr Phe Thr Leu Ala Asp Pro Ile Asn
370 375 380
His Pro Leu Trp Val Arg Phe Glu Glu Arg Ser Gly Ser Asn Leu Asn
385 390 395 400
Lys Tyr Arg Ile Leu Thr Glu Gln Leu His Thr Glu Lys Leu Lys Lys
405 410 415
Lys Leu Thr Val Gln Leu Asp Arg Leu Ile Tyr Pro Thr Glu Ser Gly
420 425 430
Gly Trp Glu Glu Lys Gly Lys Val Asp Ile Val Leu Leu Pro Ser Arg
435 440 445
Gln Phe Tyr Asn Gln Ile Phe Leu Asp Ile Glu Glu Lys Gly Lys His
450 455 460
Ala Phe Thr Tyr Lys Asp Glu Ser Ile Lys Phe Pro Leu Lys Gly Thr
465 470 475 480
Leu Gly Gly Ala Arg Val Gln Phe Asp Arg Asp His Leu Arg Arg Tyr
485 490 495
Pro His Lys Val Glu Ser Gly Asn Val Gly Arg Ile Tyr Phe Asn Met
500 505 510
Thr Val Asn Ile Glu Pro Thr Glu Ser Pro Val Ser Lys Ser Leu Lys
515 520 525
Ile His Arg Asp Asp Phe Pro Lys Val Val Asn Phe Lys Pro Lys Glu
530 535 540
Leu Thr Glu Trp Ile Lys Asp Ser Lys Gly Lys Lys Leu Lys Ser Gly
545 550 555 560
Ile Glu Ser Leu Glu Ile Gly Leu Arg Val Met Ser Ile Asp Leu Gly
565 570 575
Gln Arg Gln Ala Ala Ala Ala Ser Ile Phe Glu Val Val Asp Gln Lys
580 585 590
Pro Asp Ile Glu Gly Lys Leu Phe Phe Pro Ile Lys Gly Thr Glu Leu
595 600 605
Tyr Ala Val His Arg Ala Ser Phe Asn Ile Lys Leu Pro Gly Glu Thr
610 615 620
Leu Val Lys Ser Arg Glu Val Leu Arg Lys Ala Arg Glu Asp Asn Leu
625 630 635 640
Lys Leu Met Asn Gln Lys Leu Asn Phe Leu Arg Asn Val Leu His Phe
645 650 655
Gln Gln Phe Glu Asp Ile Thr Glu Arg Glu Lys Arg Val Thr Lys Trp
660 665 670
Ile Ser Arg Gln Glu Asn Ser Asp Val Pro Leu Val Tyr Gln Asp Glu
675 680 685
Leu Ile Gln Ile Arg Glu Leu Met Tyr Lys Pro Tyr Lys Asp Trp Val
690 695 700
Ala Phe Leu Lys Gln Leu His Lys Arg Leu Glu Val Glu Ile Gly Lys
705 710 715 720
Glu Val Lys His Trp Arg Lys Ser Leu Ser Asp Gly Arg Lys Gly Leu
725 730 735
Tyr Gly Ile Ser Leu Lys Asn Ile Asp Glu Ile Asp Arg Thr Arg Lys
740 745 750
Phe Leu Leu Arg Trp Ser Leu Arg Pro Thr Glu Pro Gly Glu Val Arg
755 760 765
Arg Leu Glu Pro Gly Gln Arg Phe Ala Ile Asp Gln Leu Asn His Leu
770 775 780
Asn Ala Leu Lys Glu Asp Arg Leu Lys Lys Met Ala Asn Thr Ile Ile
785 790 795 800
Met His Ala Leu Gly Tyr Cys Tyr Asp Val Arg Lys Lys Lys Trp Gln
805 810 815
Ala Lys Asn Pro Ala Cys Gln Ile Ile Leu Phe Glu Asp Leu Ser Asn
820 825 830
Tyr Asn Pro Tyr Glu Glu Arg Ser Arg Phe Glu Asn Ser Lys Leu Met
835 840 845
Lys Trp Ser Arg Arg Glu Ile Pro Arg Gln Val Ala Leu Gln Gly Glu
850 855 860
Ile Tyr Gly Leu Gln Val Gly Glu Val Gly Ala Gln Phe Ser Ser Arg
865 870 875 880
Phe His Ala Lys Thr Gly Ser Pro Gly Ile Arg Cys Ser Val Val Thr
885 890 895
Lys Glu Lys Leu Gln Asp Asn Arg Phe Phe Lys Asn Leu Gln Arg Glu
900 905 910
Gly Arg Leu Thr Leu Asp Lys Ile Ala Val Leu Lys Glu Gly Asp Leu
915 920 925
Tyr Pro Asp Lys Gly Gly Glu Lys Phe Ile Ser Leu Ser Lys Asp Arg
930 935 940
Lys Cys Val Thr Thr His Ala Asp Ile Asn Ala Ala Gln Asn Leu Gln
945 950 955 960
Lys Arg Phe Trp Thr Arg Thr His Gly Phe Tyr Lys Val Tyr Cys Lys
965 970 975
Ala Tyr Gln Val Asp Gly Gln Thr Val Tyr Ile Pro Glu Ser Lys Asp
980 985 990
Gln Lys Gln Lys Ile Ile Glu Glu Phe Gly Glu Gly Tyr Phe Ile Leu
995 1000 1005
Lys Asp Gly Val Tyr Glu Trp Val Asn Ala Gly Lys Leu Lys Ile
1010 1015 1020
Lys Lys Gly Ser Ser Lys Gln Ser Ser Ser Glu Leu Val Asp Ser
1025 1030 1035
Asp Ile Leu Lys Asp Ser Phe Asp Leu Ala Ser Glu Leu Lys Gly
1040 1045 1050
Glu Lys Leu Met Leu Tyr Arg Asp Pro Ser Gly Asn Val Phe Pro
1055 1060 1065
Ser Asp Lys Trp Met Ala Ala Gly Val Phe Phe Gly Lys Leu Glu
1070 1075 1080
Arg Ile Leu Ile Ser Lys Leu Thr Asn Gln Tyr Ser Ile Ser Thr
1085 1090 1095
Ile Glu Asp Asp Ser Ser Lys Gln Ser Met
1100 1105
<210> 5
<211> 1108
<212> PRT
<213> Bacillus
<400> 5
Met Ala Ile Arg Ser Ile Lys Leu Lys Leu Lys Thr His Thr Gly Pro
1 5 10 15
Glu Ala Gln Asn Leu Arg Lys Gly Ile Trp Arg Thr His Arg Leu Leu
20 25 30
Asn Glu Gly Val Ala Tyr Tyr Met Lys Met Leu Leu Leu Phe Arg Gln
35 40 45
Glu Ser Thr Gly Glu Arg Pro Lys Glu Glu Leu Gln Glu Glu Leu Ile
50 55 60
Cys His Ile Arg Glu Gln Gln Gln Arg Asn Gln Ala Asp Lys Asn Thr
65 70 75 80
Gln Ala Leu Pro Leu Asp Lys Ala Leu Glu Ala Leu Arg Gln Leu Tyr
85 90 95
Glu Leu Leu Val Pro Ser Ser Val Gly Gln Ser Gly Asp Ala Gln Ile
100 105 110
Ile Ser Arg Lys Phe Leu Ser Pro Leu Val Asp Pro Asn Ser Glu Gly
115 120 125
Gly Lys Gly Thr Ser Lys Ala Gly Ala Lys Pro Thr Trp Gln Lys Lys
130 135 140
Lys Glu Ala Asn Asp Pro Thr Trp Glu Gln Asp Tyr Glu Lys Trp Lys
145 150 155 160
Lys Arg Arg Glu Glu Asp Pro Thr Ala Ser Val Ile Thr Thr Leu Glu
165 170 175
Glu Tyr Gly Ile Arg Pro Ile Phe Pro Leu Tyr Thr Asn Thr Val Thr
180 185 190
Asp Ile Ala Trp Leu Pro Leu Gln Ser Asn Gln Phe Val Arg Thr Trp
195 200 205
Asp Arg Asp Met Leu Gln Gln Ala Ile Glu Arg Leu Leu Ser Trp Glu
210 215 220
Ser Trp Asn Lys Arg Val Gln Glu Glu Tyr Ala Lys Leu Lys Glu Lys
225 230 235 240
Met Ala Gln Leu Asn Glu Gln Leu Glu Gly Gly Gln Glu Trp Ile Ser
245 250 255
Leu Leu Glu Gln Tyr Glu Glu Asn Arg Glu Arg Glu Leu Arg Glu Asn
260 265 270
Met Thr Ala Ala Asn Asp Lys Tyr Arg Ile Thr Lys Arg Gln Met Lys
275 280 285
Gly Trp Asn Glu Leu Tyr Glu Leu Trp Ser Thr Phe Pro Ala Ser Ala
290 295 300
Ser His Glu Gln Tyr Lys Glu Ala Leu Lys Arg Val Gln Gln Arg Leu
305 310 315 320
Arg Gly Arg Phe Gly Asp Ala His Phe Phe Gln Tyr Leu Met Glu Glu
325 330 335
Lys Asn Arg Leu Ile Trp Lys Gly Asn Pro Gln Arg Ile His Tyr Phe
340 345 350
Val Ala Arg Asn Glu Leu Thr Lys Arg Leu Glu Glu Ala Lys Gln Ser
355 360 365
Ala Thr Met Thr Leu Pro Asn Ala Arg Lys His Pro Leu Trp Val Arg
370 375 380
Phe Asp Ala Arg Gly Gly Asn Leu Gln Asp Tyr Tyr Leu Thr Ala Glu
385 390 395 400
Ala Asp Lys Pro Arg Ser Arg Arg Phe Val Thr Phe Ser Gln Leu Ile
405 410 415
Trp Pro Ser Glu Ser Gly Trp Met Glu Lys Lys Asp Val Glu Val Glu
420 425 430
Leu Ala Leu Ser Arg Gln Phe Tyr Gln Gln Val Lys Leu Leu Lys Asn
435 440 445
Asp Lys Gly Lys Gln Lys Ile Glu Phe Lys Asp Lys Gly Ser Gly Ser
450 455 460
Thr Phe Asn Gly His Leu Gly Gly Ala Lys Leu Gln Leu Glu Arg Gly
465 470 475 480
Asp Leu Glu Lys Glu Glu Lys Asn Phe Glu Asp Gly Glu Ile Gly Ser
485 490 495
Val Tyr Leu Asn Val Val Ile Asp Phe Glu Pro Leu Gln Glu Val Lys
500 505 510
Asn Gly Arg Val Gln Ala Pro Tyr Gly Gln Val Leu Gln Leu Ile Arg
515 520 525
Arg Pro Asn Glu Phe Pro Lys Val Thr Thr Tyr Lys Ser Glu Gln Leu
530 535 540
Val Glu Trp Ile Lys Ala Ser Pro Gln His Ser Ala Gly Val Glu Ser
545 550 555 560
Leu Ala Ser Gly Phe Arg Val Met Ser Ile Asp Leu Gly Leu Arg Ala
565 570 575
Ala Ala Ala Thr Ser Ile Phe Ser Val Glu Glu Ser Ser Asp Lys Asn
580 585 590
Ala Ala Asp Phe Ser Tyr Trp Ile Glu Gly Thr Pro Leu Val Ala Val
595 600 605
His Gln Arg Ser Tyr Met Leu Arg Leu Pro Gly Glu Gln Val Glu Lys
610 615 620
Gln Val Met Glu Lys Arg Asp Glu Arg Phe Gln Leu His Gln Arg Val
625 630 635 640
Lys Phe Gln Ile Arg Val Leu Ala Gln Ile Met Arg Met Ala Asn Lys
645 650 655
Gln Tyr Gly Asp Arg Trp Asp Glu Leu Asp Ser Leu Lys Gln Ala Val
660 665 670
Glu Gln Lys Lys Ser Pro Leu Asp Gln Thr Asp Arg Thr Phe Trp Glu
675 680 685
Gly Ile Val Cys Asp Leu Thr Lys Val Leu Pro Arg Asn Glu Ala Asp
690 695 700
Trp Glu Gln Ala Val Val Gln Ile His Arg Lys Ala Glu Glu Tyr Val
705 710 715 720
Gly Lys Ala Val Gln Ala Trp Arg Lys Arg Phe Ala Ala Asp Glu Arg
725 730 735
Lys Gly Ile Ala Gly Leu Ser Met Trp Asn Ile Glu Glu Leu Glu Gly
740 745 750
Leu Arg Lys Leu Leu Ile Ser Trp Ser Arg Arg Thr Arg Asn Pro Gln
755 760 765
Glu Val Asn Arg Phe Glu Arg Gly His Thr Ser His Gln Arg Leu Leu
770 775 780
Thr His Ile Gln Asn Val Lys Glu Asp Arg Leu Lys Gln Leu Ser His
785 790 795 800
Ala Ile Val Met Thr Ala Leu Gly Tyr Val Tyr Asp Glu Arg Lys Gln
805 810 815
Glu Trp Cys Ala Glu Tyr Pro Ala Cys Gln Val Ile Leu Phe Glu Asn
820 825 830
Leu Ser Gln Tyr Arg Ser Asn Leu Asp Arg Ser Thr Lys Glu Asn Ser
835 840 845
Thr Leu Met Lys Trp Ala His Arg Ser Ile Pro Lys Tyr Val His Met
850 855 860
Gln Ala Glu Pro Tyr Gly Ile Gln Ile Gly Asp Val Arg Ala Glu Tyr
865 870 875 880
Ser Ser Arg Phe Tyr Ala Lys Thr Gly Thr Pro Gly Ile Arg Cys Lys
885 890 895
Lys Val Arg Gly Gln Asp Leu Gln Gly Arg Arg Phe Glu Asn Leu Gln
900 905 910
Lys Arg Leu Val Asn Glu Gln Phe Leu Thr Glu Glu Gln Val Lys Gln
915 920 925
Leu Arg Pro Gly Asp Ile Val Pro Asp Asp Ser Gly Glu Leu Phe Met
930 935 940
Thr Leu Thr Asp Gly Ser Gly Ser Lys Glu Val Val Phe Leu Gln Ala
945 950 955 960
Asp Ile Asn Ala Ala His Asn Leu Gln Lys Arg Phe Trp Gln Arg Tyr
965 970 975
Asn Glu Leu Phe Lys Val Ser Cys Arg Val Ile Val Arg Asp Glu Glu
980 985 990
Glu Tyr Leu Val Pro Lys Thr Lys Ser Val Gln Ala Lys Leu Gly Lys
995 1000 1005
Gly Leu Phe Val Lys Lys Ser Asp Thr Ala Trp Lys Asp Val Tyr
1010 1015 1020
Val Trp Asp Ser Gln Ala Lys Leu Lys Gly Lys Thr Thr Phe Thr
1025 1030 1035
Glu Glu Ser Glu Ser Pro Glu Gln Leu Glu Asp Phe Gln Glu Ile
1040 1045 1050
Ile Glu Glu Ala Glu Glu Ala Lys Gly Thr Tyr Arg Thr Leu Phe
1055 1060 1065
Arg Asp Pro Ser Gly Val Phe Phe Pro Glu Ser Val Trp Tyr Pro
1070 1075 1080
Gln Lys Asp Phe Trp Gly Glu Val Lys Arg Lys Leu Tyr Gly Lys
1085 1090 1095
Leu Arg Glu Arg Phe Leu Thr Lys Ala Arg
1100 1105
<210> 6
<211> 1112
<212> PRT
<213> Bacillus
<400> 6
Met Ala Ile Arg Ser Ile Lys Leu Lys Met Lys Thr Asn Ser Gly Thr
1 5 10 15
Asp Ser Ile Tyr Leu Arg Lys Ala Leu Trp Arg Thr His Gln Leu Ile
20 25 30
Asn Glu Gly Ile Ala Tyr Tyr Met Asn Leu Leu Thr Leu Tyr Arg Gln
35 40 45
Glu Ala Ile Gly Asp Lys Thr Lys Glu Ala Tyr Gln Ala Glu Leu Ile
50 55 60
Asn Ile Ile Arg Asn Gln Gln Arg Asn Asn Gly Ser Ser Glu Glu His
65 70 75 80
Gly Ser Asp Gln Glu Ile Leu Ala Leu Leu Arg Gln Leu Tyr Glu Leu
85 90 95
Ile Ile Pro Ser Ser Ile Gly Glu Ser Gly Asp Ala Asn Gln Leu Gly
100 105 110
Asn Lys Phe Leu Tyr Pro Leu Val Asp Pro Asn Ser Gln Ser Gly Lys
115 120 125
Gly Thr Ser Asn Ala Gly Arg Lys Pro Arg Trp Lys Arg Leu Lys Glu
130 135 140
Glu Gly Asn Pro Asp Trp Glu Leu Glu Lys Lys Lys Asp Glu Glu Arg
145 150 155 160
Lys Ala Lys Asp Pro Thr Val Lys Ile Phe Asp Asn Leu Asn Lys Tyr
165 170 175
Gly Leu Leu Pro Leu Phe Pro Leu Phe Thr Asn Ile Gln Lys Asp Ile
180 185 190
Glu Trp Leu Pro Leu Gly Lys Arg Gln Ser Val Arg Lys Trp Asp Lys
195 200 205
Asp Met Phe Ile Gln Ala Ile Glu Arg Leu Leu Ser Trp Glu Ser Trp
210 215 220
Asn Arg Arg Val Ala Asp Glu Tyr Lys Gln Leu Lys Glu Lys Thr Glu
225 230 235 240
Ser Tyr Tyr Lys Glu His Leu Thr Gly Gly Glu Glu Trp Ile Glu Lys
245 250 255
Ile Arg Lys Phe Glu Lys Glu Arg Asn Met Glu Leu Glu Lys Asn Ala
260 265 270
Phe Ala Pro Asn Asp Gly Tyr Phe Ile Thr Ser Arg Gln Ile Arg Gly
275 280 285
Trp Asp Arg Val Tyr Glu Lys Trp Ser Lys Leu Pro Glu Ser Ala Ser
290 295 300
Pro Glu Glu Leu Trp Lys Val Val Ala Glu Gln Gln Asn Lys Met Ser
305 310 315 320
Glu Gly Phe Gly Asp Pro Lys Val Phe Ser Phe Leu Ala Asn Arg Glu
325 330 335
Asn Arg Asp Ile Trp Arg Gly His Ser Glu Arg Ile Tyr His Ile Ala
340 345 350
Ala Tyr Asn Gly Leu Gln Lys Lys Leu Ser Arg Thr Lys Glu Gln Ala
355 360 365
Thr Phe Thr Leu Pro Asp Ala Ile Glu His Pro Leu Trp Ile Arg Tyr
370 375 380
Glu Ser Pro Gly Gly Thr Asn Leu Asn Leu Phe Lys Leu Glu Glu Lys
385 390 395 400
Gln Lys Lys Asn Tyr Tyr Val Thr Leu Ser Lys Ile Ile Trp Pro Ser
405 410 415
Glu Glu Lys Trp Ile Glu Lys Glu Asn Ile Glu Ile Pro Leu Ala Pro
420 425 430
Ser Ile Gln Phe Asn Arg Gln Ile Lys Leu Lys Gln His Val Lys Gly
435 440 445
Lys Gln Glu Ile Ser Phe Ser Asp Tyr Ser Ser Arg Ile Ser Leu Asp
450 455 460
Gly Val Leu Gly Gly Ser Arg Ile Gln Phe Asn Arg Lys Tyr Ile Lys
465 470 475 480
Asn His Lys Glu Leu Leu Gly Glu Gly Asp Ile Gly Pro Val Phe Phe
485 490 495
Asn Leu Val Val Asp Val Ala Pro Leu Gln Glu Thr Arg Asn Gly Arg
500 505 510
Leu Gln Ser Pro Ile Gly Lys Ala Leu Lys Val Ile Ser Ser Asp Phe
515 520 525
Ser Lys Val Ile Asp Tyr Lys Pro Lys Glu Leu Met Asp Trp Met Asn
530 535 540
Thr Gly Ser Ala Ser Asn Ser Phe Gly Val Ala Ser Leu Leu Glu Gly
545 550 555 560
Met Arg Val Met Ser Ile Asp Met Gly Gln Arg Thr Ser Ala Ser Val
565 570 575
Ser Ile Phe Glu Val Val Lys Glu Leu Pro Lys Asp Gln Glu Gln Lys
580 585 590
Leu Phe Tyr Ser Ile Asn Asp Thr Glu Leu Phe Ala Ile His Lys Arg
595 600 605
Ser Phe Leu Leu Asn Leu Pro Gly Glu Val Val Thr Lys Asn Asn Lys
610 615 620
Gln Gln Arg Gln Glu Arg Arg Lys Lys Arg Gln Phe Val Arg Ser Gln
625 630 635 640
Ile Arg Met Leu Ala Asn Val Leu Arg Leu Glu Thr Lys Lys Thr Pro
645 650 655
Asp Glu Arg Lys Lys Ala Ile His Lys Leu Met Glu Ile Val Gln Ser
660 665 670
Tyr Asp Ser Trp Thr Ala Ser Gln Lys Glu Val Trp Glu Lys Glu Leu
675 680 685
Asn Leu Leu Thr Asn Met Ala Ala Phe Asn Asp Glu Ile Trp Lys Glu
690 695 700
Ser Leu Val Glu Leu His His Arg Ile Glu Pro Tyr Val Gly Gln Ile
705 710 715 720
Val Ser Lys Trp Arg Lys Gly Leu Ser Glu Gly Arg Lys Asn Leu Ala
725 730 735
Gly Ile Ser Met Trp Asn Ile Asp Glu Leu Glu Asp Thr Arg Arg Leu
740 745 750
Leu Ile Ser Trp Ser Lys Arg Ser Arg Thr Pro Gly Glu Ala Asn Arg
755 760 765
Ile Glu Thr Asp Glu Pro Phe Gly Ser Ser Leu Leu Gln His Ile Gln
770 775 780
Asn Val Lys Asp Asp Arg Leu Lys Gln Met Ala Asn Leu Ile Ile Met
785 790 795 800
Thr Ala Leu Gly Phe Lys Tyr Asp Lys Glu Glu Lys Asp Arg Tyr Lys
805 810 815
Arg Trp Lys Glu Thr Tyr Pro Ala Cys Gln Ile Ile Leu Phe Glu Asn
820 825 830
Leu Asn Arg Tyr Leu Phe Asn Leu Asp Arg Ser Arg Arg Glu Asn Ser
835 840 845
Arg Leu Met Lys Trp Ala His Arg Ser Ile Pro Arg Thr Val Ser Met
850 855 860
Gln Gly Glu Met Phe Gly Leu Gln Val Gly Asp Val Arg Ser Glu Tyr
865 870 875 880
Ser Ser Arg Phe His Ala Lys Thr Gly Ala Pro Gly Ile Arg Cys His
885 890 895
Ala Leu Thr Glu Glu Asp Leu Lys Ala Gly Ser Asn Thr Leu Lys Arg
900 905 910
Leu Ile Glu Asp Gly Phe Ile Asn Glu Ser Glu Leu Ala Tyr Leu Lys
915 920 925
Lys Gly Asp Ile Ile Pro Ser Gln Gly Gly Glu Leu Phe Val Thr Leu
930 935 940
Ser Lys Arg Tyr Lys Lys Asp Ser Asp Asn Asn Glu Leu Thr Val Ile
945 950 955 960
His Ala Asp Ile Asn Ala Ala Gln Asn Leu Gln Lys Arg Phe Trp Gln
965 970 975
Gln Asn Ser Glu Val Tyr Arg Val Pro Cys Gln Leu Ala Arg Met Gly
980 985 990
Glu Asp Lys Leu Tyr Ile Pro Lys Ser Gln Thr Glu Thr Ile Lys Lys
995 1000 1005
Tyr Phe Gly Lys Gly Ser Phe Val Lys Asn Asn Thr Glu Gln Glu
1010 1015 1020
Val Tyr Lys Trp Glu Lys Ser Glu Lys Met Lys Ile Lys Thr Asp
1025 1030 1035
Thr Thr Phe Asp Leu Gln Asp Leu Asp Gly Phe Glu Asp Ile Ser
1040 1045 1050
Lys Thr Ile Glu Leu Ala Gln Glu Gln Gln Lys Lys Tyr Leu Thr
1055 1060 1065
Met Phe Arg Asp Pro Ser Gly Tyr Phe Phe Asn Asn Glu Thr Trp
1070 1075 1080
Arg Pro Gln Lys Glu Tyr Trp Ser Ile Val Asn Asn Ile Ile Lys
1085 1090 1095
Ser Cys Leu Lys Lys Lys Ile Leu Ser Asn Lys Val Glu Leu
1100 1105 1110
<210> 7
<211> 1149
<212> PRT
<213> Desulfovibrio inopinatus
<400> 7
Met Pro Thr Arg Thr Ile Asn Leu Lys Leu Val Leu Gly Lys Asn Pro
1 5 10 15
Glu Asn Ala Thr Leu Arg Arg Ala Leu Phe Ser Thr His Arg Leu Val
20 25 30
Asn Gln Ala Thr Lys Arg Ile Glu Glu Phe Leu Leu Leu Cys Arg Gly
35 40 45
Glu Ala Tyr Arg Thr Val Asp Asn Glu Gly Lys Glu Ala Glu Ile Pro
50 55 60
Arg His Ala Val Gln Glu Glu Ala Leu Ala Phe Ala Lys Ala Ala Gln
65 70 75 80
Arg His Asn Gly Cys Ile Ser Thr Tyr Glu Asp Gln Glu Ile Leu Asp
85 90 95
Val Leu Arg Gln Leu Tyr Glu Arg Leu Val Pro Ser Val Asn Glu Asn
100 105 110
Asn Glu Ala Gly Asp Ala Gln Ala Ala Asn Ala Trp Val Ser Pro Leu
115 120 125
Met Ser Ala Glu Ser Glu Gly Gly Leu Ser Val Tyr Asp Lys Val Leu
130 135 140
Asp Pro Pro Pro Val Trp Met Lys Leu Lys Glu Glu Lys Ala Pro Gly
145 150 155 160
Trp Glu Ala Ala Ser Gln Ile Trp Ile Gln Ser Asp Glu Gly Gln Ser
165 170 175
Leu Leu Asn Lys Pro Gly Ser Pro Pro Arg Trp Ile Arg Lys Leu Arg
180 185 190
Ser Gly Gln Pro Trp Gln Asp Asp Phe Val Ser Asp Gln Lys Lys Lys
195 200 205
Gln Asp Glu Leu Thr Lys Gly Asn Ala Pro Leu Ile Lys Gln Leu Lys
210 215 220
Glu Met Gly Leu Leu Pro Leu Val Asn Pro Phe Phe Arg His Leu Leu
225 230 235 240
Asp Pro Glu Gly Lys Gly Val Ser Pro Trp Asp Arg Leu Ala Val Arg
245 250 255
Ala Ala Val Ala His Phe Ile Ser Trp Glu Ser Trp Asn His Arg Thr
260 265 270
Arg Ala Glu Tyr Asn Ser Leu Lys Leu Arg Arg Asp Glu Phe Glu Ala
275 280 285
Ala Ser Asp Glu Phe Lys Asp Asp Phe Thr Leu Leu Arg Gln Tyr Glu
290 295 300
Ala Lys Arg His Ser Thr Leu Lys Ser Ile Ala Leu Ala Asp Asp Ser
305 310 315 320
Asn Pro Tyr Arg Ile Gly Val Arg Ser Leu Arg Ala Trp Asn Arg Val
325 330 335
Arg Glu Glu Trp Ile Asp Lys Gly Ala Thr Glu Glu Gln Arg Val Thr
340 345 350
Ile Leu Ser Lys Leu Gln Thr Gln Leu Arg Gly Lys Phe Gly Asp Pro
355 360 365
Asp Leu Phe Asn Trp Leu Ala Gln Asp Arg His Val His Leu Trp Ser
370 375 380
Pro Arg Asp Ser Val Thr Pro Leu Val Arg Ile Asn Ala Val Asp Lys
385 390 395 400
Val Leu Arg Arg Arg Lys Pro Tyr Ala Leu Met Thr Phe Ala His Pro
405 410 415
Arg Phe His Pro Arg Trp Ile Leu Tyr Glu Ala Pro Gly Gly Ser Asn
420 425 430
Leu Arg Gln Tyr Ala Leu Asp Cys Thr Glu Asn Ala Leu His Ile Thr
435 440 445
Leu Pro Leu Leu Val Asp Asp Ala His Gly Thr Trp Ile Glu Lys Lys
450 455 460
Ile Arg Val Pro Leu Ala Pro Ser Gly Gln Ile Gln Asp Leu Thr Leu
465 470 475 480
Glu Lys Leu Glu Lys Lys Lys Asn Arg Leu Tyr Tyr Arg Ser Gly Phe
485 490 495
Gln Gln Phe Ala Gly Leu Ala Gly Gly Ala Glu Val Leu Phe His Arg
500 505 510
Pro Tyr Met Glu His Asp Glu Arg Ser Glu Glu Ser Leu Leu Glu Arg
515 520 525
Pro Gly Ala Val Trp Phe Lys Leu Thr Leu Asp Val Ala Thr Gln Ala
530 535 540
Pro Pro Asn Trp Leu Asp Gly Lys Gly Arg Val Arg Thr Pro Pro Glu
545 550 555 560
Val His His Phe Lys Thr Ala Leu Ser Asn Lys Ser Lys His Thr Arg
565 570 575
Thr Leu Gln Pro Gly Leu Arg Val Leu Ser Val Asp Leu Gly Met Arg
580 585 590
Thr Phe Ala Ser Cys Ser Val Phe Glu Leu Ile Glu Gly Lys Pro Glu
595 600 605
Thr Gly Arg Ala Phe Pro Val Ala Asp Glu Arg Ser Met Asp Ser Pro
610 615 620
Asn Lys Leu Trp Ala Lys His Glu Arg Ser Phe Lys Leu Thr Leu Pro
625 630 635 640
Gly Glu Thr Pro Ser Arg Lys Glu Glu Glu Glu Arg Ser Ile Ala Arg
645 650 655
Ala Glu Ile Tyr Ala Leu Lys Arg Asp Ile Gln Arg Leu Lys Ser Leu
660 665 670
Leu Arg Leu Gly Glu Glu Asp Asn Asp Asn Arg Arg Asp Ala Leu Leu
675 680 685
Glu Gln Phe Phe Lys Gly Trp Gly Glu Glu Asp Val Val Pro Gly Gln
690 695 700
Ala Phe Pro Arg Ser Leu Phe Gln Gly Leu Gly Ala Ala Pro Phe Arg
705 710 715 720
Ser Thr Pro Glu Leu Trp Arg Gln His Cys Gln Thr Tyr Tyr Asp Lys
725 730 735
Ala Glu Ala Cys Leu Ala Lys His Ile Ser Asp Trp Arg Lys Arg Thr
740 745 750
Arg Pro Arg Pro Thr Ser Arg Glu Met Trp Tyr Lys Thr Arg Ser Tyr
755 760 765
His Gly Gly Lys Ser Ile Trp Met Leu Glu Tyr Leu Asp Ala Val Arg
770 775 780
Lys Leu Leu Leu Ser Trp Ser Leu Arg Gly Arg Thr Tyr Gly Ala Ile
785 790 795 800
Asn Arg Gln Asp Thr Ala Arg Phe Gly Ser Leu Ala Ser Arg Leu Leu
805 810 815
His His Ile Asn Ser Leu Lys Glu Asp Arg Ile Lys Thr Gly Ala Asp
820 825 830
Ser Ile Val Gln Ala Ala Arg Gly Tyr Ile Pro Leu Pro His Gly Lys
835 840 845
Gly Trp Glu Gln Arg Tyr Glu Pro Cys Gln Leu Ile Leu Phe Glu Asp
850 855 860
Leu Ala Arg Tyr Arg Phe Arg Val Asp Arg Pro Arg Arg Glu Asn Ser
865 870 875 880
Gln Leu Met Gln Trp Asn His Arg Ala Ile Val Ala Glu Thr Thr Met
885 890 895
Gln Ala Glu Leu Tyr Gly Gln Ile Val Glu Asn Thr Ala Ala Gly Phe
900 905 910
Ser Ser Arg Phe His Ala Ala Thr Gly Ala Pro Gly Val Arg Cys Arg
915 920 925
Phe Leu Leu Glu Arg Asp Phe Asp Asn Asp Leu Pro Lys Pro Tyr Leu
930 935 940
Leu Arg Glu Leu Ser Trp Met Leu Gly Asn Thr Lys Val Glu Ser Glu
945 950 955 960
Glu Glu Lys Leu Arg Leu Leu Ser Glu Lys Ile Arg Pro Gly Ser Leu
965 970 975
Val Pro Trp Asp Gly Gly Glu Gln Phe Ala Thr Leu His Pro Lys Arg
980 985 990
Gln Thr Leu Cys Val Ile His Ala Asp Met Asn Ala Ala Gln Asn Leu
995 1000 1005
Gln Arg Arg Phe Phe Gly Arg Cys Gly Glu Ala Phe Arg Leu Val
1010 1015 1020
Cys Gln Pro His Gly Asp Asp Val Leu Arg Leu Ala Ser Thr Pro
1025 1030 1035
Gly Ala Arg Leu Leu Gly Ala Leu Gln Gln Leu Glu Asn Gly Gln
1040 1045 1050
Gly Ala Phe Glu Leu Val Arg Asp Met Gly Ser Thr Ser Gln Met
1055 1060 1065
Asn Arg Phe Val Met Lys Ser Leu Gly Lys Lys Lys Ile Lys Pro
1070 1075 1080
Leu Gln Asp Asn Asn Gly Asp Asp Glu Leu Glu Asp Val Leu Ser
1085 1090 1095
Val Leu Pro Glu Glu Asp Asp Thr Gly Arg Ile Thr Val Phe Arg
1100 1105 1110
Asp Ser Ser Gly Ile Phe Phe Pro Cys Asn Val Trp Ile Pro Ala
1115 1120 1125
Lys Gln Phe Trp Pro Ala Val Arg Ala Met Ile Trp Lys Val Met
1130 1135 1140
Ala Ser His Ser Leu Gly
1145
<210> 8
<211> 1090
<212> PRT
<213> Laceyella sediminis
<400> 8
Met Ser Ile Arg Ser Phe Lys Leu Lys Ile Lys Thr Lys Ser Gly Val
1 5 10 15
Asn Ala Glu Glu Leu Arg Arg Gly Leu Trp Arg Thr His Gln Leu Ile
20 25 30
Asn Asp Gly Ile Ala Tyr Tyr Met Asn Trp Leu Val Leu Leu Arg Gln
35 40 45
Glu Asp Leu Phe Ile Arg Asn Glu Glu Thr Asn Glu Ile Glu Lys Arg
50 55 60
Ser Lys Glu Glu Ile Gln Gly Glu Leu Leu Glu Arg Val His Lys Gln
65 70 75 80
Gln Gln Arg Asn Gln Trp Ser Gly Glu Val Asp Asp Gln Thr Leu Leu
85 90 95
Gln Thr Leu Arg His Leu Tyr Glu Glu Ile Val Pro Ser Val Ile Gly
100 105 110
Lys Ser Gly Asn Ala Ser Leu Lys Ala Arg Phe Phe Leu Gly Pro Leu
115 120 125
Val Asp Pro Asn Asn Lys Thr Thr Lys Asp Val Ser Lys Ser Gly Pro
130 135 140
Thr Pro Lys Trp Lys Lys Met Lys Asp Ala Gly Asp Pro Asn Trp Val
145 150 155 160
Gln Glu Tyr Glu Lys Tyr Met Ala Glu Arg Gln Thr Leu Val Arg Leu
165 170 175
Glu Glu Met Gly Leu Ile Pro Leu Phe Pro Met Tyr Thr Asp Glu Val
180 185 190
Gly Asp Ile His Trp Leu Pro Gln Ala Ser Gly Tyr Thr Arg Thr Trp
195 200 205
Asp Arg Asp Met Phe Gln Gln Ala Ile Glu Arg Leu Leu Ser Trp Glu
210 215 220
Ser Trp Asn Arg Arg Val Arg Glu Arg Arg Ala Gln Phe Glu Lys Lys
225 230 235 240
Thr His Asp Phe Ala Ser Arg Phe Ser Glu Ser Asp Val Gln Trp Met
245 250 255
Asn Lys Leu Arg Glu Tyr Glu Ala Gln Gln Glu Lys Ser Leu Glu Glu
260 265 270
Asn Ala Phe Ala Pro Asn Glu Pro Tyr Ala Leu Thr Lys Lys Ala Leu
275 280 285
Arg Gly Trp Glu Arg Val Tyr His Ser Trp Met Arg Leu Asp Ser Ala
290 295 300
Ala Ser Glu Glu Ala Tyr Trp Gln Glu Val Ala Thr Cys Gln Thr Ala
305 310 315 320
Met Arg Gly Glu Phe Gly Asp Pro Ala Ile Tyr Gln Phe Leu Ala Gln
325 330 335
Lys Glu Asn His Asp Ile Trp Arg Gly Tyr Pro Glu Arg Val Ile Asp
340 345 350
Phe Ala Glu Leu Asn His Leu Gln Arg Glu Leu Arg Arg Ala Lys Glu
355 360 365
Asp Ala Thr Phe Thr Leu Pro Asp Ser Val Asp His Pro Leu Trp Val
370 375 380
Arg Tyr Glu Ala Pro Gly Gly Thr Asn Ile His Gly Tyr Asp Leu Val
385 390 395 400
Gln Asp Thr Lys Arg Asn Leu Thr Leu Ile Leu Asp Lys Phe Ile Leu
405 410 415
Pro Asp Glu Asn Gly Ser Trp His Glu Val Lys Lys Val Pro Phe Ser
420 425 430
Leu Ala Lys Ser Lys Gln Phe His Arg Gln Val Trp Leu Gln Glu Glu
435 440 445
Gln Lys Gln Lys Lys Arg Glu Val Val Phe Tyr Asp Tyr Ser Thr Asn
450 455 460
Leu Pro His Leu Gly Thr Leu Ala Gly Ala Lys Leu Gln Trp Asp Arg
465 470 475 480
Asn Phe Leu Asn Lys Arg Thr Gln Gln Gln Ile Glu Glu Thr Gly Glu
485 490 495
Ile Gly Lys Val Phe Phe Asn Ile Ser Val Asp Val Arg Pro Ala Val
500 505 510
Glu Val Lys Asn Gly Arg Leu Gln Asn Gly Leu Gly Lys Ala Leu Thr
515 520 525
Val Leu Thr His Pro Asp Gly Thr Lys Ile Val Thr Gly Trp Lys Ala
530 535 540
Glu Gln Leu Glu Lys Trp Val Gly Glu Ser Gly Arg Val Ser Ser Leu
545 550 555 560
Gly Leu Asp Ser Leu Ser Glu Gly Leu Arg Val Met Ser Ile Asp Leu
565 570 575
Gly Gln Arg Thr Ser Ala Thr Val Ser Val Phe Glu Ile Thr Lys Glu
580 585 590
Ala Pro Asp Asn Pro Tyr Lys Phe Phe Tyr Gln Leu Glu Gly Thr Glu
595 600 605
Leu Phe Ala Val His Gln Arg Ser Phe Leu Leu Ala Leu Pro Gly Glu
610 615 620
Asn Pro Pro Gln Lys Ile Lys Gln Met Arg Glu Ile Arg Trp Lys Glu
625 630 635 640
Arg Asn Arg Ile Lys Gln Gln Val Asp Gln Leu Ser Ala Ile Leu Arg
645 650 655
Leu His Lys Lys Val Asn Glu Asp Glu Arg Ile Gln Ala Ile Asp Lys
660 665 670
Leu Leu Gln Lys Val Ala Ser Trp Gln Leu Asn Glu Glu Ile Ala Thr
675 680 685
Ala Trp Asn Gln Ala Leu Ser Gln Leu Tyr Ser Lys Ala Lys Glu Asn
690 695 700
Asp Leu Gln Trp Asn Gln Ala Ile Lys Asn Ala His His Gln Leu Glu
705 710 715 720
Pro Val Val Gly Lys Gln Ile Ser Leu Trp Arg Lys Asp Leu Ser Thr
725 730 735
Gly Arg Gln Gly Ile Ala Gly Leu Ser Leu Trp Ser Ile Glu Glu Leu
740 745 750
Glu Ala Thr Lys Lys Leu Leu Thr Arg Trp Ser Lys Arg Ser Arg Glu
755 760 765
Pro Gly Val Val Lys Arg Ile Glu Arg Phe Glu Thr Phe Ala Lys Gln
770 775 780
Ile Gln His His Ile Asn Gln Val Lys Glu Asn Arg Leu Lys Gln Leu
785 790 795 800
Ala Asn Leu Ile Val Met Thr Ala Leu Gly Tyr Lys Tyr Asp Gln Glu
805 810 815
Gln Lys Lys Trp Ile Glu Val Tyr Pro Ala Cys Gln Val Val Leu Phe
820 825 830
Glu Asn Leu Arg Ser Tyr Arg Phe Ser Tyr Glu Arg Ser Arg Arg Glu
835 840 845
Asn Lys Lys Leu Met Glu Trp Ser His Arg Ser Ile Pro Lys Leu Val
850 855 860
Gln Met Gln Gly Glu Leu Phe Gly Leu Gln Val Ala Asp Val Tyr Ala
865 870 875 880
Ala Tyr Ser Ser Arg Tyr His Gly Arg Thr Gly Ala Pro Gly Ile Arg
885 890 895
Cys His Ala Leu Thr Glu Ala Asp Leu Arg Asn Glu Thr Asn Ile Ile
900 905 910
His Glu Leu Ile Glu Ala Gly Phe Ile Lys Glu Glu His Arg Pro Tyr
915 920 925
Leu Gln Gln Gly Asp Leu Val Pro Trp Ser Gly Gly Glu Leu Phe Ala
930 935 940
Thr Leu Gln Lys Pro Tyr Asp Asn Pro Arg Ile Leu Thr Leu His Ala
945 950 955 960
Asp Ile Asn Ala Ala Gln Asn Ile Gln Lys Arg Phe Trp His Pro Ser
965 970 975
Met Trp Phe Arg Val Asn Cys Glu Ser Val Met Glu Gly Glu Ile Val
980 985 990
Thr Tyr Val Pro Lys Asn Lys Thr Val His Lys Lys Gln Gly Lys Thr
995 1000 1005
Phe Arg Phe Val Lys Val Glu Gly Ser Asp Val Tyr Glu Trp Ala
1010 1015 1020
Lys Trp Ser Lys Asn Arg Asn Lys Asn Thr Phe Ser Ser Ile Thr
1025 1030 1035
Glu Arg Lys Pro Pro Ser Ser Met Ile Leu Phe Arg Asp Pro Ser
1040 1045 1050
Gly Thr Phe Phe Lys Glu Gln Glu Trp Val Glu Gln Lys Thr Phe
1055 1060 1065
Trp Gly Lys Val Gln Ser Met Ile Gln Ala Tyr Met Lys Lys Thr
1070 1075 1080
Ile Val Gln Arg Met Glu Glu
1085 1090
<210> 9
<211> 1119
<212> PRT
<213> Spirochaetes
<400> 9
Met Ser Phe Thr Ile Ser Tyr Pro Phe Lys Leu Ile Ile Lys Asn Lys
1 5 10 15
Asp Glu Ala Lys Ala Leu Leu Asp Thr His Gln Tyr Met Asn Glu Gly
20 25 30
Val Lys Tyr Tyr Leu Glu Lys Leu Leu Met Phe Arg Gln Glu Lys Ile
35 40 45
Phe Ile Gly Glu Asp Glu Thr Gly Lys Arg Ile Tyr Ile Glu Glu Thr
50 55 60
Glu Tyr Lys Lys Gln Ile Glu Glu Phe Tyr Leu Ile Lys Lys Thr Glu
65 70 75 80
Leu Gly Arg Asn Leu Thr Leu Thr Leu Asp Glu Phe Lys Thr Leu Met
85 90 95
Arg Glu Leu Tyr Ile Cys Leu Val Ser Ser Ser Met Glu Asn Lys Lys
100 105 110
Gly Phe Pro Asn Ala Gln Gln Ala Ser Leu Asn Ile Phe Ser Pro Leu
115 120 125
Phe Asp Ala Glu Ser Lys Gly Tyr Ile Leu Lys Glu Glu Asn Asn Asn
130 135 140
Ile Ser Leu Ile His Lys Asp Tyr Gly Lys Ile Leu Leu Lys Arg Leu
145 150 155 160
Arg Asp Asn Asn Leu Ile Pro Ile Phe Thr Lys Phe Thr Asp Ile Lys
165 170 175
Lys Ile Thr Ala Lys Leu Ser Pro Thr Ala Leu Asp Arg Met Ile Phe
180 185 190
Ala Gln Ala Ile Glu Lys Leu Leu Ser Tyr Glu Ser Trp Cys Lys Leu
195 200 205
Met Ile Lys Glu Arg Phe Asp Lys Glu Val Lys Ile Lys Glu Leu Glu
210 215 220
Asn Lys Cys Glu Asn Lys Gln Glu Arg Asp Lys Ile Phe Glu Ile Leu
225 230 235 240
Glu Lys Tyr Glu Glu Glu Arg Gln Lys Thr Phe Glu Gln Asp Ser Gly
245 250 255
Phe Ala Lys Lys Gly Lys Phe Tyr Ile Thr Gly Arg Met Leu Lys Gly
260 265 270
Phe Asp Glu Ile Lys Glu Lys Trp Leu Lys Glu Lys Asp Arg Ser Glu
275 280 285
Gln Asn Leu Ile Asn Ile Leu Asn Lys Tyr Gln Thr Asp Asn Ser Lys
290 295 300
Leu Val Gly Asp Arg Asn Leu Phe Glu Phe Ile Ile Lys Leu Glu Asn
305 310 315 320
Gln Cys Leu Trp Asn Gly Asp Ile Asp Tyr Leu Lys Ile Lys Arg Asp
325 330 335
Ile Asn Lys Asn Gln Ile Trp Leu Asp Arg Pro Glu Met Pro Arg Phe
340 345 350
Thr Met Pro Asp Phe Lys Lys His Pro Leu Trp Tyr Arg Tyr Glu Asp
355 360 365
Pro Ser Asn Ser Asn Phe Arg Asn Tyr Lys Ile Glu Val Val Lys Asp
370 375 380
Glu Asn Tyr Ile Thr Ile Pro Leu Ile Thr Glu Arg Asn Asn Glu Tyr
385 390 395 400
Phe Glu Glu Asn Tyr Thr Phe Asn Leu Ala Lys Leu Lys Lys Leu Ser
405 410 415
Glu Asn Ile Thr Phe Ile Pro Lys Ser Lys Asn Lys Glu Phe Glu Phe
420 425 430
Ile Asp Ser Asn Asp Glu Glu Glu Asp Lys Lys Asp Gln Lys Lys Ser
435 440 445
Lys Gln Tyr Ile Lys Tyr Cys Asp Thr Ala Lys Asn Thr Ser Tyr Gly
450 455 460
Lys Ser Gly Gly Ile Arg Leu Tyr Phe Asn Arg Asn Glu Leu Glu Asn
465 470 475 480
Tyr Lys Asp Gly Lys Lys Met Asp Ser Tyr Thr Val Phe Thr Leu Ser
485 490 495
Ile Arg Asp Tyr Lys Ser Leu Phe Ala Lys Glu Lys Leu Gln Pro Gln
500 505 510
Ile Phe Asn Thr Val Asp Asn Lys Ile Thr Ser Leu Lys Ile Gln Lys
515 520 525
Lys Phe Gly Asn Glu Glu Gln Thr Asn Phe Leu Ser Tyr Phe Thr Gln
530 535 540
Asn Gln Ile Thr Lys Lys Asp Trp Met Asp Glu Lys Thr Phe Gln Asn
545 550 555 560
Val Lys Glu Leu Asn Glu Gly Ile Arg Val Leu Ser Val Asp Leu Gly
565 570 575
Gln Arg Phe Phe Ala Ala Val Ser Cys Phe Glu Ile Met Ser Glu Ile
580 585 590
Asp Asn Asn Lys Leu Phe Phe Asn Leu Asn Asp Gln Asn His Lys Ile
595 600 605
Ile Arg Ile Asn Asp Lys Asn Tyr Tyr Ala Lys His Ile Tyr Ser Lys
610 615 620
Thr Ile Lys Leu Ser Gly Glu Asp Asp Asp Leu Tyr Lys Glu Arg Lys
625 630 635 640
Ile Asn Lys Asn Tyr Lys Leu Ser Tyr Gln Glu Arg Lys Asn Lys Ile
645 650 655
Gly Ile Phe Thr Arg Gln Ile Asn Lys Leu Asn Gln Leu Leu Lys Ile
660 665 670
Ile Arg Asn Asp Glu Ile Asp Lys Glu Lys Phe Lys Glu Leu Ile Glu
675 680 685
Thr Thr Lys Arg Tyr Val Lys Asn Thr Tyr Asn Asp Gly Ile Ile Asp
690 695 700
Trp Asn Asn Val Asp Asn Lys Ile Leu Ser Tyr Glu Asn Lys Glu Asp
705 710 715 720
Val Ile Asn Leu His Lys Glu Leu Asp Lys Lys Leu Glu Ile Asp Phe
725 730 735
Lys Glu Phe Ile Arg Glu Cys Arg Lys Pro Ile Phe Arg Ser Gly Gly
740 745 750
Leu Ser Met Gln Arg Ile Asp Phe Leu Glu Lys Leu Asn Lys Leu Lys
755 760 765
Arg Lys Trp Val Ala Arg Thr Gln Lys Ser Ala Glu Ser Ile Val Leu
770 775 780
Thr Pro Lys Phe Gly Tyr Lys Leu Lys Glu His Ile Asn Glu Leu Lys
785 790 795 800
Asp Asn Arg Val Lys Gln Gly Val Asn Tyr Ile Leu Met Thr Ala Leu
805 810 815
Gly Tyr Ile Lys Asp Asn Glu Ile Lys Asn Asp Ser Lys Lys Lys Gln
820 825 830
Lys Glu Asp Trp Val Lys Lys Asn Arg Ala Cys Gln Ile Ile Leu Met
835 840 845
Glu Lys Leu Thr Glu Tyr Thr Phe Ala Glu Asp Arg Pro Arg Glu Glu
850 855 860
Asn Ser Lys Leu Arg Met Trp Ser His Arg Gln Ile Phe Asn Phe Leu
865 870 875 880
Gln Gln Lys Ala Ser Leu Trp Gly Ile Leu Val Gly Asp Val Phe Ala
885 890 895
Pro Tyr Thr Ser Lys Cys Leu Ser Asp Asn Asn Ala Pro Gly Ile Arg
900 905 910
Cys His Gln Val Thr Lys Lys Asp Leu Ile Asp Asn Ser Trp Phe Leu
915 920 925
Lys Ile Val Val Lys Asp Asp Ala Phe Cys Asp Leu Ile Glu Ile Asn
930 935 940
Lys Glu Asn Val Lys Asn Lys Ser Ile Lys Ile Asn Asp Ile Leu Pro
945 950 955 960
Leu Arg Gly Gly Glu Leu Phe Ala Ser Ile Lys Asp Gly Lys Leu His
965 970 975
Ile Val Gln Ala Asp Ile Asn Ala Ser Arg Asn Ile Ala Lys Arg Phe
980 985 990
Leu Ser Gln Ile Asn Pro Phe Arg Val Val Leu Lys Lys Asp Lys Asp
995 1000 1005
Glu Thr Phe His Leu Lys Asn Glu Pro Asn Tyr Leu Lys Asn Tyr
1010 1015 1020
Tyr Ser Ile Leu Asn Phe Val Pro Thr Asn Glu Glu Leu Thr Phe
1025 1030 1035
Phe Lys Val Glu Glu Asn Lys Asp Ile Lys Pro Thr Lys Arg Ile
1040 1045 1050
Lys Met Asp Lys His Glu Lys Glu Ser Thr Asp Glu Gly Asp Asp
1055 1060 1065
Tyr Ser Lys Asn Gln Ile Ala Leu Phe Arg Asp Asp Ser Gly Ile
1070 1075 1080
Phe Phe Asp Lys Ser Leu Trp Val Asp Gly Lys Ile Phe Trp Ser
1085 1090 1095
Val Val Lys Asn Lys Met Thr Lys Leu Leu Arg Glu Arg Asn Asn
1100 1105 1110
Lys Lys Asn Gly Ser Lys
1115
<210> 10
<211> 1142
<212> PRT
<213> Tuberibacillus calidus
<400> 10
Met Asn Ile His Leu Lys Glu Leu Ile Arg Met Ala Thr Lys Ser Phe
1 5 10 15
Ile Leu Lys Met Lys Thr Lys Asn Asn Pro Gln Leu Arg Leu Ser Leu
20 25 30
Trp Lys Thr His Glu Leu Phe Asn Phe Gly Val Ala Tyr Tyr Met Asp
35 40 45
Leu Leu Ser Leu Phe Arg Gln Lys Asp Leu Tyr Met His Asn Asp Glu
50 55 60
Asp Pro Asp His Pro Val Val Leu Lys Lys Glu Glu Ile Gln Glu Arg
65 70 75 80
Leu Trp Met Lys Val Arg Glu Thr Gln Gln Lys Asn Gly Phe His Gly
85 90 95
Glu Val Ser Lys Asp Glu Val Leu Glu Thr Leu Arg Ala Leu Tyr Glu
100 105 110
Glu Leu Val Pro Ser Ala Val Gly Lys Ser Gly Glu Ala Asn Gln Ile
115 120 125
Ser Asn Lys Tyr Leu Tyr Pro Leu Thr Asp Pro Ala Ser Gln Ser Gly
130 135 140
Lys Gly Thr Ala Asn Ser Gly Arg Lys Pro Arg Trp Lys Lys Leu Lys
145 150 155 160
Glu Ala Gly Asp Pro Ser Trp Lys Asp Ala Tyr Glu Lys Trp Glu Lys
165 170 175
Glu Arg Gln Glu Asp Pro Lys Leu Lys Ile Leu Ala Ala Leu Gln Ser
180 185 190
Phe Gly Leu Ile Pro Leu Phe Arg Pro Phe Thr Glu Asn Asp His Lys
195 200 205
Ala Val Ile Ser Val Lys Trp Met Pro Lys Ser Lys Asn Gln Ser Val
210 215 220
Arg Lys Phe Asp Lys Asp Met Phe Asn Gln Ala Ile Glu Arg Phe Leu
225 230 235 240
Ser Trp Glu Ser Trp Asn Glu Lys Val Ala Glu Asp Tyr Glu Lys Thr
245 250 255
Val Ser Ile Tyr Glu Ser Leu Gln Lys Glu Leu Lys Gly Ile Ser Thr
260 265 270
Lys Ala Phe Glu Ile Met Glu Arg Val Glu Lys Ala Tyr Glu Ala His
275 280 285
Leu Arg Glu Ile Thr Phe Ser Asn Ser Thr Tyr Arg Ile Gly Asn Arg
290 295 300
Ala Ile Arg Gly Trp Thr Glu Ile Val Lys Lys Trp Met Lys Leu Asp
305 310 315 320
Pro Ser Ala Pro Gln Gly Asn Tyr Leu Asp Val Val Lys Asp Tyr Gln
325 330 335
Arg Arg His Pro Arg Glu Ser Gly Asp Phe Lys Leu Phe Glu Leu Leu
340 345 350
Ser Arg Pro Glu Asn Gln Ala Ala Trp Arg Glu Tyr Pro Glu Phe Leu
355 360 365
Pro Leu Tyr Val Lys Tyr Arg His Ala Glu Gln Arg Met Lys Thr Ala
370 375 380
Lys Lys Gln Ala Thr Phe Thr Leu Cys Asp Pro Ile Arg His Pro Leu
385 390 395 400
Trp Val Arg Tyr Glu Glu Arg Ser Gly Thr Asn Leu Asn Lys Tyr Arg
405 410 415
Leu Ile Met Asn Glu Lys Glu Lys Val Val Gln Phe Asp Arg Leu Ile
420 425 430
Cys Leu Asn Ala Asp Gly His Tyr Glu Glu Gln Glu Asp Val Thr Val
435 440 445
Pro Leu Ala Pro Ser Gln Gln Phe Asp Asp Gln Ile Lys Phe Ser Ser
450 455 460
Glu Asp Thr Gly Lys Gly Lys His Asn Phe Ser Tyr Tyr His Lys Gly
465 470 475 480
Ile Asn Tyr Glu Leu Lys Gly Thr Leu Gly Gly Ala Arg Ile Gln Phe
485 490 495
Asp Arg Glu His Leu Leu Arg Arg Gln Gly Val Lys Ala Gly Asn Val
500 505 510
Gly Arg Ile Phe Leu Asn Val Thr Leu Asn Ile Glu Pro Met Gln Pro
515 520 525
Phe Ser Arg Ser Gly Asn Leu Gln Thr Ser Val Gly Lys Ala Leu Lys
530 535 540
Val Tyr Val Asp Gly Tyr Pro Lys Val Val Asn Phe Lys Pro Lys Glu
545 550 555 560
Leu Thr Glu His Ile Lys Glu Ser Glu Lys Asn Thr Leu Thr Leu Gly
565 570 575
Val Glu Ser Leu Pro Thr Gly Leu Arg Val Met Ser Val Asp Leu Gly
580 585 590
Gln Arg Gln Ala Ala Ala Ile Ser Ile Phe Glu Val Val Ser Glu Lys
595 600 605
Pro Asp Asp Asn Lys Leu Phe Tyr Pro Val Lys Asp Thr Asp Leu Phe
610 615 620
Ala Val His Arg Thr Ser Phe Asn Ile Lys Leu Pro Gly Glu Lys Arg
625 630 635 640
Thr Glu Arg Arg Met Leu Glu Gln Gln Lys Arg Asp Gln Ala Ile Arg
645 650 655
Asp Leu Ser Arg Lys Leu Lys Phe Leu Lys Asn Val Leu Asn Met Gln
660 665 670
Lys Leu Glu Lys Thr Asp Glu Arg Glu Lys Arg Val Asn Arg Trp Ile
675 680 685
Lys Asp Arg Glu Arg Glu Glu Glu Asn Pro Val Tyr Val Gln Glu Phe
690 695 700
Glu Met Ile Ser Lys Val Leu Tyr Ser Pro His Ser Val Trp Val Asp
705 710 715 720
Gln Leu Lys Ser Ile His Arg Lys Leu Glu Glu Gln Leu Gly Lys Glu
725 730 735
Ile Ser Lys Trp Arg Gln Ser Ile Ser Gln Gly Arg Gln Gly Val Tyr
740 745 750
Gly Ile Ser Leu Lys Asn Ile Glu Asp Ile Glu Lys Thr Arg Arg Leu
755 760 765
Leu Phe Arg Trp Ser Met Arg Pro Glu Asn Pro Gly Glu Val Lys Gln
770 775 780
Leu Gln Pro Gly Glu Arg Phe Ala Ile Asp Gln Gln Asn His Leu Asn
785 790 795 800
His Leu Lys Asp Asp Arg Ile Lys Lys Leu Ala Asn Gln Ile Val Met
805 810 815
Thr Ala Leu Gly Tyr Arg Tyr Asp Gly Lys Arg Lys Lys Trp Ile Ala
820 825 830
Lys His Pro Ala Cys Gln Leu Val Leu Phe Glu Asp Leu Ser Arg Tyr
835 840 845
Ala Phe Tyr Asp Glu Arg Ser Arg Leu Glu Asn Arg Asn Leu Met Arg
850 855 860
Trp Ser Arg Arg Glu Ile Pro Lys Gln Val Ala Gln Ile Gly Gly Leu
865 870 875 880
Tyr Gly Leu Leu Val Gly Glu Val Gly Ala Gln Tyr Ser Ser Arg Phe
885 890 895
His Ala Lys Ser Gly Ala Pro Gly Ile Arg Cys Arg Val Val Lys Glu
900 905 910
His Glu Leu Tyr Ile Thr Glu Gly Gly Gln Lys Val Arg Asn Gln Lys
915 920 925
Phe Leu Asp Ser Leu Val Glu Asn Asn Ile Ile Glu Pro Asp Asp Ala
930 935 940
Arg Arg Leu Glu Pro Gly Asp Leu Ile Arg Asp Gln Gly Gly Asp Lys
945 950 955 960
Phe Ala Thr Leu Asp Glu Arg Gly Glu Leu Val Ile Thr His Ala Asp
965 970 975
Ile Asn Ala Ala Gln Asn Leu Gln Lys Arg Phe Trp Thr Arg Thr His
980 985 990
Gly Leu Tyr Arg Ile Arg Cys Glu Ser Arg Glu Ile Lys Asp Ala Val
995 1000 1005
Val Leu Val Pro Ser Asp Lys Asp Gln Lys Glu Lys Met Glu Asn
1010 1015 1020
Leu Phe Gly Ile Gly Tyr Leu Gln Pro Phe Lys Gln Glu Asn Asp
1025 1030 1035
Val Tyr Lys Trp Val Lys Gly Glu Lys Ile Lys Gly Lys Lys Thr
1040 1045 1050
Ser Ser Gln Ser Asp Asp Lys Glu Leu Val Ser Glu Ile Leu Gln
1055 1060 1065
Glu Ala Ser Val Met Ala Asp Glu Leu Lys Gly Asn Arg Lys Thr
1070 1075 1080
Leu Phe Arg Asp Pro Ser Gly Tyr Val Phe Pro Lys Asp Arg Trp
1085 1090 1095
Tyr Thr Gly Gly Arg Tyr Phe Gly Thr Leu Glu His Leu Leu Lys
1100 1105 1110
Arg Lys Leu Ala Glu Arg Arg Leu Phe Asp Gly Gly Ser Ser Arg
1115 1120 1125
Arg Gly Leu Phe Asn Gly Thr Asp Ser Asn Thr Asn Val Glu
1130 1135 1140
<210> 11
<211> 137
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 11
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaaaga acgctcgctc agtgttctga cgtcggatca ctgagcgagc 120
gatctgagaa gtggcac 137
<210> 12
<211> 141
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 12
aactgtctaa aggacagaat ttttcaacgg gtgtgccaat ggccactttc caggtggcaa 60
agcccgttga acttctcaaa aagaacgctc gctcagtgtt ctgacgtcgg atcactgagc 120
gagcgatctg agaagtggca c 141
<210> 13
<211> 139
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 13
ctgtctaaag gacagaattt ttcaacgggt gtgccaatgg ccactttcca ggtggcaaag 60
cccgttgaac ttctcaaaaa gaacgctcgc tcagtgttct gacgtcggat cactgagcga 120
gcgatctgag aagtggcac 139
<210> 14
<211> 127
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 14
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaaaga acgctcgctc agtgttatca ctgagcgagc gatctgagaa 120
gtggcac 127
<210> 15
<211> 99
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 15
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaaaga acgatctgag aagtggcac 99
<210> 16
<211> 93
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 16
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaaagc tgagaagtgg cac 93
<210> 17
<211> 91
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 17
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaagctg agaagtggca c 91
<210> 18
<211> 91
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 18
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaactg agaagtggca c 91
<210> 19
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 19
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaagcgag aagtggcac 89
<210> 20
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 20
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctaagcagaa gtggcac 87
<210> 21
<211> 85
<212> DNA
<213> Artificial Sequence
<220>
<223> optimized sgRNA scaffold
<400> 21
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt caagcgaagt ggcac 85
<210> 22
<211> 137
<212> DNA
<213> Artificial Sequence
<220>
<223> AasgRNA_ scaffold
<400> 22
gtctaaagga cagaattttt caacgggtgt gccaatggcc actttccagg tggcaaagcc 60
cgttgaactt ctcaaaaaga acgctcgctc agtgttctga cgtcggatca ctgagcgagc 120
gatctgagaa gtggcac 137
<210> 23
<211> 137
<212> DNA
<213> Artificial Sequence
<220>
<223> AksgRNA1_ scaffold
<400> 23
tcgtctatag gacggcgagg acaacgggaa gtgccaatgt gctctttcca agagcaaaca 60
ccccgttggc ttcaagatga ccgctcgctc agcgatctga caacggatcg ctgagcgagc 120
ggtctgagaa gtggcac 137
<210> 24
<211> 145
<212> DNA
<213> Artificial Sequence
<220>
<223> AmsgRNA1_ scaffold
<400> 24
ggaattgccg atctatagga cggcagattc aacgggatgt gccaatgcac tctttccagg 60
agtgaacacc ccgttggctt caacatgatc gcccgctcaa cggtccgatg tcggatcgtt 120
gagcgggcga tctgagaagt ggcac 145
<210> 25
<211> 141
<212> DNA
<213> Artificial Sequence
<220>
<223> BhsgRNA_ scaffold
<400> 25
gaggttctgt cttttggtca ggacaaccgt ctagctataa gtgctgcagg gtgtgagaaa 60
ctcctattgc tggacgatgt ctcttttatt tcttttttct tggatgtcca agaaaaaaga 120
aatgatacga ggcattagca c 141
<210> 26
<211> 132
<212> DNA
<213> Artificial Sequence
<220>
<223> BssgRNA_ scaffold
<400> 26
ccataagtcg acttacatat ccgtgcgtgt gcattatggg cccatccaca ggtctattcc 60
cacggataat cacgactttc cactaagctt tcgaatgttc gaaagcttag tggaaagctt 120
cgtggttagc ac 132
<210> 27
<211> 130
<212> DNA
<213> Artificial Sequence
<220>
<223> Bs3sgRNA_ scaffold
<400> 27
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acaggattat cttatttctg ctaagtgttt agttgcctga atacttagca gaaataatga 120
tgattggcac 130
<210> 28
<211> 118
<212> DNA
<213> Artificial Sequence
<220>
<223> LssgRNA_ scaffold
<400> 28
ggcaaagaat actgtgcgtg tgctaaggat ggaaaaaatc cattcaacca caggattaca 60
ttatttatct aatcacttaa atctttaagt gattagatga attaaatgtg attagcac 118
<210> 29
<211> 126
<212> DNA
<213> Artificial Sequence
<220>
<223> SbsgRNA_ scaffold
<400> 29
gtcttagggt atatcccaaa tttgtcttag tatgtgcatt gcttacagcg acaactaagg 60
tttgtttatc ttttttttac attgtaagat gttttacatt ataaaaagaa gataatctta 120
ttgcac 126
<210> 30
<211> 86
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA1 scaffold
<400> 30
ggtctaaagg acagaatttt tcaacgggtg tgccaatggc cactttccag gtggcaaagc 60
ccgttgaact tcaagcgaag tggcac 86
<210> 31
<211> 84
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA2 scaffold
<400> 31
ggtctaaagg acagaagaca acgggaagtg ccaatgtgct ctttccaaga gcaaacaccc 60
cgttgacttc aagcgaagtg gcac 84
<210> 32
<211> 79
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA3 scaffold
<400> 32
ggtctaaagg acagaaaatc tgtgcgtgtg ccataagtaa ttaaaaatta cccaccacag 60
acttcaagcg aagtggcac 79
<210> 33
<211> 91
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA4 scaffold
<400> 33
ggtcgtctat aggacggcga gtttttcaac gggtgtgcca atggccactt tccaggtggc 60
aaagcccgtt gaacttcaag cgaagtggca c 91
<210> 34
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA5 scaffold
<400> 34
ggtcgtctat aggacggcga ggacaacggg aagtgccaat gtgctctttc caagagcaaa 60
caccccgttg acttcaagcg aagtggcac 89
<210> 35
<211> 84
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA6 scaffold
<400> 35
ggtcgtctat aggacggcga gaatctgtgc gtgtgccata agtaattaaa aattacccac 60
cacagacttc aagcgaagtg gcac 84
<210> 36
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA7 scaffold
<400> 36
ggtgacctat agggtcaatg tttttcaacg ggtgtgccaa tggccacttt ccaggtggca 60
aagcccgttg aacttcaagc gaagtggcac 90
<210> 37
<211> 88
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA8 scaffold
<400> 37
ggtgacctat agggtcaatg gacaacggga agtgccaatg tgctctttcc aagagcaaac 60
accccgttga cttcaagcga agtggcac 88
<210> 38
<211> 83
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA9 scaffold
<400> 38
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acagacttca agcgaagtgg cac 83
<210> 39
<211> 85
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA10 scaffold
<400> 39
ggtctaaagg acagaatttt tcaacgggtg tgccaatggc cactttccag gtggcaaagc 60
ccgttgagct tcaaagaagt ggcac 85
<210> 40
<211> 83
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA11 scaffold
<400> 40
ggtctaaagg acagaagaca acgggaagtg ccaatgtgct ctttccaaga gcaaacaccc 60
cgttggcttc aaagaagtgg cac 83
<210> 41
<211> 78
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA12 scaffold
<400> 41
ggtctaaagg acagaaaatc tgtgcgtgtg ccataagtaa ttaaaaatta cccaccacag 60
gcttcaaaga agtggcac 78
<210> 42
<211> 90
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA13 scaffold
<400> 42
ggtcgtctat aggacggcga gtttttcaac gggtgtgcca atggccactt tccaggtggc 60
aaagcccgtt gagcttcaaa gaagtggcac 90
<210> 43
<211> 88
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA14 scaffold
<400> 43
ggtcgtctat aggacggcga ggacaacggg aagtgccaat gtgctctttc caagagcaaa 60
caccccgttg gcttcaaaga agtggcac 88
<210> 44
<211> 83
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA15 scaffold
<400> 44
ggtcgtctat aggacggcga gaatctgtgc gtgtgccata agtaattaaa aattacccac 60
cacaggcttc aaagaagtgg cac 83
<210> 45
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA16 scaffold
<400> 45
ggtgacctat agggtcaatg tttttcaacg ggtgtgccaa tggccacttt ccaggtggca 60
aagcccgttg agcttcaaag aagtggcac 89
<210> 46
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA17 scaffold
<400> 46
ggtgacctat agggtcaatg gacaacggga agtgccaatg tgctctttcc aagagcaaac 60
accccgttgg cttcaaagaa gtggcac 87
<210> 47
<211> 82
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA18 scaffold
<400> 47
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acaggcttca aagaagtggc ac 82
<210> 48
<211> 89
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA19 scaffold
<400> 48
ggtctaaagg acagaatttt tcaacgggtg tgccaatggc cactttccag gtggcaaagc 60
ccgttgagat tatctatgat gattggcac 89
<210> 49
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA20 scaffold
<400> 49
ggtctaaagg acagaagaca acgggaagtg ccaatgtgct ctttccaaga gcaaacaccc 60
cgttggatta tctatgatga ttggcac 87
<210> 50
<211> 82
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA21 scaffold
<400> 50
ggtctaaagg acagaaaatc tgtgcgtgtg ccataagtaa ttaaaaatta cccaccacag 60
gattatctat gatgattggc ac 82
<210> 51
<211> 94
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA22 scaffold
<400> 51
ggtcgtctat aggacggcga gtttttcaac gggtgtgcca atggccactt tccaggtggc 60
aaagcccgtt gagattatct atgatgattg gcac 94
<210> 52
<211> 92
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA23 scaffold
<400> 52
ggtcgtctat aggacggcga ggacaacggg aagtgccaat gtgctctttc caagagcaaa 60
caccccgttg gattatctat gatgattggc ac 92
<210> 53
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA24 scaffold
<400> 53
ggtcgtctat aggacggcga gaatctgtgc gtgtgccata agtaattaaa aattacccac 60
cacaggatta tctatgatga ttggcac 87
<210> 54
<211> 93
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA25 scaffold
<400> 54
ggtgacctat agggtcaatg tttttcaacg ggtgtgccaa tggccacttt ccaggtggca 60
aagcccgttg agattatcta tgatgattgg cac 93
<210> 55
<211> 91
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA26 scaffold
<400> 55
ggtgacctat agggtcaatg gacaacggga agtgccaatg tgctctttcc aagagcaaac 60
accccgttgg attatctatg atgattggca c 91
<210> 56
<211> 86
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA27 scaffold
<400> 56
ggtgacctat agggtcaatg aatctgtgcg tgtgccataa gtaattaaaa attacccacc 60
acaggattat ctatgatgat tggcac 86
<210> 57
<211> 82
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA28 scaffold
<400> 57
ggtctaaagg acagaacaac gggatgtgcc aatgcactct ttccaggagt gaacaccccg 60
ttgacttcaa gcgaagtggc ac 82
<210> 58
<211> 87
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA29 scaffold
<400> 58
ggtcgtctat aggacggcga gcaacgggat gtgccaatgc actctttcca ggagtgaaca 60
ccccgttgac ttcaagcgaa gtggcac 87
<210> 59
<211> 99
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA30 scaffold
<400> 59
ggaattgccg atctatagga cggcagattt ttttcaacgg gtgtgccaat ggccactttc 60
caggtggcaa agcccgttga acttcaagcg aagtggcac 99
<210> 60
<211> 97
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA31 scaffold
<400> 60
ggaattgccg atctatagga cggcagattg acaacgggaa gtgccaatgt gctctttcca 60
agagcaaaca ccccgttgac ttcaagcgaa gtggcac 97
<210> 61
<211> 95
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA32 scaffold
<400> 61
ggaattgccg atctatagga cggcagattc aacgggatgt gccaatgcac tctttccagg 60
agtgaacacc ccgttgactt caagcgaagt ggcac 95
<210> 62
<211> 81
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA33 scaffold
<400> 62
ggtctaaagg acagaacaac gggatgtgcc aatgcactct ttccaggagt gaacaccccg 60
ttggcttcaa agaagtggca c 81
<210> 63
<211> 86
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA34 scaffold
<400> 63
ggtcgtctat aggacggcga gcaacgggat gtgccaatgc actctttcca ggagtgaaca 60
ccccgttggc ttcaaagaag tggcac 86
<210> 64
<211> 98
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA35 scaffold
<400> 64
ggaattgccg atctatagga cggcagattt ttttcaacgg gtgtgccaat ggccactttc 60
caggtggcaa agcccgttga gcttcaaaga agtggcac 98
<210> 65
<211> 96
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA36 scaffold
<400> 65
ggaattgccg atctatagga cggcagattg acaacgggaa gtgccaatgt gctctttcca 60
agagcaaaca ccccgttggc ttcaaagaag tggcac 96
<210> 66
<211> 94
<212> DNA
<213> Artificial Sequence
<220>
<223> artsgRNA37 scaffold
<400> 66
ggaattgccg atctatagga cggcagattc aacgggatgt gccaatgcac tctttccagg 60
agtgaacacc ccgttggctt caaagaagtg gcac 94

Claims (36)

1.一种检测生物样品中靶核酸分子的存在和/或量的非诊断方法,所述方法包括以下步骤:
(a) 使所述生物样品接触:i) Cas12b蛋白,ii) 针对所述靶核酸分子中的靶序列的gRNA,和iii) 被切割后产生可检测信号的单链DNA报告分子,从而形成反应混合物;
(b) 检测所述反应混合物中产生的可检测信号的存在和/或水平;
其中所述可检测信号的存在和/或水平代表所述靶核酸分子的存在和/或量;
其中所述Cas12b蛋白是来自Alicyclobacillus acidiphilus的AaCas12b蛋白,且其氨基酸序列如SEQ ID NO:1所示;
其中所述单链DNA报告分子中的单链DNA不是聚鸟苷酸。
2.权利要求1的方法,所述靶核酸分子是双链DNA分子或单链DNA分子。
3.权利要求1的方法,其中所述gRNA是sgRNA。
4.权利要求3的方法,所述sgRNA包括由选自SEQ ID NO:11-66之一的核酸序列编码的支架序列。
5.权利要求4的方法,所述sgRNA包括位于支架序列3’端的间隔区序列,其与靶序列或靶序列的互补序列特异性杂交。
6.权利要求5的方法,所述间隔区序列与靶序列具有至少一个核苷酸错配。
7.权利要求1-6任一项的方法,其中所述单链DNA报告分子在两端分别包含荧光基团和相应的猝灭基团。
8.权利要求7的方法,其中所述荧光基团选自FAM、TEX、HEX、Cy3或Cy5,所述猝灭基团选自BHQ1、BHQ2、BHQ3或TAMRA。
9.权利要求1-6任一项的方法,其中所述单链DNA报告分子长度为2个- 100个核苷酸。
10.权利要求1-6任一项的方法,其中所述单链DNA报告分子为聚腺苷酸、聚胞苷酸或聚胸苷酸。
11.权利要求1-6任一项的方法,还包括在步骤(a)之前对所述生物样品中的核酸分子进行扩增的步骤。
12.权利要求11的方法,所述扩增是PCR扩增或重组酶聚合酶扩增。
13.权利要求12的方法,所述重组酶聚合酶扩增进行10分钟-60分钟。
14.权利要求1-6任一项的方法,其中在与所述生物样品接触之前,所述Cas12b蛋白已经与所述gRNA预先复合形成Cas12b-gRNA复合物。
15.权利要求1-6任一项的方法,所述步骤(a)的反应进行20分钟- 180分钟。
16.权利要求1-6任一项的方法,步骤(a)在NEBufferTM 2、NEBuffer™ 2.1或 Cutsmart® Buffer缓冲液中进行。
17.权利要求1-6任一项的方法,其中所述生物样品选自全血、血浆、血清、脑脊液、尿液、粪便、细胞或组织提取物。
18.权利要求17的方法,其中所述生物样品是提取自细胞或组织的核酸样品。
19.来自Alicyclobacillus acidiphilus的AaCas12b蛋白在制备试剂或试剂盒中的用途,所述试剂或试剂盒用于通过包含以下步骤的方法检测生物样品中靶核酸分子的存在和/或量:
(a) 使所述生物样品接触:i) 所述AaCas12b蛋白,ii) 针对所述靶核酸分子中的靶序列的gRNA,和iii) 被切割后产生可检测信号的单链DNA报告分子,从而形成反应混合物;
(b) 检测所述反应混合物中产生的可检测信号的存在和/或水平;
其中所述可检测信号的存在和/或水平代表所述靶核酸分子的存在和/或量;
其中所述来自Alicyclobacillus acidiphilus的AaCas12b蛋白的氨基酸序列如SEQID NO:1所示;
其中所述单链DNA报告分子中的单链DNA不是聚鸟苷酸。
20.权利要求19的用途,所述靶核酸分子是双链DNA分子或单链DNA分子。
21.权利要求19的用途,其中所述gRNA是sgRNA。
22.权利要求21的用途,所述sgRNA包括由选自SEQ ID NO:11-66之一的核酸序列编码的支架序列。
23.权利要求22的用途,所述sgRNA包括位于支架序列3’端的间隔区序列,其与靶序列或靶序列的互补序列特异性杂交。
24.权利要求23的用途,所述间隔区序列与靶序列具有至少一个核苷酸错配。
25.权利要求19-24任一项的用途,其中所述单链DNA报告分子在两端分别包含荧光基团和相应的猝灭基团。
26.权利要求25的用途,其中所述荧光基团选自FAM、TEX、HEX、Cy3或Cy5,所述猝灭基团选自BHQ1、BHQ2、BHQ3或TAMRA。
27.权利要求19-24任一项的用途,其中所述单链DNA报告分子长度为2个- 100个核苷酸。
28.权利要求19-24任一项的用途,其中所述单链DNA报告分子为聚腺苷酸、聚胞苷酸或聚胸苷酸。
29.权利要求19-24任一项的用途,还包括在步骤(a)之前对所述生物样品中的核酸分子进行扩增的步骤。
30.权利要求29的用途,所述扩增是PCR扩增或重组酶聚合酶扩增。
31.权利要求30的用途,所述重组酶聚合酶扩增进行10分钟-60分钟。
32.权利要求19-24任一项的用途,其中在与所述生物样品接触之前,所述Cas12b蛋白已经与所述gRNA预先复合形成Cas12b-gRNA复合物。
33.权利要求19-24任一项的用途,所述步骤(a)的反应进行20分钟- 180分钟。
34.权利要求19-24任一项的用途,步骤(a)在NEBufferTM 2、NEBuffer™ 2.1或Cutsmart® Buffer缓冲液中进行。
35.权利要求19-24任一项的用途,其中所述生物样品选自全血、血浆、血清、脑脊液、尿液、粪便、细胞或组织提取物。
36.权利要求35的用途,其中所述生物样品是提取自细胞或组织的核酸样品。
CN201811453278.9A 2018-09-20 2018-11-30 核酸检测方法 Active CN109837328B (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811099146 2018-09-20
CN2018110991460 2018-09-20

Publications (2)

Publication Number Publication Date
CN109837328A CN109837328A (zh) 2019-06-04
CN109837328B true CN109837328B (zh) 2021-07-27

Family

ID=66883179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811453278.9A Active CN109837328B (zh) 2018-09-20 2018-11-30 核酸检测方法

Country Status (5)

Country Link
US (1) US20210381038A1 (zh)
EP (1) EP4023766B1 (zh)
JP (1) JP2022501039A (zh)
CN (1) CN109837328B (zh)
WO (1) WO2020056924A1 (zh)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11584955B2 (en) * 2017-07-14 2023-02-21 Shanghai Tolo Biotechnology Company Limited Application of Cas protein, method for detecting target nucleic acid molecule and kit
CA3106035A1 (en) * 2018-08-07 2020-02-13 The Broad Institute, Inc. Cas12b enzymes and systems
KR20220062289A (ko) * 2019-08-12 2022-05-16 라이프에디트 테라퓨틱스, 인크. Rna-가이드된 뉴클레아제 및 그의 활성 단편 및 변이체 및 사용 방법
CN111690717B (zh) * 2020-04-30 2023-05-30 山东舜丰生物科技有限公司 基于crispr技术进行目标核酸检测的方法和系统
CN111996236B (zh) * 2020-05-29 2021-06-29 山东舜丰生物科技有限公司 基于crispr技术进行靶核酸检测的方法
CN111690773B (zh) * 2020-06-17 2021-08-20 山东舜丰生物科技有限公司 利用新型Cas酶进行目标核酸检测的方法和系统
CN111778230A (zh) * 2020-07-17 2020-10-16 山东舜丰生物科技有限公司 一种适用于Cas12蛋白的缓冲系统及其应用
CN112301016B (zh) * 2020-07-23 2023-09-08 广州美格生物科技有限公司 新型mlCas12a蛋白在核酸检测方面的应用
CN116024314A (zh) * 2020-09-18 2023-04-28 山东舜丰生物科技有限公司 基于crispr技术进行靶核酸多重检测的方法
CN114507716A (zh) * 2020-11-16 2022-05-17 北京迅识科技有限公司 检测样品中靶核酸的方法
CN113308451B (zh) * 2020-12-07 2023-07-25 中国科学院动物研究所 工程化的Cas效应蛋白及其使用方法
CN112877410B (zh) * 2020-12-30 2022-09-13 东北大学 一种优化的基于crispr介导的核酸检测系统及其检测方法
NO347409B1 (no) * 2021-04-11 2023-10-23 Glomsroed Mikkel Ante Fremgangsmåte for påvisning av mikroorganismer eller genfeil
CN113122554A (zh) * 2021-04-14 2021-07-16 河北科技大学 在拟南芥中高效表达Bs3Cas12b蛋白的核酸分子及其在基因组编辑中的应用
CN117337326A (zh) 2021-05-27 2024-01-02 中国科学院动物研究所 工程化的Cas12i核酸酶、效应蛋白及其用途
CN113373130B (zh) * 2021-05-31 2023-12-22 复旦大学 Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用
CN116042848A (zh) * 2021-10-28 2023-05-02 北京干细胞与再生医学研究院 一种通过血样检测判断动物体内存在移植物的方法
CN113913509A (zh) * 2021-11-18 2022-01-11 江苏博嘉生物医学科技有限公司 脑卒中风险筛查相关分子标志物基因突变的检测试剂盒与检测方法
WO2023114090A2 (en) * 2021-12-13 2023-06-22 Labsimply, Inc. Signal boost cascade assay
CN114231530B (zh) * 2021-12-20 2024-03-15 大连理工大学 一种基于核酸核酶与环状向导RNA调控的Cas12a-CcrRNA系统及其应用
WO2024046307A1 (zh) * 2022-08-29 2024-03-07 北京迅识科技有限公司 一种突变的v型crispr酶及其应用
CN115820818B (zh) * 2022-12-13 2024-02-23 博迪泰(厦门)生物科技有限公司 一种一步法核酸检测方法及其应用

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107488710A (zh) * 2017-07-14 2017-12-19 上海吐露港生物科技有限公司 一种Cas蛋白的用途及靶标核酸分子的检测方法和试剂盒
CN107784200A (zh) * 2016-08-26 2018-03-09 深圳华大基因研究院 一种筛选新型CRISPR‑Cas系统的方法和装置
CN110551800A (zh) * 2018-06-03 2019-12-10 上海吐露港生物科技有限公司 一种耐高温Cas蛋白的用途及靶标核酸分子的检测方法和试剂盒

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018112098A1 (en) * 2016-12-13 2018-06-21 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Methods of treating cells containing fusion genes by genomic targeting
CN108277231B (zh) * 2017-01-06 2021-02-02 中国科学院分子植物科学卓越创新中心 一种用于棒状杆菌基因组编辑的crispr系统
US10253365B1 (en) * 2017-11-22 2019-04-09 The Regents Of The University Of California Type V CRISPR/Cas effector proteins for cleaving ssDNAs and detecting target DNAs
WO2019127087A1 (zh) * 2017-12-27 2019-07-04 中国科学院动物研究所 基因组编辑系统和方法
CN112543812A (zh) * 2018-06-26 2021-03-23 麻省理工学院 基于crispr效应系统的扩增方法、系统和诊断

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784200A (zh) * 2016-08-26 2018-03-09 深圳华大基因研究院 一种筛选新型CRISPR‑Cas系统的方法和装置
CN107488710A (zh) * 2017-07-14 2017-12-19 上海吐露港生物科技有限公司 一种Cas蛋白的用途及靶标核酸分子的检测方法和试剂盒
CN110551800A (zh) * 2018-06-03 2019-12-10 上海吐露港生物科技有限公司 一种耐高温Cas蛋白的用途及靶标核酸分子的检测方法和试剂盒

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Alicyclobacillus acidiphilus NBRC 100859,NZ_BCQI01000053.1;Hosoyama,A等;《NCBI GenBank》;20170416;第1-2页 *
CRISPR-Cas12a target binding unleashes indiscriminate single-stranded DNase activity;Janice S.Chen等;《Science》;20180215;第360卷(第6387期);第436-439页 *
CRISPR-Cas12b-assisted nucleic acid detection platform;Linxian Li等;《bioRxiv》;20180706;第1-12页 *
Hosoyama,A等.Alicyclobacillus acidiphilus NBRC 100859,NZ_BCQI01000053.1.《NCBI GenBank》.2017, *
Linxian Li等.CRISPR-Cas12b-assisted nucleic acid detection platform.《bioRxiv》.2018, *

Also Published As

Publication number Publication date
US20210381038A1 (en) 2021-12-09
EP4023766A1 (en) 2022-07-06
JP2022501039A (ja) 2022-01-06
CN109837328A (zh) 2019-06-04
EP4023766B1 (en) 2024-04-03
WO2020056924A1 (zh) 2020-03-26

Similar Documents

Publication Publication Date Title
CN109837328B (zh) 核酸检测方法
US10704091B2 (en) Genotyping by next-generation sequencing
CN108699598B (zh) 用于分析修饰的核苷酸的组合物和方法
US10787702B2 (en) Thermolabile exonucleases
US20020132259A1 (en) Mutation detection using MutS and RecA
AU2002311761A1 (en) Mutation detection using MutS and RecA
US20040224336A1 (en) RecA-assisted specific oligonucleotide extension method for detecting mutations, SNPs and specific sequences
KR20220131939A (ko) 개선된 검출 검정
Niu et al. Highly sensitive detection method for HV69-70del in SARS-CoV-2 alpha and omicron variants based on CRISPR/Cas13a
JP2005507668A (ja) 変異、一ヌクレオチド多型、および特定の配列のRecA補助検出
CN117210437A (zh) 两种基因编辑工具酶鉴定及其在核酸检测中的应用
WO2024112441A1 (en) Double-stranded dna deaminases and uses thereof
WO2023148235A1 (en) Methods of enriching nucleic acids
JP2015528281A (ja) 拡大された基質範囲を有する新規のdnaポリメラーゼ
CN110195102A (zh) 一种β地中海贫血基因分型方法
Liu et al. A-Star, an Argonaute-directed System for Rare SNV Enrichment and Detection
Liu et al. Argonaute-mediated system for supersensitive and multiplexed detection of rare mutations
US20100196893A1 (en) Method for genotyping DNA tandem repeat sequences

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant