CN110272881B - 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用 - Google Patents

核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用 Download PDF

Info

Publication number
CN110272881B
CN110272881B CN201910581265.8A CN201910581265A CN110272881B CN 110272881 B CN110272881 B CN 110272881B CN 201910581265 A CN201910581265 A CN 201910581265A CN 110272881 B CN110272881 B CN 110272881B
Authority
CN
China
Prior art keywords
lys
leu
glu
asp
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910581265.8A
Other languages
English (en)
Other versions
CN110272881A (zh
Inventor
黄强
薛冬梅
汤洪海
朱海霞
杜文豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201910581265.8A priority Critical patent/CN110272881B/zh
Publication of CN110272881A publication Critical patent/CN110272881A/zh
Application granted granted Critical
Publication of CN110272881B publication Critical patent/CN110272881B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

本发明属于蛋白质工程技术领域,具体为一种来源于酿脓链球菌的CRISPR核酸酶SpCas9的截短型高特异性变异体及其应用。本发明中的CRISPR‑Cas9(TSpCas9‑V1/V2)核酸酶属于CRISPR‑Cas9系统,TSpCas9‑V1核酸酶将截短型CRISPR‑Cas9(TSpCas9)核酸酶的第863位的氨基酸H突变成N,TSpCas9‑V2核酸酶将截短型CRISPR‑Cas9(TSpCas9)核酸酶的第862位的氨基酸H突变成A,将第863位的氨基酸H突变成N;该截短型高特异性变异体具有与野生型CRISPR‑Cas9核酸酶相当的基因编辑活性,能够降低基因编辑中的脱靶,因而能用于对基因组DNA片段特定位置的精准编辑。

Description

核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其 应用
技术领域
本发明属于蛋白质工程技术领域,具体涉及来源于酿脓链球菌的CRISPR-Cas9核酸酶SpCas9的高特异性截短变异体TSpCas9-V1/V2及其在基因编辑领域中的应用。
背景技术
CRISPR/Cas系统可方便地对基因组特定基因进行的切割编辑,是最新发现的一种革命性的基因编辑技术,近年来,发展尤为迅速。来源于酿脓链球菌(Streptococcuspyogenes)的CRISPR-Cas9核酸酶SpCas9是目前最为广泛使用的CRISPR核酸酶[1],因其具有易用、高效、强特异性和多功能性等优点,已被应用到多个领域,包括医学研究以及生物技术等,比如细胞或动物模型的快速构建、功能基因的便捷筛选以及部分遗传疾病的治疗[2-6]
虽然,CRISPR-Cas9系统的优势促使其能够广泛发展,但是,其仍然存在着许多有待解决的问题。比如SpCas9体积过于庞大,难以通过病毒载体将其高效靶向地运载到体内或细胞中,从而限制了其在临床医学治疗上的发展[7];SpCas9由sgRNA引导至特定靶序列进行切割编辑时,sgRNA可能会与其他类似靶序列的基因位点发生局部匹配,进一步激活Cas9切割非靶DNA,因而难以精准的靶向特异性疾病基因位点,导致产生脱靶现象,进而限制了该技术在精准医疗方面的应用[8-10]
基于上述问题,寻找一个体积较小,特异性较高的Cas9核酸酶,从而有效地实现体内DNA片段的精准遗传编辑是十分必要的。
发明内容
本发明的目的是提供一种来源于酿脓链球菌的CRISPR-Cas9核酸酶SpCas9的截短型高特异性变异体及其用途。
本发明提供的来源于酿脓链球菌的CRISPR-Cas9核酸酶SpCas9的截短型高特异性变异体有两种,一种是将截短型CRISPR-Cas9(TSpCas9)核酸酶的第863位的氨基酸H突变成N,记为TSpCas9-V1核酸酶;一种是将截短型CRISPR-Cas9(TSpCas9)核酸酶的第862位的氨基酸H突变成A,第863位的氨基酸H突变成N,记为TSpCas9-V2核酸酶;两种截短型高特异性变异体记为TSpCas9-V1/V2;它们具有与野生型CRISPR-Cas9核酸酶相当的基因编辑活性,但比野生型特异性更高,能实现精准编辑。
所述野生型SpCas9核酸酶的核苷酸序列和氨基酸序列分别为SEQIDNO.1和SEQ IDNO.2所示。
所述截短型TSpCas9核酸酶的核苷酸序列和氨基酸序列分别为SEQ ID NO.3和SEQID NO.4所示。
所述截短型高特异性TSpCas9-V1核酸酶的核苷酸序列和氨基酸序列分别为SEQID NO.5和SEQ ID NO.6所示,与野生型SpCas9的相似度达90%以上。
所述截短型高特异性TSpCas9-V2核酸酶的核苷酸序列和氨基酸序列分别为SEQID NO.7和SEQ ID NO.8所示,与野生型SpCas9的相似度达90%以上。
本发明还提供一种多核苷酸序列,可以转录和翻译成所述的CRISPR-Cas9核酸酶(TSpCas9-V1/V2)。
本发明还提供一种表达载体,其含有上述多核苷酸序列。
本发明还提供一种宿主细胞,可以用于转化上述表达载体。
本发明还提供所述CRISPR-Cas9核酸酶(TSpCas9-V1/V2)的制备方法,具体步骤包括:首先,构建所述CRISPR-Cas9核酸酶的多核苷酸序列表达载体;然后,将所述表达载体转化至宿主细胞,筛选并挑出单克隆;最后,将所述单克隆诱导表达,并通过亲和层析、离子交换等方法从表达产物中分离出所述的CRISPR-Cas9核酸酶。
本发明提供的上述CRISPR-Cas9核酸酶、多核苷酸序列以及表达载体均可作为编辑基因组DNA的编辑工具,用于基因组DNA片段的相关编辑。
本发明中,所述的基因编辑可以是单点编辑,也可以是编辑位点大于等于两个的多点编辑。
所述编辑的手段包括删除、突变、插入、倒位、移位、重复或易位。
本发明中,所述CRISPR-Cas9编辑工具包括与靶标DNA片段匹配的引导sgRNA。
所述的CRISPR-Cas9核酸酶与能够介导它的sgRNA组合,从而对目的基因进行编辑。
本发明中,所述含有编码CRISPR-Cas9核酸酶多核苷酸序列的载体和与之匹配的引导sgRNA一同转入宿主细胞,对基因进行编辑。
本发明中,所述单位点或多位点基因编辑,包括利用所述CRISPR-Cas9核酸酶对双链DNA进行剪切,并通过宿主细胞的修复系统对断裂的缺口进行修复。
本发明中,所述的单位点或多位点的基因编辑,是改变单位点或多位点编辑时的碱基突变特征。
与现有技术相比,本发明的(TSpCas9-V1/V2),属于CRISPR-Cas9免疫系统,分别含有SEQ ID NO.5、SEQ ID NO.6和SEQ ID NO.7、SEQ ID NO.8的核苷酸序列和氨基酸序列,能够提高基因编辑特异性,实现对基因组DNA片段的特定位置的精准编辑,具有潜在的精准生物医学应用价值。
附图说明
图1为Pet21-6His-TEV-TSpCas9-V1/V2质粒构建图示。
图2为Pet21-6His-TEV-TSpCas9-V1/V2-V1/V2质粒筛选与培养。
图3为含有TSpCas9-V1/V2的质粒测序。
图4为TSpCas9-V1/V2-V1/V2目标蛋白的纯化方法。
图5为TSpCas9-V1/V2-V1/V2目标蛋白纯化获取的过程。
图6为CRISPR-Cas9目标蛋白纯化的电泳鉴定。
图7为sgRNA的在靶和脱靶序列。
图8为野生型SpCas9、TSpCas9及TSpCas9-V1/V2体外剪切活性的检测。
图9为野生型SpCas9体外在靶和脱靶效应的检测。
图10为截短型TSpCas9体外在靶和脱靶效应的检测。
图11为截短变异体TSpCas9-V1体外在靶和脱靶效应的检测。
图12为截短变异体TSpCas9-V2体外在靶和脱靶效应的检测。
具体实施方式
下面通过具体实施例进一步描述本发明。
下述实施例中所用的实验方法,如无特定说明,均为常规方法。
下述实施例中所用的材料、试剂等,如无特定说明,均为从商业途径获得。
一、CRISPR-Cas9核酸酶
本发明的TSpCas9-V1核酸酶将截短型CRISPR-Cas9(TSpCas9)核酸酶的第863位的氨基酸H突变成N,TSpCas9-V2核酸酶将截短型CRISPR-Cas9(TSpCas9)核酸酶的第862位的氨基酸H突变成A、第863位的氨基酸H突变成N,属于CRISPR-Cas9系统,具有与野生型CRISPR-Cas9核酸酶相当的基因编辑活性,但比野生型更能特异性靶向基因编辑位点,实现精准编辑。
二、编码CRISPR-Cas9核酸酶的多核苷酸序列
转录和翻译所述CRISPR-Cas9(TSpCas9-V1/V2)的多核苷酸序列,包括DNA或RNA。DNA还可以细分为质粒DNA、基因组DNA或人工合成的DNA。
编码所述CRISPR-Cas9(TSpCas9-V1/V2)的多核苷酸序列,可以利用该领域科研或技术人员所熟悉的相关分子生物学技术来制备,其不局限于重组DNA技术和化学合成方法。
三、表达载体
所述表达载体含有编码所述CRISPR-Cas9核酸酶(TSpCas9-V1/V2)的多核苷酸序列。该表达载体可以通过科研或技术人员所熟悉的分子生物学方法来构建,包括DNA重组技术和DNA合成技术等,主要将CRISPR-Cas9核酸酶(TSpCas9-V1/V2)的DNA有效连接到载体上的克隆位点中,然后通过转录翻译等过程表达目的蛋白TSpCas9-V1/V2。
四、宿主细胞
所述宿主细胞是用来转化表达CRISPR-Cas9核酸酶的重组质粒。主要包括原核细胞(如细菌),低等真核细胞(如酵母),高等真核细胞(如哺乳动物细胞)等。常用的宿主细胞如大肠杆菌DH5α、毕赤酵母、HEK293、CHO、Hela细胞等。
五、CRISPR-Cas9核酸酶(TSpCas9-V1/V2)及其编码该酶的核苷酸序列和所述表达载体的用途
本发明的CRISPR-Cas9核酸酶(TSpCas9-V1/V2)及其编码该酶的多核苷酸序列和所述的表达载体能够用于基因组DNA片段的编辑或用于制备基因编辑工具。CRISPR-Cas9核酸酶(TSpCas9-V1/V2)编辑包括单位点和多位点编辑,其编辑手段包括删除、突变、插入、倒位、移位、重复或易位等。
六、基因编辑工具及其方法
本发明的基因编辑工具属于CRISPR-Cas9系统,CRISPR-Cas9(TSpCas9-V1/V2)在特定的sgRNA的引导下可以在目的基因DNA片段PAM(NGG)位点上游3到4位碱基间剪切底物DNA。该编辑过程可以在体内或体内进行。当sgRNA是单个的时候可以进行单点编辑,当sgRNA是两个或两个以上时可以进行多位点编辑。
如本发明的一些实施方式中所列举的,CRISPR-Cas9核酸酶(TSpCas9-V1/V2)在sgRNA的引导下,可以在体外对底物DNA(920bp)进行剪切,其产物长度分别为760bp和260bp。
在本发明中,Cas9可作为CRISPR-Cas9核酸酶的简称使用,其含义与CRISPR-Cas9核酸酶相同。本发明中的截短型高特异性TSpCas9-V1核酸酶是将截短型CRISPR-Cas9(TSpCas9)核酸酶的第863位的氨基酸H突变成N后表达得到的蛋白,TSpCas9-V2核酸酶是将截短型CRISPR-Cas9(TSpCas9)核酸酶的第862位的氨基酸H突变成A、第863位的氨基酸H突变成N后表达得到的蛋白。
在进一步描述本发明具体实施方式之前,应理解,本发明的保护范围并不局限为下述特定的具体实施方案,还应理解为,本发明实施例中的术语是为了描述特定的具体实施方案,而不是为了限制本发明的保护范围。下例实施例中未注明具体条件的试验方法,通常按照常规条件操作,或者按照各生产厂商所建议的条件操作。
在实施例给出的数值范围中,应理解,除非本发明另有说明,每个数值范围的两个端点以及两个端点之间任何一个数值均可选用。除非另外定义,本发明中使用的所有技术和科学术语与本技术领域技术人员通常理解的意义相同。除实施例中使用的具体方法、设备、材料外,根据本技术领域的技术人员对现有技术的掌握及本发明的记载,还可以使用与本发明实施例中所述的方法、设备、材料相似或等同的现有技术的任何方法、设备和材料来实现本发明。
除非另外说明,本发明中所公开的实验方法、检测方法、制备方法均采用本技术领域常规的分子生物学、生物化学、重组DNA技术及相关领域的常规技术。
实施例1,构建CRISPR-Cas9核酸酶(TSpCas9-V1/V2)的质粒
1.变异体的设计
以pet21-6His-TEV-TSpCas9质粒,即SEQ ID NO.3(3744bp)为模板,将2587位的碱基C和2584-2587位的碱基CACC分别突变为A和GCCA,其余位置的碱基保持不变,将其分别命名为TSpCas9-V1和TSpCas9-V2。其改造设计思路如图1所示,其详细步骤简述如下:
利用引物F2587和R2587,F2584和R2584分别对质粒Pet21-6His-TEV-TSpCas9(相当于SEQ ID NO.3)进行点突变,然后质粒模板消化,即可获得目的产物TSpCas9-V1/V2。
(1)采购点突变试剂盒
所用点突变试剂盒Fast Site-Directed Mutagenesis Kit从天根生化科技(北京)有限公司订购。
(2)采购引物
所用引物均从上海生工生物工程有限公司订购。它们的序列如下:
2587位碱基C突变为A的正反引物:
F2587:GATCAACAATTACCACAATGCGCATGATGCC(SEQ ID NO.9)
R2587:TGTGGTAATTGTTGATCTCTCTCACCTTATA(SEQ ID NO.10)
2584-2587位的碱基CACC突变为GCCA的正反引物:
F2584:GATCAACAATTACGCCAATGCGCATGATGCCTAC(SEQ ID NO.11)
R2584:GTAGGCATCATGCGCATTGGCGTAATTGTTGATC(SEQ ID NO.12)。
点突变体系如下:
Figure BDA0002113242960000051
PCR反应条件:
Figure BDA0002113242960000061
点突变体系如下:
Figure BDA0002113242960000062
PCR反应条件:
Figure BDA0002113242960000063
Figure BDA0002113242960000071
该反应体系在37℃孵育1小时后,转化大肠杆菌DH5α(购于TIANGEN)37℃过夜培养筛选单克隆TSpCas9-V1/V2,如图2所示,其核苷酸序列和氨基酸分别为序列为SEQ IDNO.5、SEQ ID NO.6和SEQ ID NO.7、SEQ ID NO.8,培养细菌。
(3)采购质粒抽提试剂盒
所用质粒小提试剂盒TIANprep Mini Plasmid Kit从天根生化科技(北京)有限公司订购。
抽提质粒,抽提方法见天根质粒小提试剂盒使用说明。
(4)样品测序
在上海杰李生物技术有限公司通过一代测序对8个样品质粒进行测序,发现样品质粒1和6构建正确,其结果如图3所示。
实施例2,制备CRISPR-Cas9(TSpCas9-V1/V2)核酸酶
2.蛋白表达与纯化
2.1蛋白表达
(1)打开超净台,用含75%酒精的棉球擦拭桌面以及各种器具耗材,开紫外灯照射20min,启动风机备用;
(2)移液枪吸取10μl表达Pet21-6His-TEV-TSpCas9-V1/V2的Rosetta(DE3)(购于TIANGEN)菌液转至6ml含有双抗(Amp与Cm)的LB液体培养基中,37℃,200r/min振荡培养过夜;
(3)将过夜培养的菌液按照体积比为1:100转至500ml含双抗的LB(购于生工)液体培养基中,37℃,200r/min振荡培养。在培养过程中,随时检测菌液的OD值;
(4)当菌液的OD值接近0.4~0.8时,加入蛋白诱导剂IPTG,使其终浓度为0.1mM,然后16℃,200r/min振荡培养20h;
(5)收集菌液,5000r/min离心5min使菌体沉淀,弃上清,并称重Pet21-6His-TEV-TSpCas9-V1/V2菌体。
2.2蛋白纯化
所述蛋白纯化主要通过镍柱亲和层析技术,如图4所示;其纯化过程包括菌体破碎、蛋白样品离心收集、蛋白样品与镍柱介质共孵育以及目的蛋白的洗脱等,如图5所示。其详细步骤如下:
(1)向菌体中加入预先冰浴且PMSF终浓度为0.1mM的裂解液(20mM HEPES,500mMKCl,pH7.5;1g菌体加入5ml),涡旋仪重悬使菌块分散混匀,细胞超声破碎仪破碎细胞,超声3sec停3sec,一次10min,超声两次,超声过程均在冰浴中进行;
(2)向破碎的菌液中加入终浓度为10μg/ml Rnase(生工),5μg/ml DNase I(生工),冰浴处理30min后,4℃10000r/min离心45~60min,收集上清;
(3)将上清与预先用平衡液(20mM HEPES,500mM KCl,1%蔗糖,pH7.5)处理的Qiagen Ni-NTA介质孵育,此过程在冰浴上进行,并加以振荡(150r/min),1.5h后静置,待Qiagen Ni-NTA沉淀;
(4)将Qiagen Ni-NTA装载到重力柱中,BioLogic LP系统的监测下,分别以流速为2ml/min的平衡液和洗脱液(20mM HEPES,500mM KCl,500mM咪唑,1%蔗糖,pH7.5),20、30、40、50、100、250、500Mm洗脱液冲洗Qiagen Ni-NTA,并收集蛋白;
(5)将不同咪唑浓度下的蛋白溶液跑SDS-PAGE(购于EpiZyme Scientific)电泳,考马斯亮蓝染色,脱色剂脱色,观察目的蛋白的表达和挂柱效果。
所述蛋白TSpCas9-V1/V2的纯化结果如图6所示,其显示该目的蛋白TSpCas9-V1/V2的纯化情况,杂条带较少,蛋白较为纯净。
实施例3,检验CRISPR-Cas9(TSpCas9-V1/V2)核酸酶剪切活性
3.变异体活性检测
所用底物DNA(SEQ ID NO.13),主要利用引物QG-F和QG-R通过常规PCR扩增,然后割胶回收获取。
(1)采购扩增试剂盒
所用扩增试剂盒Fast HiFidelity PCR Kit从天根生化科技(北京)有限公司订购。
(2)采购引物
所用引物均从上海生工生物工程有限公司订购。它们的序列如下:
QG-F:TAGTCCTGTCGGGTTTCG(SEQ ID NO.14)
QG-R:TTCCATTCGCCATTCAGG(SEQ ID NO.15)。
其反应体系和扩增条件如下:
扩增体系如下:
Figure BDA0002113242960000081
Figure BDA0002113242960000091
PCR反应条件:
Figure BDA0002113242960000092
(3)采购割胶回收试剂盒
所用割胶回收试剂盒AxyPrepTM DNA Gel Extraction Kit从Axygen公司订购,割胶回收操作均按其说明书进行,可以获得较纯的底物DNA(SEQ ID NO.13)。
Cas9与sgRNA以等摩尔混合,而根据实验需要,底物DNA可调为Cas9摩尔质量的0.2~1倍。其反应体系如下
Figure BDA0002113242960000093
将反应体系置37℃孵育,1h后70℃加热10min,最后通过琼脂糖凝胶电泳检测目的蛋白TSpCas9-V1/V2的体外切割活性。其结果如图8所示,从图8中可以发现,与野生型SpCas9和截短型TSpCas9相比,TSpCas9-V1和TSpCas9-V2在体外均能够剪切底物DNA (SEQID NO.13)(泳道5和6),生成产物1和产物2。虽然TSpCas9-V1/V2在体外的DNA剪切活性并不比野生型SpCas9的强,但是其体积在小型化上却比野生型SpCas9有优势,即我们的TSpCas9-V1/V2体积比野生型SpCas9小,对方便腺病毒AAV运输而言,比野生型SpCas9有较大优势。
实施例4,CRISPR-Cas9(TSpCas9-V1/V2)体外脱靶检测的评价方法
4.脱靶效应检测
利用不同的sgRNA,如图7中1到8号所示检测TSpCas9-V1/V2的体外切割活性,从而评价TSpCas9-V1/V2的脱靶效应,其反应体系如下:
Figure BDA0002113242960000101
首先,评价野生型SpCas9在体外的脱靶效应,如图9所示,与0号sgRNA引导的SpCas9剪切活性相比(泳道3),1到8号sgRNA引导的SpCas9均能够在体外剪切底物DNA(SEQID NO.13)(泳道4到11),生成产物1和产物2。尽管只有部分sgRNA的引导活性强,即1到4号sgRNA引导的SpCas9体外剪切活性比较强(泳道4到7),5到8号sgRNA引导的SpCas9体外剪切活性相对较弱(8号到11号),但是,该结果依然反映野生型SpCas9在体外的脱靶效应比较严重。由此说明,野生型SpCas9有较强的脱靶效应,尤其在1到4号sgRNA的引导下更为突出。
其次,评价截短型TSpCas9在体外的脱靶效应,如图10所示,与0号sgRNA引导的TSpCas9剪切活性相比(泳道3),1到8号sgRNA引导的截短型TSpCas9均能够在体外剪切底物DNA(SEQ ID NO.13)(泳道4到11),生成产物1和产物2。尤其是1到4号sgRNA引导的截短型TSpCas9的体外剪切活性相对较强,由此说明截短型TSpCas9依然存在较强的脱靶效应。
再次,评价截短型TSpCas9-V1在体外的脱靶效应,如图11所示,与0号sgRNA引导的TSpCas9-V1剪切活性相比(泳道3),1、2和4号sgRNA引导的截短型TSpCas9-V1体外剪切底物DNA(SEQ ID NO.13)的能力较为明显(泳道4、5、7),生成产物1和产物2。其余sgRNA引导的TSpCas9-V1的在体外的切割活性非常弱,尤其是5到8号sgRNA引导的TSpCas9-V1几乎没有发挥切割作用,由此说明TSpCas9-V1能降低脱靶效应。
最后,评价截短型TSpCas9-V2在体外的脱靶效应,如图12所示,与0号sgRNA引导的TSpCas9-V2剪切活性相比(泳道3),1和4号sgRNA引导的截短型TSpCas9-V2体外剪切底物DNA(SEQ ID NO.13)的能力明显减弱(泳道4和7),生成产物1和产物2。而其余sgRNA引导的TSpCas9-V2的在体外几乎没有发挥切割作用,由此说明TSpCas9-V2能降低脱靶效应。
在sgRNA与底物DNA完全互补的情况下,TSpCas9-V1/V2保留了野生型SpCas9核酸酶的切割活性;同时,在sgRNA与底物DNA存在两个碱基错配的情况下,和野生型SpCas9和截短型TSpCas9相比较,TSpCas9-V1/V2对于对底物DNA的容错率更低,体外剪切特异性更高。即TSpCas9-V1/V2不仅具有野生型CRISPR-Cas9核酸酶基因编辑功能,而且与野生型核酸酶相比,更能特异性靶向基因编辑位点,实现精准编辑,同时体积更小,方便腺病毒AAV运输,比野生型SpCas9有更大优势,这对将来CRISPR-Cas9系统方便应用到临床医学上提供了潜在的价值。
参考文选
[1].Doudna,J.A.and E.Charpentier,Genome editing.The new frontier ofgenome engineering with CRISPR-Cas9.Science,2014.346(6213):p.1258096.
[2].Hsu,P.D.,E.S.Lander and F.Zhang,Development and applications ofCRISPR-Cas9 for genome engineering.Cell,2014.157(6):p.1262-78.
[3].Suzuki,K.,et al.,In vivo genome editing via CRISPR/Cas9 mediatedhomology-independent targeted integration.Nature,2016.540(7631):p.144-149.
[4].Jinek,M.,et al.,RNA-programmed genome editing in humancells.Elife,2013.2:p.e00471.
[5].Hille,F.,et al.,The Biology of CRISPR-Cas:Backward andForward.Cell,2018.172(6):p.1239-1259.
[6].Karginov,F.V.and G.J.Hannon,The CRISPR system:small RNA-guideddefense in bacteria and archaea.Mol Cell,2010.37(1):p.7-19.
[7].Niewoehner,J.,et al.,Increased brain penetration and potency of atherapeutic antibody using a monovalent molecular shuttle.Neuron,2014.81(1):p.49-60.
[8].Kleinstiver,B.P.,et al.,High-fidelity CRISPR-Cas9 nucleases withno detectable genome-wide off-target effects.Nature,2016.529(7587):p.490-5.
[9].Pattanayak,V.,et al.,High-throughput profiling of off-target DNAcleavage reveals RNA-programmed Cas9 nuclease specificity.Nat Biotechnol,2013.31(9):p.839-43.
[10].Jamal,M.,et al.,Keeping CRISPR/Cas on-Target.Curr Issues MolBiol,2016.20:p.1-12。
序列表
<110> 复旦大学
<120> 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用
<130> 2019.06.28
<160> 16
<170> SIPOSequenceListing 1.0
<210> 1
<211> 4104
<212> DNA
<213> Streptococcus pyogenes serotype M1
<400> 1
ggcgacaaga agtactccat tgggctcgat atcggcacaa acagcgtcgg ctgggccgtc 60
attacggacg agtacaaggt gccgagcaaa aaattcaaag ttctgggcaa taccgatcgc 120
cacagcataa agaagaacct cattggcgcc ctcctgttcg actccgggga gacggccgaa 180
gccacgcggc tcaaaagaac agcacggcgc agatataccc gcagaaagaa tcggatctgc 240
tacctgcagg agatctttag taatgagatg gctaaggtgg atgactcttt cttccatagg 300
ctggaggagt cctttttggt ggaggaggat aaaaagcacg agcgccaccc aatctttggc 360
aatatcgtgg acgaggtggc gtaccatgaa aagtacccaa ccatatatca tctgaggaag 420
aagcttgtag acagtactga taaggctgac ttgcggttga tctatctcgc gctggcgcat 480
atgatcaaat ttcggggaca cttcctcatc gagggggacc tgaacccaga caacagcgat 540
gtcgacaaac tctttatcca actggttcag acttacaatc agcttttcga agagaacccg 600
atcaacgcat ccggagttga cgccaaagca atcctgagcg ctaggctgtc caaatcccgg 660
cggctcgaaa acctcatcgc acagctccct ggggagaaga agaacggcct gtttggtaat 720
cttatcgccc tgtcactcgg gctgaccccc aactttaaat ctaacttcga cctggccgaa 780
gatgccaagc ttcaactgag caaagacacc tacgatgatg atctcgacaa tctgctggcc 840
cagatcggcg accagtacgc agaccttttt ttggcggcaa agaacctgtc agacgccatt 900
ctgctgagtg atattctgcg agtgaacacg gagatcacca aagctccgct gagcgctagt 960
atgatcaagc gctatgatga gcaccaccaa gacttgactt tgctgaaggc ccttgtcaga 1020
cagcaactgc ctgagaagta caaggaaatt ttcttcgatc agtctaaaaa tggctacgcc 1080
ggatacattg acggcggagc aagccaggag gaattttaca aatttattaa gcccatcttg 1140
gaaaaaatgg acggcaccga ggagctgctg gtaaagctta acagagaaga tctgttgcgc 1200
aaacagcgca ctttcgacaa tggaagcatc ccccaccaga ttcacctggg cgaactgcac 1260
gctatcctca ggcggcaaga ggatttctac ccctttttga aagataacag ggaaaagatt 1320
gagaaaatcc tcacatttcg gataccctac tatgtaggcc ccctcgcccg gggaaattcc 1380
agattcgcgt ggatgactcg caaatcagaa gagaccatca ctccctggaa cttcgaggaa 1440
gtcgtggata agggggcctc tgcccagtcc ttcatcgaaa ggatgactaa ctttgataaa 1500
aatctgccta acgaaaaggt gcttcctaaa cactctctgc tgtacgagta cttcacagtt 1560
tataacgagc tcaccaaggt caaatacgtc acagaaggga tgagaaagcc agcattcctg 1620
tctggagagc agaagaaagc tatcgtggac ctcctcttca agacgaaccg gaaagttacc 1680
gtgaaacagc tcaaagaaga ctatttcaaa aagattgaat gtttcgactc tgttgaaatc 1740
agcggagtgg aggatcgctt caacgcatcc ctgggaacgt atcacgatct cctgaaaatc 1800
attaaagaca aggacttcct ggacaatgag gagaacgagg acattcttga ggacattgtc 1860
ctcaccctta cgttgtttga agatagggag atgattgaag aacgcttgaa aacttacgct 1920
catctcttcg acgacaaagt catgaaacag ctcaagaggc gccgatatac aggatggggg 1980
cggctgtcaa gaaaactgat caatgggatc cgagacaagc agagtggaaa gacaatcctg 2040
gattttctta agtccgatgg atttgccaac cggaacttca tgcagttgat ccatgatgac 2100
tctctcacct ttaaggagga catccagaaa gcacaagttt ctggccaggg ggacagtctt 2160
cacgagcaca tcgctaatct tgcaggtagc ccagctatca aaaagggaat actgcagacc 2220
gttaaggtcg tggatgaact cgtcaaagta atgggaaggc ataagcccga gaatatcgtt 2280
atcgagatgg cccgagagaa ccaaactacc cagaagggac agaagaacag tagggaaagg 2340
atgaagagga ttgaagaggg tataaaagaa ctggggtccc aaatccttaa ggaacaccca 2400
gttgaaaaca cccagcttca gaatgagaag ctctacctgt actacctgca gaacggcagg 2460
gacatgtacg tggatcagga actggacatc aatcggctct ccgactacga cgtggatcat 2520
atcgtgcccc agtcttttct caaagatgat tctattgata ataaagtgtt gacaagatcc 2580
gataaaaata gagggaagag tgataacgtc ccctcagaag aagttgtcaa gaaaatgaaa 2640
aattattggc ggcagctgct gaacgccaaa ctgatcacac aacggaagtt cgataatctg 2700
actaaggctg aacgaggtgg cctgtctgag ttggataaag caggcttcat caaaaggcag 2760
cttgttgaga cacgccagat caccaagcac gtggcccaaa ttctcgattc acgcatgaac 2820
accaagtacg atgaaaatga caaactgatt cgagaggtga aagttattac tctgaagtct 2880
aagctggtct cagatttcag aaaggacttt cagttttata aggtgagaga gatcaacaat 2940
taccaccatg cgcatgatgc ctacctgaat gcagtggtag gcactgcact tatcaaaaaa 3000
tatcccaagc ttgaatctga atttgtttac ggagactata aagtgtacga tgttaggaaa 3060
atgatcgcaa agtctgagca ggaaataggc aaggccaccg ctaagtactt cttttacagc 3120
aatattatga attttttcaa gaccgagatt acactggcca atggagagat tcggaagcga 3180
ccacttatcg aaacaaacgg agaaacagga gaaatcgtgt gggacaaggg tagggatttc 3240
gcgacagtcc ggaaggtcct gtccatgccg caggtgaaca tcgttaaaaa gaccgaagta 3300
cagaccggag gcttctccaa ggaaagtatc ctcccgaaaa ggaacagcga caagctgatc 3360
gcacgcaaaa aagattggga ccccaagaaa tacggcggat tcgattctcc tacagtcgct 3420
tacagtgtac tggttgtggc caaagtggag aaagggaagt ctaaaaaact caaaagcgtc 3480
aaggaactgc tgggcatcac aatcatggag cgatcaagct tcgaaaaaaa ccccatcgac 3540
tttctcgagg cgaaaggata taaagaggtc aaaaaagacc tcatcattaa gcttcccaag 3600
tactctctct ttgagcttga aaacggccgg aaacgaatgc tcgctagtgc gggcgagctg 3660
cagaaaggta acgagctggc actgccctct aaatacgtta atttcttgta tctggccagc 3720
cactatgaaa agctcaaagg gtctcccgaa gataatgagc agaagcagct gttcgtggaa 3780
caacacaaac actaccttga tgagatcatc gagcaaataa gcgaattctc caaaagagtg 3840
atcctcgccg acgctaacct cgataaggtg ctttctgctt acaataagca cagggataag 3900
cccatcaggg agcaggcaga aaacattatc cacttgttta ctctgaccaa cttgggcgcg 3960
cctgcagcct tcaagtactt cgacaccacc atagacagaa agcggtacac ctctacaaag 4020
gaggtcctgg acgccacact gattcatcag tcaattacgg ggctctatga aacaagaatc 4080
gacctctctc agctcggtgg agac 4104
<210> 2
<211> 1368
<212> PRT
<213> Streptococcus pyogenes serotype M1
<400> 2
Gly Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser
1025 1030 1035 1040
Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu
1045 1050 1055
Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile
1060 1065 1070
Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser
1075 1080 1085
Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly
1090 1095 1100
Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile
1105 1110 1115 1120
Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser
1125 1130 1135
Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly
1140 1145 1150
Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1155 1160 1165
Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala
1170 1175 1180
Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys
1185 1190 1195 1200
Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser
1205 1210 1215
Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1220 1225 1230
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val
1265 1270 1275 1280
Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys
1285 1290 1295
His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu
1300 1305 1310
Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp
1315 1320 1325
Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp
1330 1335 1340
Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile
1345 1350 1355 1360
Asp Leu Ser Gln Leu Gly Gly Asp
1365
<210> 3
<211> 3744
<212> DNA
<213> Streptococcus pyogenes serotype M1
<400> 3
ggcgacaaga agtactccat tgggctcgat atcggcacaa acagcgtcgg ctgggccgtc 60
attacggacg agtacaaggt gccgagcaaa aaattcaaag ttctgggcaa taccgatcgc 120
cacagcataa agaagaacct cattggcgcc ctcctgttcg actccgggga gacggccgaa 180
gccacgcggc tcaaaagaac agcacggcgc agatataccc gcagaaagaa tcggatctgc 240
tacctgcagg agatctttag taatgagatg gctaaggtgg atgactcttt cttccatagg 300
ctggaggagt cctttttggt ggaggaggat aaaaagcacg agcgccaccc aatctttggc 360
aatatcgtgg acgaggtggc gtaccatgaa aagtacccaa ccatatatca tctgaggaag 420
aagcttgtag acagtactga taaggctgac ttgcggttga tctatctcgc gctggcgcat 480
atgatcaaat ttcggggaca cttcctcatc gagggggacc tgaacccaga caacagcatt 540
ctgctgagtg atattctgcg agtgaacacg gagatcacca aagctccgct gagcgctagt 600
atgatcaagc gctatgatga gcaccaccaa gacttgactt tgctgaaggc ccttgtcaga 660
cagcaactgc ctgagaagta caaggaaatt ttcttcgatc agtctaaaaa tggctacgcc 720
ggatacattg acggcggagc aagccaggag gaattttaca aatttattaa gcccatcttg 780
gaaaaaatgg acggcaccga ggagctgctg gtaaagctta acagagaaga tctgttgcgc 840
aaacagcgca ctttcgacaa tggaagcatc ccccaccaga ttcacctggg cgaactgcac 900
gctatcctca ggcggcaaga ggatttctac ccctttttga aagataacag ggaaaagatt 960
gagaaaatcc tcacatttcg gataccctac tatgtaggcc ccctcgcccg gggaaattcc 1020
agattcgcgt ggatgactcg caaatcagaa gagaccatca ctccctggaa cttcgaggaa 1080
gtcgtggata agggggcctc tgcccagtcc ttcatcgaaa ggatgactaa ctttgataaa 1140
aatctgccta acgaaaaggt gcttcctaaa cactctctgc tgtacgagta cttcacagtt 1200
tataacgagc tcaccaaggt caaatacgtc acagaaggga tgagaaagcc agcattcctg 1260
tctggagagc agaagaaagc tatcgtggac ctcctcttca agacgaaccg gaaagttacc 1320
gtgaaacagc tcaaagaaga ctatttcaaa aagattgaat gtttcgactc tgttgaaatc 1380
agcggagtgg aggatcgctt caacgcatcc ctgggaacgt atcacgatct cctgaaaatc 1440
attaaagaca aggacttcct ggacaatgag gagaacgagg acattcttga ggacattgtc 1500
ctcaccctta cgttgtttga agatagggag atgattgaag aacgcttgaa aacttacgct 1560
catctcttcg acgacaaagt catgaaacag ctcaagaggc gccgatatac aggatggggg 1620
cggctgtcaa gaaaactgat caatgggatc cgagacaagc agagtggaaa gacaatcctg 1680
gattttctta agtccgatgg atttgccaac cggaacttca tgcagttgat ccatgatgac 1740
tctctcacct ttaaggagga catccagaaa gcacaagttt ctggccaggg ggacagtctt 1800
cacgagcaca tcgctaatct tgcaggtagc ccagctatca aaaagggaat actgcagacc 1860
gttaaggtcg tggatgaact cgtcaaagta atgggaaggc ataagcccga gaatatcgtt 1920
atcgagatgg cccgagagaa ccaaactacc cagaagggac agaagaacag tagggaaagg 1980
atgaagagga ttgaagaggg tataaaagaa ctggggtccc aaatccttaa ggaacaccca 2040
gttgaaaaca cccagcttca gaatgagaag ctctacctgt actacctgca gaacggcagg 2100
gacatgtacg tggatcagga actggacatc aatcggctct ccgactacga cgtggatcat 2160
atcgtgcccc agtcttttct caaagatgat tctattgata ataaagtgtt gacaagatcc 2220
gataaaaata gagggaagag tgataacgtc ccctcagaag aagttgtcaa gaaaatgaaa 2280
aattattggc ggcagctgct gaacgccaaa ctgatcacac aacggaagtt cgataatctg 2340
actaaggctg aacgaggtgg cctgtctgag ttggataaag caggcttcat caaaaggcag 2400
cttgttgaga cacgccagat caccaagcac gtggcccaaa ttctcgattc acgcatgaac 2460
accaagtacg atgaaaatga caaactgatt cgagaggtga aagttattac tctgaagtct 2520
aagctggtct cagatttcag aaaggacttt cagttttata aggtgagaga gatcaacaat 2580
taccaccatg cgcatgatgc ctacctgaat gcagtggtag gcactgcact tatcaaaaaa 2640
tatcccaagc ttgaatctga atttgtttac ggagactata aagtgtacga tgttaggaaa 2700
atgatcgcaa agtctgagca ggaaataggc aaggccaccg ctaagtactt cttttacagc 2760
aatattatga attttttcaa gaccgagatt acactggcca atggagagat tcggaagcga 2820
ccacttatcg aaacaaacgg agaaacagga gaaatcgtgt gggacaaggg tagggatttc 2880
gcgacagtcc ggaaggtcct gtccatgccg caggtgaaca tcgttaaaaa gaccgaagta 2940
cagaccggag gcttctccaa ggaaagtatc ctcccgaaaa ggaacagcga caagctgatc 3000
gcacgcaaaa aagattggga ccccaagaaa tacggcggat tcgattctcc tacagtcgct 3060
tacagtgtac tggttgtggc caaagtggag aaagggaagt ctaaaaaact caaaagcgtc 3120
aaggaactgc tgggcatcac aatcatggag cgatcaagct tcgaaaaaaa ccccatcgac 3180
tttctcgagg cgaaaggata taaagaggtc aaaaaagacc tcatcattaa gcttcccaag 3240
tactctctct ttgagcttga aaacggccgg aaacgaatgc tcgctagtgc gggcgagctg 3300
cagaaaggta acgagctggc actgccctct aaatacgtta atttcttgta tctggccagc 3360
cactatgaaa agctcaaagg gtctcccgaa gataatgagc agaagcagct gttcgtggaa 3420
caacacaaac actaccttga tgagatcatc gagcaaataa gcgaattctc caaaagagtg 3480
atcctcgccg acgctaacct cgataaggtg ctttctgctt acaataagca cagggataag 3540
cccatcaggg agcaggcaga aaacattatc cacttgttta ctctgaccaa cttgggcgcg 3600
cctgcagcct tcaagtactt cgacaccacc atagacagaa agcggtacac ctctacaaag 3660
gaggtcctgg acgccacact gattcatcag tcaattacgg ggctctatga aacaagaatc 3720
gacctctctc agctcggtgg agac 3744
<210> 4
<211> 1248
<212> PRT
<213> Streptococcus pyogenes serotype M1
<400> 4
Gly Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile
180 185 190
Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His
195 200 205
His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro
210 215 220
Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala
225 230 235 240
Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile
245 250 255
Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys
260 265 270
Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
275 280 285
Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
290 295 300
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile
305 310 315 320
Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala
325 330 335
Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr
340 345 350
Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala
355 360 365
Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn
370 375 380
Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val
385 390 395 400
Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys
405 410 415
Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu
420 425 430
Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr
435 440 445
Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu
450 455 460
Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile
465 470 475 480
Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu
485 490 495
Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile
500 505 510
Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met
515 520 525
Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
530 535 540
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu
545 550 555 560
Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu
565 570 575
Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln
580 585 590
Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala
595 600 605
Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val
610 615 620
Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val
625 630 635 640
Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn
645 650 655
Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly
660 665 670
Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
675 680 685
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val
690 695 700
Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His
705 710 715 720
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val
725 730 735
Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser
740 745 750
Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn
755 760 765
Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
770 775 780
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln
785 790 795 800
Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp
805 810 815
Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu
820 825 830
Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys
835 840 845
Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala
850 855 860
His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys
865 870 875 880
Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr
885 890 895
Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala
900 905 910
Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
915 920 925
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu
930 935 940
Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
945 950 955 960
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys
965 970 975
Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
980 985 990
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
995 1000 1005
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1010 1015 1020
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val
1025 1030 1035 1040
Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys
1045 1050 1055
Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys
1060 1065 1070
Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn
1075 1080 1085
Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn
1090 1095 1100
Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser
1105 1110 1115 1120
His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln
1125 1130 1135
Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln
1140 1145 1150
Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1155 1160 1165
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu
1170 1175 1180
Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala
1185 1190 1195 1200
Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr
1205 1210 1215
Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1220 1225 1230
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1235 1240 1245
<210> 5
<211> 3744
<212> DNA
<213> Streptococcus pyogenes serotype M1
<400> 5
ggcgacaaga agtactccat tgggctcgat atcggcacaa acagcgtcgg ctgggccgtc 60
attacggacg agtacaaggt gccgagcaaa aaattcaaag ttctgggcaa taccgatcgc 120
cacagcataa agaagaacct cattggcgcc ctcctgttcg actccgggga gacggccgaa 180
gccacgcggc tcaaaagaac agcacggcgc agatataccc gcagaaagaa tcggatctgc 240
tacctgcagg agatctttag taatgagatg gctaaggtgg atgactcttt cttccatagg 300
ctggaggagt cctttttggt ggaggaggat aaaaagcacg agcgccaccc aatctttggc 360
aatatcgtgg acgaggtggc gtaccatgaa aagtacccaa ccatatatca tctgaggaag 420
aagcttgtag acagtactga taaggctgac ttgcggttga tctatctcgc gctggcgcat 480
atgatcaaat ttcggggaca cttcctcatc gagggggacc tgaacccaga caacagcatt 540
ctgctgagtg atattctgcg agtgaacacg gagatcacca aagctccgct gagcgctagt 600
atgatcaagc gctatgatga gcaccaccaa gacttgactt tgctgaaggc ccttgtcaga 660
cagcaactgc ctgagaagta caaggaaatt ttcttcgatc agtctaaaaa tggctacgcc 720
ggatacattg acggcggagc aagccaggag gaattttaca aatttattaa gcccatcttg 780
gaaaaaatgg acggcaccga ggagctgctg gtaaagctta acagagaaga tctgttgcgc 840
aaacagcgca ctttcgacaa tggaagcatc ccccaccaga ttcacctggg cgaactgcac 900
gctatcctca ggcggcaaga ggatttctac ccctttttga aagataacag ggaaaagatt 960
gagaaaatcc tcacatttcg gataccctac tatgtaggcc ccctcgcccg gggaaattcc 1020
agattcgcgt ggatgactcg caaatcagaa gagaccatca ctccctggaa cttcgaggaa 1080
gtcgtggata agggggcctc tgcccagtcc ttcatcgaaa ggatgactaa ctttgataaa 1140
aatctgccta acgaaaaggt gcttcctaaa cactctctgc tgtacgagta cttcacagtt 1200
tataacgagc tcaccaaggt caaatacgtc acagaaggga tgagaaagcc agcattcctg 1260
tctggagagc agaagaaagc tatcgtggac ctcctcttca agacgaaccg gaaagttacc 1320
gtgaaacagc tcaaagaaga ctatttcaaa aagattgaat gtttcgactc tgttgaaatc 1380
agcggagtgg aggatcgctt caacgcatcc ctgggaacgt atcacgatct cctgaaaatc 1440
attaaagaca aggacttcct ggacaatgag gagaacgagg acattcttga ggacattgtc 1500
ctcaccctta cgttgtttga agatagggag atgattgaag aacgcttgaa aacttacgct 1560
catctcttcg acgacaaagt catgaaacag ctcaagaggc gccgatatac aggatggggg 1620
cggctgtcaa gaaaactgat caatgggatc cgagacaagc agagtggaaa gacaatcctg 1680
gattttctta agtccgatgg atttgccaac cggaacttca tgcagttgat ccatgatgac 1740
tctctcacct ttaaggagga catccagaaa gcacaagttt ctggccaggg ggacagtctt 1800
cacgagcaca tcgctaatct tgcaggtagc ccagctatca aaaagggaat actgcagacc 1860
gttaaggtcg tggatgaact cgtcaaagta atgggaaggc ataagcccga gaatatcgtt 1920
atcgagatgg cccgagagaa ccaaactacc cagaagggac agaagaacag tagggaaagg 1980
atgaagagga ttgaagaggg tataaaagaa ctggggtccc aaatccttaa ggaacaccca 2040
gttgaaaaca cccagcttca gaatgagaag ctctacctgt actacctgca gaacggcagg 2100
gacatgtacg tggatcagga actggacatc aatcggctct ccgactacga cgtggatcat 2160
atcgtgcccc agtcttttct caaagatgat tctattgata ataaagtgtt gacaagatcc 2220
gataaaaata gagggaagag tgataacgtc ccctcagaag aagttgtcaa gaaaatgaaa 2280
aattattggc ggcagctgct gaacgccaaa ctgatcacac aacggaagtt cgataatctg 2340
actaaggctg aacgaggtgg cctgtctgag ttggataaag caggcttcat caaaaggcag 2400
cttgttgaga cacgccagat caccaagcac gtggcccaaa ttctcgattc acgcatgaac 2460
accaagtacg atgaaaatga caaactgatt cgagaggtga aagttattac tctgaagtct 2520
aagctggtct cagatttcag aaaggacttt cagttttata aggtgagaga gatcaacaat 2580
taccacaatg cgcatgatgc ctacctgaat gcagtggtag gcactgcact tatcaaaaaa 2640
tatcccaagc ttgaatctga atttgtttac ggagactata aagtgtacga tgttaggaaa 2700
atgatcgcaa agtctgagca ggaaataggc aaggccaccg ctaagtactt cttttacagc 2760
aatattatga attttttcaa gaccgagatt acactggcca atggagagat tcggaagcga 2820
ccacttatcg aaacaaacgg agaaacagga gaaatcgtgt gggacaaggg tagggatttc 2880
gcgacagtcc ggaaggtcct gtccatgccg caggtgaaca tcgttaaaaa gaccgaagta 2940
cagaccggag gcttctccaa ggaaagtatc ctcccgaaaa ggaacagcga caagctgatc 3000
gcacgcaaaa aagattggga ccccaagaaa tacggcggat tcgattctcc tacagtcgct 3060
tacagtgtac tggttgtggc caaagtggag aaagggaagt ctaaaaaact caaaagcgtc 3120
aaggaactgc tgggcatcac aatcatggag cgatcaagct tcgaaaaaaa ccccatcgac 3180
tttctcgagg cgaaaggata taaagaggtc aaaaaagacc tcatcattaa gcttcccaag 3240
tactctctct ttgagcttga aaacggccgg aaacgaatgc tcgctagtgc gggcgagctg 3300
cagaaaggta acgagctggc actgccctct aaatacgtta atttcttgta tctggccagc 3360
cactatgaaa agctcaaagg gtctcccgaa gataatgagc agaagcagct gttcgtggaa 3420
caacacaaac actaccttga tgagatcatc gagcaaataa gcgaattctc caaaagagtg 3480
atcctcgccg acgctaacct cgataaggtg ctttctgctt acaataagca cagggataag 3540
cccatcaggg agcaggcaga aaacattatc cacttgttta ctctgaccaa cttgggcgcg 3600
cctgcagcct tcaagtactt cgacaccacc atagacagaa agcggtacac ctctacaaag 3660
gaggtcctgg acgccacact gattcatcag tcaattacgg ggctctatga aacaagaatc 3720
gacctctctc agctcggtgg agac 3744
<210> 6
<211> 1248
<212> PRT
<213> Streptococcus pyogenes serotype M1
<400> 6
Gly Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile
180 185 190
Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His
195 200 205
His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro
210 215 220
Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala
225 230 235 240
Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile
245 250 255
Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys
260 265 270
Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
275 280 285
Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
290 295 300
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile
305 310 315 320
Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala
325 330 335
Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr
340 345 350
Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala
355 360 365
Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn
370 375 380
Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val
385 390 395 400
Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys
405 410 415
Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu
420 425 430
Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr
435 440 445
Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu
450 455 460
Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile
465 470 475 480
Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu
485 490 495
Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile
500 505 510
Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met
515 520 525
Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
530 535 540
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu
545 550 555 560
Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu
565 570 575
Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln
580 585 590
Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala
595 600 605
Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val
610 615 620
Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val
625 630 635 640
Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn
645 650 655
Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly
660 665 670
Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
675 680 685
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val
690 695 700
Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His
705 710 715 720
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val
725 730 735
Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser
740 745 750
Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn
755 760 765
Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
770 775 780
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln
785 790 795 800
Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp
805 810 815
Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu
820 825 830
Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys
835 840 845
Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His Asn Ala
850 855 860
His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys
865 870 875 880
Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr
885 890 895
Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala
900 905 910
Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
915 920 925
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu
930 935 940
Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
945 950 955 960
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys
965 970 975
Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
980 985 990
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
995 1000 1005
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1010 1015 1020
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val
1025 1030 1035 1040
Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys
1045 1050 1055
Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys
1060 1065 1070
Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn
1075 1080 1085
Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn
1090 1095 1100
Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser
1105 1110 1115 1120
His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln
1125 1130 1135
Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln
1140 1145 1150
Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1155 1160 1165
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu
1170 1175 1180
Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala
1185 1190 1195 1200
Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr
1205 1210 1215
Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1220 1225 1230
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1235 1240 1245
<210> 7
<211> 3744
<212> DNA
<213> Streptococcus pyogenes serotype M1
<400> 7
ggcgacaaga agtactccat tgggctcgat atcggcacaa acagcgtcgg ctgggccgtc 60
attacggacg agtacaaggt gccgagcaaa aaattcaaag ttctgggcaa taccgatcgc 120
cacagcataa agaagaacct cattggcgcc ctcctgttcg actccgggga gacggccgaa 180
gccacgcggc tcaaaagaac agcacggcgc agatataccc gcagaaagaa tcggatctgc 240
tacctgcagg agatctttag taatgagatg gctaaggtgg atgactcttt cttccatagg 300
ctggaggagt cctttttggt ggaggaggat aaaaagcacg agcgccaccc aatctttggc 360
aatatcgtgg acgaggtggc gtaccatgaa aagtacccaa ccatatatca tctgaggaag 420
aagcttgtag acagtactga taaggctgac ttgcggttga tctatctcgc gctggcgcat 480
atgatcaaat ttcggggaca cttcctcatc gagggggacc tgaacccaga caacagcatt 540
ctgctgagtg atattctgcg agtgaacacg gagatcacca aagctccgct gagcgctagt 600
atgatcaagc gctatgatga gcaccaccaa gacttgactt tgctgaaggc ccttgtcaga 660
cagcaactgc ctgagaagta caaggaaatt ttcttcgatc agtctaaaaa tggctacgcc 720
ggatacattg acggcggagc aagccaggag gaattttaca aatttattaa gcccatcttg 780
gaaaaaatgg acggcaccga ggagctgctg gtaaagctta acagagaaga tctgttgcgc 840
aaacagcgca ctttcgacaa tggaagcatc ccccaccaga ttcacctggg cgaactgcac 900
gctatcctca ggcggcaaga ggatttctac ccctttttga aagataacag ggaaaagatt 960
gagaaaatcc tcacatttcg gataccctac tatgtaggcc ccctcgcccg gggaaattcc 1020
agattcgcgt ggatgactcg caaatcagaa gagaccatca ctccctggaa cttcgaggaa 1080
gtcgtggata agggggcctc tgcccagtcc ttcatcgaaa ggatgactaa ctttgataaa 1140
aatctgccta acgaaaaggt gcttcctaaa cactctctgc tgtacgagta cttcacagtt 1200
tataacgagc tcaccaaggt caaatacgtc acagaaggga tgagaaagcc agcattcctg 1260
tctggagagc agaagaaagc tatcgtggac ctcctcttca agacgaaccg gaaagttacc 1320
gtgaaacagc tcaaagaaga ctatttcaaa aagattgaat gtttcgactc tgttgaaatc 1380
agcggagtgg aggatcgctt caacgcatcc ctgggaacgt atcacgatct cctgaaaatc 1440
attaaagaca aggacttcct ggacaatgag gagaacgagg acattcttga ggacattgtc 1500
ctcaccctta cgttgtttga agatagggag atgattgaag aacgcttgaa aacttacgct 1560
catctcttcg acgacaaagt catgaaacag ctcaagaggc gccgatatac aggatggggg 1620
cggctgtcaa gaaaactgat caatgggatc cgagacaagc agagtggaaa gacaatcctg 1680
gattttctta agtccgatgg atttgccaac cggaacttca tgcagttgat ccatgatgac 1740
tctctcacct ttaaggagga catccagaaa gcacaagttt ctggccaggg ggacagtctt 1800
cacgagcaca tcgctaatct tgcaggtagc ccagctatca aaaagggaat actgcagacc 1860
gttaaggtcg tggatgaact cgtcaaagta atgggaaggc ataagcccga gaatatcgtt 1920
atcgagatgg cccgagagaa ccaaactacc cagaagggac agaagaacag tagggaaagg 1980
atgaagagga ttgaagaggg tataaaagaa ctggggtccc aaatccttaa ggaacaccca 2040
gttgaaaaca cccagcttca gaatgagaag ctctacctgt actacctgca gaacggcagg 2100
gacatgtacg tggatcagga actggacatc aatcggctct ccgactacga cgtggatcat 2160
atcgtgcccc agtcttttct caaagatgat tctattgata ataaagtgtt gacaagatcc 2220
gataaaaata gagggaagag tgataacgtc ccctcagaag aagttgtcaa gaaaatgaaa 2280
aattattggc ggcagctgct gaacgccaaa ctgatcacac aacggaagtt cgataatctg 2340
actaaggctg aacgaggtgg cctgtctgag ttggataaag caggcttcat caaaaggcag 2400
cttgttgaga cacgccagat caccaagcac gtggcccaaa ttctcgattc acgcatgaac 2460
accaagtacg atgaaaatga caaactgatt cgagaggtga aagttattac tctgaagtct 2520
aagctggtct cagatttcag aaaggacttt cagttttata aggtgagaga gatcaacaat 2580
tacgccaatg cgcatgatgc ctacctgaat gcagtggtag gcactgcact tatcaaaaaa 2640
tatcccaagc ttgaatctga atttgtttac ggagactata aagtgtacga tgttaggaaa 2700
atgatcgcaa agtctgagca ggaaataggc aaggccaccg ctaagtactt cttttacagc 2760
aatattatga attttttcaa gaccgagatt acactggcca atggagagat tcggaagcga 2820
ccacttatcg aaacaaacgg agaaacagga gaaatcgtgt gggacaaggg tagggatttc 2880
gcgacagtcc ggaaggtcct gtccatgccg caggtgaaca tcgttaaaaa gaccgaagta 2940
cagaccggag gcttctccaa ggaaagtatc ctcccgaaaa ggaacagcga caagctgatc 3000
gcacgcaaaa aagattggga ccccaagaaa tacggcggat tcgattctcc tacagtcgct 3060
tacagtgtac tggttgtggc caaagtggag aaagggaagt ctaaaaaact caaaagcgtc 3120
aaggaactgc tgggcatcac aatcatggag cgatcaagct tcgaaaaaaa ccccatcgac 3180
tttctcgagg cgaaaggata taaagaggtc aaaaaagacc tcatcattaa gcttcccaag 3240
tactctctct ttgagcttga aaacggccgg aaacgaatgc tcgctagtgc gggcgagctg 3300
cagaaaggta acgagctggc actgccctct aaatacgtta atttcttgta tctggccagc 3360
cactatgaaa agctcaaagg gtctcccgaa gataatgagc agaagcagct gttcgtggaa 3420
caacacaaac actaccttga tgagatcatc gagcaaataa gcgaattctc caaaagagtg 3480
atcctcgccg acgctaacct cgataaggtg ctttctgctt acaataagca cagggataag 3540
cccatcaggg agcaggcaga aaacattatc cacttgttta ctctgaccaa cttgggcgcg 3600
cctgcagcct tcaagtactt cgacaccacc atagacagaa agcggtacac ctctacaaag 3660
gaggtcctgg acgccacact gattcatcag tcaattacgg ggctctatga aacaagaatc 3720
gacctctctc agctcggtgg agac 3744
<210> 8
<211> 1248
<212> PRT
<213> Streptococcus pyogenes serotype M1
<400> 8
Gly Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile
180 185 190
Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His
195 200 205
His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro
210 215 220
Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala
225 230 235 240
Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile
245 250 255
Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys
260 265 270
Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly
275 280 285
Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
290 295 300
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile
305 310 315 320
Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala
325 330 335
Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr
340 345 350
Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala
355 360 365
Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn
370 375 380
Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val
385 390 395 400
Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys
405 410 415
Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu
420 425 430
Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr
435 440 445
Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu
450 455 460
Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile
465 470 475 480
Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu
485 490 495
Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile
500 505 510
Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met
515 520 525
Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
530 535 540
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu
545 550 555 560
Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu
565 570 575
Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln
580 585 590
Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala
595 600 605
Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val
610 615 620
Asp Glu Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val
625 630 635 640
Ile Glu Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn
645 650 655
Ser Arg Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly
660 665 670
Ser Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
675 680 685
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val
690 695 700
Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His
705 710 715 720
Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val
725 730 735
Leu Thr Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser
740 745 750
Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn
755 760 765
Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
770 775 780
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln
785 790 795 800
Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp
805 810 815
Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu
820 825 830
Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys
835 840 845
Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr Ala Asn Ala
850 855 860
His Asp Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys
865 870 875 880
Tyr Pro Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr
885 890 895
Asp Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala
900 905 910
Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
915 920 925
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu
930 935 940
Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe
945 950 955 960
Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys
965 970 975
Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
980 985 990
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
995 1000 1005
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1010 1015 1020
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val
1025 1030 1035 1040
Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys
1045 1050 1055
Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys
1060 1065 1070
Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn
1075 1080 1085
Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn
1090 1095 1100
Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser
1105 1110 1115 1120
His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln
1125 1130 1135
Leu Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln
1140 1145 1150
Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1155 1160 1165
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu
1170 1175 1180
Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala
1185 1190 1195 1200
Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr
1205 1210 1215
Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1220 1225 1230
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1235 1240 1245
<210> 9
<211> 31
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 9
gatcaacaat taccacaatg cgcatgatgc c 31
<210> 10
<211> 31
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 10
tgtggtaatt gttgatctct ctcaccttat a 31
<210> 11
<211> 34
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 11
gatcaacaat tacgccaatg cgcatgatgc ctac 34
<210> 12
<211> 34
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 12
gtaggcatca tgcgcattgg cgtaattgtt gatc 34
<210> 13
<211> 920
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 13
tagtcctgtc gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg 60
ggggcggagc ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg 120
ctggcctttt gctcacatgt tctttcctgc gttatcccct gattctgtgg ataaccgtat 180
taccgccttt gagtgagctg ataccgctcg ccgcagccga acgaccgagc gcagcgagtc 240
agtgagcgag gaagcggaag agcgcccaat acgcaaaccg cctctccccg cgcgttggcc 300
gattcattaa tgcagctggc acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa 360
cgcaattaat gtgagttagc tcactcatta ggcaccccag gctttacact ttatgcttcc 420
ggctcgtatg ttgtgtggaa ttgtgagcgg ataacaattt cacacaggaa acagctatga 480
ccatgattac gccaagctcg aaattaaccc tcactaaagg gaacaaaagc tggagctcca 540
ccgcggtggc ggccgctcta gaactagtgg atcccccggg ctgcaggaat tcgatatcaa 600
gcttatcgat taccgctcca gtcgttcatg aggttagagc tagaaatagc aagttaaaat 660
aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt cggtgctctc gagggggggc 720
ccggtaccca attcgcccta tagtgagtcg tattacaatt cactggccgt cgttttacaa 780
cgtcgtgact gggaaaaccc tggcgttacc caacttaatc gccttgcagc acatccccct 840
ttcgccagct ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca acagttgcgc 900
agcctgaatg gcgaatggaa 920
<210> 14
<211> 18
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 14
tagtcctgtc gggtttcg 18
<210> 15
<211> 18
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 15
ttccattcgc cattcagg 18
<210> 16
<211> 3046
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 16
gcacttttcg gggaaatgtg cgcggaaccc ctatttgttt atttttctaa atacattcaa 60
atatgtatcc gctcatgaga caataaccct gataaatgct tcaataatat tgaaaaagga 120
agagtatgag tattcaacat ttccgtgtcg cccttattcc cttttttgcg gcattttgcc 180
ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg 240
gtgcacgagt gggttacatc gaactggatc tcaacagcgg taagatcctt gagagttttc 300
gccccgaaga acgttttcca atgatgagca cttttaaagt tctgctatgt ggcgcggtat 360
tatcccgtat tgacgccggg caagagcaac tcggtcgccg catacactat tctcagaatg 420
acttggttga gtactcacca gtcacagaaa agcatcttac ggatggcatg acagtaagag 480
aattatgcag tgctgccata accatgagtg ataacactgc ggccaactta cttctgacaa 540
cgatcggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat catgtaactc 600
gccttgatcg ttgggaaccg gagctgaatg aagccatacc aaacgacgag cgtgacacca 660
cgatgcctgt agcaatggca acaacgttgc gcaaactatt aactggcgaa ctacttactc 720
tagcttcccg gcaacaatta atagactgga tggaggcgga taaagttgca ggaccacttc 780
tgcgctcggc ccttccggct ggctggttta ttgctgataa atctggagcc ggtgagcgtg 840
ggtctcgcgg tatcattgca gcactggggc cagatggtaa gccctcccgt atcgtagtta 900
tctacacgac ggggagtcag gcaactatgg atgaacgaaa tagacagatc gctgagatag 960
gtgcctcact gattaagcat tggtaactgt cagaccaagt ttactcatat atactttaga 1020
ttgatttaaa acttcatttt taatttaaaa ggatctaggt gaagatcctt tttgataatc 1080
tcatgaccaa aatcccttaa cgtgagtttt cgttccactg agcgtcagac cccgtagaaa 1140
agatcaaagg atcttcttga gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa 1200
aaaaaccacc gctaccagcg gtggtttgtt tgccggatca agagctacca actctttttc 1260
cgaaggtaac tggcttcagc agagcgcaga taccaaatac tgtccttcta gtgtagccgt 1320
agttaggcca ccacttcaag aactctgtag caccgcctac atacctcgct ctgctaatcc 1380
tgttaccagt ggctgctgcc agtggcgata agtcgtgtct taccgggttg gactcaagac 1440
gatagttacc ggataaggcg cagcggtcgg gctgaacggg gggttcgtgc acacagccca 1500
gcttggagcg aacgacctac accgaactga gatacctaca gcgtgagcta tgagaaagcg 1560
ccacgcttcc cgaagggaga aaggcggaca ggtatccggt aagcggcagg gtcggaacag 1620
gagagcgcac gagggagctt ccagggggaa acgcctggta tctttatagt cctgtcgggt 1680
ttcgccacct ctgacttgag cgtcgatttt tgtgatgctc gtcagggggg cggagcctat 1740
ggaaaaacgc cagcaacgcg gcctttttac ggttcctggc cttttgctgg ccttttgctc 1800
acatgttctt tcctgcgtta tcccctgatt ctgtggataa ccgtattacc gcctttgagt 1860
gagctgatac cgctcgccgc agccgaacga ccgagcgcag cgagtcagtg agcgaggaag 1920
cggaagagcg cccaatacgc aaaccgcctc tccccgcgcg ttggccgatt cattaatgca 1980
gctggcacga caggtttccc gactggaaag cgggcagtga gcgcaacgca attaatgtga 2040
gttagctcac tcattaggca ccccaggctt tacactttat gcttccggct cgtatgttgt 2100
gtggaattgt gagcggataa caatttcaca caggaaacag ctatgaccat gattacgcca 2160
agctcgaaat taaccctcac taaagggaac aaaagctgga gctccaccgc ggtggcggcc 2220
gctctagaac tagtggatcc cccgggctgc aggaattcga tatcaagctt atcgattacc 2280
gctccagtcg ttcatgaggt tagagctaga aatagcaagt taaaataagg ctagtccgtt 2340
atcaacttga aaaagtggca ccgagtcggt gctctcgagg gggggcccgg tacccaattc 2400
gccctatagt gagtcgtatt acaattcact ggccgtcgtt ttacaacgtc gtgactggga 2460
aaaccctggc gttacccaac ttaatcgcct tgcagcacat ccccctttcg ccagctggcg 2520
taatagcgaa gaggcccgca ccgatcgccc ttcccaacag ttgcgcagcc tgaatggcga 2580
atggaaattg taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc 2640
tcatttttta accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc 2700
gagatagggt tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac 2760
tccaacgtca aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca 2820
ccctaatcaa gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg 2880
agcccccgat ttagagcttg acggggaaag ccggcgaacg tggcgagaaa ggaagggaag 2940
aaagcgaaag gagcgggcgc tagggcgctg gcaagtgtag cggtcacgct gcgcgtaacc 3000
accacacccg ccgcgcttaa tgcgccgcta cagggcgcgt caggtg 3046

Claims (9)

1.来源于酿脓链球菌的CRISPR-Cas9核酸酶SpCas9的截短型高特异性变异体,其特征在于,有两种,一种是将截短型CRISPR-Cas9(TSpCas9)核酸酶的第863位的氨基酸H突变成N,记为TSpCas9-V1核酸酶;一种是将截短型CRISPR-Cas9(TSpCas9)核酸酶的第862位的氨基酸H突变成A,第863位的氨基酸H突变成N,记为TSpCas9-V2核酸酶;两种截短型高特异性变异体记为TSpCas9-V1/V2;它们具有与野生型CRISPR-Cas9核酸酶相当的基因编辑活性,但比野生型特异性更高,能实现精准编辑;
所述野生型SpCas9核酸酶的核苷酸序列和氨基酸序列分别为SEQIDNO.1和SEQ IDNO.2所示;
所述截短型TSpCas9核酸酶的核苷酸序列和氨基酸序列分别为SEQ ID NO.3和SEQ IDNO.4所示;
所述截短型高特异性TSpCas9-V1核酸酶的核苷酸序列和氨基酸序列分别为SEQ IDNO.5和SEQ ID NO.6所示;
所述截短型高特异性TSpCas9-V2核酸酶的核苷酸序列和氨基酸序列分别为SEQ IDNO.7和SEQ ID NO.8所示。
2.一种多核苷酸序列,可以转录和翻译成如权利要求1所述的TSpCas9-V1或者转录和翻译成如权利要求1所述的TSpCas9-V2。
3.一种表达载体,其含有权利要求2所述的多核苷酸序列。
4.一种如权利要求1所述的截短型高特异性变异体的制备方法,其特征在于,具体步骤为:首先,构建所述CRISPR-Cas9核酸酶的多核苷酸序列表达载体;然后,将所述表达载体转化至宿主细胞,筛选并挑出单克隆;最后,将所述单克隆诱导表达,并通过亲和层析、离子交换方法从表达产物中分离出所述的CRISPR-Cas9核酸酶。
5.如权利要求1所述的截短型高特异性变异体、如权利要求2所述的多核苷酸序列、如权利要求3所述的表达载体作为编辑基因组DNA的编辑工具在基因组DNA片段的相关编辑中的应用。
6.根据权利要求5所述的应用,所述的编辑基因组DNA是单点编辑,或者是编辑位点大于等于两个的多点编辑;编辑的手段包括删除、突变、插入、倒位、移位、重复或易位。
7.根据权利要求6所述的应用,所述CRISPR-Cas9编辑工具包括与靶标DNA片段匹配的引导sgRNA;所述的CRISPR-Cas9核酸酶与能够介导它的sgRNA组合,对目的基因进行编辑。
8.根据权利要求5所述的应用,其特性在于,将权利要求3所述的表达载体和与之匹配的引导sgRNA一同转入宿主细胞,对基因进行编辑。
9.根据权利要求7所述的应用,其特性在于,所述单位点或多位点基因编辑,是利用权利要求1中所述TSpCas9-V1或TSpCas9-V2核酸酶对双链DNA进行剪切,并通过宿主细胞的修复系统对断裂的缺口进行修复。
CN201910581265.8A 2019-06-29 2019-06-29 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用 Active CN110272881B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910581265.8A CN110272881B (zh) 2019-06-29 2019-06-29 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910581265.8A CN110272881B (zh) 2019-06-29 2019-06-29 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用

Publications (2)

Publication Number Publication Date
CN110272881A CN110272881A (zh) 2019-09-24
CN110272881B true CN110272881B (zh) 2021-04-30

Family

ID=67962692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910581265.8A Active CN110272881B (zh) 2019-06-29 2019-06-29 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用

Country Status (1)

Country Link
CN (1) CN110272881B (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112538471B (zh) * 2020-12-28 2023-12-12 南方医科大学 一种CRISPR SpCas9(K510A)突变体及其应用

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108290933A (zh) * 2015-06-18 2018-07-17 布罗德研究所有限公司 降低脱靶效应的crispr酶突变
CN108350449A (zh) * 2015-08-28 2018-07-31 通用医疗公司 工程化的CRISPR-Cas9核酸酶

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108290933A (zh) * 2015-06-18 2018-07-17 布罗德研究所有限公司 降低脱靶效应的crispr酶突变
CN109536474A (zh) * 2015-06-18 2019-03-29 布罗德研究所有限公司 降低脱靶效应的crispr酶突变
CN108350449A (zh) * 2015-08-28 2018-07-31 通用医疗公司 工程化的CRISPR-Cas9核酸酶

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"High-fidelity CRISPR-Cas9 variants with undetectable genomewide";Benjamin P. Kleinstiver et al.,;《Nature》;20160128;第529卷;第490-495页 *

Also Published As

Publication number Publication date
CN110272881A (zh) 2019-09-24

Similar Documents

Publication Publication Date Title
CN104498493B (zh) CRISPR/Cas9特异性敲除乙型肝炎病毒的方法以及用于特异性靶向HBV DNA的gRNA
KR101106253B1 (ko) 사이코스 3-에피머라제 효소를 코딩하는 폴리뉴클레오티드를 포함하는 대장균 및 그를 이용하여 사이코스를 생산하는 방법
CN107586779A (zh) 使用crispr‑cas系统对间充质干细胞进行casp3基因敲除的方法
CN111235080B (zh) 基因重组大肠杆菌及5-羟色胺的生产方法
CN109055544B (zh) 动脉粥样硬化分子标志物及其应用
CN111154707B (zh) 基因工程化大肠杆菌及褪黑素的生产方法
CN106867952A (zh) 一株大肠杆菌基因工程菌及利用其生产l‑苏氨酸的方法
CN109182503A (zh) 动脉粥样硬化分子标志物及其应用
CN110241098B (zh) 酿脓链球菌的CRISPR核酸酶SpCas9的截短型高特异性变异体及其应用
CN107988250B (zh) 一种通用型衣藻外源基因表达载体构建方法
CN110964725A (zh) 特异性识别猪KIT基因的sgRNA及其编码DNA、试剂盒和应用
CN104278031B (zh) 一种受黄嘌呤调控的启动子a及其重组表达载体和应用
CN111909914B (zh) 核酸内切酶SpCas9的高PAM兼容性截短型变异体txCas9及其应用
CN110272881B (zh) 核酸内切酶SpCas9高特异性截短变异体TSpCas9-V1/V2及其应用
CN101466833B (zh) 经修饰的软骨素合酶多肽及其晶体
CN110499336B (zh) 一种利用小分子化合物提高基因组定点修饰效率的方法
CN106479928B (zh) 一株耐高盐耐高cod盐水球菌菌株和来源该菌株的内源质粒
CN106636023B (zh) 一种增强zwf基因启动子表达强度的方法
CN109136228B (zh) 长链非编码rna-nkila在骨组织损伤修复中的应用
CN112553237A (zh) 一种新型mariner转座子系统、应用和构建枯草芽孢杆菌插入突变株文库
CN110656120A (zh) 一种乙脑病毒sa14-14-2的克隆方法及应用
CN112662697B (zh) 一种莱茵衣藻tctn1表达质粒及其构建方法和应用
RU2761660C1 (ru) Штамм клеток Escherichia coli BL21(DE3)/pET32v11-Flpo, продуцирующих сайт-специфическую рекомбиназу Flpe
CN113444708B (zh) 一种用于药物皮下注射制剂的透明质酸酶突变体
CN106520818B (zh) 一种快速回补鸭疫里默氏杆菌缺失基因的方法

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant