CN109021111B - 一种基因碱基编辑器 - Google Patents

一种基因碱基编辑器 Download PDF

Info

Publication number
CN109021111B
CN109021111B CN201810185384.7A CN201810185384A CN109021111B CN 109021111 B CN109021111 B CN 109021111B CN 201810185384 A CN201810185384 A CN 201810185384A CN 109021111 B CN109021111 B CN 109021111B
Authority
CN
China
Prior art keywords
leu
lys
glu
asp
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810185384.7A
Other languages
English (en)
Other versions
CN109021111A (zh
Inventor
陈佳
杨力
黄行许
杨贝
王潇
李佳楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ShanghaiTech University
Original Assignee
ShanghaiTech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ShanghaiTech University filed Critical ShanghaiTech University
Publication of CN109021111A publication Critical patent/CN109021111A/zh
Application granted granted Critical
Publication of CN109021111B publication Critical patent/CN109021111B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Abstract

本发明提供的人源载脂蛋白B信使RNA脱氨酶催化亚基3A(APOBEC3A)、CRISPR相关Cas蛋白以及尿嘧啶糖苷酶抑制剂(UGI)[视情况而定加与不加]的融合表达的蛋白(碱基编辑器)。该碱基编辑器能够在DNA中进行碱基编辑,将胞嘧啶脱氨为尿嘧啶,即使胞嘧啶位于GpC位点或者高甲基化状态下,该碱基编辑器仍然具有较高的编辑效率。

Description

一种基因碱基编辑器
技术领域
本发明涉及一种基因碱基编辑器。
背景技术
基因组编辑技术指利用可设计的核酸酶(分子剪刀)通过碱基插入、缺失或置换等方式,对生物体基因组DNA特定片段进行改造从而达到对目标基因进行编辑的一种基因工程技术。利用基因组编辑技术对细胞进行遗传学操纵,可广泛的应用于生命科学基础研究、生物技术开发、农业技术开发以及医药研发领域。例如:直接在体内校正引起遗传疾病的基因突变,将能够从根本上治疗遗传疾病;对农作物进行精确的基因工程改造,使其产量提高或能够抵抗环境污染或病原体的感染;对微生物基因组进行精准改造,从而促进可再生生物能源的开发等。
CRISPR/Cas(Clustered regularly interspaced short palindromic repeats/CRISPR-associated protein)基因组编辑系统自问世以来,就有着其它基因组编辑技术无可比拟的优势,被认为能够在活细胞中广泛使用,并且是迄今为止最有效、最便捷的基因组编辑系统。Cas核酸酶利用引导RNA(guide RNA,gRNA)可在多种细胞的基因组特定靶点定位,对其进行切割从而产生DNA双链断裂(double strand breaks,DSBs),然后利用细胞内源的DNA修复机制来实现编辑。根据不同DNA修复通路的激活,基因组编辑将会导致基因的失活或者突变的校正。
通常来说,有两种主要的修复机制会被DSB所激活,一种是非同源末端连接(non-homologous endjoining,NHEJ),另一种是同源性介导修复(homology-directed repair,HDR)。作为DNA双链断裂最主要的修复通路,NHEJ在修复过程中能在DSB附近的基因组位点引入随机性碱基插入或缺失,从而导致基因的失活。与NHEJ相反,当HDR被激活时能够以外源性供体DNA作为模版,利用同源重组机制将外源性供体DNA的序列替换靶点基因组DNA的序列,从而完成基因突变的校正。
但实际上,由于同源重组机制本身的局限性,HDR介导的基因校正效率一直很低(通常小于5%)。因此,极大限制了CRISPR/Cas基因组编辑工具从科研向应用的转化,尤其是在精准基因治疗方面的应用,这也是基因编辑领域的一大难题。
为了提高基因突变的校正效率,碱基编辑器(base editor,BE)在近期内应运而生。现有的碱基编辑器将CRISPR/Cas系统和大鼠胞嘧啶脱氨酶1(rat APOBEC1,rA1)整合在一起,可执行将胞嘧啶(cytosine,C)编辑为胸腺嘧啶(thymine,T)的功能。
然而,基于rA1的碱基编辑器无法有效编辑GpC位点中的碱基C,从而限制了现有碱基编辑器的有效编辑位点。例如,GpT至GpC的突变可导致RNA剪接位点的缺失,从而引发多种人类疾病;而现有的基于rA1的碱基编辑器则无法有效的矫正GpT至GpC的突变。因此,创造能够在GpC位点进行高效碱基编辑的新型碱基编辑器,有助于在各物种基因组更为广泛的位点实现高效的碱基编辑,并极大地扩充我们对于碱基编辑器的应用,特别是在医疗领域对相关疾病进行精准基因治疗方面。
发明内容
本发明公开了一系列新型碱基编辑器(融合蛋白)和相应的新的基因碱基编辑方法。该发明显示,通过将人胞嘧啶脱氨酶3A(human APOBEC3A,hA3A)与CRISPR/Cas系统融合,能够在CRISPR/Cas系统的靶标位点高效地将胞嘧啶(C)脱氨基成尿嘧啶(U),从而导致基因组中特定位点上C到T的突变。当融合蛋白中同时融合了尿嘧啶糖基化酶抑制剂(UGI)时,该系列碱基编辑器的编辑效率能够进一步提高。惊喜的是,即使是在GpC二核苷酸的背景下,该系列编辑器仍然能够实现高精度、高效率的定向碱基编辑。特别值得注意的是,该系列编辑器也能够在甲基化的胞嘧啶(methylated C)上进行高效率的编辑,由于胞嘧啶甲基化在活细胞中普遍而常见,所以该系列编辑器在甲基化胞嘧啶上的高编辑效率无疑具有显著的临床意义。
相应地,在本发明公开的实施例中,本发明公开了一种碱基编辑器,其包含有两个片段,第一片段包含载脂蛋白B人胞嘧啶脱氨酶3A(human APOBEC3A,hA3A),第二片段包含CRISPR/Cas系统相关蛋白。在其他的一些实施例中,融合蛋白还将进一步包含尿嘧啶糖基化酶抑制剂(UGI)。
该系列基因碱基编辑器的大小分别不超过3000,2500,2200,2100,2000,1900,1800,1700,1600,或1500个氨基酸。
在一些实施例中,该系列基因碱基编辑器的APOBEC3A部分包含来自于SEQ ID NO:1的氨基酸序列,或者与SEQ ID NO:1的氨基酸残基的29-199位具有至少90%的序列同一性并保留胞苷脱氨酶活性。在一些实施例中,该系列基因碱基编辑器的APOBEC3A部分包含选自SEQ ID NO:1-10的氨基酸序列。
在一些实施例中,该系列基因碱基编辑器的Cas蛋白部分来自于以下组群SpCas9,FnCas9,St1Cas9,St3Cas9,NmCas9,SaCas9,AsCpff1,LbCpff1,FnCpff1,VQR SpCas9,EQRSpCas9,VRER SpCas9,RHA FnCas9,以及KKH SaCas9;在一些实施例中,该系列基因碱基编辑器的Cas蛋白部分是上述Cas蛋白组群(SpCas9,FnCas9,St1Cas9,St3Cas9,NmCas9,SaCas9,AsCpff1,LbCpf1,FnCpf1,VQR SpCas9,EQR SpCas9,VRER SpCas9,RHA FnCas9,以及KKH SaCas9)中某些蛋白的突变体,其保留了这些Cas蛋白的DNA结合活性,但完全丧失了DNA剪切活性;在一些实施例中,该系列基因碱基编辑器的Cas蛋白部分是上述Cas蛋白组群(SpCas9,FnCas9,St1Cas9,St3Cas9,NmCas9,SaCas9,AsCpff1,LbCpff1,FnCpff1,VQRSpCas9,EQR SpCas9,VRER SpCas9,RHA FnCas9,以及KKH SaCas9)中某些蛋白的突变体,其保留了这些Cas蛋白的DNA结合活性,但丧失了部分DNA剪切活性(即只能切割基因组双链DNA中的一条链);在一些实施例中,该系列基因碱基编辑器的Cas蛋白部分包含如SEQ IDNO:11所示的氨基酸序列。
在一些实施例中,该系列基因碱基编辑器的UGI部分包含如SEQ ID NO:12所示的氨基酸序列,或者与SEQ ID NO:12列出的氨基酸序列具有至少90%的序列抑制同一性并保留尿嘧啶糖基化酶活性的功能。
在一些实施例中,第一片段位于第二片段的N端侧;在一些实施例中,第一片段位于第二片段的N端侧,第二片段位于UGI片段的N端侧。
在一些实施例中,该系列基因碱基编辑器(融合蛋白)在第一片段与第二片段之间进一步包含长短不一的连接肽;在一些实施例中,该连接肽具有1至100个氨基酸残基;在一些实施例中,该连接肽的氨基酸残基序列中至少有10%,20%,30%,40%,50%,60%,70%,80%或90%是选自以下的氨基酸残基:丙氨酸,甘氨酸,半胱氨酸和丝氨酸;在一些实施例中,该连接肽具有如SEQ ID NO:13或14所示的氨基酸序列;在一些实施例中,该系列基因碱基编辑器(融合蛋白)还进一步包含核定位序列。
该系列碱基编辑器(融合蛋白)的实施实例包括但不限于如SEQ ID NO:16-20所示的那些氨基酸序列。
该发明同时提供了现公开实施例中所用碱基编辑器的编码核苷酸序列;在另一个实施例中,其提供的组合序列除包含了现公开实施例中所用碱基编辑器的编码核苷酸序列外,还包含了一个在药学意义上已被广泛承认和接受的承载载体;在某些实施例中,该组合还进一步包含一个引导RNA(guide RNA)的序列。
该发明还同时提供了使用该系列碱基编辑器(融合蛋白)和其相关组合物的方法。在该发明揭示的一个实施例中,其通过利用一种包含在该申请书中的碱基编辑器(融合蛋白)以及相应的引导RNA(guide RNA)将该碱基编辑器靶向定位到与guide RNA至少有部分序列互补性的基因组序列上,从而在目标位点有效地催化进行了胞嘧啶(C)脱氨基反应;在一些实施例中,胞嘧啶(C)处于GpC二核苷酸背景中;在一些实施例中,胞嘧啶(C)被甲基化;在一些实施方案中,该碱基编辑器的靶向定位接触是体外,离体或体内的。
附图说明
图1:hA3A-BE在GpC的碱基C位点实现高效碱基编辑图1A:共表达sgRNA与hA3A-BE的方法示意图;图1B:相对于共表达sgRNA与BE3,共表达sgRNA与hA3A-BE的方法在实施例1(sgFANCF-M-L6)、2(sgSITE4)的sgRNA靶点序列中GpC位点处实现了高效的C-to-T碱基编辑。数字1-20代表sgRNA靶点序列中碱基的位置,non-transfected代表未转染样品。
图2:hA3A-BE-Y130F与hA3A-BE-Y132D缩小了碱基编辑窗口图2A:共表达sgRNA/hA3A-BE-Y130F或sgRNA/hA3A-BE-Y132D的方法示意图;图2B:相对于共表达sgRNA/hA3A-BE,共表达sgRNA/hA3A-BE-Y130F或sgRNA/hA3A-BE-Y132D的方法在实施例3(sgSITE3)、4(sgEMX1)中缩小了碱基编辑窗口,从而实现了更为精确的碱基编辑。数字1-20代表sgRNA靶点序列中碱基的位置,non-transfected代表未转染样品。
图3:hA3A-BE-W104A与hA3A-BE-D131Y增强了碱基编辑效率图3A:共表达sgRNA/hA3A-BE-W104A或sgRNA/hA3A-BE-D131Y的方法示意图;图3B:相对于共表达sgRNA/hA3A-BE,共表达sgRNA/hA3A-BE-W104A或sgRNA/hA3A-BE-D131Y的方法在实施例5(sgFANCF)、6(sgSITE2)中增强了碱基编辑效率,从而实现了更为高效的碱基编辑。数字1-20代表sgRNA靶点序列中碱基的位置,non-transfected代表未转染样品。
具体实施方式
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。此外应理解,在阅读了本发明讲授的内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。
需要注意的是,该发明中的术语“一”或“一个”实体是指一个或多个该实体;例如“一种抗体”应被理解为表示一种或多种抗体。因此,术语“一”(或“一个”),“一个或多个”和“至少一个”在本文中可互换使用。
本文中所用术语“多肽”旨在涵盖单数“多肽”以及复数“多肽”,并且是指由通过酰胺键线性连接的单体(氨基酸)组成的分子(也称为肽键)。术语“多肽”是指两个或更多个氨基酸组成的的任何链或多个链,并且不涉及产物的特定长度。因此,“多肽”的定义中包括肽,二肽,三肽,寡肽,“蛋白质”,“氨基酸链”或用于表示两个或更多个氨基酸的链的任何其他术语,并且术语“多肽”可以用来代替这些术语中的任何一个,术语“多肽”还意指多肽的表达后修饰的产物,包括但不限于糖基化,乙酰化,磷酸化,酰胺化,通过已知的保护/封闭基团衍生化,蛋白水解切割或非天然修饰发生的氨基酸。多肽可以来自天然生物来源或通过重组技术产生,但不一定从指定的核酸序列翻译而来。它可能以任何方式产生,包括通过化学合成。
本文中关于细胞、多肽或核酸(例如DNA或RNA)所使用的术语“分离的”是指分别与存在于大分子的天然来源中的其他DNA或RNA分离的分子。本文所用的术语“分离的”还指当相应核酸或多肽分子通过重组技术产生时,其基本上不含来源于细胞、病毒或培养基的物质;还指当相应核酸或多肽分子通过化学合成技术产生时,其基本上不含来源于化学前体的物质或其他化学物质。此外,“分离的核酸”还意欲包括那些在天然状态下不以片段形式存在的核酸分子,这些“分离的核酸”在天然状态下不会单独存在。术语“分离的”在本文中也以于指被从其他细胞、蛋白质或组织分离的细胞或多肽,同时分离的多肽也意味着包括纯化的和重组的多肽。
本文中涉及多肽或多核苷酸时所用术语“重组”意指相关多肽或多核苷酸的形式在天然状态下不存在,其实施实例包括但不限于可通过组合相关多核苷酸或多肽得到的多核苷酸或多肽,而这些组合方式在天然状态下通常不会自动发生。
“同源性”或“同一性”或“相似性”是指两个肽之间或两个核酸分子之间的序列相似性。同源性可以通过比对不同多肽或核酸分子中的相对应位置来确定,当被比较分子序列中的同一位置在不同序列中被相同的碱基或氨基酸占据时,那么该分子在该位置是同源的。序列之间的同源程度由序列共有的匹配或同源位置的数目的函数决定。“不相关的”或“非同源的”序列与本发明所公开的序列之一的同源性应小于40%,但在优选条件下,该同源性应小于25%。
多核苷酸或多核苷酸区域(或多肽或多肽区域)与另一多核苷酸酸或多核苷酸区域(或多肽或多肽区域)具有一定百分比的序列同源性(例如,60%,65%,70%,75%,80%,85%,90%,95%,98%或99%)是指比对时,被比对的两个序列中有该百分比的碱基(或氨基酸)是相同的。该比对和百分比同源性或序列同一性可以使用本领域已知的软件程序和方法来确定,例如像Ausubel et al.eds.(2007)Current Protocols in MolecularBiology中描述的那样。优选地,序列比对时应使用默认参数。其中一个备选的对齐程序是BLAST,使用默认参数。特别地,当使用程序BLASTN和BLASTP进行比对时,使用以下默认参数:Genetic code=standard;filter=none;strand=both;cutoff=60;expect=10;Matrix=BLOSUM62;Descriptions=50sequences;sort by=HIGH SCORE;Databases=non-redundant.GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+SwissProtein+SPupdate+PIR。在生物学意义上被认为是等价的多核苷酸是指具有上述限定百分比同源性并编码具有相同或相似生物学活性的多肽的那些多核苷酸。
本文中所用术语“等价核酸或多核苷酸”是指具有与该发明公开的多核苷酸或与其互补的核苷酸序列具有一定程度同源性或序列同一性的多核苷酸。双链核酸的同源物意图同时包括与编码链或互补非编码链具有一定同源性的多核苷酸。在一个方面,核酸的同系物能够与核酸或其互补物杂交。类似地,“等价多肽”是指与该发明中公开的参考多肽的氨基酸序列有至少约70%,75%,80%,85%,90%,95%,98%或99%同源性或序列同一性的多肽。在某些情况下,与该发明中公开的参考多肽或多核苷酸相比,等价多肽或多核苷酸具有1,2,3,4或5个添加、缺失、取代或它们的组合;在某些情况下,等价序列保留与该发明中公开的参考序列相同或相似的活性(例如,表位结合)或结构(例如盐桥)。
杂交反应可以在不同的“严格性”条件下进行。通常,在约40℃下、约10×SSC或具有等同离子强度/温度的溶液中进行低严格杂交反应;在约50℃下、约6×SSC中进行中度严格杂交;在约60℃、约1×SSC中进行高度严格杂交反应。杂交反应也可以在本领域技术人员所熟知的“生理条件”下进行。“生理条件”包括但不限于通常在细胞或生物体内中存在的温度、离子强度、pH和Mg2+浓度。
当多核苷酸为DNA时,该多核苷酸的序列由以下四个核苷酸碱基的代表字母组成:腺嘌呤(A);胞嘧啶(C);鸟嘌呤(G);胸腺嘧啶(T)。当多核苷酸为RNA时,该多核苷酸的序列由以下四个核苷酸碱基的代表字母组成:腺嘌呤(A);胞嘧啶(C);鸟嘌呤(G);尿嘧啶(U)。因此,术语“多核苷酸序列”是多核苷酸分子的字母表示。该字母表示可以输入到具有中央处理单元的计算机中的数据库中,并用于生物信息学应用,例如功能基因组学和同源性搜索。术语“多态性”是指多于一种形式的基因或其部分的共存,“基因的多态性区域”是指基因在同一位置有不同的核苷酸表现形式(即不同的核苷酸序列)。基因多态性区域可以是单核苷酸,其在不同的等位基因中是不同的。
在本发明中,术语“多核苷酸”和“寡核苷酸”可互换使用,并且它们是指任何长度的核苷酸的聚合形式,无论是脱氧核糖核苷酸还是核糖核苷酸或其类似物。多核苷酸可以具有任何三维结构并且可以执行已知或未知的任何功能。多核苷酸的实例包括但不限于如下这些:基因或基因片段(包括探针,引物,EST或SAGE标签),外显子,内含子,信使RNA(mRNA),转运RNA,核糖体RNA,核酶,cDNA,dsRNA,siRNA,miRNA,重组多核苷酸,分支多核苷酸,质粒,载体,任何序列的分离的DNA,任何序列的分离的RNA,核酸探针和引物。多核苷酸也包含经过修饰的核苷酸,例如甲基化的核苷酸和核苷酸类似物。如果多核苷酸上存在修饰,该修饰可以在组装多核苷酸之前或之后赋予。核苷酸序列可以被非核苷酸组分打断。多核苷酸可以在聚合后被进一步修饰,例如通过偶联被标记组分所标记。该术语同时指双链和单链多核苷酸分子。除非另有说明或要求,否则本发明公开的多核苷酸的任何实施方案包括其双链形式和已知或预测能构成双链形式的两条互补单链形式中的任一种。
当将其应用于多核苷酸时,术语“编码”是指多核苷酸“编码”多肽,意即在其天然状态下或当通过本领域技术人员公知的方法操作时,其可以通过转录和/或翻译以产生目的多肽和/或其片段,或产生能够编码该目的多肽和/或其片段的mRNA。反义链是指与该多核苷酸互补的序列,并且可以从中推导出编码序列。
融合蛋白
当前基于rA1的BE(碱基编辑器)无法有效编辑在GpC背景中的C,这限制了该碱基编辑器的使用。本发明公开了一系列新型碱基编辑器(融合蛋白)和相应的新的基因碱基编辑方法。该发明显示,通过将人胞嘧啶脱氨酶3A(human APOBEC3A,hA3A)与CRISPR/Cas系统融合,能够在CRISPR/Cas系统的靶标位点高效地将胞嘧啶(C)脱氨基成尿嘧啶(U),从而导致基因组中特定位点上C到T的突变。当融合蛋白中同时融合了尿嘧啶糖基化酶抑制剂(UGI)时,该系列碱基编辑器的编辑效率能够进一步提高。
惊喜的是,即使是在GpC二核苷酸的背景下,该系列编辑器仍然能够实现高精度、高效率的定向碱基编辑。特别值得注意的是,该系列编辑器也能够在甲基化的胞嘧啶(methylated C)上进行高效率的编辑,由于胞嘧啶甲基化在活细胞中普遍而常见,所以该系列编辑器在甲基化胞嘧啶上的高编辑效率无疑具有显著的临床意义。
相应地,在本发明公开的实施例中,本发明公开了一种碱基编辑器,其包含有两个片段,第一片段包含载脂蛋白B人胞嘧啶脱氨酶3A(human APOBEC3A,hA3A),第二片段包含CRISPR/Cas系统相关蛋白。在其他的一些实施例中,融合蛋白还将进一步包含尿嘧啶糖基化酶抑制剂(UGI)。
APOBEC3A,也被称为载脂蛋白B mRNA编辑酶催化亚基3A或A3A,是在人,非人灵长类动物和一些其他哺乳动物中都存在的APOBEC3家族成员中的一种。APOBEC3A蛋白缺乏其他家族成员的锌结合活性。人的APOBEC3A有两种isoform,isoform a(NP_663745.1;SEQ IDNO:1)和isoform b(NP_001257335.1;SEQ ID NO:6)都具有脱氨基活性;相比于isoform b,isoform a在靠近N-末端处包含更多的残基。术语“APOBEC3A”还包括与野生型哺乳动物APOBEC3A具有一定水平(例如70%,75%,80%,85%,90%,95%,98%,99%)序列同一性的变体和突变体,这些变体和突变体都具有胞苷脱氨活性。如实施例1所示,某些突变体(例如Y130F(SEQ ID NO:2),Y132D(SEQ ID NO:3),W104A(SEQ ID NO:4)和D131Y(SEQ ID NO:5))甚至优于野生型人APOBEC3A。下表1中提供了这些变体和突变体的实例序列。
表1.APOBEC3A及其变体、突变体的序列
Figure BDA0001590127870000091
Figure BDA0001590127870000101
Figure BDA0001590127870000111
APOBEC3A蛋白也可以在其他氨基酸位置进行进一步的修饰,例如添加,缺失和/或取代。这样的修饰可以是在一个,两个或三个或更多个氨基酸位置上进行的取代替换。在一个实施例中,修饰是在一个位置处的替换。在一些实施例中,这样的替换是保守氨基酸取代。
“保守氨基酸取代”是指氨基酸残基被其他具有相似侧链的氨基酸残基取代的情况。具有相似侧链的氨基酸残基家族在本领域内已有公认的定义,包括碱性侧链(例如赖氨酸,精氨酸,组氨酸),酸性侧链(例如天冬氨酸,谷氨酸),不带电荷的极性侧链(例如,甘氨酸,天冬酰胺,谷氨酰胺,丝氨酸,苏氨酸,酪氨酸,半胱氨酸),非极性侧链(例如丙氨酸,缬氨酸,亮氨酸,异亮氨酸,脯氨酸,苯丙氨酸,甲硫氨酸,色氨酸)异亮氨酸)和芳族侧链(例如酪氨酸,苯丙氨酸,色氨酸,组氨酸)家族。因此,该发明公开的融合蛋白中的非关键氨基酸残基可以被来自相同侧链家族的另一氨基酸残基取代置换。在另一个实施例中,一串氨基酸可以通过保守氨基酸取代的方法被另外一串结构相似的氨基酸代替,后者的侧链家族成员的顺序和/或组成和前者不同。
保守型氨基酸取代包括但不限于下表中所列,下表中的数字表示两个氨基酸之间的相似度,当数字大于等于0时认为是保守氨基酸取代。
表A.氨基酸相似度矩阵
C G P S A T D E N Q H K R V M I L F Y W
W -8 -7 -6 -2 -6 -5 -7 -7 -4 -5 -3 -3 2 -6 -4 -5 -2 0 0 17
Y 0 -5 -5 -3 -3 -3 -4 -4 -2 -4 0 -4 -5 -2 -2 -1 -1 7 10
F -4 -5 -5 -3 -4 -3 -6 -5 -4 -5 -2 -5 -4 -1 0 1 2 9
L -6 -4 -3 -3 -2 -2 -4 -3 -3 -2 -2 -3 -3 2 4 2 6
I -2 -3 -2 -1 -1 0 -2 -2 -2 -2 -2 -2 -2 4 2 5
M -5 -3 -2 -2 -1 -1 -3 -2 0 -1 -2 0 0 2 6
V -2 -1 -1 -1 0 0 -2 -2 -2 -2 -2 -2 -2 4
R -4 -3 0 0 -2 -1 -1 -1 0 1 2 3 6
K -5 -2 -1 0 -1 0 0 0 1 1 0 5
H -3 -2 0 -1 -1 -1 1 1 2 3 6
Q -5 -1 0 -1 0 -1 2 2 1 4
N -4 0 -1 1 0 0 2 1 2
E -5 0 -1 0 0 0 3 4
D -5 1 -1 0 0 0 4
T -2 0 0 1 1 3
A -2 1 1 1 2
S 0 1 1 1
P -3 -1 6
G -3 5
C 12
表B.保守氨基酸替代品
Figure BDA0001590127870000121
Figure BDA0001590127870000131
术语“CRISPR/Cas9”或简称“Cas”是指在化脓链球菌Streptococcus pyogenes中,或其他细菌中发现的与其基于CRISPR(成簇规则间隔短回文重复序列)的获得性免疫系统相关的一系列RNA引导的DNA内切核酸酶。Cas蛋白包括但不限于化脓性链球菌Cas9(SpCas9),金黄色葡萄球菌Cas9(SaCas9),酸性氨基酸球菌Cas12a(Cpf1),Lachnospiraceae细菌Cas12a(Cpf1),新凶手弗朗西斯氏菌Cas12a(Cpf1)。在Komor等人2017年1月12日发表于cell杂志168(1-2):20-36的文章“用于真核基因组操作的基于CRISPR的技术”中还提供了关于Cas蛋白另外实例。
在一些实施例中,该系列基因碱基编辑器的Cas蛋白部分来自于以下组群SpCas9,FnCas9,St1Cas9,St3Cas9,NmCas9,SaCas9,AsCpf1,LbCpf1,FnCpf1,VQR SpCas9,EQRSpCas9,VRER SpCas9,RHA FnCas9,以及KKH SaCas9;在一些实施例中,该系列基因碱基编辑器的Cas蛋白部分是上述Cas蛋白组群(SpCas9,FnCas9,St1Cas9,St3Cas9,NmCas9,SaCas9,AsCpf1,LbCpf1,FnCpf1,VQR SpCas9,EQR SpCas9,VRER SpCas9,RHA FnCas9,以及KKH SaCas9)中某些蛋白的突变体,其保留了这些Cas蛋白的DNA结合活性,但没有DNA剪切活性,或不能够同时切割双链DNA的两条链。
例如,以往的研究已经发现,SpCas9蛋白中的Asp10和His840氨基酸对于其DNA剪切活性十分关键。当两个氨基酸均被突变为丙氨酸时,突变体蛋白完全丧失DNA剪切活性;当Asp10被突变为丙氨酸时,突变体蛋白丧失部分DNA剪切活性,不能够同时切割双链DNA的两条链从而引入DNA双链断裂,而只能切割其中一条链从而引入DNA缺刻。这样的Cas蛋白突变体又被称为Cas nickase。Cas9nickase的序列包括但不限于SEQ ID NO:11中所列。
在一些实施例中,该系列基因碱基编辑器(融合蛋白)还进一步包含尿嘧啶糖基化酶抑制剂(UGI),其序列包括但不限于Bacillusphage AR9(YP_009283008.1)中所示。在一些实施例中,其UGI部分包含如SEQ ID NO:12所示的氨基酸序列,或者其序列与SEQ ID NO:12列出的氨基酸序列具有至少90%的序列同一性并保有尿嘧啶糖基化酶的活性。
在一些实施例中,该系列基因碱基编辑器(融合蛋白)在第一片段与第二片段之间进一步包含长短不一的连接肽;在一些实施例中,该连接肽具有1至100个氨基酸残基;在一些实施例中,该连接肽的氨基酸残基序列中至少有10%,20%,30%,40%,50%,60%,70%,80%或90%是选自以下的氨基酸残基:丙氨酸,甘氨酸,半胱氨酸和丝氨酸;在一些实施例中,该连接肽具有如SEQ ID NO:13或14所示的氨基酸序列。
在一些实施例中,APOBEC3A,Cas蛋白和UGI可以以任何方式排列。然而,在一个优选的实施方案中,APOBEC3A被置于Cas蛋白的N端侧,当融合蛋白包含UGI时,Cas蛋白优选被置于UGI的N端侧。
在一些实施例中,该系列基因碱基编辑器(融合蛋白)还进一步包含核定位序列。
表2.其他序列
Figure BDA0001590127870000141
Figure BDA0001590127870000151
Figure BDA0001590127870000161
Figure BDA0001590127870000171
Figure BDA0001590127870000181
Figure BDA0001590127870000191
Figure BDA0001590127870000201
Figure BDA0001590127870000211
Figure BDA0001590127870000221
Figure BDA0001590127870000231
Figure BDA0001590127870000241
Figure BDA0001590127870000251
Figure BDA0001590127870000261
Figure BDA0001590127870000271
Figure BDA0001590127870000281
Figure BDA0001590127870000291
Figure BDA0001590127870000301
Figure BDA0001590127870000311
本发明还提供了在本发明中公开的碱基编辑器(融合蛋白)或其突变体或衍生物的分离的多核苷酸或核酸分子(例如,SEQ ID NO:21)。制备融合蛋白的方法在本领域中是公知的并在此描述。
组成与方法
本发明同时提供了碱基编辑器组合物的组成与使用方法。这种组合物包或有效量的融合蛋白和可接受的载体。在一些实施例中,组合物还包含与目标DNA具有互补性的指导RNA。这样的组合可以用于样本中靶标序列的碱基编辑。
融合蛋白及其组合物可以用于碱基编辑。在一个实施例中,提供了用于编辑靶标多核苷酸的方法:利用与靶标多核苷酸具有至少部分序列互补性的指导RNA来将本发明公开的碱基编辑器(融合蛋白)和与靶多核苷酸接触,其中所述编辑包括胞嘧啶脱氨基(C)在靶多核苷酸中。
目前数据表明融合蛋白可以对任意位置,任意环境的胞嘧啶发生编辑,例如CpC,ApC,GpC,TpC,CpA,CpG,CpC,CpT等等。意外之喜的是我们发现融合蛋白可以编辑GpC二核苷酸位点的胞嘧啶和甲基化位点的胞嘧啶。
融合蛋白(包括引导RNA)与其所靶向的多核苷酸的接触可以是体外的,在细胞中尤其如此。本发明中的融合蛋白在临床治疗中可以发挥重要作用,无论是在间接体外治疗或者体内治疗。
实施例
实施例1:碱基编辑器
构建pCMV-hA3A-BE的表达质粒。人源载脂蛋白B信使RNA脱氨酶催化亚基3A(APOBEC3A,hA3A;SEQ ID NO:1)与Cas9切口酶以及一个尿嘧啶DNA糖苷酶抑制剂[芽孢杆菌噬菌体](SEQ ID NO:12)融合在一个表达载体上。Cas9切口酶的第10位天冬氨酸突变为丙氨酸,由此失去切割双链的活性并保证在一条链上产生一个切口。
融合表达载体hA3A-nCas9-UGI(hA3A-BE,SEQ ID NO:21)与单链引导RNA的表达载体共转入真核细胞中(图1,图例A),在基因组的引导RNA靶向的位点发生C-T的碱基编辑。PCR扩增基因组DNA靶位置的序列,并通过Sanger DNA测序来检测靶位点C-T的碱基编辑效率。相对于共表达sgRNA与BE3,共表达sgRNA与hA3A-BE的方法在实施例1(sgFANCF-M-L6)、2(sgSITE4)的sgRNA靶点序列中GpC位点处实现了高效的C-to-T碱基编辑(图1,图例B,虚线框)。
然后,Y130F(SEQ ID NO:2)和Y132D(SEQ ID NO:3)突变分别引入到原始的hA3A序列中,构建得到hA3A-BE-Y130F和hA3A-BE-Y132D碱基编辑器(图2,图例A)。相对于共表达sgRNA/hA3A-BE,共表达sgRNA/hA3A-BE-Y130F或sgRNA/hA3A-BE-Y132D的方法在实施例3(sgSITE3)、4(sgEMX1)中缩小了碱基编辑窗口,从而实现了更为精确的碱基编辑(图2,图例B)。
此外,W104A和D131Y两种突变分别引入到原始hA3A序列中,构建得到hA3A-BE-W104A和hA3A-BE-D131Y碱基编辑器(图3,图例A)。相对于共表达sgRNA/hA3A-BE,共表达sgRNA/hA3A-BE-W104A或sgRNA/hA3A-BE-D131Y的方法在实施例5(sgFANCF)、6(sgSITE2)中增强了碱基编辑效率,从而实现了更为高效的碱基编辑(图3,图例B)。
***
目前的信息公开并不局限于所描述的用于说明本发明单一方面特性的特定的实施例中,任何功能相同的组分和方法都在这一公开范围之内。对于技术娴熟的人来说,他们可以在目前的组成成分和方法上做任意的修饰和改变,但不能背离所公开信息的精神和范围。因此现在的公开包含修饰和改变,但前提是这些修饰和改变要在附加权利要求及其等价物范围之内。
本规范中所提到的所有出版物和专利申请都是在此基础上进行合并的,每一个单独的出版物或专利申请都是具体的并且个别地指明要被引用的。
<110> 上海科技大学
<120> 一种基因碱基编辑器
<140> 2018101853847
<141> 2018-03-07
<160> 21
<170> SIPOSequenceListing 1.0
<210> 1
<211> 199
<212> PRT
<213> Artificial Sequence
<400> 1
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn
195
<210> 2
<211> 199
<212> PRT
<213> Artificial Sequence
<400> 2
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Phe Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn
195
<210> 3
<211> 199
<212> PRT
<213> Artificial Sequence
<400> 3
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Asp Asp Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn
195
<210> 4
<211> 199
<212> PRT
<213> Artificial Sequence
<400> 4
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Ala Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn
195
<210> 5
<211> 199
<212> PRT
<213> Artificial Sequence
<400> 5
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Tyr Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn
195
<210> 6
<211> 181
<212> PRT
<213> Artificial Sequence
<400> 6
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Lys Thr Tyr Leu Cys
1 5 10 15
Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp Gln
20 25 30
His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly Phe
35 40 45
Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser Leu
50 55 60
Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser Trp
65 70 75 80
Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala Phe Leu
85 90 95
Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile Tyr
100 105 110
Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp Ala
115 120 125
Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys Trp
130 135 140
Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp Gly
145 150 155 160
Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile Leu
165 170 175
Gln Asn Gln Gly Asn
180
<210> 7
<211> 181
<212> PRT
<213> Artificial Sequence
<400> 7
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Lys Thr Tyr Leu Cys
1 5 10 15
Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp Gln
20 25 30
His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly Phe
35 40 45
Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser Leu
50 55 60
Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser Trp
65 70 75 80
Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala Phe Leu
85 90 95
Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile Phe
100 105 110
Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp Ala
115 120 125
Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys Trp
130 135 140
Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp Gly
145 150 155 160
Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile Leu
165 170 175
Gln Asn Gln Gly Asn
180
<210> 8
<211> 181
<212> PRT
<213> Artificial Sequence
<400> 8
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Lys Thr Tyr Leu Cys
1 5 10 15
Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp Gln
20 25 30
His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly Phe
35 40 45
Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser Leu
50 55 60
Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser Trp
65 70 75 80
Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala Phe Leu
85 90 95
Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile Tyr
100 105 110
Asp Asp Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp Ala
115 120 125
Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys Trp
130 135 140
Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp Gly
145 150 155 160
Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile Leu
165 170 175
Gln Asn Gln Gly Asn
180
<210> 9
<211> 181
<212> PRT
<213> Artificial Sequence
<400> 9
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Lys Thr Tyr Leu Cys
1 5 10 15
Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp Gln
20 25 30
His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly Phe
35 40 45
Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser Leu
50 55 60
Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser Trp
65 70 75 80
Ser Pro Cys Phe Ser Ala Gly Cys Ala Gly Glu Val Arg Ala Phe Leu
85 90 95
Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile Tyr
100 105 110
Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp Ala
115 120 125
Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys Trp
130 135 140
Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp Gly
145 150 155 160
Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile Leu
165 170 175
Gln Asn Gln Gly Asn
180
<210> 10
<211> 181
<212> PRT
<213> Artificial Sequence
<400> 10
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Lys Thr Tyr Leu Cys
1 5 10 15
Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp Gln
20 25 30
His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly Phe
35 40 45
Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser Leu
50 55 60
Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser Trp
65 70 75 80
Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala Phe Leu
85 90 95
Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile Tyr
100 105 110
Tyr Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp Ala
115 120 125
Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys Trp
130 135 140
Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp Gly
145 150 155 160
Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile Leu
165 170 175
Gln Asn Gln Gly Asn
180
<210> 11
<211> 1399
<212> PRT
<213> Artificial Sequence
<400> 11
Met Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Pro Lys Lys Lys Arg
1 5 10 15
Lys Val Glu Ala Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly
20 25 30
Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro
35 40 45
Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys
50 55 60
Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu
65 70 75 80
Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys
85 90 95
Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys
100 105 110
Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu
115 120 125
Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp
130 135 140
Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys
145 150 155 160
Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu
165 170 175
Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly
180 185 190
Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu
195 200 205
Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser
210 215 220
Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg
225 230 235 240
Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly
245 250 255
Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe
260 265 270
Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys
275 280 285
Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp
290 295 300
Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile
305 310 315 320
Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro
325 330 335
Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu
340 345 350
Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys
355 360 365
Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
370 375 380
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu
385 390 395 400
Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu
405 410 415
Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His
420 425 430
Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp
435 440 445
Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu
450 455 460
Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser
465 470 475 480
Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp
485 490 495
Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile
500 505 510
Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu
515 520 525
Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu
530 535 540
Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu
545 550 555 560
Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn
565 570 575
Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile
580 585 590
Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn
595 600 605
Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
610 615 620
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val
625 630 635 640
Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu
645 650 655
Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys
660 665 670
Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn
675 680 685
Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys
690 695 700
Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp
705 710 715 720
Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln
725 730 735
Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala
740 745 750
Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val
755 760 765
Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala
770 775 780
Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg
785 790 795 800
Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu
805 810 815
Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr
820 825 830
Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu
835 840 845
Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln
850 855 860
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser
865 870 875 880
Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val
885 890 895
Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile
900 905 910
Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu
915 920 925
Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr
930 935 940
Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn
945 950 955 960
Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile
965 970 975
Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe
980 985 990
Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr
995 1000 1005
Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu
1010 1015 1020
Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys
1025 1030 1035 1040
Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr
1045 1050 1055
Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1060 1065 1070
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1075 1080 1085
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1090 1095 1100
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val
1105 1110 1115 1120
Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser
1125 1130 1135
Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly
1140 1145 1150
Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys
1155 1160 1165
Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu
1170 1175 1180
Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp
1185 1190 1195 1200
Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile
1205 1210 1215
Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg
1220 1225 1230
Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
1235 1240 1245
Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys
1250 1255 1260
Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu
1265 1270 1275 1280
Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe
1285 1290 1295
Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1300 1305 1310
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1315 1320 1325
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1330 1335 1340
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys
1345 1350 1355 1360
Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr
1365 1370 1375
Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Pro Lys Lys
1380 1385 1390
Lys Arg Lys Val Glu Ala Ser
1395
<210> 12
<211> 83
<212> PRT
<213> Artificial Sequence
<400> 12
Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val
1 5 10 15
Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile
20 25 30
Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu
35 40 45
Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr
50 55 60
Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile
65 70 75 80
Lys Met Leu
<210> 13
<211> 16
<212> PRT
<213> Artificial Sequence
<400> 13
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
<210> 14
<211> 4
<212> PRT
<213> Artificial Sequence
<400> 14
Ser Gly Gly Ser
1
<210> 15
<211> 7
<212> PRT
<213> Artificial Sequence
<400> 15
Pro Lys Lys Lys Arg Lys Val
1 5
<210> 16
<211> 1680
<212> PRT
<213> Artificial Sequence
<400> 16
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn Ser Gly Ser Glu Thr Pro Gly Thr Ser
195 200 205
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
210 215 220
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
225 230 235 240
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
245 250 255
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
260 265 270
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
275 280 285
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
290 295 300
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
305 310 315 320
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
325 330 335
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
340 345 350
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
355 360 365
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
370 375 380
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
385 390 395 400
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
405 410 415
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
420 425 430
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
435 440 445
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
450 455 460
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
465 470 475 480
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
485 490 495
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
500 505 510
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
515 520 525
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
530 535 540
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
545 550 555 560
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
565 570 575
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
580 585 590
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
595 600 605
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
610 615 620
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
625 630 635 640
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
645 650 655
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
660 665 670
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
675 680 685
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
690 695 700
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
705 710 715 720
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
725 730 735
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
740 745 750
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
755 760 765
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
770 775 780
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
785 790 795 800
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
805 810 815
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
820 825 830
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
835 840 845
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
850 855 860
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
865 870 875 880
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
885 890 895
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
900 905 910
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
915 920 925
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
930 935 940
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
945 950 955 960
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
965 970 975
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
980 985 990
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
995 1000 1005
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1010 1015 1020
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1025 1030 1035 1040
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val
1045 1050 1055
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1060 1065 1070
Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1075 1080 1085
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1090 1095 1100
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1105 1110 1115 1120
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1125 1130 1135
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1140 1145 1150
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1155 1160 1165
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1170 1175 1180
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1185 1190 1195 1200
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1205 1210 1215
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1220 1225 1230
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1235 1240 1245
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1250 1255 1260
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1265 1270 1275 1280
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1285 1290 1295
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1300 1305 1310
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1315 1320 1325
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1330 1335 1340
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1345 1350 1355 1360
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1365 1370 1375
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1380 1385 1390
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1395 1400 1405
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1410 1415 1420
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1425 1430 1435 1440
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1445 1450 1455
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1460 1465 1470
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1475 1480 1485
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1490 1495 1500
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1505 1510 1515 1520
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1525 1530 1535
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1540 1545 1550
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1555 1560 1565
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly
1570 1575 1580
Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln
1585 1590 1595 1600
Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu
1605 1610 1615
Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr
1620 1625 1630
Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro
1635 1640 1645
Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn
1650 1655 1660
Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1665 1670 1675 1680
<210> 17
<211> 1680
<212> PRT
<213> Artificial Sequence
<400> 17
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Phe Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn Ser Gly Ser Glu Thr Pro Gly Thr Ser
195 200 205
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
210 215 220
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
225 230 235 240
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
245 250 255
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
260 265 270
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
275 280 285
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
290 295 300
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
305 310 315 320
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
325 330 335
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
340 345 350
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
355 360 365
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
370 375 380
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
385 390 395 400
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
405 410 415
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
420 425 430
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
435 440 445
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
450 455 460
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
465 470 475 480
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
485 490 495
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
500 505 510
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
515 520 525
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
530 535 540
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
545 550 555 560
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
565 570 575
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
580 585 590
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
595 600 605
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
610 615 620
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
625 630 635 640
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
645 650 655
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
660 665 670
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
675 680 685
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
690 695 700
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
705 710 715 720
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
725 730 735
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
740 745 750
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
755 760 765
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
770 775 780
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
785 790 795 800
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
805 810 815
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
820 825 830
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
835 840 845
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
850 855 860
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
865 870 875 880
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
885 890 895
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
900 905 910
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
915 920 925
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
930 935 940
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
945 950 955 960
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
965 970 975
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
980 985 990
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
995 1000 1005
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1010 1015 1020
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1025 1030 1035 1040
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val
1045 1050 1055
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1060 1065 1070
Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1075 1080 1085
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1090 1095 1100
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1105 1110 1115 1120
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1125 1130 1135
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1140 1145 1150
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1155 1160 1165
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1170 1175 1180
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1185 1190 1195 1200
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1205 1210 1215
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1220 1225 1230
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1235 1240 1245
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1250 1255 1260
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1265 1270 1275 1280
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1285 1290 1295
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1300 1305 1310
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1315 1320 1325
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1330 1335 1340
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1345 1350 1355 1360
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1365 1370 1375
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1380 1385 1390
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1395 1400 1405
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1410 1415 1420
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1425 1430 1435 1440
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1445 1450 1455
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1460 1465 1470
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1475 1480 1485
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1490 1495 1500
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1505 1510 1515 1520
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1525 1530 1535
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1540 1545 1550
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1555 1560 1565
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly
1570 1575 1580
Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln
1585 1590 1595 1600
Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu
1605 1610 1615
Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr
1620 1625 1630
Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro
1635 1640 1645
Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn
1650 1655 1660
Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1665 1670 1675 1680
<210> 18
<211> 1680
<212> PRT
<213> Artificial Sequence
<400> 18
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Asp Asp Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn Ser Gly Ser Glu Thr Pro Gly Thr Ser
195 200 205
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
210 215 220
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
225 230 235 240
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
245 250 255
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
260 265 270
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
275 280 285
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
290 295 300
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
305 310 315 320
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
325 330 335
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
340 345 350
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
355 360 365
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
370 375 380
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
385 390 395 400
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
405 410 415
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
420 425 430
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
435 440 445
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
450 455 460
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
465 470 475 480
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
485 490 495
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
500 505 510
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
515 520 525
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
530 535 540
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
545 550 555 560
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
565 570 575
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
580 585 590
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
595 600 605
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
610 615 620
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
625 630 635 640
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
645 650 655
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
660 665 670
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
675 680 685
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
690 695 700
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
705 710 715 720
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
725 730 735
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
740 745 750
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
755 760 765
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
770 775 780
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
785 790 795 800
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
805 810 815
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
820 825 830
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
835 840 845
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
850 855 860
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
865 870 875 880
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
885 890 895
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
900 905 910
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
915 920 925
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
930 935 940
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
945 950 955 960
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
965 970 975
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
980 985 990
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
995 1000 1005
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1010 1015 1020
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1025 1030 1035 1040
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val
1045 1050 1055
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1060 1065 1070
Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1075 1080 1085
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1090 1095 1100
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1105 1110 1115 1120
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1125 1130 1135
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1140 1145 1150
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1155 1160 1165
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1170 1175 1180
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1185 1190 1195 1200
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1205 1210 1215
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1220 1225 1230
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1235 1240 1245
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1250 1255 1260
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1265 1270 1275 1280
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1285 1290 1295
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1300 1305 1310
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1315 1320 1325
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1330 1335 1340
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1345 1350 1355 1360
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1365 1370 1375
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1380 1385 1390
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1395 1400 1405
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1410 1415 1420
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1425 1430 1435 1440
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1445 1450 1455
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1460 1465 1470
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1475 1480 1485
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1490 1495 1500
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1505 1510 1515 1520
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1525 1530 1535
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1540 1545 1550
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1555 1560 1565
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly
1570 1575 1580
Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln
1585 1590 1595 1600
Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu
1605 1610 1615
Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr
1620 1625 1630
Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro
1635 1640 1645
Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn
1650 1655 1660
Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1665 1670 1675 1680
<210> 19
<211> 1680
<212> PRT
<213> Artificial Sequence
<400> 19
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Ala Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn Ser Gly Ser Glu Thr Pro Gly Thr Ser
195 200 205
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
210 215 220
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
225 230 235 240
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
245 250 255
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
260 265 270
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
275 280 285
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
290 295 300
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
305 310 315 320
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
325 330 335
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
340 345 350
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
355 360 365
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
370 375 380
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
385 390 395 400
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
405 410 415
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
420 425 430
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
435 440 445
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
450 455 460
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
465 470 475 480
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
485 490 495
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
500 505 510
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
515 520 525
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
530 535 540
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
545 550 555 560
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
565 570 575
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
580 585 590
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
595 600 605
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
610 615 620
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
625 630 635 640
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
645 650 655
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
660 665 670
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
675 680 685
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
690 695 700
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
705 710 715 720
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
725 730 735
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
740 745 750
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
755 760 765
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
770 775 780
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
785 790 795 800
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
805 810 815
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
820 825 830
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
835 840 845
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
850 855 860
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
865 870 875 880
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
885 890 895
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
900 905 910
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
915 920 925
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
930 935 940
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
945 950 955 960
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
965 970 975
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
980 985 990
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
995 1000 1005
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1010 1015 1020
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1025 1030 1035 1040
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val
1045 1050 1055
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1060 1065 1070
Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1075 1080 1085
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1090 1095 1100
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1105 1110 1115 1120
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1125 1130 1135
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1140 1145 1150
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1155 1160 1165
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1170 1175 1180
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1185 1190 1195 1200
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1205 1210 1215
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1220 1225 1230
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1235 1240 1245
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1250 1255 1260
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1265 1270 1275 1280
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1285 1290 1295
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1300 1305 1310
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1315 1320 1325
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1330 1335 1340
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1345 1350 1355 1360
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1365 1370 1375
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1380 1385 1390
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1395 1400 1405
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1410 1415 1420
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1425 1430 1435 1440
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1445 1450 1455
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1460 1465 1470
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1475 1480 1485
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1490 1495 1500
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1505 1510 1515 1520
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1525 1530 1535
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1540 1545 1550
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1555 1560 1565
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly
1570 1575 1580
Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln
1585 1590 1595 1600
Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu
1605 1610 1615
Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr
1620 1625 1630
Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro
1635 1640 1645
Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn
1650 1655 1660
Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1665 1670 1675 1680
<210> 20
<211> 1680
<212> PRT
<213> Artificial Sequence
<400> 20
Met Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His
1 5 10 15
Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr
20 25 30
Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met
35 40 45
Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys
50 55 60
Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro
65 70 75 80
Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile
85 90 95
Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala
100 105 110
Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg
115 120 125
Ile Tyr Tyr Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg
130 135 140
Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His
145 150 155 160
Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp
165 170 175
Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala
180 185 190
Ile Leu Gln Asn Gln Gly Asn Ser Gly Ser Glu Thr Pro Gly Thr Ser
195 200 205
Glu Ser Ala Thr Pro Glu Ser Asp Lys Lys Tyr Ser Ile Gly Leu Ala
210 215 220
Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys
225 230 235 240
Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
245 250 255
Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr
260 265 270
Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg
275 280 285
Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met
290 295 300
Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu
305 310 315 320
Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile
325 330 335
Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
340 345 350
Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile
355 360 365
Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile
370 375 380
Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
385 390 395 400
Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn
405 410 415
Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
420 425 430
Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys
435 440 445
Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro
450 455 460
Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
465 470 475 480
Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile
485 490 495
Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp
500 505 510
Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys
515 520 525
Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln
530 535 540
Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys
545 550 555 560
Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr
565 570 575
Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro
580 585 590
Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
595 600 605
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile
610 615 620
Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln
625 630 635 640
Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys
645 650 655
Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
660 665 670
Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr
675 680 685
Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser
690 695 700
Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
705 710 715 720
Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn
725 730 735
Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala
740 745 750
Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys
755 760 765
Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys
770 775 780
Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg
785 790 795 800
Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys
805 810 815
Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
820 825 830
Ile Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
835 840 845
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln
850 855 860
Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
865 870 875 880
Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe
885 890 895
Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His
900 905 910
Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser
915 920 925
Gly Gln Gly Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser
930 935 940
Pro Ala Ile Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu
945 950 955 960
Leu Val Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu
965 970 975
Met Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
980 985 990
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln
995 1000 1005
Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys
1010 1015 1020
Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
1025 1030 1035 1040
Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val
1045 1050 1055
Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
1060 1065 1070
Arg Ser Asp Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1075 1080 1085
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys
1090 1095 1100
Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
1105 1110 1115 1120
Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val
1125 1130 1135
Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg
1140 1145 1150
Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys
1155 1160 1165
Val Ile Thr Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe
1170 1175 1180
Gln Phe Tyr Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp
1185 1190 1195 1200
Ala Tyr Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro
1205 1210 1215
Lys Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1220 1225 1230
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala
1235 1240 1245
Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
1250 1255 1260
Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn
1265 1270 1275 1280
Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1285 1290 1295
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1300 1305 1310
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1315 1320 1325
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys
1330 1335 1340
Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val
1345 1350 1355 1360
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu
1365 1370 1375
Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro
1380 1385 1390
Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu
1395 1400 1405
Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg
1410 1415 1420
Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
1425 1430 1435 1440
Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr
1445 1450 1455
Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1460 1465 1470
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser
1475 1480 1485
Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
1490 1495 1500
Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala
1505 1510 1515 1520
Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1525 1530 1535
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1540 1545 1550
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1555 1560 1565
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly
1570 1575 1580
Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln
1585 1590 1595 1600
Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu
1605 1610 1615
Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr
1620 1625 1630
Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro
1635 1640 1645
Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn
1650 1655 1660
Lys Ile Lys Met Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1665 1670 1675 1680
<210> 21
<211> 8442
<212> DNA
<213> Artificial Sequence
<400> 21
atatgccaag tacgccccct attgacgtca atgacggtaa atggcccgcc tggcattatg 60
cccagtacat gaccttatgg gactttccta cttggcagta catctacgta ttagtcatcg 120
ctattaccat ggtgatgcgg ttttggcagt acatcaatgg gcgtggatag cggtttgact 180
cacggggatt tccaagtctc caccccattg acgtcaatgg gagtttgttt tggcaccaaa 240
atcaacggga ctttccaaaa tgtcgtaaca actccgcccc attgacgcaa atgggcggta 300
ggcgtgtacg gtgggaggtc tatataagca gagctggttt agtgaaccgt cagatccgct 360
agagatccgc ggccgctaat acgactcact atagggagag ccgccaccat ggaagccagc 420
ccagcatccg ggcccagaca cttgatggat ccacacatat tcacttccaa ctttaacaat 480
ggcattggaa ggcataagac ctacctgtgc tacgaagtgg agcgcctgga caatggcacc 540
tcggtcaaga tggaccagca caggggcttt ctacacaacc aggctaagaa tcttctctgt 600
ggcttttacg gccgccatgc ggagctgcgc ttcttggacc tggttccttc tttgcagttg 660
gacccggccc agatctacag ggtcacttgg ttcatctcct ggagcccctg cttctcctgg 720
ggctgtgccg gggaagtgcg tgcgttcctt caggagaaca cacacgtgag actgcgtatc 780
ttcgctgccc gcatctatga ttacgacccc ctatataagg aggcactgca aatgctgcgg 840
gatgctgggg cccaagtctc catcatgacc tacgatgaat ttaagcactg ctgggacacc 900
tttgtggacc accagggatg tcccttccag ccctgggatg gactagatga gcacagccaa 960
gccctgagtg ggaggctgcg ggccattctc cagaatcagg gaaacagcgg cagcgagact 1020
cccgggacct cagagtccgc cacacccgaa agtgataaaa agtattctat tggtttagcc 1080
atcggcacta attccgttgg atgggctgtc ataaccgatg aatacaaagt accttcaaag 1140
aaatttaagg tgttggggaa cacagaccgt cattcgatta aaaagaatct tatcggtgcc 1200
ctcctattcg atagtggcga aacggcagag gcgactcgcc tgaaacgaac cgctcggaga 1260
aggtatacac gtcgcaagaa ccgaatatgt tacttacaag aaatttttag caatgagatg 1320
gccaaagttg acgattcttt ctttcaccgt ttggaagagt ccttccttgt cgaagaggac 1380
aagaaacatg aacggcaccc catctttgga aacatagtag atgaggtggc atatcatgaa 1440
aagtacccaa cgatttatca cctcagaaaa aagctagttg actcaactga taaagcggac 1500
ctgaggttaa tctacttggc tcttgcccat atgataaagt tccgtgggca ctttctcatt 1560
gagggtgatc taaatccgga caactcggat gtcgacaaac tgttcatcca gttagtacaa 1620
acctataatc agttgtttga agagaaccct ataaatgcaa gtggcgtgga tgcgaaggct 1680
attcttagcg cccgcctctc taaatcccga cggctagaaa acctgatcgc acaattaccc 1740
ggagagaaga aaaatgggtt gttcggtaac cttatagcgc tctcactagg cctgacacca 1800
aattttaagt cgaacttcga cttagctgaa gatgccaaat tgcagcttag taaggacacg 1860
tacgatgacg atctcgacaa tctactggca caaattggag atcagtatgc ggacttattt 1920
ttggctgcca aaaaccttag cgatgcaatc ctcctatctg acatactgag agttaatact 1980
gagattacca aggcgccgtt atccgcttca atgatcaaaa ggtacgatga acatcaccaa 2040
gacttgacac ttctcaaggc cctagtccgt cagcaactgc ctgagaaata taaggaaata 2100
ttctttgatc agtcgaaaaa cgggtacgca ggttatattg acggcggagc gagtcaagag 2160
gaattctaca agtttatcaa acccatatta gagaagatgg atgggacgga agagttgctt 2220
gtaaaactca atcgcgaaga tctactgcga aagcagcgga ctttcgacaa cggtagcatt 2280
ccacatcaaa tccacttagg cgaattgcat gctatactta gaaggcagga ggatttttat 2340
ccgttcctca aagacaatcg tgaaaagatt gagaaaatcc taacctttcg cataccttac 2400
tatgtgggac ccctggcccg agggaactct cggttcgcat ggatgacaag aaagtccgaa 2460
gaaacgatta ctccatggaa ttttgaggaa gttgtcgata aaggtgcgtc agctcaatcg 2520
ttcatcgaga ggatgaccaa ctttgacaag aatttaccga acgaaaaagt attgcctaag 2580
cacagtttac tttacgagta tttcacagtg tacaatgaac tcacgaaagt taagtatgtc 2640
actgagggca tgcgtaaacc cgcctttcta agcggagaac agaagaaagc aatagtagat 2700
ctgttattca agaccaaccg caaagtgaca gttaagcaat tgaaagagga ctactttaag 2760
aaaattgaat gcttcgattc tgtcgagatc tccggggtag aagatcgatt taatgcgtca 2820
cttggtacgt atcatgacct cctaaagata attaaagata aggacttcct ggataacgaa 2880
gagaatgaag atatcttaga agatatagtg ttgactctta ccctctttga agatcgggaa 2940
atgattgagg aaagactaaa aacatacgct cacctgttcg acgataaggt tatgaaacag 3000
ttaaagaggc gtcgctatac gggctgggga cgattgtcgc ggaaacttat caacgggata 3060
agagacaagc aaagtggtaa aactattctc gattttctaa agagcgacgg cttcgccaat 3120
aggaacttta tgcagctgat ccatgatgac tctttaacct tcaaagagga tatacaaaag 3180
gcacaggttt ccggacaagg ggactcattg cacgaacata ttgcgaatct tgctggttcg 3240
ccagccatca aaaagggcat actccagaca gtcaaagtag tggatgagct agttaaggtc 3300
atgggacgtc acaaaccgga aaacattgta atcgagatgg cacgcgaaaa tcaaacgact 3360
cagaaggggc aaaaaaacag tcgagagcgg atgaagagaa tagaagaggg tattaaagaa 3420
ctgggcagcc agatcttaaa ggagcatcct gtggaaaata cccaattgca gaacgagaaa 3480
ctttacctct attacctaca aaatggaagg gacatgtatg ttgatcagga actggacata 3540
aaccgtttat ctgattacga cgtcgatcac attgtacccc aatccttttt gaaggacgat 3600
tcaatcgaca ataaagtgct tacacgctcg gataagaacc gagggaaaag tgacaatgtt 3660
ccaagcgagg aagtcgtaaa gaaaatgaag aactattggc ggcagctcct aaatgcgaaa 3720
ctgataacgc aaagaaagtt cgataactta actaaagctg agaggggtgg cttgtctgaa 3780
cttgacaagg ccggatttat taaacgtcag ctcgtggaaa cccgccaaat cacaaagcat 3840
gttgcacaga tactagattc ccgaatgaat acgaaatacg acgagaacga taagctgatt 3900
cgggaagtca aagtaatcac tttaaagtca aaattggtgt cggacttcag aaaggatttt 3960
caattctata aagttaggga gataaataac taccaccatg cgcacgacgc ttatcttaat 4020
gccgtcgtag ggaccgcact cattaagaaa tacccgaagc tagaaagtga gtttgtgtat 4080
ggtgattaca aagtttatga cgtccgtaag atgatcgcga aaagcgaaca ggagataggc 4140
aaggctacag ccaaatactt cttttattct aacattatga atttctttaa gacggaaatc 4200
actctggcaa acggagagat acgcaaacga cctttaattg aaaccaatgg ggagacaggt 4260
gaaatcgtat gggataaggg ccgggacttc gcgacggtga gaaaagtttt gtccatgccc 4320
caagtcaaca tagtaaagaa aactgaggtg cagaccggag ggttttcaaa ggaatcgatt 4380
cttccaaaaa ggaatagtga taagctcatc gctcgtaaaa aggactggga cccgaaaaag 4440
tacggtggct tcgatagccc tacagttgcc tattctgtcc tagtagtggc aaaagttgag 4500
aagggaaaat ccaagaaact gaagtcagtc aaagaattat tggggataac gattatggag 4560
cgctcgtctt ttgaaaagaa ccccatcgac ttccttgagg cgaaaggtta caaggaagta 4620
aaaaaggatc tcataattaa actaccaaag tatagtctgt ttgagttaga aaatggccga 4680
aaacggatgt tggctagcgc cggagagctt caaaagggga acgaactcgc actaccgtct 4740
aaatacgtga atttcctgta tttagcgtcc cattacgaga agttgaaagg ttcacctgaa 4800
gataacgaac agaagcaact ttttgttgag cagcacaaac attatctcga cgaaatcata 4860
gagcaaattt cggaattcag taagagagtc atcctagctg atgccaatct ggacaaagta 4920
ttaagcgcat acaacaagca cagggataaa cccatacgtg agcaggcgga aaatattatc 4980
catttgttta ctcttaccaa cctcggcgct ccagccgcat tcaagtattt tgacacaacg 5040
atagatcgca aacgatacac ttctaccaag gaggtgctag acgcgacact gattcaccaa 5100
tccatcacgg gattatatga aactcggata gatttgtcac agcttggggg tgactctggt 5160
ggttctacta atctgtcaga tattattgaa aaggagaccg gtaagcaact ggttatccag 5220
gaatccatcc tcatgctccc agaggaggtg gaagaagtca ttgggaacaa gccggaaagc 5280
gatatactcg tgcacaccgc ctacgacgag agcaccgacg agaatgtcat gcttctgact 5340
agcgacgccc ctgaatacaa gccttgggct ctggtcatac aggatagcaa cggtgagaac 5400
aagattaaga tgctctctgg tggttctccc aagaagaaga ggaaagtcta accggtcatc 5460
atcaccatca ccattgagtt taaacccgct gatcagcctc gactgtgcct tctagttgcc 5520
agccatctgt tgtttgcccc tcccccgtgc cttccttgac cctggaaggt gccactccca 5580
ctgtcctttc ctaataaaat gaggaaattg catcgcattg tctgagtagg tgtcattcta 5640
ttctgggggg tggggtgggg caggacagca agggggagga ttgggaagac aatagcaggc 5700
atgctgggga tgcggtgggc tctatggctt ctgaggcgga aagaaccagc tggggctcga 5760
taccgtcgac ctctagctag agcttggcgt aatcatggtc atagctgttt cctgtgtgaa 5820
attgttatcc gctcacaatt ccacacaaca tacgagccgg aagcataaag tgtaaagcct 5880
agggtgccta atgagtgagc taactcacat taattgcgtt gcgctcactg cccgctttcc 5940
agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg 6000
gtttgcgtat tgggcgctct tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc 6060
ggctgcggcg agcggtatca gctcactcaa aggcggtaat acggttatcc acagaatcag 6120
gggataacgc aggaaagaac atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa 6180
aggccgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat cacaaaaatc 6240
gacgctcaag tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc 6300
ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg 6360
cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt 6420
cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc 6480
gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc 6540
cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag 6600
agttcttgaa gtggtggcct aactacggct acactagaag aacagtattt ggtatctgcg 6660
ctctgctgaa gccagttacc ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa 6720
ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag 6780
gatctcaaga agatcctttg atcttttcta cggggtctga cgctcagtgg aacgaaaact 6840
cacgttaagg gattttggtc atgagattat caaaaaggat cttcacctag atccttttaa 6900
attaaaaatg aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacagtt 6960
accaatgctt aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag 7020
ttgcctgact ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca 7080
gtgctgcaat gataccgcga gacccacgct caccggctcc agatttatca gcaataaacc 7140
agccagccgg aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt 7200
ctattaattg ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg 7260
ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca 7320
gctccggttc ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg 7380
ttagctcctt cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca 7440
tggttatggc agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg 7500
tgactggtga gtactcaacc aagtcattct gagaatagtg tatgcggcga ccgagttgct 7560
cttgcccggc gtcaatacgg gataataccg cgccacatag cagaacttta aaagtgctca 7620
tcattggaaa acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca 7680
gttcgatgta acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg 7740
tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac 7800
ggaaatgttg aatactcata ctcttccttt ttcaatatta ttgaagcatt tatcagggtt 7860
attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc 7920
cgcgcacatt tccccgaaaa gtgccacctg acgtcgacgg atcgggagat cgatctcccg 7980
atcccctagg gtcgactctc agtacaatct gctctgatgc cgcatagtta agccagtatc 8040
tgctccctgc ttgtgtgttg gaggtcgctg agtagtgcgc gagcaaaatt taagctacaa 8100
caaggcaagg cttgaccgac aattgcatga agaatctgct tagggttagg cgttttgcgc 8160
tgcttcgcga tgtacgggcc agatatacgc gttgacattg attattgact agttattaat 8220
agtaatcaat tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac 8280
ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa 8340
tgacgtatgt tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt 8400
atttacggta aactgcccac ttggcagtac atcaagtgta tc 8442

Claims (18)

1.一种融合蛋白,其特征在于,融合蛋白包括第一片段和第二片段,其中第一片段位于第二片段的N末端方向,第一片段是人源载脂蛋白B信使RNA脱氨酶催化亚基3A即APOBEC3A,第二片段是CRISPR相关Cas蛋白Cas9,其中:该APOBEC3A是突变isoform a,相比于野生型isoform a,其突变选自:Y130F、Y132D、W104A或D131Y中的一个或多个;或者该APOBEC3A是突变isoform b,相比于野生型isoform b,其突变选自:Y112F、Y114D、W86A或D113Y中的一个或多个;并且该APOBEC3A保留胞苷脱氨酶活性。
2.如权利要求1所述的一种融合蛋白,其特征在于,所述融合蛋白包含一个尿嘧啶糖苷酶抑制剂即UGI。
3.如权利要求1所述的一种融合蛋白,其特征在于,所述融合蛋白的大小不超过2500个氨基酸。
4.如权利要求1所述的一种融合蛋白,其特征在于,所述的APOBEC3A选自SEQ ID NOs:2-5,7-10。
5.如权利要求1所述的一种融合蛋白,其特征在于,所述的Cas蛋白选自SpCas9、FnCas9、St1Cas9、St3Cas9、NmCas9、SaCas9、AsCpf1、LbCpf1、FnCpf1、VQR SpCas9、EQRSpCas9、VRER SpCas9、RHA FnCas9、以及KKH SaCas9组成的蛋白组。
6.如权利要求1所述的一种融合蛋白,其特征在于,所述的Cas蛋白选自SpCas9、FnCas9、St1Cas9、St3Cas9、NmCas9、SaCas9、AsCpf1、LbCpf1、FnCpf1、VQR SpCas9、EQRSpCas9、VRER SpCas9、RHA FnCas9、以及KKH SaCas9组成的蛋白组中的蛋白突变体,其保留了Cas蛋白的DNA结合活性,但不产生DNA双链断裂。
7.如权利要求6所述的一种融合蛋白,其特征在于,所述的Cas蛋白的突变体能够在其结合的DNA双链的一条链上引入一个缺口。
8.如权利要求6所述的一种融合蛋白,其特征在于,所述的Cas蛋白为SEQ ID NO:11所示的氨基酸序列。
9.如权利要求2所述的一种融合蛋白,其特征在于,所述的UGI为SEQ ID NO:12所示的氨基酸序列。
10.一种将一个靶多核苷酸中的一个胞嘧啶脱氨基的碱基编辑方法,其特征在于,方法包括将一个如权利要求1-9中任一项所述的融合蛋白和一个与靶多核苷酸具有至少部分序列互补性的向导RNA接触至所述靶多核苷酸,其中所述编辑包括将所述靶多核苷酸中的胞嘧啶脱氨基,所述碱基编辑方法为非诊断或非治疗方法。
11.如权利要求10所述的方法,其特征在于,所述胞嘧啶处于GpC二核苷酸背景中。
12.如权利要求11所述的方法,其特征在于,所述的胞嘧啶为甲基化的。
13.一种碱基编辑器在制备用于将一个靶多核苷酸中的一个胞嘧啶脱氨基的药物中的用途,其特征在于,碱基编辑器包括一个如权利要求1-9中任一项所述的融合蛋白和一个与靶多核苷酸具有至少部分序列互补性的向导RNA。
14.如权利要求13所述的用途,其特征在于,所述胞嘧啶处于GpC二核苷酸背景中。
15.如权利要求14所述的用途,其特征在于,所述的胞嘧啶为甲基化的。
16.如权利要求10所述的一种将一个靶多核苷酸中的一个胞嘧啶脱氨基的碱基编辑方法在非诊断或非治疗的定向碱基编辑的方法上的应用。
17.如权利要求16所述的应用,其特征在于,所述胞嘧啶处于GpC二核苷酸背景中。
18.如权利要求16所述的应用,其特征在于,所述胞嘧啶被甲基化。
CN201810185384.7A 2018-02-23 2018-03-07 一种基因碱基编辑器 Active CN109021111B (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2018076991 2018-02-23
CNPCT/CN2018/076991 2018-02-23

Publications (2)

Publication Number Publication Date
CN109021111A CN109021111A (zh) 2018-12-18
CN109021111B true CN109021111B (zh) 2021-12-07

Family

ID=64143078

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810185384.7A Active CN109021111B (zh) 2018-02-23 2018-03-07 一种基因碱基编辑器
CN201810647142.5A Pending CN108822217A (zh) 2018-02-23 2018-06-21 一种基因碱基编辑器

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201810647142.5A Pending CN108822217A (zh) 2018-02-23 2018-06-21 一种基因碱基编辑器

Country Status (1)

Country Link
CN (2) CN109021111B (zh)

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150044192A1 (en) 2013-08-09 2015-02-12 President And Fellows Of Harvard College Methods for identifying a target site of a cas9 nuclease
US9359599B2 (en) 2013-08-22 2016-06-07 President And Fellows Of Harvard College Engineered transcription activator-like effector (TALE) domains and uses thereof
US9388430B2 (en) 2013-09-06 2016-07-12 President And Fellows Of Harvard College Cas9-recombinase fusion proteins and uses thereof
US9340799B2 (en) 2013-09-06 2016-05-17 President And Fellows Of Harvard College MRNA-sensing switchable gRNAs
US9526784B2 (en) 2013-09-06 2016-12-27 President And Fellows Of Harvard College Delivery system for functional nucleases
US9840699B2 (en) 2013-12-12 2017-12-12 President And Fellows Of Harvard College Methods for nucleic acid editing
WO2016022363A2 (en) 2014-07-30 2016-02-11 President And Fellows Of Harvard College Cas9 proteins including ligand-dependent inteins
US20190225955A1 (en) 2015-10-23 2019-07-25 President And Fellows Of Harvard College Evolved cas9 proteins for gene editing
KR102547316B1 (ko) 2016-08-03 2023-06-23 프레지던트 앤드 펠로우즈 오브 하바드 칼리지 아데노신 핵염기 편집제 및 그의 용도
AU2017308889B2 (en) 2016-08-09 2023-11-09 President And Fellows Of Harvard College Programmable Cas9-recombinase fusion proteins and uses thereof
US11542509B2 (en) 2016-08-24 2023-01-03 President And Fellows Of Harvard College Incorporation of unnatural amino acids into proteins using base editing
KR20240007715A (ko) 2016-10-14 2024-01-16 프레지던트 앤드 펠로우즈 오브 하바드 칼리지 핵염기 에디터의 aav 전달
US10745677B2 (en) 2016-12-23 2020-08-18 President And Fellows Of Harvard College Editing of CCR5 receptor gene to protect against HIV infection
US11898179B2 (en) 2017-03-09 2024-02-13 President And Fellows Of Harvard College Suppression of pain by gene editing
EP3592777A1 (en) 2017-03-10 2020-01-15 President and Fellows of Harvard College Cytosine to guanine base editor
US11268082B2 (en) 2017-03-23 2022-03-08 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable DNA binding proteins
US11560566B2 (en) 2017-05-12 2023-01-24 President And Fellows Of Harvard College Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation
WO2019023680A1 (en) 2017-07-28 2019-01-31 President And Fellows Of Harvard College METHODS AND COMPOSITIONS FOR EVOLUTION OF BASIC EDITORS USING PHAGE-ASSISTED CONTINUOUS EVOLUTION (PACE)
WO2019139645A2 (en) 2017-08-30 2019-07-18 President And Fellows Of Harvard College High efficiency base editors comprising gam
US11795443B2 (en) 2017-10-16 2023-10-24 The Broad Institute, Inc. Uses of adenosine base editors
CN116836300A (zh) * 2019-01-21 2023-10-03 上海科技大学 一种碱基编辑分子及其用途
CN109762846B (zh) * 2019-02-01 2020-11-24 国家卫生健康委科学技术研究所 利用碱基编辑修复与克拉伯病相关的galcc1586t突变的试剂和方法
JP2022526695A (ja) * 2019-02-02 2022-05-26 シャンハイテック ユニバーシティ 遺伝子編集における非意図的な変異の阻害
CN110804628B (zh) * 2019-02-28 2023-05-12 中国科学院脑科学与智能技术卓越创新中心 高特异性无脱靶单碱基基因编辑工具
CA3130488A1 (en) 2019-03-19 2020-09-24 David R. Liu Methods and compositions for editing nucleotide sequences
CN110029096B (zh) * 2019-05-09 2023-05-12 上海科技大学 一种腺嘌呤碱基编辑工具及其用途
CN112048497B (zh) * 2019-06-06 2023-11-03 辉大(上海)生物科技有限公司 一种新型的单碱基编辑技术及其应用
CN110407945A (zh) * 2019-06-14 2019-11-05 上海科技大学 一种腺嘌呤碱基编辑工具及其用途
CN112175927B (zh) * 2019-07-02 2023-04-18 上海科技大学 一种碱基编辑工具及其用途
CN117264998A (zh) * 2019-07-10 2023-12-22 苏州齐禾生科生物科技有限公司 双功能基因组编辑系统及其用途
KR102258713B1 (ko) * 2019-07-31 2021-05-31 한양대학교 산학협력단 사이토신 염기교정용 조성물 및 이의 용도
CN110467679B (zh) * 2019-08-06 2021-04-23 广州大学 一种融合蛋白、碱基编辑工具和方法及其应用
US20220380749A1 (en) * 2019-08-20 2022-12-01 Tianjin Institute Of Industrial Biotechnology, Chinese Academy Of Sciences Base editing systems for achieving c to a and c to g base mutation and application thereof
EP3783104A1 (en) * 2019-08-20 2021-02-24 Kemijski Institut Coiled-coil mediated tethering of crispr-cas and exonucleases for enhanced genome editing
CN112979823B (zh) * 2019-12-18 2022-04-08 华东师范大学 一种用于治疗和/或预防β血红蛋白病的产品及融合蛋白
WO2021155607A1 (zh) * 2020-02-07 2021-08-12 辉大(上海)生物科技有限公司 经改造的胞嘧啶碱基编辑器及其应用
GB2614813A (en) 2020-05-08 2023-07-19 Harvard College Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence
CN115386623A (zh) * 2021-05-20 2022-11-25 北京大学 用于检测碱基编辑器编辑位点的方法和试剂盒
CN113564145B (zh) * 2021-06-04 2023-07-28 上海市第一人民医院 用于胞嘧啶碱基编辑的融合蛋白及其应用
WO2023155901A1 (en) * 2022-02-17 2023-08-24 Correctsequence Therapeutics Mutant cytidine deaminases with improved editing precision

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105934516A (zh) * 2013-12-12 2016-09-07 哈佛大学的校长及成员们 用于基因编辑的cas变体
WO2017070632A2 (en) * 2015-10-23 2017-04-27 President And Fellows Of Harvard College Nucleobase editors and uses thereof
WO2018010516A1 (zh) * 2016-07-13 2018-01-18 陈奇涵 一种基因组dna特异性编辑方法和应用

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105934516A (zh) * 2013-12-12 2016-09-07 哈佛大学的校长及成员们 用于基因编辑的cas变体
WO2017070632A2 (en) * 2015-10-23 2017-04-27 President And Fellows Of Harvard College Nucleobase editors and uses thereof
WO2018010516A1 (zh) * 2016-07-13 2018-01-18 陈奇涵 一种基因组dna特异性编辑方法和应用

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Structural determinants of human APOBEC3A enzymatic and nucleic acid binding properties;MITRA, M.等;《Nucleic Acids Research》;20131024;第42卷(第2期);第1095-1110页 *

Also Published As

Publication number Publication date
CN109021111A (zh) 2018-12-18
CN108822217A (zh) 2018-11-16

Similar Documents

Publication Publication Date Title
CN109021111B (zh) 一种基因碱基编辑器
US20240117335A1 (en) Fusion proteins for base editing
CN106459957B (zh) 用于特异性转变靶向dna序列的核酸碱基的基因组序列的修饰方法、及其使用的分子复合体
US6491924B1 (en) Chlamydia pneumoniae antigenic polypeptide
US20020045185A1 (en) Secreted neural adhesion proteins
CN111607614A (zh) 白喉毒素调控清除免疫细胞的cd45-dtr转基因小鼠的构建方法与应用
KR101535555B1 (ko) 구제역 바이러스 O Manisa를 이용한 재조합 구제역 백신 바이러스
CN113831394B (zh) 一种非洲猪瘟病毒asfv基因的重组病毒组合及由其制备的疫苗
IL171903A (en) Purification of 2 – her variants
US6265218B1 (en) Plasmids without a selection marker gene
US6365344B1 (en) Methods for screening for transdominant effector peptides and RNA molecules
CN114196702A (zh) 一种利用单碱基编辑器构建长qt疾病干细胞的方法
CN110042117B (zh) 弓形虫α淀粉酶基因敲除虫株的构建方法及用途
CN113789348B (zh) 一种apex2基因敲入的小鼠动物模型、构建方法及其应用
KR102009268B1 (ko) 구제역 C3 Resende 주의 방어 항원이 발현되는 재조합 바이러스
CN110079530A (zh) 一种源自布氏乳杆菌的基因编辑工具及其制备方法和应用
KR102096282B1 (ko) 재조합 베큘로바이러스를 이용한 인간 trem2 세포막 단백질의 효율적인 정제방법
KR101898214B1 (ko) Myh1 유전자를 포함하는 재조합 벡터 및 이의 이용
CN116536352A (zh) 一种复制型引导编辑器介导的高效精准多基因编辑系统
KR102623115B1 (ko) 신규한 구제역 Asia1형 재조합 바이러스 및 상기 바이러스를 포함하는 구제역 백신 조성물
CN114457118B (zh) 一种荧光报告基因元件、基因编辑监测系统及其用途
CN116536353A (zh) 一种复制型高效引导编辑系统
RU2804334C2 (ru) Применение tpk в качестве мишени при болезни альцгеймера
KR101876487B1 (ko) Myh1 유전자를 포함하는 형질전환체 및 이의 이용
CN111607611A (zh) 一种靶向cd45的打靶载体及其整合至cd45外显子1位点的方法和应用

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant