CN1824675A - 奈瑟球菌抗原 - Google Patents
奈瑟球菌抗原 Download PDFInfo
- Publication number
- CN1824675A CN1824675A CNA2005101133957A CN200510113395A CN1824675A CN 1824675 A CN1824675 A CN 1824675A CN A2005101133957 A CNA2005101133957 A CN A2005101133957A CN 200510113395 A CN200510113395 A CN 200510113395A CN 1824675 A CN1824675 A CN 1824675A
- Authority
- CN
- China
- Prior art keywords
- sequence
- people
- protein
- dna
- gene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Peptides Or Proteins (AREA)
Abstract
本发明提供了脑膜炎奈瑟球菌(菌株A和B)以及淋病奈瑟球菌的蛋白,包括氨基酸序列、对应的核苷酸序列、表达数据以及血清学数据。该蛋白是有用的抗原,可用作疫苗、免疫原性组合物和/或诊断试剂。
Description
本申请是申请日为1998年10月9日、申请号为CN98812844.6、发明名称为“奈瑟球菌抗原”的中国专利申请的分案申请。
技术领域
本发明涉及奈瑟球菌属细菌。
背景技术
脑膜炎奈瑟球菌(Neisseria meningitidis)和淋病奈瑟球菌(Neisseria gonrrhoeae)是不能动的人致病性革兰阴性双球菌。脑膜炎奈瑟球菌群集在咽喉处,并引起脑膜炎(有时没有脑膜炎而是败血病);淋病奈瑟球菌群集在生殖道并引起淋病。尽管这两种病原体群集在身体不同的区域并引起完全不同的疾病,但它们却是密切相关的,然而脑膜炎球菌与淋球菌明显不同的一个特征是所有病原性脑膜炎双球菌中存在多糖荚膜。
在1983-1990年期间,单单在美国,淋病奈瑟球菌每年就引起约800000起病例(Meitzner和Cohen,″抗淋球菌感染的疫苗″章节,在New Generation Vaccines,第2版,Levine,Woodrow,Kaper和Cobon,Marcel Dekker编辑,New York,1997,817-842页)。该疾病引起的发病率很高,但死亡率有限。非常希望对淋病奈瑟球菌进行疫苗接种,但反复尝试没有成功。该疫苗的主要候选抗原是表面外露蛋白如菌毛、孔蛋白、与不透明相关的蛋白(Opas)以及其它外露蛋白如Lip、Laz、IgA1蛋白酶以及运铁蛋白结合蛋白。也有人提议用脂寡糖(LOS)作为疫苗(Meitzner和Cohen,同上)。
脑膜炎奈瑟球菌会引起地方性和流行性疾病。在美国,其发病率为每年每100000人有0.6-1人,爆发时可以高得多(见Lieberman等人,(1996)″血清型A/C脑膜炎奈瑟球菌寡糖-蛋白偶联物疫苗在幼儿中的安全性和免疫原性″,JAMA 275(19):1499-1503;Schuchat等人(1997)“1995年美国的细菌性脑膜炎”,N Engl J Med 337(14):970-976)。在发展中国家,地方性疾病率要高得多,在流行时,发病率可高达每年每100000人有500起。该病的死亡率在美国很高,为10-20%,在发展中国家则要高得多。在引入了抗流感嗜血菌的偶联物疫苗后,脑膜炎奈瑟球菌是引起美国所有年龄人群中细菌性脑膜炎的主要原因(Schuchat等人(1997)同上)。
根据生物的荚膜多糖,已经鉴定出12种脑膜炎奈瑟球菌的血清型。A型是亚撒哈拉-非洲地区流行病中最常见的病原体。B型和C型血清型菌是导致美国以及大多数发达国家内的大多数病例的原因。W135和Y型血清型菌是导致美国和发达国家的其余病例的原因。目前使用的脑膜炎球菌是由血清型A、C、Y和W135组成的四价多糖疫苗。尽管其在青年和成人中有效,但是它诱导了差的免疫应答和短期的保护作用,并且不能用于婴儿[例如,见发病率和死亡率每周报道,46卷,PR-5(1997)]这是因为多糖是T细胞非依赖型抗原,其诱导的弱免疫应答不能通过重复免疫来加强。在流感嗜血菌的疫苗接种成功后,已经开发出了针对血清型A和C的偶联疫苗,现在是临床测试的最终阶段(Zollinger WD″新的和改进的抗脑膜炎球菌疾病疫苗″,在:New Generation Vaccines中,同上,469-488页;Lieberman等人(1996)同上;Costantino等人(1992)“抗脑膜炎球菌A和C的偶联疫苗的开发和I期临床测试”,Vaccine,10:691-698)。
然而,脑膜炎球菌B仍是一个问题。此血清型目前在美国、欧洲和南美州引起的病例约占总脑膜炎的50%。不能采用多糖方法,因为menB荚膜多糖是α(2-8)-相连的N-乙酰基神经氨酸的聚合物,它也存在于哺乳动物组织中。这导致了对抗原的耐受;实际上,如果引发免疫应答,则该免疫应答是抗自身的,因此是不希望的。为了避免引起自身免疫力并诱导保护性免疫应答,已经对该荚膜多糖进行化学修饰,例如用N-丙酰基代替N-乙酰基,而不改变特异性抗原性(Romero和Outschoorn(1994)″B型脑膜炎球菌候选疫苗的目前状况:荚膜或非荚膜?″Clin Microbiol Rev7(4):595-575)。
menB疫苗的另一种方法采用外膜蛋白(OMP)的复合物混合物,它只含有OMP、或富集在膜孔蛋白中的OMP,或缺失4型OMP(认为它诱导了封闭杀菌活性的抗体)。该方法产生的疫苗的性质还未经完全地分析。它们能保护机体抵抗同源的菌株,但是当存在许多外膜蛋白的抗原性变体株时一般无效。为了克服抗原性差异,已经构建了含有高达9种不同膜孔蛋白的多价疫苗(例如,Poolman JT(1992)“脑膜炎球菌疫苗的发展”Infect.Agents Dis.4:13-28)。用于外膜疫苗的其它蛋白是opa和opc蛋白,但是这些方法均不能克服抗原性差异(例如Ala′Aldeen和Borriello(1996)″脑膜炎球菌运铁蛋白结合蛋白1和2均是外露的,并产生能杀伤同源和异源菌株的杀菌性抗体″Vaccine 14(1):49-53)。
已可得到脑膜炎球菌和淋球菌的基因和蛋白的一定数量的序列信息(例如EP-A-0467714,WO96/29412),但这决不完全的。提供进一步的信息,就有机会能鉴定出估计是免疫系统靶标且没有抗原性差异的分泌的或外露的蛋白。例如,一些已鉴定的蛋白可作为抗脑膜炎球菌B的有效疫苗的成分,一些可作为抗有所脑膜炎球菌血清型的疫苗的成分,其它可作为抗所有病原性奈瑟球菌的疫苗的成分。
发明内容
本发明提供了一些蛋白,该蛋白含有公开在实施例中的奈瑟球菌氨基酸序列。这些序列涉及脑膜炎奈瑟球菌或淋病奈瑟球菌。
本发明还提供了含有与实施例所公开的奈瑟球菌氨基酸序列同源(即具有序列相同性)的序列的蛋白。根据具体的序列,相同性的程度宜大于50%(例如65%、80%、90%或更高)。这些同源性蛋白包括实施例中公开的序列的突变体和等位基因变体。通常,认为两种蛋白之间有50%或更高的相同性表明功能等价。蛋白之间的相同性宜用在MRSRCH程序(Oxford Molecular)中执行的Smith-Watemen同源性搜寻算法来确定,采用仿射空隙搜寻,参数“空隙开口罚分(gap open penalty)”为12,“空隙延伸罚分(gapextension penalty)”为1。
本发明还提供了包含实施例所公开的奈瑟球菌氨基酸序列片段的蛋白。该片段应包含该序列中至少n个连续的氨基酸,根据具体的序列,n为7或更高(例如,8、10、12、14、16、18、20或更高)。该片段宜包含该序列的一个表位。
本发明的蛋白当然可用各种方法(例如重组表达、从细胞培养中纯化、化学合成等)制成各种形式(例如天然的、融合物等)。它们宜制成基本上纯的或分离的形式(即基本上不含其它奈瑟球菌或宿主细胞蛋白)。
另一方面,本发明提供了结合这些蛋白的抗体。它们可能是多克隆的或单克隆的,可用任何合适的方法制得。
还有一方面,本发明提供了包含实施例所公开的奈瑟球菌核苷酸序列的核酸。另外,本发明还提供了包含与实施例所公开的奈瑟球菌核苷酸序列同源(即具有序列相同性)的序列的核酸。
另外,本发明还提供了能与实施例中公开的奈瑟球菌核酸杂交(较佳的是在“高度严谨”条件(65℃,在0.1×SSC、0.5% SDS溶液中)下杂交)的核酸。
本发明还提供了包含这些序列之片段的核酸。这些核酸应包含来自奈瑟球菌序列的至少n个连续的核苷酸,根据具体的序列,n为10或更高(例如,12、14、15、18、20、25、30、35、40或更高)。
还有一方面,本发明提供了编码本发明的蛋白和蛋白片段的核酸。
也应理解,本发明也提供了包含与上述那些序列互补的序列的核酸(例如用于反义或探针目的)。
当然,本发明的核酸可用各种方式(例如化学合成,从基因组或cDNA文库、或从生物体本身制得等)制得,并可采用各种形式(例如单链、双链、载体、探针等)。
另外,术语“核酸”包括DNA和RNA,以及它们的类似物,如含有修饰的骨架的那些,还包括肽核酸(PNA)等。
另一方面,本发明提供了含有本发明的核苷酸序列的载体(如表达载体)以及转化了这些载体的宿主细胞。
另一方面,本发明提供了包含本发明的蛋白、抗体和/核酸的组合物。例如,这些组合物适合用作疫苗,或作为诊断性试剂,或作为免疫原性组合物。
本发明还提供了本发明的核酸、蛋白或抗体用作药剂(例如作为疫苗)或作为诊断性试剂的应用。本发明还提供了本发明的核酸、蛋白或抗体在生产下列物质中的应用:(i)用于治疗或预防奈瑟球菌感染的药剂;(ii)用于检测奈瑟球菌或针对奈瑟球菌产生的抗体是否存在的诊断性试剂;和/或(iii)可产生针对奈瑟球菌的抗体的制剂。所述奈瑟球菌可以是任何种或菌株(例如淋病奈瑟球菌或脑膜炎奈瑟球菌的任何菌株如菌株A、菌株B或菌株C)。
本发明还提供了一种治疗患者的方法,该方法包括给予患者治疗有效量的本发明的核酸、蛋白和/或抗体。
还有一方面,本发明提供了以下各种方法。
本发明提供了一种生产本发明的蛋白的方法,该方法包括在诱导蛋白表达的条件下培育本发明的宿主细胞的步骤。
本发明提供了一种生产本发明的蛋白或核酸的方法,其中用化学手段部分或全部合成所述蛋白或核酸。
本发明提供了一种检测本发明的多核苷酸的方法,该方法包括下列步骤:(a)在杂交条件下使本发明的核酸探针与生物样品接触,形成双链体;和(b)检测所述双链体。
本发明提供了一种检测本发明的蛋白质的方法,该方法包括下列步骤:(a)在适合形成抗体-抗原复合物的条件下使本发明的抗体和生物样品接触;和(b)检测所述复合物。
下面归纳了为了实施本发明而采用的标准技术和方法(例如用公开的序列用于接种或诊断性目的)。这种归纳不是对本发明的限制,而是举例,这些例子可以采用,但是不要求一定用。
综述
除非另有描述,本发明的实施将采用分子生物学、微生物学、重组DNA和免疫学的常规技术,这些均是本领域技术人员所知的。这些技术在下列文献中有完整的描述:例如,Sambrook《分子克隆实验指南》第2版(1989);《DNA克隆》第I和II卷(D.N.Glover编辑1985);《寡核苷酸合成》(M.J.Gait编辑,1984);《核酸杂交》(B.D.Hames和S.J.Higgins编辑.1984);《转录和翻译》(B.D.Hames和S.J.Higgins编辑,1984);《动物细胞培养》(R.I.Freshney编辑,1986);《固定化细胞和酶》(IRL出版社,1986);B.Perbal,《分子克隆实用指南》(1984);《酶学方法》系列丛书(Academic Press,Inc.),尤其是154和155卷;《哺乳动物细胞的基因转移载体》(J.H.Miller和M.P.Calos编辑,1987,Cold Spring Harbor Laboratory);Mayer和Walker编辑(1987),《细胞和分子生物学的免疫化学方法》(Academic Press,London);Scopes,(1987)《蛋白质纯化:原理和实践》第2版(Springer-Verlag,N.Y.),以及《实验免疫学手册》I-IV卷(D.C.Weir和C.C.Blackwell编辑1986)。
在本说明书中采用了核苷酸和氨基酸的标准缩写。
本文引用的所有出版物、专利和专利申请均纳入本文作参考。尤其是将英国专利申请9723516.2、9724190.5、9724386.9、9725158.1、9726147.3、9800759.4和9819016.8的内容纳入本文作为参考。
定义
当组合物中总X+Y重量的至少85%是X时,则称含有X的组合物“基本上没有Y”。较佳的,X占组合物中X+Y总重量的至少约90%,更佳至少约95%或者甚至99%(重量)。
术语“包含”指“包括”以及“由…组成”,例如组合物“包含”X可以是只由X组成,或可包括X以外的物质,例如X+Y。
术语“异源”指在自然界中发现不在一起的两种生物学组分。此组分可以是宿主细胞、基因、或调控区如启动子。尽管异源组分在自然界中发现不在一起,但是它们能一起起作用,例如当与基因异源的启动子与该基因操作性相连时。另一个例子是奈瑟球菌序列与小鼠宿主细胞异源。还有一个例子是相同或不同蛋白的两个表位装配到一个蛋白中,以自然界中未曾发现的排列方式排列。
“复制起点”是启动和调节多核苷酸(例如表达载体)复制的多核苷酸序列。复制起点可作为细胞内多核苷酸复制的自主性单位,能在其自身的控制下进行复制。复制起点是载体在特定宿主细胞中复制所需的。有了某一复制起点,表达载体就能在细胞中合适蛋白的存在下高拷贝数的复制。复制起点的例子是在酵母中有效的自主复制序列;以及在COS-7细胞中有效的病毒性T-抗原。
“突变体”序列定义成与天然或公开的序列不同但具有序列相同性的DNA、RNA或氨基酸序列。根据具体的序列,天然或公开的序列与突变体序列之间的序列相同性程度宜大于50%(例如60%、70%、80%、90%、95%、99%或更高,用上述Smith-Waterman算法计算出)。如本文所述,本文提供的核酸序列的核酸分子或区域的“等位基因变体”是在另一或第二个分离物的基因组中基本上相同的基因座上的核酸分子或区域,由于诸如突变或重组引起的自然变异,它们具有相似但不相同的核酸序列。编码区等位基因变体通常编码的蛋白具有与其比较基因所编码蛋白相似的活性。等位基因变体还可包含基因5′或3′非翻译区中的变化,例如在调控控制区中的变化(例如见美国专利5,753,235)。
表达系统
奈瑟球菌核苷酸序列可在各种不同的表达系统中表达;例如和哺乳动物细胞、杆状病毒、植物、细菌和酵母一起使用的那些系统。
i.哺乳动物系统
哺乳动物表达系统是本领域中已知的。哺乳动物启动子是能结合哺乳动物RNA聚合酶并启动下游(3′)编码序列(如结构基因)转录成mRNA的任何DNA序列。启动子具有一个转录起始区,其通常邻近编码序列的5′端,还具有一个TATA盒,其通常位于转录起始位点上游25-30个碱基对(bp)处。认为TATA盒指导RNA聚合酶II在正确位点开始RNA合成。哺乳动物启动子还含有一个上游启动子元件,其通常位于TATA盒上游100至200bp内。该上游启动子元件决定了转录启动的速度,并可在两个方向之一上起作用[Sambrook等人(1989)“克隆基因在哺乳动物细胞中的表达”《分子克隆实验指南》,第2版]。
哺乳动物病毒基因通常是高表达的,具有宽的宿主范围;因此,编码哺乳动物病毒基因的序列提供了特别有用的启动子序列。例子包括SV40早期启动子、小鼠乳房肿瘤病毒LTR启动子、腺病毒主要晚期启动子(Ad MLP)以及单纯疱疹病毒启动子。另外,从非病毒基因(如鼠金属硫蛋白基因)衍生的序列也提供了有用的启动子序列。表达可以是组成型的或受调控的(诱导的),这取决于该启动子能否在激素反应性细胞中用促糖皮质激素诱导。
增强元件(增强子)的存在,联合上述启动子元件通常会提高表达水平。增强子是这样一种调控性DNA序列,当其与同源或异源启动子相连,合成在正常的RNA起始位点开始时,它能刺激转录提高1000倍。当增强子位于转录起始位点的上游或下游,处于正常或翻转方向,或距离启动子1000个核苷酸以上的距离时,它均具有活性[Maniatis等人(1987)Science 236:1237;Alberts等人(1989)《细胞分子生物学》,第2版]。从病毒衍生获得的增强子元件可能是特别有用的,因为它们通常具有较宽的宿主范围。例子包括SV40早期基因增强子[Dijkema等人(1985)EMBO J.4:761]以及衍生自Rous肉瘤病毒的长末端重复序列(LTR)的增强子/启动子[Gorman等人(1982b)Proc.Natl.Acad.Sci.79:6777]以及来自人巨细胞病毒的增强子/启动子[Boshart等人(1985)Cell 41:521]。另外,一些增强子仅仅在诱导物(例如激素或金属离子)的存在下是可调节的并具有活性[Sassone-Corsi和Borelli(1986)Trends Genet.2:215;Maniatis等人(1987)Science 236:1237]。
DNA分子可在哺乳动物细胞中胞内表达。启动子序列可以和DNA分子直接相连,在这种情况下,重组蛋白的N端第一个氨基酸始终是甲硫氨酸,其由ATG起始密码子编码。如果需要,可通过和溴化氰体外培育来从蛋白上切下N端。
另外,外来蛋白也可从细胞中分泌到生长培养基中,方法是产生嵌合的DNA分子,该DNA分子编码的融合蛋白包括一前导序列片段,该片段在哺乳动物细胞中提供了外源蛋白的分泌。较佳的,在前导序列片段和外源基因之间可以有能在体内或体外断裂的加工位点。前导序列片段通常编码一种信号肽,该信号肽包含指导蛋白分泌出细胞的疏水性氨基酸。腺病毒三联前导序列是哺乳动物细胞中分泌外来蛋白的一个前导序列例子。
通常,哺乳动物细胞识别的转录终止和聚腺苷酸化序列是位于翻译终止密码子3′的调控区域,因此它和启动子元件一起连接在编码序列的侧面。成熟mRNA的3′端由定点的转录后断裂和聚腺苷酸化形成[Birnstiel等人(1985)Cell 41:349;Proudfoot和Whitelaw(1988)″真核RNA的终止和3′端加工″《转录和剪接》(B.D.Hames和D.M.Glover编辑);Proudfoot(1989)Trends Biochem.Sci.14:105]。这些序列指导mRNA的转录,mRNA能被翻译成该DNA编码的多肽。转录终止子/聚腺苷酸化信号的例子包括从SV40获得的那些[Sambrook等人(1989)“克隆基因在培养的哺乳动物细胞中的表达”《分子克隆实验指南》]。
通常,上述组件,包括启动子、聚腺苷酸化信号以及转录终止序列被一起放在表达构建物中。如果需要,该表达构建物中还包括增强子、具有功能性剪接供体体和受体位点的内含子以及前导序列。表达构建物通常以复制子形式维持,例如是能在宿主(如哺乳动物细胞或细菌)中稳定维持的染色体外元件(如质粒)。哺乳动物复制系统包括从动物病毒衍生的那些系统,其需要反式作用因子来进行复制。例如,含有乳多空病毒复制系统的质粒,如SV40[Gluzman(1981)Cell 23:175]或多瘤病毒,在合适的病毒T抗原存在下复制出极高的拷贝数。哺乳动物复制子的其它例子包括衍生自牛乳头瘤病毒和EB病毒的复制子。另外,复制子可以有两个复制系统,从而使其能维持在例如哺乳动物细胞中进行表达并能在原核宿主中克隆和扩增。这些哺乳动物细菌穿梭载体的例子包括pMT2[Kaufman等人(1989)Mol.Cell.Biol.9:946]和pHEBO[Shimizu等人(1986)Mol.Cell.Biol.6:1074]。
所用的转化程序取决于待转化的宿主。将异源多核苷酸导入哺乳动物细胞中的方法是本领域所知的,其包括葡聚糖介导的转染、磷酸钙沉淀、Polybrene(1,5-二甲基-1,5-二氮十一亚甲基聚甲溴化物)介导的转染、原生质体融合、电穿孔、将多核苷酸包裹在脂质体中以及将DNA直接显微注射到胞核中。
可作为宿主进行表达的哺乳动物细胞系是本领域中已知的,其包括许多从美国典型培养物保藏中心(ATCC)获得的无限增殖细胞系,包括但不局限于,中国仓鼠卵巢(CHO)细胞、海拉细胞、幼仓鼠肾(BHK)细胞、猴肾细胞(COS)、人肝细胞癌细胞(如Hep G2)和其它许多细胞系。
ii.杆状病毒系统
编码蛋白质的多核苷酸也可插入合适的昆虫表达载体中,并与该载体中的控制元件操作性相连。载体构建采用本领域已知的技术。总地来说,表达系统的组分包括一种转移载体,通常是细菌质粒,其含有杆状病毒基因组片段以及便于插入待表达异源基因的限制性位点;野生型杆状病毒,其序列与转移载体中的杆状病毒特异性片段同源(这使得异源基因能同源重组到杆状病毒基因组中);以及合适的昆虫宿主细胞和生长培养基。
在将编码蛋白质的DNA序列插入转移载体中后,将载体和野生型病毒基因组转染到昆虫宿主细胞中,使载体和病毒基因组重组。表达包装的重组病毒,鉴定并纯化重组噬斑。杆状病毒/昆虫细胞表达系统材料及其方法,除别的以外,可以试剂盒形式购自Invitrogen,San Diego CA(″MaxBac″试剂盒)。这些技术通常是本领域技术人员所知的,在Summers和Smith的Texas Agricultural Experiment Station Bulletin No.1555(1987)(后称“Summer和Smith的文章”)中有充分描述。
在将编码蛋白质的DNA序列插入杆状病毒基因组之前,通常将上述组件,包括启动子、前导序列(如果需要)、感兴趣的编码序列以及转录终止序列装配在中间置换型构建物(转移载体)中。该构建物可含有单个基因以及操作性相连的调控元件;多个基因,每个基因有其自己的操作性相连调控元件;或是由同一组调控元件调控的多个基因。中间置换型构建物通常保持在一个复制子中,例如能在宿主(如细菌)内稳定保持的染色体外元件(如质粒)。复制子将具有一个复制系统,从而使其能保持在合适的宿主中进行克隆和扩增。
目前,用来将外源基因导入AcNPV的最常用的转移载体是pAc373。还可设计本领域技术人员已知的其它许多载体。这些载体例如包括,pVL985(其将多角体蛋白的起始密码子从ATG变为ATT,在ATT下游32个碱基对处引入一个BamHI克隆位点;见Luckow和Summers,Virology(1989)17:31)。
质粒通常还含有多角体蛋白聚腺苷酸化信号(Miller等人(1988)Ann.Rev.Microbiol.,42:177)以及用来在大肠杆菌中选择和繁殖的原核氨苄青霉素抗性(amp)基因和复制起点。
杆状病毒转移载体通常含有杆状病毒启动子。杆状病毒启动子是能结合杆状病毒RNA聚合酶并启动下游(5′到3′)编码序列(如结构基因)转录成mRNA的DNA序列。启动子具有一个转录起始区,该区通常邻近编码序列的5′端。该转录起始区通常包括一个RNA聚合酶结合位点以及一个转录起始位点。杆状病毒转移载体还可能有称为增强子的第二个区,如果该区域存在,它通常在结构基因的远端。表达可以是调控的或组成型的。
在病毒感染周期晚期大量转录的结构基因提供特别有用的启动子序列。例子包括从编码病毒多角体蛋白的基因衍生获得的序列,Friesen等人(1986)“杆状病毒基因表达的调控”《杆状病毒分子生物学》(Walter Doerfler编辑);EPO公开号127 839和155476;以及编码p10蛋白的基因,Vlak等人(1988),J.Gen.Virol.69:765。
编码合适的信号序列的DNA可以衍生自分泌的昆虫或杆状病毒蛋白(如杆状病毒多角体蛋白基因)的基因(Carbonell等人,(1988)Gene,73:409)。另外,由于哺乳动物细胞翻译后修饰的信号(如信号肽断裂、蛋白水解断裂和磷酸化)看来可被昆虫细胞识别,且分泌和胞核积累所需的信号看来在非脊椎动物细胞和脊椎动物细胞之间是保守的,因此也可用非昆虫来源的前导序列来提供昆虫中的分泌,这些前导序列例如是从编码人α-干扰素(Maeda等人(1985),Nature 315:592)、人胃泌素释放的肽(Lebacq-Verheyden等人(1988),Molec.Cell.Biol.8:3129)、人IL-2(Smith等人(1985)PNAS,82:8404)、小鼠IL-3(Miyajima等人(1987)Gene 58:273)和人葡糖脑苷脂酶(Martin等人(1988)DNA,7:99)的基因衍生获得的。
重组多肽或多蛋白可以在胞内表达,或如果它和合适的调控序列一起表达,它可被分泌。非融合的外源蛋白的良好的胞内表达理想的通常需要具有短前导序列的异源基因在ATG起始信号前有合适的翻译起始信号。如果需要,可通过和溴化氰体外培育来从成熟蛋白上切下N端甲硫氨酸。
另外,可通过产生嵌合的DNA分子将非天然分泌的重组聚蛋白或蛋白从昆虫细胞中分泌出来,该嵌合的DNA分子所编码的融合蛋白包含一前导序列片段,该片段提供了昆虫中分泌外源蛋白的作用。该前导序列片段通常编码一种信号肽,该信号肽包含的疏水性氨基酸指导蛋白质转移到内质网中。
在插入了编码该蛋白表达产物前体的DNA序列和/或基因后,用转移载体的异源DNA和野生型杆状病毒的基因组DNA共同转化(通常是共转染)昆虫细胞宿主。构建物的启动子和转录终止序列通常包含2-5kb的杆状病毒基因组片段。将异源DNA引入杆状病毒中所需位点内的方法是本领域所知的。(见Summers和Smith的文章,同上;Ju等人(1987);Smith等人,Mol.Cell.Biol.(1983)3:2156;和Luckow和Summers(1989))。例如,插入可以是通过同源双交换重组来插入基因如多角体蛋白基因中;插入还可以是插入工程改造入所需杆状病毒基因内的限制性酶切位点中。Miller等人(1989),Bioessays 4:91。当DNA序列被克隆在表达载体多角体蛋白基因位置中后,其5′和3′均侧接了多角体蛋白特异性序列,并位于多角体蛋白启动子的下游。
随后将新形成的杆状病毒表达载体包装到感染性重组杆状病毒中。发生同源重组的频率很低(在约1%和5%之间);因此,共转染后产生的大多数病毒仍是野生型病毒。因此,需要用一种方法来鉴别重组病毒。该表达系统的一个优点是视觉筛选能区分重组病毒。在病毒感染后期,天然病毒产生的多角体蛋白在受其感染细胞的胞核中产生的水平非常高。累积的多角体蛋白形成的包涵体还含有包埋颗粒。这些包涵体的大小为15微米,它们具有高度的折光性,从而使它们呈现了明亮的发光的外观,在光学显微镜下很容易观察。感染了重组病毒的细胞缺少包涵体。为了区分重组病毒和野生型病毒,用本领域已知的技术将转染上清接种到单层昆虫细胞上形成噬斑。即,在光学显微镜下筛选存在(表明是野生型病毒)或不存在(表明是重组病毒)包涵体的噬斑。“当代微生物学方法”第2卷(Ausubel等人编辑),16.8(增补10,1990);Summers和Smith,同上;Miller等人(1989)。
已经开发出感染进入几种昆虫细胞的重组杆状病毒表达载体。例如,已经开发出用于感染以下昆虫的细胞的重组杆状病毒:埃及伊蚊、苜蓿丫纹夜蛾、家蚕、黑尾果蝇、草地夜蛾和粉纹夜蛾(WO 89/046699;Carbonell等人(1985)J.Virol.56:153;Wright(1986)Nature 321:718;Smith等人(1983)Mol.Cell.Biol.3:2156;综述见Fraser等人(1989)Vitro Cell.Dev.Biol.25:225)。
可以购得细胞和细胞培养基用于在杆状病毒/表达系统中直接表达和融合表达异源多肽;细胞培养技术是本领域技术人员通常所知的。例如见Summers和Smith,同上。
然后,经修饰的昆虫细胞可以生长在合适的营养培养基中,该培养基能稳定地保持该质粒于修饰的昆虫宿主中。当表达产物基因处于可诱导的控制下时,可以使宿主生长至高密度,并诱导表达。另外,当表达是组成型表达时,产物将被连续表达到培养基中,营养性培养基必需不断循环,同时取出感兴趣的产物并补充消耗的营养物。产物可用以下这些技术来纯化:例如层析,如HPLC、亲和层析、离子交换层析等;电泳;密度梯度离心;溶剂抽提等。产物可按需作进一步纯化,以基本上除去所有也分泌到培养基中或由昆虫细胞裂解而产生的昆虫蛋白,以提供一种至少基本上不含宿主碎片如蛋白质、脂质和多糖的产物。
为了进行蛋白质表达,将从转化子衍生获得的重组宿主细胞培育在允许重组蛋白的编码序列表达的条件下。这些条件将随所选定的宿主细胞而变。然而,本领域技术人员容易根据本领域已知的知识来确定该条件。
iii.植物系统
本领域中已知有许多植物细胞培养系统和全植物遗传表达系统。典型的植物细胞基因表达系统包括在以下专利中描述的那些,例如:US 5,693,506;US 5,659,122;和US5,608,143。Zenk,Phytochemistry 30:3861-3863(1991)中描述了在植物细胞培养物中遗传表达的其它例子。除上述参考文献外,关于植物蛋白信号肽的描述还可在下列文献中找到:Vaulcombe等人,Mol.Gen.Genet.209:33-40(1987);Chandler等人,PlantMolecular Biology 3:407-418(1984);Rogers,J.Biol.Chem.260:3731-3738(1985);Rothstein等人,Gene 55:353-356(1987);Whittier等人,Nucleic Acids Research15:2515-2535(1987);Wirsel等人,Molecular Microbiology 3:3-14(1989);Yu等人,Gene 122:247-253(1992)。关于用植物激素、赤霉素酸和赤霉素酸诱导分泌的酶调节植物基因表达的描述可在R.L.Jones和J.MacMillin,Gibberellins,《植物生理学进展》,Malcolm B.Wilkins编辑,1984 Pitman Publishing Limited,London,21-52页中找到。描述其它调节代谢的基因的参考文献参见:Sheen,Plant Cell,2:1027-1038(1990);Maas等人,EMBO J.9:3447-3452(1990);Benkel和Hickey,Proc.Natl.Acad.Sci.84:1337-1339(1987)。
通常,利用本领域已知的技术,将所需的多核苷酸序列插入一表达盒中,该表达盒含有为在植物中操作而设计的基因调控元件。将该表达盒插入所需的表达载体中,表达盒的上游和下游有适合在植物宿主中表达的伴随序列。该伴随序列可来自质粒或病毒,并为载体提供所需的性质,以允许载体将DNA从起初的克隆宿主(如细菌)中移动到所需植物宿主中。基础的细菌/植物载体构建物最好能提供宽的宿主范围原核复制起点;原核可选择标记;以及,对于农杆菌转化而言,宜提供T DNA序列用于农杆菌介导转移至植物染色体。当异源基因不易检测时,该构建物最好还具有一个适用于确定植物细胞是否已经转化的可选择标记基因。关于合适标记(例如对于禾草类家族成员)的综述可在Wilmink和Dons,1993,Plant Mol.Biol.Reptr,11(2):165-185中找到。
还建议采用合适将异源序列整合到植物基因组中的序列。这些序列可能包括用于同源重组的转座子序列以及允许将异源表达盒随机插入植物基因组中的Ti序列。合适的原核可选择标记包括抗生素(如氨苄青霉素或四环素)抗性标记。编码其它功能的其它DNA序列也可存在于载体中,这是本领域所知的。
本发明的核酸分子可包括在一个表达盒中来表达感兴趣的蛋白质。通常只有一个表达盒,但是两个或多个表达盒也是可行的。除了编码异源蛋白的序列外,重组表达盒还含有下列元件:启动子区域、植物5′非翻译序列、起始密码子(根据结构基因原来是否具有而定)、以及转录和翻译终止序列。表达盒5′和3′端的独特限制性酶位点能使表达盒方便地插入预先存在的载体中。
异源编码序列可以用于任何与本发明有关的蛋白。编码感兴趣的蛋白的序列将编码出一个信号肽,该信号肽能适当地加工和转运蛋白质,并且通常缺少可能会导致本发明的所需蛋白与膜结合的序列。由于对于大部分来说,转录起始区将针对发芽期间表达和转运的基因,采用提供转运的信号肽,也可提供转运感兴趣的蛋白质。通过这种方式,感兴趣的蛋白将从表达该蛋白的细胞中转运出来,并能被有效地收获。通常,种子中的分泌是通过糊粉或小盾体上皮层进入种子的胚乳。尽管不需要使蛋白从产生该蛋白的细胞中分泌出来,但是这种分泌有利于重组蛋白的分离和纯化。
由于所需基因产物的最终表达将在真核细胞中进行,因此需要确定克隆的基因部分是否含有作为内含子被宿主剪接体机制加工的序列。如果是这样,需要对“内含子”区进行定点诱变,以防止一部分遗传信息作为错误的内含子密码而丧失,Reed和Maniatis,Cell 41:95-105,1985。
可用微量移液管以机械方式转移重组DNA,将载体直接显微注射到植物细胞中。Crossway,Mol.Gen.Genet,202:179-185。还可用聚乙二醇将遗传物质转移到植物细胞中,Krens等人,Nature,296,72-74,1982。导入核酸片段的另一种方法是用小颗粒进行高速弹道贯穿,在这些小珠或颗粒的基质中或表面上带有核酸,Klein等人,Nature,327,70-73,1987,Knudsen和Muller,1991,Planta,185:330-336提出用颗粒轰击大麦胚乳以产生转基因大麦。还有一种导入方法是使原生质体和其它实体(微细胞(minicell)、细胞、溶酶体或其它可融合的脂质表面体)融合,Fraley等人,Proc.Natl.Acad.Sci.USA,79,1859-1863,1982。
载体也可通过电穿孔导入植物细胞中。(Fromm等人,PNAS 82:5824,1958)。在该技术中,在含有基因构建物的质粒存在下电穿孔植物原生质体。高电场强度的电脉冲使生物膜可逆地被通透,从而允许导入质粒。电穿孔的植物原生质体改造了细胞壁,分裂并形成植物胼胝体。
本发明可转化所有的植物,从中能分离出原生质体并能培育成全再生植物,从而回收得到含有转基因的全植物。已经知道实际上可以从培育的细胞或组织再生所有的全植物,其包括但不局限于,甘蔗、甜菜、棉花、果实和其它树、豆科植物和蔬菜的所有主要种类。一些合适的植物包括,例如,草莓属、莲花属、苜蓿属、驴食豆属、三叶草属、胡卢巴属、豇豆属、柑橘属、亚麻属、老鹳草属、Manihot、Daucus、鼠耳芥属、芸苔属、萝卜属、白芥属、颠茄属、辣椒属、曼陀罗属、天仙子属、番茄属、烟草属、茄属、碧冬茄属、毛地黄属、Majorana、菊苣属、向日葵属、莴苣属、雀麦属、天门冬属、金鱼草属、龙骨角属、Nemesia、天竺葵属、稷属、狼尾草属、毛茛属、千里光属、Salpiglossis、香瓜属、Browaalia、大豆属、黑麦草属、玉蜀黍属、小麦、蜀黍属和曼陀罗属各种类。
各种植物的再生方式是不同的,但是通常是首先提供含有异源基因拷贝的转化的原生质体悬液。形成胼胝体组织,从胼胝体中诱生出枝条,随后是根。另外,从原生质体悬液可以诱生形成胚胎。这些胚胎象天然的胚胎那样发芽形成植物。培养基通常含有各种氨基酸和激素,如植物生长素和细胞分裂素。尤其是对于玉米和苜蓿属来说,在培养基中加入谷氨酸和脯氨酸也是很有利的。枝条和根通常同时发育。有效的再生取决于培养基、基因型以及培养史。如果控制了这三个变量,那么再生能完全再现和重复。
在一些植物细胞培养系统中,本发明所需的蛋白可能被排泄出来,或者蛋白可从全植物中提取出来。当本发明所需的蛋白被分泌到培养基中后,就可进行收集。或者,可以用机械方式破碎胚以及无胚-半种子或其它植物组织,以释放出分泌到细胞和组织之间的蛋白。将该混合物悬于缓冲液中,以提取可溶性蛋白。然后用常规的蛋白分离和纯化方法纯化重组蛋白。用常规方法调节时间、温度、pH、氧和体积等参数,以优化异源蛋白的表达和回收。
iv.细菌系统
细菌表达技术是本领域已知的。细菌启动子是能结合细菌RNA聚合酶并启动下游(3′)编码序列(如结构基因)转录成mRNA的DNA序列。启动子具有一个转录起始区,其通常位于编码序列的5′端附近。该转录起始区通常包括RNA聚合酶结合位点以及一个转录起始位点。细菌启动子可能还有第二个功能区域称为操纵子,它可能与毗邻的RNA合成开始的RNA聚合酶结合位点重叠。该操纵子允许(可诱导)对转录的负调节,因为基因阻遏蛋白可能结合操纵子并因而抑制特定基因的转录。在负调节元件(如操纵子)不存在时,可能发生组成型表达。另外,正调节可通过基因激活蛋白结合序列来实现,如果有的话,它通常邻近RNA聚合酶结合序列(5′)。基因激活蛋白的例子是分解代谢物激活剂蛋白(CAP),它帮助启动大肠杆菌(E.coli)中的lac操纵子的转录[Raibaud等人(1984)Annu.Rev.Genet.18:173]。因此,表达调控可能是正作用或负作用,从而增强或减弱了转录。
编码代谢途径中的酶的序列提供了特别有用的启动子序列。例子包括衍生自糖(如半乳糖、乳糖(lac)[Chang等人(1977)Nature 198:1056]和麦芽糖)代谢酶的启动子序列。其它例子包括衍生自生物合成酶(如色氨酸(trp))[Goeddel等人(1980)Nuc.AcidsRes.8:4057;Yelverton等人(1981)Nucl.Acids Res.9:731;美国专利4,738,921;EP-A-0036776和EP-A-0121775]的启动子序列。g-内酰胺酶(bla)启动子系统[Weissmann(1981)″干扰素的克隆和其它错误″《干扰素3》(I.Gresser编辑)],λ嗜菌体PL[Shimatake等人(1981)Nature 292:128]和T5[美国专利4,689,406]启动子系统也提供了有用的启动子序列。
另外,非天然存在的合成的启动子也可象细菌启动子一样起作用。例如,一种细菌或嗜菌体启动子的转录激活序列可以和另一种细菌或嗜菌体启动子的操纵子序列连接在一起,形成合成的杂交启动子[美国专利4,551,433]。例如,tac启动子是杂合的trp-lac启动子,它由trp启动子以及受lac阻遏蛋白调节的lac操纵子序列组成[Amann等人(1983)Gene 25:167;de Boer等人,(1983)Proc.Natl.Acad.Sci.80:21]。另外,细菌启动子可包括非细菌来源但能结合细菌RNA聚合酶并启动转录的天然存在的启动子。天然存在的非细菌来源的启动子还能和相容的RNA聚合酶偶联在一起,从而在原核细胞中高水平地表达某些基因[Studier等人(1986)J.Mol.Biol.189:113;Tabor等人(1985)Proc.Natl.Acad.Sci.82:1074]。另外,杂合的启动子还可由嗜菌体启动子以及大肠杆菌操纵子区域组成(EPO A-0 267 851)。
除了有功能的启动子序列外,有效的核糖体结合位点对于外来基因在原核细胞中的表达也是有用的。在大肠杆菌中,核糖体结合位点称为Shine-Dalgarno(SD)序列,其包括起始密码子(ATG)以及在起始密码子上游3-11个核苷酸处的长度为3-9个核苷酸的序列[Shine等人(1975)Nature 254:34]。认为SD序列是通过SD序列和大肠杆菌16S rRNA的3′端碱基配对来促进mRNA与核糖体结合的[Steitz等人(1979)″信使RNA中的遗传信号和核苷酸序列″生物学调节和发育:基因表达″(编者R.F.Goldberger)]。为了表达具有弱的核糖体结合位点的原核基因和真核基因[Sambrook等人(1989)″克隆基因在大肠杆菌中的表达″《分子克隆实验指南》]。
DNA分子可以在胞内表达。启动子序列可以直接与DNA分子相连,在这种情况下,N端的第一个氨基酸始终是甲硫氨酸,其由ATG起始密码子编码。如果需要,可通过和溴化氰体外培育或通过和细菌甲硫氨酸N-端肽酶体内或体外培育,将N端的甲硫氨酸从蛋白质上切下(EPO-A-0 219 237)。
融合蛋白为直接表达提供了一种备选方案。通常,将编码内源细菌蛋白或其它稳定的蛋白之N端部分的DNA序列与异源编码序列的5′端融合。在表达时,该构建物将提供这两个氨基酸序列的融合物。例如,λ噬菌体细胞基因可以和外源基因的5′端相连并在细菌中表达。所得融合蛋白宜保留一个酶(因子Xa)加工位点,以便将噬菌体蛋白与外源基因切开[Nagai等人(1984)Nature 309:810]。融合蛋白也可用lacZ[Jia等人(1987)Gene 60:197],trpE[Allen等人(1987)J.Biotechnol.5:93;Makoff等人(1989),J.Gen.Microbiol.135:11]以及Chey[EP-A-0-324 647]基因的序列组成。两个氨基酸序列连接处的DNA序列可能编码或不编码可切割的位点。另一个例子是遍在蛋白融合蛋白。这种融合蛋白由遍在蛋白区域组成,该区域宜保留一个酶(例如遍在蛋白特异性加工蛋白酶)加工位点,以便将外源蛋白和遍在蛋白切开。通过这种方法,可以分离获得天然的外源蛋白[Miller等人(1989)Bio/Technology 7:698]。
另外,还可通过产生嵌合的DNA分子来将外源蛋白分泌出细胞,该嵌合的DNA分子编码的融合蛋白含有一个信号肽序列片段,该序列片段能使细菌中的外源蛋白分泌出来[美国专利4,336,336]。信号序列片段通常编码一个信号肽,该信号肽含有疏水性氨基酸,能指引蛋白分泌出细胞。蛋白质被分泌到生长培养基(革兰阳性菌)中或细胞内膜和外膜之间的周质间隙内(革兰阴性菌)。在编码的信号肽片段和外源基因之间宜具有能在体内或体外切割的加工位点。
编码合适信号序列的DNA可以从分泌性细菌蛋白的基因衍生获得,这些基因例如是大肠杆菌外膜蛋白基因(ompA)[Masui等人(1983),《基因表达的实验操作》;Ghrayeb等人(1984)EMBO J.3:2437]以及大肠杆菌碱性磷酸酶信号序列(phoA)[Oka等人(1985)Proc.Natl.Acad.Sci.82:7212]。另一个例子是,可采用各种芽孢杆菌菌株的α淀粉酶基因的信号序列将异源蛋白分泌出枯草芽孢杆菌[Palva等人(1982)Proc.Natl.Acad.Sci.USA 79:5582;EP-A-0 244 042]。
通常,细菌所识别的转录终止序列是位于翻译终止密码子3′的调控区,它和启动子一起侧接在编码序列的两侧。这些序列指导mRNA的转录,而mRNA能被翻译成该DNA所编码的多肽。转录终止序列通常包括约50个核苷酸的DNA序列,该序列能形成帮助终止转录的茎环结构。例子包括衍生自具有强启动子的基因(如大肠杆菌中的trp基因以及其它生物合成的基因)的转录终止序列。
上述组件,包括启动子、信号序列(如果需要的)、感兴趣的编码序列以及转录终止序列通常一起被放在表达构建物中。表达构建物通常以复制子的形式维持,例如能在宿主(如细菌)中稳定维持的染色体外元件(如质粒)。复制子具有一个复制系统,从而允许其维持在原核宿主中或进行表达或进行克隆和扩增。另外,复制子可以是高拷贝数或低拷贝数的质粒。高拷贝数质粒的拷贝数大致在约5至200之间,通常在约10至150之间。含有高拷贝数质粒的宿主宜含有至少约10个质粒,更佳的含有至少约20个质粒。根据载体以及外源蛋白对宿主的影响,可以选择高拷贝数或低拷贝数的载体。
另外,表达构建物可以和一个整合载体一起整合入细菌基因组中。整合载体通常含有至少一个序列与细菌染色体同源,从而允许该载体整合。整合看来是载体和细菌染色体中的同源DNA之间重组引起的。例如,用不同芽孢杆菌菌株的DNA构建的整合载体整合到芽孢杆菌染色体中(EP-A-0 127 328)。整合载体还可包含噬菌体或转座子序列。
通常,染色体外以及整合的表达构建物均含有可选择的标记,以便选择已经转化的菌株。可选择标记可在细菌宿主中表达,其包括赋予细菌对药物(如氨苄青霉素、氯霉素、红霉素、卡那霉素(新霉素)和四环素)抗性的基因[Davies等人(1978)Annu.Rev.Microbiol.32:469]。可选择标记还可包括生物合成性基因,如在组氨酸、色氨酸以及亮氨酸生物合成途径中的那些基因。
另外,上述某些组件可以一起放在转化载体中。转化载体通常包含一个可选择标记,如上所述,该载体以复制子形式维持或发展成一个整合载体。
已经开发出了用于转化到许多细菌中的表达和转化载体(无论是染色体外复制子还是整合载体)。例如,已经开发出了用于下列细菌的表达载体:枯草芽孢杆菌[Palva等人,(1982)Proc.Natl.Acad.Sci.USA 79:5582;EP-A-0 036 259和EP-A-0 063 953;WO 84/04541],大肠杆菌[Shimatake等人,(1981)Nature 292:128;Amann等人,(1985)Gene 40:183;Studier等人,(1986)J.Mol.Biol.189:113;EP-A-0 036 776,EP-A-0 136829和EP-A-0 136 907],酪链球菌[Powell等人,(1988)Appl.Environ.Microbiol.54:655];浅青紫链球菌[Powell等人,(1988)Appl.Environ.Microbiol.54:655],浅青紫链霉菌[US patent 4,745,056].
将外源DNA导入细菌宿主的方法是本领域熟知的,通常包括用氯化钙或其它试剂(如二价阳离子和DMSO)处理对细菌进行转化。DNA还可通过电穿孔方法导入细菌细胞。转化程序通常因待转化的细菌种类而不同。例如参见[Masson等人,(1989)FEMS Microbiol.Lett.60:273;Palva等人,(1982)Proc.Natl.Acad.Sci.USA 79:5582;EP-A-0 036 259和EP-A-0 063 953;WO 84/04541,芽孢杆菌],[Miller等人,(1988)Proc.Natl.Acad.Sci.85:856;Wang等人,(1990)J.Bacteriol.172:949,弯曲杆菌],[Cohen等人,(1973)Proc.Natl.Acad.Sci.69:2110;Dower等人,(1988)Nucleic Acids Res.16:6127;Kushner(1978)″用ColE1-衍生的质粒转化大肠杆菌的改进的方法″GeneticEngineering:Proceedings of the International Symposium on Genetic Engineering(H.W.Boyer和S.Nicosia编辑);Mandel等人,(1970)J.Mol.Biol.53:159;Taketo(1988)Biochim.Biophys.Acta 949:318;埃希氏菌],[Chassy等人,(1987)FEMS Microbiol.Lett.44:173乳酸杆菌];[Fiedler等人,(1988)Anal.Biochem 170:38,假单胞菌];[Augustin等人,(1990)FEMS Microbiol.Lett.66:203,葡萄球菌],[Barany等人,(1980)J.Bacteriol.144:698;Harlander(1987)″用电穿孔转化链球菌产乳酸微生物″Streptococcal Genetics(J.Ferretti和R.Curtiss III编辑);Perry等人,(1981)Infect.Immun.32:1295;Powell等人,(1988)Appl.Environ.Microbiol.54:655;Somkuti等人,(1987)Proc.4th Evr.Cong.Biotechnology 1:412,链球菌]。
v.酵母表达
酵母表达系统也是本领域技术人员所知的。酵母启动子是能结合酵母RNA聚合酶并启动下游(3′)编码序列(如结构基因)转录成mRNA的DNA序列。启动子具有一个转录起始区,它通常位于编码序列的5′端附近。该转录起始区通常包括RNA聚合酶结合位点(″TATA″盒)以及一个转录起始位点。酵母启动子可能还有第二个功能区域称为上游激活序列(UAS),如果存在的话,它通常在结构基因的远端。UAS能调节表达(可诱导)。在UAS不存在时,发生组成型表达。表达的调控可能是正作用或负作用的,从而增强或减弱了转录。
酵母是一种发酵生物体,具有活泼的代谢途径,因此编码代谢途径中的酶的序列提供了特别有用的启动子序列。例子包括醇脱氢酶(ADH)(EP-A-0 284 044)、烯醇酶、葡萄糖激酶、葡萄糖-6-磷酸异构酶、甘油醛-3-磷酸-脱氢酶(GAP或GAPDH)、己糖激酶、磷酸果糖激酶、3-磷酸甘油酸变位酶、以及丙酮酸激酶(PyK)(EPO-A-0 329 203)。编码酸性磷酸酶的酵母PHO5基因也提供了有用的启动子序列[Myanohara等人(1983)Proc.Natl.Acad.Sci.USA 80:1]。
另外,非天然存在的合成的启动子也可象酵母启动子一样起作用。例如,一种酵母启动子的UAS序列可以和另一种酵母启动子的转录激活区连接在一起,形成合成的杂合启动子。这种杂合启动子的例子包括与GAP转录激活区相连的ADH调控序列(美国专利No.4,876,197和4,880,734)。杂合启动子的其它例子包括由ADH2、GAL4、GAL10或PHO5基因的调控序列组成的启动子与糖酵解酶基因如GAP或PyK的转录激活区组合(EP-A-0 164 556)。另外,酵母启动子可包括非酵母来源但能结合酵母RNA聚合酶并启动转录的天然存在的启动子。这些启动子的例子包括,尤其是,[Cohen等人,(1980)Proc.Natl.Acad.Sci.USA 77:1078;Henikoff等人,(1981)Nature 283:835;Hollenberg等人,(1981)Curr.Topics Microbiol.Immunol.96:119;Hollenberg等人,(1979)″细菌抗生素抗性基因在酿酒酵母中的表达″Plasmids of Medical,Environmentaland Commercial Importance(K.N.Timmis和A.Puhler编辑);Mercerau-Puigalon等人,(1980)Gene 11:163;Panthier等人,(1980)Curr.Genet.2:109]。
DNA分子可以在酵母菌胞内表达。启动子序列可以直接与DNA分子相连,在这种情况下,重组蛋白N端的第一个氨基酸始终是甲硫氨酸,其由ATG起始密码子编码。如果需要,可通过和溴化氰体外培育将N端的甲硫氨酸从蛋白质上切下。
象在哺乳动物、杆状病毒以及细菌表达系统中一样,融合蛋白为酵母表达系统提供了一种备选方案。通常,将编码内源酵母蛋白或其它稳定的蛋白之N端部分的DNA序列与异源编码序列的5′端融合。在表达时,该构建物将提供这两个氨基酸序列的融合物。例如,酵母或人超氧化物歧化酶(SOD)基因可以和外源基因5′端相连并在酵母中表达。两个氨基酸序列连接处的DNA序列可能编码或不编码可切割的位点。例如参见EP-A-0 196 056。另一个例子是遍在蛋白融合蛋白。这种融合蛋白由遍在蛋白区域组成,该区域宜保留一个酶(例如遍在蛋白特异性加工蛋白酶)加工位点,以便将外源蛋白和遍在蛋白切开。因此,通过这种方法,可以分离获得天然的外源蛋白(例如WO88/024066)。
另外,还可通过产生嵌合的DNA分子来将外源蛋白从细胞分泌到生长培养基中,该嵌合的DNA分子编码的融合蛋白含有一个前导序列片段,该前导序列片段能使酵母中的外源蛋白分泌出来。较佳的,在编码的前导片段和外来基因之间宜具有能在体内或体外切割的加工位点。该前导序列片段通常编码了含有疏水性氨基酸的信号肽,其指导蛋白从细胞分泌出来。
编码合适信号序列的DNA可以从分泌性酵母蛋白的基因衍生获得,这些基因例如有酵母转化酶基因(EP-A-0 012 873;JPO.62,096,086)以及A-因子基因(美国专利4,588,684)。另外,非酵母来源的前导序列(如干扰素前导序列)的存在也能提供分泌出酵母的作用(EP-A-0 060 057)。
较佳的一类分泌前导序列采用了酵母α-因子基因的片段,其含有″pre″信号序列和″pro″区。可采用的α因子片段的类型包括全长pre-pro α因子前导序列(约83个氨基酸残基)以及截短的α-因子前导序列(通常约25至50个氨基酸残基)(美国专利4,546,083和4,870,008;EP-A-0 324 274)。采用α-因子前导片段提供分泌作用的其它前导序列包括杂合的α-因子前导序列,其由第一个酵母的pre序列以及第二个酵母α因子的pro区域组成(例如见WO 89/02463)。
通常,被酵母识别的转录终止序列是位于翻译终止密码子3′的调控区,其和启动子一起侧接在编码序列的两侧。这些序列指导mRNA的转录,而mRNA能被翻译成该DNA所编码的多肽。转录终止序列和其它酵母识别的终止序列的例子例如是编码糖酵解酶的那些转录终止序列。
上述组件,包括启动子、信号序列(如果需要的)、感兴趣的编码序列以及转录终止序列,通常被一起放在表达构建物中。表达构建物通常以复制子的形式保持,例如能在宿主(如酵母或细菌)中稳定保持的染色体外元件(如质粒)。复制子可能具有两个复制系统,从而允许其能维持在例如酵母中进行表达,并能维持在原核宿主进行克隆和扩增。这些酵母-细菌穿梭载体的例子包括YEp24[Botstein等人(1979)Gene8:17-24],pCL/1[Brake等人,(1984)Proc.Natl.Acad.Sci.USA 81:4642-4646]和YRp17[Stinchcomb等人(1982)J.Mol.Biol.158:157]。另外,复制子可以是高拷贝数或低拷贝数的质粒。高拷贝数质粒的拷贝数大致在约5至200之间,通常在约10至150之间。含有高拷贝数质粒的宿主宜含有至少约10个质粒,更佳的含有至少约20个质粒。根据载体以及外源蛋白对宿主的影响,可以选择高拷贝数或低拷贝数的载体。例如参见Brake等人,同上。
另外,表达构建物可以和一个整合载体一起整合入酵母基因组中。整合载体通常含有至少一个序列与酵母染色体同源,从而允许该载体整合,最好含有两个同源序列侧接该表达构建物。整合看来是载体和酵母染色体中同源DNA之间重组引起的[Orr-Weaver等人(1983)Methods in Enzymol.101:228-245]。通过选择合适的同源序列插入载体中,可以使整合载体针对酵母中某一特定的基因座。见Orr-Weaver等人,同上。可以整合入一个或多个表达构建物,这可能会影响重组蛋白产生的水平[Rine等人(1983)Proc.Natl.Acad.Sci.USA 80:6750]。载体中的染色体序列可以载体中的单个片段形式存在(从而导致整个载体的整合),或是与染色体中的相邻片段同源的两个片段,这两个片段在载体中侧接在表达构建物两侧,从而导致仅仅表达构建物稳定地整合。
通常,染色体外以及整合的表达构建物均含有可选择的标记,以便选择已经转化的酵母菌株。可选择标记可包括能在酵母宿主中表达的生物合成基因(如ADE2、HIS4、LEU2、TRP1和ALG7以及G418抗性基因),这些基因分别赋予酵母细胞对衣霉素以及G418的抗性。另外,合适的可选择标记还可能为酵母在毒性化合物(如金属)存在下提供生长能力。例如,CUP1的存在使酵母能在铜离子存在下生长[Butt等人,(1987)Microbiol,Rev.51:351]。
另外,上述某些组件可以一起放在转化载体中。转化载体通常包含一个可选择标记,如上所述,该载体以复制子形式维持或发展成一个整合载体。
已经开发出了用于转化入许多酵母中的表达和转化载体(无论是染色体外复制子还是整合载体)。例如,已经开发出用于下列酵母菌的表达载体:白假丝酵母[Kurtz,等人,(1986)Mol.Cell.Biol.6:142],麦芽糖念珠菌[Kunze,等人,(1985)J.BasicMicrobiol.25:141],多形汉逊酵母[Gleeson,等人,(1986)J.Gen.Microbiol.132:3459;Roggenkamp等人,(1986)Mol.Gen.Genet.202:302],脆壁克鲁维酵母[Das,等人,(1984)J.Bacteriol.158:1165],乳酸克鲁维酵母[De Louvencourt等人,(1983)J.Bacteriol.154:737;Van den Berg等人,(1990)Bio/Technology 8:135],季也蒙毕赤酵母[Kunze等人,(1985)J.Basic Microbiol.25:141],巴斯德毕酵母[Cregg,等人,(1985)Mol.Cell.Biol.5:3376;美国专利No.4,837,148和4,929,555],酿酒酵母[Hinnen等人,(1978)Proc.Natl.Acad.Sci.USA 75:1929;Ito等人,(1983)J.Bacteriol.153:163],栗酒裂植酵母[Beach和Nurse(1981)Nature 300:706],以及Yarrowia lipolytica[Davidow,等人,(1985)Curr.Genet 10:380471 Gaillardin,等人,(1985)Curr.Genet.10:49]。
将外源DNA导入酵母宿主的方法是本领域熟知的,通常包括用碱阳离子处理转化的原生质球或完整酵母细胞。转化程序通常因待转化的酵母种类而不同。例如参见,[Kurtz等人,(1986)Mol.Cell.Biol.6:142;Kunze等人,(1985)J.Basic Microbiol.25:141;假丝酵母];[Gleeson等人,(1986)J.Gen.Microbiol.132:3459;Roggenkamp等人,(1986)Mol.Gen.Genet.202:302;汉逊酵母];[Das等人,(1984)J.Bacteriol.158:1165;De Louvencourt等人,(1983)J.Bacteriol.154:1165;Van den Berg等人,(1990)Bio/Technology 8:135;克鲁维酵母];[Cregg等人,(1985)Mol.Cell.Biol.5:3376;Kunze等人,(1985)J.Basic Microbiol.25:141;美国专利No.4,837,148和4,929,555;毕赤酵母];[Hinnen等人,(1978)Proc.Natl.Acad.Sci.USA 75;1929;Ito等人,(1983)J.Bacteriol.153:163酿酒酵母];[Beach和Nurse(1981)Nature 300:706;裂殖酵母];[Davidow等人,(1985)Curr.Genet.10:39;Gaillardin等人,(1985)Curr.Genet.10:49;Yarrowia]。
抗体
本文所用的术语“抗体”指由至少一个抗体结合位点组成的一个或一组多肽。“抗体结合位点”是一个三维结合空间,其内表面形状和电荷分布与抗原表位的特征互补,从而使抗体与抗原结合。“抗体”例如包括,脊椎动物抗体、杂合抗体、嵌合抗体、人化抗体、经修饰的抗体、单价抗体、Fab蛋白以及单结构域抗体。
针对本发明蛋白的抗体可用于亲和层析、免疫试验以及区别/鉴定奈瑟球菌蛋白。
针对本发明蛋白的多克隆和单克隆抗体可用常规方法制得。通常,首先用蛋白来免疫合适的动物,较佳的是小鼠、大鼠、家兔或山羊。由于可获得的血清体积多,能获得标记的抗家兔和抗山羊抗体,因此对于制备多克隆抗血清来说,家兔和山羊是较佳的。免疫通常这样进行:将蛋白混合或乳化到盐水(较佳的是佐剂如Freund完全佐剂)中,然后肠胃外(通常是皮下或肌内)注射该混合物或乳剂。每次注射50-200微克的剂量就足够了。2-6周后用盐水(较佳的是用Freund不完全佐剂)配的蛋白质注射一次或多次以强化免疫。另外可以用本领域已知的方法进行体外免疫来产生抗体,从本发明的目的来看,认为其与体内免疫等效。将免疫后的动物血液抽取到玻璃或塑料容器中,25℃培育该血液1小时,然后4℃培育2-18小时,获得多克隆抗血清。离心(例如1000g 10分钟)回收血清。家兔每次取血可获得约20-50毫升。
用Kohler和Milstein的标准方法[Nature(1975)256:495-96]或其改进方法制得单克隆抗体。通常,如上所述对小鼠或大鼠免疫。然而,并非是对动物取血然后抽提血清,而是取出脾脏(以及任选地取出几个大的淋巴结),将其分离成单细胞。如果需要,可将细胞悬液(在除去非特异性粘附的细胞后)加入包被了蛋白质抗原的板或孔中,对脾细胞进行筛选。表达抗原特异性的膜结合免疫球蛋白的B细胞结合到板上,不象悬液其它物质那样被洗去。然后使所得B细胞或所有解离的脾细胞与骨髓瘤细胞融合形成杂交瘤,培养在选择性培养基(如次黄嘌呤、氨基蝶呤胸苷培养基,“HAT”)中。通过有限稀释接种所得杂交瘤,并测定特异性结合免疫抗原(且不结合无关抗原)的抗体的产生。然后,体外(例如在组织培养瓶或中空纤维反应器中)或体内(如小鼠腹水中)培养所选的分泌单克隆抗体的杂交瘤。
如果需要,抗体(无论是多克隆还是单克隆抗体)可用常规技术来标记。合适的标记包括荧光团、发色团、放射活性原子(具体是32P和125I)、密电子试剂、酶、以及具有特异性结合配偶的配体。酶通常靠其活性来检测。例如,辣根过氧化物酶通常是检测其将3,3′,5,5′-四甲基联苯胺(TMB)转变成蓝色的能力,可用分光光度计定量测定。“特异性结合配偶”指能以高特异性结合配体分子的蛋白质,例如抗原以及对其有特异性的单克隆抗体。其它特异性结合配偶包括生物素和亲和素或链亲和素,IgG和蛋白A,以及本领域已知的许多受体-配体对。应理解,上述内容并非要将各种标记分成不同的类,因为同一标记可在几种不同的模型中起作用。例如,125I可作为放射活性标记,或作为密电子试剂。HRP可作为酶或单抗的抗原。另外,一种物质可以和各种标记组合以获得所需的效果。例如,在实施本发明中,单抗和亲和素也需要标记,因此,可以用生物素标记单抗,并用标记了125I的亲和素检测其存在,或用标记HRP的抗生物素单抗检测其存在。其它替换和可能性对于本领域普通技术人员来说是显而易见的,所以应认作等价物属于本发明的范围。
药物组合物
药物组合物可包含本发明的多肽、抗体或核酸。该药物组合物将包含治疗有效量的本发明的多肽、抗体或多核苷酸。
本文所用的术语“治疗有效量”指治疗剂治疗、缓解或预防目标疾病或状况的量,或是表现出可检测的治疗或预防效果的量。该效果例如可通过化学标记或抗原水平来检测。治疗效果也包括生理性症状的减少,例如体温降低。对于某一对象的精确有效的量取决于该对象的体型和健康状况、病症的性质和程度、以及选择给予的治疗剂和或治疗剂的组合。因此,预先指定准确的有效量是没用的。然而,对于某给定的状况而言,可以用常规实验来确定该有效量,临床医师是能够判断出来的。
为了本发明的目的,有效的剂量为给予个体约0.01毫克/千克至50毫克/千克或0.05毫克/千克至10毫克/千克的DNA构建物。
药物组合物还可含有药学上可接受的载体。术语“药学上可接受的载体”指用于治疗剂(例如抗体、多肽、基因或其它治疗剂)给药的载体。该术语指这样一些药剂载体:它们本身不诱导产生对接受该组合物的个体有害的抗体,且给药后没有过分的毒性。合适的载体可能是大的、代谢缓慢的大分子,如蛋白质、多糖、聚乳酸(polylacticacid)、聚乙醇酸、氨基酸聚合物、氨基酸共聚物以及无活性的病毒颗粒。这些载体是本领域普通技术人员所熟知的。
本文可用的药学上可接受的盐例如有:无机酸盐,如盐酸盐、氢溴酸盐、磷酸盐、硫酸盐等;以及有机酸的盐,如乙酸盐、丙酸盐、丙二酸盐、苯甲酸盐等。在Remington′sPharmaceutical Sciences(Mack Pub.Co.,N.J.1991)中可找到关于药学上可接受的赋形剂的充分讨论。
治疗性组合物中的药学上可接受的载体可含有液体,如水、盐水、甘油和乙醇。另外,这些载体中还可能存在辅助性的物质,如润湿剂或乳化剂、pH缓冲物质等。通常,可将治疗性组合物制成可注射剂,例如作为液体溶液或悬液;还可制成在注射前适合配入溶液或悬液中、液体载体的固体形式。脂质体也包括在药学上可接受的载体的定义中。
输药方法
一旦配成本发明的组合物,可将其直接给予对象。待治疗的对象可以是动物;尤其可以治疗人对象。
直接输送该组合物通常可通过皮下、腹膜内、静脉内或肌内注射或输送至组织间隙来实现。组合物也可输送至病灶区。其它给药方式包括口服和肺给药、栓剂和透皮或经皮肤应用(例如参见WO98/20734)、用针、基因枪或手持喷雾器(hypospray)。治疗剂量方案可以是单剂方案或多剂方案。
疫苗
本发明的疫苗可以是预后性的(即预防感染)或治疗性的(即在感染后治疗疾病)。
这些疫苗包含免疫性抗原、免疫原、多肽、蛋白或核酸,通常与“药学上可接受的载体”组合,这些载体包括本身不诱导产生对接受该组合物的个体有害的抗体的任何载体。合适的载体通常是大的、代谢缓慢的大分子,如蛋白质、多糖、聚乳酸、聚乙醇酸、氨基酸聚合物、氨基酸共聚物、脂质凝集物(如油滴或脂质体)以及无活性的病毒颗粒。这些载体是本领域普通技术人员所熟知的。另外,这些载体可作为免疫刺激剂(“佐剂”)。另外,抗原或免疫原可以和细菌类毒素(如白喉、破伤风、霍乱、幽门螺杆菌等病原体的类毒素)偶联。
增强组合物效果的较佳的佐剂包括但不局限于:(1)铝盐(alum),如氢氧化铝、磷酸铝、硫酸铝等;(2)水包油的乳剂配方(有或没有其它特异性的免疫刺激剂,如胞壁酰肽(见下文)或细菌细胞壁成分),例如,例(a)MF59TM(WO 90/14837;《疫苗设计:亚基和佐剂方法》第10章,编者Powell和Newman,Plenum Press 1995),其含有5%鲨烯、0.5%吐温80和0.5%Span 85(任选地含有不同量的MTP-PE(见下文),虽然并不需要),用微量流化器(如110Y型微量流化器(Microfluidics,Newton,MA))制成亚微米级颗粒;(b)SAF,其含有10%鲨烯、0.4%吐温80、5%普卢兰尼克(pluronic)嵌段聚合物L121以及thr-MDP(见下文),微量流化成亚微米级乳剂或涡流振荡产生粒径较大的乳剂,和(c)RibiTM佐剂系统(RAS)(Ribi Immunochem,Hamilton,MT),其含有2%鲨烯、0.2%吐温80以及取自单磷酰脂A(MPL)、二霉菌酸海藻糖酯(TDM)、和细胞壁骨架(CWS)的一种或多种细菌细胞壁组分,较佳的是MPL+CWS(DetoxTM);(3)皂素佐剂,例如可采用StimulonTM(Cambridge Bioscience,Worcester,MA)或从其产生的颗粒,如ISCOM(免疫刺激性复合物);(4)Freund完全佐剂(CFA)和Freund不完全佐剂(IFA);(5)细胞因子,如白介素(如IL-1、IL-2、IL-4、IL-5、IL-6、IL-7、IL-12等)、干扰素(如γ干扰素)、巨噬细胞集落刺激因子(M-CFS)、肿瘤坏死因子(TNF)等;以及(6)作为免疫刺激剂来增强组合物效果的其它物质。Alum和MF59TM是较佳的。
如上所述,胞壁酰肽包括但不局限于,N-乙酰-胞壁酰-L-苏氨酰-D-异谷氨酰胺(thr-MDP)、N-乙酰-去胞壁酰-L-丙氨酰-D-异谷氨酰胺(nor-MDP)、N-乙酰胞壁酰-L-丙氨酰-D-异谷氨酰氨酰基-L-丙氨酸-2-(1′-2′-二棕榈酰-sn-甘油-3-羟基磷酰氧)-乙胺(MTP-PE)等。
免疫原性组合物(如免疫用抗原/免疫原/多肽/蛋白质/核酸,药学上可接受的载体以及佐剂)通常含有稀释剂,如水,盐水,甘油,乙醇等。另外,辅助性物质,如润湿剂或乳化剂、pH缓冲物质等可存在于该赋形剂中。
通常,可将免疫原性组合物制成可注射剂,例如作为液体溶液或悬液;还可制成在注射前适合配入溶液或悬液、液体赋形剂的固体形式。该制剂还可乳化或包封在脂质体中,在上述药学上可接受的载体下增强佐剂效果。
用作疫苗的免疫原性组合物包含免疫学有效量的抗原性或免疫原性多肽,以及上述其它所需的组分。“免疫学有效量”指以单剂或连续剂一部分给予个体的量对治疗或预防是有效的。该用量根据所治疗个体的健康状况和生理状况、所治疗个体的类别(如非人灵长类等)、个体免疫系统合成抗体的能力、所需的保护程度、疫苗的配制、治疗医师对医疗状况的评估、及其它的相关因素而定。预计该用量将在相对较宽的范围内,可通过常规实验来确定。
传统方法是从肠胃外(皮下、肌内、或透皮/经皮肤(如WO98/20734))途径通过注射给予免疫原性组合物。适合其它给药方式的其它配方包括口服和肺制剂、栓剂和透皮应用。治疗剂量可以是单剂方案或多剂方案。疫苗可以结合其它免疫调节剂一起给予。
作为以蛋白质为基础的疫苗的备选方案是,可以采用DNA疫苗接种[例如,Robinson和Torres(1997)Seminars in Immunology 9:271-283;Donnelly等人(1997)Annu Rev.Immunol 15:617-648;见下文]。
基因输送载体
用于输送构建物的基因治疗载体可以口服或全身性给予,其中所述构建物包括本发明治疗剂的编码序列,将其输送至哺乳动物以便在哺乳动物体内表达。这些构建物可利用体内或活体外方式中的病毒或非病毒载体方法。这些编码序列的表达可用内源哺乳动物启动子或异源启动子诱导。编码序列的体内表达可以是组成型的或受调控的。
本发明包括能表达所涉及的核酸序列的基因输送载体。基因输送载体宜为病毒载体,更佳的是逆转录病毒、腺病毒、腺伴随病毒(AAV)、疱疹病毒或甲病毒载体。病毒载体还可以是星状病毒、冠状病毒、正粘病毒、乳多空病毒、副粘病毒、细小病毒、小核糖核酸病毒、痘病毒或披膜病毒的病毒载体。通常参见Jolly(1994)Cancer GeneTherapy 1:51-64;Kimura(1994)Human Gene Therapy 5:845-852;Connelly(1995)Human Gene Therapy 6:185-193;以及Kaplitt(1994)Nature Genetics 6:148-153。
逆转录病毒载体是本领域中熟知的,我们认为任何逆转录病毒基因治疗载体均可用于本发明,包括B、C和D型逆转录病毒、异嗜性逆转录病毒(例如NZB-X1、NZB-X2和NZB9-1(见O′Neill(1985)J.Virol.53:160)广食性逆转录病毒如MCF和MCF-MLV(见Kelly(1983)J.Virol 45:291)、泡沫病毒和慢病毒。见《RNA肿瘤病毒》第2版,ColdSpring Harbor Laboratory,1985。
逆转录病毒基因治疗载体的诸部分可从不同逆转录病毒衍生获得。例如,逆转录载体LTR可以从鼠肉瘤病毒衍生获得,tRNA结合位点可以从Rous肉瘤病毒衍生获得,包装信号从鼠白血病病毒获得,第二链的合成起点从禽类白血病病毒获得。
可将这些重组逆转录病毒导入合适的包装细胞系,用来产生转导感受态逆转录病毒载体颗粒(见美国专利5,591,624)。通过将嵌合性整合酶掺入逆转录病毒颗粒,构建逆转录病毒载体,以便将其定点整合到宿主细胞DNA中(见WO96/37626)。较佳的是重组病毒载体是复制缺陷型重组病毒。
适合与上述逆转录病毒载体一起使用的包装细胞系是本领域中熟知的,很容易制得(见WO95/30763和WO92/05266),并能用来产生能生产重组载体颗粒的生产型细胞系(也称为载体细胞系或“VCL”)。包装细胞系宜从人亲代细胞(如HT1080细胞)或貂亲代细胞系制取,以便消除人血清的灭活作用。
用来构建逆转录病毒基因治疗载体的较佳的逆转录病毒包括禽类白血病病毒、牛白血病病毒、鼠白血病病毒、水貂细胞灶诱导病毒、鼠肉瘤病毒、网状内皮组织增殖病毒和Rous肉瘤病毒。特别佳的鼠白血病病毒包括4070A和1504A(Hartley和Rowe(1976)J Virol 19:19-25),Abelson(ATCC No.VR-999),Friend(ATCCNo.VR-245),Graffi,Gross(ATCC Nol VR-590),Kirsten,Harvey肉瘤病毒和Rauscher(ATCC No.VR-998)以及莫洛尼鼠白血病病毒(ATCC No.VR-190)。这些逆转录病毒可以从保藏机构或保藏中心如Rockville,Maryland的美国典型培养物保藏中心(ATCC)获得,或用常用的技术从已知来源分离获得。
可用于本发明的典型的已知逆转录病毒基因治疗载体包括在以下专利申请中描述的那些载体:GB2200651,EP0415731,EP0345242,EP0334301,WO89/02468;WO89/05349,WO89/09271,WO90/02806,WO90/07936,WO94/03622,WO93/25698,WO93/25234,WO93/11230,WO93/10218,WO91/02805,WO91/02825,WO95/07994,US 5,219,740,US 4,405,712,US 4,861,719,US 4,980,289,US 4,777,127,US 5,591,624.另见Vile(1993)Cancer Res 53:3860-3864;Vile(1993)Cancer Res 53:962-967;Ram(1993)Cancer Res 53(1993)83-88;Takamiya(1992)J Neurosci Res 33:493-503;Baba(1993)J Neurosurg 79:729-735;Mann(1983)Cell 33:153;Cane(1984)Proc Natl AcadSci 81:6349;以及Miller(1990)Human Gene Therapy 1。
人腺病毒基因治疗载体也是本领域中已知的,并可用于本发明。例如参见Berkner(1988)Biotechniques 6:616和Rosenfeld(1991)Science 252:431,以及WO93/07283,WO93/06223和WO93/07282。用于本发明的典型的已知的腺病毒基因治疗载体包括在上述文献以及下述专利中描述的那些例子:WO94/12649,WO93/03769,WO93/19191,WO94/28938,WO95/11984,WO95/00655,WO95/27071,WO95/29993,WO95/34671,WO96/05320,WO94/08026,WO94/11506,WO93/06223,WO94/24299,WO95/14102,WO95/24297,WO95/02697,WO94/28152,WO94/24299,WO95/09241,WO95/25807,WO95/05835,WO94/18922和WO95/09654。另外,可以采用Curiel(1992)Hum.Gene Ther.3:147-154中描述的给予和已杀死腺病毒相连的DNA的方法。本发明的基因输递载体还包括腺病毒伴随病毒(AAV)载体。用于本发明的这种载体的主要且较佳的例子是Srivastava,WO93/09239中公开的AAV-2为基的载体。最佳的AAV载体包含两个AAV反向末端重复序列,其中通过替换核苷酸对天然D-序列进行修饰,使至少5-18个天然的核苷酸(较佳的至少10-18个天然核苷酸,最佳的10个天然核苷酸)被保留下来,而D-序列其余的核苷酸缺失或被非天然核苷酸取代。AAV末端反向重复序列的天然D-序列是每个AAV反向末端重复序列中不参与HP形成的20个串联核苷酸的序列(即每一端有一个序列)。非天然的替换核苷酸可以是天然D-序列该位置中所见核苷酸除外的任何核苷酸。其它可采用典型AAV载体是pWP-19、pWN-1,两者均公开在Nahreini(1993)Gene 124:257-262中。这样的AAV的另一个例子是psub201(见Samulski(1987)J.Virol.61:3096)。另一个典型的AAV载体是Double-D ITR载体。Double-D ITR载体的构建方案公开在美国专利5,478,745中。还有其它的载体是公开在Carter的美国专利4,797,368和Muzyczka的美国专利5,139,941、Chartejee的美国专利5,474,935和Kotin的WO94/288157中的载体。可用于本发明的另一个AAV载体例子是SSV9AFABTKneo,它含有AFP增强子和白蛋白启动子,并且主要指导肝内表达。其结构和构建方案公开在Su(1996)Human Gene Therapy 7:463-470中。其它的AAV基因治疗载体在美国专利5,354,678,5,173,414,5,139,941,5,252,479中有所描述。
本发明的基因治疗载体还包括疱疹载体。主要且较佳的例子是含有编码胸苷激酶多肽的序列的单纯疱疹病毒载体,如公开在US5,288,641和EP0176170(Roizman)中的那些。其它典型的单纯疱疹病毒载体包括WO95/04139中公开的HFEM/ICP6-LacZ(Wistar Institute)、Geller(1988)Science 241:1667-1669以及WO90/09441和WO92/07945中公开的pHSVlac、Fink(1992)Human Gene Therapy3:11-19中描述的HSV Us3::pgC-lacZ、EP 0453242(Breakefield)中描述的HSV 7134、2RH 105和GAL4以及保藏于ATCC、保藏号为ATCC VR-977和ATCC VR-260的那些病毒。
还考虑到甲病毒基因基因治疗载体也可用于本发明。较佳的甲病毒载体是新培斯病毒载体。披膜病毒、Semliki Forest病毒(ATCC VR-67;ATCC VR-1247)、Middleberg病毒(ATCC VR-370)、Ross River病毒(ATCC VR-373;ATCC VR-1246)、委内瑞拉马脑炎病毒(ATCC VR923;ATCC VR-1250;ATCC VR-1249;ATCC VR-532)、以及在美国专利5,091,309,5,217,879以及WO92/10578中描述的那些。更具体地说,可以采用1995年3月15日提交的美国申请08/405,627、WO94/21792、WO92/10578、WO95/07994、US 5,091,309和US 5,217,879中描述的那些甲病毒载体。这些甲病毒可以从保藏机构或保藏中心如Rockville,Maryland的美国典型培养物保藏中心(ATCC)获得,或用常用的技术从已知来源分离获得。较佳的是,采用细胞毒性减少的甲病毒载体(见USSN 08/679640)。
DNA载体系统,如真核分层的(layered)表达系统也可用于表达本发明的核酸。关于真核分层的表达系统详见WO95/07994。较佳的,本发明的真核分层表达系统宜从甲病毒载体衍生获得,更佳的从新培斯病毒载体衍生获得。
适用于本发明的其它病毒载体包括:从脊髓灰质炎病毒衍生的病毒,例如ATCCVR-58以及在Evans,Nature 339(1989)385和Sabin(1973)J.Biol.Standardization 1:115中描述的那些;鼻病毒,例如ATCC VR-1110以及在Arnold(1990)J Cell Biochem L401中描述的那些;痘病毒,如金黄色痘病毒或牛痘病毒,例如ATCC VR-111和ATCCVR-2010,以及在Fisher-Hoch(1989)Proc Natl Acad Sci 86:317;Flexner(1989)Ann NYAcad Sci 569:86,Flexner(1990)Vaccine 8:17;US 4,603,112,US 4,769,330以及WO89/01973中描述的那些;SV40病毒,例如ATCC VR-305以及在Mulligan(1979)Nature 277:108和Madzak(1992)J Gen Virol 73:1533中描述的那些;流感病毒,例如ATCC VR-797以及用例如US 5,166,057和Enami(1990)Proc Natl Acad Sci87:3802-3805;Enami和Palese(1991)J Virol 65:2711-2713;Luytjes(1989)Cell 59:110中所述的反基因技术制得的重组流感病毒(另见McMichael(1983)NEJ Med 309:13,Yap(1978)Nature 273:238以及Nature(1979)277:108);EP-0386882和Buchschacher(1992)J Virol.66:2731中描述的人免疫缺陷病毒;麻疹病毒,例如ATCC VR-67和VR-1247,以及EP-0440219中描述的那些;奥拉病毒,例如ATCC VR-368;Bebaru病毒,例如ATCC VR-600和ATCC VR-1240;Cabassou病毒,例如ATCC VR-922;屈曲病毒,例如ATCC VR-64和ATCC VR-1241;Fort Morgan病毒,例如ATCC VR-924;Getah病毒,例如ATCC VR-369和ATCC VR-1243;Kyzylagach病毒,例如ATCCVR-927;Mayaro病毒,例如ATCC VR-66;Mucambo病毒,例如ATCC VR-580和ATCC VR-1244;Ndumu病毒,例如ATCC VR-371;Pixuna病毒,例如ATCC VR-372和ATCC VR-1245;Tonate病毒,例如ATCC VR-925;Triniti病毒,例如ATCC VR-469;Una病毒,例如ATCC VR-374;Whataroa病毒,例如ATCC VR-926;Y-62-33病毒,例如ATCC VR-375;O′Nyong病毒,东部脑炎病毒,例如ATCC VR-65和ATCCVR-1242;西部脑炎病毒,例如ATCC VR-70,ATCC VR-1251,ATCC VR-622和ATCCVR-1252;和冠状病毒,例如ATCC VR-740和在Hamre(1966)Proc Soc Exp Biol Med121:190中描述的那些。
将本发明的组合物输送至细胞内并不局限于上述病毒载体。还可采用其它输送方法和介质,例如核酸表达载体、与已被杀死的腺病毒相连或不相连的单独的聚阳离子凝缩的DNA(例如参见1994年12月30日美国申请No.08/366,787和Curiel(1992)HumGene Ther 3:147-154)、配体连接的DNA(例如参见Wu(1989)J.Biol.Chem.264:16985-16987)、真核细胞输送载体细胞(例如参见1994年5月9日提交的美国申请No.08/240,030以及美国申请No.08/404,796)、光聚合水凝胶材料的沉淀、手提式基因转移颗粒枪(如美国专利5,149,655所述)、电离辐射(如US5,206,152和WO92/11033所述)、核电荷中和或与细胞膜融合。其它方法在Philip(1994)Mol CellBiol 14:2411-2418以及Woffendin(1994)Proc.Natl.Acad.Sci.91:1581-1585中有所描述。
可以采用颗粒介导的基因转移,例如参见美国申请No.60/023,867。简言之,可将序列插入含有控制高水平表达的常规序列的常规载体中,然后和合成性基因转移分子一起培育,这些基因转移分子例如是聚合性DNA-结合阳离子(如聚赖氨酸、鱼精蛋白和白蛋白),其与细胞寻靶配体(如脱唾液酸血清类粘蛋白(如Wu和Wu(1987)J.Biol.Chem.262:4429-4432所述)、胰岛素(如Hucked(1990)Biochem Pharmacol40:253-263所述)、半乳糖(如Plank(1992)Bioconjugate Chem 3:533-539所述)、乳糖或运铁蛋白)相连。
还可使用裸露的DNA。典型的裸露DNA导入方法在WO 90/11902和US5,580,859中有所描述。用可生物降解的乳胶珠可以改善摄取效果。在对珠粒的胞吞作用开始后,DNA包被的乳胶珠粒被有效地运输到细胞中。通过处理珠粒以提高其疏水性可进一步改进该方法,从而帮助破坏核内体和将DNA释放到细胞质中。
可作为基因输送载体的脂质体在US 5,422,120,WO95/13796,WO94/23697,WO91/14445和EP-524,968中有所描述。如USSN 60/023,867中所描述的,在非病毒输送时,可将编码多肽的核酸序列插入含有控制高水平表达的常规序列的常规载体中,然后和合成性基因转移分子一起培育,这些基因转移分子例如是聚合性DNA-结合阳离子(如聚赖氨酸、鱼精蛋白和白蛋白),其与细胞寻靶配体(如脱唾液酸血清类粘蛋白、胰岛素、半乳糖、乳糖或运铁蛋白)相连。其它输送系统包括采用脂质体来包裹DNA,该DNA所含基因在各种组织特异性或活性普遍存在的启动子控制下。适用的其它非病毒输送系统包括机械输送系统,如Woffendin等人(1994)Proc.Natl.Acad.Sci.USA 91(24):11581-11585中描述的方法。另外,该系统的编码序列和表达产物可以通过光聚合的水凝胶材料的沉淀来输送。可用来输送编码序列的其它基因输送常规方法例如包括,用手提式基因转移颗粒枪(如美国专利5,149,655所述);用电离辐射来激活转移的基因(如US 5,206,152和WO92/11033所述)。
典型的脂质体和聚阳离子基因输送载体在下列文献中有所描述:US 5,422,120和4,762,915;WO 95/13796;WO94/23697;WO91/14445;EP-0524968;Stryer,Biochemistry,236-240页(1975)W.H.Freeman,San Francisco;Szoka(1980)BiochemBiophys Acta 600:1;Bayer(1979)Biochem Biophys Acta 550:464;Rivnay(1987)MethEnzymol 149:119;Wang(1987)Proc Natl Acad Sci 84:7851;Plant(1989)Anal Biochem176:420。
多核苷酸组合物可包含治疗有效量的基因治疗载体,其定义如上所述。出于本发明的目的,有效的剂量是给予个体约0.01毫克/千克至50毫克/千克或0.05毫克/千克至10毫克/千克的DNA构建物。
输送方法
一旦配制成后,本发明的多核苷酸组合物可以以下三种方式给予:(1)直接给予对象;(2)活体外输送至从对象衍生获得的细胞;或(3)体外表达重组蛋白。待处理的对象可以是哺乳动物或鸟类。另外,也可对人进行治疗。
直接输送该组合物通常可通过皮下、腹膜内、静脉内或肌内注射或输送至组织间隙来实现。组合物也可输送至病灶区。其它给药方式包括口服和肺给药、栓剂和透皮或经皮肤应用(例如参见WO98/20734)、用针、基因枪或手持喷雾器(hypospray)。治疗剂量方案可以是单剂方案或多剂方案。
活体外输送以及将转化的细胞重新植入对象体内的方法是本领域所熟知的,在例如WO93/14778中有所描述。用于活体外应用的细胞例子包括例如干细胞、尤其是造血细胞、淋巴细胞、巨噬细胞、树突细胞或肿瘤细胞。
通常,对于活体外和体外应用,核酸的输送可通过以下步骤来实现,例如有葡聚糖介导的转染、磷酸钙沉淀、Polybrene介导的转染、原生质体融合、电穿孔、将多核苷酸包囊在脂质体中以及将DNA直接显微注射到胞核中,所有这些均是本领域所熟知的。
多核苷酸和多肽药物组合物
除了上述的药学上可接受的载体和盐外,多核苷酸和多肽组合物中还可采用下列附加试剂。
A.多肽
一个例子是多肽,其包括但不局限于:脱唾液酸血清类粘蛋白(ASOR);运铁蛋白;脱唾液酸糖蛋白;抗体;抗体片段;铁蛋白;白介素;干扰素;粒细胞-巨噬细胞集落刺激因子(GM-CSF);粒细胞集落刺激因子(G-CSF)、巨噬细胞集落刺激因子(M-CSF)、干细胞因子和促红细胞生成素。还可使用病毒抗原,如包膜蛋白。另外,可用来自其它侵袭性生物的蛋白,例如疟原虫恶性疟疾的环孢子蛋白的17个氨基酸的肽(称为RII)。
B.激素,维生素等
其它可包括的种类例如是:激素、类固醇、雄激素、雌激素、甲状腺激素或维生素、叶酸。
C.聚亚烷基、多糖等
另外,聚(亚烷基)二醇可以和所需的多核苷酸/多肽组合在一起。在一个较佳的实施方案中,聚(亚烷基)二醇是聚乙二醇。另外,可以加入单糖、二糖或多糖。在此方面的一个较佳实施方案中,多糖是葡聚糖或DEAE-葡聚糖。另外有脱乙酰壳多糖和聚交酯-聚乙醇酸内酯共聚物。
D.脂质和脂质体
所需的多核苷酸/多肽还可在输送给对象或对象衍生的细胞之前包裹在脂质中或包裹在脂质体中。
脂质包裹通常用能稳定结合或捕获并保留核酸的脂质体来实现。浓缩的多核苷酸与脂质制剂之比可以变化,但是通常在约1∶1(毫克DNA∶微摩尔脂质)之间,或脂质更多。关于脂质体作为输送核酸的载体的综述参见Hug和Sleight(1991)Biochim.Biophys.Acta.1097:1-17;Straubinger(1983)Meth.Enzymol.101:512-527。
用于本发明的脂质体制剂包括阳离子(带正电荷)、阴离子(带负电荷)和中性制剂。阳离子脂质体已经显示出能以有功能的形式介导质粒DNA的胞内输送(Felgner(1987)Proc.Natl.Acad.Sci.USA 84:7413-7416);mRNA(Malone(1989)Proc.Natl.Acad.Sci.USA 86:6077-6081);和纯化的转录因子(Debs(1990)J.Biol.Chem.265:10189-10192)。
阳离子脂质体很容易购得。例如,N[1-2,3-二油烯基氧)丙基]-N,N,N-三乙铵(DOMTA)脂质体可以Lipofectin的商品名从GIBCO BRL,Grand Island,NY购得。(另见Felgner,同上)。其它市售的脂质体包括transfectace(DDAB/DOPE)和DOTAP/DOPE(Boerhinger)。其它阳离子脂质体可用本领域熟知的方法从易购得的材料制得。例如参见,Szoka(1978)PNAS 75:4194-4198;WO90/11092关于DOTAP(1,2-二(油酰基氧)-3-(三甲基铵溶)丙烷)脂质体合成的描述。
同样,阴离子和中性脂质体也是容易获得的,例如购自Avanti PolarLipids(birmingham,AL),或容易用易购得的材料制得。这种材料包括磷脂酰胆碱、胆固醇、磷脂酰乙醇胺、二油酰基磷脂酰胆碱(DOPC)、二油酰基磷脂酰甘油(DOPG)、二油酰基磷脂酰乙醇胺(DOPE)。这些材料还能以合适比例与DOTMA和DOTAP原料混合。用这些材料制备脂质体的方法是本领域熟知的。
脂质体可包含多层脂质体(MLV),小的单层脂质体(SUV)、或大的单层脂质体(LUV)。各种脂质体-核酸复合物可用本领域已知的方法制得。例如参见Straubinger(1983)Meth.Immunol.101:512-527;Szoka(1978)Proc.Natl.Acad.Sci.USA75:4194-4198;Papahadjopoulos(1975)Biochim.Biophys.Acta 394:483;Wilson(1979)Cell 17:77);Deamer和Bangham(1976)Biochim.Biophys.Acta 443:629;Ostro(1977)Biochem.Biophys.Res.Commun.76:836;Fraley(1979)Proc.Natl.Acad.Sci.USA76:3348);Enoch和Strittmatter(1979)Proc.Natl.Acad.Sci.USA 76:145;Fraley(1980)J.Biol.Chem.(1980)255:10431;Szoka和Papahadjopoulos(1978)Proc.Natl.Acad.Sci.USA 75:145;以及Schaefer-Ridder(1982)Science 215:166。
E.脂蛋白
另外,脂蛋白也可加入待输送的多核苷酸/多肽中。采用的脂蛋白的例子包括:乳糜微粒、HDL、IDL、LDL和VLDL。还可采用这些蛋白的突变体、片段或融合物。另外,可采用天然存在的脂蛋白的修饰物,例如乙酰化的LDL。这些脂蛋白能使多核苷酸的输送指向表达脂蛋白受体的细胞。较佳的,如果待输送的多核苷酸中加入了脂蛋白,则组合物中不加入其它寻靶的配体。
天然存在的脂蛋白包含脂质和蛋白部分。蛋白部分称为脱辅基蛋白。目前,已经分离并鉴定出了脱辅基蛋白A、B、C、D和E。其中至少有两个含有几种蛋白,用罗马数字AI、AII、AIV;CI、CII、CIII命名。
脂蛋白可包含多个脱辅基蛋白。例如,天然存在的乳糜微粒包含A、B、C和E,随着时间的推移,这些脂蛋白失去A,得到C和E脱辅基蛋白。VLDL包含A、B、C、和E脱辅基蛋白,LDL包含脱辅基蛋白B;HDL包含脱辅基蛋白A、C和E。
这些脱辅基蛋白的氨基酸是已知的,并且在下列文献中有所描述:Breslow(1985)Annu Rev.Biochem 54:699;Law(1986)Adv.Exp Med.Biol.151:162;Chen(1986)J BiolChem 261:12918;Kane(1980)Proc Natl Acad Sci USA 77:2465;and Utermann(1984)Hum Genet 65:232。
脂蛋白含有各种脂质,包括甘油三酯、胆固醇(游离的和酯型)以及磷脂。天然存在的脂蛋白中脂质的组成是不同的。例如,乳糜微粒主要含甘油三酯。关于天然存在的脂蛋白的脂质含量更详细的描述可在例如Meth.Enzymol.128(1968)中找到。选择脂质的组成,以使脱辅基蛋白的构型与受体结合活性相符。还可选择脂质的组成,以促进与多核苷酸结合分子的疏水性相互作用和结合。
天然存在的脂蛋白可以用诸如超离心的方法从血清中分离出来。这些方法在Meth.Enzymol.(同上);Pitas(1980)J.BioChem.255:5454-5460以及Mahey(1979)J Clin.Invest 64:743-750中有所描述。脂蛋白还可在体外产生,或通过在所需宿主细胞中表达脱辅基蛋白基因的重组方法产生。例如参见Atkinson(1986)Annu Rev Biophys Chem15:403和Radding(1958)Biochim Biophys Acta 30:443。脂蛋白也可购自商业供应商,如Biomedical Techniologies,Inc.,Stoughton,Massachusetts,USA。关于脂蛋白的进一步描述可在Zuckermann等人的PCT/US97/14465中找到。
F.聚阳离子试剂
聚阳离子试剂可以与或不与脂蛋白一起包括在含所需待输送多核苷酸/多肽的组合物中。
聚阳离子试剂通常在生理性相关的pH下表现出净的正电荷,并能中和核酸的电荷,以有助于输送至所需位置。这些试剂具有体外、活体外和体内的用途。聚阳离子试剂可用来将核酸通过肌内或皮下等输送至活的对象。
下面是用作聚阳离子试剂的多肽例子:聚赖氨酸、聚精氨酸、聚鸟氨酸和鱼精蛋白。其它例子包括组蛋白、鱼精蛋白、人血清白蛋白、DNA结合蛋白、非组蛋白染色体蛋白、DNA病毒的外壳蛋白,如(X174,转录因子还含有结合DNA的结构域,因此可用作核酸浓缩剂。简言之,转录因子如C/CEBP、c-jun、c-fos、AP-1、AP-2、AP-3、CPF、Prot-1、Sp-1、Oct-1、Oct-2、CREP、和TFIID含有结合DNA序列的基础性结构域。
聚阳离子有机试剂包括:精胺、亚精胺和腐胺。
从上面的清单可以推出聚阳离子试剂的尺寸和生理性能,以构建其它多肽聚阳离子试剂或产生合成的聚阳离子试剂。
可采用的合成的聚阳离子试剂例如包括,DEAE-葡聚糖、polybrene。LipofectinTM和lipofectAMINETM是和多核苷酸/多肽组合时形成聚阳离子复合物的单体。
免疫诊断试验
本发明的奈瑟球菌抗原可用于免疫试验来检测抗体水平(或相反,可用抗奈瑟球菌抗体来检测抗原水平)。根据明确的免疫试验,可以开发出重组抗原,以代替侵入性诊断性方法。针对生物学样品(例如包括血液或血清样品)中的奈瑟球菌蛋白的抗体可以被检测出来。免疫试验的设计可作很大变化,其各种方案均是本领域中已知的。免疫试验的方案可采取例如竞争性、或直接反应或夹心型试验。方案例如还可采用固体支持物,或可以采用免疫沉淀法。大多数试验涉及采用有标记的抗体或多肽;该标记例如可以是荧光标记、化学发光标记、放射活性标记或染料分子。扩增探针信号的试验也是已知的;其例子是采用生物素和亲和素的试验,酶标记的和介导的免疫试验,如ELISA试验。
将合适的材料(包括本发明的组合物)以及进行试验所需的其它试剂和材料(例如合适的缓冲液、盐溶液等)和合适的试验说明包装到合适的容器中,构成适用于免疫诊断且含有适当标记的试剂的试剂盒。
核酸杂交
“杂交”指两个核酸序列相互之间通过氢键而结合。通常,一个序列被固定到固体载体中,另一个将游离于溶液内。然后,在有利于形成氢键的条件下使两个序列相互接触。影响这种结合的因素包括:溶剂的类型和体积;反应温度;杂交时间;搅拌程度;封闭液相序列与固体载体非特异性连接的试剂(Denhardt′s试剂或BLOTTO);序列的浓度;是否用化合物来增加序列结合的速度(硫酸葡聚糖或聚乙二醇);以及杂交后洗涤条件的严谨程度。见Sambrook等人[同上]第2卷,第9章,9.47至9.57页。
“严谨性”指有利于非常相似的序列与不同序列发生结合的杂交反应条件。例如,应选择温度和盐浓度组合,使温度比所研究的杂交物的Tm计算值低大约120至200℃。温度和盐浓度常可在前期初步实验中通过经验来确定,在初步实验中,固定在滤膜上的基因组DNA样品与感兴趣的序列杂交,然后在不同的严谨度条件下洗涤。见Sambrook等人第9.50页。
在进行例如Southern印迹时,要考虑的参数是(1)待印迹的DNA的复杂性以及(2)探针与受检测序列之间的同源性。对于高度复杂的真核基因组中的单拷贝基因,待研究片段的总量可以在10的一个数量级范围内变化,质粒为0.1至1微克,或将噬菌体消化至10-9至10-8克。对于复杂性较低的多核苷酸,可以采用实际上更短的印迹、杂交以及接触时间,更少量的起始多核苷酸,以及比活更低的探针。例如,从1微克酵母DNA开始,用仅仅1小时的接触时间,印迹2小时,然后和108cpm/μg的探针杂交4-8小时,就可以检测单拷贝酵母基因。对于单拷贝哺乳动物基因而言,一种保守的方法是从10微克DNA开始,印迹过夜,在10%硫酸葡聚糖存在下用108cpm/μg以上的探针杂交过夜,导致接触时间约为24小时。
有几个因素可能会影响探针与感兴趣片段之间的DNA-DNA杂合物的解链温度(Tm),因而影响杂交和洗涤的合适条件。在许多情况下,探针并非与片段100%同源。其它常常遇到的变量包括杂交序列的长度和G+C总含量,以及杂交缓冲液的离子强度和甲酰胺含量。所有这些因素的作用可近似表示成一个方程式:
Tm=81+16.6(log10Ci)+0.4[%(G+C)]-0.6(%甲酰胺)-600/n-1.5(%错配)其中Ci是盐浓度(单价离子),n是杂交物碱基对的长度(对Meinkoth和Wahl(1984)Anal.Biochem.138:267-284中的稍稍作了修改)。
在设计杂交实验时,影响核酸杂交的一些因素可以方便地予以改变。杂交和洗涤时的温度以及洗涤时的盐浓度的调节最为简单。随着杂交温度(即严谨度)的升高,不同源的链之间发生杂交的可能性变得更少,结果背景值降低。如果放射性标记的探针并非与固定的片段完全同源(这在基因家族和种间杂交实验中是常见的),则必须降低杂交温度,而背景值将会增加。洗涤温度以类似的方式影响杂交带的强度和背景值的程度。洗涤的严谨性也随盐浓度的降低而升高。
通常,在50%甲酰胺存在下的方便的杂交温度是:对于靶片段同源性达95%至100%的探针而言,是42℃;对于同源性为90%至95%的探针,为37℃;对于同源性为85%和90%的探针,为32℃。对于较低的同源性,应用上述方程式应相应地降低甲酰胺含量和调节温度。如果探针和靶片段之间的同源性是未知的,则最简单的方法是从非严谨的杂交和洗涤条件开始。如果在放射自显影后发现了非特异性的条带或高背景值,则可在高严谨性下洗涤滤膜,并重新曝光。如果曝光所需时间使得该方法不切实际,则应平行测试几个杂交和/或洗涤严谨性。
核酸探针试验
采用本发明的核酸探针的方法(如PCR、分支DNA探针试验或印迹技术)能确定cDNA或mRNA的存在。如果探针和本发明的序列能形成稳定地足以被检测到的双链体或双链复合物,则称探针与本发明的序列“杂交”。
核酸探针将与本发明的奈瑟球菌核苷酸序列(包括有义和反义链)杂交。尽管有许多不同的核苷酸序列编码该氨基酸序列,但是天然的奈瑟球菌序列是较佳的,因为它是实际存在于细胞中的序列。mRNA代表一种编码序列,因此探针应与该编码序列互补;单链cDNA与mRNA互补,因此cDNA探针应与非编码序列互补。
探针序列无需和奈瑟球菌序列(或其互补体)相同,序列以及长度的一些差异能增加试验的灵敏度,如果核酸探针能和靶核苷酸形成能被检测的双链体的话。另外,核酸探针可包括其它核苷酸,以使形成的双链体稳定。其它奈瑟球菌序列也是有帮助的,可作为检测形成的双链体的标记。例如,非互补的核苷酸序列可以和探针的5′端相连,探针序列的其余部分与奈瑟球菌序列互补。或者,非互补的碱基或较长的序列能散布到探针中,只要探针序列与奈瑟球菌序列有足够的互补性以便与其杂交从而形成能被检测的双链体。
探针的确切长度和序列将取决于杂交条件,如温度,盐浓度等。例如,对于诊断应用,根据分析物序列的复杂程度,核酸探针通常含有至少10-20个核苷酸,较佳的有15-25个,更佳的有至少30个核苷酸,但是也可短于该长度。短的引物通常需要温度较低,以便和模板形成足够稳定的杂交复合物。
探针可用合成方法产生,例如Matteucci等人[J.Am.Chem.Soc.(1981)103:3185]的方法或Urdea等人[Proc.Natl.Acad.Sci.USA(1983)80:7461]的方法,或用市售的自动寡核苷酸合成仪合成。
可以根据偏好选择探针的化学特征。对于某些应用,DNA或RNA是合适的。对于其它的应用,可以加入修饰,例如骨架修饰,如硫代磷酸酯或甲基磷酸酯,可用来增加体内半衰期,改变RNA亲和力,增加核酸酶抗性等[例如参见Agrawal和Iyer(1995)Curr Opin Biotechnol 6:12-19;Agrawal(1996)TIBTECH 14:376-387];还可采用类似物如肽核酸[例如参见Corey(1997)TIBTECH 15:224-229;Buchardt等人(1993)TIBTECH 11:384-386]。
另外,聚合酶链反应(PCR)是另一个熟知的检测少量靶核酸的手段。该试验在Mullis等人[Meth.Enzymol.(1987)155:335-350];美国专利4,683,195和4,468,202中有所描述。用两个“引物”核苷酸与靶核酸杂交,并用来引导反应。引物可包含不与扩增靶序列(或其互补序列)杂交的序列,以帮助双链体的稳定性,或例如可插入一个简便的限制性位点。这些序列通常侧接所需的奈瑟球菌序列。
利用最初的靶核酸作为模板,热稳定的聚合酶能从引物产生靶核酸的拷贝。在聚合酶产生临界量的靶核酸后,它们可用较传统的方法(如Southern印迹)来检测。当采用Southern印迹方法后,标记的探针将与奈瑟球菌序列(如其互补序列)杂交。
另外,mRNA或cDNA也可用Sambrook等人[同上]中描述的传统印迹技术来检测。用凝胶电泳可纯化并分离利用聚合酶从mRNA产生的mRNA或cDNA。然后,将凝胶上的核酸印迹到固体载体如硝酸纤维素上。使固体载体与标记的探针接触,然后洗涤除去所有未杂交的探针。然后,检测含有标记探针的双链体。该探针通常用放射活性物质作标记。
附图简述
图1-20显示了实施例中,和ORF 37、5、2、15、22、28、32、4、61、76、89、97、106、138、23、25、27、79、85以及132的序列分析所得的生化数据。M1和M2是分子量标记。箭头表示主要重组产物的位置,或在Western印迹中,主要的脑膜炎奈瑟球菌免疫反应性条带的位置。TP表示脑膜炎奈瑟球菌总蛋白抽提物;OMV表示脑膜炎奈瑟球菌外膜泡囊制备物。在杀菌试验的结果中:菱形(◆)表示免疫前的数据;三角(▲)表示GST对照数据;圆圈(●)表示脑膜炎奈瑟球菌重组蛋白的数据。计算机分析显示了亲水性曲线(上方)、抗原性指数曲线(中间)以及AMPHI分析(下方)。用AMPHI程序预测T-细胞表位[Gao等人(1989)J.Immunol.143:3007;Roberts等人(1996)AIDS Res Hum Retrovir 12:593;Quakyi等人(1992)Scand J Immunol增版11:9],该程序可从DNASTAR,Inc(1228 South Park Street,Madison,Wisconsin 53715 USA)的Protean软件包中获得。
实施例
下列实施例描述已经在脑膜炎奈瑟球菌和淋病奈瑟球菌中鉴定的核酸序列及其推定的翻译产物。并非所有的核酸序列都是完整的,即它们编码的不是全长野生型蛋白。
实施例总体上采用下列形式:
●脑膜炎奈瑟球菌(B株)中已经鉴定的核苷酸序列
●该序列推定的翻译产物
●根据数据库比较用计算机分析翻译产物
●脑膜炎奈瑟球菌(A株)以及淋病奈瑟球菌中鉴定的对应的基因和蛋白序列
●可能具有适当抗原性的蛋白的特性描述
●生物化学分析(表达、纯化、ELISA、FACS等)的结果
实施例通常包括菌株和菌株之间的序列相同性细节情况。序列相似的蛋白其结构和功能通常是相似的,序列相同性通常表示有共同的进化起源。广泛采用功能已知的蛋白序列之间的比较,作为赋予其新序列推定蛋白功能的指南,在全基因组分析中证明这是特别有用的。
在NCBI(http://www.ncbi.nlm.nih.gov)用BLAST、BLAST2、BLASTn、BLASTp、tBLASTn、BLASTx、和tBLASTx算法进行序列比较[例如参见Altschul等人(1997)″Gapped BLAST和PSI-BLAST:新一代的蛋白数据库搜索程序″Nucleic AcidsResearch 25:2289-3402]。对下列数据库进行搜索:非冗长的GenBank+EMBL+DDBJ+PDB序列和非冗长的GenBank CDS翻译+PDB+SwissProt+SPupdate+PIR序列。
为了比较脑膜炎球菌和淋球菌序列,用tBLASTx算法,在http://www.genome.ou.edu/gono_blast.html中执行。还用FASTA算法来比较ORF(购自GCG Wisconsin Package,9.0版)。
核苷酸序列中的点(例如SEQ ID 11中的495位)代表为了维持读码框而任意导入的核苷酸。同样,除去带双划线的核苷酸。小写字母(如SEQ ID 11的496位)代表在独立测序反应的序列对比时出现了多义性(实施例中的一些核苷酸序列是通过合并两个或多个实验的结果而获得的)。
用根据Esposti等人[″膜蛋白亲水性的关键评价″(1990)Eur J Biochem190:207-219]的统计研究的算法,扫描所有6个读码框中的核苷酸序列,以预测疏水性区域的存在。这些结构域代表潜在的跨膜区域或疏水性前导序列。
用ORFFINDER程序(NCBI)从片段化的核苷酸序列预测开放读框。
有下划线的氨基酸序列代表用PSORT算法(http://www.psort.nibb.ac.jp)估测出的ORF中可能的跨膜区域或前导序列。还用MOTIFS程序(GCG Wisconsin和PROSITE)预测了功能性结构域。
可用各种试验来评价实施例中鉴定的蛋白的体内免疫原性。例如,可以重组表达蛋白,并用于免疫印迹筛选患者血清。蛋白和患者血清之间发生阳性反应表明该患者以前已经建立了对该所述蛋白的免疫应答,即该蛋白是免疫原。该方法还可用来鉴定免疫优势蛋白。
重组蛋白还可方便地用来例如在小鼠中制备抗体。这些抗体可用来直接确认蛋白位于细胞表面。将标记的抗体(例如对于FACS为荧光标记)与完整的细菌培育,细菌表面出现标记确认了该蛋白的位置。
具体地说,采用下列方法(A)至(S),来表达、纯化和分析本发明蛋白的生物化学特性:
A)染色体DNA制备
使脑膜炎奈瑟球菌2996菌株在100毫升GC培养基中生长至指数期,离心收获,重悬于5毫升缓冲液(20%蔗糖、50毫摩尔Tris-HCl、50毫摩尔EDTA、pH 8)中。冰上培育10分钟后,加入10毫升裂解溶液(50毫摩尔NaCl,1%Na-十二烷基肌氨酸钠,50微克/毫升蛋白酶K)裂解该细菌,37℃培育悬液2小时。用苯酚抽提两次(平衡至pH 8),用三氯甲烷/异戊醇(24∶1)抽提一次。加入0.3M乙酸钠和2体积乙醇,使DNA沉淀,离心收集。用70%乙醇洗涤沉淀一次,重新溶解在4毫升缓冲液(10毫摩尔Tris-HCl,1毫摩尔EDTA,pH 8)中。读取260纳米下OD值,测定DNA浓度。
B)寡核苷酸设计
用(a)脑膜炎球菌B的序列(当能获得时),或(b)淋球菌/脑膜炎球菌A序列(按需适应于脑膜炎球菌密码子偏好利用率),根据各ORF的编码序列,设计合成的寡核苷酸引物。推导紧靠预计的前导序列下游5′端扩增引物序列,忽略任何预计的信号肽。
对于大多数ORF,5′引物包括两个限制性酶识别位点(BamHI-NdeI,BamHI-NheI或EcoRI-NheI,这取决于基因自身的限制性方式);3′引物包括一个XhoI限制性位点。建立该步骤是为了指导各扩增产物(对应于各ORF)克隆到以下两个不同的表达系统中:pGEX-KG(用BamHI-XhoI或EcoRI-XhoI),以及pET21b+(用NdeI-XhoI或NheI-XhoI)。
5’-端引物尾序列:CGC
GGATCCCATATG (BamHI-NdeI)
CGC
GGATCCGCTAGC (BamHI-NheI)
CC
GGAATTCTA
GCTAGC (EcoRI-NheI)
3’-端引物尾序列:CCCG
CTCGAG (XhoI)
对于ORF5、15、17、19、20、22、27、28、65和69,进行两个不同的扩增,将各ORF克隆到两个表达系统中。对于各ORF采用两个不同的5′引物;如前采用了同一3′XhoI引物:
5’-端引物尾序列:GGAATTC
CATATGGCCATGG (NdeI)
5’-端引物尾序列:CG
GGATCC (BamHI)
ORF76被克隆到pTRC表达载体中,并表达成氨基端His-tag融合。在该具体情况中,预计的信号肽包括在最终产物中。用下列引物掺入NheI-BamHI限制性位点:
5’-端引物尾序列:GATCA
GCTAGCCATATG (NheI)
3’-端引物尾序列:CG
GGATCC (BamHI)
引物不仅含有限制性酶识别序列,而且还包括与待扩增序列杂交的核苷酸。杂交核苷酸的数目取决于整个引物的解链温度,对于各引物可用下式测定:
Tm=4(G+C)+2(A+T) (排除尾部)
Tm=64.9+0.41(%GC)-600/N (整个引物)
对于整个寡核苷酸来说,所选寡核苷酸的平均解链温度为65-70℃,对于单单杂交区来说,平均解链温度为50-55℃。
表1(511页)显示了用于各种扩增的正向和反向引物。在某些情况下,应注意引物的序列没有与ORF中的序列完全匹配。在进行最初的扩增时,一些脑膜炎球菌ORF的完整的5′和/或3′序列并不是已知的,但是已经鉴定了其在淋球菌中的对应序列。为了进行扩增,可用淋球菌序列作为引物设计的根据,考虑密码偏好作了改变。具体地说,改变下列密码子:ATA→ATT;TCG→TCT;CAG→CAA;AAG→AAA;GAG→GAA;CGA→CGC;CGG→CGC;GGG→GGC。表1中的斜体核苷酸表明这种变化。应理解,一旦鉴定出了完整的序列,就不再需要该方法了。
用Perkin Elmer 394 DNA/RNA合成仪合成寡核苷酸,用2毫升氢氧化铵从柱上洗脱下,56℃培育5小时去保护。加入0.3M乙酸钠和2体积乙醇,使寡核苷酸沉淀。然后离心样品,将沉淀重悬于100微升或1毫升水中。用Perkin ElmerλBio分光光度计测定OD260,测得浓度,调节至2-10pmol/微升。
C)扩增
标准的PCR程序如下:在20-40微摩尔各寡核苷酸、400-800微摩尔dNTP溶液、1×PCR缓冲液(包括1.5毫摩尔氯化镁)、2.5单位TaqI DNA聚合酶(用Perkin-ElmerAmpliTaQ,GIBCO Platinum,Pwo DNA聚合酶或Tahara Shuzo Taq聚合酶)存在下,用50-200ng基因组DNA作为模板。
在一些例子中,通过加入10微升DMSO或50微升2M甜菜碱来优化PCR。
在加热开始后(在最初的95℃培育整个混合物3分钟期间加入聚合酶),每个样品经历两个步骤的扩增:开头5轮的进行用排除限制性酶尾部的寡核苷酸的解链温度作为杂交温度,随后的30轮根据全长寡核苷酸的杂交温度来进行。这些轮后是最后在72℃下延伸10分钟。
标准循环如下:
变性 | 杂交 | 延伸 | |
前5轮 | 30秒95℃ | 30秒50-55℃ | 30-60秒72℃ |
后30轮 | 30秒 | 30秒 | 30-60秒 |
95℃ | 65-70℃ | 72℃ |
延伸时间随待扩增ORF的长度不同而不同。
扩增用9600或2400 Perkin Elmer GeneAmp PCR系统进行。为了检查结果,将1/10的扩增体积装载到1-1.5%琼脂糖凝胶上,将各扩增片段的大小与DNA分子标记作比较。
将扩增的DNA直接上样到1%琼脂糖凝胶上,或是先用乙醇沉淀,然后重悬于合适的体积中,上样到1%琼脂糖凝胶上。然后用Qiagen凝胶抽提试剂盒按照生产商说明从凝胶中洗脱并纯化获得对应于大小正确条带的DNA片段。该DNA片段的最终体积为30微升或50微升的水,或10毫摩尔Tris,pH 8.5。
D)PCR片段的消化
将对应于扩增片段的纯化的DNA分成2等份,用以下物质进行双重消化:
-NdeI/XhoI或NheI/XhoI,用于克隆到pET-21b+中,该蛋白进一步表达成C-端His-尾融合物
-BamHI/XhoI或EcoRI/XhoI,用于克隆到pGEX-KG中,该蛋白进一步表达成N-端GST融合物
-对于ORF76,NheI/BamHI,用于克隆到pTRC-HisA载体中,该蛋白进一步表达成N-端His-尾融合物
-EcoRI/PstI,EcoRI/SalI,SalI/PstI,用于克隆到pGex-His中,该蛋白进一步表达成N-端His-尾融合物
在合适的缓冲液存在下,使各纯化的DNA片段与20单位的各种限制性酶(NewEngland Biolabs)在30或40微升的最终体积中培育(37℃培育3小时至过夜)。然后用QIAquick PCR纯化试剂盒按照生产商说明书纯化消化产物,洗脱到最终体积为30微升或50微升的水中或10毫摩尔Tris-HCl,pH 8.5中。在滴定的分子量标记存在下,通过1%琼脂糖凝胶电泳测定最终的DNA浓度。
E)克隆载体(pET22B,pGEX-KG,pTRC-His A和pGex-His)的消化
在合适的缓冲液存在下,使200微升反应体积中的限制性酶各50单位与10微克质粒37℃培育过夜,对10微克质粒进行双消化。在将全部消化物上样到1%琼脂糖凝胶上后,用Qiagen QIAquick凝胶抽提试剂盒从凝胶中纯化对应于消化载体的条带,将DNA洗脱到50微升10毫摩尔Tris-HCl,pH 8.5中。测定样品的OD260,评价其DNA浓度,并调节至50微克/微升。每个克隆步骤采用1微升质粒。
pGEX-His载体是经修饰的pGEX-2T载体,其在凝血酶断裂位点上游携带有一个编码6个组氨酸残基的区域,而且还含有载体pTRC99(Pharmacia)的多个克隆位点。
F)克隆
将预先消化和纯化的对应于各ORF的片段连接到pET22b和pGEX-KG中。在20微升的最终体积,在生产商提供的缓冲液存在下,用0.5微升NEB T4 DNA连接酶(400单位/微升)连接摩尔比为3∶1的片段/载体。室温培育反应3小时。在一些实验中,用Boheringer的″快速连接试剂盒″按照生产商说明书进行连接。
为了将重组质粒导入合适的菌株内,使100微升大肠杆菌DH5感受态细胞与连接酶反应溶液于冰上培育40分钟,然后37℃3分钟,然后在加入800微升LB肉汤后,再37℃培育20分钟。然后在Eppendorf微量离心机中以最大速度离心细胞,重悬于约200微升上清液中。然后将悬液接种到LB氨苄青霉素(100毫克/毫升)平板上。
使5个随机选择的菌落在2毫升(pGEX或pTC克隆)或5毫升(pET克隆)LB肉汤+100微克/毫升氨苄青霉素中37℃生长过夜,对重组克隆进行筛选。然后,使细胞沉淀,用Qiagen QIAprep旋转微量制备试剂盒,按照生产商说明书,将DNA抽提到最终体积为30微升。用NdeI/XhoI或BamHI/XhoI消化5微升各个微量制备物(约1微克),将整个消化物上样到1-1.5%琼脂糖凝胶上(取决于预计的插入物大小),与分子量标记(1Kb DNA梯序列,GIBCO)平行。根据正确的插入物大小筛选阳性克隆。
对于ORF110、111、113、115、119、122、125和130的克隆,将双消化的PCR产物连接入双消化载体利用的是EcoRI-PstI克隆位点,或者对于115和127,利用的是EcoRI-SalI位点,或者对于ORF122,利用的是SalI-PstI位点。克隆后,将重组质粒导入大肠杆菌宿主W3110中。使单个克隆在含50微升/毫升氨苄青霉素的L-肉汤中37℃生长过夜。
G)表达
将克隆到表达载体中的每个ORF转化入适合表达重组蛋白产物的菌株中。用1微升各构建物转化上述30微升大肠杆菌BL21(pGEX载体)、大肠杆菌TOP10(pTRC载体)或大肠杆菌BL21-DE3(pET载体)。在pGEX-His载体例子中,用相同的大肠杆菌菌株(W3110)进行最初的克隆和表达。将单个重组菌落接种到2毫升LB+Amp(100微克/毫升)中,37℃培育过夜,然后1∶30稀释在100毫升瓶中的20毫升LB+Amp(100微克/毫升)中,确保OD600在0.1至0.15之间。将瓶培育在30℃的回转水浴摇床中,直至OD表明达到适合诱导表达的指数生长(pET和pTRC载体的OD为0.4-0.8;pGEX和pGEX-His载体的OD为0.8-1)。对于pET,pTRC和pGEX-His载体,加入1毫摩尔IPTG,诱导蛋白质表达,而在pGEX系统情况下,IPTG的最终浓度为0.2毫摩尔。30℃培育3小时后,测OD检查样品的最终浓度。为了检查表达,取出各样品1毫升,在微量离心机中离心,将沉淀重悬于PBS中,用12%SDS-PAGE和考马斯蓝染色分析。6000g离心整个样品,将沉淀重悬于PBS中待用。
H)GST-融合蛋白大规模纯化
使单菌落在LB+Amp琼脂板上37℃培育过夜。将细菌接种到水浴摇床中20毫升LB+Amp培养液中,生长过夜。将细菌1∶30稀释到600毫升新鲜培养基中,使其在最适温度(20-37℃)下生长至OD550为0.8-1。用0.2毫摩尔IPTG诱导蛋白质表达,然后培育3小时。4℃、8000rpm离心培养物。弃去上清液,将细菌沉淀重悬于7.5毫升冷的PBS中。用40W的Brason超声波仪B-15在冰上超声破碎细胞30秒种,冻融2次,再次离心。收集上清液,与150微升谷胱苷肽-Sepharose 4B树脂(Pharmacia)(先用PBS洗涤)混合,室温下培育30分钟。4℃、700g离心样品5分钟。用10毫升冷的PBS洗涤树脂2次10分钟,重悬于1毫升冷的PBS中,上样于一次性柱中。用2毫升冷PBS洗柱2次,直至流穿液OD280达到0.02-0.06。加入700微升冷的谷胱苷肽洗脱缓冲液(10毫摩尔还原的谷胱苷肽,50毫摩尔Tris-HCl),洗脱GST-融合蛋白,收集组分直至OD280为0.1。将各组分21微升上样于12%SDS凝胶上,凝胶采用BioradSDS-PAGE分子量标准宽范围(M1)(200,116.25,97.4,66.2,45,31,21.5,14.4,6.5kDa)或Amersham Rainbow标记(M2)(220,66,46,30,21.5,14.3kDa)作为标准。因为GST的MW为26kDa,因此该值必须加入各GST-融合蛋白的MW中。
I)His-融合物溶解度分析(ORF111-129)
为了分析His-融合物表达产物的溶解度,将3毫升培养物沉淀重悬于缓冲液M1[500微升PBS,pH 7.2]中。加入25微升溶菌酶(10毫克/毫升),4℃培育细菌15分钟。用Branson超声仪B-15以40W超声破碎沉淀30秒,冻融两次,然后再次离心分离成沉淀和上清液。收集上清液,将沉淀重悬于缓冲液M2[8M尿素,0.5M氯化钠,20毫摩尔咪唑和0.1M磷酸二氢钠]中,4℃培育3-4小时。离心后,收集上清液,将沉淀重悬于缓冲液M3[6M盐酸胍,0.5M氯化钠,20毫摩尔咪唑和0.1M磷酸二氢钠]中,4℃过夜。用SDS-PAGE分析所有步骤的上清液。
发现ORF113、119和120表达的蛋白溶于PBS,而ORF111、122、116以及129表达的蛋白的溶解需要尿素,ORF125和127的需要盐酸胍。
J)His融合物大规模纯化
使单菌落在LB+Amp琼脂板上37℃培育过夜。将细菌接种到20毫升LB+Amp培养液中,在水浴摇床中培育过夜。将细菌1∶30稀释到600毫升新鲜培养基中,使其在最适温度(20-37℃)下生长至OD550为0.6-0.8。加入1毫摩尔IPTG诱导蛋白质表达,进一步培育该培养物3小时。4℃、8000rpm离心培养物,弃去上清液,将细菌沉淀重悬于7.5毫升(i)冷的缓冲液A(300毫摩尔氯化钠,50毫摩尔磷酸缓冲液,10毫摩尔咪唑,pH 8,针对可溶性蛋白)或(ii)缓冲液B(尿素8M,10毫摩尔Tris-HCl,100毫摩尔磷酸缓冲液,pH 8.8,针对不溶性蛋白)。
用Brason超声波仪B-15于40W在冰上超声破碎细胞30秒种,冻融2次,再次离心。
对于不溶性蛋白,-20℃保藏上清液,而将沉淀重悬于2毫升缓冲液C(6M盐酸胍,100毫摩尔磷酸缓冲液,10毫摩尔Tris-HCl,pH 7.5)中,在匀化器中处理10个循环。13000rpm离心产物40分钟。
收集上清液,与150微升Ni2+ -树脂(Pharmacia)(先用合适的缓冲液A或缓冲液B洗涤),室温下轻微搅动培育30分钟。4℃,700g离心样品5分钟。用10毫升缓冲液A或B洗涤树脂二次10分钟,重悬于1毫升缓冲液A或B中,上样于一次性柱中。用2毫升冷的缓冲液A 4℃洗涤树脂,或在室温下用2毫升缓冲液B洗涤树脂,直至流穿液OD280达到0.02-0.06。
用以下缓冲液洗涤树脂:(i)2毫升冷的20毫摩尔咪唑缓冲液(300毫摩尔氯化钠,50毫摩尔磷酸缓冲液,20毫摩尔咪唑,pH 8)或(ii)缓冲液D(尿素gM,10毫摩尔Tris-HCl,100毫摩尔磷酸缓冲液,pH 6.3),直至流穿液OD280达到0.02-0.06。加入700微升的(i)冷的洗脱缓冲液A(300毫摩尔氯化钠,50毫摩尔磷酸缓冲液,250毫摩尔咪唑,pH8)或(ii)洗脱缓冲液B(尿素8M,10毫摩尔Tris-HCl,100毫摩尔磷酸缓冲液,pH 4.5),洗脱His-融合蛋白,收集组分直至OD280为0.1。将各组分21微升上样于12%SDS凝胶中。
K)His-融合蛋白复性
在变性的蛋白中加入10%甘油。然后用透析缓冲液I(10%甘油,0.5M精氨酸,50毫摩尔磷酸缓冲液,5毫摩尔还原的谷胱苷肽,0.5毫摩尔氧化的谷胱苷肽,2M尿素,pH 8.8)将蛋白质稀释至20微克/毫升,用相同的缓冲液4℃透析12-14小时。用透析缓冲液II(10%甘油,0.5M精氨酸,50mM磷酸缓冲液,5毫摩尔还原的谷胱苷肽,0.5毫摩尔氧化的谷胱苷肽,pH 8.8)进一步4℃透析蛋白质12-14小时。用下式评价蛋白浓度:
蛋白质(毫克/毫升)=(1.55×OD280)-(0.76×OD260)
L)His-融合物大规模纯化(ORF111-129)
用上述步骤诱导500毫升细菌培养物,获得可溶于缓冲液M1、M2或M3的融合蛋白。将细菌粗提物上样于Ni-NTA superflow柱(Quiagen),根据融合蛋白的溶解缓冲液,用M1、M2或M3预先平衡该柱。用相同缓冲液洗柱,洗脱未结合的物质。用含有500毫摩尔咪唑的相应缓冲液洗脱特异性蛋白,用不含咪唑的相应缓冲液透析。每一轮后,在下次使用前用至少两个柱体积的0.5M氢氧化钠洗涤并重新平衡,对柱进行清洁。
M)小鼠免疫
用各纯化蛋白20微克腹膜内免疫小鼠。在ORF 2、4、15、22、27、28、37、76、89和97情况下,用氢氧化铝作为佐剂,在第1、21和42天免疫Balb-C小鼠,检测第56天所取样品中的免疫应答。对于ORF 44、106和132,用相同方案免疫CD1小鼠。对于ORF 25和40,用Freund佐剂,而不是氢氧化铝,免疫CD1小鼠,采用相同的免疫方案,只是在第42天而非56天测定免疫应答。同样,对于ORF 23、32、38和79,用Freund佐剂免疫CD1小鼠,但是在第49天测定免疫应答。
N)ELISA试验(血清分析)
将无荚膜MenB M7菌株接种到巧克力琼脂板上,37℃培育过夜。用无菌挑菌拭子收集琼脂板的细菌菌落,接种到7毫升含0.25%葡萄糖的Mueller-Hinton肉汤(Difco)中。跟踪OD260每30分钟监测细菌生长。使细菌长至OD达到0.3-0.4。10000rpm离心培养物10分钟。弃去上清液,用PBS洗涤细菌1次,重悬于含0.025%甲醛的PBS中,室温培育2小时,然后4℃搅拌过夜。在96孔Greiner板的每个孔中加入100微升细菌细胞,4℃培育过夜。然后用PBT洗涤缓冲液(0.1%吐温-20,PBS配)洗涤孔三次。每个孔中加入200微升饱和缓冲液(含2.7%聚乙烯吡咯烷酮10的水),37℃培育平板2小时。用PBT洗涤各孔3次。每个孔中加入200微升稀释的血清(稀释缓冲液:1%BSA,0.1%吐温-20,0.1%叠氮钠,PBS配),37℃培育平板90分钟。用PBT洗孔三次。在每个孔中加入100微升以稀释缓冲液1∶2000稀释的HRP-偶联的家兔抗小鼠(Dako)血清,37℃培育平板90分钟。用PBT缓冲液洗涤孔三次。在每个孔中加入100微升HRP的底物缓冲液(25毫升柠檬酸缓冲液pH 5,10毫克邻苯二胺和10微升水),使平板在室温下放置20分钟。在每个孔中加入100微升硫酸,并跟踪OD490。当OD490为各自免疫前血清OD值的2.5倍时,认为ELISA呈阳性。
O)FACScan细菌结合试验程序
将无荚膜MenB M7菌株接种到巧克力琼脂板上,37℃培育过夜。用无菌挑菌拭子收集琼脂板上的细菌菌落,接种到8毫升含0.25%葡萄糖的Mueller-Hinton肉汤(Difco)的4个试管中。跟踪OD260,每30分钟监测细菌生长。使细菌长至OD达到0.35-0.5。4000rpm离心培养物10分钟。弃去上清液,将沉淀重悬于封闭缓冲液(1%BSA,0.4%叠氮钠)中,4000rpm离心5分钟。将细胞重悬于封闭缓冲液中,至OD620为0.07。在Costar 96孔板的每个孔中加入100微升细菌细胞。在每个孔中加入100微升稀释(1∶200)血清(封闭缓冲液配),4℃培育平板2小时。4000rpm离心细胞5分钟,吸出上清液,每个孔中加入200微升封闭缓冲液,洗涤细胞。在每个孔中加入1∶100稀释的R-Phicoerytrin偶联的F(ab)2山羊抗小鼠抗体,4℃培育平板1小时。4000rpm离心5分钟,使细胞旋转沉淀,在每个孔中加入200微升封闭缓冲液进行洗涤。吸出上清液,将细胞重悬于每孔200微升PBS和0.25%甲醛中。将样品转移到FACScan管中读数。FACScan设置的条件为:FL1,开,FL2和FL3关;FSC-H临界值:92;FSC PMT电压:E 02;SSC PMT:474;Amp.Gains 7.1;FL-2PMT:539;补偿值:0。
P)OMV制备
使细菌在5GC平板上生长过夜,用挑菌环收获,重悬于10毫升20毫摩尔Tris-HCl中。56℃热灭活30分钟,在冰上超声破碎该细菌10分钟(50%负载循环,50%输出)。5000g离心10分钟,除去未破碎的细胞,4℃、50000g离心75分钟,回收全部细胞包膜组分。为了从粗制的外膜中抽提出细胞质膜蛋白,将全部组分重悬于2%十二烷基肌氨酸钠(Sigma)中,室温培育20分钟。10000g离心该悬浮液10分钟,除去凝聚物,对上清液进一步50000g超离心75分钟,使外膜沉淀。将外膜重悬于10毫摩尔Tris-HCl,pH 8,用BioRad蛋白质试验以BSA为标准品测定蛋白浓度。
Q)全抽提物制备
使细菌在GC板上生长过夜,用挑菌环收获,重悬于1毫升20毫摩尔Tris-HCl中。56℃热灭活30分钟。
R)Western印迹
将MenB菌株2996的纯化蛋白(每条泳道500ng)、外膜泡囊(5微克)和全细胞抽提物(25微克)上样于15%SDS-PAGE中并转移到硝酸纤维素膜上。转移在4℃、150mA、转移缓冲液(0.3%Tris碱,1.44%甘氨酸,20%甲醇)中进行2小时。在饱和缓冲液(10%脱脂乳、0.1%Triton X100,PBS配)中4℃培育过夜,使该膜饱和。用洗涤缓冲液(3%脱脂乳,0.1%Triton X100,PBS配)洗涤该膜两次,并与洗涤缓冲液1∶200稀释的小鼠血清37℃培育2小时。洗涤该膜两次,和稀释度为1∶2000的辣根过氧化物酶标记的抗小鼠Ig培育90分钟。用含0.1%Triton X100的PBS洗涤该膜两次,用Opti-4CN底物试剂盒(Bio-Rad)显影。加入水,终止反应。
S)杀菌试验
使MC58菌株在巧克力琼脂板上37℃生长过夜。收集5-7个菌落,用于接种7毫升Mueller-Hinton肉汤。在章动器上37℃培育该悬浮液,使其生长,至OD620为0.5-0.8。将培养液等分到1.5毫升无菌Eppendorf管中,在微量离心机中以最大速度离心20分钟。以Gey′s缓冲液(Gibco)洗涤沉淀一次,重悬于相同缓冲液中,至OD620为0.5,以Gey′s缓冲液稀释1∶20000,25℃保藏。
在96孔组织培养板的每个孔中加入50微升Gey′s缓冲液/1%BSA。在每个孔中加入25微升稀释的小鼠血清(1∶100稀释在Gey′s缓冲液/0.2%BSA中),4℃培育平板。将25微升前述细菌悬浮液加入每个孔中。每个孔中加入25微升热灭活(56℃水浴30分钟)或正常的幼兔补体。在加入幼兔补体后,立即将每个孔中22微升的样品接种到Mueller-Hinton琼脂板(时间0)。37℃转动培育96孔板1小时,然后将每个孔内22微升的样品接种到Mueller-Hinton琼脂板(时间1)上。过夜培育后,计数对应于时间0和时间1的菌落。
表II(520页)给出了克隆、表达和纯化结果的小结。
实施例1
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 1>:
1 ATGAAACAGA CAGTCAA.AT GCTTGCCGCC GCCCTGATTG CCTTGGGCTT
51 GAACCGACCG GTGTGGNCGG ATGACGTATC GGATTTTCGG GAAAACTTGC
101 A.GCGGCAGC ACAGGGAAAT GCAGCAGCCC AATACAATTT GGGCGCAATG
151 TAT.TACAAA GGACGCGCGT GCGCCGGGAT GATGCTGAAG CGGTCAGATG
201 GTATCGGCAG CCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG
251 GCTGGATGTA TGCCAACGGG CGCGC.GTGC GCCAAGATGA TACCGAAGCG
301 GTCAGATGGT ATCGGCAGGC GGCAGCGCAG GGGGTTGTCC AAGCCCAATA
351 CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG
401 TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA
451 GCCCAAAACA ATTTGGGCGT GATGTATGCC GAAAGANCGC GCGTGCGCCA
501 AGACCG...
它对应于氨基酸序列<SEQ ID 2;ORF37>:
1 MKQTVXMLAA ALIALGLNRP VWXDDVSDFR ENLXAAAQGN AAAQYNLGAM
51 YXQRTRVRRD DAEAVRWYRQ PAEQGLAQAQ YNLGWMYANG RXVRQDDTEA
101 VRWYRQAAAQ GVVQAQYNLG VIYAEGRGVR QDDVEAVRWF RQAAAQGVAQ
151 AQNNLGVMYA ERXRVRQD...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 3>:
1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT
51 GAACCGAGCG GTGTGGGCGG ATGACGTATC GGATTTTCGG GAAAACTTGC
101 AGGCGGCAGC ACAGGGAAAT GCAGCAGCCC AATACAATTT GGGCGCAATG
151 TATTACAAAG GACGCGGCGT GCGCCGGGAT GATGCTGAAG CGGTCAGATG
201 GTATCGGCAG GCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG
251 GCTGGATGTA TGCCAACGGG CGCGGCGTGC GCCAAGATGA TACCGAAGCG
301 GTCAGATGGT ATCGGCAGGC GGCAGCGCAG GGGGTTGTCC AAGCCCAATA
351 CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG
401 TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA
451 GCCCAAAACA ATTTGGGCGT GATGTATGCC GAAAGACGCG GCGTGCGCCA
501 AGACCGCGCC CTTGCACAAG AATGGTTTGG CAAGGCTTGT CAAAACGGAG
551 ACCAAGACGG CTGCGACAAT GACCAACGCC TGAAGGCGGG TTATTGA
其对应于氨基酸序列<SEQ ID 4;ORF37-1>:
1
MKQTVKWLAA ALIALGLNRA VWADDVSDFR ENLQAAAQGN AAAQYNLGAM
51 YYKGRGVRRD DAEAVRWYRQ AAEQGLAQAQ YNLGWMYANG RGVRQDDTEA
101 VRWYRQAAAQ GVVQAQYNLG VIYAEGRGVR QDDVEAVRWF RQAAAQGVAQ
151 AQNNLGVMYA ERRGVRQDRA LAQEWFGKAC QNGDQDGCDN DQRLKAGY*
进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 5>:
1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT
51 GAACCAAGCG GTGTGGGCGG ATGACGTATC GGATTTTCGG GAAAACTTGC
101 AGGCGGCAGC ACAGGGAAAT GCAGCAGCCC AAAACAATTT GGGCGTGATG
151 TATGCCGAAA GACGCGGCGT GCGCCAAGAC CGCGCCCTTG CACAAGAATG
201 GCTTGGCAAG GCTTGTCAAA ACGGATACCA AGACAGCTGC GACAATGACC
251 AACGCCTGAA AGCGGGTTAT TGA
它编码的蛋白具有以下的氨基酸序列<SEQ ID 6;ORF37a>:
1
MKQTVKWLAA ALIALGLNQA VWADDVSDFR ENLQAAAQGN AAAQNNLGVM
51 YAERRGVRQD RALAQEWLGK ACQNGYQDSC DNDQRLKAGY *
最初鉴定的部分菌株B序列(ORF37)和ORF37a在75个氨基酸的重叠区内显示出有68.0%的相同性:
10 20 30 40 50 60
orf37.pep
MKQTVXMLAAALIALGLNRPVWXDDVSDFRENLXAAAQGNAAAQYNLGAMYXQRTRVRRD
||||| |||||||||||: || | ||||||||| |||||||||| |||:|| :| ||:|
orf37a
MKQTVKWLAAALIALGLNQAVWADDVSDFRENLQAAAQGNAAAQNNLGVMYAERRGVRQD
10 20 30 40 50 60
70 80 90 100 110 120
orf37.pep DAEAVRWYRQPAEQGLAQAQ YNLGWMYANGRXVRQDDTEAVRWYRQAAAQGVVQAQYNLG
| | :| : ::|
orf37a RALAQEWLGKACQNGYQDSC DNDQRLKAGYX
70 80 90
进一步的工作鉴定了淋病奈瑟球菌中的对应基因<SEQ ID 7>:
1 ATGAAACAGA CAGTCAAATG GCTTGCCGCC GCCCTGATTG CCTTGGGCTT
51 GAACCAAGCG GTGTGGGCGG GTGACGTATC GGATTTTCGG GAAAACTTGC
101 AGgcggcaGA ACaggGAAAT GCAGCAGCCC AATTCAATTT GGGCGTGATG
151 TATGAAAATG GACAAGGAGT TCGTCAAGAT TATGTACAGG CAGTGCAGTG
201 GTATCGCAAG GCTTCAGAAC AAGGGGATGC CCAAGCCCAA TACAATTTGG
251 GCTTGATGTA TTACGATGGA CGCGGCGTGC GCCAAGACCT TGCGCTCGCT
301 CAACAATGGC TTGGCAAGGC TTGTCAAAAC GGAGACCAAA ACAGCTGCGA
351 CAATGACCAA CGCCTGAAGG CGGGTTATTA A
它编码的蛋白质具有以下的氨基酸序列<SEQ ID 8;ORF37ng>:
1
MKQTVKWLAA ALIALGLNQA VWAGDVSDFR ENLQAAEQGN AAAQFNLGVM
51 YENGQGVRQD YVQAVQWYRK ASEQGDAQAQ YNLGLMYYDG RGVRQDLALA
101 QQWLGKACQN GDQNSCDNDQ RLKAGY*
最初鉴定的部分菌株B序列(ORF37)在与ORF37ng重叠的111个氨基酸内显示出64.9%的相同性:
orf37.pep MKQTVXMLAAALIALGLNRPVWXDDVSDFRENLXAAAQGNAAAQYNLGAMYXQRTRVRRD 60
||||| |||||||||||: || ||||||||| || |||||||:|||:|| : ||:|
orf37ng MKQTVKWLAAALIALGLNQAVWAGDVSDFRENLQAAEQGNAAAQFNLGVMYENGQGVRQD 60
orf37.pep DAEAVRWYRQPAEQGLAQAQYNLGWMYANGRXVRQDDTEAVRWYRQAAAQGVVQAQYNLG 120
::||:|||: :||| |||||||| || :|| |||| : | :| :| :|
orf37ng YVQAVQWYRKASEQGDAQAQYNLGLMYYDGRGVRQDLALAQQWLGKACQNGDQNSCDNDQ 120
orf37.pep VIYAEGRGVRQDDVEAVRWFRQAAAQGVAQAQNNLGVMYAERXRVRQD 168
orf37ng RLKAGY 126
完整的菌株B序列(ORF37-1)和ORF37ng在重叠的198个氨基酸中显示出51.5%的相同性:
10 20 30 40 50 60
orf37-1.pep MKQTVKWLAAALIALGLNRAVWADDVSDFRENLQAAAQGNAAAQYNLGAMYYKGRGVRRD
||||||||||||||||||:|||| |||||||||||| |||||||:|||:|| :|:|||:|
orf37ng MKQTVKWLAAALIALGLNQAVWAGDVSDFRENLQAAEQGNAAAQFNLGVMYENGQGVRQD
10 20 30 40 50 60
70 80 90 100 110 120
orf37-1.pep DAEAVRWYRQAAEQGLAQAQYNLGWMYANGRGVRQDDTEAVRWYRQAAAQGVVQAQYNLG
::||:|||:|:||| |||||||| || :|||||||
orf37ng YVQAVQWYRKASEQGDAQAQYNLGLMYYDGRGVRQD------------------------
70 80 90
130 140 150 160 170 180
orf37-1.pep VIYAEGRGVRQDDVEAVRWFRQAAAQGVAQAQNNLGVMYAERRGVRQDRALAQEWFGKAC
||||:|:||||
orf37ng ------------------------------------------------LALAQQWLGKAC
100
190 199
orf37-1.pep QNGDQDGCDNDQRLKAGYX
|||||::||||||||||||
orf37ng QNGDQNSCDNDQRLKAGYX
110 120
这些氨基酸序列的计算机分析表明了一个推定的前导序列,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位能用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF37-1(11kDa)克隆到pET和pGex载体中,并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图1A显示了GST-融合蛋白亲和纯化的结果,图1B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白免疫小鼠,用该小鼠的血清进行ELISA(阳性结果),FACS分析(图1C)和杀菌试验(图1D)。这些实验确证ORF37-1是一种外露蛋白,并且是一种有用的免疫原。
图1E显示了ORF37-1的亲水性、抗原性指数以及AMPHI区域。
实施例2
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 9>:
TTCGGCGA CATCGGCGGT TTGAAGGTCA ATGCCCCCGT CAAATCCGCA
GGCGTATTGG TCGGGCGCGT CGGCGCTATC GGACTTGACC CGAAATCCTA
TCAGGCGAGG GTGCGCCTCG ATTTGGACGG CAAGTATCAG TTCAGCAGCG
ACGTTTCCGC GCAAATCCTG ACTTCsGGAC TTTTGGGCGA GCAGTACATC
GGGCTGCAGC AGGGCGGCGA CACGGAAAAC CTTGCTGCCG GCGACACCAT
CTCCGTAACC AGTTCTGCAA TGGTTCTGGA AAACCTTATC GGCAAATTCA
TGACGAGTTT TGCCGAGAAA AATGCCGACG GCGGCAATGC GGAAAAAGCC
GCCGAATAA
它对应于氨基酸序列<SEQ ID 10>:
1 FGDIGGLKVN APVKSAGVLV GRVGAIGLDP KSYQARVRLD LDGKYQFSSD
51 VSAQILTSGL LGEQYIGLQQ GGDTENLAAG DTISVTSSAM VLENLIGKFM
101 TSFAEKNADG GNAEKAAE*
这些氨基酸序列的计算机分析给出了下列结果:
与假设的流感嗜血菌蛋白(ybrd.haein:登录号p45029)的同源性
SEQ ID 9和ybrd.haein在122个重叠的氨基酸内显示出有48.4%的相同性:
20 30 40 50 60 70
yrbd.h LGIGALVFLGLRVANVQGFAETKSYTVTATFDNIGGLKVRAPLKIGGVVIGRVSAITLDE
|::||||||:||:| :||::|||:||:||
N.m FGDIGGLKVNAPVKSAGVLVGRVGAIGLDP
10 20 30
80 90 100 110 120 130
yrbd.h KSYLPKVSIAINQEYNEIPENSSLSIKTSGLLGEQYIALTMGFDDGDTAMLKNGSQIQDT
||| ::|::::: :| ::::: | | ||||||||||:| | |||: | :|: | |
N.m KSYQARVRLDLDGKY-QFSSDVSAQILTSGLLGEQYIGLQQG---GDTENLAAGDTISVT
40 50 60 70 80
140 150 160
yrbd.h TSAMVLEDLIGQFL--YGSKKSDGNEKSESTEQ
:||||||:|||:|: :::|::||:: ::::|:
N.m SSAMVLENLIGKFMTSFAEKNADGGNAEKAAEX
90 100 110 120
与淋病奈瑟球菌的预计的ORF的同源性
SEQ ID 9与淋病奈瑟球菌的预计的ORF在重叠的118个氨基酸内显示出有99.2%的相同性:
20 30 40 50 60 70
yrbd GAAAVAFLAFRVAGGAAFGGSDKTYAVYADFGDIGGLKVNAPVKSAGVLVGRVGAIGLDP
||||||||||||||||||||||||||||||
N.m FGDIGGLKVNAPVKSAGVLVGRVGAIGLDP
10 20 30
80 90 100 110 120 130
yrbd KSYQARVRLDLDGKYQFSSDVSAQILTSGLLGEQYIGLQQGGDTENLAAGDTISVTSSAM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
N.m KSYQARVRLDLDGKYQFSSDVSAQILTSGLLGEQYIGLQQGGDTENLAAGDTISVTSSAM
40 50 60 70 80 90
140 150 160
yrbd VLENLIGKFMTSFAEKNAEGGNAEKAAEX
||||||||||||||||||:||||||||||
N.m VLENLIGKFMTSFAEKNADGGNAEKAAEX
100 110 120
完整的yrbd流感嗜血菌序列具有一个前导序列,预计全长的同源脑膜炎奈瑟球菌该蛋白也会有一个前导序列。这提示它可能是膜蛋白、分泌的蛋白或表面蛋白,且该蛋白或其表位之一可能是疫苗或诊断的有用抗原。
实施例3
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 11>:
1 ..ATTTTGATAT ACCTCATCCG CAAGAATCTA GGTTCGCCCG TCTTCTTCTT
51 TCAGGAACGC CCCGGAAAGG ACGGAAAACC TTTTAAAATG GTCAAATTCC
101 GTTCCATGCG CGACGGCTTG TATTCAGACG GCATTCCGCT GCCCGACGGA
151 GAACGCCTGA CACCGTTCGG CAAAAAACTG CGTGCCGcCA GTwTGGACGA
201 ACTGCCTGAA TTATGGAATA TCTTAAAAGG CGAGATGAGC CTGGTCGGCC
251 CCCGCCCGCT GCTGATGCAA TATCTGCCGC TGTACGACAA CTTCCAAAAC
301 CGCCGCCACG AAATGAAACC CGGCATTACC GGCTGGGCGC AGGTCAACGG
351 GCGCAACGCg CTTTCGTGGG ACGAAAAATT CGCCTGCGAT GTTTGGTATA
401 TCGACCACTT CAGCCTGTGC CTCGACATCA AAATCCTACT GCTGACGGTT
451 AAAAAAGTAT TAATCAAGGA AGGGATTTCC GCACAGGGCG AACA.aCCAT
501 GCCCCCTTTC ACAGGAAAAC GCAAACTCGC CGTCGTCGGT GCGGGCGGAC
551 ACGGAAAAGT CGTTGCCGAC CTTGCCGCCG CACTCGGCCG GTACAGGGAA
601 ATCGTTTTTC TGGACGACCG CGCACAAGGC AGCGTCAACG GCTTTTCCGT
651 CATCGGCACG ACGCTGCTGC TTGAAAACAG TTTATCGCCC GAACAATACG
701 ACGTCGCCGT CGCCGTCGGC AACAACCGCA TCCGCCGCCA AATCGCCGAA
751 AAAGCCGCCG CGCTCGGCTT CGCCCTGCCC GTACTGGTTC ATCCGGACGC
801 GACCGTCTCG CCTTCTGCAA CAGTCGGACA AGGCAGCGTC GTTATGGCGA
851 AAGCGGTCG..
它对应于氨基酸序列<SEQ ID 12;ORF3>:
1..
ILIYLIRKNL GSPVFFFQER PGKDGKPFKM VKFRSMRDGL YSDGIPLPDG
51 ERLTPFGKKL RAASXDELPE LWNILKGEMS LVGPRPLLMQ YLPLYDNFQN
101 RRHEMKPGIT GWAQVNGRNA LSWDEKFACD VWYIDHFSLC LDIKILLLTV
151 KKVLIKEGIS AQGEXTMPPF TGKRKLAVVG AGGHGKVVAD LAAALGRYRE
201 IVFLDDRAQG SVNGFSVIGT TLLLENSLSP EQYDVAVAVG NNRIRRQIAE
251 KAAALGFALP VLVHPDATVS PSATVGQGSV VMAKAV..
进一步的序列分析揭示了完整的核苷酸序列<SEQ ID 13>:
1 ATGAGTAAAT TCTTCAAACG CCTGTTTGAC ATTGTTGCCT CCGCCTCGGG
51 ACTGATTTTC CTCTCGCCAG TATTTTTGAT TTTGATATAC CTCATCCGCA
101 AGAATCTAGG TTCGCCCGTC TTCTTCTTTC AGGAACGCCC CGGAAAGGAC
151 GGAAAACCTT TTAAAATGGT CAAATTCCGT TCCATGCGCG ACGCGCTTGA
201 TTCAGACGGC ATTCCGCTGC CCGACGGAGA ACGCCTGACA CCGTTCGGCA
251 AAAAACTGCG TGCCGCCAGT TTGGACGAAC TGCCTGAATT ATGGAATATC
301 TTAAAAGGCG AGATGAGCCT GGTCGGCCCC CGCCCGCTGC TGATGCAATA
351 TCTGCCGCTG TACGACAACT TCCAAAACCG CCGCCACGAA ATGAAACCCG
401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC
451 GAAAAATTCG CCTGCGATGT TTGGTATATC GACCACTTCA GCCTGTGCCT
501 CGACATCAAA ATCCTACTGC TGACGGTTAA AAAAGTATTA ATCAAGGAAG
551 GGATTTCCGC ACAGGGCGAA GCCACCATGC CCCCTTTCAC AGGAAAACGC
601 AAACTCGCCG TCGTCGGTGC GGGCGGACAC GGAAAAGTCG TTGCCGACCT
651 TGCCGCCGCA CTCGGCCGGT ACAGGGAAAT CGTTTTTCTG GACGACCGCG
701 CACAAGGCAG CGTCAACGGC TTTTCCGTCA TCGGCACGAC GCTGCTGCTT
751 GAAAACAGTT TATCGCCCGA ACAATACGAC GTCGCCGTCG CCGTCGGCAA
801 CAACCGCATC CGCCGCCAAA TCGCCGAAAA AGCCGCCGCG CTCGGCTTCG
851 CCCTGCCCGT TCTGGTTCAT CCGGACGCGA CCGTCTCGCC TTCTGCAACA
901 GTCGGACAAG GCAGCGTCGT TATGGCGAAA GCCGTCGTAC AGGCAGGCAG
951 CGTATTGAAA GACGGCGTGA TTGTGAACAC TGCCGCCACC GTCGATCACG
1001 ACTGCCTGCT TAACGCTTTC GTCCACATCA GCCCAGGCGC GCACCTGTCG
1051 GGCAACACGC ATATCGGCGA AGAAAGCTGG ATAGGCACGG GCGCGTGCAG
1101 CCGCCAGCAG ATCCGTATCG GCAGCCGCGC AACCATTGGA GCGGGCGCAG
1151 TCGTCGTACG CGACGTTTCA GACGGCATGA CCGTCGCGGG CAATCCGGCA
1201 AAGCCGCTGC CGCGCAAAAA CCCCGAGACC TCGACAGCAT AA
它对应于氨基酸序列<SEQ ID 14;ORF3-1>:
1 MSKFFKRLFD IVASA
SGLIF LSPVFLILIY LIRKNLGSPV FFFQERPGKD
51 GKPFKMVKFR SMRDALDSDG IPLPDGERLT PFGKKLRAAS LDELPELWNI
101 LKGEMSLVGP RPLLMQYLPL YDNFQNRRHE MKPGITGWAQ VNGRNALSWD
151 EKFACDVWYI DHF
SLCLDIK ILLLTVKKVL IKEGISAQGE ATMPPFTGKR
201 KLAVVGAGGH GKVVADLAAA LGRYREIVFL DDRAQGSVNG FSVIGTTLLL
251 ENSLSPEQYD VAVAVGNNRI RRQIAEKAAA LGFALPVLVH PDATVSPSAT
301 VGQGSVVMAK AVVQAGSVLK DGVIVNTAAT VDHDCLLNAF VHISPGAHLS
351 GNTHIGEESW IGTGACSRQQ IRIGSRATIG AGAVVVRDVS DGMTVAGNPA
401 KPLPRKNPET STA*
对该氨基酸序列的计算机分析给出下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF3与脑膜炎奈瑟球菌菌株A的ORF(ORF3a)在重叠的286个氨基酸内显示出有93.0%的相同性:
10 20 30
orf3.pep
ILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
|||||| ||||||||||||||||||||||||||||
orf3a MSKFFKRLFDIVASA
SGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
10 20 30 40 50 60
40 50 60 70 80 90
orf3.pep SMRDGLYSDGIPLPDGERLTPFGKKLRAASXDELPELWNILKGEMSLVGPRPLLMQYLPL
||:|:| |||| |||||||||||||||||| ||||||||:|||:||||||||||||||||
orf3a SMHDALDSDGILLPDGERLTPFGKKLRAASLDELPELWNVLKGDMSLVGPRPLLMQYLPL
70 80 90 100 110 120
100 110 120 130 140 150
orf3.pep YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFS
LCLDIKILLLTVKKVL
|||||||||||||||||||||||||||||||:||||:||||||| ||||||||||||||||
orf3a YDNFQNRRHEMKPGITGWAQVNGRNALSWDERFACDIWYIDHFS
LCLDIKILLLTVKKVL
130 140 150 160 170 180
160 170 180 190 200 210
orf3.pep
IKEGISAQGEXTMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG
|||||||||| ||||||||||||||||||||||||:|||||| | ||||||||:||||||
orf3a
IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVAELAAALGTYGEIVFLDDRVQGSVNG
190 200 210 220 230 240
220 230 240 250 260 270
orf3.pep FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT
| ||||||||||||||||:|:|||||||||||||||||||||||||||:|||:|||||||
orf3a FPVIGTTLLLENSLSPEQFDIAVAVGNNRIRRQIAEKAAALGFALPVLIHPDSTVSPSAT
250 260 270 280 290 300
280
orf3.pep VGQGSVVMAKAV
||||:|||||||
orf3a VGQGGVVMAKAVVQADSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESW
310 320 330 340 350 360
全长ORF3a核苷酸序列<SEQ ID 15>是:
1 ATGAGTAAAT TCTTCAAACG CCTGTTTGAC ATTGTTGCCT CCGCCTCGGG
51 ACTGATTTTC CTCTCGCCAG TATTTTTGAT TTTGATATAC CTCATCCGCA
101 AGAATCTGGG TTCGCCCGTC TTCTTCTTTC AGGAACGCCC CGGAAAGGAC
151 GGAAAACCTT TTAAAATGGT CAAATTCCGT TCCATGCACG ACGCGCTTGA
201 TTCAGACGGC ATTCTGCTGC CCGACGGAGA ACGCCTGACA CCGTTCGGCA
251 AAAAACTGCG TGCCGCCAGT TTGGACGAAC TGCCCGAACT GTGGAACGTC
301 CTCAAAGGCG ACATGAGCCT GGTCGGCCCC CGCCCGCTGC TGATGCAATA
351 TCTGCCGCTG TACGACAACT TCCAAAACCG CCGCCACGAA ATGAAACCGG
401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC
451 GAACGCTTCG CATGCGACAT CTGGTATATC GACCACTTCA GCCTGTGCCT
501 CGACATCAAA ATCCTACTGC TGACGGTTAA AAAAGTATTA ATCAAAGAAG
551 GGATTTCCGC ACAGGGCGAA GCCACCATGC CCCCTTTCAC AGGAAAACGC
601 AAACTTGCCG TCGTCGGTGC GGGCGGACAC GGCAAAGTCG TTGCCGAGCT
651 TGCCGCCGCA CTCGGCACAT ACGGCGAAAT CGTTTTTCTG GACGACCGCG
701 TCCAAGGCAG CGTCAACGGC TTCCCCGTCA TCGGCACGAC GCTGCTGCTT
751 GAAAACAGTT TATCGCCCGA ACAATTCGAC ATCGCCGTCG CCGTCGGCAA
801 CAACCGCATC CGCCGCCAAA TCGCCGAAAA AGCCGCCGCG CTCGGCTTCG
851 CCCTGCCCGT CCTGATTCAT CCGGACTCGA CCGTCTCGCC TTCTGCAACA
901 GTCGGACAAG GCGGCGTCGT TATGGCGAAA GCCGTCGTAC AGGCTGACAG
951 CGTATTGAAA GACGGCGTAA TTGTGAACAC TGCCGCCACC GTCGATCACG
1001 ATTGCCTGCT TGATGCTTTC GTCCACATCA GCCCGGGCGC GCACCTGTCG
1051 GGCAACACGC GTATCGGCGA AGAAAGCTGG ATAGGCACAG GCGCGTGCAG
1101 CCGCCAGCAG ATCCGTATCG GCAGCCGCGC AACCATTGGA GCGGGCGCAG
1151 TCGTCGTGCG CGACGTTTCA GACGGCATGA CCGTCGCGGG CAACCCGGCA
1201 AAACCATTGG CAGGCAAAAA TACCGAGACC CTGCGGTCGT AA
预计它编码的蛋白具有下列氨基酸序列<SEQ ID 16>:
1 MSKFFKRLFD IVASA
SGLIF LSPVFLILIY LIRKNLGSPV FFFQERPGKD
51 GKPFKMVKFR SMHDALDSDG ILLPDGERLT PFGKKLRAAS LDELPELWNV
101 LKGDMSLVGP RPLLMQYLPL YDNFQNRRHE MKPGITGWAQ VNGRNALSWD
151 ERFACDIWYI DHFS
LCLDIK ILLLTVKKVL IKEGISAQGE ATMPPFTGKR
201 KLAVVGAGGH GKVVAELAAA LGTYGEIVFL DDRVQGSVNG FPVIGTTLLL
251 ENSLSPEQFD IAVAVGNNRI RRQIAEKAAA LGFALPVLIH PDSTVSPSAT
301 VGQGGVVMAK AVVQADSVLK DGVIVNTAAT VDHDCLLDAF VHISPGAHLS
351 GNTRIGEESW IGTGACSRQQ IRIGSRATIG AGAVVVRDVS DGMTVAGNPA
401 KPLAGKNTET LRS*
两个跨膜结构域用下划线表示。
ORF3-1与ORF3a在重叠的410个氨基酸中显示出有94.6%的相同性:
10 20 30 40 50 60
orf3a.pep MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf3-1 MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
10 20 30 40 50 60
70 80 90 100 110 120
orf3a.pep SMHDALDSDGILLPDGERLTPFGKKLRAASLDELPELWNVLKGDMSLVGPRPLLMQYLPL
||:|||||||| |||||||||||||||||||||||||||:|||:||||||||||||||||
orf3-1 SMRDALDSDGIPLPDGERLTPFGKKLRAASLDELPELWNILKGEMSLVGPRPLLMQYLPL
70 80 90 100 110 120
130 140 150 160 170 180
orf3a.pep YDNFQNRRHEMKPGITGWAQVNGRNALSWDERFACDIWYIDHFSLCLDIKILLLTVKKVL
|||||||||||||||||||||||||||||||:||||:|||||||||||||||||||||||
orf3-1 YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL
130 140 150 160 170 180
190 200 210 220 230 240
orf3a.pep IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVAELAAALGTYGEIVFLDDRVQGSVNG
|||||||||||||||||||||||||||||||||||:|||||| | ||||||||:||||||
orf3-1 IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG
190 200 210 220 230 240
250 260 270 280 290 300
orf3a.pep FPVIGTTLLLENSLSPEQFDIAVAVGNNRIRRQIAEKAAALGFALPVLIHPDSTVSPSAT
| ||||||||||||||||:|:|||||||||||||||||||||||||||:|||:|||||||
orf3-1 FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT
250 260 270 280 290 300
310 320 330 340 350 360
orf3a.pep VGQGGVVMAKAVVQADSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESW
||||:|||||||||| |||||||||||||||||||||:|||||||||||||||:||||||
orf3-1 VGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHDCLLNAFVHISPGAHLSGNTHIGEESW
310 320 330 340 350 360
370 380 390 400 410
orf3a.pep IGTGACSRQQIRIGSRATIGAGAVVVRDVSDGMTVAGNPAKPLAGKNTETLRSX
||||||||||||||||||||||||||||||||||||||||||| || ||
orf3-1 IGTGACSRQQIRIGSRATIGAGAVVVRDVSDGMTVAGNPAKPLPRKNPETSTAX
370 380 390 400 410
与枯草芽孢杆菌的yvfc基因(登录号为Z71928)编码的假设蛋白质的同源性
ORF3和YVFC蛋白质在170个氨基酸重叠区域内表现出有55%的氨基酸相同性(BLASTp):
ORF3 3 IYLIRKNLGSPVFFFQERPGKDGKPFKMVKFRSMRDGLYSDGIPLPDGERLTPFGKKLRA 62
I ++R +GSPVFF Q RPG GKPF + KFR+M D S G LPD RLT G+ +R
yvfc 27 IAVVRLKIGSPVFFKQVRPGLHGKPFTLYKFRTMTDERDSKGNLLPDEVRLTKTGRLIRK 86
ORF3 63 ASXDELPELWNILKGEMSLVGPRPLLMQYLPLYDNFQNRRHEMKPGITGWAQVNGRNALS 122
S DELP+L N+LKG++SLVGPRPLLM YLPLY Q RRHE+KPGITGWAQ+NGRNA+S
yvfc 87 LSIDELPQLLNVLKGDLSLVGPRPLLMDYLPLYTEKQARRHEVKPGITGWAQINGRNAIS 146
ORF3 123 WDEKFACDVWYIDHFSLCLDXXXXXXXXXXXXXXEGISAQGEXTMPPFTG 172
W++KF DVWY+D++S LD EGI T FTG
vvfc 147 WEKKFELDVWYVDNWSFFLDLKILCLTVRKVLVSEGIQQTNHVTAERFTG 196
与淋病奈瑟球菌的预计ORF的同源性
ORF3与淋病奈瑟球菌的预计ORF(ORF3.ng)在重叠的286个氨基酸内显示出有86.3%的相同性:
orf3
ILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR 34
:||||| ||| ||||||::||||||||||||||||
orf3ng MSKAVKRLFDIIASA
SGLIVLSPVFLVLIYLIRKNKGSPVFFIRERPGKDGKPFKMVKFR 60
orf3 SMRDGLYSDGIPLPD GERLTPFGKKLRAASXD ELPELWNILKGEMSLVGPRPLLMQYLPL 94
||||:| |||||||| :|||| |||||||:| | |||||||:||||||||||||||||||||
orf3ng SMRDALDSDGIPLPD SERLTDFGKKLRATSLD ELPELWNVLKGEMSLVGPRPLLMQYLPL 120
orf3 YDNFQNRRHEMKPGI TGWAQVNGRNALSWDEK FACDVWYIDHFSLCLDIKILLLTVKKVL 154
|::|||||||||||| ||||||||||||||||| |:||||| |:||: ||:|||:|||||||
orf3ng YNKFQNRRHEMKPGI TGWAQVNGRNALSWDEK FSCDVWYTDNFSFWLDMKILFLTVKKVL 180
orf3 IKEGISAQGEXTMPP FTGKRKLAVVGAGGHGK VVADLAAALGRYREIVFLDDRAQGSVNG 214
|||||||||| |||| |:|:|||||:||||||| |||:|||||| | ||||||||:||||||
orf3ng IKEGISAQGEATMPP FAGNRKLAVIGAGGHGK VVAELAAALGTYGEIVFLDDRTQGSVNG 240
orf3 FSVIGTTLLLENSLS PEQYDVAVAVGNNRIRR QIAEKAAALGFALPVLVHPDATVSPSAT 274
| ||||||||||||| |||:|::|||||||||| ||:|:|||||| ||||:||||||||||
orf3ng FPVIGTTLLLENSLS PEQFDITVAVGNNRIRR QITENAAALGFKLPVLIHPDATVSPSAI 300
orf3 VGQGSVVMAKAV 286
:|||||||||||
orf3ng IGQGSVVMAKAVVQA GSVLKDGVIVNTAATVD HDCLLDAFVHISPGAHLSGNTRIGEESR 360
全长ORF3ng核苷酸序列<SEQ ID 17>是:
1 ATGAGTAAAG CCGTCAAACG CCTGTTCGAC ATCATCGCAT CCGCATCGGG
51 GCTGATTGTC CTGTCGCCCG TGTTTTTGGT TTTAATATAC CTCATCCGCA
101 AAAACTTAGG TTCGCCCGTC TTCTTCattC GGGAACGCCc cgGAAAGGAc
151 ggaaaacCTT TTAAAATGGT CAAATTCCGT TCCAtgcgcg acgcgcttGA
201 TTCAGACGGC ATTCCGCTGC CCGATAGCGA ACGCCTGACC GATTTCGGCA
251 AAAAATTACG CGCCACCAGT TTGGACGAAC TTCCTGAATT ATGGAATGTC
301 CTCAAAGGCG AGATGAGCCT GGTCGGCCCC CGCCCGCTTT TGATGCAGTA
351 TCTGCCGCTT TACAACAAAT TTCAAAACCG CCGCCACGAA ATGAAACCGG
401 GCATTACCGG CTGGGCGCAG GTCAACGGGC GCAACGCGCT TTCGTGGGAC
451 GAAAAGTTCT CCTGCGATGT TTGGTACACC GACAATTTCA GCTTTTGGCT
501 GGATATGAAA ATCCTGTTTC TGACAGTCAA AAAAGTCTTG ATTAAAGAAG
551 GCATTTCGGC GCAAGGGGAA GCCACCATGC CCCCTTTCGC GGGGAATCGC
601 AAACTCGCCG TTATCGGCGC GGGCGGACAC GGCAAAGTCG TTGCCGAGCT
651 TGCCGCCGCA CTCGGCACAT ACGGCGAAAT CGTTTTTCTG GACGACCGCA
701 CCCAAGGCAG CGTCAACGGC TTCCCCGTCA TCGGCACGAC GCTGCTGCTT
751 GAAAACAGTT TATCGCCCGA ACAATTCGAC ATCACCGTCG CCGTCGGCAA
801 CAACCGCATC CGCCGCCAAA TCACCGAAAA CGCCGCCGCG CTCGGCTTCA
851 AACTGCCCGT TCTGATTCAT CCCGACGCGA CCGTCTCGCC TTCTGCAATA
901 ATCGGACAAG GCAGCGTCGT AATGGCGAAA GCCGTCGTAC AGGCCGGCAG
951 CGTATTGAAA GACGGCGTGA TTGTGAACAC TGCCGCCACC GTCGATCACG
1001 ACTGCCTGCT TGACGCTTTC GtccaCATCA GCCCGGGCGC GCACCTGTCG
1051 GGCAACACGC GTATCGGCGA AGAAAGCCGG ATAGGCACGG GCGCGTGCAG
1101 CCGCCAGCAG ACAACCGTCG GCAGCGGGGT TACCgGcgGT GCAGGGgcGG
1151 TTATCGTATG CGACATCCCG GACGGCATGA CCGTCGCGGG CAACCCGGCA
1201 AAGCCCCTTA CGGGCAAAAA CCCCAAGACC GGGACGGCAT AA
它编码的蛋白质具有下列氨基酸序列<SEQ ID 18>:
1 MSKAVKRLFD IIASA
SGLIV LSPVFLVLIY LIRKNLGSPV FFIRERPGKD
51 GKPFKMVKFR SMRDALDSDG IPLPDSERLT DFGKKLRATS LDELPELWNV
101 LKGEMSLVGP RPLLMQYLPL YNKFQNRRHE MKPGITGWAQ VNGRNALSWD
151 EKFSCDVWYT DNFSFWLDMK ILFLTVKKVL IKEGISAQGE ATMPPFAGNR
201 KLAVIGAGGH GKVVAELAAA LGTYGEIVFL DDRTQGSVNG FPVIGTTLLL
251 ENSLSPEQFD ITVAVGNNRI RRQITENAAA LGFKLPVLIH PDATVSPSAI
301 IGQGSVVMAK AVVQAGSVLK DGVIVNTAAT VDHDCLLDAF VHISPGAHLS
351 GNTRIGEESR IGTGACSRQQ TT
VGSGVTAG AGAVIVCDIP DGMTVAGNPA
401 KPLTGKNPKT GTA*
该蛋白与ORF3-1在重叠的413个氨基酸内有86.9%的相同性:
10 20 30 40 50 60
orf3-1.pep MSKFFKRLFDIVASASGLIFLSPVFLILIYLIRKNLGSPVFFFQERPGKDGKPFKMVKFR
||| ||||||:||||||| ||||||:|||||||||||||||::||||||||||||||||
orf3ng MSKAVKRLFDIIASASGLIVLSPVFLVLIYLIRKNLGSPVFFIRERPGKDGKPFKMVKFR
10 20 30 40 50 60
70 80 90 100 110 120
orf3-1.pep SMFDALDSDGIPLPDGERLTPFGKKLRAASLDELPELWNILKGEMSLVGPRPLLMQYLPL
|||||||||||||||:|||| |||||||:||||||||||:||||||||||||||||||||
orf3ng SMRDALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPL
70 80 90 100 110 120
130 140 150 160 170 180
orf3-1.pep YDNFQNRRHEMKPGITGWAQVNGRNALSWDEKFACDVWYIDHFSLCLDIKILLLTVKKVL
|::||||||||||||||||||||||||||||||:||||| |:||: ||:|||:|||||||
orf3ng YNKFQNRRHEMKPGITGWAQVNGRNALSWDEKFSCDVWYTDNFSFWLDMKILFLTVKKVL
130 140 150 160 170 180
190 200 210 220 230 240
orf3-1.pep IKEGISAQGEATMPPFTGKRKLAVVGAGGHGKVVADLAAALGRYREIVFLDDRAQGSVNG
||||||||||||||||:|:|||||:||||||||||:|||||| | ||||||||:||||||
orf3ng IKEGISAQGEATMPPFAGNRKLAVIGAGGHGKVVAELAAALGTYGEIVFLDDRTQGSVNG
190 200 210 220 230 240
250 260 270 280 290 300
orf3-1.pep FSVIGTTLLLENSLSPEQYDVAVAVGNNRIRRQIAEKAAALGFALPVLVHPDATVSPSAT
| ||||||||||||||||:|::||||||||||||:|:|||||| ||||:||||||||||
orf3ng FPVIGTTLLLENSLSPEQFDITVAVGNNRIRRQITENAAALGFKLPVLIHPDATVSPSAI
250 260 270 280 290 300
310 320 330 340 350 360
orf3-1.pep VGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHDCLLNAFVHISPGAHLSGNTHIGEESW
:||||||||||||||||||||||||||||||||||||:|||||||||||||||:|||||
orf3ng IGQGSVVMAKAVVQAGSVLKDGVIVNTAATVDHDCLLDAFVHISPGAHLSGNTRIGEESR
310 320 330 340 350 360
370 380 390 400 410
orf3-1.pep IGTGACSRQQIRIGSRATIGAGAVVVRDVSDGMTVAGNPAKPLPRKNPETSTAX
|||||||||| :|| :| |||||:| |: ||||||||||||| |||:|:|||
orf3ng IGTGACSRQQTTVGSGVTAGAGAVIVCDIPDGMTVAGNPAKPLTGKNPKTGTAX
370 380 390 400 410
另外,ORF3ng显示出与枯草芽孢杆菌的假设蛋白明显同源:
gnl|PID|e238668(Z71928)假设蛋白[枯草芽孢杆菌]
>gi|1945702|gnl|PID|e313004(Z94043)假设蛋白[枯草芽孢杆菌]>gi|2635938|gnl|PID|e1186113(Z99121)与
荚膜多糖生物合成相似[枯草芽孢杆菌]长度=202
评分=235位(594),估计值=3e-61
相同性=114/195(58%),阳性=142/195(72%)
询问:5 VKRLFDIIASASGLIVLSPVFLVLIYLIRKNLGSPVFFIRERPGKDGKPFKMVKFRSMRD 64
+KRLFD+ A+ L S + L I ++R +GSPVFF + RPG GKPF + KFR+M D
目标:3 LKRLFDLTAAIFLLCCTSVIILFTIAVVRLKIGSPVFFKQVRPGLHGKPFTLYKFRTMTD 62
询问:65 ALDSDGIPLPDSERLTDFGKKLRATSLDELPELWNVLKGEMSLVGPRPLLMQYLPLYNKF 124
DS G LPD RLT G+ +R S+DELP+L NVLKG++SLVGPRPLLM YLPLY +
目标:63 ERDSKGNLLPDEVRLTKTGRLIRKLSIDELPQLLNVLKGDLSLVGPRPLLMDYLPLYTEK 122
询问:125 QNRRHEMKPGITGWAQVNGRNALSWDEKFSCDVWYTDNFSFWLDMKILFLTVKKVLIKEG 184
Q RRHE+KPGITGWAQ+NGRNA+SW++KF DVWY DN+SF+LD+KIL LTV+KVL+ EG
目标:123 QARRHEVKPGITGWAQINGRNAISWEKKFELDVWYVDNWSFFLDLKILCLTVRKVLVSEG 182
询问:185 ISAQGEATMPPFAGN 199
I T F G+
目标:183 IQQTNHVTAERFTGS 197
yvfc基因的假设产物显示与苜蓿根瘤菌(R.meliloti)的EXOY(一种外多糖产生蛋白(exopolysaccharide production protein))相似。根据这个情况以及同源的淋病奈瑟球菌序列中两个预计的跨膜区,预计这些蛋白或它们的表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例4
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 19>:
1 ..AACCATATGG CGATTGTCAT CGACGAATAC GGCGGCACAT CCGGCTTGGT
51 CACCTTTGAA GACATCATCG AGCAAATCGT CGGCGAAATC GAAGACGAGT
101 TTGACGAAGA CGATAGCGCC GACAATATCC ATGCCGTTTC TTCAGACACG
151 TGGCGCATCC ATGCAGCTAC CGAAATCGAA GACATCAACA CCTTCTTCGG
201 CACGGAATAC AGCATCGAAG AAGCCGACAC CATT.GGCGG CCTGGTCATT
251 CAAGAGTTGG GACATCTGCC CGTGCGCGGC GAAAAAGTCC TTATCGGCGG
301 TTTGCAGTTC ACCGTCGCAC GCGCCGACAA CCGCCGCCTG CATACGCTGA
351 TGGCGACCCG CGTGAAGTAA GC........ .....ACCGC CGTTTCTGCA
401 CAGTTTAG
它对应于氨基酸序列<SEQ ID 20:ORF5>:
1 ..NHMAIVIDEY GGTSGLVTFE DIIEQIVGEI EDEFDEDDSA DNIHAVSSDT
51 WRIHAATEIE DINTFFGTEY SIEEADTIXR PGHSRVGTSA RARRKSPYRR
101 FAVHRRTRRQ PPPAYADGDP REVS....XR RFCTV*
进一步的序列分析揭示了完整的DNA序列是<SEQ ID 21>:
1 ATGGACGGCG CACAACCGAA AACGAATTTT TTTGAACGCC TGATTGCCCG
51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTAAAC CTGCTTCGGC
101 AGGCGCACGA GCAGGAAGTT TTTGATGCGG ATACGCTTTT AAGATTGGAA
151 AAAGTCCTCG ATTTTTCCGA TTTGGAAGTG CGCGACGCGA TGATTACGCG
201 CAGCCGTATG AACGTTTTAA AAGAAAACGA CAGCATCGAG CGCATCACCG
251 CCTACGTTAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGCGAAGAC
301 AAAGACGAAG TTTTGGGCAT TTTGCACGCC AAAGACCTGC TCAAATATAT
351 GTTTAACCCC GAGCAGTTCC ACCTCAAATC CATTCTCCGC CCCGCCGTCT
401 TCGTCCCCGA AGGCAAATCG CTGACCGCCC TTTTAAAAGA GTTCCGCGAA
451 CAGCGCAACC ATATGGCGAT TGTCATCGAC GAATACGGCG GCACATCCGG
501 CTTGGTCACC TTTGAAGACA TCATCGAGCA AATCGTCGGC GAAATCGAAG
551 ACGAGTTTGA CGAAGACGAT AGCGCCGACA ATATCCATGC CGTTTCTTCC
601 GAACGCTGGC GCATCCATGC AGCTACCGAA ATCGAAGACA TCAACACCTT
651 CTTCGGCACG GAATACAGCA GCGAAGAAGC CGACACCATT CGGCCTGGTC
701 ATTCAAGAGT TGGGACATCT GCCCGTGCGC GGCGAAAAAG TCCTTATCGG
751 CGGTTTGCAG TTCACCGTCG CACGCGCCGA CAACCGCCGC CTGCATACGC
801 TGATGGCGAC CCGCGTGAAG TAAGCACCGC CGTTTCTGCA CAGTTTAGGA
851 TGACGGTACG GGCGTTTTCT GTTTCAATCC GCCCCATCCG CCAAACATAA
它对应于氨基酸序列<SEQ ID 22;ORF5-1>:
1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLLRLE
51 KVLDFSDLEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED
101 KDEVLGILHA KDLLKYMFNP EQFHLKSILR PAVFVPEGKS LTALLKEFRE
151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG EIEDEFDEDD SADNIHAVSS
201 ERWRIHAATE IEDINTFFGT EYSSEEADTI RPGHSRVGTS ARARRKSPYR
251 RFAVHRRTRR QPPPAYADGD PREVSTAVSA QFRMTVRAFS VSIRPIRQT*
进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 23>:
1 ATGGACGGCG CACAACCGAA AACAAATTTT TTNNAACGCC TGATTGCCCG
51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTGACC CTGTTGCGCC
101 AAGCGCACGA ACAGGAAGTA TTTGATGCGG ATACGCTTTT AAGATTGGAA
151 AAAGTCCTCG ATTTTTCTGA TTTGGAAGTG CGCGACGCGA TGATTACGCG
201 CAGCCGTATG AACGTTTTAA AAGAAAACGA CAGCATCGAA CGCATCACCG
251 CCTACGTTAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGTGAAGAC
301 AAAGACGAAG TTTTGGGTAT TTTGCACGCC AAAGACCTGC TCAAATATAT
351 GTTCAACCCC GAGCAGTTCC ACCTCAAATC GATATTGCGC CCTGCCGTCT
401 TCGTCCCCGA AGGCAAATCG CTGACCGCCC TTTTAAAAGA GTTCCGCGAA
451 CAGCGCAACC ATATGGCAAT CGTCATCGAC GAATACGGCG GCACGTCGGG
501 TTTGGTAACT TTTGAAGACA TCATCGAGCA AATCGTCGGC GACATCGAAG
551 ATGAGTTTGA CGAAGACGAA AGCGCGGACA ACATCCACGC CGTTTCCGCC
601 GAACGCTGGC GCATCCACGC GGCTACCGAA ATCGAAGACA TCAACGCCTT
651 TTTCGGCACG GAATACAGCA GCGAAGAAGC CGACACCATC GGCGGCCNTG
701 GTCATTCAGG AATTGGNACA CCTGCCCGTG CGCGGCGAAA AAGTCNTTAT
751 CGGCGNNTTG CANTTCACNG TCGCCNGCGC NGACAACCGC CGCCTGCATA
801 CGCTGATGGC GACCCGCGTG AAGTAAGCTC CGCCGTTTCT GTACAGTTTA
851 GGATGACGGT ACGGGCGTTT TCTGTTTCAA TCCGCCCCAT CCGCCANACA
901 TAA
它编码的蛋白具有以下的氨基酸序列<SEQ ID 24;ORF5a>:
1 MDGAQPKTNF XXRLIARLAR EPDSAEDVLT LLRQAHEQEV FDADTLLRLE
51 KVLDFSDLEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED
101 KDEVLGILHA KDLLKYMFNP EQFHLKSILR PAVFVPEGKS LTALLKEFRE
151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADNIHAVSA
201 ERWRIHAATE IEDINAFFGT EYSSEEADTI GGXGHSGIGT PARARRKSXY
251 RRXAXHXRXR XQPPPAYADG DPREVSSAVS VQFRMTVRAF SVSIRPIRXT
301 *
最初鉴定的部分菌株B序列(ORF5)与ORF5a在重叠的124个氨基酸内显示出有54.7%的相同性:
10 20 30
orf5.pep NHMAIVIDEYGGTSGLVTFEDIIEQIVGEI
||||||||||||||||||||||||||||:|
orf5a FHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVGDI
130 140 150 160 170 180
40 50 60 70 80 90
orf5.pep EDEFDEDDSADNIHAVSSDTWRIHAATEIEDINTFFGTEYSIEEADTIXRPGHSRVGTSA
|||||||:|||||||||:: |||||||||||||:||||||| |||||| ||| :|| |
orf5a EDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADTIGGXGHSGIGTPA
190 200 210 220 230 240
100 110 120 130
orf5.pep RARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSXXXXXRRFCTV
|||||| ||| | | |:| |||||||||||||||
orf5a RARRKSXYRRXAXHXRXRXQPPPAYADGDPREVSSAVSVQFRMTVRAFSVSIRPIRXTX
250 260 270 280 290 300
完整的菌株B序列(ORF5-1)和ORF5a在重叠的300个氨基酸中显示出有92.7%的相同性:
10 20 30 40 50 60
orf5a.pep MDGAQPKTNFXXRLIARLAREPDSAEDVLTLLRQAHEQEVFDADTLLRLEKVLDFSDLEV
|||||||||| |||||||||||||||||:||||||||||||||||||||||||||||||
orf5-1 MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLLRLEKVLDFSDLEV
10 20 30 40 50 60
70 80 90 100 110 120
orf5a.pep RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf5-1 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
70 80 90 100 110 120
130 140 150 160 170 180
orf5a.pep EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf5-1 EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
130 140 150 160 170 180
190 200 210 220 230 240
orf5a.pep DIEDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADTIGGXGHSGIGT
:||||||||:|||||||||:|||||||||||||||:|||||||||||||| ||| :||
orf5-1 EIEDEFDEDDSADNIHAVSSERWRIHAATEIEDINTFFGTEYSSEEADTIRP-GHSRVGT
190 200 210 220 230
250 260 270 280 290 300
orf5a.pep PARARRKSXYRRXAXHXRXRXQPPPAYADGDPREVSSAVSVQFRMTVRAFSVSIRPIRXT
||||||| ||| | | |:| |||||||||||||||:|||:||||||||||||||||| |
orf5-1 SARARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSTAVSAQFRMTVRAFSVSIRPIRQT
240 250 260 270 280 290
进一步的工作鉴定了淋病奈瑟球菌中的部分DNA序列<SEQ ID 25>,它编码的蛋白质具有氨基酸序列<SEQ ID 26;ORF5ng>:
1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLTRLE
51 KVLDFAELEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED
101 KDEVLGILHA KDLLKYMFNP EQFHLKSVLR PAVFVPEGKS LTALLKEFRE
151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADDIHSVSA
201 ERWRIHAATE IEDINAFFGT EYGSEEADTI RRLGHSGIGT PARARRKSPY
251 RRFAVHRRPR RQPPPAHADG DPREVSRACP HRRFCTV*
进一步的分析揭示了完整的淋球菌核苷酸序列<SEQ ID 27>是:
1 ATGGACGGCG CACAACCGAA AACAAATTTT TTTGAACGCC TGATTGCCCG
51 ACTCGCCCGC GAACCCGATT CCGCCGAAGA CGTATTAAAC CTGCTTCGGC
101 AGGCGCACGA ACAGGAAGTT TTTGATGCCG ACACACTGAC CCGGCTGGAA
151 AAAGTATTGG ACTTTGCCGA GCTGGAAGTG CGCGATGCGA TGATTACGCG
201 CAGCCGCATG AACGTATTGA AAGAAAACGA CAGCATCGAA CGCATCACCG
251 CCTACGTCAT CGATACCGCC CATTCGCGCT TCCCCGTCAT CGGCGAAGAC
301 AAAGACGAAG TTTTGGGCAT TTTGCACGCC AAAGACCTGC TCAAATATAT
351 GTTCAACCCC GAGCAGTTCC ACCTGAAATC CGTCTTGCGC CCTGCCGTTT
401 TCGTGCCCGA AGGCAAATCT TTGACCGCCC TTTTAAAAGA GTTCCGCGAA
451 CAGCGCAACC ATATGGCAAT CGTCATCGAC GAATACGGCG GCACGTCGGG
501 TTTGGTCACC TTTGAAGACA TCATCGAGCA AATCGTCGGT GACATCGAAG
551 ACGAGTTTGA CGAAGACGAA AGCGccgacg acatCCACTC cgTTTccgCC
601 GAACGCTGGC GCATCCacgc ggctaCCGAA ATCGAAGaca TCAACGCCTT
651 TTTCGGTACG GAatacggca gcgaagaagc cgacaccatc cggcggctTG
701 GTCATTCAGG AATTGGGACA CCTGCCCGTG CGCGGCGAAA AAGTCCTTAt
751 cggcgGTTTG Cagttcaccg tCGCCCGCGC CGACAACCGC CGCCTGCACA
801 CGCTGATGGC GACCCGCGTG AAGTAAGCAG AGCCTGCCcg AccgccgttT
851 CTGCacAGTT TAGGatgACG gtaCGGTCGT TTTCTGTTTC AATCCGCCCC
901 ATCCGCCAAA CATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 28;ORF5ng-1>:
1 MDGAQPKTNF FERLIARLAR EPDSAEDVLN LLRQAHEQEV FDADTLTRLE
51 KVLDFAELEV RDAMITRSRM NVLKENDSIE RITAYVIDTA HSRFPVIGED
101 KDEVLGILHA KDLLKYMFNP EQFHLKSVLR PAVFVPEGKS LTALLKEFRE
151 QRNHMAIVID EYGGTSGLVT FEDIIEQIVG DIEDEFDEDE SADDIHSVSA
201 ERWRIHAATE IEDINAFFGT EYGSEEADTI RRLGHSGIGT PARARRKSPY
251 RRFAVHRRPR RQPPPAHADG DPREVSRACP TAVSAQFRMT VRSFSVSIRP
301 IRQT*
最初鉴定的部分菌株B序列(ORF5)与部分淋球菌序列(ORF5ng)在重叠的135个氨基酸内显示出有83.1%的相同性:
orf5 NHMAIVIDEYGGTSGLVTFEDIIEQIVGEI 30
||||||||||||||||||||||||||||:|
orf5ng FHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVGDI 182
orf5 EDEFDEDDSADNIHAVSSDTWRIHAATEIEDINTFFGTEYSIEEADTIXRPGHSRVGTSA 90
|||||||:|||:||:||:: |||||||||||||:||||||: |||||| | ||| :|| |
orf5ng EDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEADTIRRLGHSGIGTPA 242
orf5 RARRKSPYRRFAVHRRTRRQPPPAYADGDPREVSX----RRFCTV 131
|||||||||||||||| |||||||:||||||||| ||||||
orf5ng RARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPHRRFCTV 287
完整的菌株B和淋球菌序列(ORF5-1和ORF5ng-1)在重叠的304个氨基酸中显示出有92.4%的相同性:
10 20 30 40 50 60
orf5ng-1.pep MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLTRLEKVLDFAELEV
|||||||||||||||||||||||||||||||||||||||||||||| ||||||||::|||
orf5-1 MDGAQPKTNFFERLIARLAREPDSAEDVLNLLRQAHEQEVFDADTLLRLEKVLDFSDLEV
10 20 30 40 50 60
70 80 90 100 110 120
orf5ng-1.pep RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf5-1 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYMFNP
70 80 90 100 110 120
130 140 150 160 170 180
orf5ng-1.pep EQFHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf5-1 EQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIVG
130 140 150 160 170 180
190 200 210 220 230 240
orf5ng-1.pep DIEDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEADTIRRLGHSGIGT
:||||||||:|||:||:||:|||||||||||||||:||||||:|||||||| ||| :||
orf5-1 EIEDEFDEDDSADNIHAVSSERWRIHAATEIEDINTFFGTEYSSEEADTIRP-GHSRVGT
190 200 210 220 230
250 260 270 280 290 300
orf5ng-1.pep PARARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPTAVSAQFRMTVRSFSVSIRP
||||||||||||||||| |||||||:||||||||| ||||||||||||:|||||||
orf5-1 SARARRKSPYRRFAVHRRTRRQPPPAYADGDPREVS----TAVSAQFRMTVRAFSVSIRP
240 250 260 270 280 290
orf5ng-1.pep IRQTX
|||||
orf5-1 IRQTX
300
这些氨基酸序列的计算机分析表明了一个推定的前导序列,并鉴定了下列同源性:
与流感嗜血菌的溶血素同系物TlyC(登录号为U32716)的同源性
ORF5和TlyC蛋白在重叠的77个氨基酸内有58%的相同性(BLASTp)。
ORF5 2 HMAIVIDEYGGTSGLVTFEDIIEQIVGEIEDEFDEDDSADNIHAVSSDTWRIHAATEIED 61
HMAIV+DE+G SGLVT EDI+EQIVG+IEDEFDE++ AD I +S T+ + A T+I+D
TlyC 166 HMAIVVDEFGAVSGLVTIEDILEQIVGDIEDEFDEEEIAD-IRQLSRHTYAVRALTDIDD 224
ORF5 62 INTFFGTEYSIEEADTI 78
N F T++ EE DTI
TlyC 225 FNAQFNTDFDDEEVDTI 241
ORF5ng-1还显示出与TlyC明显同源:
评分Init1:301 Initn:419 Opt:668
Smith-Waterman评分:668;242个重叠的氨基酸内有45.9%的相同性
10 20 30 40 50
orf5ng-1.pep MDGAQPKTNFFERLIARLAR-EPDSAEDVLNLLRQAHEQEVFDADTLTRLEK
| ||: |::|: : | : |::::::|::::::::| :| :|
tlyc_haein MNDEQQNSNQSENTKKPFFQSLFGRFFQGELKNREELVEVIRDSEQNDLIDQNTREMIEG
10 20 30 40 50 60
60 70 80 90 100 109
orf5ng-1.pep VLDFAELEVRDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGE--DKDEVLGILH
|:::|||:||| || ||:: :::::::: :|::||||||||:: |:|:::||||
tlyc_haein VMEIAELRVRDIMIPRSQIIFIEDQQDLNTCLNTIIESAHSRFPVIADADDRDNIVGILH
70 80 90 100 110 120
110 120 130 140 150 160
orf5ng-1.pep AKDLLKYMF-NPEQFHLKSVLRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGL
||||||:: : | | |:|:|||:|:|||:| : :||:|| :| |||||:||:|::|||
tlyc_haein AKDLLKFLREDAEVFDLSSLLRPVVIVPESKRVDRMLKDFRSERFHMAIVVDEFGAVSGL
130 140 150 160 170 180
170 180 190 200 210 220
orf5ng-1.pep VTFEDIIEQIVGDIEDEFDEDESADDIHSVSAERWRIHAATEIEDINAFFGTEYGSEEAD
||:|||:|||||||||||||:| || |:::| : : ::| |:|:|:|| |:|:: :||:|
tlyc_haein VTIEDILEQIVGDIEDEFDEEEIAD-IRQLSRHTYAVRALTDIDDFNAQFNTDFDDEEVD
190 200 210 220 230
230 240 250 260 270 280
orf5ng-1.pep TIRRLGHSGIG-TPARARRKSPYRRFAVHRRPRRQPPPAHADGDPREVSRACPTAVSAQF
|| | : :| | |:
tlyc_haein TIGGLIMQTFGYLPKRGEEIILKNLQFKVTSADSRRLIQLRVTVPDEHLAEMNNVDEKSE
240 250 260 270 280 290
与大肠杆菌的假设分泌蛋白的同源性:
ORF5显示出与大肠杆菌的一种假设分泌蛋白有同源性:
sp|P77392|YBEX_ECOLI CUTE-ASNB基因间区域中假设的33.3KD蛋白
>gi|1778577(U82598)与流感嗜血菌相似[大肠杆菌]>gi|1786879(AE000170)f292;该292aa ORF与约440aa
蛋白的272个残基有23%的相同性(9个空隙)YTFL_HAEIN SW:P44717[大肠杆菌]长度=292
评分=212位(533),估计值=3e-54
相同性=112/230(48%),阳性=149/230(64%),空隙=3/230(1%)
询问:2 DGAQPKTNFXXRLIARLAR-EPDSAEDVLTLLRQAHEQEVFDADTLLRLEKVLDFSDLEV 60
D K F L+++L EP + +++L L+R + + ++ D DT LE V+D +D V
目标:10 DTISNKKGFFSLLLSQLFHGEPKNRDELLALIRDSGQNDLIDEDTRDMLEGVMDIADQRV 69
询问:61 RDAMITRSRMNVLKENDSIERITAYVIDTAHSRFPVIGEDKDEVLGILHAKDLLKYM-FN 119
RD MI RS+M LK N +++ +I++AHSRFPVI EDKD + GIL AKDLL +M +
目标:70 RDIMIPRSQMITLKRNQTLDECLDVIIESAHSRFPVISEDKDHIEGILMAKDLLPFMRSD 129
询问:120 PEQFHLKSILRPAVFVPEGKSLTALLKEFREQRNHMAIVIDEYGGTSGLVTFEDIIEQIV 179
E F + +LR AV VPE K + +LKEFR QR HMAIVIDE+GG SGLVT EDI+E IV
目标:130 AEAFSMDKVLRQAVVVPESKRVDRMLKEFRSQRYHMAIVIDEFGGVSGLVTIEDILELIV 189
询问:180 GDIEDEFDEDESADNIHAVSAERWRIHAATEIEDINAFFGTEYSSEEADT 229
G+IEDE+DE++ D +S W + A IED N FGT +S EE DT
目标:190 GEIEDEYDEEDDID-FRQLSRHTWTVRALASIEDFNEAFGTHFSDEEVDT 238
根据该分析,包括与流感嗜血菌的TlyC溶血素同系物的氨基酸同源性(溶血素是分泌的蛋白质),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白是分泌性的,因此可用作疫苗或诊断用的抗原。
如上所述,将ORF5-1(30.7kDa)克隆到pGex中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图2A显示了GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白免疫小鼠,将小鼠的血清用于Western印迹分析(图1B)。这些实验确认ORF5-1是外露的蛋白,且是有用的免疫原。
实施例5
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 29>:
1 ATGCGCGGCG GCAGGCCGGA TTCCGTTACC GTGCAGATTA TCGAAGGTTC
51 GCGTTTTTCG CATATGAGGA AAGTCATCGA CGCAACGCCC GACATCGGAC
101 ACGACACCAA AGGCTGGAGC AATGAAAAAC TGATGGCGGA AGTTGCGCCC
151 GATGCCTTCA GCGGCAATCC TGAAgGGCAG TTTTTCCCCG ACAGCTACGA
251 GCGATGCAAC GCCGCCTGAA TGA
GGCATG GGAAAGCAGG CAGGACGGGC
301 TGCCTTATAA AAACCCTTAT GAAATGCTGA TTATGGCGAr CCTGGTCGAA
351 AAGGAAACAG GGCATGAAGC CGAsCsCGAC CATGTcGCTT CCGTCTTCGT
401 CAACCGCCTG AAAATCGGTA TGCGCCTGCA AACCgAssCG TCCGTGATTT
451 ACGGCATGGG TGCGGCATAC AAGGGCAAAA TCCGTAAAGC CGACCTGCGC
501 CGCGACACGC CGTACAACAC CTACACGCGC GGCGGTCTGC CGCCAACCCC
551 GATTGCGCTG CCC..
它对应于氨基酸序列<SEQ ID 30;ORF7>:
1 MRGGRPDSVT VQIIEGSRFS HMRKVIDATP DIGHDTKGWS NEKLMAEVAP
51 DAFSGNPEGQ FFPDSYEIDA GGSDLQIYQT AYKAMQRRLN EAWESRQDGL
101 PYKNPYEMLI MAXLVEKETG HEAXXDHVAS VFVNRLKIGM RLQTXXSVIY
151 GMGAAYKGKI RKADLRRDTP YNTYTRGGLP PTPIALP.
进一步的序列分析揭示了完整的DNA序列<SEQ ID 31>:
1 ATGTTGAGAA AATTGTTGAA ATGGTCTGCC GTTTTTTTGA CCGTGTCGGC
51 AGCCGTTTTC GCCGCGCTGC TTTTTGTTCC TAAGGATAAC GGCAGGGCAT
101 ACCGAATCAA AATTGCCAAA AACCAGGGTA TTTCGTCGGT CGGCAGGAAA
151 CTTGCCGAAG ACCGCATCGT GTTCAGCAGG CATGTTTTGA CGGCGGCGGC
201 CTACGTTTTG GGTGTGCACA ACAGGCTGCA TACGGGGACG TACAGATTGC
251 CTTCGGAAGT GTCTGCTTGG GATATCTTGC AGAAAATGCG CGGCGGCAGG
301 CCGGATTCCG TTACCGTGCA GATTATCGAA GGTTCGCGTT TTTCGCATAT
351 GAGGAAAGTC ATCGACGCAA CGCCCGACAT CGGACACGAC ACCAAAGGCT
401 GGAGCAATGA AAAACTGATG GCGGAAGTTG CGCCCGATGC CTTCAGCGGC
451 AATCCTGAAG GGCAGTTTTT CCCCGACAGC TACGAAATCG ATGCGGGCGG
501 CAGTGATTTG CAGATTTACC AAACCGCCTA CAAGGCGATG CAACGCCGCC
551 TGAATGAGGC ATGGGAAAGC AGGCAGGACG GGCTGCCTTA TAAAAACCCT
601 TATGAAATGC TGATTATGGC GAGCCTGGTC GAAAAGGAAA CAGGGCATGA
651 AGCCGACCGC GACCATGTCG CTTCCGTCTT CGTCAACCGC CTGAAAATCG
701 GTATGCGCCT GCAAACCGAC CCGTCCGTGA TTTACGGCAT GGGTGCGGCA
751 TACAAGGGCA AAATCCGTAA AGCCGACCTG CGCCGCGACA CGCCGTACAA
801 CACCTACACG CGCGGCGGTC TGCCGCCAAC CCCGATTGCG CTGCCCGGCA
851 AGGCGGCACT CGATGCCGCC GCCCATCCGT CCGGCGAAAA ATACCTGTAT
901 TTCGTGTCCA AAATGGACGG CACGGGCTTG AGCCAGTTCA GCCATGATTT
951 GACCGAACAC AATGCCGCCG TCCGCAAATA TATTTTGAAA AAATAA
它对应于氨基酸序列<SEQ ID 32;ORF7-1>:
1
MLRKLLKWSA VFLTVSAAVF AALLFVPKDN GRAYRIKIAK NQGISSVGRK
51 LAEDRIVFSR HVLTAAAYVL GVHNRLHTGT YRLPSEVSAW DILQKMRGGR
101 PDSVTVQIIE GSRFSHMRKV IDATPDIGHD TKGWSNEKLM AEVAPDAFSG
151 NPEGQFFPDS YEIDAGGSDL QIYQTAYKAM QRRLNEAWES RQDGLPYKNP
201 YEMLIMASLV EKETGHEADR DHVASVFVNR LKIGMRLQTD PSVIYGMGAA
251 YKGKIRKADL RRDTPYNTYT RGGLPPTPIA LPGKAALDAA AHPSGEKYLY
301 FVSKMDGTGL SQFSHDLTEH NAAVRKYILK K*
该氨基酸序列的计算机分析给出了下列结果:
与流感嗜血菌的yceg基因(登录号为P44270)编码的假设蛋白的同源性
ORF7和yceg蛋白在重叠的192个氨基酸内显示出有44%的氨基酸相同性:
ORF7 1 MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMA-----EVAPDAFSG 55
+ G+ V+ IEG F RK ++ P + K SNE++ A ++ +
yceg 102 LNSGKEVQFNVKWIEGKTFKDWRKDLENAPHLVQTLKDKSNEEIFALLDLPDIGQNLELK 161
ORF7 56 NPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWESRQDGLPYKNPYEMLIMAXLV 115
N EG +PD+Y +DL++ + + + M++ LN+AW R + LP NPYEMLI+A +V
yceg 162 NVEGWLYPDTYNYTPKSTDLELLKRSAERMKKALNKAWNERDEDLPLANPYEMLILASIV 221
ORF7 116 EKETGHEAXXDHVASVFVNRLKIGMRLQTXXSVIYGMGAAYKGKIRKADLRRDTPYNTYT 175
EKETG VASVF+NRLK M+LQT +VIYGMG Y G IRK DL TPYNTY
yceg 222 EKETGIANERAKVASVFINRLKAKMKLQTDPTVIYGMGENYNGNIRKKDLETKTPYNTYV 281
ORF7 176 RGGLPPTPIALP 187
GLPPTPIA+P
yceg 282 IDGLPPTPIAMP 293
全长YCEG蛋白具有以下序列:
1
MKKFLIAILL LILILAGVAS FSYYKMTEFV KTPVNVQADE LLTIERGTTS
51 SKLATLFEQE KLIADGKLLP YLLKLKPELN KIKAGTYSLE NVKTVQDLLD
101 LLNSGKEVQF NVKWIEGKTF KDWRKDLENA PHLVQTLKDK SNEEIFALLD
151 LPDIGQNLEL KNVEGWLYPD TYNYTPKSTD LELLKRSAER MKKALNKAWN
201 ERDEDLPLAN PYEMLILASI VEKETGIANE RAKVASVFIN RLKAKMKLQT
251 DPTVIYGMGE NYNGNIRKKD LETKTPYNTY VIDGLPPTPI AMPSESSLQA
301 VANPEKTDFY YFVADGSGGH KFTRNLNEHN KAVQEYLRWY RSQKNAK
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF7与脑膜炎奈瑟球菌菌株A的ORF(ORF7a)在重叠的187个氨基酸内显示出有95.2%的相同性:
10 20 30
orf7.pep MRGGRPDSVTVQIIEGSRFSHMRKVIDATP
||||||||||||||||||||||||||||||
orf7a AAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKVIDATP
70 80 90 100 110 120
40 50 60 70 80 90
orf7.pep DIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLN
|| |||||||||||||||||||||||||||||||||||||||||:||| ||||||||||
orf7a DIEHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLRIYQIAYKAMQRRLN
130 140 150 160 170 180
100 110 120 130 140 150
orf7.pep EAWESRQDGLPYKNPYEMLIMAXLVEKETGHEAXXDHVASVFVNRLKIGMRLQTXXSVIY
|||||||||||||||||||||| |:|||||||| ||||||||||||||||||| ||||
orf7a EAWESRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTDPSVIY
190 200 210 220 230 240
160 170 180
orf7.pep GMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALP
|||||||||||||||||||||||||||||||||||||
orf7a GMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLYFVSKM
250 260 270 280 290 300
orf7a DGTGLSQFSHDLTEHNAAVRKYILKKX
310 320 330
全长ORF7a核苷酸序列<SEQ ID 33>是:
1 ATGTTGAGAA AATTGTTGAA ATGGTCTGCC GTTTTTTTGA CCGTATCGGC
51 AGCCGTTTTC GCCGCGCTGC TTTTCGTCCC TAAAGACAAC GGCAGGGCAT
101 ACAGGATTAA AATTGCCAAA AACCAGGGTA TTTCGTCGGT CGGCAGGAAA
151 CTTGCCGAAG ACCGCATCGT GTTCAGCAGG CATGTTTTGA CGGCGGCGGC
201 CTACGTTTTG GGTGTGCACA ACAGGCTGCA TACGGGGACG TACAGACTGC
251 CTTCGGAAGT GTCTGCTTGG GATATCTTGC AGAAAATGCG CGGCGGCAGG
301 CCGGATTCCG TTACCGTGCA GATTATCGAA GGTTCGCGTT TTTCGCATAT
351 GAGGAAAGTC ATCGACGCAA CGCCCGACAT CGAACACGAC ACCAAAGGCT
401 GGAGCAATGA AAAACTGATG GCGGAAGTTG CCCCTGATGC CTTCAGCGGC
451 AATCCTGAAG GGCAGTTTTT CCCCGACAGC TACGAAATCG ATGCGGGCGG
501 CAGCGATTTA CGGATTTACC AAATCGCCTA CAAGGCGATG CAACGCCGAC
551 TGAATGAGGC ATGGGAAAGC AGGCAGGACG GGCTGCCTTA TAAAAACCCT
601 TATGAAATGC TGATTATGGC GAGCCTGATC GAAAAGGAAA CAGGGCATGA
651 AGCCGACCGC GACCATGTCG CTTCCGTCTT CGTCAACCGC CTGAAAATCG
701 GTATGCGCCT GCAAACCGAC CCGTCCGTGA TTTACGGCAT GGGTGCGGCA
751 TACAAGGGCA AAATCCGTAA AGCCGACCTG CGCCGCGACA CGCCGTACAA
801 CACCTACACG CGCGGCGGTC TGCCGCCAAC CCCGATCGCG CTGCCCGGCA
851 AGGCGGCACT CGATGCCGCC GCCCATCCGT CCGGTGAAAA ATACCTGTAT
901 TTCGTGTCCA AAATGGACGG TACGGGCTTG AGCCAGTTCA GCCATGATTT
951 GACCGAACAC AACGCCGCCG TTCGCAAATA TATTTTGAAA AAATAA
预计它编码的蛋白质具有氨基酸序列<SEQ ID 34>:
1
MLRKLLKWSA VFLTVSAAVF AALLFVPKDN GRAYRIKIAK NQGISSVGRK
51 LAEDRIVFSR HVLTAAAYVL GVHNRLHTGT YRLPSEVSAW DILQKMRGGR
101 PDSVTVQIIE GSRFSHMRKV IDATPDIEHD TKGWSNEKLM AEVAPDAFSG
151 NPEGQFFPDS YEIDAGGSDL RIYQIAYKAM QRRLNEAWES RQDGLPYKNP
201 YEMLIMASLI EKETGHEADR DHVASVFVNR LKIGMRLQTD PSVIYGMGAA
251 YKGKIRKADL RRDTPYNTYT RGGLPPTPIA LPGKAALDAA AHPSGEKYLY
301 FVSKMDGTGL SQFSHDLTEH NAAVRKYILK K*
前导肽用下划线表示。
ORF7a和ORF7-1在重叠的133个氨基酸内显示出有98.1%的相同性:
10 20 30 40 50 60
orf7a.pep MLRKLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7-1 MLRKLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSR
10 20 30 40 50 60
70 80 90 100 110 120
orf7a.pep HVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7-1 HVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKV
70 80 90 100 110 120
130 140 150 160 170 180
orf7a.pep IDATPDIEHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLRIYQIAYKAM
||||||| ||||||||||||||||||||||||||||||||||||||||||:||| |||||
orf7-1 IDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAM
130 140 150 160 170 180
190 200 210 220 230 240
orf7a.pep QRRLNEAWESRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTD
|||||||||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf7-1 QRRLNEAWESRQDGLPYKNPYEMLIMASLVEKETGHEADRDHVASVFVNRLKIGMRLQTD
190 200 210 220 230 240
250 260 270 280 290 300
orf7a.pep PSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7-1 PSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLY
250 260 270 280 290 300
310 320 330
orf7a.pep FVSKMDGTGLSQFSHDLTEHNAAVRKYILKKX
||||||||||||||||||||||||||||||||
orf7-1 FVSKMDGTGLSQFSHDLTEHNAAVRKYILKKX
310 320 330
与淋病奈瑟球菌的预计ORF的同源性
ORF7与淋病奈瑟球菌的预计ORF(ORF7.ng)在重叠的187个氨基酸内显示出有94.7%的相同性:
orf7 MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQ 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7ng MRGGRPDSVTVQIIEGSRFSHMRKVIDATPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQ 60
orf7 FFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWESRQDGLPYKNPYEMLIMAXLVEKETG 120
||||||||||||||||||||||||||||||||| :||||||||||||||||| |:|||||
orf7ng FFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWAGRQDGLPYKNPYEMLIMASLIEKETG 120
orf7 HEAXXDHVASVFVNRLKIGMRLQTXXSVIYGMGAAYKGKIRKADLRRDTPYNTYTRGGLP 180
||| ||||||||||||||||||| ||||||||||||||||||||||||||||| ||||
orf7ng HEADRDHVASVFVNRLKIGMRLQTDPSVIYGMGAAYKGKIRKADLRRDTPYNTYTGGGLP 180
orf7 PTPIALP 187
|| ||||
orf7ng PTRIALPGKAAMDAAAHPSGEKYLYFVSKMDGTGLSQFSHDLTEHNAAVRKYILKK 236
预计ORF7ng核苷酸序列<SEQ ID 35>编码的蛋白质具有氨基酸序列<SEQ ID36>:
1 MRGGRPDSVT VQIIEGSRFS HMRKVIDATP DIGHDTKGWS NEKLMAEVAP
51 DAFSGNPEGQ FFPDSYEIDA GGSDLQIYQT AYKAMQRRLN EAWAGRQDGL
101 PYKNPYEMLI MASLIEKETG HEADRDHVAS VFVNRLKIGM RLQTDPSVIY
151 GMGAAYKGKI RKADLRRDTP YNTYTGGGLP PTRIALPGKA AMDAAAHPSG
201 EKYLYFVSKM DGTGLSQFSH DLTEHNAAVR KYILKK*
进一步的序列分析揭示了ORF7ng的部分DNA序列<SEQ ID 37>:
1 ..taccgaatca AGATTGCCAA AAATCAGGGT ATTTCGTCGG TCGGCAGGAA
51 ACTTGCcgaA GACCGCATCG TGTTCAGCAG GCATGTTTTG ACAGCGGCGG
101 CCTACGTTTT GGGTGTGCAC AACAGGCTGC ATACGGGGAC gTACAGATTG
151 CCTTCGGAAG TGTCTGCTTG GGATATCTTG CAGAAAATGC GCGGCGGCAG
201 GCCGGATTCC GTTACCGTGC AGATTATCGA AGGTTCGCGT TTTTCGCATA
251 TGAGGAAAGT CATCGACGCA ACGCCCGACA TCGGACACGA CACCAAAGGC
301 TGGAGCAATG AAAAACTGAT GGCGGAAGTT GCGCCCGATG CCTTCAGCGG
351 CAATCCTGAA GGGCAGTTTT TTCCCGACAG CTACGAAATC GATGCGGGCG
401 GCAGCGATTT GCAGATTTAC CAAACCGCCT ACAAGGCGAT GCAACGCCGC
451 CTGAACGAGG CATGGGCAGG CAGGCAGGAC GGGCTGCCTT ATAAAAACCC
501 TTATGAAATG CTGATTATGG CGAGCCTGAT CGAAAAGGAA ACGGGGCATG
551 AGGCCGACCG CGACCATGTC GCTTCCGTCT TCGTCAACCG CCTGAAAATC
601 GGTATGCGCC TGCAAACCGA CCCGTCCGTG ATTTACGGCA TGGGTGCGGC
651 ATACAAGGGC AAAATCCGTA AAGCCGACCT GCGCCGCGAC ACGCCGTACA
701 aCAccTAtac gggcgggggc ttgccgccaa cccggattgc gctgcccggC
751 Aaggcggcaa tggatgccgc cgcccacccg tccggcgaAa aatacctgTa
801 tttcgtgtcC AAAATGGACG GCACGGGCTT GAGCCAGTTC AGCCATGATT
851 TGACCGAACA CAACGCCGCc gTcCGCAAAT ATATTTTGAA AAAATAA
它对应于氨基酸序列<SEQ ID 38:ORF7ng-1>:
1 ..YRIKIAKNQG ISSVGRKLAE DRIVFSRHVL TAAAYVLGVH NRLHTGTYRL
51 PSEVSAWDIL QKMRGGRPDS VTVQIIEGSR FSHMRKVIDA TPDIGHDTKG
101 WSNEKLMAEV APDAFSGNPE GQFFPDSYEI DAGGSDLQIY QTAYKAMQRR
151 LNEAWAGRQD GLPYKNPYEM LIMASLIEKE TGHEADRDHV ASVFVNRLKI
201 GMRLQTDPSV IYGMGAAYKG KIRKADLRRD TPYNTYTGGG LPPTRIALPG
251 KAAMDAAAHP SGEKYLYFVS KMDGTGLSQF SHDLTEHNAA VRKYILKK*
ORF7ng-1和ORF7-1在重叠的298个氨基酸内有98.0%的相同性:
10 20 30 40 50 60
orf7-1.pep KLLKWSAVFLTVSAAVFAALLFVPKDNGRAYRIKIAKNQGISSVGRKLAEDRIVFSRHVL
||||||||||||||||||||||||||||||
orf7ng-1 YRIKIAKNQGISSVGRKLAEDRIVFSRHVL
10 20 30
70 80 90 100 110 120
orf7-1.pep TAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKVIDA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7ng-1 TAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPDSVTVQIIEGSRFSHMRKVIDA
40 50 60 70 80 90
130 140 150 160 170 180
orf7-1.pep TPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf7ng-1 TPDIGHDTKGWSNEKLMAEVAPDAFSGNPEGQFFPDSYEIDAGGSDLQIYQTAYKAMQRR
100 110 120 130 140 150
190 200 210 220 230 240
orf7-1.pep LNEAWESRQDGLPYKNPYEMLIMASLVEKETGHEADRDHVASVFVNRLKIGMRLQTDPSV
||||| :|||||||||||||||||||:|||||||||||||||||||||||||||||||||
orf7ng-1 LNEAWAGRQDGLPYKNPYEMLIMASLIEKETGHEADRDHVASVFVNRLKIGMRLQTDPSV
160 170 180 190 200 210
250 260 270 280 290 300
orf7-1.pep IYGMGAAYKGKIRKADLRRDTPYNTYTRGGLPPTPIALPGKAALDAAAHPSGEKYLYFVS
||||||||||||||||||||||||||| |||||| ||||||||:||||||||||||||||
orf7ng-1 IYGMGAAYKGKIRKADLRRDTPYNTYTGGGLPPTRIALPGKAAMDAAAHPSGEKYLYFVS
220 230 240 250 260 270
310 320 330
orf7-1.pep KMDGTGLSQFSHDLTEHNAAVRKYILKKX
|||||||||||||||||||||||||||||
orf7ng-1 KMDGTGLSQFSHDLTEHNAAVRKYILKKX
280 290
另外,ORF7ng-1显示出与一种假设的大肠杆菌蛋白明显同源:
sp|P28306|YCEG_ECOLI PABC-HOLB基因间区域中假设的38.2KD蛋白gi|1787339(AE000210)o340;与YCEG_ECOLI
片段100%相同SW:P28306但有97个附加的C端残基[大肠杆菌]长度=340
评分=79(36.2位),估计值=5.0e-57,Sum P(2)=5.0e-57
相同性=20/87(22%),阳性=40/87(45%)
询问: 10 GISSVGRKLAEDRIVFSRHVLTAAAYVLGVHNRLHTGTYRLPSEVSAWDILQKMRGGRPD 69
G ++G +L D+I+ V + + GTYR +++ ++L+ + G+
目标: 49 GRLALGEQLYADKIINRPRVFQWLLRIEPDLSHFKAGTYRFTPQMTVREMLKLLESGKEA 108
询问: 70 SVTVQIIEGSRFSHMRKVIDATPDIGH 96
++++EG R S K + P I H
目标:109 QFPLRLVEGMRLSDYLKQLREAPYIKH 135
评分=438(200.7位),估计值=5.0e-57,Sum P(2)=5.0e-57
相同性=84/155(54%),阳性=111/155(71%)
询问:120 EGQFFPDSYEIDAGGSDLQIYQTAYKAMQRRLNEAWAGRQDGLPYKNPYEMLIMASLIEK 179
EG F+PD++ A +D+ + + A+K M + ++ AW GR DGLPYK+ +++ MAS+IEK
目标:158 EGWFWPDTWMYTANTTDVALLKRAHKKMVKAVDSAWEGRADGLPYKDKNQLVTMASIIEK 217
询问:180 ETGHEADRDHVASVFVNRLKIGMRLQTDPSVIYGMGAAYKGKIRKADLRRDTPYNTYTGG 239
ET ++RD VASVF+NRL+IGMRLQTDP+VIYGMG Y GK+ +ADL T YNTYT
目标:218 ETAVASERDKVASVFINRLRIGMRLQTDPTVIYGMGERYNGKLSRADLETPTAYNTYTIT 277
询问:240 GLPPTRIALPGKAAMDAAAHPSGEKYLYFVSKMDG 274
GLPP IA PG ++ AAAHP+ YLYFV+ G
目标:278 GLPPGAIATPGADSLKAAAHPAKTPYLYFVADGKG 312
根据该分析,包括流感嗜血菌YCEG蛋白具有一个可能的前导序列这一事实,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白及其表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例6
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 39>:
1 CGTTTCAAAA TGTTAACTGT GTTGACGGCA ACCTTGATTG CCGGACAGGT
51 ATCTGCCGCC GGAGGCGGTG CGGGGGATAT GAAACAGCCG AAGGAAGTCG
101 GAAAGGTTTT CAGAAAGCAG CAGCGTTACA GCGAGGAAGA AATCAAAAAC
151 GAACGCGCAC GGCTTGCGGC AGTGGGCGAG CGGGTTAATC AGATATTTAC
201 GTTGCTGGGA GGGGAAACCG CCTTGCAAAA GGGGCAGGCG GGAACGGCTC
251 TGGCAACCTA TATGCTGATG TTGGAACGCA CAAAATCCCC CGAAGTCGCC
301 GAACGCGCCT TGGAAATGGC CGTGTCGCTG AACGCGTTTG AACAGGCGGA
351 AATGATTTAT CAGAAATGGC GGCAGATTGA GCCTATACCG GGTAAGGCGC
401 AAAAACGGGC GGGGTGGCTG CGGAACGTGC TGAGGGAAAG AGGAAATCAG
451 CATCTGGACG GACGGGAAGA AGTGCTGGCT CAGGCGGACG AAGGACAG
它对应于氨基酸序列<SEQ ID 40;ORF9>:
1 ..
RFKMLTVLTA TLIAGQVSAA GGGAGDMKQP KEVGKVFRKQ QRYSEEEIKN
51 ERARLAAVGE RVNQIFTLLG GETALQKGQA GTALATYMLM LERTKSPEVA
101 ERALEMAVSL NAFEQAEMIY QKWRQIEPIP GKAQKRAGWL RNVLRERGNQ
151 HLDGREEVLA QADEGQ
进一步的序列分析揭示了完整的DNA序列<SEQ ID 41>:
1 ATGTTACCTA ACCGTTTCAA AATGTTAACT GTGTTGACGG CAACCTTGAT
51 TGCCGGACAG GTATCTGCCG CCGGAGGCGG TGCGGGGGAT ATGAAACAGC
101 CGAAGGAAGT CGGAAAGGTT TTCAGAAAGC AGCAGCGTTA CAGCGAGGAA
151 GAAATCAAAA ACGAACGCGC ACGGCTTGCG GCAGTGGGCG AGCGGGTTAA
201 TCAGATATTT ACGTTGCTGG GAGGGGAAAC CGCCTTGCAA AAGGGGCAGG
251 CGGGAACGGC TCTGGCAACC TATATGCTGA TGTTGGAACG CACAAAATCC
301 CCCGAAGTCG CCGAACGCGC CTTGGAAATG GCCGTGTCGC TGAACGCGTT
351 TGAACAGGCG GAAATGATTT ATCAGAAATG GCGGCAGATT GAGCCTATAC
401 CGGGTAAGGC GCAAAAACGG GCGGGGTGGC TGCGGAACGT GCTGAGGGAA
451 AGAGGAAATC AGCATCTGGA CGGACTGGAA GAAGTGCTGG CTCAGGCGGA
501 CGAAGGACAG AACCGCAGGG TGTTTTTATT GTTGGCACAA GCCGCCGTGC
551 AACAGGACGG GTTGGCGCAA AAAGCATCGA AAGCGGTTCG CCGCGCGGCG
601 TTGAAATATG AACATCTGCC CGAAGCGGCG GTTGCCGATG TGGTGTTCAG
651 CGTACAGGGA CGCGAAAAGG AAAAGGCAAT CGGAGCTTTG CAGCGTTTGG
701 CGAAGCTCGA TACGGAAATA TTGCCCCCCA CTTTAATGAC GTTGCGTCTG
751 ACTGCACGCA AATATCCCGA AATACTCGAC GGCTTTTTCG AGCAGACAGA
801 CACCCAAAAC CTTTCGGCCG TCTGGCAGGA AATGGAAATT ATGAATCTGG
851 TTTCCCTGCA CAGGCTGGAT GATGCCTATG CGCGTTTGAA CGTGCTGTTG
901 GAACGCAATC CGAATGCAGA CCTGTATATT CAGGCAGCGA TATTGGCGGC
951 AAACCGAAAA GAAGGTGCTT CCGTTATCGA CGGCTACGCC GAAAAGGCAT
1001 ACGGCAGGGG GACGGAGGAA CAGCGGAGCA GGGCGGCGCT AACGGCGGCG
1051 ATGATGTATG CCGACCGCAG GGATTACGCC AAAGTCAGGC AGTGGCTGAA
1101 AAAAGTATCC GCGCCGGAAT ACCTGTTCGA CAAAGGTGTG CTGGCGGCTG
1151 CGGCGGCTGT CGAGTTGGAC GGCGGCAGGG CGGCTTTGCG GCAGATCGGC
1201 AGGGTGCGGA AACTTCCCGA ACAGCAGGGG CGGTATTTTA CGGCAGACAA
1251 TTTGTCCAAA ATACAGATGC TCGCCCTGTC GAAGCTGCCC GATAAACGGG
1301 AGGCTTTGAG GGGGTTGGAC AAGATTATCG AAAAACCGCC TGCCGGCAGT
1351 AATACAGAGT TACAGGCAGA GGCATTGGTA CAGCGGTCAG TTGTTTACGA
1401 TCGGCTTGGC AAGCGGAAAA AAATGATTTC AGATCTTGAA AGGGCGTTCA
1451 GGCTTGCACC CGATAACGCT CAGATTATGA ATAATCTGGG CTACAGCCTG
1501 CTGACCGATT CCAAACGTTT GGACGAAGGT TTCGCCCTGC TTCAGACGGC
1551 ATACCAAATC AACCCGGACG ATACCGCTGT CAACGACAGC ATAGGCTGGG
1601 CGTATTACCT GAAAGGCGAC GCGGAAAGCG CGCTGCCGTA TCTGCGGTAT
1651 TCGTTTGAAA ACGACCCCGA GCCCGAAGTT GCCGCCCATT TGGGCGAAGT
1701 GTTGTGGGCA TTGGGCGAAC GCGATCAGGC GGTTGACGTA TGGACGCAGG
1751 CGGCACACCT TACGGGAGAC AAGAAAATAT GGCGGGAAAC GCTCAAACGT
1801 CACGGCATCG CATTGCCCCA ACCTTCCCGA AAACCTCGGA AATAA
它对应于氨基酸序列<SEQ ID 42;ORF9-1>:
1
MLPNRFKMLT VLTATLIAGQ VSAAGGGAGD MKQPKEVGKV FRKQQRYSEE
51 EIKNERARLA AVGERVNQIF TLLGGETALQ KGQAGTALAT YMLMLERTKS
101 PEVAERALEM AVSLNAFEQA EMIYQKWRQI EPIPGKAQKR AGWLRNVLRE
151 RGNQHLDGLE EVLAQADEGQ NRRVFLLLAQ AAVQQDGLAQ KASKAVRRAA
201 LKYEHLPEAA VADVVFSVQG REKEKAIGAL QRLAKLDTEI LPPTLMTLRL
251 TARKYPEILD GFFEQTDTQN LSAVWQEMEI MNLVSLHRLD DAYARLNVLL
301 ERNPNADLYI QAAILAANRK EGASVIDGYA EKAYGRGTEE QRSRAALTAA
351 MMYADRRDYA KVRQWLKKVS APEYLFDKGV LAAAAAVELD GGRAALRQIG
401 RVRKLPEQQG RYFTADNLSK IQMLALSKLP DKREALRGLD KIIEKPPAGS
451 NTELQAEALV QRSVVYDRLG KRKKMISDLE RAFRLAPDNA QIMNNLGYSL
501 LTDSKRLDEG FALLQTAYQI NPDDTAVNDS IGWAYYLKGD AESALPYLRY
551 SFENDPEPEV AAHLGEVLWA LGERDQAVDV WTQAAHLTGD KKIWRETLKR
601 HGIALPQPSR KPRK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF9与脑膜炎奈瑟球菌菌株A的ORF(ORF9a)在重叠的166个氨基酸区域表现出有89.8%的相同性:
10 20 30 40 50
orf9.pep RFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA
|| :|:||:|:|:|||: || ||:| | |||||||||||||||||||||||||||
orf9a MLPARFTILSVLAAALLAGQAYAA--GAADAKPPKEVGKVFRKQQRYSEEEIKNERARLA
10 20 30 40 50
60 70 80 90 100 110
orf9.pep AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
|||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||
orf9a AVGERVNQIFTLLGXETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
60 70 80 90 100 110
120 130 140 150 160
orf9.pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGREEVLAQADEGQ
|||||||||||||||||||||||||||||||||||||| || |||||| |
orf9a EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEXLAQADEXQNRRVFLLLAQ
120 130 140 150 160 170
orf9a AAVQQDGLAQKASKAVRRAALRYEHLPEAAVADVVFSVQXREKEKAIGALQRLAKLDTEI
180 190 200 210 220 230
全长ORF9a核苷酸序列<SEQ ID 43>是:
1 ATGTTACCCG CCCGTTTCAC CATTTTATCT GTGCTCGCGG CAGCCCTGCT
51 TGCCGGGCAG GCGTATGCCG CCGGCGCGGC GGATGCGAAG CCGCCGAAGG
101 AAGTCGGAAA GGTTTTCAGA AAGCAGCAGC GTTACAGCGA GGAAGAAATC
151 AAAAACGAAC GCGCACGGCT TGCGGCAGTG GGCGAGCGGG TTAATCAGAT
201 ATTTACGTTG CTGGGANGGG AAACCGCCTT GCAAAAGGGG CAGGCGGGAA
251 CGGCTCTGGC AACCTATATG CTGATGTTGG AACGCACAAA ATCCCCCGAA
301 GTCGCCGAAC GCGCCTTGGA AATGGCCGTG TCNCTGAACG CGTTTGAACA
351 GGCGGAAATG ATTTATCAGA AATGGCGGCA GATTGAGCCT ATACCGGGTA
401 AGGCGCAAAA ACGGGCGGGG TGGCTGCGGA ACGTGCTGAG GGAAAGAGGA
451 AATCAGCATC TAGACGGACT GGAAGAANTG CTGGCTCAGG CGGACGAANG
501 ACAGAACCGC AGGGTGTTTT TATTGTTGGC ACAAGCCGCC GTGCAACAGG
551 ACGGGTTGGC GCAAAAAGCA TCGAAAGCGG TTCGCCGCGC GGCGTTGAGA
601 TATGAACATC TGCCCGAAGC GGCGGTTGCC GATGTGGTGT TCAGCGTACA
651 GGNACGCGAA AAGGAAAAGG CAATCGGAGC TTTGCAGCGT TTGGCGAAGC
701 TCGATACGGA AATATTGCCC CCCACTTTAA TGACGTTGCG TCTGACTGCA
751 CGCAAATATC CCGAAATACT CGACGGCTTT TTCGAGCAGA CAGACACCCA
801 AAACCTTTCG GCCGTCTGGC AGGAAATGGA AATTATGAAT CTGGTTTCCC
851 TGCACAGGCT GGATGATGCC TATGCGCGTT TGAACGTGCT GTTGGAACGC
901 AATCCGAATG CAGACCTGTA TATTCAGGCA GCGATATTGG CGGCAAACCG
951 AAAAGAANGT GCTTCCGTTA TCGACGGCTA CGCCGAAAAG GCATACGGCA
1001 GGGGGACGGG GGAACAGCGG GGCAGGGCGG CAATGACGGC GGCGATGATA
1051 TATGCCGACC GAAGGGATTA CACCAAAGTC AGGCAGTGGT TGAAAAAAGT
1101 GTCCGCGCCG GAATACCTGT TCGACAAAGG TGTGCTGGCG GCTGCGGCGG
1151 CTGTCGAGTT GGACNGCGGC AGGGCGGCTT TGCGGCAGAT CGGCAGGGTG
1201 CGGAAACTTC CCGAACAGCA GGGGCGGTAT TTTACGGCAG ACAATTTGTC
1251 CAAAATACAG ATGTTCGCCC TGTCGAAGCT GCCCGACAAA CGGGAGGCTT
1301 TGAGGGGGTT GGACAAGATT ATCGAAAAAC CGCCTGCCGG CAGTAATACA
1351 GAGTTACAGG CAGAGGCATT GGTACAGCGG TCAGTTGTTT ACGATCGGCT
1401 TGGCAAGCGG AAAAAAATGA TTTCAGATCT TGAAAGGGCG TTCAGGCTTG
1451 CACCCGATAA CGCTCAGATT ATGAATAATC TGGGCTACAG CCTGCTTTCC
1501 GATTCCAAAC GTTTGGACGA AGGCTTCGCC CTGCTTCAGA CGGCATACCA
1551 AATCAACCCG GACGATACCG CTGTCAACGA CAGCATAGGC TGGGCGTATT
1601 ACCTGAAANG CGACGCGGAA AGCGCGCTGC CGTATCTGCG GTATTCGTTT
1651 GAAAACGACC CCGAGCCCGA AGTTGCCGCC CATTTGGGCG AAGTGTTGTG
1701 GGCATTGGGC GAACGCGATC AGGCGGTTGA CGTATGGACG CAGGCGGCAC
1751 ACCTTACGGG AGACAAGAAA ATATGGCGGG AAACGCTCAA ACGTCACGGC
1801 ATCGCATTGC CCCAACCTTC CCGAAAACCT CGGAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 44>:
1
MLPARFTILS VLAAALLAGQ AYAAGAADAK PPKEVGKVFR KQQRYSEEEI
51 KNERARLAAV GERVNQIFTL LGXETALQKG QAGTALATYM LMLERTKSPE
101 VAERALEMAV SLNAFEQAEM IYQKWRQIEP IPGKAQKRAG WLRNVLRERG
151 NQHLDGLEEX LAQADEXQNR RVFLLLAQAA VQQDGLAQKA SKAVRRAALR
201 YEHLPEAAVA DVVFSVQXRE KEKAIGALQR LAKLDTEILP PTLMTLRLTA
251 RKYPEILDGF FEQTDTQNLS AVWQEMEIMN LVSLHRLDDA YARLNVLLER
301 NPNADLYIQA AILAANRKEX ASVIDGYAEK AYGRGTGEQR GRAAMTAAMI
351 YADRRDYTKV RQWLKKVSAP EYLFDKGVLA AAAAVELDXG RAALRQIGRV
401 RKLPEQQGRY FTADNLSKIQ MFALSKLPDK REALRGLDKI IEKPPAGSNT
451 ELQAEALVQR SVVYDRLGKR KKMISDLERA FRLAPDNAQI MNNLGYSLLS
501 DSKRLDEGFA LLQTAYQINP DDTAVNDSIG WAYYLKXDAE SALPYLRYSF
551 ENDPEPEVAA HLGEVLWALG ERDQAVDVWT QAAHLTGDKK IWRETLKRHG
601 IALPQPSRKP RK*
ORF9a和ORF9-1在614个氨基酸的重叠区域内显示出有95.3%的相同性:
10 20 30 40 50
orf9a.pep MLPARFTILSVLAAALLAGQAYAAG--AADAKPPKEVGKVFRKQQRYSEEEIKNERARLA
||| || :|:||:|:|:|||: ||| |:| | |||||||||||||||||||||||||||
orf9-1 MLPNRFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA
10 20 30 40 50 60
60 70 80 90 100 110
orf9a.pep AVGERVNQIFTLLGXETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
|||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||
orf9-1 AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
70 80 90 100 110 120
120 130 140 150 160 170
orf9a.pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEXLAQADEXQNRRVFLLLAQ
||||||||||||||||||||||||||||||||||||||||| |||||| |||||||||||
orf9-1 EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEVLAQADEGQNRRVFLLLAQ
130 140 150 160 170 180
180 190 200 210 220 230
orf9a.pep AAVQQDGLAQKASKAVRRAALRYEHLPEAAVADVVFSVQXREKEKAIGALQRLAKLDTEI
|||||||||||||||||||||:||||||||||||||||| ||||||||||||||||||||
orf9-1 AAVQQDGLAQKASKAVRRAALKYEHLPEAAVADVVFSVQGREKEKAIGALQRLAKLDTEI
190 200 210 220 230 240
240 250 260 270 280 290
orf9a.pep LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf9-1 LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL
250 260 270 280 290 300
300 310 320 330 340 350
orf9a.pep ERNPNADLYIQAAILAANRKEXASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYT
|||||||||||||||||||||||||||||||||||||| |||:|||:||||:|||||||:
orf9-1 ERNPNADLYIQAAILAANRKEGASVIDGYAEKAYGRGTEEQRSRAALTAAMMYADRRDYA
310 320 330 340 350 360
360 370 380 390 400 410
orf9a.pep KVRQWLKKVSAPEYLFDKGVLAAAAAVELDXGRAALRQIGRVRKLPEQQGRYFTADNLSK
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf9-1 KVRQWLKKVSAPEYLFDKGVLAAAAAVELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK
370 380 390 400 410 420
420 430 440 450 460 470
orf9a.pep IQMFALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSVVYDRLGKRKKMISDLE
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf9-1 IQMLALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSVVYDRLGKRKKMISDLE
430 440 450 460 470 480
480 490 500 510 520 530
orf9a.pep RAFRLAPDNAQIMNNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKXD
|||||||||||||||||||||:|||||||||||||||||||||||||||||||||||| |
orf9-1 RAFRLAPDNAQIMNNLGYSLLTDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGD
490 500 510 520 530 540
540 550 560 570 580 590
orf9a.pep AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf9-1 AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR
550 560 570 580 590 600
600 610
orf9a.pep HGIALPQPSRKPRKX
|||||||||||||||
orf9-1 HGIALPQPSRKPRKX
610
与淋病奈瑟球菌的预计ORF的同源性
ORF9与淋病奈瑟球菌的预计ORF(ORF9.ng)在重叠的163个氨基酸区域内显示出有82.8%的相同性:
Orf9 RFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERAR 54
|| :|:||:|:|:|||: || ||:|:: |||||||:||::|||||||||||||
orf9ng MIMLPARFTILSVLAAALLAGQAYAA--GAADVELPKEVGKVLRKHRRYSEEEIKNERAR 58
orf9 LAAVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFE 114
|||||||||::|||||||||||||||||||||||||||||||||||||||||||||||||
orf9ng LAAVGERVNRVFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFE 118
orf9 QAEMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGREEVLAQADEGQ 166
|||||||||||||||||:||| ||||||||:| || ||| ||| ||:|
orf9ng QAEMIYQKWRQIEPIPGEAQKPAGWLRNVLKEGGNPHLDRLEEVPAQSDYVHQPMIFLLL 178
预计ORF9ng核苷酸序列<SEQ ID 45>编码的蛋白质具有氨基酸序列<SEQ ID46>:
1
MIMLPARFTI LSVLAAALLA GQAYAAGAAD VELPKEVGKV LRKHRRYSEE
51 EIKNERARLA AVGERVNRVF TLLGGETALQ KGQAGTALAT YMLMLERTKS
101 PEVAERALEM AVSLNAFEQA EMIYQKWRQI EPIPGEAQKP AGWLRNVLKE
151 GGNPHLDRLE EVPAQSDYVH QP
MIFLLLVQ AAVQHGGVAQ KPSKAVRPAA
201 YNYEVLPETA GADAVFCVQG PQYEKAIQSF PPCGRNPQTE NIAPPFNELF
251 RPTARPISPK LLQRFFRTEP NLAKPFRPPG PEMETYQTGF PRPLTRNNPT
氨基酸1-28是推定的前导序列,173-189预计是跨膜结构域。
进一步的序列分析揭示了全长ORF9ng DNA序列<SEQ ID 47>:
1 ATGTTACCCG CCCGTTTCAC TATTTTATCT GTCCTCGCAG CAGCCCTGCT
51 TGCCGGACAG GCGTATGCTG CCGGCGCGGC GGATGTGGAG CTGCCGAAGG
101 AAGTCGGAAA GGTTTTAAGG AAACATCGGC GTTACAGCGA GGAAGAAATC
151 AAAAACGAAC GCGCACGGCT TGCGGCAGTG GGCGAACGGG TCAACAGGGT
201 GTTTACGCTG TTGGGCGGTG AAACGGCTTT GCAGAAAGGG CAGGCGGGAA
251 CGGCTCTGGC AACCTATATG CTGATGTTGG AACGCACAAA ATCCCCCGAA
301 GTCGCCGAAC GCGCCTTGGA AATGGCCGTG TCGCTGAACG CGTTTGAACA
351 GGCGGAAATG ATTTATCAGA AATGgcggca gatcgagcct ataCcgggtg
401 aggcgcaaaa accgGcgggG tggctgcgga acgtattgaa ggaagggGGa
451 aaTCAGCATC TGGAcgggtt gaaagaggTG CtggcgcaAT cggacgatGT
501 GCAAAAAcgc aggaTATTTT TGCTGCTGGT GCAAGCCGCC GTGCagcagg
551 gTGGGGTGGC TCAAAAAGCA TCGAAAGCGG TTCGCcgtgc GGcgttgaAG
601 TATGAACATC TGCCcgaagc ggcggTTGCC GATGcggTGT TCGGCGTACA
651 GGGACGCGAA AAGGAAAagg caaTCGAAGC TTTGCAGCGT TTGGCGAAGC
701 TCGATACGGA AATATTGCCC CCCACTTTAA TGACGTTGCG TCTGACTGCA
751 CGCAAATATC CCGAAATACT CGACGGCTTT TTCGAGCAGA CAGACACCCA
801 AAACCTTTCG GCCGTCTGGC AGGAAATGGA AATTATGAAT CTGGTTTCCC
851 TGCGTAAGCC GGATGATGCC TATGCGCGTT TGAACGTGCT GTTGGAACAC
901 AACCCGAATG CAAACCTGTA TATTCAGGCG GCGATATTGG CGGCAAACCG
951 AAAAGAAGGT GCGTCCGTTA TCGACGGCTA CGCCGAAAAG GCATACGGCA
1001 GGGGGACGGG GGAACAGCGG GGCagggcgg cAATgacggc GGCGATGATA
1051 TATGCCGACC GCAGGGATTA CGCCAAAGTC AGGCAGTGGT TGAAAAAAGT
1101 GTCCGCGCCG GAATACCTGT TCGACAAAGG CGTGCTGGCG GCTGCGGCGG
1151 CTGCCGAATT GGACGGAGGC CGGGCGGCTT TGCGGCAGAT CGGCAGGGTG
1201 CGGAAACTTC CCGAACAGCA GGGGCGGTAT TTTACGGCAG ACAATTTGTC
1251 CAAAATACAG ATGCTCGCCC TGTCGAAGCT GCCCGACAAA CGGGAAGCCC
1301 TGATCGGGCT GAACAACATC ATCGCCAAAC TTTCGGCGGC GGGAAGCACG
1351 GAACCTTTGG CGGAAGCATT GGCACAGCGT TCCATTATTT ACGaacAGTT
1401 cggCAAACGG GGAAAAATGA TTGCCGACCT tgaAACcgcg CTCAAACTTA
1451 CGCCCGATAA TGCACAAATT ATGAATAATC TGGGCTACAG CCTGCTTTCC
1501 GATTCCAAAC GTTTGGACGA GGGTTTCGCC CTGCTTCAGA CGGCATACCA
1551 AATCAACCCG GACGATACCG CCGTTAACGA CAGCATAGGC TGGGCGTATT
1601 ACCTGAAAGG CGACgcggaA AGCGCGCTGC CGTATCTGcg gtattcgttt
1651 gAAAACGACC CCGAGCCCGA AGTTGCCGCC CATTTGGGCG AAGTGTTGTG
1701 GGCATTGGGC GAACGCGATC AGGCGGTTGA CGTATGGACG CAGGCGGCAC
1751 ACCTTAGGGG AGACAAGAAA ATATGGCGGG AGACGCTCAA ACGCTACGGA
1801 ATCGCCTTGC CCGAGCCTTC CCGAAAACCC CGGAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 48>:
1
MLPARFTILS VLAAALLAGQ AYAAGAADVE LPKEVGKVLR KHRRYSEEEI
51 KNERARLAAV GERVNRVFTL LGGETALQKG QAGTALATYM LMLERTKSPE
101 VAERALEMAV SLNAFEQAEM IYQKWRQIEP IPGEAQKPAG WLRNVLKEGG
151 NQHLDGLKEV LAQSDDVQKR RIFLLLVQAA VQQGGVAQKA SKAVRRAALK
201 YEHLPEAAVA DAVFGVQGRE KEKAIEALQR LAKLDTEILP PTLMTLRLTA
251 RKYPEILDGF FEQTDTQNLS AVWQEMEIMN LVSLRKPDDA YARLNVLLEH
301 NPNANLYIQA AILAANRKEG ASVIDGYAEK AYGRGTGEQR GRAAMTAAMI
351 YADRRDYAKV RQWLKKVSAP EYLFDKGVLA AAAAAELDGG RAALRQIGRV
401 RKLPEQQGRY FTADNLSKIQ MLALSKLPDK REALIGLNNI IAKLSAAGST
451 EPLAEALAQR SIIYEQFGKR GKMIADLETA LKLTPDNAQI MNNLGYSLLS
501 DSKRLDEGFA LLQTAYQINP DDTAVNDSIG WAYYLKGDAE SALPYLRYSF
551 ENDPEPEVAA HLGEVLWALG ERDQAVDVWT QAAHLRGDKK IWRETLKRYG
601 IALPEPSRKP RK*
ORF9ng和ORF9-1在614个氨基酸的重叠区域内显示出有88.1%的相同性:
10 20 30 40 50 60
orf9-1.pep MLPNRFKMLTVLTATLIAGQVSAAGGGAGDMKQPKEVGKVFRKQQRYSEEEIKNERARLA
||| || :|:||:|:|:|||: ||| |:|:: |||||||:||::|||||||||||||||
orf9ng-1 MLPARFTILSVLAAALLAGQAYAAG--AADVELPKEVGKVLRKHRRYSEEEIKNERARLA
10 20 30 40 50
70 80 90 100 110 120
orf9-1.pep AVGERVNQIFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
|||||||::|||||||||||||||||||||||||||||||||||||||||||||||||||
orf9ng-1 AVGERVNRVFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQA
60 70 80 90 100 110
130 140 150 160 170 180
orf9-1.pep EMIYQKWRQIEPIPGKAQKRAGWLRNVLRERGNQHLDGLEEVLAQADEGQNRRVFLLLAQ
|||||||||||||||:||| ||||||||:| ||||||||:|||||:|: |:||:||||:|
orf9ng-1 EMIYQKWRQIEPIPGEAQKPAGWLRNVLKEGGNQHLDGLKEVLAQSDDVQKRRIFLLLVQ
120 130 140 150 160 170
190 200 210 220 230 240
orf9-1.pep AAVQQDGLAQKASKAVRRAALKYEHLPEAAVADVVFSVQGREKEKAIGALQRLAKLDTEI
||||| |:|||||||||||||||||||||||||:||:|||||||||| ||||||||||||
orf9ng-1 AAVQQGGVAQKASKAVRRAALKYEHLPEAAVADAVFGVQGREKEKAIEALQRLAKLDTEI
180 190 200 210 220 230
250 260 270 280 290 300
orf9-1.pep LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLHRLDDAYARLNVLL
||||||||||||||||||||||||||||||||||||||||||||||:: |||||||||||
orf9ng-1 LPPTLMTLRLTARKYPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLRKPDDAYARLNVLL
240 250 260 270 280 290
310 320 330 340 350 360
orf9-1.pep ERNPNADLYIQAAILAANRKEGASVIDGYAEKAYGRGTEEQRSRAALTAAMMYADRRDYA
|:||||:||||||||||||||||||||||||||||||| |||:|||:||||:||||||||
orf9ng-1 EHNPNANLYIQAAILAANRKEGASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYA
300 310 320 330 340 350
370 380 390 400 410 420
orf9-1.pep KVRQWLKKVSAPEYLFDKGVLAAAAAVELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||
orf9ng-1 KVRQWLKKVSAPEYLFDKGVLAAAAAAELDGGRAALRQIGRVRKLPEQQGRYFTADNLSK
360 370 380 390 400 410
430 440 450 460 470 480
orf9-1.pep IQMLALSKLPDKREALRGLDKIIEKPPAGSNTELQAEALVQRSVVYDRLGKRKKMISDLE
|||||||||||||||| ||::|| | |:::|| ||||:|||::|:::||| |||:|||
orf9ng-1 IQMLALSKLPDKREALIGLNNIIAKLSAAGSTEPLAEALAQRSIIYEQFGKRGKMIADLE
420 430 440 450 460 470
490 500 510 520 530 540
orf9-1.pep RAFRLAPDNAQIMNNLGYSLLTDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGD
|::|:|||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf9ng-1 TALKLTPDNAQIMNNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGD
480 490 500 510 520 530
550 560 570 580 590 600
orf9-1.pep AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLTGDKKIWRETLKR
||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||
orf9ng-1 AESALPYLRYSFENDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLRGDKKIWRETLKR
540 550 560 570 580 590
610
orf9-1.pep HGIALPQPSRKPRKX
:|||||:||||||||
orf9ng-1 YGIALPEPSRKPRKX
600 610
另外,ORF9ng显示出与绿脓杆菌的一种假设蛋白明显同源:
sp|P42810|YHE3_PSEAE HEMM-HEMA基因间区域中的假设的64.8KD蛋白(ORF3)
>gi|1072999|pir||S49376假设蛋白3-绿脓杆菌>gi|557259(X82071)orf3[绿脓杆菌]长度=576
评分=128位(318),估计值=1e-28
相同性=138/587(23%),阳性=228/587(38%),空隙=125/587(21%)
询问:67 VFTLLGGETALQKGQAGTALATYMLMLERTKSPEVAERALEMAVSLNAFEQAEMIYQKWR 126
+++LL E A Q+ + AL+ Y++ ++T+ P V+ERA +A L A ++A W
目标:53 LYSLLVAELAGQRNRFDIALSNYVVQAQKTRDPGVSERAFRIAEYLGADQEALDTSLLWA 112
询问:127 QIEPIPGEAQKPAG--------------WLRNVLKEGGNQHLDGLKEVLAQSDDVQKRRI 172
+ P +AQ+ A ++ VL G+ H D L A++D + +
目标:113 RSAPDNLDAQRAAAIQLARAGRYEESMVYMEKVLNGQGDTHFDFLALSAAETDPDTRAGL 172
询问:173 FXXXXXXXXXXXXXXXKASKAVRRAALKYEHLPEAAVADAVFGVQGREKEKAIEALQRLA 232
++ KY + + A+ Q ++A+ L+ +
目标:173 L------------------QSFDHLLKKYPNNGQLLFGKALLLQQDGRPDEALTLLEDNS 214
询问:233 KLDTEILPPTLMTLRLTARK-----YPEILDGFFEQTDTQNLSAVWQEMEIMNLVSLRKP 287
E+ P L + L + K P + G E D + + + + LV +
目标:215 ASRHEVAPLLLRSRLLQSMKRSDEALPLLKAGIKEHPDDKRVRLAYARL----LVEQNRL 270
询问:288 DDAYARLNVLLEHNPN---------------------ANLYIQAAI-------------- 312
DDA A L++ P+ A +Y++ +
目标:271 DDAKAEFAGLVQQFPDDDDDLRFSLALVCLEAQAWDEARIYLEELVERDSHVDAAHFNLG 330
询问:313 -LAANRKEGASVIDGYAEKAYGRGTGEQRGRAAMTAAMIYADRRDYAKVRQWLKKVSAPE 371
LA +K+ A +D YA+ G G + T ++ A R D A R + P+
目标:331 RLAEEQKDTARALDEYAQ--VGPGNDFLPAQLRQTDVLLKAGRVDEAAQRLDKARSEQPD 388
询问:372 YLFDKXXXXXXXXXXXXXXXXXXRQIGRVRKLPEQQGRYFTADNLSKIQMLALSKLPDKR 431
Y A L I+ ALS +
目标:389 Y----------------------------------------AIQLYLIEAEALSNNDQQE 408
询问:432 EALIGLNNIIAKLSAAGSTEPLAEALAQRSIIYEQFGKRGKMIADLETALKLTPDNAQIM 491
+A + + + E L L RS++ E+ +M DL + PDNA +
目标:409 KAWQAIQEGLKQYP-----EDL-NLLYTRSMLAEKRNDLAQMEKDLRFVIAREPDNAMAL 462
询问:492 NNLGYSLLSDSKRLDEGFALLQTAYQINPDDTAVNDSIGWAYYLKGDAESALPYLRYSFE 551
N LGY+L + R E L+ A+++NPDD A+ DS+GW Y +G A YLR + +
目标:463 NALGYTLADRTTRYGEARELILKAHKLNPDDPAILDSMGWINYRQGKLADAERYLRQALQ 522
询问:552 NDPEPEVAAHLGEVLWALGERDQAVDVWTQAAHLRGDKKIWRETLKR 598
P+ EVAAHLGEVLWA G + A +W + + D + R T+KR
目标:523 RYPDHEVAAHLGEVLWAQGRQGDARAIWREYLDKQPDSDVLRRTIKR 569
gi|2983399(AE000710)假设蛋白[Aquifex aeolicus]长度=545
评分=81.5位(198),估计值=1e-14
相同性=61/198(30%),阳性=98/198(48%),空隙=19/198(9%)
询问:408 GRYFTADNL-SKIQMLALSKLPDKREALIGLNNIIAKLSAAGSTEPLAEALAQ------- 459
G Y A L K ++LA PDK+E L + +K + + L +
目标:335 GNYEDAKRLIEKAKVLA----PDKKEILFLEADYYSKTKQYDKALEILKKLEKDYPNDSR 390
询问:460 ----RSIIYEQFGKRGKMIADLETALKLTPDNAQIMNNLGYSLLS--DSKRLDEGFALLQ 513
+I+Y+ G L A++L P+N N LGYSLL +R++E L++
目标:391 VYFMEAIVYDNLGDIKNAEKALRKAIELDPENPDYYNYLGYSLLLWYGKERVEEAEELIK 450
询问:514 TAYQINPDDTAVNDSIGWAYYLKGDAESALPYLRYSF-ENDPEPEVAAHLGEVLWALGER 572
A + +P++ A DS+GW YYLKGD E A+ YL + E +P V H+G+VL +G +
目标:451 KALEKDPENPAYIDSMGWVYYLKGDYERAMQYLLKALREAYDDPVVNEHVGDVLLKMGYK 510
询问:573 DQAVDVWTQAAHLRGDKK 590
++A + + +A L + K
目标:511 EEARNYYERALKLLEEGK 528
根据该分析,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例7
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 49>:
1 AACCTCTACG CCGGCCCGCA GACCACATCC GTCATCGCAA ACATCGCCGA
51 CAACCTGCAA CTGGCCAAAG ACTACGGCAA AGTACACTGG TTCGCCTCCC
101 CGCTCTTCTG GCTCCTGAAC CAACTGCACA ACATCATCGG CAACTGGGGC
151 TGGGCGATTA TCGTTTTAAC CATCATCGTC AAAGCCGTAC TGTATCCATT
201 GACCAACGCC TCTTACCGCT CTATGGCGAA AATGCGTGCC GCCGCACCCA
251 AACTGCAAGC CATCAAAGAG AAATACGGCG ACGACCGTAT GGCGCAACAA
301 CAGGCGATGA TGCAGCTTTA CACAGACGAG AAAATCAACC CG
CTGGGCG
351 GCTGCCTGCC TATGCTGTTG CAAATCCCCG TCTTCATCGG ATTGTATTGG
401 GCATTGTTCG CCTCCGTAGA ATTGCGCCAG GCACCTTGGC TGGGTTGGAT
451 TACCGACCTC AGCCGCGCCG ACCCCTACTA CATCCTGCCC ATCATTATGG
501 CGGCAACGAT GTTCGCCCAA ACTTATCTGA ACCCGCCGCC GAcCGACCCG
551 ATGCagGCGA AAATGATGAA AATCATGCCG TTGGTTTTCT CsGwCrTGTT
601 CTTCTTCTTC CCTGCCGGks TGGTATTGTA CTGGGTAGTC AACAACCTCC
651 TGACCATCGC CCAGCAATGG CACATCAACC GCAGCATCGA AAAACAACGC
701 GCCCAAGGCG AAGTCGTTTC CTAA
它对应于氨基酸序列<SEQ ID 50:ORF11>:
1 ..NLYAGPQTTS VIANIADNLQ LAKDYGKVHW FASPLFWLLN QLHNIIGNWG
51
WAIIVLTIIV KAVLYPLTNA SYRSMAKMRA AAPKLQAIKE KYGDDRMAQQ
101 QAMMQLYTDE KINPLGGCLP
MLLQIPVFIG LYWALFASVE LRQAPWLGWI
151 TDLSRADPYY ILPIIMAATM FAQTYLNPPP TDPMQAKMMK IMP
LVFSXXF
201
FFFPAGXVLY WVVNNLLTIA QQWHINRSIE KQRAQGEVVS *
进一步的序列分析揭示了全部的DNA序列<SEQ ID 51>:
1 ATGGATTTTA AAAGACTCAC GGCGTTTTTC GCCATCGCGC TGGTGATTAT
51 GATCGGCTGG GAAAAGATGT TCCCCACTCC GAAGCCAGTC CCCGCGCCCC
101 AACAGGCAGC ACAACAACAG GCCGTAACCG CTTCCGCCGA AGCCGCGCTC
151 GCGCCCGCAA CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTCAT
201 TGATGAAAAA AGCGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG
251 CAACCGGCGA CGAAAATAAA CCGTTCATCC TGTTTGGCGA CGGCAAAGAA
301 TACACCTACG TCGCCCAATC CGAACTTTTG GACGCGCAGG GCAACAACAT
351 TCTAAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC AGCTTGGAAG
401 GCGACAAAGT TGAAGTCCGC CTGAGCGCGC CTGAAACACG CGGTCTGAAA
451 ATCGACAAAG TTTATACTTT CACCAAAGGC AGCTATCTGG TCAACGTCCG
501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT
551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG TTACTTTACC
601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA
651 AGTCAGCTTT TCCGACTTGG ACGACGATGC CAAATCCGGC AAATCCGAGG
701 CCGAATACAT CCGCAAAACC CCGACCGGCT GGCTCGGCAT GATTGAACAC
751 CACTTCATGT CCACCTGGAT TCTCCAACCT AAAGGCAGAC AAAGCGTTTG
801 CGCCGCAGGC GAGTGCAACA TCGACATCAA ACGCCGCAAC GACAAGCTGT
851 ACAGCACCAG CGTCAGCGTG CCTTTAGCCG CCATCCAAAA CGGCGCGAAA
901 GCCGAAGCCT CCATCAACCT CTACGCCGGC CCGCAGACCA CATCCGTCAT
951 CGCAAACATC GCCGACAACC TGCAACTGGC CAAAGACTAC GGCAAAGTAC
1001 ACTGGTTCGC CTCCCCGCTC TTCTGGCTCC TGAACCAACT GCACAACATC
1051 ATCGGCAACT GGGGCTGGGC GATTATCGTT TTAACCATCA TCGTCAAAGC
1101 CGTACTGTAT CCATTGACCA ACGCCTCTTA CCGCTCTATG GCGAAAATGC
1151 GTGCCGCCGC ACCCAAACTG CAAGCCATCA AAGAGAAATA CGGCGACGAC
1201 CGTATGGCGC AACAACAGGC GATGATGCAG CTTTACACAG ACGAGAAAAT
1251 CAACCCGCTG GGCGGCTGCC TGCCTATGCT GTTGCAAATC CCCGTCTTCA
1301 TCGGATTGTA TTGGGCATTG TTCGCCTCCG TAGAATTGCG CCAGGCACCT
1351 TGGCTGGGTT GGATTACCGA CCTCAGCCGC GCCGACCCCT ACTACATCCT
1401 GCCCATCATT ATGGCGGCAA CGATGTTCGC CCAAACTTAT CTGAACCCGC
1451 CGCCGACCGA CCCGATGCAG GCGAAAATGA TGAAAATCAT GCCGTTGGTT
1501 TTCTCCGTCA TGTTCTTCTT CTTCCCTGCC GGTCTGGTAT TGTACTGGGT
1551 AGTCAACAAC CTCCTGACCA TCGCCCAGCA ATGGCACATC AACCGCAGCA
1601 TCGAAAAACA ACGCGCCCAA GGCGAAGTCG TTTCCTAA
它对应于氨基酸序列<SEQ ID 52;ORF11-1>:
1
MDFKRLTAFF AIALVIMIGW EKMFPTPKPV PAPQQAAQQQ AVTASAEAAL
51
APATPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDENK PFILFGDGKE
101 YTYVAQSELL DAQGNNILKG IGFSAPKKQY SLEGDKVEVR LSAPETRGLK
151 IDKVYTFTKG SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT
201 HSYVGPVVYT PEGNFQKVSF SDLDDDAKSG KSEAEYIRKT PTGWLGMIEH
251 HFMSTWILQP KGRQSVCAAG ECNIDIKRRN DKLYSTSVSV PLAAIQNGAK
301 AEASINLYAG PQTTSVIANI ADNLQLAKDY GKVHWFASPL FWLLNQLHNI
351 IGNWGW
AIIV LTIIVKAVLY PLTNASYRSM AKMRAAAPKL QAIKEKYGDD
401 RMAQQQAMMQ LYTDEKINPL GGCLP
MLLQI PVFIGLYWAL FASVELRQAP
451 WLGWITDLSR ADPYYILPII MAATMFAQTY LNPPPTDPMQ AKMMKIMP
LV
501
FSVMFFFFPA GLVLYWVVNN LLTIAQQWHI NRSIEKQRAQ GEVVS*
该氨基酸序列的计算机分析给出了下列结果:
与恶臭假单胞菌ORF11的60kDa内膜蛋白(登录号为P25754)的同源性
ORF11和60kDa的蛋白在229个氨基酸的重叠区域内显示出有58%的氨基酸相同性(BLASTp)。
ORF11 2 LYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIVLTIIVK 61
LYAGP+ S + ++ L+L DYG + + A P+FWLL +H+++GNWGW+IIVLT+++K
60K 324 LYAGPKIQSKLKELSPGLELTVDYGFLWFIAQPIFWLLQHIHSLLGNWGWSIIVLTMLIK 383
ORF11 62 AVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRXXXXXXXXXLYTDEKINPLGGCLPM 121
+ +PL+ ASYRSMA+MRA APKL A+KE++GDDR LY EKINPLGGCLP+
60K 384 GLFFPLSAASYRSMARMRAVAPKLAALKERFGDDRQKMSQAMMELYKKEKINPLGGCLPI 443
ORF11 122 LLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLNPPPT 181
L+Q+PVF+ LYW L SVE+RQAPW+ WITDLS DP++ILPIIM ATMF Q LNP P
60K 444 LVQMPVFLALYWVLLESVEMRQAPWILWITDLSIKDPFFILPIIMGATMFIQQRLNPTPP 503
ORF11 182 DPMQAKAMKIMPLVXXXXXXXXPAGXVLYWVVNNLLTIAQQWHINRSIE 230
DPMQAK+MK+MP++ PAG VLYWVVNN L+I+QQW+I R IE
60K 504 DPMQAKVMKMMPIIFTFFFLWFPAGLVLYWVVNNCLSISQQWYITRRIE 552
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF11与脑膜炎奈瑟球菌菌株A的ORF(ORF11a)在240个氨基酸重叠区域内显示出有97.9%的相同性:
10 20 30
orf11.pep NLYAGPQTTSVIANIADNLQLAKDYGKVHW
||||||||||||||||||||||||||||||
orf11a IKRRNDKLYSTSVSVPLAAIQNGAKSXASINLYAGPQTTSVIANIADNLQLXKDYGKVHW
280 290 300 310 320 330
40 50 60 70 80 90
orf11.pep FASPLFWLLNQLHNIIGNWGWAIIVLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11a FASPLFWLLNQLHNIIGNWGWAIIVLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKE
340 350 360 370 380 390
100 110 120 130 140 150
orf11.pep KYGDDRMAQQQAMMQLYTDEKINPLGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11a KYGDDRMAQQQAMMQLYTDEKINPLGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWI
400 410 420 430 440 450
160 170 180 190 200 210
orf11.pep TDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVFSXXFFFFPAGXVLY
||||||||||||||||||||||||||||||||||||||||||||| ||||| |||| |||
orf11a TDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVXSXXFFXFPAGLVLY
460 470 480 490 500 510
220 230 240
orf11.pep WVVNNLLTIAQQWHINRSIEKQRAQGEVVSX
||:||||||||||||||||||||||||||||
orf11a WVINNLLTIAQQWHINRSIEKQRAQGEVVSX
520 530 540
全长ORF11核苷酸序列<SEQ ID 53>是:
1 ANGGATTTTA AAAGACTCAC NGNGTTTTTC GCCATCGCAC TGGTGATTAT
51 GATCGGATNG NAAANGATGT TCCCCACTCC GAAGCCCGTC CCCGCGCCCC
101 AACAGACGGC ACAACAACAG GCCGTAANCG CTTCCGCCGA AGCCGCGCTC
151 GCGCCCGNAN CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTCAT
201 TGATGAAAAA AGCGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG
251 CAACCGGCGA CNAAAATAAA CCGTTCATCC TGTTTGGCGA CGGCAAANAA
301 TACACCTACN TCGCCCANTC CGAACTTTTG GACGCGCAGG GCAACAACAT
351 TCTAAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC AGCTTGGAAG
401 GCGACAAAGT TGAAGTCCGC CTGAGCGCAC CTGAAACACG CGGTCTGAAA
451 ATCGACAAAG TTTATACTTT CACCAAAGGC AGCTATCTGG TCAACGTCCG
501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT
551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG CTACTTTACC
601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA
651 AGTCAGCTTC TCCGACTTGG ACGACGATGC CAANTCCGGN AAATCCGAGG
701 CCGAATACAT CCGCAAAACC CNGACCGGCT GGCTCGGCAT GATTGAACAC
751 CACTTCATGT CCACCTGGAT CCTCCAACCC AAAGGCGGAC AAAGCGTTTG
801 CGCCGCTGGC GACTGCNGTA TNGACATCAA ACGCCGCAAC GACAAGCTGT
851 ACAGCACCAG CGTCAGCGTG CCTTTAGCCG CTATCCAAAA CGGTGCGAAA
901 TCCNAAGCCT CCATCAACCT CTACGCCGGC CCACAGACCA CATCNGTTAT
951 CGCAAACATC GCCGACAACC TGCAACTGGN CAAAGACTAC GGCAAAGTAC
1001 ACTGGTTCGC CTCCCCCCTC TTTTGGCTTT TGAACCAACT GCACAACATC
1051 ATCGGCAACT GGGGCTGGGC GATTATCGTT TTAACCATCA TCGTCAAAGC
1101 CGTACTGTAT CCATTGACCA ACGCCTCTTA CCGTTCGATG GCGAAAATGC
1151 GTGCCGCCGC GCCCAAACTG CAAGCCATCA AAGAGAAATA CGGCGACGAC
1201 CGTATGGCGC AGCAACAAGC CATGATGCAG CTTTACACAG ACGAGAAAAT
1251 CAACCCGCTG GGCGGCTGCC TGCCTATGCT GTTGCAAATC CCCGTCTTCA
1301 TCGGATTGTA TTGGGCATTG TTCGCCTCCG TAGAATTGCG CCAGGCACCT
1351 TGGCTGGGTT GGATTACCGA CCTCAGCCGC GCCGACCCNT ACTACATCCT
1401 GCCCATCATT ATGGCGGCAA CGATGTTCGC CCAAACCTAT CTGAACCCGC
1451 CGCCGACCGA CCCGATGCAG GCGAAAATGA TGAAAATCAT GCCTTTGGTT
1501 NTNTCNNNNA NGTTCTTCNN CTTCCCTGCC GGTCTGGTAT TGTACTGGGT
1551 GATCAACAAC CTCCTGACCA TCGCCCAGCA ATGGCACATC AACCGCAGCA
1601 TCGAAAAACA ACGCGCCCAA GGCGAAGTCG TTTCCTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 54>:
1
XDFKRLTXFF AIALVIMIGX XXMFPTPKPV PAPQQTAQQQ AVXASAEAAL
51
APXXPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDXNK PFILFGDGKX
101 YTYXAXSELL DAQGNNILKG IGFSAPKKQY SLEGDKVEVR LSAPETRGLK
151 IDKVYTFTKG SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT
201 HSYVGPVVYT PEGNFQKVSF SDLDDDAXSG KSEAEYIRKT XTGWLGMIEH
251 HFMSTWILQP KGGQSVCAAG DCXXDIKRRN DKLYSTSVSV PLAAIQNGAK
301 SXASINLYAG PQTTSVIANI ADNLQLXKDY GKVHWFASPL FWLLNQLHNI
351 IGNWGW
AIIV LTIIVKAVLY PLTNASYRSM AKMRAAAPKL QAIKEKYGDD
401 RMAQQQAMMQ LYTDEKINPL GGCLP
MLLQI PVFIGLYWAL FASVELRQAP
451 WLGWITDLSR ADPYYILPII MAATMFAQTY LNPPPTDPMQ AKMMKIMP
LV
501
XSXXFFXFPA GLVLYWVINN LLTIAQQWHI NRSIEKQRAQ GEVVS*
ORF11和ORF11-1在544个氨基酸重叠区域内显示出有95.2%的相同性:
10 20 30 40 50 60
orf11a.pep XDFKRLTXFFAIALVIMIGXXXMFPTPKPVPAPQQTAQQQAVXASAEAALAPXXPITVTT
||||||| ||||||||||| |||||||||||||:||||||:||||||||| :||||||
orf11-1 MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQQQAVTASAEAALAPATPITVTT
10 20 30 40 50 60
70 80 90 100 110 120
orf11a.pep DTVQAVIDEKSGDLRRLTLLKYKATGDXNKPFILFGDGKXYTYXAXSELLDAQGNNILKG
||||||||||||||||||||||||||| ||||||||||| ||| | ||||||||||||||
orf11-1 DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFILFGDGKEYTYVAQSELLDAQGNNILKG
70 80 90 100 110 120
130 140 150 160 170 180
orf11a.pep IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11-1 IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL
130 140 150 160 170 180
190 200 210 220 230 240
orf11a.pep SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAXSGKSEAEYIRKT
||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||
orf11-1 SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT
190 200 210 220 230 240
250 260 270 280 290 300
orf11a.pep XTGWLGMIEHHFMSTWILQPKGGQSVCAAGDCXXDIKRRNDKLYSTSVSVPLAAIQNGAK
|||||||||||||||||||||| |||||||:| ||||||||||||||||||||||||||
orf11-1 PTGWLGMIEHHFMSTWILQPKGRQSVCAAGECNIDIKRRNDKLYSTSVSVPLAAIQNGAK
250 260 270 280 290 300
310 320 330 340 350 360
orf11a.pep SXASINLYAGPQTTSVIANIADNLQLXKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIV
: |||||||||||||||||||||||| |||||||||||||||||||||||||||||||||
orf11-1 AEASINLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIV
310 320 330 340 350 360
370 380 390 400 410 420
orf11a.pep LTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11-1 LTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPL
370 380 390 400 410 420
430 440 450 460 470 480
orf11a.pep GGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11-1 GGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTY
430 440 450 460 470 480
490 500 510 520 530 540
orf11a.pep LNPPPTDPMQAKMMKIMPLVXSXXFFXFPAGLVLYWVINNLLTIAQQWHINRSIEKQRAQ
|||||||||||||||||||| | || ||||||||||:||||||||||||||||||||||
orf11-1 LNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQ
490 500 510 520 530 540
orf11a.pep GEVVSX
||||||
orf11-1 GEVVSX
与淋病奈瑟球菌的预计ORF的同源性
ORF11与淋病奈瑟球菌的预计ORF(ORF11.ng)在240个氨基酸重叠区域内显示出有93.6%的相同性:
Orf11 NLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIIVLT 57
|||||||||||||||||||||||||||||||||||||||||||||||||||||:|||
orf11ng MAVNLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIVVLT 60
orf11 IIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINPLGG 117
||||||||||||||||||||||||||:||:|||||||||||||||||||: ||:||||||
orf11ng IIVKAVLYPLTNASYRSMAKMRAAAPELQTIKEKYGDDRMAQQQAMMQLFEDEEINPLGG 120
orf11 CLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLN 177
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11ng CLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLN 180
orf11 PPPTDPMQAKMMKIMPLVFSXXFFFFPAGXVLYWVVNNLLTIAQQWHINRSIEKQRAQGE 237
|||||||||||||||||||| ||||||| ||||||||||||||||||||||||||||||
orf11ng PPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQGE 240
orf11 VVS 240
|||
orf11ng VVS 243
预计ORF11ng核苷酸序列<SEQ ID 55>编码的蛋白质具有氨基酸序列<SEQ ID56>:
1 MAVNLYAGPQ TTSVIANIAD NLQLAKDYGK VHWFASPLFW LLNQLHNIIG
51 NWGW
AIVVLT IIVKAVLYPL TNASYRSMAK MRAAAPELQT IKEKYGDDRM
101 AQQQAMMQLF EDEEINPLGG CLP
MLLQIPV FIGLYWALFA SVELRQAPWL
151 GWITDLSRAD PYYILPIIMA ATMFAQTYLN PPPTDPMQAK MMKIMP
LVFS
201
VMFFFFPAGL VLYWVVNNLL TIAQQWHINR SIEKQRAQGE VVS*
进一步的序列分析揭示了全部的淋球菌DNA序列<SEQ ID 57>是:
1 ATGGATTTTA AAAGACTCAC GGCGTTTTTC GCCATCGCGC TGGTGATTAT
51 GATCGGCTGG GAAAAAATGT TCCCCACCCC GAAACCCGTC CCCGCGCCCC
101 AACAGGCGGC ACAAAAACAG GCAGCAACCG CTTCCGCCGA AGCCGCGCTC
151 GCGCCCGCAA CGCCGATTAC CGTAACGACC GACACGGTTC AAGCCGTTAT
201 TGATGAAAAA AGTGGCGACC TGCGCCGGCT GACCCTGCTC AAATACAAAG
251 CAACCGGCGA CGAAAACAAA CCGTTCGTCC TGTTTGGCGA CGGCAAAGAA
301 TACACCTACG TCGCCCAATC CGAACTTTTG GACGCGCAGG GCAACAACAT
351 TCTGAAAGGC ATCGGCTTTA GCGCACCGAA AAAACAGTAC ACCCTCAACG
401 GCGACACAGT CGAAGTCCGC CTGAGCGCGC CCGAAACCAA CGGACTGAAA
451 ATCGACAAAG TCTATACCTT TACCAAAGAC AGCTATCTGG TCAACGTCCG
501 CTTCGACATC GCCAACGGCA GCGGTCAAAC CGCCAACCTG AGCGCGGACT
551 ACCGCATCGT CCGCGACCAC AGCGAACCCG AGGGTCAAGG CTACTTTACC
601 CACTCTTACG TCGGCCCTGT TGTTTATACC CCTGAAGGCA ACTTCCAAAA
651 AGTCAGCTTC TCCgacTTgg acgACGATGC gaaaTccggc aaATccgagg
701 ccgaatacaT CCGCAAAACC ccgaccggtt ggctcggcat gattgaacac
751 cacttcatgt ccacctggat cctccAAcct aaaggcggcc aaaacgtttg
801 cgcccaggga gactgccgta tcgacattaa aCgccgcaac gacaagctgt
851 acagcgcaag cgtcagcgtg cctttaaccg ctatcccaac ccgggggcca
901 aaaccgaaaa tggcggTCAA CCTGTATGCC GGTCCGCAAA CCACATCCGT
951 TATCGCAAAC ATCGCcgacA ACCTGCAACT GGCAAAAGAC TACGGTAAAG
1001 TACACTGGTT CGCATCGCCG CTCTTCTGGC TCCTGAACCA ACTGCACAAC
1051 ATTATCGGCA ACTGGGGCTG GGCAATCGTC GTTTTGACCA TCATCGTCAA
1101 AGCCGTACTG TATCCATTGA CCAACGcctc ctACCGTTCG ATGGCGAAAA
1151 TGCGTGccgc cgcacCcaaA CTGCAGACCA TCAAAGAAAA ATAcgGCGAC
1201 GACCGTATGG CGCAACAGCA AGCGATGATG CAGCTTTACA AAgacgAGAA
1251 AATCAACCCG CTGGGCGGCT GTctgcctat gctgttgCAA ATCCCCGTCT
1301 TCATCGGCTT GTACTGGGCA TTGTTCGCCT CCGTAGAATT GCGCCAGGCA
1351 CCTTGGCTGG GCTGGATTAC CGACCTCAGC CGCGCCGACC CCTACTACAT
1401 CCTGCCCATC ATTATGGCGG CAACGATGTT CGCCCAAACC TATCTGAACC
1451 CGCCGCCGAC CGACCCGATG CAGGCGAAAA TGATGAAAAT CATGCCGTTG
1501 GTTTTCTCCG TCATGTTCTT CTTCTTCCCT GCCGGTTTGG TTCTCTACTG
1551 GGTGGTCAAC AACCTCCTGA CCATCGCCCA GCAGTGGCAC ATCAACCGCA
1601 GCATCGAAAA ACAACGCGCC CAAGGCGAAG TCGTTTCCTA A
它编码的蛋白质具有氨基酸序列<SEQ ID 58;ORF11ng-1>:
1
MDFKRLTAFF AIALVIMIGW EKMFPTPKPV PAPQQAAQKQ AATASAEAAL
51
APATPITVTT DTVQAVIDEK SGDLRRLTLL KYKATGDENK PFVLFGDGKE
101 YTYVAQSELL DAQGNNILKG IGFSAPKKQY TLNGDTVEVR LSAPETNGLK
151 IDKVYTFTKD SYLVNVRFDI ANGSGQTANL SADYRIVRDH SEPEGQGYFT
201 HSYVGPVVYT PEGNFQKVSF SDLDDDAKSG KSEAEYIRKT PTGWLGMIEH
251 HFMSTWILQP KGGQNVCAQG DCRIDIKRRN DKLYSASVSV PLTAIPTRGP
301 KPKMAVNLYA GPQTTSVIAN IADNLQLAKD YGKVHWFASP LFWLLNQLHN
351 IIGNWGW
AIV VLTIIVKAVL YPLTNASYRS MAKMRAAAPK LQTIKEKYGD
401 DRMAQQQAMM QLYKDEKINP LGGCLP
MLLQ IPVFIGLYWA LFASVELRQA
451 PWLGWITDLS RADPYYILPI IMAATMFAQT YLNPPPTDPM QAKMMKIMP
L
501
VFSVMFFFFP AGLVLYWVVN NLLTIAQQWH INRSIEKQRA QGEVVS*
ORF11ng-1和ORF11-1在546个氨基酸的重叠区域内显示出有95.1%的相同性:
10 20 30 40 50 60
orf11ng-1.pep MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQKQAATASAEAALAPATPITVTT
||||||||||||||||||||||||||||||||||||||:||:||||||||||||||||||
orf11-1 MDFKRLTAFFAIALVIMIGWEKMFPTPKPVPAPQQAAQQQAVTASAEAALAPATPITVTT
10 20 30 40 50 60
70 80 90 100 110 120
orf11ng-1.pep DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFVLFGDGKEYTYVAQSELLDAQGNNILKG
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf11-1 DTVQAVIDEKSGDLRRLTLLKYKATGDENKPFILFGDGKEYTYVAQSELLDAQGNNILKG
70 80 90 100 110 120
130 140 150 160 170 180
orf11ng-1.pep IGFSAPKKQYTLNGDTVEVRLSAPETNGLKIDKVYTFTKDSYLVNVRFDIANGSGQTANL
||||||||||:|:|| |||||||||| |||||||||||| ||||||||||||||||||||
orf11-1 IGFSAPKKQYSLEGDKVEVRLSAPETRGLKIDKVYTFTKGSYLVNVRFDIANGSGQTANL
130 140 150 160 170 180
190 200 210 220 230 240
orf11ng-1.pep SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11-1 SADYRIVRDHSEPEGQGYFTHSYVGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKT
190 200 210 220 230 240
250 260 270 280 290 300
orf11ng-1.pep PTGWLGMIEHHFMSTWILQPKGGQNVCAQGDCRIDIKRRNDKLYSASVSVPLTAIPTRGP
|||||||||||||||||||||| |:||| |:| ||||||||||||:||||||:|| : |
orf11-1 PTGWLGMIEHHFMSTWILQPKGRQSVCAAGECNIDIKRRNDKLYSTSVSVPLAAIQN-GA
250 260 270 280 290
310 320 330 340 350 360
orf11ng-1.pep KPKMAVNLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAIV
| : ::|||||||||||||||||||||||||||||||||||||||||||||||||||||:
orf11-1 KAEASINLYAGPQTTSVIANIADNLQLAKDYGKVHWFASPLFWLLNQLHNIIGNWGWAII
300 310 320 330 340 350
370 380 390 400 410 420
orf11ng-1.pep VLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQTIKEKYGDDRMAQQQAMMQLYKDEKINP
||||||||||||||||||||||||||||||||:|||||||||||||||||||| ||||||
orf11-1 VLTIIVKAVLYPLTNASYRSMAKMRAAAPKLQAIKEKYGDDRMAQQQAMMQLYTDEKINP
360 370 380 390 400 410
430 440 450 460 470 480
orf11ng-1.pep LGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11-1 LGGCLPMLLQIPVFIGLYWALFASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQT
420 430 440 450 460 470
490 500 510 520 530 540
orf11ng-1.pep YLNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf11-1 YLNPPPTDPMQAKMMKIMPLVFSVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRA
480 490 500 510 520 530
orf11ng-1.pep QGEVVSX
|||||||
orf11-1 QGEVVSX
540
另外,ORF11ng-1显示出与数据库中的内膜蛋白(登录号为p25754)明显同源:
ID 60IM_PSEPU STANDARD; PRT; 560 AA.
AC P25754;
DT 01-MAY-1992(REL.22,产生)
DT 01-MAY-1992(REL.22,序列的最后更新)
DT 01-NOV-1995(REL.32,注解的最后更新)
DE 60 KD内膜蛋白....
SCORES Init1:1074 Initn:1293 Opt:1103
Smith-Waterman评分:1406;574个氨基酸重叠区内有41.5%的相同性
10 20 30 40
orf11ng-1.pep MDFKR---LTAFFAIALVIMIGW-----EKMFPT------------PKPVPAPQQAAQKQ
||:|| ::|: ::: |::: | : :|| | ||| :::|: :
p25754 MDIKRTILIAALAVVSYVMVLKWNDDYGQAALPTQNTAASTVAPGLPDGVPAGNNGASAD
10 20 30 40 50 60
50 60 70 80 90
orf11ng-1.pep AATASAEAALAPATPIT-------VTTDTVQAVIDEKSGDLRRLTLLKYKATGDE-NKPF
: :|:||:: | :|:: | ||::: :|| :||: :|:| || |: | ||
p25754 VPSANAESSPAELAPVALSKDLIRVKTDVLELAIDPVGGDIVQLNLPKYPRRQDHPNIPF
70 80 90 100 110 120
100 110 120 130 140
orf11ng-1.pep VLFGDGKEYTYVAQSELLDAQGNNILKGIG---FSAPKKQYTL-NGD---TVEVRLSAPE
|| :| | :|:||| | ::| : :: | ::| :|:| | :|: :|::::|
p25754 QLFDNGGERVYLAQSGLTGTDGPDA-RASGRPLYAAEQKSYQLADGQEQLVVDLKFS---
130 140 150 160 170
150 160 170 180 190 200
orf11ng-1.pep TNGLKIDKVYTFTKDSYLVNVRFDIANGSGQTANLSADYRIVRDHS-EPEGQGYF-THSY
||:: | ::| : | :|| : | | |||: | : :: || | :| :: | :|
p25754 DNGVNYIKRFSFKRGEYDLNVSYLIDNQSGQAWNGNMFAQLKRDASGDPSSSTATGTATY
180 190 200 210 220 230
210 220 230 240 250 260
orf11ng-1.pep VGPVVYTPEGNFQKVSFSDLDDDAKSGKSEAEYIRKTPTGWLGMIEHHFMSTWILQPKGG
:| :::| ::|||::|:| |:: :| :: ||:: ::|:|:::|| |:
p25754 LGAALWTASEPYKKVSMKDID---KGSLKE-----NVSGGWVAWLQHYFVTAWI-PAKSD
240 250 260 270 280
270 280 290 300 310 320
orf11ng-1.pep QNVCAQGDCRIDIKRRNDKLYSASVSVPLTAIPTRGPKPKMAVNLYAGPQTTSVIANIAD
:|| :: :: :: | : : |: ::|: | | : :: |||||: | : :::
p25754 NNV-------VQTRKDSQGNYIIGYTGPVISVPA-GGKVETSALLYAGPKIQSKLKELSP
290 300 310 320 330
330 340 350 360 370 380
orf11ng-1.pep NLQLAKDYGKVHWF-ASPLFWLLNQLHNIIGNWGWAIVVLTIIVKAVLYPLTNASYRSMA
:|:|: ||| : || |:|:||||:::|:::|||||:|:|||:::|::::||: |||||||
p25754 GLELTVDYGFL-WFIAQPIFWLLQHIHSLLGNWGWSIIVLTMLIKGLFFPLSAASYRSMA
340 350 360 370 380 390
390 400 410 420 430 440
orf11ng-1.pep KMRAAAPKLQTIKEKYGDDRMAQQQAMMQLYKDEKINPLGGCLPMLLQIPVFIGLYWALF
:|||:|||| ::||::||||: ::||||:||| |||||||||||:|:|:|||::|||:|:
p25754 RMRAVAPKLAALKERFGDDRQKMSQAMMELYKKEKINPLGGCLPILVQMPVFLALYWVLL
400 410 420 430 440 450
450 460 470 480 490 500
orf11ng-1.pep ASVELRQAPWLGWITDLSRADPYYILPIIMAATMFAQTYLNPPPTDPMQAKMMKIMPLVF
|||:|||||: |||||| ||::||||||:|||| | ||| | ||||||:||:||::|
p25754 ESVEMRQAPWILWITDLSIKDPFFILPIIMGATMFIQQRLNPTPPDPMQAKVMKMMPIIF
460 470 480 490 500 510
510 520 530 540
orf11ng-1.pep SVMFFFFPAGLVLYWVVNNLLTIAQQWHINRSIEKQRAQGEVVSX
: :|::||||||||||||| |:|:|||:|:| ||
p25754 TFFFLWFPAGLVLYWVVNNCLSISQQWYITRRIEAATKKAAA
520 530 540 550 560
根据该分析结果(包括与恶臭假单胞菌的内膜蛋白以及预计的跨膜结构域有同源性(在脑膜炎球菌和淋球菌蛋白均见到)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白及其表位可能是疫苗或诊断,或产生抗体的有用抗体。
实施例8
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 59>:
1 ..GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT
51 NAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA
101 CGCCTGCCGC CGTCTTGACC GNCGCTCTGC TTTCCGCGCT GGGTATTTNG
151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA
201 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGNCAC ACAGGCGGCA
251 ACCGTTACGA AGTT.TTTAT CGCGGTACG. ACTGGCAGGC TCAAAATACG
301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA
351 AGGCAACCTT CTTATTATCA CACACCCTTA A
它对应于氨基酸序列<SEQ ID 60;ORF13>:
1..
AVLIIELLTG TVYLLVVSAA LAGSGIAYGL TGSTPAAVLT XALLSALGIX
51 FVHAKTAVRK VETDSYQDLD AGQYVEILRH TGGNRYEVXY RGTXWQAQNT
101 GQEELEPGTR ALIVRKEGNL LIITHP*
进一步的序列分析稍稍详细地描述了DNA序列<SEQ ID 61>:
1 ..GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT
51 nAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA
101 CGCCTGCCGC CGTCTTGACC GnCGCTCTGC TTTCCGCGCT GGGTATTTnG
151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA
201 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGACAC ACAGGCGGCA
251 ACCGTTACGA AGTTTTtTAT CGCGGTACGc ACTGGCAGGC TCAAAATACG
301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA
351 AGGCAACCTT CTTATTATCA CACACCCTTA A
它对应于氨基酸序列<SEQ ID 62;ORF13-1>:
1..
AVLIIELLTG TVYLLVVSAA LAGSGIAYGL TGSTPAAVLT XALLSALGIX
51 FVHAKTAVRK VETDSYQDLD AGQYVEILRH TGGNRYEVFY RGTHWQAQNT
101 GQEELEPGTR ALIVRKEGNL LIITHP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF13和脑膜炎奈瑟球菌菌株A的ORF(ORF13a)在126氨基酸重叠区显示出有92.9%的相同性:
10 20 30 40 50
orf13.pep
AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF
|||||||||||||||||||||||||||||||||||||||| |||||||| |
orf13a
MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF
10 20 30 40 50 60
60 70 80 90 100 110
orf13.pep VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVXYRGTXWQAQNTGQEELEPGTRA
||||||| |||||||||||||||:|||||:||||||| |||| |||||||||||||||||
orf13a VHAKTAVGKVETDSYQDLDAGQYAEILRHAGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
70 80 90 100 110 120
120
orf13.pep LIVRKEGNLLIITHPX
||||||||||||::||
orf13a LIVRKEGNLLIIAKPX
130
全长ORF13a核苷酸序列<SEQ ID 63>是:
1 ATGACTGTAT GGTTTGTTGC CGCTGTTGCC GTCTTAATCA TCGAATTATT
51 GACGGGAACG GTTTATCTTT TGGTTGTCAG CGCGGCTTTG GCGGGTTCGG
101 GCATTGCTTA CGGGCTGACC GGCAGCACGC CTGCCGCCGT CTTGACCGCC
151 GCTCTGCTTT CCGCGCTGGG TATTTGGTTC GTACACGCCA AAACCGCCGT
201 GGGAAAAGTT GAAACGGATT CATATCAGGA TTTGGATGCC GGGCAATATG
251 CCGAAATCCT CCGGCACGCA GGCGGCAACC GTTACGAAGT TTTTTATCGC
301 GGTACGCACT GGCAGGCTCA AAATACGGGG CAAGAAGAGC TTGAACCAGG
351 AACGCGCGCC CTAATCGTCC GCAAGGAAGG CAACCTTCTT ATCATCGCAA
401 AACCTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 64>:
1
MTVWFVAAVA VLIIELLTGT VYLLVVSAAL AGSGIAYGLT GSTPAAVLTA
51 ALLSALGIWF VHAKTAVGKV ETDSYQDLDA GQYAEILRHA GGNRYEVFYR
101 GTHWQAQNTG QEELEPGTRA LIVRKEGNLL IIAKP*
ORF13a和ORF13-1在126氨基酸重叠区内表现出94.4%的相同性
10 20 30 40 50 60
orf13a.pep MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF
|||||||||||||||||||||||||||||||||||||||| |||||||| |
orf13-1 AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF
10 20 30 40 50
70 80 90 100 110 120
orf13a.pep VHAKTAVGKVETDSYQDLDAGQYAEILRHAGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
||||||| |||||||||||||||:|||||:||||||||||||||||||||||||||||||
orf13-1 VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
60 70 80 90 100 110
orf13a.pep LIVRKEGNLLIIAKPX
||||||||||||::||
orf13-1 LIVRKEGNLLIITHPX
120
与淋病奈瑟球菌的预计ORF的同源性
ORF13和淋病奈瑟球菌的预计ORF(ORF13.ng)在126氨基酸重叠区内显示出有89.7%的相同性:
orf13 AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF 51
|||||||||||||||||||||||||||||||||||||||| |||||||| |
orf13ng MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF 60
orf13 VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVXYRGTXWQAQNTGQEELEPGTRA 111
||||||| |||||||||||:|:|:||||:|||||||| |||| ||||||||| :||||||
orf13ng VHAKTAVGKVETDSYQDLDTGKYAEILRYTGGNRYEVFYRGTHWQAQNTGQEVFEPGTRA 120
orf13 LIVRKEGNLLIITHP 126
||||||||||||::|
orf13ng LIVRKEGNLLIIANP 135
全长ORF13ng核苷酸序列<SEQ ID 65>是:
1 ATGACTGTAT GGTTTGTTGC CGCTGTTGCC GTCTTAATCA TCGAATTATT
51 GACGGGAACG GTTTATCTTT TGGTTGTCAG CGCGGCTTTG GCGGGTTCGG
101 GCATTGCCTA CGGGCTGACT GGCAGCACGC CTGCCGCCGT CTTGACCGCC
151 GCACTGCTTT CCGCGCTGGG CATTTGGTTC GTACATGCCA AAACCGCCGT
201 GGGAAAAGTT GAAACGGATT CATATCAGGA TTTGGATACC GGAAAATATG
251 CCGAAATCCT CCGATACACA GGCGGCAACC GTTACGAAGT TTTTTATCGC
301 GGTACGCACT GGCAGGCGCA AAATACGGGG CAGGAAGTGT TTGAACCGGG
351 AACGCGCGCC CTCATCGTCC GCAAAGAAGG TAACCTTCTT ATCATCGCAA
401 ACCCTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 66>:
1
MTVWFVAAVA VLIIELLTGT VYLLVVSAAL AGSGIAYGLT GSTPAAVLTA
51 ALLSALGIWF VHAKTAVGKV ETDSYQDLDT GKYAEILRYT GGNRYEVFYR
101 GTHWQAQNTG QEVFEPGTRA LIVRKEGNLL IIANP*
OFR13ng和ORF13-1在重叠的126个氨基酸内显示出有91.3%的相同性:
10 20 30 40 50
orf13-1.pep AVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTXALLSALGIXF
|||||||||||||||||||||||||||||||||||||||| |||||||| |
orf13ng MTVWFVAAVAVLIIELLTGTVYLLVVSAALAGSGIAYGLTGSTPAAVLTAALLSALGIWF
10 20 30 40 50 60
60 70 80 90 100 110
orf13-1.pep VHAKTAVRKVETDSYQDLDAGQYVEILRHTGGNRYEVFYRGTHWQAQNTGQEELEPGTRA
||||||| |||||||||||:|:|:||||:||||||||||||||||||||||| :||||||
orf13ng VHAKTAVGKVETDSYQDLDTGKYAEILRYTGGNRYEVFYRGTHWQAQNTGQEVFEPGTRA
70 80 90 100 110 120
120
orf13-1.pep LIVRKEGNLLIITHPX
||||||||||||::||
orf13ng LIVRKEGNLLIIANPX
130
根据该分析,包括该蛋白中的延伸前导序列,预计ORF13和ORF13ng可能是外膜蛋白。因此,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可能是疫苗或诊断,或产生抗体的有用抗原。
实施例9
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 67>:
1 ATGTwTGATT TCGGTTTrGG CGArCTGGTT TTTGTCGGCA TTATCGCCCT
51 GATwGtCCTC GGCCCCGAAC GCsTGCCCGA GGCCGCCCGC AyCGCCGGAC
101 GGcTCATCGG CAGGCTGCAA CGCTTTGTCG GcAGCGTCAA ACAGGAATTT
151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA
201 AGCTGCCGcC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA
251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA
301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA
351 TCCGCT.TCC CGATGCGGCA AACACCCTAT CAGACGGCAT TTCCGACGTT
401 ATGCCGTC..
它对应于氨基酸序列<SEQ ID 68;ORF2>:
1 MXDFGLGELV FVGIIALIVL GPERXPEAAR XAGRLIGRLQ RFVGSVKQEF
51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK
101 LPEQRTPADF GVDENGNPXS RCGKHPIRRH FRRYAV.
进一步的工作揭示了完整的核苷酸序列<SEQ ID 69>:
1 ATGTTTGATT TCGGTTTGGG CGAGCTGGTT TTTGTCGGCA TTATCGCCCT
51 GATTGTCCTC GGCCCCGAAC GCCTGCCCGA GGCCGCCCGC ACCGCCGGAC
101 GGCTCATCGG CAGGCTGCAA CGCTTTGTCG GCAGCGTCAA ACAGGAATTT
151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA
201 AGCTGCCGCC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA
251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA
301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA
351 TCCGCTTCCC GATGCGGCAA ACACCCTATC AGACGGCATT TCCGACGTTA
401 TGCCGTCCGA ACGTTCCTAC GCTTCCGCCG AAACCCTTGG GGACAGCGGG
451 CAAACCGGCA GTACAGCCGA ACCCGCGGAA ACCGACCAAG ACCGCGCATG
501 GCGGGAATAC CTGACTGCTT CTGCCGCCGC ACCCGTCGTA CAGACCGTCG
551 AAGTCAGCTA TATCGATACT GCTGTTGAAA CGCCTGTTCC GCACACCACT
601 TCCCTGCGCA AACAGGCAAT AAGCCGCAAA CGCGATTTTC GTCCGAAACA
651 CCGCGCCAAA CCTAAATTGC GCGTCCGTAA ATCATAA
它对应于氨基酸序列<SEQ ID 70;ORF2-1>:
1 MFD
FGLGELV FVGIIALIVL GPERLPEAAR TAGRLTGRLQ RFVGSVKQEF
51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK
101 LPEQRTPADF GVDENGNPLP DAANTLSDGI SDVMPSERSY ASAETLGDSG
151 QTGSTAEPAE TDQDRAWREY LTASAAAPVV QTVEVSYIDT AVETPVPHTT
201 SLRKQAISRK RDFRPKHRAK PKLRVRKS*
进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 71>:
1 ATGTTTGATT TCGGTTTGGG CGAGCTGGTT TTTGTCGGCA TTATCGCCCT
51 GATTGTCCTC GGCCCCGAAC GCCTGCCCGA GGCCGCCCGC ACCGCCGGAC
101 GGCTCATCGG CAGGCTGCAA CGCTTTGTCG GCAGCGTCAA ACAGGAATTT
151 GACACGCAAA TCGAACTGGA AGAACTAAGG AAGGCAAAGC AGGAATTTGA
201 AGCTGCCGCT GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA
251 TGGAGGGTAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA
301 CTGCCCGAAC AGCGCACGCC TGCTGATTTC GGTGTCGATG AAAACGGCAA
351 TCCCTTTCCC GATGCGGCAA ACACCCTATT AGACGGCATT TCCGACGTTA
401 TGCCGTCCGA ACGTTCCTAC GCTTCCGCCG AAACCCTTGG GGACAGCGGG
451 CAAACCGGCA GTACAGCCGA ACCCGCGGAA ACCGACCAAG ACCGTGCATG
501 GCGGGAATAC CTGACTGCTT CTGCCGCCGC ACCCGTCGTA CAGACCGTCG
551 AAGTCAGCTA TATCGATACC GCTGTTGAAA CCCCTGTTCC GCATACCACT
601 TCGCTGCGTA AACAGGCAAT AAGCCGCAAA CGCGATTTGC GTCCTAAATC
651 CCGCGCCAAA CCTAAATTGC GCGTCCGTAA ATCATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 72;ORF2a>:
1 MFD
FGLGELV FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEF
51 DTQIELEELR KAKQEFEAAA AQVRDSLKET GTDMEGNLHD ISDGLKPWEK
101 LPEQRTPADF GVDENGNPFP DAANTLLDGI SDVMPSERSY ASAETLGDSG
151 QTGSTAEPAE TDQDRAWREY LTASAAAPVV QTVEVSYIDT AVETPVPHTT
201 SLRKQAISRK RDLRPKSRAK PKLRVRKS*
最初鉴定的菌株B部分序列(ORF2)和ORF2a在重叠的118氨基酸内显示出有97.5%的相同性:
10 20 30 40 50 60
orf2.pep MXD
FGLGELVFVGIIALIVLGPERXPEAARXAGRLIGRLQRFVGSVKQEFDTQIELEELR
| | ||||||||||||||||| |||| |||||:|||||||||||||||||||||||||||||
orf2a MFD
FGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR
10 20 30 40 50 60
70 80 90 100 110 120
orf2.pep KAK QEFEAAAAQVRDSLKET GTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPXS
||| ||||||||||||||||| ||||||||||||||||||||||||||||||||||||||
orf2a KAK QEFEAAAAQVRDSLKET GTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPFP
70 80 90 100 110 120
130
orf2.pep RCG KHPIRRHFRRYAV
orf2a DAA NTLLDGISDVMPSERSY ASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV
130 140 150 160 170 180
完整的菌株B序列(ORF2-1)和ORF2a在228个氨基酸重叠区内显示出有98.2%的相同性:
orf2a.pep MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf2-1 MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR 60
orf2a.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPFP 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf2-1 KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP 120
orf2a.pep DAANTLLDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV 180
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf2-1 DAANTLSDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV 180
orf2a.pep QTVEVSYIDTAVETPVPHTTSLRKQAISRKRDLRPKSRAKPKLRVRKSX 229
||||||||||||||||||||||||||||||||:||| ||||||||||||
orf2-1 QTVEVSYIDTAVETPVPHTTSLRKQAISRKRDFRPKHRAKPKLRVRKSX 229
进一步的工作鉴定了淋病奈瑟球菌中的部分DNA序列<SEQ ID 73>,它编码下列氨基酸序列<SEQ ID 74;ORF2ng>:
1 MFD
FGLGELI FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEL
51 DTQIELEELR KVKQAFEAAA AQVRDSLKET DTDMQNSLHD ISDGLKPWEK
101 LPEQRTPADF GVDEKGNSLS RYGKHRIRRH FRRYAV*
进一步的工作鉴定了完整的淋球菌基因序列<SEQ ID 75>:
1 ATGTTTGATT TCGGTTTGGG CGAGCTGATT TTTGTCGGCA TTATCGCCCT
51 GATTGTCCTT GGTCCAGAAC GCCTGCCCGA AGCCGCCCGC ACTGCCGGAC
101 GGCTTATCGG CAGGCTGCAA CGCTTTGTAG GAAGCGTCAA ACAAGAACTT
151 GACACTCAAA TCGAACTGGA AGAGCTGAGG AAGGTCAAGC AGGCATTCGA
201 AGCTGCCGCC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GATACGGATA
251 TGCAGAACAG TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA
301 CTGCCCGAAC AGCGCACGCc tgccgatttc gGTGTCGATg AAAacggcaa
351 tccccttccc gATACGGCAA ACACCGTATC AGACGGCATT TCCGACGTTA
401 TGCCGTCTGA ACGTTCCGAT ACTtccgcCG AAACCCTTGG GGACGACAGG
451 CAAACCGGCA GTACAGCCGA ACCTGCGGAA ACCGACAAAG ACCGCGCATG
501 GCGGGAATAC CTGactgctt ctgccgccgc acctgtcgta Cagagggccg
551 tcgaagtcag ctaTATCGAT ACTGCTGTTG AAacgcctgT tccgcaCacc
601 acttccctgc gcaAACAGGC AATAAACCGC AAACGCGATT TttgtccgaA
651 ACACCGCGCc aAACCGAAat tgcgcgtcCG TAAATCATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 76;ORF2ng-1>:
1 MFD
FGLGELI FVGIIALIVL GPERLPEAAR TAGRLIGRLQ RFVGSVKQEL
51 DTQIELEELR KVKQAFEAAA AQVRDSLKET DTDMQNSLHD ISDGLKPWEK
101 LPEQRTPADF GVDENGNPLP DTANTVSDGI SDVMPSERSD TSAETLGDDR
151 QTGSTAEPAE TDKDRAWREY LTASAAAPVV QRAVEVSYID TAVETPVPHT
201 TSLRKQAINR KRDFCPKHRA KPKLRVRKS*
最初鉴定的菌株B部分序列(ORF2)和ORF2ng在重叠的136个氨基酸内显示出有87.5%的相同性:
orf2.pep MXDFGLGELVFVGIIALIVLGPERXPEAARXAGRLIGRLQRFVGSVKQEFDTQIELEELR 60
| |||||||:|||||||||||||| |||||:||||||||||||||||||:||||||||||
orf2ng MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 60
orf2.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPXS 120
|:|| ||||||||||||||| |||:::|||||||||||||||||||||||||||:||
orf2ng KVKQAFEAAAAQVRDSLKETDTDMQNSLHDISDGLKPWEKLPEQRTPADFGVDEKGNSLP 120
orf2.pep RCGKHPIRRHFRRYAV 136
| ||| ||||||||||
orf2ng RYGKHRIRRHFRRYAV 136
完整的菌株B和淋球菌序列(ORF2-1和ORF2ng-1)在229个氨基酸的重叠区内显示出有91.7%的相同性:
10 20 30 40 50 60
orf2-1.pep MFDFGLGELVFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQEFDTQIELEELR
|||||||||:|||||||||||||||||||||||||||||||||||||||:||||||||||
orf2ng-1 MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR
10 20 30 40 50 60
70 80 90 100 110 120
orf2-1.pep KAKQEFEAAAAQVRDSLKETGTDMEGNLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP
|:|| ||||||||||||||| |||:::|||||||||||||||||||||||||||||||||
orf2ng-1 KVKQAFEAAAAQVRDSLKETDTDMQNSLHDISDGLKPWEKLPEQRTPADFGVDENGNPLP
70 80 90 100 110 120
130 140 150 160 170 180
orf2-1.pep DAANTLSDGISDVMPSERSYASAETLGDSGQTGSTAEPAETDQDRAWREYLTASAAAPVV
|:|||:||||||||||||| :|||||||: ||||||||||||:|||||||||||||||||
orf2ng-1 DTANTVSDGISDVMPSERSDTSAETLGDDRQTGSTAEPAETDKDRAWREYLTASAAAPVV
130 140 150 160 170 180
190 200 210 220 229
orf2-1.pep Q-TVEVSYIDTAVETPVPHTTSLRKQAISRKRDFRPKHRAKPKLRVRKSX
| :|||||||||||||||||||||||||:||||| |||||||||||||||
orf2ng-1 QRAVEVSYIDTAVETPVPHTTSLRKQAINRKRDFCPKHRAKPKLRVRKSX
190 200 210 220 230
计算机分析这些氨基酸序列,结果提示了一个跨膜区(下划线),并且还揭示淋球菌序列与大肠杆菌的TatB蛋白之间有同源性(59%的相同性):
gnl|PID|e1292181(AJ005830)TatB蛋白[大肠杆菌]长度=171
评分=56.6位(134),估计值=1e-07
相同性=30/88(34%),阳性=52/88(59%),空隙=1/88(1%)
询问:1 MFDFGLGELIFVGIIALIVLGPERLPEAARTAGRLIGRLQRFVGSVKQELDTQIELEELR 60
MFD G EL+ V II L+VLGP+RLP A +T I L+ +V+ EL +++L+E +
目标:1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60
询问:61 -KVKQAFEAAAAQVRDSLKETDTDMQNS 87
+K+ +A+ + LK + +++ +
目标:61 DSLKKVEKASLTNLTPELKASMDELRQA 88
根据该分析,预计ORF2、ORF2a和ORF2ng可能是膜蛋白,因此,脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF2-1(16kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图3A显示了GST融合蛋白亲和纯化的结果,图3B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图3C)、ELISA(阳性结果)和FACS分析(图3D)。这些实验确认ORF37-1是一种外露蛋白,并且它是有用的免疫原。
实施例10
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 77>:
1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC
51 CGC.TGCGGG ACACTGACAG GTATTCCATC GCATGGCGgA GkTAAACgCT
101 TTgCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA
151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC
251 ATTGATGCAC kGrTwCsTGG CGAATACATA AACAGCCCTG CCGTCCGTAC
301 CGATTACACC TATCCACGTT ACGAAACCAC CGCTGAAACA ACATCAGGCG
351 GTTTGACAGG TTTAACCACT TCTTTATCTA CACTTAATGC CCCTGCACTC
401 TCTCGCACCC AATCAGACGG TAGCGGAAGT AAAAGCAGTC TGGGCTTAAA
451 TATTGGCGGG ATGGGGGATT ATCGAAATGA AACCTTGACG ACTAACCCGC
501 GCGACACTGC CTTTCTTTCC CACTTGGTAC AGACCGTATT TTTCCTGCGC
551 GGCATAGACG TTGTTTCTCC TGCCAATGCC GATACAGATG TGTTTATTAA
601 CATCGACGTA TTCGGAACGA TACGCAACAG AACCGAAATG..
它对应于氨基酸序列<SEQ ID 78;ORF 15>:
1 MQA
RLLIPIL FSVF
ILSACG TLTGIPSHGG XKRFAVEQEL VAASARAAVK
51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDAXXXG EYINSPAVRT
101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN
151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN
201 IDVFGTIRNR TEM..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 79>:
1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC
51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT
101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA
151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC
201 CACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA
251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC
301 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG
351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT
401 CTCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT
451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG
501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG
551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC
601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA
651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA
701 GAACCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT
751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA
801 AGGAATTAAA CCGACGGAAG GATTAATGGT CGATTTCTCC GATATCCGAC
851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC
901 AGTCATGAGG GGTATGGATA CAGCGATGAA GTAGTGCGAC AACATAGACA
951 AGGACAACCT TGA
它对应于氨基酸序列<SEQ ID 80;ORF15-1>:
1
MQARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK
51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT
101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN
151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN
201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA
251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DI RPYGNHTG NSAPSVEADN
301 SHEGYGYSDE VVRQHRQGQP*
进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 81>:
1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC
51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT
101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA
151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC
201 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA
251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC
301 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG
351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT
401 CGCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT
451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG
501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG
551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACGGATGT GTTTATTAAC
601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA
651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA
701 GAACCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT
751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGACCGTATA AAGTAAGCAA
801 AGGAATTAAA CCGACAGAAG GATTAATGGT CGATTTCTCC GATATCCAAC
851 CATACGGCAA TCATATGGGT AACTCTGCCC CATCCGTAGA GGCTGATAAC
901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTGCGAC GACATAGACA
951 AGGGCAACCT TGA
它编码的蛋白质具有氨基酸序列<SEQ ID 82;ORF15a>:
1
MQARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK
51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT
101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN
151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN
201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA
251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHMG NSAPSVEADN
301 SHEGYGYSDE AVRRHRQGQP*
最初鉴定的菌株B部分序列(ORF15)和ORF15a在213氨基酸重叠区内显示出有98.1%的相同性:
10 20 30 40 50 60
orf15.pep
MQARLLIPILFSVFILSACGTLTGIPSHGGXKRFAVEQELVAASARAAVKDMDLQALHGR
|||||||||||||||||| |||||||||||| |||||||||||||||||||||||||||||
orf15a
MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
10 20 30 40 50 60
70 80 90 100 110 120
orf15.pep KVALYIATMGDQGSGSLT GGRYSIDAXXXGEYINSPAVRTDYTYPRYETTAETTSGGLTG
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf15a KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
70 80 90 100 110 120
130 140 150 160 170 180
orf15.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15a LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180
190 200 210
orf15.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEM
|||||||||||||||||||||||||||||||||
orf15a FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
190 200 210 220 230 240
完整的菌株B序列(ORF15-1)和ORF15a在320个氨基酸的重叠区内显示出有98.8%的相同性:
10 20 30 40 50 60
orf15a.pep MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
10 20 30 40 50 60
70 80 90 100 110 120
orf15a.pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
70 80 90 100 110 120
130 140 150 160 170 180
orf15a.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180
190 200 210 220 230 240
orf15a.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15-1 FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
190 200 210 220 230 240
250 260 270 280 290 300
orf15a.pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHMGNSAPSVEADN
||||||||||||||||||||||||||||||||||||||||||:||||| |||||||||||
orf15-1 IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN
250 260 270 280 290 300
310 320
orf15a.pep SHEGYGYSDEAVRRHRQGQPX
||||||||||:||:|||||||
orf15-1 SHEGYGYSDEVVRQHRQGQPX
310 320
进一步的工作鉴定了淋病奈瑟球菌中对应的基因<SEQ ID 83>:
1 ATGCGGGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC
51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGCAAACGCT
101 TCGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA
151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC
201 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA
251 TTGATGCACT GATTCGCGGC GAATACATAA ACAGCCCTGC CGTCCGCACC
301 GATTACACCT ATCCGCGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG
351 TTTGACGGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT
401 CGCGCACCCA ATCAGACGGT AGCGGAAGTA GGAGCAGTCT GGGCTTAAAT
451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CCAACCCGCG
501 CGACACTGCC TTTCTTTCCC ACTTGGTGCA GACCGTATTT TTCCTGCGCG
551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC
601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA
651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA
701 GAACCAATAA AAAATTGCTC ATCAAACCCA AAACCAATGC GTTTGAAGCT
751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA
801 AGGAATCAAA CCGACGGAAG GATTGATGGT CGATTTCTCC GATATCCAAC
851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC
901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTGCGAC AACATAGACA
951 AGGGCAACCT TGA
它编码的蛋白质具有氨基酸序列<SEQ ID 84;ORF15ng>:
1 MRARLLIPIL FSVF
ILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK
51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT
101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSRSSLGLN
151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDVVSP ANADTDVFIN
201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA
251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHTG NSAPSVEADN
301 SHEGYGYSDE AVRQHRQGQP *
最初鉴定的菌株B部分序列(ORF15)和ORF15ng在重叠的213个氨基酸内显示出有97.2%的相同性:
orf15.pep MQARLLIPILFSVFILSACGTLTGIPSHGGXKRFAVEQELVAASARAAVKDMDLQALHGR 60
|:|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf15ng MRARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 60
orf15.pep KVALYIATMGDQGSGSLTGGRYSIDAXXXGEYINSPAVRTDYTYPRYETTAETTSGGLTG 120
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf15ng KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 120
orf15.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 180
|||||||||||||||||||||||:||||||||||||||||||||||||||||||||||||
orf 15ng LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 180
orf15.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEM 213
|||||||||||||||||||||||||||||||||
orf15ng FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL 240
完整的菌株B序列(ORF15-1)和ORF15ng在320个氨基酸的重叠区内显示出有98.8%的相同性:
10 20 30 40 50 60
orf15-1.pep MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15ng MRARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR
10 20 30 40 50 60
70 80 90 100 110 120
orf15-1.pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15ng KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG
70 80 90 100 110 120
130 140 150 160 170 180
orf15-1.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
|||||||||||||||||||||||:||||||||||||||||||||||||||||||||||||
orf15ng LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180
190 200 210 220 230 240
orf15-1.pep FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf15ng FLRGIDVVSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL
190 200 210 220 230 240
250 260 270 280 290 300
orf15-1.pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||
orf15ng IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHTGNSAPSVEADN
250 260 270 280 290 300
310 320
orf15-1.pep SHEGYGYSDEVVRQHRQGQPX
||||||||||:||||||||||
orf15ng SHEGYGYSDEAVRQHRQGQPX
310 320
这些氨基酸序列的计算机分析揭示了一个ILSAC基序(推定的膜脂蛋白脂质连接位点,如MOTIFS程序所预计的那样),暗示了一个推定的前导序列,并且预计脑膜炎奈瑟球菌和淋病奈瑟球菌的该蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
将ORF15-1(31.7kDa)如上所述克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图4A显示了GST-融合蛋白的亲和纯化结果,图4B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图4C)和ELISA(阳性结果)。这些结果确认ORFX-1是一种外露蛋白,且是一种有用的免疫原。
实施例11
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 85>:
1 ..GG.CAGCACA AAAAACAGGC GGTT
AACGG AAAAACCGTA TTTACGATGA
51 TGCCGGGTAT GATATTCGGC GTATTCACGG GCGCATTCTC CGCAAAATAT
101 ATCCCCGCGT TCGGGCTTCA AATTTTCTTC ATCCTGTTTT TAACCGCCGT
151 CGCATTCAAA ACACTGCATA CCGACCCTCA GACGGCATCC CGCCCGCTGC
201 CCGGACTGCC CrGACTGACT GCGGTTTCCA CACTGTTCGG CACAATGTCG
251 AGCTGGGTCG GCATAGGCGG CGGTTCACTT TCCGTCCCCT TCTTAATCCA
301 CTGCGGCTTC CCCGCCCATA AAGCCATCGG CACATCATCC GGCCTTGCCT
351 GGCCGATTGC ACTCTCCGGC GCAATATCGT ATCTGCTCAA CGGCCTGAAT
401 ATTGCAGGAT TGCCCGAAGG GTCACTGGGC TTCCTTTACC TGCCCGCCGT
451 CGCCGTCCTC AGCGCGGCAA CCATTGCCTT TGCCCCGCTC GGTGTCAAAA
501 CCGCCCACAA ACTTTCTTCT GCCAAACTCA AAAAATC.TT CGGCATTATG
551 TTGCTTTTGA TTGCCGGAAA AATGCTGTAC AACCTGCTTT AA
它对应于氨基酸序列<SEQ ID 86;ORF17>:
1 ..GQHKKQAVNG KTVFTMMPGM IFGVFTGAFS AKYIPAFGLQ IFFILFLTAV
51 AFKTLHTDPQ TASRPLPGLP XLTAVSTLFG TMSSWVGIGG GSLSVPFLIH
101 CGFPAHKAIG TSSGLAWPIA LSGAISYLLN GLNIAGLPEG SLGFLYLPAV
151 AVLSAATIAF APLGVKTAHK LSSAKLKKSF GIMLLLIAGK MLYNLL*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 87>:
1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCCGTAG GCAGTGCGGC
51 AGGTTTTATT GCCGGCCTGT TCGGCGTAGG CGGCGGCACG CTGATTGTCC
101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC
151 GCGCAACACC TCGCCGTCGG CACATCCTTC GCCGTCATGG TCTTCACCGC
201 CTTTTCCAGT ATGCTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA
251 CCGTATTTAC GATGATGCCG GGTATGATAT TCGGCGTATT CACGGGCGCA
301 CTCTCCGCAA AATATATCCC CGCGTTCGGG CTTCAAATTT TCTTCATCCT
351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGAC CCTCAGACGG
401 CATCCCGCCC GCTGCCCGGA CTGCCCGGAC TGACTGCGGT TTCCACACTG
451 TTCGGCACAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT
501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT
551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG
601 CTCAACGGCC TGAATATTGC AGGATTGCCC GAAGGGTCAC TGGGCTTCCT
651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC
701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAAAA
751 Tc.TTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT
801 GCTTTAA
它对应于氨基酸序列<SEQ ID 88;ORF 17-1>:
1
MWHWDIILIL LAVGSAAGFI AG
LFGVGGGT LIVPVVLWVL DLQGLAQHPY
51 AQHLA
VGTSF AVMVFTAFSS MLGQHKKQAV DWKT
VFTMMP GMIFGVFTGA
101
LSAKYIP
AFG LQIFFILFLT AVAFKTLHTD PQTASRPLPG LPGLTAVSTL
151 FGTMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL
201 LNGLNIAGLP EGSLGFLYLP
AVAVLSAATI AFAPLGVKTA HKLSSAKLKK
251 X
FGIMLLLIA GKMLYNLL*
该氨基酸序列的计算机分析给出了下列结果:
与假设的流感嗜血菌跨膜蛋白HI0902(登录号P44070)的同源性:
ORF17和HI0902蛋白在192个氨基酸的重叠区内显示出有28%的氨基酸相同性:
ORF17 3 HKKQAVNGKTVFTMMPGMIFGVFT-GAFSAKYIPAFGLQIF--FILFLTAVAFKTLHTDP 59
HK + + V + P ++ VF G F + +IF +++L ++ D
HI0902 72 HKLGNIVWQAVRILAPVIMLSVFICGLFIGRLDREISAKIFACLVVYLATKMVLSIKKD- 130
ORF17 60 QTASRPLPGLPXLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKAIGTSSGLAWPI 119
Q ++ L L + L G SS GIGGG VPFL G +AIG+S+ +
HI0902 131 QVTTKSLTPLSSVIG-GILIGMASSAAGIGGGGFIVPFLTARGINIKQAIGSSAFCGMLL 189
ORF17 120 ALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVXXXXXXXXXXXXXX 179
+SG S++++G +PE SLG++YLPAV ++A + + LG
HI0902 190 GISGMFSFIVSGWGNPLMPEYSLGYIYLPAVLGITATSFFTSKLGASATAKLPVSTLKKG 249
ORF17 180 FGIMLLLIAGKM 191
F + L+++A M
HI0902 250 FALFLIVVAINM 261
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF17和脑膜炎奈瑟球菌菌株A的ORF(ORF17a)在重叠的196个氨基酸内显示出有96.9%的相同性:
10 20 30
orf17.pep GQHKKQAVNGKT
VFTMMPGMIFGVFTGAFS
||||||||: || ||||||||:||||:|| :|
orf17a QGLAQHPYAQHLA
VGTSFAVMVFTAFSSMLGQHKKQAVDWKT
VFTMMPGMVFGVFAGALS
50 60 70 80 90 100
40 50 60 70 80 90
orf17.pep AKYIP
AFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPXLTAVSTLFGTMSSWVGIGG
||||| ||||||||||||||||| |||||||||||||||||| |||||||||||||||||||
orf17a AKYIP
AFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGG
110 120 130 140 150 160
100 110 120 130 140 150
orf17.pep GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLP
AV
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||
orf17a GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLP
AV
170 180 190 200 210 220
160 170 180 190
orf17.pep
AVLSAATIAFAPLGVKTAHKLSSAKLKKS
FGIMLLLIAGKMLYNLLX
||||||||||||||| |||||||||||||| ||||||||||||||||| |
orf17a
AVLSAATIAFAPLGVKTAHKLSSAKLKKS
FGIMLLLIAGKMLYNLLX
230 240 250 260
全长ORF17a核苷酸序列<SEQ ID 89>是:
1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCCGTAG GCAGTGCGGC
51 AGGTTTTATT GCCGGCCTGT TCGGCGTAGG CGGCGGCACG CTGATTGTCC
101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC
151 GCGCAACACC TCGCCGTCGG CACATCCTTC GCCGTCATGG TCTTCACCGC
201 CTTTTCCAGT ATGCTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA
251 CCGTATTTAC GATGATGCCG GGTATGGTAT TCGGCGTATT CGCTGGCGCA
301 CTCTCCGCAA AATATATCCC AGCGTTCGGG CTTCAAATTT TCTTCATCCT
351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGAC CCTCAGACGG
401 CATCCCGCCC GCTGCCCGGA CTGCCCGGAC TGACTGCGGT TTCCACACTG
451 TTCGGCACAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT
501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT
551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG
601 CTCAACGGCC TGAATATTGC AGGATTGCCC GAAGGGTCAC TGGGCTTCCT
651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC
701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAAAA
751 TCCTTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT
801 GCTTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 90>:
1
MWHWDIILIL LAVGSAAGFI AG
LFGVGGGT LIVPVVLWVL DLQGLAQHPY
51 AQHLA
VGTSF AVMVFTAFSS MLGQHKKQAV DWKT
VFTMMP GMVFGVFAGA
101 LSAKYIP
AFG LQIFFILFLT AVAFKTLHTD PQTASRPLPG LPGLTAVSTL
151 FGTMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL
201 LNGLNIAGLP EGSLGFLYLP
AVAVLSAATI AFAPLGVKTA HKLSSAKLKK
251 S
FGIMLLLIA GKMLYNLL*
ORF17a和ORF17-1在268个氨基酸的重叠区内显示出有98.9%的相同性:
10 20 30 40 50 60
orf17a.pep MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf17-1 MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
10 20 30 40 50 60
70 80 90 100 110 120
orf17a.pep AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMVFGVFAGALSAKYIPAFGLQIFFILFLT
||||||||||||||||||||||||||||||||:||||:||||||||||||||||||||||
orf17-1 AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMIFGVFTGALSAKYIPAFGLQIFFILFLT
70 80 90 100 110 120
130 140 150 160 170 180
orf17a.pep AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf17-1 AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA
130 140 150 160 170 180
190 200 210 220 230 240
orf17a.pep IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf17-1 IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
190 200 210 220 230 240
250 260 269
orf17a.pep HKLSSAKLKKSFGIMLLLIAGKMLYNLLX
|||||||||| ||||||||||||||||||
orf17-1 HKLSSAKLKKXFGIMLLLIAGKMLYNLLX
250 260
与淋病奈瑟球菌的预计ORF的同源性
ORF17与淋病奈瑟球菌的预计ORF(ORF17.ng)在196氨基酸重叠区内显示出有93.9%的相同性:
orf17.pep GQHKKQAVNGKTVFTMMPGMIFGVFTGAFS 30
||||||||: ||:|:||||||||||:||:|
orf17ng QGLAQHPYAQHLAVGTSFAVMVFTAFSSMLGQHKKQAVDWKTIFAMMPGMIFGVFAGALS 102
orf17.pep AKYIPAFGLQIFFILFLTAVAFKTLHTDPQTASRPLPGLPXLTAVSTLFGTMSSWVGIGG 90
||||||||||||||||||||||||||| ||||||||||| |||||||||:|||||||||
orf17ng AKYIPAFGLQIFFILFLTAVAFKTLHTGRQTASRPLPGLPGLTAVSTLFGAMSSWVGIGG 162
orf17.pep GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAV 150
||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||
orf17ng GSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLVNGLNIAGLPEGSLGFLYLPAV 202
orf17.pep AVLSAATIAFAPLGVKTAHKLSSAKLKKSFGIMLLLIAGKMLYNLL 196
|||||||||||||||||||||||||||:||||||||||||||||||
orf17ng AVLSAATIAFAPLGVKTAHKLSSAKLKESFGIMLLLIAGKMLYNLL 268
预计ORF17ng核苷酸序列<SEQ ID 91>编码的蛋白质具有氨基酸序列<SEQ ID92>:
1 MWHWDIILIL LAVGSAAGFI AGLFGVGGGT LIVPVVLWVL DLQGLAQHPY
51 AQHLAVGTSF AVMVFTAFSS MLGQHKKQAV DWKTIFAMMP GMIFGVFAGA
101 LSAKYIPAFG LQIFFILFLT AVAFKTLHTG RQTASRPLPG LPGLTAVSTL
151 FGAMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL
201 VNGLNIAGLP EGSLGFLYLP AVAVLSAATI AFAPLGVKTA HKLSSAKLKE
251 SFGIMLLLIA GKMLYNLL*
进一步的工作揭示了该完整的淋球菌DNA序列<SEQ ID 93>:
1 ATGTGGCATT GGGACATTAT CTTAATCCTG CTTGCcgtag gcAGTGCGGC
51 AGGTTTTATT GCCGGCCTGT Tcggtgtagg cggcgGTACG CTGATTGTCC
101 CTGTCGTTTT ATGGGTGCTT GATTTGCAGG GTTTGGCACA ACATCCTTAC
151 GCGCAACACC TCGCCGTCGG CAcaTccttc gcCGTCATGG TCTTCACCGC
201 CTTTTCCAGT ATGTTGGGGC AGCACAAAAA ACAGGCGGTC GACTGGAAAA
251 CCATATTTGC GATGATGCCG GGTATGATAT TCGGCGTATT CGCTGGCGCA
301 CTCTCCGCAA AATATATCCC CGCGTTCGGG CTTCAAATTT TCTTCATCCT
351 GTTTTTAACC GCCGTCGCAT TCAAAACACT GCATACCGGT CGTCAGACGG
401 CATCCCGCCC GCTGCCCGGG CTGCCCGGAC TGACTGCGGT TTCCACACTG
451 TTCGGCGCAA TGTCGAGCTG GGTCGGCATA GGCGGCGGTT CACTTTCCGT
501 CCCCTTCTTA ATCCACTGCG GCTTCCCCGC CCATAAAGCC ATCGGCACAT
551 CATCCGGCCT TGCCTGGCCG ATTGCACTCT CCGGCGCAAT ATCGTATCTG
601 GTCAACGGTC TGAATATTGC AGGATTGCCC GAAGGGTCGC TGGGCTTCCT
651 TTACCTGCCC GCCGTCGCCG TCCTCAGCGC GGCAACCATT GCCTTTGCCC
701 CGCTCGGTGT CAAAACCGCC CACAAACTTT CTTCTGCCAA ACTCAAAGAA
751 TCCTTCGGCA TTATGTTGCT TTTGATTGCC GGAAAAATGC TGTACAACCT
801 GCTTTAA
它对应于氨基酸序列<SEQ ID 94;ORF17ng-1>:
1
MWHWDIILIL LAVGSAAGFI AG
LFGVGGGT LIVPVVLWVL DLQGLAQHPY
51 AQHLA
VGTSF AVMVFTAFSS MLGQHKKQAV DWKT
IFAMMP GMIFGVFAGA
101
LSAKYIP
AFG LQIFFILFLT AVAFKTLHTG RQTASRPLPG LPGLTAVSTL
151 FGAMSSWVGI GGGSLSVPFL IHCGFPAHKA IGTSSGLAWP IALSGAISYL
201 VNGLNIAGLP EGSLGFLYLP
AVAVLSAATI AFAPLGVKTA HKLSSAKLKE
251 S
FGIMLLLIA GKMLYNLL*
ORF17ng-1和ORF17-1在268个氨基酸的重叠区内显示出有96.6%的相同性:
10 20 30 40 50 60
orf17-1.pep MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf17ng-1 MWHWDIILILLAVGSAAGFIAGLFGVGGGTLIVPVVLWVLDLQGLAQHPYAQHLAVGTSF
10 20 30 40 50 60
70 80 90 100 110 120
orf17-1.pep AVMVFTAFSSMLGQHKKQAVDWKTVFTMMPGMIFGVFTGALSAKYIPAFGLQIFFILFLT
||||||||||||||||||||||||:|:||||||||||:||||||||||||||||||||||
orf17ng-1 AVMVFTAFSSMLGQHKKQAVDWKTIFAMMPGMIFGVFAGALSAKYIPAFGLQIFFILFLT
70 80 90 100 110 120
130 140 150 160 170 180
orf17-1.pep AVAFKTLHTDPQTASRPLPGLPGLTAVSTLFGTMSSWVGIGGGSLSVPFLIHCGFPAHKA
||||||||| |||||||||||||||||||||:|||||||||||||||||||||||||||
orf17ng-1 AVAFKTLHTGRQTASRPLPGLPGLTAVSTLFGAMSSWVGIGGGSLSVPFLIHCGFPAHKA
130 140 150 160 170 180
190 200 210 220 230 240
orf17-1.pep IGTSSGLAWPIALSGAISYLLNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||
orf17ng-1 IGTSSGLAWPIALSGAISYLVNGLNIAGLPEGSLGFLYLPAVAVLSAATIAFAPLGVKTA
190 200 210 220 230 240
250 260 269
orf17-1.pep HKLSSAKLKKXFGIMLLLIAGKMLYNLLX
|||||||||: ||||||||||||||||||
orf17ng-1 HKLSSAKLKESFGIMLLLIAGKMLYNLLX
250 260
另外,ORF17ng-1显示出与假设的流感嗜血菌蛋白同源:
sp|P44070|Y902_HAEIN假设蛋白HI0902pir||G64015假设蛋白HI0902-流感嗜血菌(Rd KW20菌株)gi|1573922(U32772)流感嗜血菌预计编码区HI0902[流感嗜血菌]长度=264
评分=74(34.9位),估计值=1.6e-23,Sum P(2)=1.6e-23
相同性=15/43(34%),阳性=23/43(53%)
询问:55 AVGTSFAVMVFTAFSSMLGQHKKQAVDWKTIFAMMPGMIFGVF 97
A+GTSFA +V T S HK + W+ + + P ++ VF
目标:52 ALGTSFATIVITGIGSAQRHHKLGNIVWQAVRILAPVIMLSVF 94
评分=195(91.9位),估计值=1.6e-23,Sum P(2)=1.6e-23
相同性=44/114(38%),阳性=65/114(57%)
询问:150 LFGAMSSWVGIGGGSLSVPFLIHCGFPAHKAIGTSSGLAWPIALSGAISYLVNGLNIAGL 209
L G SS GIGGG VPFL G +AIG+S+ + +SG S++V+G +
目标:148 LIGMASSAAGIGGGGFIVPFLTARGINIKQAIGSSAFCGMLLGISGMFSFIVSGWGNPLM 207
询问:210 PEGSLGFLYLPAVAVLSAATIAFAPLGVKTAHKLSSAKLKESFGIMLLLIAGKM 263
PE SLG++YLPAV ++A + + LG KL + LK+ F + L+++A M
目标:208 PEYSLGYIYLPAVLGITATSFFTSKLGASATAKLPVSTLKKGFALFLIVVAINM 261
这个分析结果,包括与一种假设的流感嗜血菌跨膜蛋白有同源性,提示脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例12
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 95>:
1 ..GGAAACGGAT GGCAGGCAGA CCCCGAACAT CCGCTGCTCG GGCTTTTTGC
51 CGTCAGTAAT GTATCGATGA CGCTTGCTTT TGTCGGAATA TGTGCGTTGG
101 TGCATTATTG CTTTTCGGGA ACGGTTCAAG TGTTTGTGTT TGCGGCACTG
151 CTCAAACTTT ATGCGCTGAA GCCGGTTTAT TGGTTCGTGT TGCAGTTTGT
201 GCTGATGGCG GTTGCCTATG TCCACCGCTG CGGTATAGAC CGGCAGCCGC
251 CGTCAACGTT CGGCGGCTCG CAGCTGCGAC TCGGCGGGTT GACGGCAGCG
301 TTGATGCAGG TCTCGGTACT GGTGCTGCTG CTTTCAGAAA TTGGAAGATA
351 A
它对应于氨基酸序列<SEQ ID 96;ORF 18>:
1 ..GNGWQADPEH PLLGLFAVSN VSMTLAFVGI CALVHYCFSG TVQVFVFAAL
51 LKLYALKPVY WFVLQFVLMA VAYVHRCGID RQPPSTFGGS QLRLGGLTAA
101 LMQVSVLVLL LSEIGR*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 97>:
1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGT ATGCGGCGGT
51 TTTTCTGTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA
101 GTATTATGCT GTGGCTGGGC ATATCGGTTT TGGGGGCAAA GCTGATGCCC
151 GGCATATGGG GAATGACCCG CGCCGCGCCC TTGTTCATCC CCCATTTTTA
201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGCATTGG AACCGGAAAA
251 CAGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCGCT GCTCGGGCTT
301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC
351 GTTGGTGCAT TATTGCTTTT CGGGAACGGT TCAAGTGTTT GTGTTTGCGG
401 CACTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG
451 TTTGTGCTGA TGGCGGTTGC CTATGTCCAC CGCTGCGGTA TAGACCGGCA
501 GCCGCCGTCA ACGTTCGGCG GCTCGCAGCT GCGACTCGGC GGGTTGACGG
551 CAGCGTTGAT GCAGGTCTCG GTACTGGTGC TGCTGCTTTC AGAAATTGGA
601 AGATAA
它对应于氨基酸序列<SEQ ID 98;ORF 18-1>:
1
MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIMLWLG ISVLGAKLMP
51
GIWGMTRAAP
LFIPHFYLTL GSIFFFIGHW NRKTDGNGWQ ADPEHPLLGL
101 FA
VSNVSMTL AFVGICALVH Y
CFSGTVQVF VFAALLKLYA LK
PVYWFVLQ
151
FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG
GLTAALMQVS VLVLLLSEIG
201 R*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF18与脑膜炎奈瑟球菌菌株A的ORF(ORF18a)在重叠的116个氨基酸内显示出有98.3%的相同性:
10 20 30
orf 18.pep GNGWQADPEHPLLGLFA
VSNVSMTLAFVGI
||||||||||||||||| |||||||||||||
orf18a
TRAAP
LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFA
VSNVSMTLAFVGI
60 70 80 90 100 110
40 50 60 70 80 90
orf18.pep
CALVHY CFSGTVQVFVFAALLKL YALK
PVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS
|||| || ||| ||||||||||||| |||| ||||||||||||||||| ||||||||||||||||
orf18a
CALVHY
CFSXTVQVFVFAALLKLYALK
PVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS
120 130 140 150 160 170
100 110
orf18.pep QLRLG
GLTAALMQVSVLVLLLSEIGRX
||||| |||||||| |||||||| |||||
orf18a QLRLG
GLTAALMQXSVLVLLLSEIGRX
180 190 200
全长ORF18a核苷酸序列<SEQ ID 99>是:
1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGT ATGCGGCGGT
51 TTTTCTGTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA
101 GTATTATGCT GTGGCTGGGC ATATCGGTTT TGGGGGCAAA GCTGATGCCC
151 GGCATATGGG GAATGACCCG CGCCGCGCCC TTGTTCATCC CCCATTTTTA
201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGCATTGG AACCGGAAAA
251 CGGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCTCT GCTCGGGCTG
301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC
351 GTTGGTGCAT TATTGCTTTT CGNGAACGGT TCAAGTGTTT GTGTTTGCGG
401 CACTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG
451 TTTGTGCTGA TGGCGGTTGC CTATGTCCAC CGCTGCGGTA TAGACCGGCA
501 GCCGCCGTCA ACGTTCGGCG GNTCGCAGCT GCGACTCGGC GGGTTGACGG
551 CAGCGTTGAT GCAGNTCTCG GTACTGGTGC TGCTGCTTTC AGAAATTGGA
601 AGATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 100>:
1
MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIMLWLG ISVLGAKLMP
51
GIWGMTRAAP
LFIPHFYLTL GSIFFFIGHW NRKTDGNGWQ ADPEHPLLGL
101 FA
VSNVSMTL AFVGICALVH Y
CFSXTVQVF VFAALLKLYA LK
PVYWFVLQ
151
FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG
GLTAALMQXS VLVLLLSEIG
201 R*
ORF18a和ORF18-1在201个氨基酸的重叠区内显示出有99.0%的相同性:
10 20 30 40 50 60
orf18a.pep MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIMLWLGISVLGAKLMPGIWGMTRAAP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18-1 MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIMLWLGISVLGAKLMPGIWGMTRAAP
10 20 30 40 50 60
70 80 90 100 110 120
orf18a.pep LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18-1 LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
70 80 90 100 110 120
130 140 150 160 170 180
orf18a.pep YCFSXTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
|||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18-1 YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
130 140 150 160 170 180
190 200
orf18a.pep GLTAALMQXSVLVLLLSEIGRX
|||||||| |||||||||||||
orf18-1 GLTAALMQVSVLVLLLSEIGRX
190 200
与淋病奈瑟球菌的预计ORF的同源性
ORF18显示出在与淋病奈瑟球菌的预计ORF(ORF18.ng)在重叠的116个氨基酸中有93.1%的相同性:
orf18.pep GNGWQADPEHPLLGLFAVSNVSMTLAFVGI 30
||||||||||||||||||||||||||||||
orf18ng TRAAPLFIPHFYLTLGSIFFFIGYWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGI 115
orf18.pep CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 90
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18ng CALVHYCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGS 175
orf18.pep QLRLGGLTAALMQVSVLVLLLSEIGR 116
||||| |:| ||||:| ::||:||||
orf18ng QLRLGVLAAMLMQVAVTAMLLAEIGR 201
全长ORF18ng核苷酸序列是<SEQ ID 101>:
1 ATGATTTTGC TGCATTTGGA TTTTTTGTCT GCCTTACTGt aTGCGGcggt
51 tttTctgTTT CTGATATTCC GCGCAGGAAT GTTGCAATGG TTTTGGGCGA
101 GTATTGCGTT GTGGCTCGGC ATCTCGGTTT TAGGGGTAAA GCTGATGCCG
151 GGGATGTGGG GAATGACCCG CGCCGCGCCT TTGTTCATCC CCCATTTTTA
201 CCTGACTTTG GGCAGCATAT TTTTTTTCAT CGGGTATTGG AACCGGAAAA
251 CAGATGGAAA CGGATGGCAG GCAGACCCCG AACATCCGCT GCTCGGGCTT
301 TTTGCCGTCA GTAATGTATC GATGACGCTT GCTTTTGTCG GAATATGTGC
351 GTTGGTGCAT TATTGCTTTT CGGGAACGGT TCAAGTGTTT GTGTTTGCGG
401 CATTGCTCAA ACTTTATGCG CTGAAGCCGG TTTATTGGTT CGTGTTGCAG
451 TTTGTATTGA TGGCGGttgC CTATGTCCAC CGCTGCGGTA TAGACCGGCA
501 GCCGCCGTCA ACGTTCGGCG GTTCGCAGCT GCGACTCGGC GTGTTGGCGG
551 CGATGTTGAT GCAGGTTGCG GTAACGGCGA TGCTGCTTGC CGAAATCGGC
601 AGATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 102>:
1
MILLHLDFLS ALLYAAVFLF LIFRAGMLQW FWASIALWLG ISVLGVKLMP
51
GMWGMTRAAP
LFIPHFYLTL GSIFFFIGYW NRKTDGNGWQ ADPEHPLLGL
101 FAV
SNVSMTL AFVGICALVH Y
CFSGTVQVF VFAALLKLYA LKP
VYWFVLQ
151
FVLMAVAYVH RCGIDRQPPS TFGGSQLRLG
VLAAMLMQVA VTAMLLAEIG
201 R*
此ORF18ng蛋白质序列显示出与ORF18-1在重叠的201个氨基酸中有94.0%的相同性:
10 20 30 40 50 60
orf18-1.pep MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIMLWLGISVLGAKLMPGIWGMTRAAP
||||||||||||||||||||||||||||||||||| |||||||||:|||||:||||||||
orf18ng MILLHLDFLSALLYAAVFLFLIFRAGMLQWFWASIALWLGISVLGVKLMPGMWGMTRAAP
10 20 30 40 50 60
70 80 90 100 110 120
orf18-1.pep LFIPHFYLTLGSIFFFIGHWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf18ng LFIPHFYLTLGSIFFFIGYWNRKTDGNGWQADPEHPLLGLFAVSNVSMTLAFVGICALVH
70 80 90 100 110 120
130 140 150 160 170 180
orf18-1.pep YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf18ng YCFSGTVQVFVFAALLKLYALKPVYWFVLQFVLMAVAYVHRCGIDRQPPSTFGGSQLRLG
130 140 150 160 170 180
190 200
orf18-1.pep GLTAALMQVSVLVLLLSEIGRX
|:| ||||:| ::||:|||||
orf18ng VLAAMLMQVAVTAMLLAEIGRX
190 200
根据本分析,包括该淋球菌蛋白中存在几个推定跨膜结构域的分析,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例13
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 103>:
1 ATGAAAACCC CACTCCTCAA GCCTCTGCTN ATTACCTCGC TTCCCGTTTT
51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA
101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT
151 TTGGACAACC NCNTGACCGG ACGGCTNAAA AACATCATCA CCACCGTCGC
201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC
251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CTT.CG.CTT CACCATTTTA
301 GGCGCGGNCG ...
它对应于氨基酸序列<SEQ ID 104;ORF19>:
1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGIIAGGLVD
51 LDNXXTGRLK NIITTVALFT LSSLTAQSTL GTGLPFILAM TLMTXXFTIL
101 GAX...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 105>:
1 ATGAAAACCC CACTCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT
51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA
101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT
151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCA CCACCGTCGC
201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC
251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CCTTCGGCTT CACCATTTTA
301 GGCGCGGTCG GGCTCAAATA CCGCACCTTC GCCTTCGGTG CACTCGCCGT
351 CGCCACCTAC ACCACACTTA CCTACACCCC CGAAACCTAC TGGCTGACCA
401 ACCCCTTCAT GATTTTATGC GGCACCGTAC TGTACAGCAC CGCCATCCTC
451 CTGTTCCAAA TCGTCCTGCC CCACCGCCCC GTCCAAGAAA GCGTCGCCAA
501 CGCCTACGAC GCACTCGGCG GCTACCTCGA AGCCAAAGCC GACTTCTTCG
551 ACCCCGATGA GGCAGCCTGG ATAGGCAACC GCCACATCGA CCTCGCCATG
601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT
651 TTACCGCCTT CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC
701 GTTACTACTT TGCCGCCCAA GACATACACG AACGCATCAG CTCCGCCCAC
751 GTCGATTATC AGGAAATGTC CGAAAAATTC AAAAACACCG ACATCATCTT
801 CCGCATCCAC CGCCTGCTCG AAATGCAGGG ACAAGCCTGC CGCAACACCG
851 CCCAAGCCCT GCGCGCAAGC AAAGACTACG TTTACAGCAA ACGCCTCGGC
901 CGCGCCATCG AAGGCTGCCG CCAATCGCTG CGCCTCCTTT CAGACAGCAA
951 CGACAGTCCC GACATCCGCC ACCTGCGCCG CCTTCTCGAC AACCTCGGCA
1001 GCGTCGACCA GCAGTTCCGC CAACTCCAGC ACAACGGCCT GCAGGCAGAA
1051 AACGACCGCA TGGGCGACAC CCGCATCGCC GCCCTCGAAA CCAGCAGCCT
1101 CAAAAACACC TGGCAGGCAA TCCGTCCGCA GCTAAACCTC GAATCAGGCG
1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTCG TCGTTGCCGC CGCCTGCACC
1201 ATCGTCGAAG CCCTCAACCT CAACCTCGGC TACTGGATAC TACTGACCGC
1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTCCGCC
1301 AGCGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC
1351 TACTTCACCC CGTCTGTCGA AACCAAACTC TGGATTGTCA TCGCCAGTAC
1401 CACCCTCTTT TTCATGACCC GCACCTACAA ATACAGTTTC TCCACCTTCT
1451 TCATTACCAT TCAAGCCCTG ACCAGCCTCT CCCTCGCAGG TTTGGACGTA
1501 TACGCCGCCA TGCCCGTACG CATCATCGAC ACCATTATCG GCGCATCCCT
1551 TGCCTGGGCG GCAGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC
1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAACGGTGC CTATCTCGAA
1651 AAAATCACCG AACGCCTCAA AAGCGGCGAA ACCGGCGACG ACGTCGAATA
1701 CCGCGCCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA
1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA
1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC
1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT
1901 TTACCGCACA GTTCCACCTC GCCGCCGAAC ACACCGCCCA CATCTTCCAA
1951 CACCTGCCCG AAACCGAACC CGACGACTTT CAGACAGCAC TGGATACACT
2001 GCGCGGCGAA CTCGACACCC TCCGCACCCA CAGCAGCGGA ACACAAAGCC
2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CCCGACAGCT CGAACCCTAC
2101 TACCGCGCCT ACCGCCAAAT TCCGCACAGG CAGCCCCAAA ATGCAGCCTG
2151 A
它对应于氨基酸序列<SEQ ID 106;ORF19-1>:
1
MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPK
LAMPFV LGIIAGGLVD
51
LDNRLTGRLK NIITTVALFT LSSLTAQSTL GTGLPF
ILAM TLMTFGFTIL
101
GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNP
FMILC GTVLYSTAIL
151
LFQIVLPHRP VQESVANAYD ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM
201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH
251 VDYQEMSEKF KNTDIIFRIH RLLEMQGQAC RNTAQALRAS KDYVYSKRLG
301 RAIEGCRQSL RLLSDSNDSP DIRHLRRLLD NLGSVDQQFR QLQHNGLQAE
351 NDRMGDTRIA ALETSSLKNT WQAIRPQLNL ESGVFRHAVR
LSLVVAAACT
401
IVEALNLNLG YWILLTALFV CQPNYTATKS RVRQR
IAGTV LGVIVGSLVP
451
YFTPSVETKL WIVIASTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV
501 YAAMPVRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSNGAYLE
551 KITERLKSGE TGDDVEYRAT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ
601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ
651 HLPETEPDDF QTALDTLRGE LDTLRTHSSG TQSHILLQQL QLIARQLEPY
701 YRAYRQIPHR QPQNAA*
该氨基酸序列的计算机分析给出了下列结果:
与预计的流感嗜血菌的跨膜蛋白YHFK(登录号为P44289)的同源性
ORF19和YHFK蛋白在97个氨基酸的重叠区内显示出有45%的氨基酸相同性:
orf19 6 LKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNXXTGRLKNIITT 65
L +I+++PVF +V AA +W +MP +LGIIAGGLVDLDN TGRLKN+ T
YHFK 5 LNAKVISTIPVFIAVNIAAVGIWFFDISSQSMPLILGIIAGGLVDLDNRLTGRLKNVFFT 64
orf19 66 VALFTLSSLTAQSTLGTGLPFILAMTLMTXXFTILGA 102
+ F++SS Q +G + +I+ MT++T FT++GA
YHFK 65 LIAFSISSFIVQLHIGKPIQYIVLMTVLTFIFTMIGA 101
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF19显示出在与脑膜炎奈瑟球菌菌株A的ORF(ORF19a)在重叠的102个氨基酸内有92.2%的相同性:
10 20 30 40 50 60
orf19.pep
MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPK
LAMPFVLGIIAGGLVDLDNXXTGRLK
|||| ||||||||||||||||| |||||||||||| ||||||||||||||||| || |||||
orf19a
MKTPPLKPLLITSLPVFASVFTAASIVWQLGEPK
LAMPFVLGIIAGGLVDLDNRLTGRLK
10 20 30 40 50 60
70 80 90 100
orf19.pep NIITTVALFTLSSLTAQSTLGTGLPF
ILAMTLMTXXFTILGAX
|||:||||||||||:||||||||||| |||||||| |||:||
orf19a NIIATVALFTLSSLVAQSTLGTGLPF
ILAMTLMTFGFTIMGAVGLKYRTFAFGALAVATY
70 80 90 100 110 120
orf19a TTLTYTPETYWLTNP
FMILCGTVLYSTAIILFQIILPHRPVQENVANAYEALGSYLEAKA
130 140 150 160 170 180
全长ORF19a核苷酸序列<SEQ ID 107>是:
1 ATGAAAACCC CACCCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT
51 CGCCAGTGTC TTTACCGCCG CCTCCATCGT CTGGCAGCTG GGCGAACCCA
101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCTGGCGG CCTGGTCGAT
151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCG CCACCGTCGC
201 CCTGTTCACC CTCTCCTCAC TTGTCGCGCA AAGCACCCTC GGCACAGGTT
251 TGCCATTCAT CCTCGCCATG ACCCTGATGA CTTTCGGCTT TACCATCATG
301 GGCGCGGTCG GGCTGAAATA CCGCACCTTC GCCTTCGGCG CACTCGCCGT
351 CGCCACCTAC ACCACACTTA CCTACACCCC CGAAACCTAC TGGCTGACCA
401 ACCCCTTTAT GATTCTGTGC GGAACCGTAC TGTACAGCAC CGCCATCATC
451 CTGTTCCAAA TCATCCTGCC CCACCGCCCC GTTCAAGAAA ACGTCGCCAA
501 CGCCTACGAA GCACTCGGCA GCTACCTCGA AGCCAAAGCC GACTTTTTCG
551 ATCCCGACGA AGCCGAATGG ATAGGCAACC GCCACATCGA CCTCGCCATG
601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT
651 TTACCGCCTT CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC
701 GCTACTACTT CGCCGCCCAA GACATACACG AACGCATCAG CTCCGCCCAC
751 GTCGACTACC AAGAGATGTC CGAAAAATTC AAAAACACCG ACATCATCTT
801 CCGCATCCAC CGCCTGCTCG AAATGCAGGG ACAAGCCTGC CGCAACACCG
851 CCCAAGCCCT GCGCGCAAGC AAAGACTACG TTTACAGCAA ACGCCTCGGC
901 CGCGCCATCG AAGGCTGCCG CCAATCGCTG CGCCTCCTTT CAGACAGCAA
951 CGACAATCCC GACATCCGCC ACCTGCGCCG CCTTCTCGAC AACCTCGGCA
1001 GCGTCGACCA GCAGTTCCGC CAACTCCAGC ACAACGGCCT GCAGGCAGAA
1051 AACGACCGCA TGGGCGACAC CCGCATCGCC GCCCTCGAAA CCGGCAGCCT
1101 CAAAAACACC TGGCAGGCAA TCCGTCCGCA GCTAAACCTC GAATCAGGCG
1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTTG TCGTTGCCGC CGCCTGCACC
1201 ATCGTCGAAG CCCTCAACCT CAACCTCGGC TACTGGATAC TACTGACCGC
1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTCCGCC
1301 AGCGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC
1351 TACTTTACCC CCTCCGTCGA AACCAAACTC TGGATCGTCA TCGCCAGTAC
1401 CACCCTCTTT TTCATGACCC GCACCTACAA ATACAGCTTC TCGACATTTT
1451 TCATCACCAT TCAAGCCCTG ACCAGCCTCT CCCTCGCAGG GTTGGACGTA
1501 TACGCCGCCA TGCCCGTACG CATCATCGAC ACCATTATCG GCGCATCCCT
1551 TGCCTGGGCG GCAGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC
1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAACGGCGC CTATCTCGAA
1651 AAAATCACCG AACGCCTCAA AAGCGGCGAA ACCGGCGACG ACGTCGAATA
1701 CCGCGCCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA
1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA
1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC
1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT
1901 TTACCGCACA GTTCCACCTC GCCGCCGAAC ACACCGCCCA CATCTTCCAA
1951 CACCTGCCCG AAACCGAACC CGACGACTTT CAGACAGCAC TGGATACACT
2001 GCGCGGCGAA CTCGACACCC TCCGCACCCA CAGCAGCGGA ACACAAAGCC
2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CCCGGCAGCT CGAACCCTAC
2101 TACCGCGCCT ACCGACAAAT TCCGCACAGG CAGCCCCAAA ACGCAGCCTG
2151 A
它编码的蛋白质具有氨基酸序列<SEQ ID 108>:
1
MKTPPLKPLL ITSLPVFASV FTAASIVWQL GEPK
LAMPFV LGIIAGGLVD
51
LDNRLTGRLK NIIATVALFT LSSLVAQSTL GTGLPF
ILAM TLMTFGFTIM
101
GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNP
FMILC GTVLYSTAII
151
LFQIILPHRP VQENVANAYE ALGSYLEAKA DFFDPDEAEW IGNRHIDLAM
201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH
251 VDYQEMSEKF KNTDIIFRIH RLLEMQGQAC RNTAQALRAS KDYVYSKRLG
301 RAIEGCRQSL RLLSDSNDNP DIRHLRRLLD NLGSVDQQFR QLQHNGLQAE
351 NDRMGDTRIA ALETGSLKNT WQAIRPQLNL ESGVFRHAVR
LSLVVAAACT
401
IVEALNLNLG YWILLTALFV CQPNYTATKS RVRQR
IAGTV LGVIVGSLVP
451
YFTPSVETKL WIVIASTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV
501 YAAMPVRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSNGAYLE
551 KITERLKSGE TGDDVEYRAT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ
601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ
651 HLPETEPDDF QTALDTLRGE LDTLRTHSSG TQSHILLQQL QLIARQLEPY
701 YRAYRQIPHR QPQNAA*
ORF19a和ORF19-1显示在716个氨基酸的重叠区内有98.3%的相同性:
10 20 30 40 50 60
orf19a.pep MKTPPLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
|||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
10 20 30 40 50 60
70 80 90 100 110 120
orf19a.pep NIIATVALFTLSSLVAQSTLGTGLPFILAMTLMTFGFTIMGAVGLKYRTFAFGALAVATY
|||:||||||||||:||||||||||||||||||||||||:||||||||||||||||||||
orf19-1 NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY
70 80 90 100 110 120
130 140 150 160 170 180
orf19a.pep TTLTYTPETYWLTNPFMILCGTVLYSTAIILFQIILPHRPVQENVANAYEALGSYLEAKA
|||||||||||||||||||||||||||||:||||:||||||||:|||||:|||:||||||
orf19-1 TTLTYTPETYWLTNPFMILCGTVLYSTAILLFQIVLPHRPVQESVANAYDALGGYLEAKA
130 140 150 160 170 180
190 200 210 220 230 240
orf19a.pep DFFDPDEAEWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
190 200 210 220 230 240
250 260 270 280 290 300
orf19a.pep DIHERISSAHVDYQEMSEKFKNTDIIFRIHRLLEMQGQACRNTAQALRASKDYVYSKRLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 DIHERISSAHVDYQEMSEKFKNTDIIFRIHRLLEMQGQACRNTAQALRASKDYVYSKRLG
250 260 270 280 290 300
310 320 330 340 350 360
orf19a.pep RAIEGCRQSLRLLSDSNDNPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf19-1 RAIEGCRQSLRLLSDSNDSPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA
310 320 330 340 350 360
370 380 390 400 410 420
orf19a.pep ALETGSLKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 ALETSSLKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
370 380 390 400 410 420
430 440 450 460 470 480
orf19a.pep CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 CQPNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF
430 440 450 460 470 480
490 500 510 520 530 540
orf19a.pep STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
490 500 510 520 530 540
550 560 570 580 590 600
orf19a.pep AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
550 560 570 580 590 600
610 620 630 640 650 660
orf19a.pep PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF
610 620 630 640 650 660
670 680 690 700 710
orf19a.pep QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19-1 QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
670 680 690 700 710
与淋病奈瑟球菌的预计ORF的同源性
ORF19在与淋病奈瑟球菌的预计ORF(ORF19.ng)在重叠的102个氨基酸内显示有95.1%的相同性:
orf19.pep MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNXXTGRLK 60
||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf19ng MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK 60
orf19.pep NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTXXFTILGAX 103
|||:|||||||||||||||||||||||||||||| |||||||
orf19ng NIIATVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY 120
预计ORF19ng核苷酸序列<SEQ ID 109>编码的蛋白质具有氨基酸序列<SEQ ID110>:
1 MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPKLAMPFV LGI
IAGGLVD
51
LDNRLTGRLK NIIATVA
LFT LSSLTAQSTL GTGLPFILAM TLMTFGFTIL
101 GAVGLKYRTF AFGALAVAT
Y TTLTYTPETY WLTNPFMILC GTVLYSTAII
151 LFQIILPHRP VQESVANA
YE ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM
201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH
251 VDYQEMSEKF KNTDIIFRIR RLLEMQGQAC RNTAQAIRSG KDYVYSKRLG
301 RAIEGCRQSL RLLSDGNDSP DIRHLSRLLD NLGSVDQQFR QLRHSDSPAE
351 NDRMGDTRIA ALETGSFKNT *
进一步的工作揭示了完整的核苷酸序列<SEQ ID 111>:
1 ATGAAAACCC CACTCCTCAA GCCTCTGCTC ATTACCTCGC TTCCCGTTTT
51 CGCCAGTGTC TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA
101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTGGTCGAT
151 TTGGACAACC GCCTGACCGG ACGGCTGAAA AACATCATCG CCACCGTCGC
201 CCTGTTTACC CTCTCCTCGC TCACGGCGCA AAGCACCCTC GGCACAGGGC
251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CCTTCGGCTT TACCATTTTA
301 GGCGCGGTCG GGCTGAAATA CCGCACCTTC GCCTTCGGCG CACTCGCCGT
351 CGCCACCTAC ACCACGCTTA CCTACACCCC CGAAACCTAC TGGCTGACCA
401 ACCCCTTCAT GATTTTATGC GGCACCGTAC TGTACAGCAC CGCCATCATC
451 CTGTTCCAAA TCATCCTGCC CCACCGCCCC GTCCAAGAAA GCGTCGCCAA
501 TGCCTACGAA GCACTCGGCG GCTACCTCGA AGCCAAAGCC GACTTCTTCG
551 ACCCCGATGA GGCAGCCTGG ATAGGCAACC GCCACATCGA CCTCGCCATG
601 AGCAACACCG GCGTCATCAC CGCCTTCAAC CAATGCCGTT CCGCCCTGTT
651 TTACCGTTTG CGCGGCAAAC ACCGCCACCC GCGCACCGCC AAAATGCTGC
701 GCTACTACTT CGCCGCCCAA GACATCCACG AACGCATCAG CTCCGCCCAC
751 GTCGACTACC AAGAGATGTC CGAAAAATTC AAAAACACCG ACATCATCTT
801 CCGCATCCGC CGCCTGCTCG AAATGCAGGG GCAGGCGTGC CGCAACACCG
851 CCCAAGCCAT CCGGTCGGGC AAAGACTAcg tTTACAGCAA ACGCCTCGGA
901 CGCGCCATcg aaggctgCCG CCAGTCGCtg cgcctCCTTt cagacggcaA
951 CGACAGTCCC GACATCCGCC ACCTGAGccg CCTTCTCGAC AACCTCGgca
1001 GCGTcgacca gcagtTCcgc caactCCGAC ACAgcgactC CCCCGCcgaa
1051 Aacgaccgca tgggcgacaC CCGCATCGCC GCCCtcgaaa ccggcagctT
1101 caaaaaCAcc tggcaggCAA TCCGTCCGCa gctgaaCCTC GAATCatgCG
1151 TATTCCGCCA TGCCGTCCGC CTGTCCCTCG TCGTTGCCGC CGCCTGCACC
1201 ATCGTCgaag cCCTCAACCT CAACCTCGGC TACTGGATAC TGCTGACCGC
1251 CCTTTTCGTC TGCCAACCCA ACTACACCGC CACCAAAAGC CGCGTGTACC
1301 AACGCATCGC CGGCACCGTA CTCGGCGTAA TCGTCGGCTC GCTCGTCCCC
1351 TACTTCACCC CCTCCGTCGA AACCAAACTC TGGATTGTCA TCGCCGGTAC
1401 CACCCTGTTC TTCATGACCC GCACCTACAA ATACAGTTTC TCCACCTTCT
1451 TCATCACCAT TCAGGCACTG ACCAGCCTCT CCCTCGCAGG TTTGGACGTA
1501 TACGCCGCCA TGCCCGTGCG CATCATcgaC ACCATTATCG GCGCATCCCT
1551 TGCCTGGGCG GCGGTCAGCT ACCTGTGGCC AGACTGGAAA TACCTCACGC
1601 TCGAACGCAC CGCCGCCCTT GCCGTATGCA GCAGCGGCAC ATACCTCCAA
1651 AAAATTGCCG AACGCCTCAA AACCGGCGAA ACCGGCGACG ACATAGAATA
1701 CCGCATCACC CGCCGCCGCG CCCACGAACA CACCGCCGCC CTCAGCAGCA
1751 CCCTTTCCGA CATGAGCAGC GAACCCGCAA AATTCGCCGA CAGCCTGCAA
1801 CCCGGCTTTA CCCTGCTCAA AACCGGCTAC GCCCTGACCG GCTACATCTC
1851 CGCCCTCGGC GCATACCGCA GCGAAATGCA CGAAGAATGC AGCCCCGACT
1901 TTACCGCACA GTTCCACCTT GCCGCCGAAC ACACCGCCCA CATCTTCCAA
1951 CACCTGCCCG ACATGGGACC CGACGACTTT CAGACGGCAT TGGATACACT
2001 GCGCGGCGAA CTCGGCACCC TCCGCACCCG CAGCAGCGGA ACACAAAGCC
2051 ACATCCTCCT CCAACAGCTC CAACTCATCG CccgGCAACT CGAACCCTAC
2101 TACCGCGCCT ACCGACAAAT TCCGCACAGG CAGCCCCAAA ACGCAGCCTG
2151 A
它对应于氨基酸序列<SEQ ID 112;ORF19ng-1>:
1
MKTPLLKPLL ITSLPVFASV FTAASIVWQL GEPK
LAMPFV LGIIAGGLVD
51
LDNRLTGRLK NIIATVALFT LSSLTAQSTL GTGLPF
ILAM TLMTFGFTIL
101
GAVGLKYRTF AFGALAVATY TTLTYTPETY WLTNP
FMILC GTVLYSTAII
151
LFQIILPHRP VQESVANAYE ALGGYLEAKA DFFDPDEAAW IGNRHIDLAM
201 SNTGVITAFN QCRSALFYRL RGKHRHPRTA KMLRYYFAAQ DIHERISSAH
251 VDYQEMSEKF KNTDIIFRIR RLLEMQGQAC RNTAQAIRSG KDYVYSKRLG
301 RAIEGCRQSL RLLSDGNDSP DIRHLSRLLD NLGSVDQQFR QLRHSDSPAE
351 NDRMGDTRIA ALETGSFKNT WQAIRPQLNL ESCVFRHAVR
LSLVVAAACT
401
IVEALNLNLG YWILLTALFV CQPNYTATKS RVYQR
IAGTV LGVIVGSLVP
451
YFTPSVETKL WIVIAGTTLF FMTRTYKYSF STFFITIQAL TSLSLAGLDV
501 YAAMPVRIID TIIGASLAWA AVSYLWPDWK YLTLERTAAL AVCSSGTYLQ
551 KIAERLKTGE TGDDIEYRIT RRRAHEHTAA LSSTLSDMSS EPAKFADSLQ
601 PGFTLLKTGY ALTGYISALG AYRSEMHEEC SPDFTAQFHL AAEHTAHIFQ
651 HLPDMGPDDF QTALDTLRGE LGTLRTRSSG TQSHILLQQL QLIARQLEPY
701 YRAYRQIPHR QPQNAA*
ORF19ng-1和ORF19-1在716个氨基酸的重叠区内显示出有95.5%的相同性:
10 20 30 40 50 60
orf19-1.pep MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19ng-1 MKTPLLKPLLITSLPVFASVFTAASIVWQLGEPKLAMPFVLGIIAGGLVDLDNRLTGRLK
10 20 30 40 50 60
70 80 90 100 110 120
orf19-1.pep NIITTVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19ng-1 NIIATVALFTLSSLTAQSTLGTGLPFILAMTLMTFGFTILGAVGLKYRTFAFGALAVATY
70 80 90 100 110 120
130 140 150 160 170 180
orf19-1.pep TTLTYTPETYWLTNPFMILCGTVLYSTAILLFQIVLPHRPVQESVANAYDALGGYLEAKA
|||||||||||||||||||||||||||||:||||:||||||||||||||:||||||||||
orf19ng-1 TTLTYTPETYWLTNPFMILCGTVLYSTAIILFQIILPHRPVQESVANAYEALGGYLEAKA
130 140 150 160 170 180
190 200 210 220 230 240
orf19-1.pep DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19ng-1 DFFDPDEAAWIGNRHIDLAMSNTGVITAFNQCRSALFYRLRGKHRHPRTAKMLRYYFAAQ
190 200 210 220 230 240
250 260 270 280 290 300
orf19-1.pep DIHERISSAHVDYQEMSEKFKNTDIIFRIHRLLEMQGQACRNTAQALRASKDYVYSKRLG
|||||||||||||||||||||||||||||:||||||||||||||||:|::||||||||||
orf19ng-1 DIHERISSAHVDYQEMSEKFKNTDIIFRIRRLLEMQGQACRNTAQAIRSGKDYVYSKRLG
250 260 270 280 290 300
310 320 330 340 350 360
orf19-1.pep RAIEGCRQSLRLLSDSNDSPDIRHLRRLLDNLGSVDQQFRQLQHNGLQAENDRMGDTRIA
|||||||||||||||:||||||||| ||||||||||||||||||| ||||||||||||
orf19ng-1 RAIEGCRQSLRLLSDGNDSPDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIA
310 320 330 340 350 360
370 380 390 400 410 420
orf19-1.pep ALETSSLKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
||||:|:||||||||||||||| |||||||||||||||||||||||||||||||||||||
orf19ng-1 ALETGSFKNTWQAIRPQLNLESCVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFV
370 380 390 400 410 420
430 440 450 460 470 480
orf19-1.pep CQFNYTATKSRVRQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIASTTLFFMTRTYKYSF
|||||||||||| ||||||||||||||||||||||||||||||||:||||||||||||||
orf19ng-1 CQPNYTATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSF
430 440 450 460 470 480
490 500 510 520 530 540
orf19-1.pep STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf19ng-1 STFFITIQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAAL
490 500 510 520 530 540
550 560 570 580 590 600
orf19-1.pep AVCSNGAYLEKITERLKSGETGDDVEYRATRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
||||:|:||:||:||||:||||||:||| |||||||||||||||||||||||||||||||
orf19ng-1 AVCSSGTYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADSLQ
550 560 570 580 590 600
610 620 630 640 650 660
orf19-1.pep PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPETEPDDF
|||||||||||||||||||||||||||||||||||||||||||||||||||||: ||||
orf19ng-1 PGFTLLKTGYALTGYISALGAYRSEMHEECSPDFTAQFHLAAEHTAHIFQHLPDMGPDDF
610 620 630 640 650 660
670 680 690 700 710
orf19-1.pep QTALDTLRGELDTLRTHSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
||||||||||| ||||:||||||||||||||||||||||||||||||||||||||||
orf19ng-1 QTALDTLRGELGTLRTRSSGTQSHILLQQLQLIARQLEPYYRAYRQIPHRQPQNAAX
670 680 690 700 710
另外,ORF19ng-1显示出与以前输入数据库的一种假设的淋球菌蛋白有明显同源性:
sp|033369|YOR2_NEIGO假设的45.5KD蛋白(ORF2)gnl|PID|e1154438(AJ002423)假设蛋白[淋病奈瑟球菌]长度=417
评分=1512(705.6位),估计值=5.3e-203,P=5.3e-203
相同性=301/326(92%),阳性=306/326(93%)
询问:307 RQSLRLLSDGNDSPDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIAALETGS 366
RQSLRLLSDGNDS DIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIAALETGS
目标:1 RQSLRLLSDGNDSXDIRHLSRLLDNLGSVDQQFRQLRHSDSPAENDRMGDTRIAALETGS 60
询问:367 FKNTWQAIRPQLNLESCVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTALFVCQPNYT 426
FKNTWQAIRPQLNLES VFRHAVRLSLVVAAACTIVEALNLNLGYWILLT LFVCQPNYT
目标:61 FKNTWQAIRPQLNLESGVFRHAVRLSLVVAAACTIVEALNLNLGYWILLTRLFVCQPNYT 120
询问:427 ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 486
ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT
目标:121 ATKSRVYQRIAGTVLGVIVGSLVPYFTPSVETKLWIVIAGTTLFFMTRTYKYSFSTFFIT 180
询问:487 IQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 546
IQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG
目标:181 IQALTSLSLAGLDVYAAMPVRIIDTIIGASLAWAAVSYLWPDWKYLTLERTAALAVCSSG 240
询问:547 TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADSLQPGFTLL 606
TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFAD+ P
目标:241 TYLQKIAERLKTGETGDDIEYRITRRRAHEHTAALSSTLSDMSSEPAKFADTCNPALPCS 300
询问:607 KTGYALTGYISALGAYRSEMHEECSP 632
K ALTGYISALG ++ + +P
目标:301 KPATALTGYISALGHTAAKCTKNAAP 326
根据该分析,包括该淋球菌蛋白中存在几个推定的跨膜结构域(第一个结构域在脑膜炎球菌蛋白中也见到)以及与YHFK蛋白的同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例14
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 113>:
1 ATGAATATGC TGGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC
51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG
101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG
151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT
201 TTTGGCGGAA TACAAGGAAA CGCGTTCAAA AGAGGCGG.C GAAGCCTTTA
251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTTAT CGTTACCGCG
301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCC
AGTT
351 TTGCCCAAGA TGCCGACAAA TTTCAGCTCT CCATCGATTT GCTGCGGATT
401 ACGTTTCCTT ATATATTATT GATTTCCCTG TCTTCATTTG TCGGCTCGGT
451 ACTCAATTCT TATCATAAGT TCGGCATTCC GGCGTTTACG CCAC.GTTTC
501 TGAACGTGTC GTTTATCGTA TTCGCGCTGT TTTTCGTGCC GTATTTCGAT
551 CCGCCCGTTA CCGCGCyGGC GTGGGCGGTC TTTGTCGGCG GCATTTTGCA
601 ACTCGrmTTC CAACTGCCCT GGCTGGCGAA ACTGGGCTTT TTGAAACTGC
651 CCAAACtGAG TTTCAAAGAT GCGGCGGTCA ACCGCGTGAT GAAACAGATG
701 GCGCCTGCgA TTTTgGGCGT GAgCGTGGCG CAGGTTTCTT TGGTGATCAA
751 CACGATTTTc GCGTCTTATC TGCAATCGGG CAGCGTTTCA TGGATGTATT
801 ACGCCGACCG CATGATGGAG CTGCCCAGCG GCGTGCTGGG GGCGGCACTC
851 GGTACGATTT TGCTGCCGAC TTTGTCCAAA CACTCGGCAA ACCaAGATAC
901 GGaACAGTTT TCCGCCCTGC TCGACTGGGG TTTGCGCCTG TGCATGCtgc
951 TGACGCTGCC GGCGgcGGTC GGACTGGCGG TGTTGTCGTT cCCgCtGGTG
1001 GCGACGCTGT TTATGTACCG CGwATTTACG CTGTTTGACG CGCAGATGAC
1051 GCAACACGCG CTGATTGCCT ATTCTTTCGG TTTAATCGGC TTAATCATGA
1101 TTAAAGTGTT GGCACCCGGC TTCTATGCGC GGCAAAACAT CAAwAmGCCC
1151 GTCAAAATCG CCATCTTCAC GCTCATCTGC mCGCAGTTGA TGAACCTTGs
1251 GGCGCGTGTA TCAATGCCGG ATTGTTGTTT TACCTGTTGC GCAGACACGG
1301 TATTTACCAA CCTGG.CAAG GGTTGGGCAG CGTTCTT.AG CAAAAATGCT
1351 GcTCTCGCTC GCCGTGA
它对应于氨基酸序列<SEQ ID 114;ORF20>:
1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL
51 LRRVFAEGAF AQAFVPILAE YKETRSKEAX EAFIRHVAGM LSFVLVIVTA
101 LGILAAPWVI YVSAPSFAQD ADKFQLSIDL LRITFPYILL ISLSSFVGSV
151 LNSYHKFGIP AFTPXFLNVS FIVFALFFVP YFDPPVTAXA WAVFVGGILQ
201 LXFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQMAPAILGV SVAQVSLVIN
251 TIFASYLQSG SVSWMYYADR MMELPSGVLG AALGTILLPT LSKHSANQDT
301 EQFSALLDWG LRLCMLLTLP AAVGLAVLSF PLVATLFMYR XFTLFDAQMT
351 QHALIAYSFG LIGLIMIKVL APGFYARQNI XXPVKIAIFT LICXQLMNLX
401 FXGPLXXIGL SLAIGLGACI NAGLLFYLLR RHGIYQPXQG LGSVLXQKCC
451 SRSP*
详细描述这些序列,其完整的DNA序列<SEQ ID 115>是:
1 ATGAATATGC TGGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC
51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG
101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG
151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT
201 TTTGGCGGAA TACAAGGAAA CGCGTTCAAA AGAGGCGGCG GAGGCTTTTA
251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTTAT CGTTACCGCG
301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGGTTT
351 TGCCCAAGAT GCCGACAAAT TTCAGCTCTC CATCGATTTG CTGCGGATTA
401 CGTTTCCTTA TATATTATTG ATTTCCCTGT CTTCATTTGT CGGCTCGGTA
451 CTCAATTCTT ATCATAAGTT CGGCATTCCG GCGTTTACGC CCACGTTTCT
501 GAACGTGTCG TTTATCGTAT TCGCGCTGTT TTTCGTGCCG TATTTCGATC
551 CGCCCGTTAC CGCGCTGGCG TGGGCGGTCT TTGTCGGCGG CATTTTGCAA
601 CTCGGCTTCC AACTGCCCTG GCTGGCGAAA CTGGGCTTTT TGAAACTGCC
651 CAAACTGAGT TTCAAAGATG CGGCGGTCAA CCGCGTGATG AAACAGATGG
701 CGCCTGCGAT TTTGGGCGTG AGCGTGGCGC AGGTTTCTTT GGTGATCAAC
751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTATTA
801 CGCCGACCGC ATGATGGAGC TGCCCAGCGG CGTGCTGGGG GCGGCACTCG
851 GTACGATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG
901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCCTGT GCATGCTGCT
951 GACGCTGCCG GCGGCGGTCG GACTGGCGGT GTTGTCGTTC CCGCTGGTGG
1001 CGACGCTGTT TATGTACCGC GAATTTACGC TGTTTGACGC GCAGATGACG
1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGCT TAATCATGAT
1101 TAAAGTGTTG GCACCCGGCT TCTATGCGCG GCAAAACATC AAAACGCCCG
1151 TCAAAATCGC CATCTTCACG CTCATCTGCA CGCAGTTGAT GAACCTTGCC
1201 TTTATCGGCC CACTGAAACA CGTCGGACTT TCGCTTGCCA TCGGTCTGGG
1251 CGCGTGTATC AATGCCGGAT TGTTGTTTTA CCTGTTGCGC AGACACGGTA
1301 TTTACCAACC TGGCAAGGGT TGGGCAGCGT TCTTAGCAAA AATGCTGCTC
1351 TCGCTCGCCG TGATGTGCGG CGGACTGTGG GCAGCGCAGG CTTACCTGCC
1401 GTTTGAATGG GCGCACGCCG GCGGAATGCG GAAAGCGGGG CAGCTCTGCA
1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCACT GGCGGCTTTG
1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAACTGA
它对应于氨基酸序列<SEQ ID 116;ORF20-1>:
1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL
51 LRRVFAEGAF AQAFVPILAE YKETRSKEAA EAFIRHVAG
M LSFVLVIVTA
101
LGILAAPWVI YVSAPGFAQD ADKFQLSIDL LRIT
FPYILL ISLSSFVGSV
151
LNSYHKFGIP AFTPT
FLNVS FIVFALFFVP YFDPP
VTALA WAVFVGGILQ
201
LGFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQ
MAPAILGV SVAQVSLVIN
251 TIFASYLQSG SVSWMYYADR MMELPSGVLG AALGTILLPT LSKHSANQDT
301 EQFSALLDWG LR
LCMLLTLP AAVGLAVLSF PLVATLFMYR EFTLFDAQMT
351 QHA
LIAYSFG LIGLIMIKVL APGFYARQNI KTPVK
IAIFT LICTQLMNLA
401
FIGPLKHVGL S
LAIGLGACI NAGLLFYLLR RHGIYQPGKG WA
AFLAKMLL
451
SLAVMCGGLW AAQAYLPFEW AHAGGMRKAG Q
LCILIAVGG GLYFASLAAL
501 GFRPRHFKRV EN*
该氨基酸序列的计算机分析给出了下列结果:
与鼠伤寒杆菌的MviN毒力因子(登录号为P37169)的同源性
ORF20和MviN蛋白在440个氨基酸重叠区内显示出有63%的氨基酸相同性:
Orf20 1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
MN+L +LA V S+TM SRVLGF RD ++AR FGAGMATDAFFVAFKLPNLLRR+FAEGAF
MviN 14 MNLLKSLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF 73
Orf20 61 AQAFVPILAEYKETRSKEAXEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPSFAQD 120
+QAFVPILAEYK + +EA F+ +V+G+L+ L +VT G+LAAPWVI V+AP FA
MviN 74 SQAFVPILAEYKSKQGEEATRIFVAYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT 133
Orf20 121 ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVP 180
ADKF L+ LLRITFPYILLISL+S VG++LN++++F IPAF P FLN+S I FALF P
MviN 134 ADKFALTTQLLRITFPYILLISLASLVGAILNTWNRFSIPAFAPTFLNISMIGFALFAAP 193
Orf20 181 YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 240
YF+PPV A AWAV VGG+LQL +QLP+L K+G L LP+++F+D RV+KQM PAILGV
MviN 194 YFNPPVLALAWAVTVGGVLQLVYQLPYLKKIGMLVLPRINFRDTGAMRVVKQMGPAILGV 253
Orf20 241 SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 300
SV+Q+SL+INTIFAS+L SGSVSWMYYADR+ME PSGVLG ALGTILLP+LSK A+ +
MviN 254 SVSQISLIINTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH 313
Orf20 301 EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG 360
+++ L+DWGLRLC LL LP+AV L+L+ PL +LF Y FT FDA MTQ ALIAYS G
MviN 314 DEYCRLMDWGLRLCFLLALPSAVALGILAKPLTVSLFQYGKFTAFDAAMTQRALIAYSVG 373
Orf20 361 LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXXXXXXXXXXXXXXXXXCI 420
LIGLI++KVLAPGFY+RQ+I PVKIAI TLI QLMNL F C+
MviN 374 LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLIMTQLMNLAFIGPLKHAGLSLSIGLAACL 433
Orf20 421 NAGLLFYLLRRHGIYQPXQG 440
NA LL++ LR+ I+ P G
MviN 434 NASLLYWQLRKQNIFTPQPG 453
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF20与脑膜炎奈瑟球菌菌株A的ORF(ORF20a)在重叠的447个氨基酸内显示出有93.5%的相同性:
10 20 30 40 50 60
orf20.pep MNMLGALAKVGSLTMVSRVLGFVRDTVIA RAFGAGMATDAFFVAFK LPNLLRRVFAEGAF
|||||||:||||||||||||||||||||| ||||||||||||||||| ||||||||||||||
orf20a MNMLGALVKVGSLTMVSRVLGFVRDTVIA RAFGAGMATDAFFVAFK LPNLLRRVFAEGAF
10 20 30 40 50 60
70 80 90 100 110 120
orf20.pep AQAFVPILAEYKETRSKEAXEAFIRHVAG
MLSFVLVIVTALGILAAPWVIYVSAPSFAQD
|||||||||||||||||||:||||||||| ||||||||||||||||| |||||||||:||:|
orf20a AQAFVPILAEYKETRSKEATEAFIRHVAG
MLSFVLVIVTALGILAAPWVIYVSAPGFAKD
70 80 90 100 110 120
130 140 150 160 170 180
orf20.pep ADKFQLSIDLLRIT
FPYILLISLSSFVGSVLNSYHKFGIPAFTPX
FLNVSFIVFALFFVP
|||||||||||||| ||||||||||||||||| ||||||:||||||: |||||||||||||||
orf20a ADKFQLSIDLLRIT
FPYILLISLSSFVGSVLNSYHKFSIPAFTPT
FLNVSFIVFALFFVP
130 140 150 160 170 180
190 200 210 220 230 240
orf20.pep
YFDPP
VTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQ
MAPAILGV
|| ||| ||| |||||||||||| |||||||||||||||||||||||||||||| ||||||||
orf20a
YFDPP
VTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQ
MAPAILGV
190 200 210 220 230 240
250 260 270 280 290 300
orf20.pep
SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT
||||:|||| ||||||||||||||||||||||||||:||||||||||||||||||||||||
orf20a
SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT
250 260 270 280 290 300
310 320 330 340 350 360
orf20.pep EQFSALLDWGLR
LCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHA
LIAYSFG
|||||||||||| |||||||||||:||||| |||||||||| |||||||||||| |||||||
orf20a EQFSALLDWGLR
XCMLLTLPAAVGMAVLSFPLVATLFMYREFTLFDAQMTQHA
LIAYSFG
310 320 330 340 350 360
370 380 390 400 410 420
orf20.pep
LIGLIMIKVLAPGFYARQNIXXPVK
IAIFTLICXQLMNLXFXGPLXXIGLS
LAIGLGACI
|||||||||| |||||||||| :||| ||||||||:||||| | ||| :||| |||||||||
orf20a
LIGLIMIKVLAPGFYARQNIKTPVK
IAIFTLICTQLMNLAFIGPLKHVGLS
LAIGLGACI
370 380 390 400 410 420
430 440 450
orf20.pep
NAGLLFYLLRRHGIYQPXQGLG SVLXQKCCSRSPX
|||||||| ||||||||| :| : : | :
orf20a
NAGLLFYLLRRHGIYQPGKGWA
AFLAKMLLSLAVMGGGLYAAQIWLPFDWAHAGGMQKAA
430 440 450 460 470 480
全长ORF20a核苷酸序列<SEQ ID 117>是:
1 ATGAATATGC TGGGAGCTTT GGTAAAAGTC GGCAGCCTGA CGATGGTGTC
51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGC GCATTCGGCG
101 CAGGCATGGC GACGGATGCG TTCTTTGTCG CGTTCAAACT GCCCAACCTG
151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT
201 TTTGGCGGAA TATAAGGAAA CGCGTTCTAA AGAGGCGACG GAGGCTTTTA
251 TCCGCCATGT GGCGGGGATG CTGTCGTTTG TACTGGTCAT CGTTACCGCG
301 CTGGGCATAC TTGCCGCGCC TTGGGTGATT TATGTTTCCG CACCCGGTTT
351 TGCCAAAGAT GCCGACAAAT TTCAGCTCTC TATCGATTTG CTGCGGATTA
401 CGTTTCCTTA TATCTTATTG ATTTCACTTT CCTCTTTTGT CGGCTCGGTA
451 CTCAATTCCT ATCATAAATT CAGCATTCCT GCGTTTACGC CCACGTTCCT
501 GAACGTGTCG TTTATCGTAT TCGCGCTGTT TTTCGTGCCG TATTTCGATC
551 CTCCCGTTAC CGCGCTGGCT TGGGCGGTTT TTGTCGGCGG CATTTTGCAA
601 CTCGGCTTCC AACTGCCCTG GCTGGCGAAA CTGGGTTTTT TGAAACTGCC
651 CAAACTGAGT TTCAAAGATG CGGCGGTCAA CCGCGTGATG AAACAGATGG
701 CGCCTGCGAT TTTGGGCGTG AGCGTGGCGC AGATTTCTTT GGTGATCAAC
751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTATTA
801 CGCCGACCGC ATGATGGAAC TGCCCGGCGG CGTGCTGGGG GCGGCACTCG
851 GTACGATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG
901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCNTGT GCATGCTGCT
951 GACGCTGCCG GCGGCGGTCG GAATGGCGGT GTTGTCGTTC CCGCTGGTGG
1001 CAACCTTGTT TATGTACCGA GAATTCACGC TGTTTGACGC GCAGATGACG
1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGTT TAATCATGAT
1101 TAAAGTGTTG GCGCCCGGCT TTTATGCGCG GCAAAACATC AAAACGCCCG
1151 TCAAAATCGC CATCTTCACG CTCATTTGCA CGCAGTTGAT GAACCTTGCC
1201 TTTATCGGCC CACTGAAACA CGTCGGACTT TCGCTTGCCA TCGGTCTGGG
1251 CGCGTGTATC AATGCCGGAT TGTTGTTTTA CCTGTTGCGC AGACACGGTA
1301 TTTACCAACC TGGCAAGGGT TGGGCAGCGT TCTTGGCAAA AATGCTGCTC
1351 TCGCTCGCCG TGATGGGAGG CGGCCTGTAT GCCGCCCAAA TCTGGCTGCC
1401 GTTCGACTGG GCACACGCCG GCGGAATGCA AAAGGCCGCC CGGCTCTTCA
1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCACT GGCGGCTTTG
1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAGCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 118>:
1 MNMLGALVKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL
51 LRRVFAEGAF AQAFVPILAE YKETRSKEAT EAFIRHVAG
M LSFVLVIVTA
101
LGILAAPWVI YVSAPGFAKD ADKFQLSIDL LRIT
FPYILL ISLSSFVGSV
151
LNSYHKFSIP AFTPT
FLNVS FIVFALFFVP YFDPP
VTALA WAVFVGGILQ
201
LGFQLPWLAK LGFLKLPKLS FKDAAVNRVM KQ
MAPAILGV SVAQISLVIN
251 TIFASYLQSG SVSWMYYADR MMELPGGVLG AALGTILLPT LSKHSANQDT
301 EQFSALLDWG LR
XCMLLTLP AAVGMAVLSF PLVATLFMYR EFTLFDAQMT
351 QHA
LIAYSFG LIGLIMIKVL APGFYARQNI KTPVK
IAIFT LICTQLMNLA
401
FIGPLKHVGL S
LAIGLGACI NAGLLFYLLR RHGIYQPGKG WA
AFLAKMLL
451
SLAVMGGGLY AAQIWLPFDW AHAGGMQKAA R
LFILIAVGG GLYFASLAAL
501 GFRPRHFKRV ES*
ORF20a和ORF20-1在512个氨基酸的重叠区内显示出有96.5%的相同性:
10 20 30 40 50 60
orf20a.pep MNMLGALVKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20-1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
10 20 30 40 50 60
70 80 90 100 110 120
orf20a.pep AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAKD
|||||||||||||||||||:||||||||||||||||||||||||||||||||||||||:|
orf20-1 AQAFVPILAEYKETRSKEAAEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAQD
70 80 90 100 110 120
130 140 150 160 170 180
orf20a.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFSIPAFTPTFLNVSFIVFALFFVP
|||||||||||||||||||||||||||||||||||||:||||||||||||||||||||||
orf20-1 ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPTFLNVSFIVFALFFVP
130 140 150 160 170 180
190 200 210 220 230 240
orf20a.pep YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20-1 YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
190 200 210 220 230 240
250 260 270 280 290 300
orf20a.pep SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT
||||:||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf20-1 SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT
250 260 270 280 290 300
310 320 330 340 350 360
orf20a.pep EQFSALLDWGLRXCMLLTLPAAVGMAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
|||||||||||| |||||||||||:|||||||||||||||||||||||||||||||||||
orf20-1 EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
310 320 330 340 350 360
370 380 390 400 410 420
orf20a.pep LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20-1 LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
370 380 390 400 410 420
430 440 450 460 470 480
orf20a.pep NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMGGGLYAAQIWLPFDWAHAGGMQKAA
||||||||||||||||||||||||||||||||||| |||:||| :|||:|||||||:||:
orf20-1 NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMCGGLWAAQAYLPFEWAHAGGMRKAG
430 440 450 460 470 480
490 500 510
orf20a.pep RLFILIAVGGGLYFASLAALGFRPRHFKRVESX
:| ||||||||||||||||||||||||||||:|
orf20-1 QLCILIAVGGGLYFASLAALGFRPRHFKRVENX
490 500 510
与淋病奈瑟球菌的预计ORF的同源性
ORF20与淋病奈瑟球菌的预计ORF(ORF20ng)在重叠的454个氨基酸内显示出有92.1%的相同性:
orf20.pep MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20ng MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
orf20.pep AQAFVPILAEYKETRSKEAXEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPSFAQD 120
||||||||||||||||||||:||||||||||||||::||||||||||||||||||:|::|
orf20ng AQAFVAILAEYKETRSKEATEAFIRHVAGMLSFVLIVVTALGILAAPWVIYVSAPGFTKD 120
orf20.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPXFLNVSFIVFALFFVP 180
||||||||:||||||||||||||||||||:||||||||||||||:|||:|||||||||||
orf20ng ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP 180
orf20.pep YFDPPVTAXAWAVFVGGILQLXFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV 240
|||||||| |||||||||||| |||||||||||||||||:||||||||||||||||||||
orf20ng YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLNFKDAAVNRVMKQMAPAILGV 240
orf20.pep SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT 300
||||:||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf20ng SVAQISLVINTIFASYLQSGSVSWMYYADRMMELPGGVLGAALGTILLPTLSKHSANQDT 300
orf20.pep EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYRXFTLFDAQMTQHALIAYSFG 360
||||||||||||||||||||||:||||||||||||||||| |||||||||||||||||||
orf20ng EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 360
orf20.pep LIGLIMIKVLAPGFYARQNIXXPVKIAIFTLICXQLMNLXFXGPLXXIGLSLAIGLGACI 420
||||||||||| |||||||| :|||||||||||:||||| | ||| ||||||||||||
orf20ng LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 420
orf20.pep NAGLLFYLLRRHGIYQPXQGLGSVLXQKCCSRSP 454
||||||:|:|:||||:| ||||: :|||||||
orf20ng NAGLLFFLFRKHGIYRPGQGLGQPSWRKCCSRSP 454
预计ORF20ng核苷酸序列<SEQ ID 119>编码的蛋白质具有氨基酸序列<SEQ ID120>:
1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL
51 LRRVFAEGAF AQAFVPILAE YKETRSKEAT EAFIRHVAGM LSFVLIVVTA
101 LGILAAPWVI YVSAPGFTKD ADKFQLSISL LRITFPYILL ISLSSFVGSI
151 LNSYHKFGIP AFTPTFLNIS FIVFALFFVP YFDPPVTALA WAVFVGGILQ
201 LGFQLPWLAK LGFLKLPKLN FKDAAVNRVM KQMAPAILGV SVAQISLVIN
251 TIFASYLQSG SVSWMYYADR MMELPGGVLG AALGTILLPT LSKHSANQDT
301 EQFSALLDWG LRLCMLLTLP AAAGLAVLSF PLVATLFMYR EFTLFDAQMT
351 QHALIAYSFG LIGLIMIKVL ASGFYARQNI KTPVKIAIFT LICTQLMNLA
401 FIGPLKHAGL SLAIGLGACI NAGLLFFLFR KHGIYRPGQG LGQPSWRKCC
451 SRSP*
进一步的DNA分析揭示了下列DNA序列<SEQ ID 121>:
1 ATGAATATGC TTGGAGCTTT GGCAAAAGTC GGCAGCCTGA CGATGGTGTC
51 GCGCGTTTTG GGATTTGTGC GCGATACGGT CATTGCGCGG GCATTCGGCG
101 CGGGTATGGC GACGGATGCG TTTTTTGTCG CGTTCAAACT GCCCAACCTG
151 CTTCGCCGCG TGTTTGCGGA GGGGGCGTTT GCCCAAGCGT TTGTGCCGAT
201 TTTGGCGGAA TATAAGGAAA CGCGTTCTAA AGAGGCGAcg gAGGCTTTTA
251 TCCGCCACGt tgcgggAatg CTGTCGTTTG TGCTGATcgt cGttacCGCG
301 CTGGGCATAC TTGCCGCgcc tTGGGTGATT TATGTTtccg CgcccGGCTT
351 TACCAAAGAC GCGGACAAGT TCCAACTTTC CATCAGCCTG CTGCGGATTA
401 CGTTTCCTTA TATATTATTG ATTTCTTTGT CTTCTTTTGT CGGCTCGATA
451 CTCAATTCCT ACCATAAGTT CGGCATTCCC GCGTTTACGC CCACGTTTTT
501 AAACATCTCT TTTATCGTAT TCGCACTGTT TTTCGTGCCG TATTTCGATC
551 CGCCCGTTAC CGCGCTGGCG TGGGCGGTTT TTGTCGGCGG TATTTTGCAG
601 CTCGGTTTCC AACTGCCGTG GCTGGCGAAA CTGGGCTTTT TGAAACTGCC
651 CAAACTGAAT TTCAAAGATG CGGCGGTCAA CCGCGTCATG AAACAGATGG
701 CGCCTGCGAT TTTGGGCGTG agcgTGGCGC AAATTTCTTT GgttATCAAC
751 ACGATTTTCG CGTCTTATCT GCAATCGGGC AGCGTTTCAT GGATGTatta
801 cgCCGACCGC ATGATGGAGc tgcgccGGGG CGTGCTGGGG GCTGCACTCG
851 GTACAATTTT GCTGCCGACT TTGTCCAAAC ACTCGGCAAA CCAAGATACG
901 GAACAGTTTT CCGCCCTGCT CGACTGGGGT TTGCGCCTGT GCATGCTGCT
951 GACGCTGCCG GCGGCGGccg GACTGGCGGT ATTGTCGTTC CCGCTGGTGG
1001 CGACGCTGTT TATGTACCGA GAATTCACGC TGTTTGACGC ACAAATGACG
1051 CAACACGCGC TGATTGCCTA TTCTTTCGGT TTAATCGGTT TAATTATGAT
1101 TAAAGTGTTG GCATCCGGCT TTTATGCGCG GCAAAACATC AAAACGCCCG
1151 TCAAAATCGC CATCTTCACG CTCATCTGCA CGCAGTTGAT GAACCTCGCC
1201 TTTATCGGTC CGTTGAAACA CGCCGGGCTT TCGCTCGCCA TCGGCCTGGG
1251 CGCGTGCATC AACGCCGGAT TGTTGTTCTT CCTGTTGCGC AAACACGGTA
1301 TTTACCGGCC cggcaggggt tgggcggcgt TCTTGGCGAA AATGCTGCTC
1351 GCGCTCGCCG TGATGTGCGG CGGACTGTGG GCGGCGCAGG CTTGCCTGCC
1401 GTTCGAATGG GCGCACGCCG GCGGAATGCG GAAAGCGGGG CAGCTCTGCA
1451 TCCTGATTGC CGTCGGCGGC GGACTGTATT TCGCATCTCT GGCGGCTTTG
1501 GGCTTCCGTC CGCGCCATTT CAAACGCGTG GAAAGCTGA
它编码下列氨基酸序列<SEQ ID 122;ORF20ng-1>:
1 MNMLGALAKV GSLTMVSRVL GFVRDTVIAR AFGAGMATDA FFVAFKLPNL
51 LRRVFAEGAF AQAFVPILAE YKETRSKEAT EAFIRHVAG
M LSFVLIVVTA
101
LGILAAPWVI YVSAPGFTKD ADKFQLSISL LRIT
FPYILL ISLSSFVGSI
151
LNSYHKFGIP AFTPT
FLNIS FIVFALFFVP YFDPP
VTALA WAVFVGGILQ
201
LGFQLPWLAK LGFLKLPKLN FKDAAVNRVM KQ
MAPAILGV SVAQISLVIN
251 TIFASYLQSG SVSWMYYADR MMELRRGVLG AALGTILLPT LSKHSANQDT
301 EQFSALLDWG LR
LCMLLTLP AAAGLAVLSF PLVATLFMYR EFTLFDAQMT
351 QHA
LIAYSFG LIGLIMIKVL ASGFYARQNI KTPVK
IAIFT LICTQLMNLA
401
FIGPLKHAGL S
LAIGLGACI NAGLLFFLLR KHGIYRPGRG WA
AFLAKMLL
451
ALAVMCGGLW AAQACLPFEW AHAGGMRKAG Q
LCILIAVGG GLYFASLAAL
501 GFRPRHFKRV ES*
ORF20ng-1和ORF20-1在512个氨基酸的重叠区内显示出有95.7%的相同性:
10 20 30 40 50 60
orf20-1.pep MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf20ng-1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF
10 20 30 40 50 60
70 80 90 100 110 120
orf20-1.pep AQAFVPILAEYKETRSKEAAEAFIRHVAGMLSFVLVIVTALGILAAPWVIYVSAPGFAQD
||||||||||||||||||:|||||||||||||||::|||||||||||||||||||||::|
orf20ng-1 AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLIVVTALGILAAPWVIYVSAPGFTKD
70 80 90 100 110 120
130 140 150 160 170 180
orf20-1.pep ADKFQLSIDLLRITFPYILLISLSSFVGSVLNSYHKFGIPAFTPTFLNVSFIVFALFFVP
||||||||:||||||||||||||||||||:||||||||||||||||||:|||||||||||
orf20ng-1 ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP
130 140 150 160 170 180
190 200 210 220 230 240
orf20-1.pep YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLSFKDAAVNRVMKQMAPAILGV
|||||||||||||||||||||||||||||||||||||||:||||||||||||||||||||
orf20ng-1 YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLNFKDAAVNRVMKQMAPAILGV
190 200 210 220 230 240
250 260 270 280 290 300
orf20-1.pep SVAQVSLVINTIFASYLQSGSVSWMYYADRMMELPSGVLGAALGTILLPTLSKHSANQDT
||||:||||||||||||||||||||||||||||| ||||||||||||||||||||||||
orf20ng-1 SVAQISLVINTIFASYLQSGSVSWMYYADRMMELRRGVLGAALGTILLPTLSKHSANQDT
250 260 270 280 290 300
310 320 330 340 350 360
orf20-1.pep EQFSALLDWGLRLCMLLTLPAAVGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
||||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||
orf20ng-1 EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG
310 320 330 340 350 360
370 380 390 400 410 420
orf20-1.pep LIGLIMIKVLAPGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHVGLSLAIGLGACI
||||||||||| |||||||||||||||||||||||||||||||||||:||||||||||||
orf20ng-1 LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI
370 380 390 400 410 420
430 440 450 460 470 480
orf20-1.pep NAGLLFYLLRRHGIYQPGKGWAAFLAKMLLSLAVMCGGLWAAQAYLPFEWAHAGGMRKAG
||||||:|||:||||:||:|||||||||||:||||||||||||| |||||||||||||||
orf20ng-1 NAGLLFFLLRKHGIYRPGRGWAAFLAKMLLALAVMCGGLWAAQACLPFEWAHAGGMRKAG
430 440 450 460 470 480
490 500 510
orf20-1.pep QLCILIAVGGGLYFASLAALGFRPRHFKRVENX
|||||||||||||||||||||||||||||||:|
orf20ng-1 QLCILIAVGGGLYFASLAALGFRPRHFKRVESX
490 500 510
另外,ORF20ng-1显示出与鼠伤寒杆菌的一种毒力因子明显同源:
sp|P37169|MVIN_SALTY毒力因子MVIN pir||S40271 mviN蛋白-鼠伤寒杆菌gi|438252(Z26133)mviB基因产物[鼠伤寒杆菌]gnl|PID|d1005521(D25292)ORF2[鼠伤寒杆菌]长度=524
评分=1573(750.1位),估计值=1.1e-220,Sum P(2)=1.1e-220
相同性=309/467(66%),阳性=368/467(78%)
询问: 1 MNMLGALAKVGSLTMVSRVLGFVRDTVIARAFGAGMATDAFFVAFKLPNLLRRVFAEGAF 60
MN+L +LA V S+TM SRVLGF RD ++AR FGAGMATDAFFVAFKLPNLLRR+FAEGAF
目标: 14 MNLLKSLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF 73
询问: 61 AQAFVPILAEYKETRSKEATEAFIRHVAGMLSFVLIVVTALGILAAPWVIYVSAPGFTKD 120
+QAFVPILAEYK + +EAT F+ +V+G+L+ L VVT G+LAAPWVI V+APGF
目标: 74 SQAFVPILAEYKSKQGEEATRIFVAYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT 133
询问:121 ADKFQLSISLLRITFPYILLISLSSFVGSILNSYHKFGIPAFTPTFLNISFIVFALFFVP 180
ADKF L+ LLRITFPYILLISL+S VG+ILN++++F IPAF PTFLNIS I FALF P
目标:134 ADKFALTTQLLRITFPYILLISLASLVGAILNTWNRFSIPAFAPTFLNISMIGFALFAAP 193
询问:181 YFDPPVTALAWAVFVGGILQLGFQLPWLAKLGFLKLPKLNFKDAAVNRVMKQMAPAILGV 240
YF+PPV ALAWAV VGG+LQL +QLP+L K+G L LP++NF+D RV+KQM PAILGV
目标:194 YFNPPVLALAWAVTVGGVLQLVYQLPYLKKIGMLVLPRINFRDTGAMRVVKQMGPAILGV 253
询问:241 SVAQISLVINTIFASYLQSGSVSWMYYADRMMELRRGVLGAALGTILLPTLSKHSANQDT 300
SV+QISL+INTIFAS+L SGSVSWMYYADR+ME GVLG ALGTILLP+LSK A+ +
目标:254 SVSQISLIINTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH 313
询问:301 EQFSALLDWGLRLCMLLTLPAAAGLAVLSFPLVATLFMYREFTLFDAQMTQHALIAYSFG 360
+++ L+DWGLRLC LL LP+A L +L+ PL +LF Y +FT FDA MTQ ALIAYS G
目标:314 DEYCRLMDWGLRLCFLLALPSAVALGILAKPLTVSLFQYGKFTAFDAAMTQRALIAYSVG 373
询问:361 LIGLIMIKVLASGFYARQNIKTPVKIAIFTLICTQLMNLAFIGPLKHAGLSLAIGLGACI 420
LIGLI++KVLA GFY+RQ+IKTPVKIAI TLI TQLMNLAFIGPLKHAGLSL+IGL AC+
目标:374 LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLIMTQLMNLAFIGPLKHAGLSLSIGLAACL 433
询问:421 NAGLLFFLLRKHGIYRPGRGWXXXXXXXXXXXXVMCGGLWAAQACLP 467
NA LL++ LRK I+ P GW VM L+ +P
目标:434 NASLLYWQLRKQNIFTPQPGWMWFLMRLIISVLVMAAVLFGVLHIMP 480
评分=70(33.4位),估计值=1.1e-220,Sum P(2)=1.1e-220
相同性=14/41(34%),阳性=23/41(56%)
询问:469 EWAHAGGMRKAGQLCILIAVGGGLYFASLAALGFRPRHFKR 509
EW+ + + +L ++ G YFA+LA LGF+ + F R
目标:481 EWSQGSMLWRLLRLMAVVIAGIAAYFAALAVLGFKVKEFVR 521
根据该分析结果,包括与鼠伤寒杆菌的一种毒力因子有同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白质可作为疫苗或诊断用的抗原,或用来产生抗体。
实施例15
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 123>:
1 atGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA
51 GCAAGCCGTT tACGACGGCC CGGCCaTTAC CGAAGtCGCG TTGCTTGGCG
101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC
151 GTcAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT
201 GTTTACTGCG CCGGCTTCAG GcAAAATCGC CGCGATTCAC CGTGGCGAAA
251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAArGCAA CGACGAAATC
301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA
351 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC
401 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC
451 GTCAATGCGA tGGACACCAA TCCG..
它对应于氨基酸序列<SEQ ID 124;ORF22>:
1 MIKIKKGLNL PIAGRPEQAV YDGPAITEVA LLGEEYAGMR PSMKVKEGDA
51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEXNDEI
101 EFERYAPEAL ANLSGEEVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF
151 VNAMDTNP..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 125>:
1 ATGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA
51 GCAAGCCGTT TACGACGGCC CGGCCATTAC CGAAGTCGCG TTGCTTGGCG
101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC
151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT
201 GTTTACTGCG CCGGCTTCAG GCAAAATCGC CGCGATTCAC CGTGGCGAAA
251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC
301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA
351 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC
401 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC
451 GTCAATGCGA TGGACACCAA TCCGCTGGCT GCCGACCCTA CGGTCATTAT
501 CAAAGAAGCC GCCGAGGATT TCAAACGCGG CCTGTTGGTA TTGAGCCGTT
551 TGACCGAACG CAAAATCCAT GTTTGTAAGG CAGCTGGCGC AGACGTGCCG
601 TCTGAAAATG CTGCCAACAT CGAAACACAT GAATTCGGCG GCCCGCATCC
651 TGCCGGTTTG AGTGGCACGC ACATTCATTT CATCGAGCCG GTCGGCGCGA
701 ATAAAACCGT GTGGACCATC AATTATCAAG ATGTAATTAC CATTGGCCGT
751 TTGTTTGCAA CAGGCCGTCT GAACACCGAG CGCGTGATTG CCCTAGGTGG
801 TTCTCAAGTC AACAAACCGC GCCTCTTGCG TACCGTTTTG GGTGCGAAAG
851 TATCGCAAAT TACTGCGGGC GAATTGGTTG ACACAGACAA CCGCGTGATT
901 TCCGGTTCGG TATTGAACGG CGCGATTACA CAAGGCGCGC ACGATTATTT
951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG
1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGT
1051 ACAACCCTCG GCCATTTCCT GAAAAACAAA CTCTTCAAGT TCAACACAGC
1101 CGTCAACGGC GGCGACCGCG CCATGGTGCC GATTGGTACT TACGAGCGCG
1151 TGATGCCCTT GGATATCCTG CCCACCCTGC TTTTGCGCGA TTTAATCGTC
1201 GGCGATACCG ACAGCGCGCA GGCATTGGGT TGCTTGGAAT TGGACGAAGA
1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATACGGCC
1301 CGCTGTTGCG CAAAGTGCTG GAAACCATTG AGAAGGAAGG CTGA
它对应于氨基酸序列<SEQ ID 126;ORF22-1>:
1 MIKIKKGLNL PIAGRPEQAV YDGPAITEVA LLGEEYAGMR PSMKVKEGDA
51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI
101 EFERYAPEAL ANLSGEEVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF
151 VNAMDTNPLA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP
201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVITIGR
251 LFATGRLNTE RVIALGGSQV NKPRLLRTVL GAKVSQITAG ELVDTDNRVI
301 SGSVLNGAIT QGAHDYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR
351 TTLGHFLKNK LFKFNTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV
401 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EYGPLLRKVL ETIEKEG*
进一步的工作鉴定了脑膜炎奈瑟球菌菌株A中对应的基因<SEQ ID 127>:
1 ATGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA
51 GCAAGTCATT TATGACGGGC CCGTCATTAC CGAAGTCGCG TTGCTTGGCG
101 AAGAATATGC CGGTATGCGC CCCTNGATGA AAGTCAAGGA AGGCGATGCC
151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGNATC CGGGCGTGGT
201 GTTTACCGCG CCNGTTTCAG GCAAAATCGC CGCCATCCAT CGCGGCGAAA
251 AGCGCGTACT TCAGTCGGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC
301 GAGTTCGAAC GCTACGCGCC CGAAGCGTTG GCAAACTTAA GCGGCGANGA
351 ANTNNGNNGC AATCTGATCC AATCCGGTTT GTGGACTGCG CTGCGTANCC
401 GTCCGTTCAG CAAAATCCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC
451 GTCAATGCGA TGGACACCAA TCCGCTNGCG GCAGACCCTG TGGTTGTGAT
501 CAAAGAAGCC GNCGANGATT TCAGACGANG TNTGCTGGTA TTGAGCCGTT
551 TGACCGAGCG TAAAATCCAT GTGTGTAAGG CAGCTGGCGC AGACGTGCCG
601 TCTGAAAATG CTGCCAACAT CGAAACACAT GAATTCGGCG GCCCGCATCC
651 GGCCGGTTTG AGTGGCACGC ACATTCATTT CATTGAGCCG GTCGGTGCAA
701 ACAAAACCGT TTGGACCATC AATTATCAAG ATGTAATTGC CATCGGACGT
751 TTGTTTGCAA CAGGCCGTCT GAACACCGAG CGCGTGATTG CTTTGGGTGG
801 TTCTCAAGTC AACAAACCAC GCCTCTTGCG TACCGTTTTG GGTGCGAAAG
851 TATCGCAAAT TACTGCGGGC GAATTGGTTG ACGCAGACAA CCGCGTGATT
901 TCCGGTTCGG TATTGAACGG CGCGATTACA CAAGGCGCGC ACGATTATTT
951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG
1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGT
1051 ACGACCCTCG GCCATTTCCT GAAAAACAAA CTCTTCAAGT TCACGACAGC
1101 CGTCAACGGT GGCGACCGCG CCATGGTGCC GATTGGTACT TACGAGCGCG
1151 TAATGCCGCT AGACATCCTG CCTACCCTGC TTTTGCGCGA TTTAATCGTC
1201 GGCGATACCG ACAGCGCGCA AGCATTGGGT TGCTTGGAAT TGGACGAAGA
1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATANGGCC
1301 CGCTGTTGCG TAAGGTGCTG GAAACCNTTG AGAAGGAAGG CTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 128;ORF22a>:
1 MIKIKKGLNL PIAGRPEQVI YDGPVITEVA LLGEEYAGMR PXMKVKEGDA
51 VKKGQVLFED KKXPGVVFTA PVSGKIAAIH RGEKRVLQSV VIAVEGNDEI
101 EFERYAPEAL ANLSGXEXXX NLIQSGLWTA LRXRPFSKIP AVDAEPFAIF
151 VNAMDTNPLA ADPVVVIKEA XXDFRRXXLV LSRLTERKIH VCKAAGADVP
201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVIAIGR
251 LFATGRLNTE RVIALGGSQV NKPRLLRTVL GAKVSQITAG ELVDADNRVI
301 SGSVLNGAIT QGAHDYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR
351 TTLGHFLKNK LFKFTTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV
401 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EXGPLLRKVL ETXEKEG*
最初鉴定的菌株B部分序列(ORF22)与ORF22a在重叠的158个氨基酸内显示出有94.2%的相同性:
10 20 30 40 50 60
orf22.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED
||||||||||||||||||::||||:|||||||||||||||| ||||||||||||||||||
orf22a MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED
10 20 30 40 50 60
70 80 90 100 110 120
orf22.pep KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEXNDEIEFERYAPEALANLSGEEVRR
|| ||||||||:||||||||||||||||||||||| ||||||||||||||||||| |
orf22a KKXPGVVFTAPVSGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGXEXXX
70 80 90 100 110 120
130 140 150
orf22.pep NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNP
||||||||||||:|||||||||||||||||||||||||
orf22a NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVVVIKEAXXDFRRXXLV
130 140 150 160 170 180
完整的菌株B序列(ORF22-1)和ORF22a在447个氨基酸的重叠区内显示出有94.9%的相同性:
10 20 30 40 50 60
orf22a.pep MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED
||||||||||||||||||::||||:|||||||||||||||| ||||||||||||||||||
orf22-1 MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED
10 20 30 40 50 60
70 80 90 100 110 120
orf22a.pep KKXPGVVFTAPVSGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGXEXXX
|| ||||||||:||||||||||||||||||||||||||||||||||||||||||| |
orf22-1 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGEEVRR
70 80 90 100 110 120
130 140 150 160 170 180
orf22a.pep NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVVVIKEAXXDFRRXXLV
||||||||||||:||||||||||||||||||||||||||||||:|:|||| ||:| ||
orf22-1 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV
130 140 150 160 170 180
190 200 210 220 230 240
orf22a.pep LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22-1 LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
190 200 210 220 230 240
250 260 270 280 290 300
orf22a.pep NYQDVIAIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDADNRVI
||||||:|||||||||||||||||||||||||||||||||||||||||||||||:|||||
orf22-1 NYQDVITIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDTDNRVI
250 260 270 280 290 300
310 320 330 340 350 360
orf22a.pep SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22-1 SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
310 320 330 340 350 360
370 380 390 400 410 420
orf22a.pep LFKFTTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22-1 LFKFNTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
370 380 390 400 410 420
430 440
orf22a.pep LCSFVCPGKYEXGPLLRKVLETXEKEGX
||||||||||| |||||||||| |||||
orf22-1 LCSFVCPGKYEYGPLLRKVLETIEKEGX
430 440
进一步的工作鉴定了淋病奈瑟球菌的部分基因序列<SEQ ID 129>,它编码下列氨基酸序列<SEQ ID 130;ORF22ng>:
1 MIKIKKGLNL PIAGRPEQVI YDGPAITEVA LLGEEYVGMR PSMKIKEGEA
51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI
101 EFERYVPEAL AKLSSEKVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF
151 VNAMDTNPLA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP
201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVIAIGR
251 LFVTGRLNTE RVVALGGLQV NKPRLLRTVL GAKVSQLTAG ELVDADNRVI
301 SGSVLNGAIA QGAHDYLGRY HN*
进一步的工作鉴定了完整的淋球菌基因<SEQ ID 131>:
1 ATGATTAAAA TCAAAAAAGG TCTAAATCTG CCCATCGCGG GCAGACCGGA
51 GCAAGTCATT TATGACGGCC CGGCCATTAC CGAAGTCGCG TTGCTTGGCG
101 AAGAATATGT CGGCATGCGC CCCTCGATGA AAATCAAGGA AGGTGAAGCC
151 GTCAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTAGT
201 ATTTACTGCG CCGGCTTCAG GCAAAATCGC CGCTATTCAC CGTGGCGAAA
251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAAGGCAA CGACGAAATC
301 GAGTTCGAAC GCTACGTACC TGAAGCGCTG GCAAAATTGA GCAGCGAAAA
351 AGTGCGCCGC AACCTGATTC AATCAGGCTT ATGGACTGCG CTTCGCACCC
401 GTCCGTTCAG CAAAATCCCT GCCGTAGATG CCGAGCCGTT CGCCATCTTC
451 GTCAATGCGA TGGACACCAA TCCGCTGGCT GCCGACCCTA CGGTCATCAT
501 CAAAGAAGCC GCCGAAGACT TCAAACGCGG CCTGTTGGTA TTGAGCCGCC
551 TGACCGAACG TAAAATCCAT GTGTGTAAAG CAGCAGGCGC AGACGTGCCG
601 TCTGAAAATG CTGCCAATAT CGAAACACAT GAATTTGGCG GCCCGCATCC
651 TGCCGGCTTG AGTGGCACGC ACATTCATTT CATCGAGCCA GTCGGCGCGA
701 ATAAAACCGT GTGGACCATC AATTATCAAG ACGTGATTGC TATCGGACGT
751 TTGTTCGTAA CAGGCCGTCT GAATACCGAG CGCGTGGTTG CCTTGGGCGG
801 CCTGCAAGTC AACAAACCGC GCCTCTTGCG TACCGTTTTG GGTGCGAAGG
851 TGTCTCAACT TACCGCCGGC GAATTGGTTG ACGCGGACAA CCGCGTGATT
901 TCCGGTTCGG TATTGAACGG TGCGATTGCA CAAGGCGCGC ATGATTATTT
951 GGGACGCTAC CACAATCAGA TTTCCGTTAT CGAAGAAGGC CGCAGCAAAG
1001 AGCTGTTCGG CTGGGTTGCG CCGCAGCCGG ACAAATACTC CATCACGCGC
1051 ACCACTCTCG GCCATTTCCT AAAAAACAAA CTCTTCAAGT TCACGACAGC
1101 CGTCAACGGC GGCGACCGCG CCATGGTACC GATCGGCACT TATGAGCGCG
1151 TAATGCCGTT GGACATCCTG CCTACCTTGC TTTTGCGCGA TTTAATCGTC
1201 GGCGATACCG ACAGCGCGCA GGCTTTGGGT TGCTTGGAAT TGGACGAAGA
1251 AGACCTCGCT TTGTGCAGCT TCGTCTGCCC GGGCAAATAC GAATACGGCC
1301 CGCTGTTGCG CAAAGTGCTG GAAACCATTG AGAAGGAAGG CTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 132;ORF22ng-1>:
1 MIKIKKGLNL PIAGRPEQVI YDGPAITEVA LLGEEYVGMR PSMKIKEGEA
51 VKKGQVLFED KKNPGVVFTA PASGKIAAIH RGEKRVLQSV VIAVEGNDEI
101 EFERYVPEAL AKLSSEKVRR NLIQSGLWTA LRTRPFSKIP AVDAEPFAIF
151 VNAMDTNPLA ADPTVIIKEA AEDFKRGLLV LSRLTERKIH VCKAAGADVP
201 SENAANIETH EFGGPHPAGL SGTHIHFIEP VGANKTVWTI NYQDVIAIGR
251 LFVTGRLNTE RVVALGGLQV NKPRLLRTVL GAKVSQLTAG ELVDADNRVI
301 SGSVLNGAIA QGAHDYLGRY HNQISVIEEG RSKELFGWVA PQPDKYSITR
351 TTLGHFLKNK LFKFTTAVNG GDRAMVPIGT YERVMPLDIL PTLLLRDLIV
401 GDTDSAQALG CLELDEEDLA LCSFVCPGKY EYGPLLRKVL ETIEKEG*
最初鉴定的菌株B部分序列(ORF22)与ORF22ng在重叠的158个氨基酸内显示出有93.7%的相同性:
orf22.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 60
||||||||||||||||||::||||||||||||||||:|||||||:|||:|||||||||||
orf22ng MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKKGQVLFED 60
orf22.pep KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEXNDEIEFERYAPEALANLSGEEVRR 120
||||||||||||||||||||||||||||||||||| |||||||||:|||||:||:|:|||
orf22ng KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYVPEALAKLSSEKVRR 120
orf22.pep NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNP 158
||||||||||||||||||||||||||||||||||||||
orf22ng NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV 180
菌株B(ORF22-1)和淋球菌(ORF22ng)的完整序列在447个氨基酸的重叠区内显示出有96.2%的相同性:
10 20 30 40 50 60
orf22-1.pep MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED
||||||||||||||||||::||||||||||||||||:|||||||:|||:|||||||||||
orf22ng-1 MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKKGQVLFED
10 20 30 40 50 60
70 80 90 100 110 120
orf22-1.pep KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGEEVRR
|||||||||||||||||||||||||||||||||||||||||||||:|||||:||:|:|||
orf22ng-1 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYVPEALAKLSSEKVRR
70 80 90 100 110 120
130 140 150 160 170 180
orf22-1.pep NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22ng-1 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV
130 140 150 160 170 180
190 200 210 220 230 240
orf22-1.pep LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22ng-1 LSRLTERKIHVCKAAGADVPSENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTVWTI
190 200 210 220 230 240
250 260 270 280 290 300
orf22-1.pep NYQDVITIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDTDNRVI
||||||:|||||:|||||||||:|||| ||||||||||||||||||:|||||||:|||||
orf22ng-1 NYQDVIAIGRLFVTGRLNTERVVALGGLQVNKPRLLRTVLGAKVSQLTAGELVDADNRVI
250 260 270 280 290 300
310 320 330 340 350 360
orf22-1.pep SGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||||||
orf22ng-1 SGSVLNGAIAQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFLKNK
310 320 330 340 350 360
370 380 390 400 410 420
orf22-1.pep LFKFNTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf22ng-1 LFKFTTAVNGGDRAMVPIGTYERVMPLDILPTLLLRDLIVGDTDSAQALGCLELDEEDLA
370 380 390 400 410 420
430 440
orf22-1.pep LCSFVCPGKYEYGPLLRKVLETIEKEGX
||||||||||||||||||||||||||||
orf22ng-1 LCSFVCPGKYEYGPLLRKVLETIEKEGX
430 440
这些序列的计算机分析给出了下列结果:
与大叶性肺炎放线杆菌的48kDa外膜蛋白(登录号U24492)的同源性
ORF22和该48kDa蛋白在158个氨基酸的重叠区内有72%的氨基酸相同性:
Orf22 1 MIKIKKGLNLPIAGRPEQAVYDGPAITEVALLGEEYAGMRPSMKVKEGDAVKKGQVLFED 60
MI IKKGL+LPIAG P Q+++G + EVA+LGEEY GMRPSMKV+EGD VKKGQVLFED
48kDa 1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDVVKKGQVLFED 60
orf22 61 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEXNDEIEFERYAPEALANLSGEEVRR 120
KKNPGVVFTAPASG + I+RGEKRVLQSVVI VE +++I F RY LA+LS E+V++
48kDa 61 KKNPGVVFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120
orf22 121 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNP 158
NLI+SGLWTA RTRPFSK+PA+DA P +IFVNAMDTNP
48kDa 121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNP 158
ORF22a还显示出与48kDa的大叶性肺炎放线杆菌蛋白有同源性:
gi|1185395(U24492)48kDa外膜蛋白[大叶性肺炎放线杆菌]长度=449
评分=530位(1351),估计值=e-150
相同性=274/450(60%),阳性=323/450(70%),空隙=4/450(0%)
询问:1 MIKIKKGLNLPIAGRPEQVIYDGPVITEVALLGEEYAGMRPXMKVKEGDAVKKGQVLFED 60
MI IKKGL+LPIAG P QVI++G + EVA+LGEEY GMRP MKV+EGD VKKGQVLFED
目标:1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDVVKKGQVLFED 60
询问:61 KKXPGVVFTAPVSGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYAPEALANLSGXEXXX 120
KK PGVVFTAP SG + I+RGEKRVLQSVVI VEG+++I F RY LA+LS +
目标:61 KKNPGVVFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120
询问:121 NLIQSGLWTALRXRPFSKIPAVDAEPFAIFVNAMDTNPLAADPVVVIKEAXXDFRRXXLV 180
NLI+SGLWTA R RPFSK+PA+DA P +IFVNAMDTNPLAADP VV+KE DF+ V
目标:121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNPLAADPEVVLKEYETDFKDGLTV 180
询问:181 LSRL--TERKIHVCKAAGADVP-SENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTV 237
L+RL ++ +++CK A +++P S I F G HPAGL GTHIHF++PVGA K V
目标:181 LTRLFNGQKPVYLCKDADSNIPLSPAIEGITIKSFSGVHPAGLVGTHIHFVDPVGATKQV 240
询问:238 WTINYQDVIAIGRLFATGRLNTERVIALGGSQVNKPRLLRTVLGAKVSQITAGELVDADN 297
W +NYQDVIAIG+LF TG L T+R+I+L G QV PRL+RT LGA+SQ+TA EL +N
目标:241 WHLNYQDVIAIGKLFTTGELFTDRIISLAGPQVKNPRLVRTRLGANLSQLTANELNAGEN 300
询问:298 RVISGSVLNGAITQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFL 357
RVISGSVL+GA G DYLGRY Q+SV+ EGR KELFGW+ P DK+SITRT LGHF
目标:301 RVISGSVLSGATAAGPVDYLGRYALQVSVLAEGREKELFGWIMPGSDKFSITRTVLGHFG 360
询问:358 KNKLFKFTTAVNGGDRAMVPIGTYERVMXXXXXXXXXXXXXXVGDTDSAQXXXXXXXXXX 417
K KLF FTTAV+GG+RAMVPIG YERVM GDTDSAQ
目标:361 K-KLFNFTTAVHGGERAMVPIGAYERVMPLDIIPTLLLRDLAAGDTDSAQNLGCLELDEE 419
询问:418 XXXXXSFVCPGKYEXGPLLRKVLETXEKEG 447
++VCPGK GP+LR LE EKEG
ORF22ng-1还显示出与大叶性肺炎放线杆菌的OMP有同源性:
gi|1185395(U24492)48kDa外膜蛋白[大叶性肺炎放线杆菌]长度=449
评分=555位(1414),估计值=e-157
相同性=284/450(63%),阳性=337/450(74%),空隙=4/450(0%)
询问:27 MIKIKKGLNLPIAGRPEQVIYDGPAITEVALLGEEYVGMRPSMKIKEGEAVKKGQVLFED 86
MI IKKGL+LPIAG P QVI++G + EVA+LGEEYVGMRPSMK++EG+ VKKGQVLFED
目标:1 MITIKKGLDLPIAGTPAQVIHNGNTVNEVAMLGEEYVGMRPSMKVREGDVVKKGQVLFED 60
询问:87 KKNPGVVFTAPASGKIAAIHRGEKRVLQSVVIAVEGNDEIEFERYVPEALAKLSSEKVRR 146
KKNPGVVFTAPASG + I+RGEKRVLQSVVI VEG+++I F RY LA LS+E+V++
目标:61 KKNPGVVFTAPASGTVVTINRGEKRVLQSVVIKVEGDEQITFTRYEAAQLASLSAEQVKQ 120
询问:147 NLIQSGLWTALRTRPFSKIPAVDAEPFAIFVNAMDTNPLAADPTVIIKEAAEDFKRGLLV 206
NLI+SGLWTA RTRPFSK+PA+DA P +IFVNAMDTNPLAADP V++KE DFK GL V
目标:121 NLIESGLWTAFRTRPFSKVPALDAIPSSIFVNAMDTNPLAADPEVVLKEYETDFKDGLTV 180
询问:207 LSRL--TERKIHVCKAAGADVP-SENAANIETHEFGGPHPAGLSGTHIHFIEPVGANKTV 263
L+RL ++ +++CK A +++P S I F G HPAGL GTHIHF++PVGA K V
目标:181 LTRLFNGQKPVYLCKDADSNIPLSPAIEGITIKSFSGVHPAGLVGTHIHFVDPVGATKQV 240
询问:264 WTINYQDVIAIGRLFVTGRLNTERVVALGGLQVNKPRLLRTVLGAKVSQLTAGELVDADN 323
W+NYQDVIAIG+LF TG L T+R+++L G QV PRL+RT LGA +SQLTA EL +N
目标:241 WHLNYQDVIAIGKLFTTGELFTDRIISLAGPQVKNPRLVRTRLGANLSQLTANELNAGEN 300
询问:324 RVISGSVLNGAIAQGAHDYLGRYHNQISVIEEGRSKELFGWVAPQPDKYSITRTTLGHFL 383
RVISGSVL+GA A G DYLGRY Q+SV+EGR KELFGW+P DK+SITRT LGHF
目标:301 RVISGSVLSGATAAGPVDYLGRYALQVSVLAEGREKELFGWIMPGSDKFSITRTVLGHFG 360
询问:384 KNKLFKFTTAVNGGDRAMVPIGTYERVMXXXXXXXXXXXXXXVGDTDSAQXXXXXXXXXX 443
K KLF FTTAV+GG+RAMVPIG YERVM GDTDSAQ
目标:361 K-KLFNFTTAVHGGERAMVPIGAYERVMPLDIIPTLLLRDLAAGDTDSAQNLGCLELDEE 419
询问:444 XXXXXSFVCPGKYEYGPLLRKVLETIEKEG 473
++VCPGK YGP+LR LE IEKEG
目标:420 DLALCTYVCPGKNNYGPMLRAALEKIEKEG 449
根据该分析结果,包括与大叶性肺炎放线杆菌外膜蛋白有同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
将ORF22-1(35.4kDa)如上所述克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图5A显示了GST-融合蛋白的亲和纯化结果,图5B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA(阳性结果)和FACS分析(图5C)。这些结果确认ORF22-1是一种外露蛋白,且是一种有用的免疫原。
实施例16
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 133>:
1 ..GCGnCGnAAA TCATCCATCC CC..nACGTC GTAGGCCCTG AAGCCAACTG
51 GTTTTTTATG GTAGCCAGTA CGTTTGTGAT TGCTTTGATT GGTTATTTTG
101 TTACTGAAAA AATCGTCGAA CCGCAATTGG GCCCTTATCA ATCAGATTTG
151 TCACAAGAAG AAAAAGACAT TCGGCATTCC AATGAAATCA CGCCTTTGGA
201 ATATAAAGGA TTAATTTGGG CTGGCGTGGT GTTTGTTGCC TTATCCGCCC
251 TATTGGCTTG GAGCATCGTC CCTGCCGACG GTATTTTGCG TCATCCTGAA
301 ACAGGATTGG TTTCCGGTTC GCCGTTTTTA AAATCGATTG TTGTTTTTAT
351 TTTCTTGTTG TTTGCACTGC CGGGCATTGT TTATGGCCGG GTAACCCGAA
401 GTTTGCGCGG CGAACAGGAA GTCGTTAATG CGmyGGCCGA ATCGATGAGT
451 ACTCTGGsGC TTTmTTTGsw CAkcATCTTT TTTGCCGCAC AGTTTGTCGC
501 ATTTTTTAAT TGGACGAATA TTGGGCAATA TATTGCCGTT AAAGGGGCGA
551 CGTTCTTAAA AGAAGTCGGC TTGGGCGGCA GCGTGTTGTT TATCGGTTTT
601 ATTTTAATTT GTGCTTTTAT CAATCTGATG ATAGGCTCCG CCTCCGCGCA
651 ATGGGCGGTA ACTGCGCCGA TTTTCGTCCC TATGCTGATG TTGGCCGGCT
701 ACGCGCCCGA AGTCATTCAA GCCGCTTACC GCATCGGTGA TTCCGTTACC
751 AATATTATTA CGCCGATGAT GAGTTATTTC GGGCTGATTA TGGCGACGGT
801 GrkCmmmTAC AAAAAAGATG CGGGCGTGGG TaCGcTGATT wCTATGATGT
851 TGCCGTATTC CGCTTTCTTC TTGATTGCgT GGATTGCCTT ATTCTGCATT
901 TGGGTATTTg TTTTGGGCCT GCCCGTCGGT CCCGGCGCGC CCACATTCTA
951 TCCCGCACCT TAA
它对应于氨基酸序列<SEQ ID 134;ORF12>:
1 ..AXXIIHPXXV VGPEANWFFM VASTFVIALI GYFVTEKIVE PQLGPYQSDL
51 SQEEKDIRHS NEITPLEYKG LIWAGVVFVA LSALLAWSIV PADGILRHPE
101 TGLVSGSPFL KSIVVFIFLL FALPGIVYGR VTRSLRGEQE VVNAXAESMS
151 TLXLXLXXIF FAAQFVAFFN WTNIGQYIAV KGATFLKEVG LGGSVLFIGF
201 ILICAFINLM IGSASAQWAV TAPIFVPMLM LAGYAPEVIQ AAYRIGDSVT
251 NIITPMMSYF GLIMATVXXY KKDAGVGTLI XMMLPYSAFF LIAWIALFCI
301 WVFVLGLPVG PGAPTFYPAP*
进一步的序列分析揭示了完整的DNA序列<SEQ ID 135>是:
1 ATGAGTCAAA CCGATACGCA ACGGGACGGA CGATTTTTAC GCACAGTCGA
51 ATGGCTGGGC AATATGTTGC CGCATCCGGT TACGCTTTTT ATTATTTTCA
101 TTGTGTTATT GCTGATTGCC TCTGCCGTCG GTGCGTATTT CGGACTATCC
151 GTCCCCGATC CGCGCCCTGT TGGTGCGAAA GGACGTGCCG ATGACGGTTT
201 GATTTACATT GTCAGCCTGC TCAATGCCGA CGGTTTTATC AAAATCCTGA
251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG
301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC
351 ATTAATGCGC TTATTGCTCA CAAAATCGCC ACGCAAACTC ACTACTTTTA
401 TGGTTGTTTT TACAGGGATT TTATCTAATA CCGCTTCTGA ATTGGGCTAT
451 GTCGTCCTAA TCCCTTTGTC CGCCATCATC TTTCATTCCC TCGGCCGCCA
501 TCCGCTTGCC GGTCTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT
551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC
601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC
651 CAACTGGTTT TTTATGGTAG CCAGTACGTT TGTGATTGCT TTGATTGGTT
701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA
751 GATTTGTCAC AAGAAGAAAA AGACATTCGG CATTCCAATG AAATCACGCC
801 TTTGGAATAT AAAGGATTAA TTTGGGCTGG CGTGGTGTTT GTTGCCTTAT
851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT
901 CCTGAAACAG GATTGGTTTC CGGTTCGCCG TTTTTAAAAT CGATTGTTGT
951 TTTTATTTTC TTGTTGTTTG CACTGCCGGG CATTGTTTAT GGCCGGGTAA
1001 CCCGAAGTTT GCGCGGCGAA CAGGAAGTCG TTAATGCGAT GGCCGAATCG
1051 ATGAGTACTC TGGGGCTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT
1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG
1151 GGGCGACGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGCGT GTTGTTTATC
1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC
1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG
1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC
1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC
1401 GACGGTGATC AAATACAAAA AAGATGCGGG CGTGGGTACG CTGATTTCTA
1451 TGATGTTGCC GTATTCCGCT TTCTTCTTGA TTGCGTGGAT TGCCTTATTC
1501 TGCATTTGGG TATTTGTTTT GGGCCTGCCC GTCGGTCCCG GCGCGCCCAC
1551 ATTCTATCCC GCACCTTAA
它对应于氨基酸序列<SEQ ID 136;ORF12-1>:
1 MSQTDTQRDG RFLRTVEWLG NMLPHP
VTLF IIFIVLLLIA SAVGAYFGLS
51 VPDPRPVGAK GRADDG
LIYI VSLLNADGFI KILTHTVKNF TG
FAPLGTVL
101
VSLLGVGIAE KSGLISALMR LLLTKSPRKL TTFMVVFTGI LSNTASE
LGY
151
VVLIPLSAII FHSLGRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT
201 QQAAQIIHPD YVVGPEANW
F FMVASTFVIA LIGYFVTEKI VEPQLGPYQS
251 DLSQEEKDIR HSNEITPLEY KGLIW
AGVVF VALSALLAWS IVPADGILRH
301 PETGLVSGSP FLKS
IVVFIF LLFALPGIVY GRVTRSLRGE QEVVNAMAES
351 MST
LGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGATFLKE VGLGGS
VLFI
401
GFILICAFIN LMIGSASAQW
AVTAPIFVPM LMLAGYAPEV IQAAYRIGDS
451 VTN
IITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA
FFLIAWIALF
501
CIWVFVLGLP VGPGAPTFYP AP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF12与脑膜炎奈瑟球菌菌株A的ORF(ORF12a)在重叠的320个氨基酸内显示出有96.3%的相同性:
10 20 30
orf12.pep AXXIIHPXXVVGPEANWFFMVASTFVIALI
| |||| |||||||||||||||||||||
orf12a AAAFAGVSGGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALI
180 190 200 210 220 230
40 50 60 70 80 90
orf12.pep GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12a GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV
240 250 260 270 280 290
100 110 120 130 140 150
orf12.pep PADGILRHPETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAXAESMS
|||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf12a PADGILRHPETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMS
300 310 320 330 340 350
160 170 180 190 200 210
orf12.pep TLXLXLXXIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM
|| | | ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12a TLGLYLVIIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM
360 370 380 390 400 410
220 230 240 250 260 270
orf12.pep IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVXXY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||| |
orf12a IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKY
420 430 440 450 460 470
280 290 300 310 320
orf12.pep KKDAGVGTLIXMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
|||||||||| ||||||||||||||||||||||||||||||||||||||||
orf12a KKDAGVGTLISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
480 490 500 510 520
全长ORF12a核苷酸序列<SEQ ID 137>是:
1 ATGAGTCAAA CCGATACGCA ACGGGACGGA CGATTTTTAC GCACAGTCGA
51 ATGGCTGGGC AATATGTTGC CGCACCCGGT TACGCTTTTT ATTATTTTCA
101 TTGTGTTATT GCTGATTGCC TCTGCCGCCG GTGCGTATTT CGGACTATCC
151 GTCCCCGATC CGCGCCCTGT TGGTGCGAAA GGACGTGCCG ATGACGGTTT
201 GATTCACGTT GTCAGCCTGC TCGATGCTGA CGGTTTGATC AAAATCCTGA
251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG
301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC
351 ATTAATGCGC TTATTGCTCA CAAAATCTCC ACGCAAACTC ACTACTTTTA
401 TGGTTGTTTT TACAGGGATT TTATCTAATA CCGCTTCTGA ATTGGGCTAT
451 GTCGTCCTAA TCCCTTTGTC CGCCATCATC TTTCATTCCC TCGGCCGCCA
501 TCCGCTTGCC GGTCTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT
551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC
601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC
651 CAACTGGTTT TTTATGGTAG CCAGTACGTT TGTGATTGCT TTGATTGGTT
701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA
751 GATTTGTCAC AAGAAGAAAA AGACATTCGA CATTCCAATG AAATCACGCC
801 TTTGGAATAT AAAGGATTAA TTTGGGCTGG CGTGGTGTTT GTTGCCTTAT
851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT
901 CCTGAAACAG GATTGGTTTC CGGTTCGCCG TTTTTAAAAT CAATTGTTGT
951 TTTTATTTTC TTGTTGTTTG CACTGCCGGG CATTGTTTAT GGCCGGGTAA
1001 CCCGAAGTTT GCGCGGCGAA CAGGAAGTCG TTAATGCGAT GGCCGAATCG
1051 ATGAGTACTC TGGGGCTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT
1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG
1151 GGGCGACGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGCGT GTTGTTTATC
1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC
1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG
1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC
1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC
1401 GACGGTGATC AAATACAAAA AAGATGCGGG CGTGGGTACG CTGATTTCTA
1451 TGATGTTGCC GTATTCCGCT TTCTTCTTGA TTGCGTGGAT TGCCTTATTC
1501 TGCATTTGGG TATTTGTTTT GGGCCTGCCC GTCGGTCCCG GCGCGCCCAC
1551 ATTCTATCCC GCACCTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 138>:
1 MSQTDTQRDG RFLRTVEWLG NMLPHP
VTLF IIFIVLLLIA SAAGAYFGLS
51 VPDPRPVGAK GRADDG
LIHV VSLLDADGLI KILTHTVKNF TG
FAPLGTVL
101
VSLLGVGIAE KSGLISALMR LLLTKSPRKL TTFMVVFTGI LSNTASE
LGY
151
VVLIPLSAII FHSLGRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT
201 QQAAQIIHPD YVVGPEANW
F FMVASTFVIA LIGYFVTEKI VEPQLGPYQS
251 DLSQEEKDIR HSNEITPLEY KGLIW
AGVVF VALSALLAWS IVPADGILRH
301 PETGLVSGSP FLKS
IVVFIF LLFALPGIVY GRVTRSLRGE QEVVNAMAES
351 MST
LGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGATFLKE VGLGGS
VLFI
401
GFILICAFIN LMIGSASAQW
AVTAPIFVPM LMLAGYAPEV IQAAYRTGDS
451 VTN
IITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA
FFLIAWIALF
501
CIWVFVLGLP VGPGAPTFYP AP*
ORF12a和ORF12-1在522个氨基酸的重叠区内显示出有99.0%的相同性:
10 20 30 40 50 60
orf12a.pep MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAAGAYFGLSVPDPRPVGAK
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||
orf12-1 MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAVGAYFGLSVPDPRPVGAK
10 20 30 40 50 60
70 80 90 100 110 120
orf12a.pep GRADDGLIHVVSLLDADGLIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
||||||||::||||:|||:|||||||||||||||||||||||||||||||||||||||||
orf12-1 GRADDGLIYIVSLLNADGFIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
70 80 90 100 110 120
130 140 150 160 170 180
orf12a.pep LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS
130 140 150 160 170 180
190 200 210 220 230 240
orf12a.pep GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI
190 200 210 220 230 240
250 260 270 280 290 300
orf12a.pep VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
250 260 270 280 290 300
310 320 330 340 350 360
orf12a.pep PETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMSTLGLYLVI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 PETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMSTLGLYLVI
310 320 330 340 350 360
370 380 390 400 410 420
orf12a.pep IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
370 380 390 400 410 420
430 440 450 460 470 480
orf12a.pep AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12-1 AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
430 440 450 460 470 480
490 500 510 520
orf12a.pep LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
|||||||||||||||||||||||||||||||||||||||||||
orf12-1 LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
490 500 510 520与淋病奈瑟球菌的预计ORF的同源性
ORF12与淋病奈瑟球菌的预计ORF(ORF12.ng)在重叠的320个氨基酸内显示出有92.5%的相同性:
orf12.pep AXXIIHPXXVVGPEANWFFMVASTFVIALI 30
| ||| |||||||||||:|||||||||
orf12ng AAAFAGVSGGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMAASTFVIALI 232
orf12.pep GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV 90
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12ng GYFVTEKIVEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIV 292
orf12.pep PADGILRHPETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAXAESMS 150
||||||||||||||:|||||||||||||||||||||||||:|||||||:||||| |||||
orf12ng PADGILRHPETGLVAGSPFLKSIVVFIFLLFALPGIVYGRITRSLRGEREVVNAMAESMS 352
orf12.pep TLXLXLXXIFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLM 210
|| | | |||||||||||||||||||||||||:|||: ||||||||||||||||||||
orf12ng TLGLYLVIIFFAAQFVAFFNWTNIGQYIAVKGAVFLKKFRLGGSVLFIGFILICAFINLM 412
orf12.pep IGSASAQWAVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVXXY 270
||||||||||||||||||||||| ||:|||||||||||||||||||||||||||||| |
orf12ng IGSASAQWAVTAPIFVPMLMLAGNAPQVIQAAYRIGDSVTNI ITPMMSYFGLIMATVIKY 472
orf12.pep KKDAGVGTLIXMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAP 320
|||||||||| |||||||||||||||||||||||||||||||:|||||:|
orf12ng KKDAGVGTLISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGTPTFYPVP 522
全长ORF 2ng核苷酸序列<SEQ ID 139>是:
1 ATGAGTCAAA CCGACGCGCG TCGTAGCGGA CGATTTTTAC GCACAGTCGA
51 ATGGCTGGGC AATATGTTGC CGCACCCGGT TACGCTTTTT ATTATTTTCA
101 TTGTGTTATT GCTGATTGcc tctgCCGTCG GTGCGTATTT CGGACTATCC
151 GTCCCCGATC CGCGTCCTGT TGGGGCGAAA GGACGTGCCG ATGACGGTTT
201 GATTCACGTT GTCAGCCTGC TCGATGCCGA CGGTTTGATC AAAATCCTGA
251 CGCATACCGT TAAAAATTTC ACCGGTTTCG CGCCGTTGGG AACGGTGTTG
301 GTTTCTTTAT TGGGCGTGGG GATTGCGGAA AAATCGGGCT TGATTTCCGC
351 ATTAATGCGC TTATTGCTCA CAAAATCCCC ACGCAAACTC ACTACTTTTA
401 TGGTTGTTTT TACAGGGATT TTATCCAATA CGGCTTCTGA ATTGGGCTAT
451 GTCGTCCTAA TCCCTTTGTC CGCCGTCATC TTTCATTCGC TCGGCCGCCA
501 TCCGCTTGCC GGTTTGGCTG CGGCTTTCGC CGGCGTTTCG GGCGGTTATT
551 CGGCCAATCT GTTCTTAGGC ACAATCGATC CGCTCTTGGC AGGCATCACC
601 CAACAGGCGG CGCAAATCAT CCATCCCGAC TACGTCGTAG GCCCTGAAGC
651 CAACTGGTTT TTTATGGCAG CCAGTACGTT TGTGATTGCT TTGATTGGTT
701 ATTTTGTTAC TGAAAAAATC GTCGAACCGC AATTGGGCCC TTATCAATCA
751 GATTTGTCAC AAGAAGAAAA AGACATTCGG CATTCCAATG AAATCACGCC
801 TTTGGAATAT AAAGGATTAA TTTGGGCAGG CGTGGTGTTT GTTGCCTTAT
851 CCGCCCTATT GGCTTGGAGC ATCGTCCCTG CCGACGGTAT TTTGCGTCAT
901 CCTGAAACAG GATTGGTTGC CGGTTCGCCG TTTTTAAAAT CGATTGTTGT
951 TTTTATTTTC TTGTTGTTTG CGCTGCCGGG CATTGTTTAT GGCCGGATAA
1001 CCCGAAGTTT GCGCGGCGAA CGGGAAGTCG TTAATGCGAT GGCCGAATCG
1051 ATGAGTACTT TGGGACTTTA TTTGGTCATC ATCTTTTTTG CCGCACAGTT
1101 TGTCGCATTT TTTAATTGGA CGAATATTGG GCAATATATT GCCGTTAAAG
1151 GGGCGGTGTT CTTAAAAGAA GTCGGCTTGG GCGGCAGTGT GTTGTTTATC
1201 GGTTTTATTT TAATTTGTGC TTTTATCAAT CTGATGATAG GCTCCGCCTC
1251 CGCGCAATGG GCGGTAACTG CGCCGATTTT CGTCCCTATG CTGATGTTGG
1301 CCGGCTACGC GCCCGAAGTC ATTCAAGCCG CTTACCGCAT CGGTGATTCC
1351 GTTACCAATA TTATTACGCC GATGATGAGT TATTTCGGGC TGATTATGGC
1401 GACGGTAATC AAATACAAAA AAGATGCGGG CGTAGGCACG CTGATTTCTA
1451 TGATGTTGCC GTATTCCGCT TTCTTCTTAA TTGCATGGAT CGCCTTATTC
1501 TGCATTTGGG TATTTGTTTT GGGTCTGCCC GTCGGTCCCG GCACACCCAC
1551 ATTCTATCCG GTGCCTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 140>:
1 MSQTDARRSG RFLRTVEWLG NMLPHPVTLF IIFIVLLLIA SAVGAYFGLS
51 VPDPRPVGAK GRADDG
LIHV VSLLDADGLI KILTHTVKNF TG
FAPLGTVL
101
VSLLGVGIAE KSGLISALMR LLLTKSPRKL TTFMVVFTGI LSNTASE
LGY
151
VVLIPLSAVI FHSLGRHPLA GLAAAFAGVS GGYSANLFLG TIDPLLAGIT
201 QQAAQIIHPD YVVGPEANWF
FMAASTFVIA LIGYFVTEKI VEPQLGPYQS
251 DLSQEEKDIR HSNEITPLEY KGLIW
AGVVF VALSALLAWS IVPADGILRH
301 PETGLVAGSP FLKS
IVVFIF LLFALPGIVY GRITRSLRGE REVVNAMAES
351 MST
LGLYLVI IFFAAQFVAF FNWTNIGQYI AVKGAVFLKK FRLGGS
VLFI
401
GFILICAFIN LMIGSASAQW AVTAPIFVPM LMLAGNAPQV IQAAYRIGDS
451 VTN
IITPMMS YFGLIMATVI KYKKDAGVGT LISMMLPYSA
FFLIAWIALF
501
CIWVFVLGLP VGPGTPTFYP VP*
ORF12ng与ORF12-1在重叠的522个氨基酸内显示出有97.1%的相同性:
10 20 30 40 50 60
orf12-1.pep MSQTDTQRDGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAVGAYFGLSVPDPRPVGAK
|||||::|:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf12ng MSQTDARRSGRFLRTVEWLGNMLPHPVTLFIIFIVLLLIASAVGAYFGLSVPDPRPVGAK
10 20 30 40 50 60
70 80 90 100 110 120
orf12-1.pep GRADDGLIYIVSLLNADGFIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
||||||||::||||:|||:|||||||||||||||||||||||||||||||||||||||||
orf12ng GRADDGLIHVVSLLDADGLIKILTHTVKNFTGFAPLGTVLVSLLGVGIAEKSGLISALMR
70 80 90 100 110 120
130 140 150 160 170 180
orf12-1.pep LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAIIFHSLGRHPLAGLAAAFAGVS
||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||
orf12ng LLLTKSPRKLTTFMVVFTGILSNTASELGYVVLIPLSAVIFHSLGRHPLAGLAAAFAGVS
130 140 150 160 170 180
190 200 210 220 230 240
orf12-1.pep GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMVASTFVIALIGYFVTEKI
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||
orf12ng GGYSANLFLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMAASTFVIALIGYFVTEKI
190 200 210 220 230 240
250 260 270 280 290 300
orf12-1.pep VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12ng VEPQLGPYQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRH
250 260 270 280 290 300
310 320 330 340 350 360
orf12-1.pep PETGLVSGSPFLKSIVVFIFLLFALPGIVYGRVTRSLRGEQEVVNAMAESMSTLGLYLVI
||||||:|||||||||||||||||||||||||:|||||||:|||||||||||||||||||
orf12ng PETGLVAGSPFLKSIVVFIFLLFALPGIVYGRITRSLRGEREVVNAMAESMSTLGLYLVI
310 320 330 340 350 360
370 380 390 400 410 420
orf12-1.pep IFFAAQFVAFFNWTNIGQYIAVKGATFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
|||||||||||||||||||||||||:||||||||||||||||||||||||||||||||||
orf12ng IFFAAQFVAFFNWTNIGQYIAVKGAVFLKEVGLGGSVLFIGFILICAFINLMIGSASAQW
370 380 390 400 410 420
430 440 450 460 470 480
orf12-1.pep AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf12ng AVTAPIFVPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGT
430 440 450 460 470 480
490 500 510 520
orf12-1.pep LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGAPTFYPAPX
||||||||||||||||||||||||||||||||||:|||||:||
orf12ng LISMMLPYSAFFLIAWIALFCIWVFVLGLPVGPGTPTFYPVPX
490 500 510 520
另外,ORF12ng显示出与大肠杆菌的一种假设蛋白明显同源:
sp|P46133|YDAH_ECOLI OGT-DBPA基因间区域中假设的55.1KD蛋白
>gi|1787597(AE000231)5’区域中的假设蛋白[大肠杆菌]长度=510
评分=329位(835),估计值=2e-89
相同性=178/507(35%),阳性=281/507(55%),空隙=15/507(2%)
询问:8 RSGRFLRTVEWLGNMLPHPVTXXXXXXXXXXXASAVGAYFGLSVPDPRPVGAKGRADDGL 67
+SG+ VE +GN +PHP +A+ + FG +S +P D
目标:13 QSGKLYGWVERIGNKVPHPFLLFIYLIIVLMVTTAILSAFGVSAKNP--------TDGTP 64
询问:68 IHVVSLLDADGLIKILTHTVKNFTGFAPXXXXXXXXXXXXIAEKSGLISALMRLLLTKSP 127
+ V +LL +GL L + +KNF+GFAP +AE+ GL+ ALM + +
目标:65 VVVKNLLSVEGLHWFLPNVIKNFSGFAPLGAILALVLGAGLAERVGLLPALMVKMASHVN 124
询问:128 RKLTTFMVVFTGILSNTASELGYVVLIPLSAVIFHSLGRHPLAGLAAAFAGVSGGYSANL 187
+ ++MV+F S+ +S+ V++ P+ A+IF ++GRHP+AGL AA AGV G++ANL
目标:125 ARYASYMVLFIAFFSHISSDAALVIMPPMGALIFLAVGRHPVAGLLAAIAGVGCGFTANL 184
询问:188 FLGTIDPLLAGITQQAAQIIHPDYVVGPEANWFFMAASTFVIALIGYFVTEKIVEPQLGP 247
+ T D LL+GI+ +AA +P V NW+FMA+S V+ ++G +T+KI+EP+LG
目标:185 LIVTTDVLLSGISTEAAAAFNPQMHVSVIDNWYFMASSVVVLTIVGGLITDKIIEPRLGQ 244
询问:248 YQSDLSQEEKDIRHSNEITPLEYKGLIWAGVVFVALSALLAWSIVPADGILRHPETGLVA 307
+Q + ++ + + S GL AGVV + A +A ++P +GILR P V
目标:245 WQGNSDEKLQTLTESQRF------GLRIAGVVSLLFIAAIALMVIPQNGILRDPINHTVM 298
询问:308 GSPFLKSIVVFIFLLFALPGIVYGRITRSLRGEREVVNAMAESMSTLGLYLXXXXXXXXX 367
SPF+K IV I L F + + YG TR++R + ++ + M E M + ++
目标:299 PSPFIKGIVPLIILFFFVVSLAYGIATRTIRRQADLPHLMIEPMKEMAGFIVMVFPLAQF 358
询问:368 XXXXNWTNIGQYIAVKGAVFLKEVGLGGSVLFIGFILICAFINLMIGSASAQWAVTAPIF 427
NW+N+G++IAV L+ GL G F+G L+ +F+ + I S SA W++ APIF
目标:359 VAMFNWSNMGKFIAVGLTDILESSGLSGIPAFVGLALLSSFLCMFIASGSAIWSILAPIF 418
询问:428 VPMLMLAGYAPEVIQAAYRIGDSVTNIITPMMSYFGLIMATVIKYKKDAGVGTLISMMLP 487
VPM ML G+P Q +RI DS + P+ + L + + +YK DA +GT S++LP
目标:419 VPMFMLLGFHPAFAQILFRIADSSVLPLAPVSPFVPLFLGFLQRYKPDAKLGTYYSLVLP 478
询问:488 YSAFFLIAWIALFCIWVFVLGLPVGPG 514
Y FL+ W+ + W +++GLP+GPG
目标:479YPLIFLVVWLLMLLAW-YLVGLPIGPG 504
根据该分析结果,包括该淋球菌蛋白中存在几个推定的跨膜结构域和预计的辅肌动蛋白型结合肌动蛋白的结构域特征序列(用粗体表示),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例17
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 141>:
1 ..ACAGCCGGCG CAGCAGGTTn CnCGGTCTTC GTTTTCGTAA CGGACAGTCA
51 GGTGGAGGTG TTCGGGAACA TCCAGACCGC AGTGGAAACA GGTTTTTTTC
101 ATGGCATTTC GGTTTCGTCT GTGTTTGGTG CGGCGGCACA AGACTCGGCA
151 ATgGCTTCGC GCAGTGCGTC TATACCGGTA TTTTCAGCAA CGGAAATGCG
201 GACGGcGgCA ATTTTTCCCG CAGCGTCGCG CCATATGCCC GTGTTTTgTT
251 CTTCAGACGG CAGCAGGTCG GTTTTGTTGT ACACCTTgAT GCACGGAaTA
301 TCGCCGGCAT GGATTTCTTG CAGTACGTTT TCCACGTCTT CAATCTGCTG
351 TCCGCTGTTC GGAGCGGCGG CATCGACGAC GTGCAGCAGC ACATCgGcTT
401 gCGCGGTTTC TTCCAGCGTG GCgGAAAAGG CGGAAATCAG TTTgTGCGGC
451 agATyGCTnA CGAATCCGAC GGTATCGGTC AGGATAATGC TGCATTCGGG
501 ACT..
它对应于氨基酸序列<SEQ ID 142;ORF 14>:
1 ..TAGAAGXXVF VFVTDSQVEV FGNIQTAVET GFFHGISVSS VFGAAAQDSA
51 MASRSASIPV FSATEMRTAA IFPAASRHMP VFCSSDGSRS VLLYTLMHGI
101 SPAWISCSTF STSSICCPLF GAAASTTCSS TSACAVSSSV AEKAEISLCG
151 RXLTNPTVSV RIMLHSG..
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF14与脑膜炎奈瑟球菌菌株A的ORF(ORF14a)在重叠的167个氨基酸内显示出有94.0%的相同性:
10 20 30
orf14.pep TAGAAGXXVFVFVTDSQVEVFGNIQTAVET
|:|||| |||||||:|::||||:| ||||
orf14a GRQLGFLRVGGALFVITAQARVNNALCDCLTTGAAGFAVFVFVTDGQMQVFGNVQPAVET
150 160 170 180 190 200
40 50 60 70 80 90
orf14.pep GFFHGISVSSVFGAAAQDSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf14a GFFHGISVSSVFGAAAQYSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS
210 220 230 240 250 260
100 110 120 130 140 150
orf14.pep VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf14a VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG
270 280 290 300 310 320
160
orf14.pep RXLTNPTVSVRIMLHSG
| |||||||||||||||
orf14a RSLTNPTVSVRIMLHSGLMYSRRAVVSSVAKSWSFAYMPDLVSRLNRLDLPTLVX
330 340 350 360 370 380
全长ORF14a核苷酸序列<SEQ ID 143>是:
1 ATGGAGGATT TGCAGGAAAT CGGGTTCGAT GTCGCCGCCG TAAAGGTAGG
51 TCGGCAGCGC GAACATCATC GTCTGCATCA TCCCCAGCCC GGCAACGGCG
101 AGGCGGACGA TGTATTGTTT GCGTTCTTTT TGGTTGGCGG CTTCGATTTT
151 TTGCGCGTCA TAGGGTGCGG CGGTGTAGCC TATCTGCCTG ATTTTCAACA
201 GAATGTCGGA AAGGCGGATT TTGCCGTCGT CCCAGACGAC GCGGCAGCGG
251 TGCGTGCTGT AATTGAGGTC GATGCGGACG ATGCCGTCTG TACGCAAAAG
301 CTGCTGTTCG ATCAGCCAGA CGCAGGCGGC GCAGGTGATG CCGCCGAGCA
351 TTAAAACCGC CTCGCGCGTG CCGCCGTGGG TTTCCACAAA GTCGGACTGG
401 ACTTCGGGCA GGTCGTACAG GCGGATTTGG TCGAGGATTT CTTGGGGCGG
451 CAGCTCGGTT TTTTGCGCGT CGGCGGTGCG TTGTTTGTAA TAACTGCCCA
501 AGCCCGCGTC AATAATGCTT TGTGCGACTG CCTGACAACC GGCGCAGCAG
551 GTTTCGCGGT CTTCGTTTTC GTAACGGACG GTCAGATGCA GGTTTTCGGG
601 AACGTCCAGC CCGCAGTGGA AACAGGTTTT TTTCATGGCA TTTCGGTTTC
651 GTCTGTGTTT GGTGCGGCGG CACAATACTC GGCAATGGCT TCGCGCAGTG
701 CGTCTATACC GGTATTTTCA GCAACGGAAA TGCGGACGGC GGCAATTTTT
751 CCCGCAGCGT CGCGCCATAT GCCCGTGTTT TGTTCTTCAG ACGGCAGCAG
801 GTCGGTTTTG TTGTACACCT TGATGCACGG AATATCGCCG GCATGGATTT
851 CTTGCAGTAC GTTTTCCACG TCTTCAATCT GCTGTCCGCT GTTCGGAGCG
901 GCGGCATCGA CGACGTGCAG CAGCACATCG GCTTGCGCGG TTTCTTCCAG
951 CGTGGCGGAA AAGGCGGAAA TCAGTTTGTG CGGCAGATCG CTGACGAATC
1001 CGACGGTATC GGTCAGGATA ATGCTGCATT CGGGACTGAT GTACAGCCGC
1051 CGCGCCGTCG TGTCGAGTGT GGCGAAAAGC TGGTCTTTCG CATATATGCC
1101 CGACTTGGTC AGCCGGTTGA ACAGACTGGA TTTGCCGACA TTGGTATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 144>:
1 MEDLQEIGFD VAAVKVGRQR EHHRLHHPQP GNGEADDVLF AFFLVGGFDF
51 LRVIGCGGVA YLPDFQQNVG KADFAVVPDD AAAVRAVIEV DADDAVCTQK
101 LLFDQPDAGG AGDAAEH*NR LARAAVGFHK VGLDFGQVVQ ADLVEDFLGR
151 QLGFLRVGGA LFVITAQARV NNALCDCLTT GAAGFAVFVF VTDGQMQVFG
201 NVQPAVETGF FHGISVSSVF GAAAQYSAMA SRSASIPVFS ATEMRTAAIF
251 PAASRHMPVF CSSDGSRSVL LYTLMHGISP AWISCSTFST SSICCPLFGA
301 AASTTCSSTS ACAVSSSVAE KAEISLCGRS LTNPTVSVRI MLHSGLMYSR
351 RAVVSSVAKS WSFAYMPDLV SRLNRLDLPT LV*
应注意该序列在118位包括一个终止密码子。
与淋病奈瑟球菌的预计ORF的同源性
ORF14与淋病奈瑟球菌的预计ORF(ORF14.ng)在重叠的167个氨基酸内显示出有89.8%的相同性:
orf14.pep TAGAAGXXVFVFVTDSQVEVFGNIQTAVET 30
|| ||| ||:||:|:|::||||:| ||||
orf14ng GRQFGFFRVGGASFVITAQAGIDDALCDCLTADAAGFAVFAFVADGQMQVFGNVQPAVET 208
orf14.pep GFFHGISVSSVFGAAAQDSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS 90
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf14ng GFFHGISVSSVFGAAAQYSAMASRSASIPVFSATEMRTAAIFPAASRHMPVFCSSDGSRS 268
orf14.pep VLLYTLMHGISPAWISCSTFSTSSICCPLFGAAASTTCSSTSACAVSSSVAEKAEISLCG 150
||||||||||| ||||||||||||||||||||||||||||||||:|||:|||||||||||
orf14ng VLLYTLMHGISWAWISCSTFSTSSICCPLFRAAASTTCSSTSACTVSSKVAEKAEISLCG 328
orf14.pep RXLTNPTVSVRIMLHSG 167
| |||||||||||||:|
orf14ng RSLTNPTVSVRIMLHAGLMYSRRAVVSRVAKSWSFAYMPDLVSRLNRLDLPTLV 382
预计全长ORF14ng核苷酸序列<SEQ ID 145>编码的蛋白质具有氨基酸序列<SEQ ID 146>:
1 MEDLQEIGFD VAAVKVGRQR EHHRLHHTQS GNGKADD
VLF AFFLVGGFDF
51
LRVIGCGGVA CLPDFQQNVG EADFAVVPDD AAAVRAVIEV DADDAVCAQK
101 LLFDQPDAGG AGNAAEHQHC FVRAIMGFHK VGLDFGQVVQ ADLVEDFLGR
151 QFGFFRVGGA SFVITAQAGI DDALCDCLTA DAAGFAVFAF VADGQMQVFG
201 NVQPAVETGF FHGISVSSVF GAAAQYSAMA SRSASIPVFS ATEMRTAAIF
251 PAASRHMPVF CSSDGSRSVL LYTLMHGISW AWISCSTFST SSICCPLFRA
301 AASTTCSSTS ACTVSSKVAE KAEISLCGRS LTNPTVSVRI MLHAGLMYSR
351 RAVVSRVAKS WSFAYMPDLV SRLNRLDLPT LV*
根据该淋球菌蛋白中有一个推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例18
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 147>:
1 ..GGCCATTACT CCGACCGCAC TTGGAAGCCG CGTTTGGNCG GCCGCCGTCT
51 GCCGTATCTG CTTTATGGCA CGCTGATTGC GGTTATTGTG ATGATTTTGA
101 TGCCGAACTC GGGCAGCTTC GGTTTCGGCT ATGCGTCGCT GGCGGCTTTG
151 TCGTTCGGCG CGCTGATGAT TGCGCTGTTA GACGTGTCGT CAAATATGGC
201 GATGCAGCCG TTTAAGATGA TGGTCGGCGA CATGGTCAAC GAGGAGCAGA
251 AAA.NTACGC CTACGGGATT CAAAGTTTCT TAGCAAATAC GGGCGCGGTC
301 GTGGCGGCGA TTCTGCCGTT TGTGTTTGCG TATATCGGTT TGGCGAACAC
351 CGCCGANAAA GGCGTTGTGC CGCAGACCGT GGTCGTGGCG TTTTATGTGG
401 GTGCGGCGTT GCTGGTGATT ACCAGCGCGT TCACGATTTT CAAAGTGAAG
451 GAATACGANC CGGAAACCTA CGCCCGTTAC CACGGCATCG ATGTCGCCGC
501 GAATCAGGAA AAAGCCAACT GGATCGCACT CTTAAAA.CC GCGC..
它对应于氨基酸序列<SEQ ID 148;ORF16>:
1 ..GHYSDRTWKP RLXGRRLPYL LYGTLIAVIV MILMPNSGSF GFGYASLAAL
51 SFGALMIALL DVSSNMAMQP FKMMVGDMVN EEQKXYAYGI QSFLANTGAV
101 VAAILPFVFA YIGLANTAXK GVVPQTVVVA FYVGAALLVI TSAFTIFKVK
151 EYXPETYARY HGIDVAANQE KANWIALLKX A..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 149>:
1 ATGTCGGAAT ATACGCCTCA AACAGCAAAA CAAGGTTTGC CCGCGCTGGC
51 AAAAAGCACG ATTTGGATGC TCAGTTTCGG CTTTCTCGGC GTTCAGACGG
101 CCTTTACCCT GCAAAGCTCG CAAATGAGCC GCATTTTTCA AACGCTAGGC
151 GCAGACCCGC ACAATTTGGG CTGGTTTTTC ATCCTGCCGC CGCTGGCGGG
201 GATGCTGGTG CAGCCGATTG TCGGCCATTA CTCCGACCGC ACTTGGAAGC
251 CGCGTTTGGG CGGCCGCCGT CTGCCGTATC TGCTTTATGG CACGCTGATT
301 GCGGTTATTG TGATGATTTT GATGCCGAAC TCGGGCAGCT TCGGTTTCGG
351 CTATGCGTCG CTGGCGGCTT TGTCGTTCGG CGCGCTGATG ATTGCGCTGT
401 TAGACGTGTC GTCAAATATG GCGATGCAGC CGTTTAAGAT GATGGTCGGC
451 GACATGGTCA ACGAGGAGCA GAAAGGCTAC GCCTACGGGA TTCAAAGTTT
501 CTTAGCAAAT ACGGGCGCGG TCGTGGCGGC GATTCTGCCG TTTGTGTTTG
551 CGTATATCGG TTTGGCGAAC ACCGCCGAGA AAGGCGTTGT GCCGCAGACC
601 GTGGTCGTGG CGTTTTATGT GGGTGCGGCG TTGCTGGTGA TTACCAGCGC
651 GTTCACGATT TTCAAAGTGA AGGAATACGA TCCGGAAACC TACGCCCGTT
701 ACCACGGCAT CGATGTCGCC GCGAATCAGG AAAAAGCCAA CTGGATCGAA
751 CTCTTGAAAA CCGCGCCTAA GGCGTTTTGG ACGGTTACTT TGGTGCAATT
801 CTTCTGCTGG TTCGCCTTCC AATATATGTG GACTTACTCG GCAGGCGCGA
851 TTGCGGAAAA CGTCTGGCAC ACCACCGATG CGTCTTCCGT AGGTTATCAG
901 GAGGCGGGTA ACTGGTACGG CGTTTTGGCG GCGGTGCAGT CGGTTGCGGC
951 GGTGATTTGT TCGTTTGTAT TGGCGAAAGT GCCGAATAAA TACCATAAGG
1001 CGGGTTATTT CGGCTGTTTG GCTTTGGGCG CGCTCGGCTT TTTCTCCGTT
1051 TTCTTCATCG GCAACCAATA CGCGCTGGTG TTGTCTTATA CCTTAATCGG
1101 CATCGCTTGG GCGGGCATTA TCACTTATCC GCTGACGATT GTGACCAACG
1151 CCTTGTCGGG CAAGCATATG GGCACTTACT TGGGCTTGTT TAACGGCTCT
1201 ATCTGTATGC CTCAAATCGT CGCTTCGCTG TTGAGTTTCG TGCTTTTCCC
1251 TATGCTGGGC GGCTTGCAGG CCACTATGTT CTTGGTAGGG GGCGTCGTCC
1301 TGCTGCTGGG CGCGTTTTCC GTGTTCCTGA TTAAAGAAAC ACACGGCGGG
1351 GTTTGA
它对应于氨基酸序列<SEQ ID 150;ORF16-1>:
1 MSEYTPQTAK QGLPALAKST IWMLSFGFLG VQTAFTLQSS QMSRIFQTLG
51 ADPHNLGW
FF ILPPLAGMLV QPIVGHYSDR TWKPRLGGRR
LPYLLYGTLI
101
AVIVMILMPN SGSFGFGYA
S LAALSFGALM IALLDVSSNM AMQPFKMMVG
151 DMVNEEQKGY AYGIQSFLAN TG
AVVAAILP FVFAYIGLAN TAEKGVVPQT
201
VVVAFYVGAA LLVITSAFTI FKVKEYDPET YARYHGIDVA ANQEKANWIE
251 LLKTAPKAFW TVTLVQFFCW FAFQYMWTYS AGAIAENVWH TTDASSVGYQ
301 EAGNWYG
VLA AVQSVAAVIC SFVLAKVPNK YHKAGY
FGCL ALGALGFFSV
351
FFIGNQY
ALV LSYTLIGIAW AGIITYPLTI VTNALSGKHM GTYLGLFNGS
401 ICMPQ
IVASL LSFVLFPMLG GLQATMF
LVG GVVLLLGAFS VFLIKETHGG
451 V*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF16与脑膜炎奈瑟球菌菌株A的ORF(ORF16a)在重叠的181个氨基酸内显示出有96.7%的相同性:
10 20 30
orf16.pep
GHYSDRTWKPRLXGRR
LPYLLYGTLIAVIV
| ||||||||||| ||| ||||||||||||||
135
orf16a IFQTLGADPHSLGW
FFILPPLAGMLVQPIVGHYSDRTWKPRLGGRR
LPYLLYGTLIAVIV
50 60 70 80 90 100
40 50 60 70 80 90
orf16.pep
MILMPNSGSFGFGYA
SLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKXYAYGI
||| |||||||||||| ||||||||||||||||| |||||||||||||||||||||| |||||
orf16a
MILMPNSGSFGFGYA
SLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKGYAYGI
110 120 130 140 150 160
100 110 120 130 140 150
orf16.pep QSFLANTG
AVVAAILPFVFAYIGLANTAXKGVVPQT
VVVAFYVGAALLVITSAFTIFKVK
|||||||| ||||||||||||||||| ||| ||||||| ||||||||||||||||| |||||||
orf16a QSFLANTG
AVVAAILPFVFAYIGLANTAEKGVVPQT
VVVAFYVGAALLVITSAFTIFKVK
170 180 190 200 210 220
160 170 180
orf16.pep EYXPETYARYHGIDVAANQEKANWIALLKXA
|| |||||||||||||||||||||| |||:|
orf16a EYNPETYARYHGIDVAANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAI
230 240 250 260 270 280
orf16a AENVWHTTDASSVGYQEAGNWYG
VLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGA
290 300 310 320 330 340
全长ORF16a核苷酸序列<SEQ ID 151>是:
1 ATGTCGGAAT ATACGCCTCA AACAGCAAAA CAAGGTTTGC CCGCGCTGGC
51 AAAAAGCACG ATTTGGATGC TCAGTTTCGG CTTTCTCGGC GTTCAGACGG
101 CCTTTACCCT GCAAAGCTCG CAGATGAGCC GCATCTTCCA GACGCTCGGT
151 GCCGATCCGC ACAGCCTCGG CTGGTTCTTT ATCCTGCCGC CGCTGGCGGG
201 GATGCTGGTG CAGCCGATTG TCGGCCATTA CTCCGACCGC ACTTGGAAGC
251 CGCGTTTGGG CGGCCGCCGT CTGCCGTATC TGCTTTATGG CACGCTGATT
301 GCGGTTATTG TGATGATTTT GATGCCGAAC TCGGGCAGCT TCGGTTTCGG
351 CTATGCGTCG CTGGCGGCTT TGTCGTTCGG CGCGCTGATG ATTGCGCTGT
401 TAGACGTGTC GTCAAATATG GCGATGCAGC CGTTTAAGAT GATGGTCGGC
451 GACATGGTCA ACGAGGAGCA GAAAGGCTAC GCCTACGGGA TTCAAAGTTT
501 CTTAGCGAAT ACGGGCGCGG TCGTGGCGGC GATTCTGCCG TTTGTGTTTG
551 CGTATATCGG TTTGGCGAAC ACCGCCGAGA AAGGCGTTGT GCCGCAGACC
601 GTGGTCGTGG CGTTTTATGT GGGTGCGGCG TTGCTGGTGA TTACCAGCGC
651 GTTCACGATT TTCAAAGTGA AGGAATACAA TCCGGAAACC TACGCCCGTT
701 ACCACGGCAT CGATGTCGCC GCGAATCAGG AAAAAGCCAA CTGGATCGAA
751 CTCTTGAAAA CCGCGCCTAA GGCGTTTTGG ACGGTTACTT TGGTGCAATT
801 CTTCTGCTGG TTCGCCTTCC AATATATGTG GACTTACTCG GCAGGCGCGA
851 TTGCGGAAAA CGTCTGGCAC ACCACCGATG CGTCTTCCGT AGGTTATCAG
901 GAGGCGGGTA ACTGGTACGG CGTTTTGGCG GCGGTGCAGT CGGTTGCGGC
951 GGTGATTTGT TCGTTTGTAT TGGCGAAAGT GCCGAATAAA TACCATAAGG
1001 CGGGTTATTT CGGCTGTTTG GCTTTGGGCG CGCTCGGCTT TTTCTCCGTT
1051 TTCTTCATCG GCAACCAATA CGCGCTGGTG TTGTCTTATA CCTTAATCGG
1101 CATCGCTTGG GCGGGCATTA TCACTTATCC GCTGACGATT GTGACCAACG
1151 CCTTGTCGGG CAAGCATATG GGCACTTACT TGGGCCTGTT TAACGGCTCT
1201 ATCTGTATGC CGCAAATCGT CGCTTCGCTG TTGAGTTTCG TGCTTTTCCC
1251 TATGCTGGGC GGCTTGCAGG CCACTATGTT CTTGGTAGGG GGCGTCGTCC
1301 TGCTGCTGGG CGCGTTTTCC GTGTTCCTGA TTAAAGAAAC ACACGGCGGG
1351 GTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 152>:
1 MSEYTPQTAK QGLPALAKST IWMLSFGFLG VQTAFTLQSS QMSRIFQTLG
51 ADPHSLGW
FF ILPPLAGMLV QPIVGHYSDR TWKPRLGGRR
LPYLLYGTLI
101
AVIVMILMPN SGSFGFGYA
S LAALSFGALM IALLDVSSNM AMQPFKMMVG
151 DMVNEEQKGY AYGIQSFLAN TG
AVVAAILP FVFAYIGLAN TAEKGVVPQT
201
VVVAFYVGAA LLVITSAFTI FKVKEYNPET YARYHGIDVA ANQEKANWIE
251 LLKTAPKAFW TVTLVQFFCW FAFQYMWTYS AGAIAENVWH TTDASSVGYQ
301 EAGNWYG
VLA AVQSVAAVIC SFVLAKVPNK YHKAGY
FGCL ALGALGFFSV
351
FFIGNQY
ALV LSYTLIGIAW AGIITYPLTI VTNALSGKHM GTYLGLFNGS
401 ICMPQ
IVASL LSFVLFPMLG GLQATMF
LVG GVVLLLGAFS VFLIKETHGG
451 V*
ORF16a和ORF16-1在451个氨基酸的重叠区内显示出有99.6%的相同性:
10 20 30 40 50 60
orf16a.pep MSEYTPQTAKQGLPALAKSTIWMLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHSLGWFF
||||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||
orf16-1 MSEYTPQTAKQGLPALAKSTIWMLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHNLGWFF
10 20 30 40 50 60
70 80 90 100 110 120
orf16a.pep ILPPLAGMLVQPIVGHYSDRTWKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf16-1 ILPPLAGMLVQPIVGHYSDRTWKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYAS
70 80 90 100 110 120
130 140 150 160 170 180
orf16a.pep LAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKGYAYGIQSFLANTGAVVAAILP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf16-1 LAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKGYAYGIQSFLANTGAVVAAILP
130 140 150 160 170 180
190 200 210 220 230 240
orf16a.pep FVFAYIGLANTAEKGVVPQTVVVAFYVGAALLVITSAFTIFKVKEYNPETYARYHGIDVA
||||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||
orf16-1 FVFAYIGLANTAEKGVVPQTVVVAFYVGAALLVITSAFTIFKVKEYDPETYARYHGIDVA
190 200 210 220 230 240
250 260 270 280 290 300
orf16a.pep ANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf16- 1 ANQEKANWIELLKTAPKAFWTVTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQ
250 260 270 280 290 300
310 320 330 340 350 360
orf16a.pep EAGNWYGVLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGALGFFSVFFIGNQYALV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf16-1 EAGNWYGVLAAVQSVAAVICSFVLAKVPNKYHKAGYFGCLALGALGFFSVFFIGNQYALV
310 320 330 340 350 360
370 380 390 400 410 420
orf16a.pep LSYTLIGIAWAGIITYPLTIVTNALSGKHMGTYLGLFNGSICMPQIVASLLSFVLFPMLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf16-1 LSYTLIGIAWAGIITYPLTIVTNALSGKHMGTYLGLFNGSICMPQIVASLLSFVLFPMLG
370 380 390 400 410 420
430 440 450
orf16a.pep GLQATMFLVGGVVLLLGAFSVFLIKETHGGVX
||||||||||||||||||||||||||||||||
orf16-1 GLQATMFLVGGVVLLLGAFSVFLIKETHGGVX
430 440 450
与淋病奈瑟球菌的预计ORF的同源性
ORF16与淋病奈瑟球菌预计的ORF(ORF16.ng)在重叠的181个氨基酸内显示出有93.9%的相同性:
orf16.pep GHYSDRTWKPRLXGRRLPYLLYGTLIAVIV 30
|:|||||||||| |||||||||||||||||
orf16ng HFSNARRRPAQFGLVFHPAAAGGDAGSADSGYYSDRTWKPRLGGRRLPYLLYGTLIAVIV 131
orf16.pep MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKXYAYGI 90
|||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf16ng MILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMAMQPFKMMVGDMVNEEQKSYAYGI 191
orf16.pep QSFLANTGAVVAAILPFVFAYIGLANTAXKGVVPQTVVVAFYVGAALLVITSAFTIFKVK 150
||||||| |||||||||||||||||||| |||||||||||||||||||:||||||| |||
orf16ng QSFLANTDAVVAAILPFVFAYIGLANTAEKGVVPQTVVVAFYVGAALLIITSAFTISKVK 251
orf16.pep EYXPETYARYHGIDVAANQEKANWIALLKXA 181
|| |||||||||||||||||||||: |||:|
orf16ng EYDPETYARYHGIDVAANQEKANWFELLKTAPKVFWTVTPVQFFCWFAFRYMWTYSAGAI 311
全长ORF16ng核苷酸序列<SEQ ID 153>是:
1 ATGATAGGGG ATCGCCGCGC CGGCAACCAT TTCGGATTTT CCAAAGCAAA
51 TACTTTTCAA ATCAAAAAAA AGGATTTACT TTATGTCGGA ATATACGCCT
101 CAAACAGCAA AACAAGGTTT GCCCGCGCCG GCAAAAAGCA CGATTTGGAT
151 GTTGAGCTTC GGCTATCTCG GCGTTCAGAC GGCCTTTACC CTGCAAAGCT
201 CGCAGATGAG CCGCATTTTT CAAACGCTAG GCGCAGACCC GCACAATTTG
251 GGCTGGTTTT TCATCCTGCC GCCGCTGGCG GGGATGCTGG TTCAGCCGAT
301 AGTGGCTACT ACTCAGACCG CACTTGGAAG CCGCGCTTGG GCGGCCGCCG
351 CCTGCCGTAT CTGCTTTACG GCACGCTGAT TGCGGTCATC GTGATGATTT
401 TGATGCCGAA CTCGGGCAGC TTCGGTTTCG GCTATGCGTC GCTGGCGGCC
451 TTGTCGTTCG GCGCGCTGAT GATTGCGCTG TTGGACGTGT CGTCGAATAT
501 GGCGATGCAG CCGTTTAAGA TGATGGTCGG CGATATGGTC AACGAGGAGC
551 AGAAAAGCTA CGCCTACGGG ATTCAAAGTT TCTTAGCGAA TACGGACGCG
601 GTTGTGGCAG CGATTCTGCC GTTTGTGTTC GCGTATATCG GTTTGGCGAA
651 CACTGCCGAG AAAGGCGTTG TGCCACAAAC CGTGGTCGTA GCATTCTATG
701 TGGGTGCGGC GTTACTGATT ATTACCAGTG CGTTCACAAT CTCCAAAGTC
751 AAAGAATACG ACCCGGAAAC CTACGCCCGT TACCACGGCA TCGATGTCGC
801 CGCGAATCAG GAAAAAGCCA ACTGGTTCGA ACTCTTAAAA ACCGCGCCTA
851 AAGTGTTTTG GACGGTTACT CCGGTACAGT TTTTCTGCTG GTTCGCCTTC
901 CGGTATATGT GGACTTACTC GGCAGGCGCG ATTGCAGAAA ACGTCTGGCA
951 CACTACCGAT GCGTCTTCCG TAGGCCATCA GGAGGCGGGC AACCGGTACG
1001 GCGTTTTGGC GGCGGTGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 154>:
1 MIGDRRAGNH FGFSKANTFQ IKKKDLLYVG IYASNSKTRF ARAGKKHDLD
51 VELRLSRRSD GLYPAKLADE PHFSNARRRP AQFGLVFHPA AAGGDAGSAD
101 SGYYSDRTWK PRLGGRR
LPY LLYGTLIAVI VMILMPNSGS FGFGYA
SLAA
151
LSFGALMIAL LDVSSNMAMQ PFKMMVGDMV NEEQKSYAYG IQSFLANTD
A
201
VVAAILPFVF AYIGLANTAE KGVVPQT
VVV AFYVGAALLI ITSAFTISKV
251 KEYDPETYAR YHGIDVAANQ EKANWFELLK TAPKVFWTVT PVQFFCWFAF
301 RYMWTYSAGA IAENVWHTTD ASSVGHQEAG NRYGVLAAV*
ORF16ng和ORF16-1在261个氨基酸的重叠区内显示出有89.3%的相同性:
30 40 50 60 70 80
orf16-1.pep MLSFGFLGVQTAFTLQSSQMSRIFQTLGADPHNLGWFFILPPLAGMLVQPI-VGHYSDRT
| ::| | | || : |:|||||
orf16ng DVELRLSRRSDGLYPAKLADEPHFSNARRRPAQFGLVF-HPAAAGGDAGSADSGYYSDRT
50 60 70 80 90 100
90 100 110 120 130 140
orf16-1.pep WKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf16ng WKPRLGGRRLPYLLYGTLIAVIVMILMPNSGSFGFGYASLAALSFGALMIALLDVSSNMA
110 120 130 140 150 160
150 160 170 180 190 200
orf16-1.pep MQPFKMMVGDMVNEEQKGYAYGIQSFLANTGAVVAAILPFVFAYIGLANTAEKGVVPQTV
|||||||||||||||||:|||||||||||| |||||||||||||||||||||||||||||
orf16ng MQPFKMMVGDMVNEEQKSYAYGIQSFLANTDAVVAAILPFVFAYIGLANTAEKGVVPQTV
170 180 190 200 210 220
210 220 230 240 250 260
orf16-1.pep VVAFYVGAALLVITSAFTIFKVKEYDPETYARYHGIDVAANQEKANWIELLKTAPKAFWT
|||||||||||:||||||| |||||||||||||||||||||||||||:||||||||:|||
orf16ng VVAFYVGAALLIITSAFTISKVKEYDPETYARYHGIDVAANQEKANWFELLKTAPKVFWT
230 240 250 260 270 280
270 280 290 300 310 320
orf16-1.pep VTLVQFFCWFAFQYMWTYSAGAIAENVWHTTDASSVGYQEAGNWYGVLAAVQSVAAVICS
|| ||||||||:||||||||||||||||||||||||:||||| |||||||||||||
orf16ng VTPVQFFCWFAFRYMWTYSAGAIAENVWHTTDASSVGHQEAGNRYGVLAAVX
290 300 310 320 330 340
根据该分析结果,包括该淋球菌蛋白中存在几个推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例19
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 155>:
1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGCATA CCTTGATGCT
51 GAACGGCTGT ACGTTGATGT TGTGGGGAAT GAACAACCCG GTCAGCGAAA
101 CAATCACCCG NAAACACGTT GNCAAAGACC AAATCCGNGN CTTCGGTGTG
151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG
201 CGGAAAATAC TGGTTCGTCG TCAATCCCGA AGATTCGGCG AA.NTGACGG
251 GNATTTTGAN GGCAGGGCTG GACAAACCCT TCCAAATAGT TNAGGATACC
301 CCGAGCTATG C.TGCCACCA AGCCCTGCCG GTCAAACTCG GATCGNCTGG
351 CAGCCAGAAT...
它对应于氨基酸序列<SEQ ID 156;ORF28>:
1 MLFRKTTAAV LAHTLMLNGC TLMLWGMNNP VSETITRKHV XKDQIRXFGV
51 VAEDNAQLEK GSLVMMGGKY WFVVNPEDSA XXTGILXAGL DKPFQIVXDT
101 PSYXCHQALP VKLGSXGSQN...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 157>:
1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATGCT
51 GAACGGCTGT ACGTTGATGT TGTGGGGAAT GAACAACCCG GTCAGCGAAA
101 CAATCACCCG CAAACACGTT GACAAAGACC AAATCCGCGC CTTCGGTGTG
151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG
201 CGGAAAATAC TGGTTCGTCG TCAATCCCGA AGATTCGGCG AAGCTGACGG
251 GCATTTTGAA GGCAGGGCTG GACAAACCCT TCCAAATAGT TGAGGATACC
301 CCGAGCTATG CTCGCCACCA AGCCCTGCCG GTCAAACTCG AATCGCCTGG
351 CAGCCAGAAT TTCAGTACCG AAGGCCTTTG CCTGCGCTAC GATACCGACA
401 AGCCTGCCGA CATCGCCAAG CTGAAACAGC TCGGGTTTGA AGCGGTCAAA
451 CTCGACAATC GGACCATTTA CACGCGCTGC GTATCCGCCA AAGGCAAATA
501 CTACGCCACA CCGCAAAAAC TGAACGCCGA TTACCATTTT GAGCAAAGTG
551 TGCCTGCCGA TATTTATTAC ACGGTTACTG AAGAACATAC CGACAAATCC
601 AAGCTGTTTG CAAATATCTT ATATACGCCC CCCTTTTTGA TACTGGATGC
651 GGCGGGCGCG GTACTGGCCT TGCCTGCGGC GGCTCTGGGT GCGGTCGTGG
701 ATGCCGCCCG CAAATGA
它对应于氨基酸序列<SEQ ID 158;ORF28-1>:
1
MLFRKTTAAV LAATLMLNGC TLMLWGMNNP VSETITRKHV DKDQIRAFGV
51 VAEDNAQLEK GSLVMMGGKY WFVVNPEDSA KLTGILKAGL DKPFQIVEDT
101 PSYARHQALP VKLESPGSQN FSTEGLCLRY DTDKPADIAK LKQLGFEAVK
151 LDNRTIYTRC VSAKGKYYAT PQKLNADYHF EQSVPADIYY TVTEEHTDKS
201 KLFANILYTP PF
LILDAAGA VLALPAAALG AVVDAARK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF28与脑膜炎奈瑟球菌菌株A的ORF(ORF28a)在重叠的120个氨基酸内显示出有79.2%的相同性:
10 20 30 40 50 60
orf28.pep
MLFRKTTAAVLAHTLMLNGCTLMLWGMNNPVSETITRKHVXKDQIRXFGVVAEDNAQLEK
|||||||||||| |||||| ||:|:||||:| ||| :|||| ||||| |||||||||||||
orf28a
MLFRKTTAAVLAATLMLNGCTVMMWGMNSPFSETTARKHVDKDQIRAFGVVAEDNAQLEK
10 20 30 40 50 60
70 80 90 100 110 120
orf28.pep GSLVMMGGKYWFVVNPEDSAXXTGILXAGLDKPFQIVXDTPSYXCHQALPVKLGSXGSQN
|||||||||||||||||||| |||| ||||| ||:| :| : :||||||| | :|||
orf28a GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKQFQMVEPNPRFA-YQALPVKLESPASQN
70 80 90 100 110
orf28a FSTEGLCLRYDTDRPADIAKLKQLEFEAVELDNRTIYTRCVSAKGKYYATPQKLNADYHF
120 130 140 150 160 170
全长ORF28a核苷酸序列<SEQ ID 159>是:
1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATGTT
51 GAACGGCTGT ACGGTAATGA TGTGGGGTAT GAACAGCCCG TTCAGCGAAA
101 CGACCGCCCG CAAACACGTT GACAAGGACC AAATCCGCGC CTTCGGTGTG
151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG
201 CGGGAAATAC TGGTTCGTCG TCAATCCTGA AGATTCGGCG AAGCTGACGG
251 GCATTTTGAA GGCCGGGTTG GACAAGCAGT TTCAAATGGT TGAGCCCAAC
301 CCGCGCTTTG CCTACCAAGC CCTGCCGGTC AAACTCGAAT CGCCCGCCAG
351 CCAGAATTTC AGTACCGAAG GCCTTTGCCT GCGCTACGAT ACCGACAGAC
401 CTGCCGACAT CGCCAAGCTG AAACAGCTTG AGTTTGAAGC GGTCGAACTC
451 GACAATCGGA CCATTTACAC GCGCTGCGTC TCCGCCAAAG GCAAATACTA
501 CGCCACACCG CAAAAACTGA ACGCCGATTA TCATTTTGAG CAAAGTGTGC
551 CTGCCGATAT TTATTACACG GTTACGAAAA AACATACCGA CAAATCCAAG
601 TTGTTTGAAA ATATTGCATA TACGCCCACC ACGTTGATAC TGGATGCGGT
651 GGGCGCGGTG CTGGCCTTGC CTGTCGCGGC GTTGATTGCA GCCACGAATT
701 CCTCAGACAA ATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 160>:
1
MLFRKTTAAV LAATLMLNGC TVMMWGMNSP FSETTARKHV DKDQIRAFGV
51 VAEDNAQLEK GSLVMMGGKY WFVVNPEDSA KLTGILKAGL DKQFQMVEPN
101 PRFAYQALPV KLESPASQNF STEGLCLRYD TDRPADIAKL KQLEFEAVEL
151 DNRTIYTRCV SAKGKYYATP QKLNADYHFE QSVPADIYYT VTKKHTDKSK
201 LFENIAYTPT TL
ILDAVGAV LALPVAALIA ATNSSDK*
ORF28a和ORF28-1在238个氨基酸的重叠区内显示出有86.1%的相同性:
10 20 30 40 50 60
orf28a.pep MLFRKTTAAVLAATLMLNGCTVMMWGMNSPFSETTARKHVDKDQIRAFGVVAEDNAQLEK
|||||||||||||||||||||:|:||||:| ||| :||||||||||||||||||||||||
orf28-1 MLFRKTTAAVLAATLMLNGCTLMLWGMNNPVSETITRKHVDKDQIRAFGVVAEDNAQLEK
10 20 30 40 50 60
70 80 90 100 110 119
orf28a.pep GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKQFQMVEPNPRFA-YQALPVKLESPASQN
|||||||||||||||||||||||||||||||| ||:||:| :| :|||||||||||:|||
orf28-1 GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKPFQIVEDTPSYARHQALPVKLESPGSQN
70 80 90 100 110 120
120 130 140 150 160 170 179
orf28a.pep FSTEGLCLRYDTDRPADIAKLKQLEFEAVELDNRTIYTRCVSAKGKYYATPQKLNADYHF
|||||||||||||:|||||||||| ||||:||||||||||||||||||||||||||||||
orf28-1 FSTEGLCLRYDTDKPADIAKLKQLGFEAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF
130 140 150 160 170 180
180 190 200 210 220 230
orf28a.pep EQSVPADIYYTVTKKHTDKSKLFENIAYTPTTLILDAVGAVLALPVAALIAATNSSDKX
|||||||||||||::|||||||| || ||| |||||:|||||||:||| |::::: ||
orf28-1 EQSVPADIYYTVTEEHTDKSKLFANILYTPPFLILDAAGAVLALPAAALGAVVDAARKX
190 200 210 220 230
与淋病奈瑟球菌的预计ORF的同源性
ORF28与淋病奈瑟球菌的预计ORF(ORF28.ng)在重叠的120个氨基酸内显示出有84.2%的相同性:
orf28.pep MLFRKTTAAVLAHTLMLNGCTLMLWGMNNPVSETITRKHVXKDQI RXFGVVAEDNAQLEK 60
|||||||||||| ||:|||||:|| |||||||:||||||| ||||| ||||||||||||||
orf28ng MLFRKTTAAVLAATLILNGCTMMLRGMNNPVSQTITRKHVDKDQIRAFGVVAEDNAQLEK 60
orf28.pep GSLVMMGGKYWFVVNPEDSAXXTGILXAGLDKPFQIVXDTPSYXCHQALPVKLGSXGSQN 120
||||||||||||:|||||||| ||:|||||||||||| ||||| |||||||: : ||||
orf28ng GSLVMMGGKYWFAVNPEDSAKLTGLLKAGLDKPFQIVEDTPSYARHQALPVKFEAPGSQN 120
全长ORF28ng核苷酸序列<SEQ ID 161>是
1 ATGTTGTTCC GTAAAACGAC CGCCGCCGTT TTGGCGGCAA CCTTGATACT
51 GAACGGCTGT ACGATGATGT TGCGGGGGAT GAACAACCCG GTCAGCCAAA
101 CAATCACCCG CAAACACGTT GACAAAGACC AAATCCGCGC CTTCGGTGTG
151 GTTGCCGAAG ACAATGCCCA ATTGGAAAAG GGCAGCCTGG TGATGATGGG
201 CGGGAAATAC TGGTTCGCCG TCAATCCCGA AGATTCGGCG AAGCTGACGG
251 GCCTTTTGAA GGCCGGGTTG GACAAGCCCT TCCAAATAGT TGAGGATACC
301 CCGAGCTATG CCCGCCACCA AGCCCTGCCG GTCAAATTCG AAGCGCCCGG
351 CAGCCAGAAT TTCAGTACCG GAGGTCTTTG CCTGCGCTAT GATACCGGCA
401 GACCTGACGA CATCGCCAAG CTGAAACAGC TTGAGTTTAA AGCGGTCAAA
451 CTCGACAATC GGACCATTTA CACGCGCTGC GTATCCGCCA AAGGCAAATA
501 CTACGCCACG CCGCAAAAAC TGAACGCCGA TTATCATTTT GAGCAAAGTG
551 TGCCCGCCGA TATTTATTAT ACGGTTACTG AAAAACATAC CGACAAATCC
601 AAGCTGTTTG GAAATATCTT ATATACGCCC CCCTTGTTGA TATTGGATGC
651 GGCGGCCGCG GTGCTGGTCT TGCCTATGGC TCTGATTGCA GCCGCGAATT
701 CCTCAGACAA ATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 162>:
1
MLFRKTTAAV LAATLILNGC TMMLRGMNNP VSQTITRKHV DKDQIRAFGV
51 VAEDNAQLEK GSLVMMGGKY WFAVNPEDSA KLTGLLKAGL DKPFQIVEDT
101 PSYARHQALP VKFEAPGSQN FSTGGLCLRY DTGRPDDIAK LKQLEFKAVK
151 LDNRTIYTRC VSAKGKYYAT PQKLNADYHF EQSVPADIYY TVTEKHTDKS
201 KLFGNILYTP PL
LILDAAAA VLVLPMALIA AANSSDK*
ORF28ng和ORF28-1在231个氨基酸的重叠区内显示有90.0%的相同性:
10 20 30 40 50 60
orf28-1.pep MLFRKTTAAVLAATLMLNGCTLMLWGMNNPVSETITRKHVDKDQIRAFGVVAEDNAQLEK
||||||||||||||:|||||:|| |||||||:||||||||||||||||||||||||||||
orf28ng MLFRKTTAAVLAATLILNGCTMMLRGMNNPVSQTITRKHVDKDQIRAFGVVAEDNAQLEK
10 20 30 40 50 60
70 80 90 100 110 120
orf28-1.pep GSLVMMGGKYWFVVNPEDSAKLTGILKAGLDKPFQIVEDTPSYARHQALPVKLESPGSQN
||||||||||||:|||||||||||:|||||||||||||||||||||||||||:|:|||||
orf28ng GSLVMMGGKYWFAVNPEDSAKLTGLLKAGLDKPFQIVEDTPSYARHQALPVKFEAPGSQN
70 80 90 100 110 120
130 140 150 160 170 180
orf28-1.pep FSTEGLCLRYDTDKPADIAKLKQLGFEAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF
||| |||||||| :| |||||||| |:|||||||||||||||||||||||||||||||||
orf28ng FSTGGLCLRYDTGRPDDIAKLKQLEFKAVKLDNRTIYTRCVSAKGKYYATPQKLNADYHF
130 140 150 160 170 180
190 200 210 220 230 239
orf28-1.pep EQSVPADIYYTVTEEHTDKSKLFANILYTPPFLILDAAGAVLALPAAALGAVVDAARKX
||||||||||||||:||||||||:|||||||:||||||:|||||| | ::|:
orf28ng EQSVPADIYYTVTEKHTDKSKLFGNILYTPPLLILDAAAAVLVLPMALIAAANSSDKX
190 200 210 220 230
根据该分析结果(包括该淋球菌蛋白中存在一个推定的跨膜结构域的结果),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF28-1(24kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图6A显示了GST-融合蛋白的亲和纯化结果,图6B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA,得到阳性结果。这些结果确认ORF28-1是一种外露蛋白,且其可能是一种有用的免疫原。
实施例20
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 163>:
1 ..GTCAGTCCTG TACTGCCTAT TACACACGAA CGGACAGGGT TTGAAGGTGT
51 TATCGGTTAT GAAACCCATT TTTCAGGGCA CGGACATGAA GTACACAGTC
101 CGTTCGATCA TCATGATTCA AAAAGCACTT CTGATTTCAG CGGCGGTGTA
151 GACGGCGGTT TTACTGTTTA CCAACTTCAT CGAACATGGT CGGAAATCCA
201 TCCGGAGGAT GAATATGACG GGCCGCAAGC AGCG.ATTAT CCGCCCCCCG
251 GAGGAGCAAG GGATATATAC AGCTATTATG TCAAAGGAAC TTCAACAAAA
301 ACAAAGACTA GTATTGTCCC TCAAGCCCCA TTTTCAGACC GTTGGCTAGA
351 AGAAAATGCC GGTGCCGCCT CTGGT..
它对应于氨基酸序列<SEQ ID 164;ORF29>:
1 ..VSPVLPITHE RTGFEGVIGY ETHFSGHGHE VHSPFDHHDS KSTSDFSGGV
51 DGGFTVYQLH RTWSEIHPED EYDGPQAAXY PPPGGARDIY SYYVKGTSTK
101 TKTSIVPQAP FSDRWLEENA GAASG..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 165>:
1 ATGAATTTGC CTATTCAAAA ATTCATGATG CTGTTTGCAG CAGCAATATC
51 GTTGCTGCAA ATCCCCATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC
101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGTAA ATACCATCTG
151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAG CGGGTTTACG CCGTCCAGAC
201 ATTTGATGCA ACTGCGGTCA GTCCTGTACT GCCTATTACA CACGAACGGA
251 CAGGGTTTGA AGGTGTTATC GGTTATGAAA CCCATTTTTC AGGGCACGGA
301 CATGAAGTAC ACAGTCCGTT CGATCATCAT GATTCAAAAA GCACTTCTGA
351 TTTCAGCGGC GGTGTAGACG GCGGTTTTAC TGTTTACCAA CTTCATCGAA
401 CAGGGTCGGA AATCCATCCG GAGGATGGAT ATGACGGGCC GCAAGGCAGC
451 GATTATCCGC CCCCCGGAGG AGCAAGGGAT ATATACAGCT ATTATGTCAA
501 AGGAACTTCA ACAAAAACAA AGACTAATAT TGTCCCTCAA GCCCCATTTT
551 CAGACCGTTG GCTAAAAGAA AATGCCGGTG CCGCCTCTGG TTTTTTCAGC
601 CGTGCGGATG AAGCAGGAAA ACTGATATGG GAAAGCGACC CCAATAAAAA
651 TTGGTGGGCT AACCGTATGG ATGATGTTCG CGGCATCGTC CAAGGTGCGG
701 TTAATCCTTT TTTAATGGGT TTTCAAGGAG TAGGGATTGG GGCAATTACA
751 GACAGTGCAG TAAGCCCGGT CACAGATACA GCCGCGCAGC AGACTCTACA
801 AGGTATTAAT GATTTAGGAA AATTAAGTCC GGAAGCACAA CTTGCTGCCG
851 CGAGCCTATT ACAGGACAGT GCTTTTGCGG TAAAAGACGG TATCAACTCT
901 GCCAAACAAT GGGCTGATGC CCATCCAAAT ATAACAGCTA CTGCCCAAAC
951 TGCCCTTTCC GCAGCAGAGG CCGCAGGTAC GGTTTGGAGA GGTAAAAAAG
1001 TAGAACTTAA CCCGACTAAA TGGGATTGGG TTAAAAATAC CGGTTATAAA
1051 AAACCTGCTG CCCGCCATAT GCAGACTTTA GATGGGGAGA TGGCAGGTGG
1101 GAATAAACCT ATTAAATCTT TACCAAACAG TGCCGCTGAA AAAAGAAAAC
1151 AAAATTTTGA GAAGTTTAAT AGTAACTGGA GTTCAGCAAG TTTTGATTCA
1201 GTGCACAAAA CACTAACTCC CAATGCACCT GGTATTTTAA GTCCTGATAA
1251 AGTTAAAACT CGATACACTA GTTTAGATGG AAAAATTACA ATTATAAAAG
1301 ATAACGAAAA CAACTATTTT AGAATCCATG ATAATTCACG AAAACAGTAT
1351 CTTGATTCAA ATGGTAATGC TGTGAAAACC GGTAATTTAC AAGGTAAGCA
1401 AGCAAAAGAT TATTTACAAC AACAAACTCA TATCAGGAAC TTAGACAAAT
1451 GA
它对应于氨基酸序列<SEQ ID 166;ORF29-1>:
1
MNLPIQKFMM LFAAAISLLQ IPISHANGLD ARLRDDMQAK HYEPGGKYHL
51 FGNARGSVKK RVYAVQTFDA TAVSPVLPIT HERTGFEGVI GYETHFSGHG
101 HEVHSPFDHH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGS
151 DYPPPGGARD IYSYYVKGTS TKTKTNIVPQ APFSDRWLKE NAGAASGFFS
201 RADEAGKLIW ESDPNKNWWA NRMDDVRGIV QGAVNPFLMG FQGVGIGAIT
251 DSAVSPVTDT AAQQTLQGIN DLGKLSPEAQ LAAASLLQDS AFAVKDGINS
301 AKQWADAHPN ITATAQTALS AAEAAGTVWR GKKVELNPTK WDWVKNTGYK
351 KPAARHMQTL DGEMAGGNKP IKSLPNSAAE KRKQNFEKFN SNWSSASFDS
401 VHKTLTPNAP GILSPDKVKT RYTSLDGKIT IIKDNENNYF RIHDNSRKQY
451 LDSNGNAVKT GNLQGKQAKD YLQQQTHIRN LDK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF29与脑膜炎奈瑟球菌菌株A的ORF(ORF29a)在重叠的125个氨基酸中显示出有88.0%的相同性:
10 20 30
orf29.pep VSPVLPITHERTGFEGVIGYETHFSGHGHE
|:|:||||||||||||:|||||||||||||
orf29a EPGGKYHLFGNARGSVKNRVYAVQTFDATAVGPILPITHERTGFEGIIGYETHFSGHGHE
50 60 70 80 90 100
40 50 60 70 80 90
orf29.pep VHSPFDHHDSKSTSDFSGGVDGGFTVYQLHRTWSEIHPEDEYDGPQAAXYPPPGGARDIY
||||||:||||||||||||||||||||||||| ||||||| |||||:: |||||||||||
orf29a VHSPFDNHDSKSTSDFSGGVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIY
110 120 130 140 150 160
100 110 120
orf29.pep SYYVKGTSTKTKTSIVPQAPFSDRWLEENAGAASG
||||||||||::|||:||||||||:|||||||||
orf29a XXYVKGTSTKTKSNIVPRAPFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANR
170 180 190 200 210 220
orf29a MDDIRGIVQGAVNPFLMGFQGVGIGAITDSAVSPVTDTAAQQTLQGXNHLGXLSPEAQLA
230 240 250 260 270 280
全长ORF29a核苷酸序列<SEQ ID 167>是:
1 ATGAATTNGC CTATTCAAAA ATTCATGATG CTGTTTGCAG CAGCAATATC
51 GTNGCTGCAA ATCCCNATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC
101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGTAA ATACCATCTG
151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAT CGGGTTTACG CCGTCCAAAC
201 ATTTGATGCA ACTGCGGTCG GCCCCATACT GCCTATTACA CACGAACGGA
251 CAGGATTTGA AGGCATTATC GGTTATGAAA CCCATTTTTC AGGACATGGA
301 CATGAAGTAC ACAGTCCGTT CGATAATCAT GATTCAAAAA GCACTTCTGA
351 TTTCAGCGGC GGCGTAGACG GTGGTTTTAC CGTTTACCAA CTTCATCGGA
401 CAGGGTCGGA AATCCATCCG GAGGATGGAT ATGACGGGCC GCAAGGCAGC
451 GATTATCCGC CCCCCGGAGG AGCAAGGGAT ATATACANNT ANTATGTCAA
501 AGGAACTTCA ACAAAAACAA AGAGTAATAT TGTTCCCCGA GCCCCATTTT
551 CAGACCGCTG GCTAAAAGAA AATGCCGGTG CCGCCTCTGG TTTTTTCAGC
601 CGTGCTGATG AAGCAGGAAA ACTGATATGG GAAAGCGACC CCAATAAAAA
651 TTGGTGGGCT AACCGTATGG ATGATATTCG CGGCATCGTC CAAGGTGCGG
701 TTAATCCTTT TTTAATGGGT TTTCAAGGAG TAGGGATTGG GGCAATTACA
751 GACAGTGCAG TAAGCCCGGT CACAGATACA GCCGCGCAGC AGACTCTACA
801 AGGTATNAAT CATTTAGGAA ANTTAAGTCC CGAAGCACAA CTTGCGGCTG
851 CAACCGCATT ACAAGACAGT GCTTTTGCGG TAAAAGACGG TATCAATTCC
901 GCCAGACAAT GGGCTGATGC CCATCCGAAT ATAACTGCAA CAGCCCAAAC
951 TGCCCTTGCC GTAGCAGANG CCGCAACTAC GGTTTGGGGC GGTAAAAAAG
1001 TAGAACTTAA CCCGACCAAA TGGGATTGGG TTAAAAATAC NGGCTATAAN
1051 ACACCTGCTG TTCGCACCAT GCATACTTTG GATGGGGAAA TGGCCGGTGG
1101 GAATAGACCG CCTAAATCTA TAACGTCCAA CAGCAAAGCA GATGCTTCCA
1151 CACAACCGTC TTTACAAGCG CAACTAATTG GAGAACAAAT TANNNNNGGG
1201 CATGCTTATA ACAAGCATGT CATAAGACAA CAAGAATTTA CGGATTTAAA
1251 TATCAATTCA CCAGCAGATT TTGCTCGGCA TATTGAAAAT ATTGTTAGCC
1301 ATCCANCAAA TATGAAAGAG TTACCTCGCG GTAGAACTGC GTATTGGGAT
1351 NATAAAACAG GGACNATAGT TATCCGAGAT AAAAATTCTG ACGATGGAGG
1401 TACAGCATTT AGACCAACAT CAGGTAAAAA ATATTATGAT GATTTATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 168>:
1
MNXPIQKFMM LFAAAISXLQ IPISHANGLD ARLRDDMQAK HYEPGGKYHL
51 FGNARGSVKN RVYAVQTFDA TAVGPILPIT HERTGFEGII GYETHFSGHG
101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGS
151 DYPPPGGARD IYXXYVKGTS TKTKSNIVPR APFSDRWLKE NAGAASGFFS
201 RADEAGKLIW ESDPNKNWWA NRMDDIRGIV QGAVNPFLMG FQGVGIGAIT
251 DSAVSPVTDT AAQQTLQGXN HLGXLSPEAQ LAAATALQDS AFAVKDGINS
301 ARQWADAHPN ITATAQTALA VAXAATTVWG GKKVELNPTK WDWVKNTGYX
351 TPAVRTMHTL DGEMAGGNRP PKSITSNSKA DASTQPSLQA QLIGEQIXXG
401 HAYNKHVIRQ QEFTDLNINS PADFARHIEN IVSHPXNMKE LPRGRTAYWD
451 XKTGTIVIRD KNSDDGGTAF RPTSGKKYYD DL*
ORF29a和ORF29-1在385个氨基酸的重叠区内显示出有90.1%的相同性:
10 20 30 40 50 60
orf29a.pep MNXPIQKFMMLFAAAISXLQIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKN
|| |||||||||||||| |||||||||||||||||||||||||||||||||||||||||:
orf29-1 MNLPIQKFMMLFAAAISLLQIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKK
10 20 30 40 50 60
70 80 90 100 110 120
orf29a.pep RVYAVQTFDATAVGPILPITHERTGFEGIIGYETHFSGHGHEVHSPFDNHDSKSTSDFSG
|||||||||||||:|:||||||||||||:|||||||||||||||||||:|||||||||||
orf29-1 RVYAVQTFDATAVSPVLPITHERTGFEGVIGYETHFSGHGHEVHSPFDHHDSKSTSDFSG
70 80 90 100 110 120
130 140 150 160 170 180
orf29a.pep GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYXXYVKGTSTKTKSNIVPR
|||||||||||||||||||||||||||||||||||||||||| ||||||||||:||||:
orf29-1 GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYSYYVKGTSTKTKTNIVPQ
130 140 150 160 170 180
190 200 210 220 230 240
orf29a.pep APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDIRGIVQGAVNPFLMG
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||
orf29-1 APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDVRGIVQGAVNPFLMG
190 200 210 220 230 240
250 260 270 280 290 300
orf29a.pep FQGVGIGAITDSAVSPVTDTAAQQTLQGXNHLGXLSPEAQLAAATALQDSAFAVKDGINS
|||||||||||||||||||||||||||| | ||||||||||||:||||||||||||||||
orf29-1 FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGKLSPEAQLAAASLLQDSAFAVKDGINS
250 260 270 280 290 300
310 320 330 340 350 360
orf29a.pep ARQWADAHPNITATAQTALAVAXAATTVWGGKKVELNPTKWDWVKNTGYXTPAVRTMHTL
|:|||||||||||||||||::| || ||| ||||||||||||||||||| ||:| |:||
orf29-1 AKQWADAHPNITATAQTALSAAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTL
310 320 330 340 350 360
370 380 390 400 410 420
orf29a.pep DGEMAGGNRPPKSITSNSKADASTQPSLQAQLIGEQIXXGHAYNKHVIRQQEFTDLNINS
||||||||:| ||: || |: |
orf29-1 DGEMAGGNKPIKSLP-NSAAEKRKQNFEKFNSNWSSASFDSVHKTLTPNAPGILSPDKVK
370 380 390 400 410
与淋病奈瑟球菌的预计ORF的同源性
ORF29与淋病奈瑟球菌的预计ORF(ORF29.ng)在重叠的125个氨基酸内显示出有88.8%的相同性:
orf29.pep VSPVLPITHERTGFEGVIGYETHFSGHGHE 30
|:|:||||||||||||||||||||||||||
orf29ng EPGGKYHLFGNARGSVKNRVCAVQTFDATAVGPILPITHERTGFEGVIGYETHFSGHGHE 102
orf29.pep VHSPFDHHDSKSTSDFSGGVDGGFTVYQLHRTWSEIHPEDEYDGPQAAXYPPPGGARDIY 90
||||||:||||||||||||||||||||||||| ||||||| |||||:: |||||||||||
orf29ng VHSPFDNHDSKSTSDFSGGVDGGFTVYQLHRTGSEIHPEDGYDGPQGGGYPPPGGARDIY 162
orf29.pep SYYVKGTSTKTKTSIVPQAPFSDRWLEENAGAASG 125
||::||||||||:|||||||||||:|||||||
orf29ng SYHIKGTSTKTKINTVPQAPFSDRWLKENAGAASGFLSRADEAGKLIWENDPDKNWRANR 222
预计全长ORF29ng核苷酸序列<SEQ ID 169>编码的蛋白质具有氨基酸序列<SEQ ID 170>:
1
MNLPIQKFMM LFAAAISLLQ IPISHANGLD ARLRDDMQAK HYEPGGKYHL
51 FGNARGSVKN RVCAVQTFDA TAVGPILPIT HERTGFEGVI GYETHFSGHG
101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP EDGYDGPQGG
151 GYPPPGGARD IYSYHIKGTS TKTKINTVPQ APFSDRWLKE NAGAASGFLS
201 RADEAGKLIW ENDPDKNWRA NRMDDIRGIV QGAVNPFLTG FQGLGVGAIT
251 DSAVSPVTYA AARKTLQGIH NLGNLSPEAQ LAAATALQDS AFAVKDSINS
301 ARQWADAHPN ITATAQTALA VTEAATTVWG GKKVELNPAK WDWVKNTGYK
351 KPAARHMQTV DGEMAGGNKP LESKNTVTTN NFFENTGYTE KVLRQASNGD
401 YHGFPQSVDA FSENGTVIQI VGGDNIVRHK LYIPGSYKGK DGNFEYIREA
451 DGKINHRLFV PNQQLPEK*
在第二个实验中,鉴定出下列DNA序列<SEQ ID 171>:
1 atgAATTTGC CTATTCAAAA ATTCATGATG ctgttggcAg cggcaatatc
51 gatgctGCat ATCCCCATTA GTCATGCGAA CGGTTTGGAT GCCCGTTTGC
101 GCGATGATAT GCAGGCAAAA CACTACGAAC CGGGTGGCAA ATACCATCTG
151 TTTGGTAATG CTCGCGGCAG TGTTAAAAAT CGGGTTTGCG CCGTCCAAAC
201 ATTTGATGCA ACTGCGGTCG GCCCCATACT GCCTATTACA CACGAACGGA
251 CAGGATTTGA AGGTGTTATC GGCTATGAAA CCCATTTTTC AGGACACGGA
301 CACGAAGTAC ACAGTCCGTT CGATAATCAT GATTCAAAAA GCACTTCTGA
351 TTTCAGCGGC GGCGTAGACG GCGGTTTTAC CGTTTACCAA CTTCATCGGA
401 CAGGGTCGGA AATACATCCC GCAGACGGAT ATGACGGGCC TCAAGGCGGC
451 GGTTATCCGG AACCACAAGG GGCAAGGGAT ATATACAGCT ACCATATCAA
501 AGGAACTTCA ACCAAAACAA AGATAAACAC TGTTCCGCAA GCCCCTTTTT
551 CAGACCGCTG GCTAAAAGAA AATGCCGGTG CCGCTTCCGG TTTTCTCAGC
601 CGTGCGGATG AAGCAGGAAA ACTGATATGG GAAAACGACC CCGATAAAAA
651 TTGGCGGGCT AACCGTATGG ATGATATTCG CGGCATCGTC CAAGGTGCGG
701 TTAATCCTTT TTTAACGGGT TTTCAAGGGG TAGGGATTGG GGCAATTACA
751 GACAGTGCGG TAAGCCCGGT CACAGATACA GCCGCTCAGC AGACTCTACA
801 AGGTATTAAT GATTTAGGAA ATTTAAGTCC GGAAGCACAA CTTGCCGCCG
851 CGAGCCTATT ACAGGACAGT GCCTTTGCGG TAAAAGACGG CATCAATTCC
901 GCCAGACAAT GGGCTGATGC CCATCCGAAT ATAACAGCAA CAGCCCAAAC
951 TGCCCTTGCC GTAGCAGAGG CCGCAGGTAC GGTTTGGCGC GGTAAAAAAG
1001 TAGAACTTAA CCCGACCAAA TGGGATTGGG TTAAAAATAC CGGCTATAAA
1051 AAACCTGCTG CCCGCCATAT GCAGACTGTA GATGGGGAGA TGGCAGGGGG
1101 GAATAGACCG CCTAAATCTA TAACGTCGGA AGGAAAAGCT AATGCTGCAA
1151 CCTATCCTAA GTTGGTTAAT CAGCTAAATG AGCAAAACTT AAATAACATT
1201 GCGGCTCAAG ATCCAAGATT GAGTCTAGCT ATTCATGAGG GTAAAAAAAA
1251 TTTTCCAATA GGAACTGCAA CTTATGAAGA GGCAGATAGA CTAGGTAAAA
1301 TTTGGGTTGG TGAGGGTGCA AGACAAACTA GTGGAGGCGG ATGGTTAAGT
1351 AGAGATGGCA CTCGACAATA TCGGCCACCA ACAGAAAAAA AATCACAATT
1401 TGCAACTACA GGTATTCAAG CAAATTTTGA AACTTATACT ATTGATTCAA
1451 ATGAAAAAAG AAATAAAATT AAAAATGGAC ATTTAAATAT TAGGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 172;ORF29ng-1>:
1
MNLPIQKFMM LLAAAISMLH IPISHANGLD ARLRDDMQAK HYEPGGKYHL
51 FGNARGSVKN RVCAVQTFDA TAVGPILPIT HERTGFEGVI GYETHFSGHG
101 HEVHSPFDNH DSKSTSDFSG GVDGGFTVYQ LHRTGSEIHP ADGYDGPQGG
151 GYPEPQGARD IYSYHIKGTS TKTKINTVPQ APFSDRWLKE NAGAASGFLS
201 RADEAGKLIW ENDPDKNWRA NRMDDIRGIV QGAVNPFLTG FQGVGIGAIT
251 DSAVSPVTDT AAQQTLQGIN DLGNLSPEAQ LAAASLLQDS AFAVKDGINS
301 ARQWADAHPN ITATAQTALA VAEAAGTVWR GKKVELNPTK WDWVKNTGYK
351 KPAARHMQTV DGEMAGGNRP PKSITSEGKA NAATYPKLVN QLNEQNLNNI
401 AAQDPRLSLA IHEGKKNFPI GTATYEEADR LGKIWVGEGA RQTSGGGWLS
451 RDGTRQYRPP TEKKSQFATT GIQANFETYT IDSNEKRNKI KNGHLNIR*
ORF29ng-1和ORF29-1在401个氨基酸的重叠区内显示出有86.0%的相同性:
10 20 30 40 50 60
orf29ng-1.pep MNLPIQKFMMLLAAAISMLHIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKN
|||||||||||:|||||:|:|||||||||||||||||||||||||||||||||||||||:
orf29-1 MNLPIQKFMMLFAAAISLLQIPISHANGLDARLRDDMQAKHYEPGGKYHLFGNARGSVKK
10 20 30 40 50 60
70 80 90 100 110 120
orf29ng-1.pep RVCAVQTFDATAVGPILPITHERTGFEGVIGYETHFSGHGHEVHSPFDNHDSKSTSDFSG
|| ||||||||||:|:||||||||||||||||||||||||||||||||:|||||||||||
orf29-1 RVYAVQTFDATAVSPVLPITHERTGFEGVIGYETHFSGHGHEVHSPFDHHDSKSTSDFSG
70 80 90 100 110 120
130 140 150 160 170 180
orf29ng-1.pep GVDGGFTVYQLHRTGSEIHPADGYDGPQGGGYPEPQGARDIYSYHIKGTSTKTKINTVPQ
|||||||||||||||||||| ||||||||: || | ||||||||::|||||||| | |||
orf29-1 GVDGGFTVYQLHRTGSEIHPEDGYDGPQGSDYPPPGGARDIYSYYVKGTSTKTKTNIVPQ
130 140 150 160 170 180
190 200 210 220 230 240
orf29ng-1.pep APFSDRWLKENAGAASGFLSRADEAGKLIWENDPDKNWRANRMDDIRGIVQGAVNPFLTG
||||||||||||||||||:||||||||||||:||:||| ||||||:|||||||||||| |
orf29-1 APFSDRWLKENAGAASGFFSRADEAGKLIWESDPNKNWWANRMDDVRGIVQGAVNPFLMG
190 200 210 220 230 240
250 260 270 280 290 300
orf29ng-1.pep FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGNLSPEAQLAAASLLQDSAFAVKDGINS
|||||||||||||||||||||||||||||||||:||||||||||||||||||||||||||
orf29-1 FQGVGIGAITDSAVSPVTDTAAQQTLQGINDLGKLSPEAQLAAASLLQDSAFAVKDGINS
250 260 270 280 290 300
310 320 330 340 350 360
orf29ng-1.pep ARQWADAHPNITATAQTALAVAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTV
|:|||||||||||||||||::||||||||||||||||||||||||||||||||||||||:
orf29-1 AKQWADAHPNITATAQTALSAAEAAGTVWRGKKVELNPTKWDWVKNTGYKKPAARHMQTL
310 320 330 340 350 360
370 380 390 400 410 419
orf29ng-1.pep DGEMAGGNRPPKSI-TSEGKANAATYPKLVNQLNEQNLNNIAAQDPRLSLAIHEGKKNFP
||||||||:| ||: :| :: :: |: :: : :::::
orf29-1 DGEMAGGNKPIKSLPNSAAEKRKQNFEKFNSNWSSASFDSVHKTLTPNAPGILSPDKVKT
370 380 390 400 410 420
420 430 440 450 460 470 479
orf29ng-1.pep IGTATYEEADRLGKIWVGEGARQTSGGGWLSRDGTRQYRPPTEKKSQFATTGIQANFETY
orf29-1 RYTSLDGKITIIKDNENNYFRIHDNSRKQYLDSNGNAVKTGNLQGKQAKDYLQQQTHIRN
430 440 450 460 470 480
根据该分析结果,包括该淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例21
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 173>:
1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC
51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAATGTTCC
101 ACACGCGGGC AGATGCACCG ATGCAG...
它对应于氨基酸序列<SEQ ID 174;ORF30>:
1 MKKQITAAVM MLSMIAPAMA NGLDNQAFED QMFHTRADAP MQ..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 175>:
1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC
51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC
101 ACACGCGGGC AGATGCACCG ATGCAGTTGG CGGAGCTTTC TCAAAAGGAG
151 ATGAAGGAGA CAGAGGGGGC GTTTCTTCCA TTGGCTATCT TGGGTGGTGC
201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA
251 GACCAGCTTC TGTTAGAGAT GTTGCTATTG CTGGCGGATT AGGCGCAATT
301 CCTGGTGGTG TAGGCGCCGC AGGAAAGGTT GTTTCCTTTG CTAAATATGG
351 ACGTGAGATT AAAATCGGCA ATAATATGCG GATAGCCCCT TTCGGTAATA
401 GAACAGGTCA TCCTATTGGA AAATTTCCCC ATTATCATCG TCGAGTTACG
451 GATAATACGG GCAAGACTTT GCCTGGACAG GGAATTGGTC GTCATCGCCC
501 TTGGGAATCA AAATCTACGG ACAGATCATG GAAAAACCGC TTCTAA
它对应于氨基酸序列<SEQ ID 176;ORF30-1>:
1
MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE
51 MKETE
GAFLP LAILGGAAIG MWTQHGFSYA TTGRPASVRD VAIAGGLGAI
101 PGGVGAAGKV VSFAKYGREI KIGNNMRIAP FGNRTGHPIG KFPHYHRRVT
151 DNTGKTLPGQ GIGRHRPWES KSTDRSWKNR F*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF30与脑膜炎奈瑟球菌菌株A的ORF(ORF30a)在重叠的42个氨基酸内显示出有97.6%的相同性:
10 20 30 40
orf30.pep
MKKQITAAVMMLSMIAPAMANGLDNQAFEDQMFHTRADAPMQ
|||||||||||||||||||| |||||||||||:||||||||||
orf30a
MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKXTX
GAFLP
10 20 30 40 50 60
orf30a
LXILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGXVGAAGKVVSFAKYGREI
70 80 90 100 110 120
全长ORF30a核苷酸序列<SEQ ID 177>是:
1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATTGCCCC
51 CGCAATGGCA AACGGCTTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC
101 ACACGCGGGC AGATGCACCG ATGCAGTTGG CGGAGCTTTC TCAAAAGGAG
151 ATGAAGGANA CAGNGGGGGC GTTTCTTCCA TTGGNTATCT TGGGTGGTGC
201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA
251 GACCAGCTTC TGTTAGAGAT GTTGCTATTG CTGGCGGATT AGGCGCAATT
301 CCTGGTGNTG TAGGCGCCGC AGGAAAGGTT GTTTCCTTTG CTAAATATGG
351 ACGTGAGATT AAAATCGGCA ATAATATGCG GATAGCCCCT TTCGGTAATA
401 GAACAGGTCA TCCTATTGGN AAATTTCCCC ATTATCATCG TCGAGTTACG
451 GATAATACGG GCAAGACTTT GCCTGGACAG GGAATTGGTC GTCATCGCCC
501 TTGGGAATCA AAATCTACGG ACAGATCATG GAAAAACCGC TTCTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 178>:
1
MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE
51 MKXTX
GAFLP LXILGGAAIG MWTQHGFSYA TTGRPASVRD VAIAGGLGAI
101 PGXVGAAGKV VSFAKYGREI KIGNNMRIAP FGNRTGHPIG KFPHYHRRVT
151 DNTGKTLPGQ GIGRHRPWES KSTDRSWKNR F*
ORF30a和ORF30-1在181个氨基酸的重叠区内显示出有97.8%的相同性:
orf30a.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKXTXGAFLP 60
|||||||||||||||||||||||||||||||||||||||||||||||||||| | |||||
orf30-1 MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 60
orf30a.pep LXILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGXVGAAGKVVSFAKYGREI 120
| |||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf30-1 LAILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGGVGAAGKVVSFAKYGREI 120
orf30a.pep KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf30-1 KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR 180
orf30a.pep FX
||
orf30-1 FX
与淋病奈瑟球菌的预计ORF的同源性
ORF30与淋病奈瑟球菌的预计ORF(ORF30.ng)在重叠的42个氨基酸内显示出有97.6%的相同性:
orf30.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQMFHTRADAPMQ 42
|||||||||||||||||||||||||||||||:||||||||||
orf30ng MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP 60
全长ORF30ng核苷酸序列<SEQ ID 179>是
1 ATGAAAAAAC AAATCACCGC AGCCGTAATG ATGCTGTCTA TGATCGCCCC
51 CGCAATGGCA AACGGATTGG ACAATCAGGC ATTTGAAGAC CAAGTGTTCC
101 ACACGCGGGC AGATGCGCCG ATGCAGTTGG CGGAGCTTTC TCAGAAGGAG
151 ATGAAGGAGA CTGAAGGGGC TTTTCTTCCA TTGGCTATCT TGGGTGGTGC
201 TGCCATTGGT ATGTGGACAC AGCATGGTTT TAGTTATGCA ACGACAGGCA
251 GACCAGCTTC TGTTAGAGAT GTTGCTGGCG GATTAGGCGC AATTCCTGGT
301 GATGTAGGTG CTGCAGGAAA GGTTGTTTCC TTTGCTAAAT ATGGACGTGA
351 GATTAAAATC GGCAATAATA TGCGGATAGC CCCTTTCGGT AATAGAACAG
401 GTCATCCTAT TGGAAAATTT CCCCATTATC ATCGTCGAGT TACGGATAAT
451 ACGGGCAAGA CTTTGCCTGG ACAGGGAATT GGTCGTCATC GCCCTTGGGA
501 ATCAAAATCT ACGGACAGAT CATGGAAAAA CCGCTTCTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 180>:
1
MKKQITAAVM MLSMIAPAMA NGLDNQAFED QVFHTRADAP MQLAELSQKE
51 MKETEGAFLP LAILGGAAIG MWTQHGFSYA TTGRPASVRD VAGGLGAIPG
101 DVGAAGKVVS FAKYGREIKI GNNMRIAPFG NRTGHPIGKF PHYHRRVTDN
151 TGKTLPGQGI GRHRPWESKS TDRSWKNRF*
ORF30ng和ORF30-1在181个氨基酸的重叠区内显示出有98.3%的相同性:
10 20 30 40 50 60
orf30ng.pep MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf30-1 MKKQITAAVMMLSMIAPAMANGLDNQAFEDQVFHTRADAPMQLAELSQKEMKETEGAFLP
10 20 30 40 50 60
70 80 90 100 110
orf30ng.pep LAILGGAAIGMWFQHGFSYATTGRPASVRDVA--GGLGAIPGDVGAAGKVVSFAKYGREI
|||||||||||||||||||||||||||||||| ||||||||||| ||||||||||||||
orf30-1 LAILGGAAIGMWTQHGFSYATTGRPASVRDVAIAGGLGAIPGGVGAAGKVVSFAKYGREI
70 80 90 100 110 120
120 130 140 150 160 170
orf30ng.pep KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf30-1 KIGNNMRIAPFGNRTGHPIGKFPHYHRRVTDNTGKTLPGQGIGRHRPWESKSTDRSWKNR
130 140 150 160 170 180
180
orf30ng.pep FX
||
orf30-1 FX
根据该分析结果,包括该淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例22
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 181>:
1 ATGAATAAAA CTCTCTATCG TGTAATTTTC AACCGCAAAC GTGGGGCTGT
51 GrTAGCCGTT GCTGAAACTA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA
101 GTGATTCAGG CAGCGCTCAT GTGAAATCTG TTCCTTTTGG TACTACTCAT
151 GCACCTGTTT GTg.CGTTaC AAATATCTTT TCTTTTTCTT TATTGGGCTT
201 TTCTTTATGT TTGGCTGTAG GtacGGyCAA TATTGCTTTT GCTGATGGCA
251 TT..
它对应于氨基酸序列<SEQ ID 182;ORF31>:
1 MNKTLYRVIF NRKRGAVXAV AETTKREGKS CADSDSGSAH VKSVPFGTTH
51 APVCXVTNIF SFSLLGFSLC LAVGTXNIAF ADGI..
进一步的工作揭示进一步的部分核苷酸序列<SEQ ID 183>:
1 ATGAATAAAA CTCTCTATCG TGTAATTTTC AACCGCAAAC GTGGGGCTGT
51 GGTAGCCGTT GCTGAAACTA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA
101 GTGATTCAGG CAGCGCTCAT GTGAAATCTG TTCCTTTTGG TACTACTCAT
151 GCACCTGTTT GTCGTTCAAA TATCTTTTCT TTTTCTTTAT TGGGCTTTTC
201 TTTATGTTTG GCTGTAGGTA CGGCCAATAT TGCTTTTGCT GATGGCATT..
它对应于氨基酸序列<SEQ ID 184;ORF31-1>:
1 MNKTLYRVIF NRKRGAVVAV AETTKREGKS CADSDSGSAH VKSVPFGTTH
51 APVCRSNIFS FSLLGFSLCL AVGTANIAFA DGI..
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF31与淋病奈瑟球菌的预计ORF(ORF31.ng)在重叠的84个氨基酸内显示出有76.2%的相同性:
orf31.pep MNKTLYRVIFNRKRGAVXAVAETTKREGKSCADSDSGSAHVKSVPFGTTHAPVCXVTNIF 60
||||||||||||||||| |||||||||||||||| |||::|||| | || :: |
orf31ng MNKTLYRVIFNRKRGAVVAVAETTKREGKSCADSGSGSVYVKSVSFIPTH------SKAF 54
orf31.pep SFSLLGFSLCLAVGTXNIAFADGI 84
|| ||||||||:|| ||||||||
orf31ng CFSALGFSLCLALGTVNIAFADGI ITDKAAPKTQQATILQTGNGIPQVNIQTPTSAGVSV 114
全长ORF31ng核苷酸序列<SEQ ID 185>是:
1 ATGAACAAAA CCCTCTATCG TGTGATTTTC AACCGCAAAC GCGGTGCTGT
51 GGTAGCTGTT GCCGAAACCA CCAAGCGCGA AGGTAAAAGC TGTGCCGATA
101 GTGGTTCGGG CAGCGTTTAT GTGAAATCCG TTTCTTTCAT TCCTACTCAT
151 TCCAAAGCCT TTTGTTTTTC TGCATTAGGC TTTTCTTTAT GTTTGGCTTT
201 GGGTACGGTC AATATTGCTT TTGCTGACGG CATTATTACT GATAAAGCTG
251 CTCCTAAAAC CCAACAAGCC ACGATTCTGC AAACAGGTaa cGGCATACCG
301 CAAGTCAATA TTCAAACCCC TACTTCGGCA GGGGTTTCTG TTAATCAATA
351 TGCCCAGTTT GATGTGGGTA ATCGCGGGGC GATTTTAAAC AACAGTCGCA
401 GCAACACCCA AACACAGCTA GGCGGTTGGA TTCAAGGCAA TCCTTGGTTG
451 ACAAGGGGCG AAGCACGTGT GGTTGTAAAC CAAATCAACA GCAGCCATCC
501 TTCACAACTG AATGGCTATA TTGAAGTGGG TGGACGACGT GCAGAAGTCG
551 TTATTGCCAA TCCGGCAGGG ATTGCAGTCA ATGGTGGTGG TTTTATCAAT
601 GCTTCCCGTG CCACTTTGAC GACAGGCCAA CCGCAATATC AAGCAGGAGA
651 CTTTAGCGGC TTTAAGATAA GGCAAGGCAA TGCTGTAATC GCCGGACACG
701 GTTTGGATGC CCGTGATACC GATTTCACAC GTATTCTTGT ATGCCAACAA
751 AATCACCTTG ATCAGTACGG CCGAACAAGC AGGCATTCGT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 186>:
1 MNKTLYRVIF NRKRGAVVAV AETTKREGKS CADSGSGSVY VKSVSFIPTH
51 SKAFCFSALG FSLCLALGTV NIAFADGIIT DKAAPKTQQA TILQTGNGIP
101 QVNIQTPTSA GVSVNQYAQF DVGNRGAILN NSRSNTQTQL GGWIQGNPWL
151 TRGEARVVVN QINSSHPSQL NGYIEVGGRR AEVVIANPAG IAVNGGGFIN
201 ASRATLTTGQ PQYQAGDFSG FKIRQGNAVI AGHGLDARDT DFTRILVCQQ
251 NHLDQYGRTS RHS*
该淋球菌蛋白与菊欧文氏菌的孔形成溶血素样HecA蛋白(登录号为L39897)在重叠的149个氨基酸内有50%的相同性:
orf31ng 96 GNGIPQVNIQTPTSAGVSVNQYAQFDVGNRGAILNNSRSN-TQTQLGGWIQGNPWLTRGE 154
GNG+P VNI TP ++G+S N+Y F+V NRG ILNN + T +QLGG IQ NP L
HecA 45 GNGVPVVNIATPDASGLSHNRYHDFNVDNRGLILNNGTARLTPSQLGGLIQNNPNLNGRA 104
Orf31ng 155 ARVVVNQINSSHPSQLNGYIEVGGRRAEVVIANPAGIAVNGGGFINASRATLTTGQPQYQ 214
A ++N++ S + S+L GY+EV G+A VV+ANP GI +G GF+N R TLTTG PQ+
HecA 105 AAAILNEVVSPNRSRLAGYLEVAGQAANVVVANPYGITCSGCGFLNTPRLTLTTGTPQFD 164
Orf31ng 215 -AGDFSGFKIRQGNAVIAGHGLDARDTDF 242
AG SG +R G+ +I G GLDA +D+
HecA 165 AAGGLSGLDVRGGDILIDGAGLDASRSDY 193
另外,ORF31ng和ORF31-1在83个氨基酸的重叠区内显示出有79.5%的相同性:
10 20 30 40 50 60
orf31-1.pep MNKTLYRVIFNRKRGAVVAVAETTKREGKSCADSDSGSAHVKSVPFGTTHAPVCRSNIFS
|||||||||||||||||||||||||||||||||| |||::|||| | || |:|
orf31ng MNKTLYRVIFNRKRGAVVAVAETTKREGKSCADSGSGSVYVKSVSFIPTH-----SKAFC
10 20 30 40 50
70 80
orf31-1.pep FSLLGFSLCLAVGTANIAFADGI
|| ||||||||:||:||||||||
orf31ng FSALGFSLCLALGTVNIAFADGIITDKAAPKTQQATILQTGNGIPQVNIQTPTSAGVSVN
60 70 80 90 100 110
根据这一发现,包括与溶血素以及粘附素有同源性,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例23
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 187>:
1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA
51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG
101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT
151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA
201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCG..
它对应于氨基酸序列<SEQ ID 188;ORF32>:
1 MNTPPFVCWI FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR
51 ALCPDLPDVP CVHQDIHVRT WHSDAADIDT A..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 189>:
1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA
51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG
101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT
151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA
201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCGCCTGTTC
251 CCGATGTCGT CATCGAAACT TTTGCCTGCG ACCTGCCCGA AAATGTGCTG
301 CACATTATCC GCCGACACAA GCCGCTTTGG CTGAATTGGG AATATTTGAG
351 CGCGGAGGAA AGCAATGAAA GGCTGCATCT GATGCCTTCG CCGCAGGAGG
401 GTGTTCAAAA ATATTTTTGG TTTATGGGTT TCAGCGAAAA AAGCGGCGGG
451 TTGATACGCG AACGTGATTA CTGCGAAGCC GTCCGTTTCG ATACTGAAGC
501 CCTGCGAGAG CGGCTGATGC TGCCCGAAAA AAACGCCTCC GAATGGCTGC
551 TTTTCGGCTA TCGGAGCGAT GTTTGGGCAA AGTGGCTGGA AATGTGGCGA
601 CAGGCAGGCA GCCCGATGAC ACTGTTGCTG GCGGGGACGC AAATCATCGA
651 CAGCCTCAAA CAAAGCGGCG TTATTCCGCA AGATGCCCTG CAAAACGACG
701 GCGATGTTTT TCAGACGGCA TCCGTCCGCC TCGTCAAAAT CCCTTTCGTG
751 CCGCAACAGG ACTTCGACCA ACTGCTGCAC CTTGCCGACT GCGCCGTCAT
801 CCGCGGCGAA GACAGTTTCG TGCGCGCCCA GCTTGCGGGC AAACCCTTCT
851 TTTGGCACAT CTACCCGCAA GACGAGAATG TCCATCTCGA CAAACTCCAC
901 GCCTTTTGGG ATAAGGCACA CGGTTTCTAC ACGCCCGAAA CCGTGTCGGC
951 ACACCGCCGT CTTTCGGACG ACCTCAACGG CGGAGAGGCT TTATCCGCAA
1001 CACAACGCCT CGAATGTTGG CAAACCCTGC AACAACATCA AAACGGCTGG
1051 CGGCAAGGCG CGGAGGATTG GAGCCGTTAT CTTTTCGGGC AGCCGTCAGC
1101 TCCTGAAAAA CTCGCTGCCT TTGTTTCAAA GCATCAAAAA ATACGCTAG它对应于氨基酸序列<SEQ ID 190;ORF32-1>:
1 MNTPPFVCWI FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR
51 ALCPDLPDVP CVHQDIHVRT WHSDAADIDT APVPDVVIET FACDLPENVL
101 HIIRRHKPLW LNWEYLSAEE SNERLHLMPS PQEGVQKYFW FMGFSEKSGG
151 LIRERDYCEA VRFDTEALRE RLMLPEKNAS EWLLFGYRSD VWAKWLEMWR
201 QAGSPMTLLL AGTQIIDSLK QSGVIPQDAL QNDGDVFQTA SVRLVKIPFV
251 PQQDFDQLLH LADCAVIRGE DSFVRAQLAG KPFFWHIYPQ DENVHLDKLH
301 AFWDKAHGFY TPETVSAHRR LSDDLNGGEA LSATQRLECW QTLQQHQNGW
351 RQGAEDWSRY LFGQPSAPEK LAAFVSKHQK IR*w
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF32与脑膜炎奈瑟球菌菌株A的ORF(ORF32a)在重叠的81个氨基酸内显示出有93.8%的相同性:
10 20 30 40 50 60
orf32.pep MNTPPFVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVP
|||||| ||||||||||||||||||||||||||||||||||||||||||||||||||
orf32a MNTPPFSAGXFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVX
10 20 30 40 50 60
70 80
orf32.pep CVHQDIHVRTWHSDAADIDTA
|||||||||||||||||||||
orf32a CVHQDIHVRTWHSDAADIDTAPVXDVVIETFACDLPENVLHIIRRHKPLWLXWEYLSAEX
70 80 90 100 110 120
全长ORF32a核苷酸序列<SEQ ID 191>是:
1 ATGAATACTC CTCCTTTTTC TGCTGGANTT TTTTGCAAGG TCATCGACAA
51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT TGCCCGTGTT TTGCACCGCG
101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT
151 GCGCTTTGCC CTGATTTGCC CGATGTTCNC TGCGTTCATC AGGATATTCA
201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCGCCTGTTC
251 NCGATGTCGT CATCGAAACT TTTGCCTGCG ACCTGCCCGA AAATGTGCTG
301 CACATCATCC GCCGACACAA GCCGCTTTGG CTGAANTGGG AATATTTGAG
351 CGCGGAGGAN AGCAATGAAA GGCTGCACNT GATGCCTTCG CCGCAGGAGA
401 GTGTTCNAAA ATANTTTTGG TTTATGGGTT TCAGCGAANN NAGCGGCGGA
451 CTGATACGCG AACGCGATTA CTGCGAAGCC GTCCGTTTCG ATAGCGGAGC
501 CTTGCGCAAG AGGCTGATGC TTCCCGAAAA AAACGNCCCC GAATGGCTGC
551 TTTTCGGCTA TCGGAGCGAT GTTTGGGCAA AGTGGCTGGA AATGTGGCGA
601 CAGGCAGGCA GTCCGTTGAC ACTTTTGCTG GCNGGGGCGC ANATTATCGA
651 CAGCCTCAAA CAAAACGGCG TTATTCCGCA AGATGCCCTG CAAAACGACG
701 GCGATGTTTT TCAGACGGCA TCCGTCCGCC TCGTCAAAAT CCCTTTCGTG
751 CCGCAACAGG ACTTCGACAA ACTGCTGCAC CTTGCCGACT GCGCCGTCAT
801 CCGCGGCGAA GACAGTTTCG TGCGCGCCCA GCTTGCGGGC AAACCCTTCT
851 TTTGGCACAT CTACCCGCAA GATGAGAATG TCCATCTCGA CAAACTCCAC
901 GCCTTTTGGG ATAAGGCACA CGGTTTCTAC ACGCCCGAAA CCGCATCGGC
951 ACACCGCCGC CTTTCAGACG ACCTCAACGG CGGAGAGGCT TTATCCGCAA
1001 CACAACGCCT CGAATGTTGG CAAATCCTGC AACAACATCA AAACGGCTGG
1051 CGGCAAGGCG CGGAGGATTG GAGCCGTTAT CTTTTTGGGC AGCCTTCCGC
1101 ATCCGAAAAA CTCGCCGCCT TTGTTTCAAA GCATCAAAAA ATACGCTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 192>:
1 MNTPPFSAGX FCKVIDNFGD IGVSWRLARV LHRELGWQVH LWTDDVSALR
51 ALCPDLPDVX CVHQDIHVRT WHSDAADIDT APVXDVVIET FACDLPENVL
101 HIIRRHKPLW LXWEYLSAEX SNERLHXMPS PQESVXKXFW FMGFSEXSGG
151 LIRERDYCEA VRFDSGALRK RLMLPEKNXP EWLLFGYRSD VWAKWLEMWR
201 QAGSPLTLLL AGAXIIDSLK QNGVIPQDAL QNDGDVFQTA SVRLVKIPFV
251 PQQDFDKLLH LADCAVIRGE DSFVRAQLAG KPFFWHIYPQ DENVHLDKLH
301 AFWDKAHGFY TPETASAHRR LSDDLNGGEA LSATQRLECW QILQQHQNGW
351 RQGAEDWSRY LFGQPSASEK LAAFVSKHQK IR*
ORF32a和ORF32-1在382个氨基酸的重叠区内显示出有93.2%的相同性:
10 20 30 40 50 60
orf32-1.pep MNTPPFVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVP
|||||| ||||||||||||||||||||||||||||||||||||||||||||||||||
orf32a MNTPPFSAGXFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDVX
10 20 30 40 50 60
70 80 90 100 110 120
orf32-1.pep CVHQDIHVRTWHSDAADIDTAPVPDVVIETFACDLPENVLHIIRRHKPLWLNWEYLSAEE
||||||||||||||||||||||| ||||||||||||||||||||||||||| ||||||||
orf32a CVHQDIHVRTWHSDAADIDTAPVXDVVIETFACDLPENVLHI IRRHKPLWLXWEYLSAEX
70 80 90 100 110 120
130 140 150 160 170 180
orf32-1.pep SNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYCEAVRFDTEALRERLMLPEKNAS
|||||| ||||||:| | |||||||| |||||||||||||||||: |||:||||||||||
orf32a SNERLHXMPSPQESVXKXFWFMGFSEXSGGLIRERDYCEAVRFDSGALRKRLMLPEKNXP
130 140 150 160 170 180
190 200 210 220 230 240
orf32-1.pep EWLLFGYRSDVWAKWLEMWRQAGSPMTLLLAGTQIIDSLKQSGVIPQDALQNDGDVFQTA
|||||||||||||||||||||||||:||||||: |||||||:||||||||||||||||||
orf32a EWLLFGYRSDVWAKWLEMWRQAGSPLTLLLAGAXI IDSLKQNGVIPQDALQNDGDVFQTA
190 200 210 220 230 240
250 260 270 280 290 300
orf32-1.pep SVRLVKIPFVPQQDFDQLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKLH
||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||
orf32a SVRLVKIPFVPQQDFDKLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKLH
250 260 270 280 290 300
310 320 330 340 350 360
orf32-1.pep AFWDKAHGFYTPETVSAHRRLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSRY
||||||||||||||:|||||||||||||||||||||||||| ||||||||||||||||||
orf32a AFWDKAHGFYTPETASAHRRLSDDLNGGEALSATQRLECWQILQQHQNGWRQGAEDWSRY
310 320 330 340 350 360
370 380
orf32-1.pep LFGQPSAPEKLAAFVSKHQKIRX
||||||| |||||||||||||||
orf32a LFGQPSASEKLAAFVSKHQKIRX
370 380
与淋病奈瑟球菌的预计ORF的同源性
ORF32与淋病奈瑟球菌的预计ORF(ORF32.ng)在重叠的82个氨基酸内显示出有95.1%的相同性:
orf32.pep MNTPPF-VCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLP 57
||| | |||||||||||||||||||||||||||||||||||||||||||||||||||
orf32ng MVMNTYAFPVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLP 60
orf32.pep DVPCVHQDIHVRTWHSDAADIDTA 81
||| ||||||||||||||||||||
orf32ng DVPFVHQDIHVRTWHSDAADIDTAPVPDAVIETFACDLPENVLNIIRRHKPLWLNWEYLS 120
预计ORF32ng核苷酸序列<SEQ ID 193>编码的蛋白质具有氨基酸序列<SEQ ID194>:
1 MVMNTYAFPV CWIFCKVIDN FGDIGVSWRL ARVLHRELGW QVHLWTDDVS
51 ALRALCPDLP DVPFVHQDIH VRTWHSDAAD IDTAPVPDAV IETFACDLPE
101 NVLNIIRRHK PLWLNWEYLS AEESNERLHL MPSPQEGVQK YFWFMGFSEK
151 SGGLIRERDY REAVRFDTEA LRRRLVLPEK NAPEWLLFGY
RGDVWAKWLD
201 MWQQAGSLMT LLLAGAQIID SLKQSGVIPQ NALQNEGGVF QTASVRLVKI
251 PFVPQQDFDK LLHLADCAVI RGEDSFVRTQ LAGKPFFWHI YPQDENVHLD
301 KLHAFWDKAY GFYTPETASV HRLLSDDLNG GEALSATQRL ECGVL*
进一步的测序揭示了下列DNA序列<SEQ ID 195>:
1 ATGAATACAT ACGCTTTTCC TGTCTGTTGG ATTTTTTGCA AGGTCATCGA
51 CAATTTCGGC GACATCGGCG TTTCGTGGCG GCTCGCCCGT GTTTTGCACC
101 GCGAACTCGG TTGGCAGGTG CATTTGTGGA CGGACGACGT GTCCGCCTTG
151 CGCGCGCTTT GTCCCGATTT GCCCGATGTT CCCTTCGTTC ATCAGGATAT
201 TCATGTCCGC ACTTGGCATT CCGATGCGGC AGACATTGAT ACCGCGCCCG
251 TTCCCGATGC CGTTATCGAA ACTTTTGCCT GCGACCTGCC CGAAAATGTG
301 CTGAACATCA TCCGCCGACA CAAACCGCTT TGGCTGAATT GGGAATATTT
351 GAGCGCGGAG GAAAGCAATG AAAGGCTGCA CCTGATGCCT TCGCCGCAGG
401 AGGGCGTTCA AAAATATTTT TGGTTTATGG GTTTCAGCGA AAAAAGCGGC
451 GGGTTGATAC GCGAACGCGA TTACCGCGAA GCCGTCCGTT TCGATACCGA
501 AGCCCTGCGC CGGCGGCTGG TGCTGCCCGA AAAAAACGCC CCCGAATGGC
551 TGCTTTTCGG CTATCGGGGC GATGTTTGGG CAAAGTGGCT GGACATGTGG
601 CAACAGGCAG GCAGCCTGAT GACCCTACTG CTGGCGGGGG CGCAAATTAT
651 CGACAGCCTC AAACAAAGCG GCGTTATTCC GCAAAACGCC CTGCAAAAtg
701 aaggcgGTGT CTTTCagacG gcatccgTcC gccttGTCAA AAtcCCGTTC
751 GTGCcGCAAC AGGAcTTCGA CAAATTGCTG CAcctcgcCG ACTGCGCCGT
801 GATACGCGGC GAAGACAGTT TCGTGCGTAC CCAGCTTGCC GGAAAACCCT
851 TTTTTTGGCA CATCTACCCG CAAGACGAGA ATGTCCATCT CGACAAACTC
901 CACGCCTTTT GGGATAAGGC ATACGGCTTC TACACGCCCG AAACCGCATC
951 GGTGCACCGC CTCCTTTCGG ACGACCTCAA CGGCGGAGAG GCTTTATCCG
1001 CAACACAACG CCTCGAATGT TGGCAAACCC TGCAACAACA TCAAAACGGC
1051 TGGCGGCAAG GCGCGGAGGA TTGGAGCCGT TATCTTTTCG GGCAGCCTTC
1101 CGCATCCGAA AAACTCGCCG CCTTTGTTTC AAAGCATCAA AAAATACGCT
1151 AG
它编码的蛋白质具有氨基酸序列<SEQ ID 196;ORF32ng-1>:
1 MNTYAFPVCW IFCKVIDNFG DIGVSWRLAR VLHRELGWQV HLWTDDVSAL
51 RALCPDLPDV PFVHQDIHVR TWHSDAADID TAPVPDAVIE TFACDLPENV
101 LNIIRRHKPL WLNWEYLSAE ESNERLHLMP SPQEGVQKYF WFMGFSEKSG
151 GLIRERDYRE AVRFDTEALR RRLVLPEKNA PEWLLFGYRG DVWAKWLDMW
201 QQAGSLMTLL LAGAQIIDSL KQSGVIPQNA LQNEGGVFQT ASVRLVKIPF
251 VPQQDFDKLL HLADCAVIRG EDSFVRTQLA GKPFFWHIYP QDENVHLDKL
301 HAFWDKAYGF YTPETASVHR LLSDDLNGGE ALSATQRLEC WQTLQQHQNG
351 WRQGAEDWSR YLFGQPSASE KLAAFVSKHQ KIR*
ORF32ng-1和ORF32-1在383个氨基酸的重叠区内显示出有93.5%的相同性:
10 20 30 40 50 59
orf32-1.pep MNTPPF-VCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDV
||| | |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf32ng-1 MNTYAFPVCWIFCKVIDNFGDIGVSWRLARVLHRELGWQVHLWTDDVSALRALCPDLPDV
10 20 30 40 50 60
60 70 80 90 100 110 119
orf32-1.pep PCVHQDIHVRTWHSDAADIDTAPVPDVVIETFACDLPENVLHIIRRHKPLWLNWEYLSAE
| ||||||||||||||||||||||||:||||||||||||||:||||||||||||||||||
orf32ng-1 PFVHQDIHVRTWHSDAADIDTAPVPDAVIETFACDLPENVLNIIRRHKPLWLNWEYLSAE
70 80 90 100 110 120
120 130 140 150 160 170 179
orf32-1.pep ESNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYCEAVRFDTEALRERLMLPEKNA
||||||||||||||||||||||||||||||||||||||| |||||||||||:||:|||||
orf32ng-1 ESNERLHLMPSPQEGVQKYFWFMGFSEKSGGLIRERDYREAVRFDTEALRRRLVLPEKNA
130 140 150 160 170 180
180 190 200 210 220 230 239
orf32-1.pep SEWLLFGYRSDVWAKWLEMWRQAGSPMTLLLAGTQIIDSLKQSGVIPQDALQNDGDVFQT
||||||||:|||||||:||:|||| |||||||:||||||||||||||:||||:| ||||
orf32ng-1 PEWLLFGYRGDVWAKWLDMWQQAGSLMTLLLAGAQIIDSLKQSGVIPQNALQNEGGVFQT
190 200 210 220 230 240
240 250 260 270 280 290 299
orf32-1.pep ASVRLVKIPFVPQQDFDQLLHLADCAVIRGEDSFVRAQLAGKPFFWHIYPQDENVHLDKL
||||||||||||||||:||||||||||||||||||:||||||||||||||||||||||||
orf32ng-1 ASVRLVKIPFVPQQDFDKLLHLADCAVIRGEDSFVRTQLAGKPFFWHIYPQDENVHLDKL
250 260 270 280 290 300
300 310 320 330 340 350 359
orf32-1.pep HAFWDKAHGFYTPETVSAHRRLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSR
|||||||:|||||||||:|| |||||||||||||||||||||||||||||||||||||||
orf32ng-1 HAFWDKAYGFYTPETASVHRLLSDDLNGGEALSATQRLECWQTLQQHQNGWRQGAEDWSR
310 320 330 340 350 360
360 370 380
orf32-1.pep YLFGQPSAPEKLAAFVSKHQKIRX
|||||||| |||||||||||||||
orf32ng-1 YLFGQPSASEKLAAFVSKHQKIRX
370 380
根据这一发现,包括该淋球菌蛋白中有粘附素有特有的RGD序列的发现,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF32-1(42kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图7A显示出His-融合蛋白亲和纯化的结果,图7B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,用小鼠血清进行ELISA,得到阳性结果。这些结果确认ORF32-1是一种外露蛋白,且是一种有用的免疫原。
实施例24
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 197>:
1 ..TTGTTCCTGC GTGTNAAAGT GGGGCGTTTT TTCAGCAGTC CGGCGACGTG
51 GTTTCGGGNC AAAGACCCTG TAAATCAGGC GGTGTTGCGG CTGTATNCGG
101 ACGAGTGGCG GCA.ACTTCG GTACGTTGGA AAATAGNCGC AACGTCGCAC
151 AGCCTGTGGC TCTGCACGCT GCTCGGAATG CTGGTGTCGG TATTGTTGCT
201 GCTTTTGGTG CGGCAATATA CGTTCAACTG GGAAAGCACG CTGTTGAGCA
251 ATGCCGCTTC GGTACGCGCG GTGGAAATGT TGGCATGGCT GCCGTCGAAA
301 CTCGGTTTCC CTGTCCCCGA TGCGCGGTCG GTCATCGAAG GCCGTCTGAA
351 CGGCAATATT GCCGATGCGC GGGCTTGGTC GGGGCTGCTG GTCGNCAGTA
401 TCGCCTGCTA NGGCATCCTG CCGCGCCTG..
它对应于氨基酸序列<SEQ ID 198;ORF33>:
1 ..LFLRVKVGRF FSSPATWFRX KDPVNQAVLR LYXDEWRXTS VRWKIXATSH
51 SLWLCTLLGM LVSVLLLLLV RQYTFNWEST LLSNAASVRA VEMLAWLPSK
101 LGFPVPDARS VIEGRLNGNI ADARAWSGLL VXSIACXGIL PRL..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 199>:
1 ATGTTGAATC CATCCCGAAA ACTGGTTGAG CTGGTCCGTA TTTTGGACGA
51 AGGCGGTTTT ATTTTCAGCG GCGATCCCGT ACAGGCGACG GAGGCTTTGC
101 GCCGCGTGGA CGGCAGTACG GAGGAAAAAA TCATCCGTCG GGCGGAGATG
151 ATTGACAGGA ACCGTATGCT GCGGGAGACG TTGGAACGTG TGCGTGCGGG
201 GTCGTTCTGG TTGTGGGTGG TGGCGGCGAC GTTTGCATTT TTTACCGGTT
251 TTTCAGTCAC TTATCTTCTA ATGGACAATC AGGGTCTGAA TTTCTTTTTG
301 GTTTTGGCGG GCGTGTTGGG CATGAATACG CTGATGCTGG CAGTATGGTT
351 GGCAATGTTG TTCCTGCGTG TGAAAGTGGG GCGTTTTTTC AGCAGTCCGG
401 CGACGTGGTT TCGGGGCAAA GACCCTGTAA ATCAGGCGGT GTTGCGGCTG
451 TATGCGGACG AGTGGCGGCA ACCTTCGGTA CGTTGGAAAA TAGGCGCAAC
501 GTCGCACAGC CTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT
551 TGTTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG
601 TTGAGCAATG CCGCTTCGGT ACGCGCGGTG GAAATGTTGG CATGGCTGCC
651 GTCGAAACTC GGTTTCCCTG TCCCCGATGC GCGGGCGGTC ATCGAAGGCC
701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC
751 GGCAGTATCG CCTGCTACGG CATCCTGCCG CGCCTGCTGG CTTGGGTAGT
801 GTGTAAAATC CTTTTGAAAA CAAGCGAAAA CGGATTGGAT TTGGAAAAGC
851 CCTATTATCA GGCGGTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG
901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCACCGAAAA TCATCTTGAA
951 CGATGCGCCG AAATGGGCGG TCATGCTGGA GACCGAGTGG CAGGACGGCG
1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC
1051 ACCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC
1101 GGCGCAACTG CTTATCGGCG TGCGCGCCCA AACTGTGCCG GACCGCGGCG
1151 TGTTGCGGCA GATTGTCCGA CTCTCGGAAG CGGCGCAGGG CGGCGCGGTG
1201 GTGCAGCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT
1251 GGAACATTGG CGTAACGCGC TGGCCGAATG CGGCGCGGCG TGGCTTGAGC
1301 CTGACAGGGC GGCGCAGGAA GGGCGTTTGA AAGACCAATA A
它对应于氨基酸序列<SEQ ID 200;ORF33-1>:
1ML MLNPSRKLVE LVRILDEGGF IFSGDPVQAT EALRRVDGST EEKIIRRAEM
51I IDRNRMLRET LERVRAGS
FW LWVVAATFAF FTGFSVTYLL MDNQGLNF
FL
101
VLAGVLGMNT LMLAVWLAML FLRVKVGRFF SSPATWFRGK DPVNQAVLRL
151 YADEWRQPSV RWKIGATSHS LW
LCTLLGML VSVLLLLLVR QYTFNWESTL
201 LSNAASVRAV EMLAWLPSKL GFPVPDARAV IEGRLNGNIA DARAWSG
LLV
251
GSIACYGILP RLLAWVVCKI LLKTSENGLD LEKPYYQAVI RRWQNKITDA
301 DTRRETVSAV SPKIILNDAP KWAVMLETEW QDGEWFEGRL AQEWLDKGVA
351 TNREQYAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV
401 VQLLAEQGLS DDLSEKLEHW RNALAECGAA WLEPDRAAQE GRLKDQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF33与脑膜炎奈瑟球菌菌株A的ORF(ORF33a)在重叠的143个氨基酸内显示出有90.9%的相同性:
10 20 30
orf33.pep LFLRVKVGRFFSSPATWFRXKDPVNQAVLR
||||||||||||||||||| ||||||||||
orf33a LMDNQGLNF
FLVLAGVXGMNTLMLAVWLAMLFLRVKVGRFFSSPATWFRGKDPVNQAVLR
90 100 110 120 130 140
40 50 60 70 80 90
orf33.pep LYXDEWRXTSVRWKIXATSHSLW
LCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA
|| ||||| |||||| ||||||| ||||||||||||||||| |||||||||||||::::|||
orf33a LYADEWRXPSVRWKIGATSHSLW
LCTLLGMLVSVLLLLLVRQYTFNWESTLLGDSSSVRL
150 160 170 180 190 200
100 110 120 130 140
orf33.pep VEMLAWLPSKLGFPVPDARSVIEGRLNGNIADARAWSG
LLVXSIACXGILPRL
||||||||:||||||||||:||||||||||||||||||||| |||| ||||||
orf33a VEMLAWLPAKLGFPVPDARAVIEGRLNGNIADARAWSG
LLVGSIACYGILPRLLAWAVCK
210 220 230 240 250 260
orf33a ILXXTSENGLDLEKXXXXXXIRRWQNKITDADTRRETVSAVSPKIVLNDAPKWAVMLETE
270 280 290 300 310 320
全长ORF33a核苷酸序列<SEQ ID 201>是:
1 ATGTTGAATC CATCCCGAAA ACTGGTTGAG CTGGTCCGTA TTTTGGAAGA
51 AGGCGGCTTT ATTTTCAGCG GCGATCCCGT GCAGGCGACG GAGGCTTTGC
101 GCCGCGTGGA CGGCAGTACG GAGGAAAAAA TCATCCGTCG GGCGAAGATG
151 ATCGACAGGA ACCGTATGCT GCGGGAGACG TTGGAACGTG TGCGTGCGGG
201 GTCGTTCTGG TTGTGGGTGG CGGCGGCGAC GTTTGCGTTT NTTACCGNTT
251 TTTCAGTTAC TTATCTTCTA ATGGACAATC AGGGTCTGAA TTTCTTTTTG
301 GTTTTGGCGG GCGTGNTGGG CATGAATACG CTGATGCTGG CAGTATGGTT
351 GGCAATGTTG TTCCTGCGCG TGAAAGTGGG GCGTTTTTTC AGCAGTCCGG
401 CGACGTGGTT TCGGGGCAAA GACCCTGTCA ATCAGGCGGT GTTGCGGCTG
451 TATGCGGACG AGTGGCGGCN ACCTTCGGTA CGTTGGAAAA TAGGCGCAAC
501 GTCGCACAGC CTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT
551 TGTTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG
601 TTGGGCGATT CGTCTTCGGT ACGGCTGGTG GAAATGTTGG CATGGCTGCC
651 TGCGAAACTG GGTTTTCCCG TGCCTGATGC GCGGGCGGTC ATCGAAGGTC
701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC
751 GGCAGTATCG CCTGCTACGG CATCCTGCCG CGCCTCTTGG CTTGGGCGGT
801 ATGCAAAATC CTTNTGNAAA CAAGCGAAAA CGGCTTGGAT TTGGAAAAGC
851 NCNNNNNTCN NNCGNTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG
901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCGCCGAAAA TCGTCTTGAA
951 CGATGCGCCG AAATGGGCGG TCATGCTGGA GACCGAATGG CAGGACGGCG
1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC
1051 GCCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC
1101 GGCGCAACTG CTTATCGGCG TGCGCGCCCA AACTGTGCCC GACCGCGGCG
1151 TGTTGCGGCA GATCGTCCGA CTTTCGGAAG CGGCGCAGGG CGGCGCGGTG
1201 GTGCANCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT
1251 GGAACATTGG CGTAACGCGC TGACCGAATG CGGCGCGGCG TGGCTGGAAC
1301 CCGACAGAGC GGCGCAGGAA GGCCGTCTGA AAACCAACGA CCGCACTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 202>:
1 MLNPSRKLVE LVRILEEGGF IFSGDPVQAT EALRRVDGST EEKIIRRAKM
51 IDRNRMLRET LERVRAGS
FW LWVAAATFAF XTXFSVTYLL MDNQGLNF
FL
101
VLAGVXGMNT LMLAVWLAML FLRVKVGRFF SSPATWFRGK DPVNQAVLRL
151 YADEWRXPSV RWKIGATSHS LW
LCTLLGML VSVLLLLLVR QYTFNWESTL
201 LGDSSSVRLV EMLAWLPAKL GFPVPDARAV IEGRLNGNIA DARAWSG
LLV
251
GSIACYGILP RLLAWAVCKI LXXTSENGLD LEKXXXXXXI RRWQNKITDA
301 DTRRETVSAV SPKIVLNDAP KWAVMLETEW QDGEWFEGRL AQEWLDKGVA
351 ANREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV
401 VXLLAEQGLS DDLSEKLEHW RNALTECGAA WLEPDRAAQE GRLKTNDRT*
ORF33a和ORF33-1在444个氨基酸的重叠区内显示出有94.1%的相同性:
10 20 30 40 50 60
orf33a.pep MLNPSRKLVELVRILEEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAKMIDRNRMLRET
|||||||||||||||:||||||||||||||||||||||||||||||||:|||||||||||
orf33-1 MLNPSRKLVELVRILDEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAEMIDRNRMLRET
10 20 30 40 50 60
70 80 90 100 110 120
orf33a.pep LERVRAGSFWLWVAAATFAFXTXFSVTYLLMDNQGLNFFLVLAGVXGMNTLMLAVWLAML
||||||||||||||||:|||||| | ||||||||||||||||||| ||||||||||||||
orf33-1 LERVRAGSFWLWVVAATFAFFTGFSVTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLAML
70 80 90 100 110 120
130 140 150 160 170 180
orf33a.pep FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRXPSVRWKIGATSHSLWLCTLLGML
|||||||||||||||||||||||||||||||||||| |||||||||||||||||||||||
orf33-1 FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRQPSVRWKIGATSHSLWLCTLLGML
130 140 150 160 170 180
190 200 210 220 230 240
orf33a.pep VSVLLLLLVRQYTFNWESTLLGDSSSVRLVEMLAWLPAKLGFPVPDARAVIEGRLNGNIA
|||||||||||||||||||||::::||| ||||||||:||||||||||||||||||||||
orf33-1 VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA
190 200 210 220 230 240
250 260 270 280 290 300
orf33a.pep DARAWSGLLVGSIACYGILPRLLAWAVCKILXXTSENGLDLEKXXXXXXIRRWQNKITDA
|||||||||||||||||||||||||:||||| |||||||||| |||||||||||
orf33-1 DARAWSGLLVGSIACYGILPRLLAWVVCKILLKTSENGLDLEKPYYQAVIRRWQNKITDA
250 260 270 280 290 300
310 320 330 340 350 360
orf33a.pep DTRRETVSAVSPKIVLNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVAANREQVAALE
||||||||||||||:|||||||||||||||||||||||||||||||||||:|||||||||
orf33-1 DTRRETVSAVSPKIILNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVATNREQVAALE
310 320 330 340 350 360
370 380 390 400 410 420
orf33a.pep TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVXLLAEQGLSDDLSEKLEHW
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||
orf33-1 TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVQLLAEQGLSDDLSEKLEHW
370 380 390 400 410 420
430 440 450
orf33a.pep RNALTECGAAWLEPDRAAQEGRLKTNDRTX
||||:|||||||||||||||||||
orf33-1 RNALAECGAAWLEPDRAAQEGRLKDQX
430 440
与淋病奈瑟球菌的预计ORF的同源性
ORF33与淋病奈瑟球菌的预计ORF(ORF33.ng)在重叠的143个氨基酸内显示出有91.6%的相同性:
orf33.pep LFLRVKVGRFFSSPATWFRXKDPVNQAVLR 30
||||||||||||||||||| | ||||||||
orf33ng LMDNQGLNFFLVLAGVLGMNTLMLAVWLATLFLRVKVGRFFSSPATWFRGKGPVNQAVLR 100
orf33.pep LYXDEWRXTSVRWKIXATSHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA 90
|| |:|| |||||| ||:|||||||||||||||||||||||||||||||||||||||||
orf33ng LYADQWRQPSVRWKIGATAHSLWLCTLLGMLVSVLLLLLVRQYTFNWESTLLSNAASVRA 160
orf33.pep VEMLAWLPSKLGFPVPDARSVIEGRLNGNIADARAWSGLLVXSIACXGILPRL 143
|||||||||||||||||||:||||||||||||||||||||| ||:| ||||||
orf33ng VEMLAWLPSKLGFPVPDARAVIEGRLNGNIADARAWSGLLVGSIVCYGILPRLLAWVVCK 220
预计ORF33ng核苷酸序列<SEQ ID 203>编码的蛋白质具有氨基酸序列<SEQ ID204>:
1 MIDRDRMLRD TLERVRAGS
F WLWVVVASMM FTAGFSGTYL LMDNQGLNFF
51
LVLAGVLGMN TLMLAVWLAT LFLRVKVGRF FSSPATWFRG KGPVNQAVLR
101 LYADQWRQPS VRWKIGATAH SLW
LCTLLGM LVSVLLLLLV RQYTFNWEST
151
LLSNAASVRA VEMLAWLPSK LGFPVPDARA VIEGRLNGNI ADARAWSG
LL
201 VGSIVCYGIL PRLLAWVVCK ILLKTSENGL DLEKTYYQAV IRRWQNKITD
251 ADTRRETVSA VSPKIVLNDA PKWALMLETE WQDGQWFEGR LAQEWLDKGV
301 AANREQVAAL ETELKQKPAQ LLIGVRAQTV PDRGVLRQIV RLSEAAQGGA
351 VVQLLAEQGL SDDLSEKLEH WRNALTECGA AWLEPDRVAQ EGRLKDQ*
进一步的序列分析揭示了下列DNA序列<SEQ ID 205>:
1 ATGTTGaatC CATCCCgaAA ACTGgttgag ctGgTCCgtA Ttttgaataa
51 agggggtTTT attttcagcg gcgatcctgt gcaggcgacg gaggctttgc
101 gccgcgtgga cggcAGTACG GAggAaaaaa tcttccgtcg GGCGGAGAtg
151 atcgACAGGg accgtatgtt gcgggACaCg TtggaacGTG TGCGTGCggg
201 gtcgtTctgG TTATGGGTGG TggtggCAtC gATGATGTtt aCCGCCGGAT
251 TTTCAGgcac ttatCttCTG ATGGACaatC AGGGGCtGAA TtTCTTTTTA
301 GTTTTggcgG GAGTGTtggG CATGaatacG ctgATGCTGG CAGTATGGtt
351 gGCAACGTTG TTCCTGCGCG TGAAAGTGGG ACGGTTTTTC AGCAGTCCGG
401 CGACGTGGTT TCGGGGCAAA GGCCCTGTAA ATCAGGCGGT GTTGCGGCTG
451 TATGCGGACC AGTGGCGGCA ACCTTCGGTA CGATGGAAAA TAGGCGCAAC
501 GGCGCACAGC TTGTGGCTCT GCACGCTGCT CGGAATGCTG GTGTCGGTAT
551 TGCTGCTGCT TTTGGTGCGG CAATATACGT TCAACTGGGA AAGCACGCTG
601 TTGAGCAATG CCGCTTCGGT ACGCGCGGTG GAAATGTTGG CATGGCTGCC
651 GTCGAAACTC GGTTTCCCTG TCCCCGATGC GCGGGCGGTC ATCGAAGGTC
701 GTCTGAACGG CAATATTGCC GATGCGCGGG CTTGGTCGGG GCTGCTGGTC
751 GGCAGTATCG TCTGCTACGG CATCCTGCCG CGCCTCTTGG CTTGGGTAGT
801 GTGTAAAATC CTTTTGAAAA CAAGCGAAAA CGGattgGAT TTGGAAAAAA
851 CCTATTATCA GGCGGTCATC CGCCGCTGGC AGAACAAAAT CACCGATGCG
901 GATACGCGTC GGGAAACCGT GTCCGCCGTT TCGCcgaAAA TCGTCTTGAA
951 CGATGCGCCG AAATGGGCGC TCATGCTGGA GACCGAGTGG CAGGACGGCC
1001 AATGGTTCGA GGGCAGGCTG GCGCAGGAAT GGCTGGATAA GGGCGTTGCC
1051 GCCAATCGGG AACAGGTTGC CGCGCTGGAG ACAGAGCTGA AGCAGAAACC
1101 GGCGCAACTG CTTATCGGCG TACGCGCCCA AACTGTGCCG GACCGGGGCG
1151 TGCTGCGGCA GATTGTGCGG CTTTCGGAAG CGGCGCAGGG CGGCGCGGTG
1201 GTGCAGCTTT TGGCGGAACA GGGGCTTTCA GACGACCTTT CGGAAAAGCT
1251 GGAACATTGG CGTAACGCGC TGACCGAATG CGGCGCGGCG TGGCTTGAGC
1301 CTGACAGGGT GGCGCAGGAA GGCCGTTTGA AAGACCAATA A
它编码的蛋白质具有氨基酸序列<SEQ ID 206;ORF33ng-1>:
1 MLNPSRKLVE LVRILNKGGF IFSGDPVQAT EALRRVDGST EEKIFRRAEM
51 IDRDRMLRDT LERVRAGS
FW LWVVVASMMF TAGFSGTYLL MDNQGLNF
FL
101
VLAGVLGMNT LMLAVWLATL FLRVKVGRFF SSPATWFRGK GPVNQAVLRL
151 YADQWRQPSV RWKIGATAHS LW
LCTLLGML VSVLLLLLVR QYTFNWESTL
201 LSNAASVRAV EMLAWLPSKL GFPVPDARAV IEGRLNGNIA DARAWSG
LLV
251
GSIVCYGILP RLLAWVVCKI LLKTSENGLD LEKTYYQAVI RRWQNKITDA
301 DTRRETVSAV SPKIVLNDAP KWALMLETEW QDGQWFEGRL AQEWLDKGVA
351 ANREQVAALE TELKQKPAQL LIGVRAQTVP DRGVLRQIVR LSEAAQGGAV
401 VQLLAEQGLS DDLSEKLEHW RNALTECGAA WLEPDRVAQE GRLKDQ*
ORF33ng-1和ORF33-1在446个氨基酸的重叠区内显示出有94.6%的相同性:
10 20 30 40 50 60
orf33-1.pep MLNPSRKLVELVRILDEGGFIFSGDPVQATEALRRVDGSTEEKIIRRAEMIDRNRMLRET
||||||||||||||::||||||||||||||||||||||||||||:||||||||:||||:|
orf33ng-1 MLNPSRKLVELVRILNKGGFIFSGDPVQATEALRRVDGSTEEKIFRRAEMIDRDRMLRDT
10 20 30 40 50 60
70 80 90 100 110 120
orf33-1.pep LERVRAGSFWLWVVAATFAFFTGFSVTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLAML
||||||||||||||:|:: | :||| |||||||||||||||||||||||||||||||| |
orf33ng-1 LERVRAGSFWLWVVVASMMFTAGFSGTYLLMDNQGLNFFLVLAGVLGMNTLMLAVWLATL
70 80 90 100 110 120
130 140 150 160 170 180
orf33-1.pep FLRVKVGRFFSSPATWFRGKDPVNQAVLRLYADEWRQPSVRWKIGATSHSLWLCTLLGML
|||||||||||||||||||| ||||||||||||:|||||||||||||:||||||||||||
orf33ng-1 FLRVKVGRFFSSPATWFRGKGPVNQAVLRLYADQWRQPSVRWKIGATAHSLWLCTLLGML
130 140 150 160 170 180
190 200 210 220 230 240
orf33-1.pep VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf33ng-1 VSVLLLLLVRQYTFNWESTLLSNAASVRAVEMLAWLPSKLGFPVPDARAVIEGRLNGNIA
190 200 210 220 230 240
250 260 270 280 290 300
orf33-1.pep DARAWSGLLVGSIACYGILPRLLAWVVCKILLKTSENGLDLEKPYYQAVIRRWQNKITDA
|||||||||||||:||||||||||||||||||||||||||||| ||||||||||||||||
orf33ng-1 DARAWSGLLVGSIVCYGILPRLLAWVVCKILLKTSENGLDLEKTYYQAVIRRWQNKITDA
250 260 270 280 290 300
310 320 330 340 350 360
orf33-1.pep DTRRETVSAVSPKIILNDAPKWAVMLETEWQDGEWFEGRLAQEWLDKGVATNREQVAALE
||||||||||||||:||||||||:|||||||||:||||||||||||||||:|||||||||
orf33ng-1 DTRRETVSAVSPKIVLNDAPKWALMLETEWQDGQWFEGRLAQEWLDKGVAANREQVAALE
310 320 330 340 350 360
370 380 390 400 410 420
orf33-1.pep TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVQLLAEQGLSDDLSEKLEHW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf33ng-1 TELKQKPAQLLIGVRAQTVPDRGVLRQIVRLSEAAQGGAVVQLLAEQGLSDDLSEKLEHW
370 380 390 400 410 420
430 440
orf33-1.pep RNALAECGAAWLEPDRAAQEGRLKDQX
||||:|||||||||||:||||||||||
orf33ng-1 RNALTECGAAWLEPDRVAQEGRLKDQX
430 440
根据该淋球菌蛋白中存在几个推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例25
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 207>:
1 ..CAGAAGAGTT TGTCGAGAAT TTCTTTATGG GGTTTGGGCG GCGTGTTTTT
51 CGGGGTGTCC GGTCTGGTAT GGTTTTCTTT GGGCGTTTCT TT.GAGTGCG
101 CCTGTTTTTC GGGTGTTTCT TTTCGGGGTT CGGGACGGGG GACGTTTGTG
151 GGCAGTACGG GGGTTTCTTT GAGTGTGTTT TCAGCTTGTG TTCC.GGCGT
201 CGTCCGGCTG CCTGTCGGTT TGAGCTGTGT CGGCAGGTTG CG..GTTTGA
251 CCCGGTTTTT CTTGGGTGCG GCAGGGGACG TCATTCTCCT GCCGCTTTCG
301 TCTGTGCCGT CCGGCTGTGC GGGTTCGGAT GAGGCGGCGT GGTGGTGTTC
351 GGGTTGGGCG GCATCTTGT
CCGACTACGC CGTTTGGCAG CCAGAATTCG
401 GTTTCGCGGG GGCTGTCGGT GTGTTGCGGT TCGGCTTGAA GGGTTTTGTC
451 GTCC..
它对应于氨基酸序列<SEQ ID 208;ORF34>:
1 ..QKSLSRISLW GLGGVFFGVS GLVWFSLGVS XECACFSGVS FRGSGRGTFV
51 GSTGVSLSVF SACVXGVVRL PVGLSCVGRL XXLTRFFLGA AGDVILLPLS
101 SVPSGCAGSD EAAWWCSGWA ASCPTTPFGS QNSVSRGLSV CCGSA*RVLS
151 S..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 209>:
1 ATGATGATGC CGTTCATAAT GCTTCCTTGG ATTGCkGGTG TGCCTGCCGT
51 GCCGGGTCAG AATAGGTTGT CCAGAATTTC TTTATGGGGT TTGGGCGGCG
101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTTTG
151 GGCTGCGCCT GTTTTTCGGG TGTTTCTTTT CGGGGTTCGG GACGGGGGAC
201 GTTTGTGGGC AGTACGGGGG TTTCTTTGAG TGTGTTTTCA GCTTGTGTTC
251 CGGCGTCGTC CGGCTGCCTG TCGGTTTGAG CTGTGTCGGC AGGTTGCGGT
301 TTGACCCGGT TTTTCTTGGG TGCGGCAGGG GACGGCAGTC CGCTGCCGCT
351 TTCGTCTGTG CCGTCCGGCT GTGCGGGTTC GGATGAGGCG GCGTGGTGGT
401 GTTCGGGTTG GGCGGCATCT TGTCCGACTA CGCCGTTTGG CAGCCAGAAT
451 TCGGTTTCGC GGGGGCTGTC GGTGTGTTGC GGTTCGGCTT GAAGGGTTTT
501 GTCGCCGTTC GGGTTGAATG TGCTGACGAT GCCTATTGCC AATGCGCCGA
551 TGGCGGCGAT ACAGATGAGC AATACGGCGC GTATCAGGAG TTTGGGGGTC
601 AGCCTGAAGG GTTTGTTCGG TTTTTTTGCC ATTTTGATTG TGCTTTTGGG
651 GTGTCGGGCA ATGCCGTCTG AAGGCGGTTC AGACGGCATT GCCGAGTCAG
701 CGTTGGACGT AGTTTTGGTA GAGGGTGATG ACTTTTTGTA CGCCGACGGT
751 GGTGCTGACT TTTTGGGTAA TCTGCGCCTG TTCTTCGGGG GTGAGGATGC
801 CCATAACGTA GGTTACGTTG CCGTAGGTAA CGATTTTGAC GCGCGCCTGT
851 GTGGCGGGGC TGATGCCCAA CAGCGTGGCG CGGACTTTGG ATGTGTTCCA
901 AGTGTCGCCG GCGATGTCGC CGGCAGTGCG CGGCAGGGAG GCGACGGTAA
951 TATAGTTGTA CACGCCTTCG GCGGCCTGTT CGGAACGTGC AATCTGACCG
1001 ACGAACTGTT TTTCGCCTTC GGTGGCGACT TGTCCGAGCA GCAGCAGGTG
1051 GCGGTTGTAG CCGACGACGG AGATTTGGGG CGTGTAGCCT TTGGTTTGGT
1101 TGTTTTGGCG CAGATAGGAA CGGGCGGTGG TTTCGATACG CAACGCCATA
1151 ACGTTGTCGT CGGTTTGCGC GCCGGTGGTT CGGCGGTCGA CGGCGGATTT
1201 CGCGCCGACG GCGGCGCTTC CGATTACTGC GCTGACGCAG CCGCTAAGGG
1251 CAAGGCTGAA AATGGCGGCA ATCAGGGTGC GGACGGTGTG CGGTTTGGGT
1301 TTCATCGGGT GCTTCCTTTC TTGGGCGTTT CAGACGGCAT TGCTTTGCGC
1351 CATGCCGTCT GA
它对应于氨基酸序列<SEQ ID 210;ORF34-1>:
1
MMMPFIMLPW IAGVPAVPGQ NRLSR
ISLWG LGGVFFGVSG LVWFSLG
VSL
51
GCACFSGVSF RGSGRGTFVG STGVSLSVFS ACVPASSGCL SV*AVSAGCG
101 LTRFFLGAAG DGSPLPLSSV PSGCAGSDEA AWWCSGWAAS CPTTPFGSQN
151 SVSRGLSVCC GSA*RVLSPF GLNVLTMPIA NAPMAAIQMS NTARIRSLG
V
201
SLKGLFGFFA ILIVLLGCRA MPSEGGSDGI AESALDVVLV EGDDFLYADG
251 GADFLGNLRL FFGGEDAHNV GYVAVGNDFD ARLCGGADAQ QRGADFGCVP
301 SVAGDVAGSA RQGGDGNIVV HAFGGLFGTC NLTDELFFAF GGDLSEQQQV
351 AVVADDGDLG R
VAFGLVVLA QIGTGGGFDT QRHNVVVGLR AGGSAVDGGF
401 RADGGASDYC ADAAAKGKAE NGGNQGADGV RFGFHRVLPF LGVSDGIALR
451 HAV*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF34与脑膜炎奈瑟球菌菌株A的ORF(ORF34a)在重叠的161个氨基酸内显示出有73.3%的相同性:
10 20 30
orf34.pep QKSLSR
ISLWGLGGVFFGVSGLVWFSLG
VSXE------CAC
|| ||| |||||| |||||||||| ||||| ||| |||
orf34a
MMXPXIMLPWIAGVPAVPGQKRLSR
XSLWGLGGXFFGVSGLVWFSLG
VSXSLGVSXGCAC
10 20 30 40 50 60
40 50 60 70 80 90
orf34.pep
FSGVSFRGSGRG
TFVGSTGVSLSVFSACVXGVVRLPVGLSCVGRLXX-----LTRFFLGA
|||| |||||||| ||||||||||||||||: |:: :|:: ||| | ||
orf34a
FSGVSFRGSGRG
TFVGSTGVSLSVFSACA------PASSGCLSVXAVSAGCGLTRXFXGA
70 80 90 100 110
100 110 120 130 140 150
orf34.pep AGDVILLPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLS
||| ||||||||||||:|| | |||||||||||||||||||||||||||||: ||||
orf34a AGDGSPLPLSSVPSGCAGADEEAXXCSGWAASCPTTPFGSQNSVSRGLSVCCGSVWRVLS
120 130 140 150 160 170
orf34.pep S
orf34a PFGXNVLTMPIANAPMAVIQMSNTARIRSL
GVSLKGLFXFFAILIVLLGCRAMPSEGGSD
180 190 200 210 220 230
全长ORF34a核苷酸序列<SEQ ID 211>是:
1 ATGATGATNC CGTTNATAAT GCTTCCTTGG ATTGCGGGTG TGCCTGCCGT
51 GCCGGGTCAG AAGAGGTTGT CGAGAANTTC TTTATGGGGT TTAGGCGGCN
101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTNTT
151 TCTTTGGGTG TTTCTNTGGG CTGTGCCTGT TTTTCGGGTG TTTCTTTTCG
201 GGGTTCGGGA CGGGGGACGT TTGTGGGCAG TACNGGGGTT TCTTTGAGTG
251 TGTTTTCAGC TTGTGCTCCG GCGTCGTCCG GCTGCCTGTC GGTTTNAGCT
301 GTGTCGGCAG GTTGCGGTTT GACCCGGNTT TTCTTNGGTG CGGCAGGGGA
351 CGGCAGTCCG CTGCCGCTTT CGTCTGTGCC GTCCGGCTGT GCGGGTGCGG
401 ATGAGGAGGC GTNGTNGTGT TCGGGTTGGG CGGCATCTTG TCCGACTACG
451 CCGTTTGGCA GCCAGAATTC GGTTTCGCGG GGGCTGTCGG TGTGTTGCGG
501 TTCGGTNTGG AGGGTTTTGT CNCCGTTCGG GTNGAATGTG CTGACGATGC
551 CTATTGCCAA TGCGCCGATG GCGGTGATAC AGATGAGCAA TACGGCGCGT
601 ATCAGGAGTT TGGGGGTCAG CCTGAAGGGT TTGTTCNGTT TTTTTGCCAT
651 TTTGATTGTG CTTTTGGGGT GTCGGGCAAT GCCGTCTGAA GGCGGTTCAG
701 ACGGCATTGC CGAGTCAGCG TTGGACGTAG TTTNGGTAGA GGGTGATGAC
751 TTTTTGTACG CCGACGGTGG TGCTGACTTT TTGGGTAATC TGCGCCTGTT
801 CTTCGGGGGT GAGGATGCCC ATAACGTAGG TTACGTTGCC GTAGGTAACG
851 ATTTTGACGC GCGCCTGTGT GGCGGGGCTG ATGCCCAACA GCGTGGCGCG
901 GACTTTGGAT GTGTTCCAAG TGTCGCCGGC GATGTCGCCG GCAGTGCGCG
951 GCAGGGAGGC GACGGTAATG TANTTGTACA CGCCTTCGGC GGCCTGTTCG
1001 GAACGTGCAA TCTGACCGAC GAACTGTTTC TCGCCTTCGG TGGCGACTTG
1051 TCCGAGCAGC AGCAGGTGGC GGTTGTAGCC GACAACGGAG ATTTGGGGCG
1101 TGTANCCTTT GGTTTGGTTG TTTTGGCGCA GATAGGAGCG GGCGGTGGTT
1151 TCGATACGCA GCGCCATTAC GTTGTCGTCG GTTNGCGCGC CGGTGGTTCG
1201 GCGGTCGACG GCGGATTTCG CGCCGACCGC CGCGCCGCCG ACGACTGCGC
1251 TGACGCAGCC GCCGAGGGCA AGGCTGAGGA CGGCGGCAGT CAGGGTGCGG
1301 ACGGTGTGCG GTTTGGGTTT CATCGGGTGC TTCCTTTCTT GGGCGTTTCA
1351 GACGGCATTG CTTTGCGCCA TGCCGTCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 212>:
1
MMXPXIMLPW IAGVPAVPGQ KRLSR
XSLWG LGGXFFGVSG LVWFSLG
VSX
51
SLGVSXGCAC FSGVSFRGSG RG
TFVGSTGV SLSVFSACAP ASSGCLSVXA
101 VSAGCGLTRX FXGAAGDGSP LPLSSVPSGC AGADEEAXXC SGWAASCPTT
151 PFGSQNSVSR GLSVCCGSVW RVLSPFGXNV LTMPIANAPM AVIQMSNTAR
201 IRSL
GVSLKG LFXFFAILIV LLGCRAMPSE GGSDGIAESA LDVVXVEGDD
251 FLYADGGADF LGNLRLFFGG EDAHNVGYVA VGNDFDARLC GGADAQQRGA
301 DFGCVPSVAG DVAGSARQGG DGNVXVHAFG GLFGTCNLTD ELFLAFGGDL
351 SEQQQVAVVA DNGDLGR
VXF GLVVLAQIGA GGGFDTQRHY VVVGXRAGGS
401 AVDGGFRADR RAADDCADAA AEGKAEDGGS QGADGVRFGF HRVLPFLGVS
451 DGIALRHAV*
ORF34a和ORF34-1在459个氨基酸的重叠区内显示出有91.3%的相同性:
10 20 30 40 50 60
orf34a.pep MMXPXIMLPWIAGVPAVPGQKRLSRXSLWGLGGXFFGVSGLVWFSLGVSXSLGVSXGCAC
|| | |||||||||||||||:|||| ||||||| ||||||||||||||| ||||
orf34-1 MMMPFIMLPWIAGVPAVPGQNRLSRISLWGLGGVFFGVSGLVWFSLGVSL------GCAC
10 20 30 40 50
70 80 90 100 110 120
orf34a.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACAPASSGCLSVXAVSAGCGLTRXFXGAAGDGSP
||||||||||||||||||||||||||||:|||||||||||||||||||| | ||||||||
orf34-1 FSGVSFRGSGRGTFVGSTGVSLSVFSACVPASSGCLSVXAVSAGCGLTRFFLGAAGDGSP
60 70 80 90 100 110
130 140 150 160 170 180
orf34a.pep LPLSSVPSGCAGADEEAXXCSGWAASCPTTPFGSQNSVSRGLSVCCGSVWRVLSPFGXNV
||||||||||||:|| | |||||||||||||||||||||||||||||: ||||||| ||
orf34-1 LPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLSPFGLNV
120 130 140 150 160 170
190 200 210 220 230 240
orf34a.pep LTMPIANAPMAVIQMSNTARIRSLGVSLKGLFXFFAILIVLLGCRAMPSEGGSDGIAESA
|||||||||||:|||||||||||||||||||| |||||||||||||||||||||||||||
orf34-1 LTMPIANAPMAAIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA
180 190 200 210 220 230
250 260 270 280 290 300
orf34a.pep LDVVXVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA
|||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf34-1 LDVVLVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA
240 250 260 270 280 290
310 320 330 340 350 360
orf34a.pep DFGCVPSVAGDVAGSARQGGDGNVXVHAFGGLFGTCNLTDELFLAFGGDLSEQQQVAVVA
|||||||||||||||||||||||: ||||||||||||||||||:||||||||||||||||
orf34-1 DFGCVPSVAGDVAGSARQGGDGNIVVHAFGGLFGTCNLTDELFFAFGGDLSEQQQVAVVA
300 310 320 330 340 350
370 380 390 400 410 420
orf34a.pep DNGDLGRVXFGLVVLAQIGAGGGFDTQRHYVVVGXRAGGSAVDGGFRADRRAADDCADAA
|:|||||| ||||||||||:||||||||| |||| |||||||||||||| |:| |||||
orf34-1 DDGDLGRVAFGLVVLAQIGTGGGFDTQRHNVVVGLRAGGSAVDGGFRADGGASDYCADAA
360 370 380 390 400 410
430 440 450 460
orf34a.pep AEGKAEDGGSQGADGVRFGFHRVLPFLGVSDGIALRHAVX
|:||||:||:||||||||||||||||||||||||||||||
orf34-1 AKGKAENGGNQGADGVRFGFHRVLPFLGVSDGIALRHAVX
420 430 440 450
与淋病奈瑟球菌的预计ORF的同源性
ORF34与淋病奈瑟球菌的预计ORF(ORF34.ng)在重叠的161个氨基酸内显示出有77.6%的相同性:
orf34.pep QKSLSRISLWGLGGVFFGVSGLVWFSLGVSXE------CAC 35
|| |||||||||:||||||||||||||||| |||
orf34ng MMMPFIMLPWIAGVPAVPGQKRLSRISLWGLAGVFFGVSGLVWFSLGVSFSLGVSLGCAC 60
orf34.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVXGVVRLPVGLSCV-----GRLXXLTRFFLGA 90
|||||||||| |:|||||||||||||||| :||: | : || ||||||||
orf34ng FSGVSFRGSGWGAFVGSTGVSLSVFSACVP----VPVNESAARAASEGR--GLTRFFLGA 114
orf34.pep AGDVILLPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLS 150
||| |||||||||||||||||||||||||||||:||||||||||||||||||: ||||
orf34ng AGDGSPLPLSSVPSGCAGSDEAAWWCSGWAASCPTAPFGSQNSVSRGLSVCCGSVWRVLS 174
orf34.pep S 175
orf34ng PFGLNVLTMPTANAPMAVIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSD 234
全长ORF34ng核苷酸序列<SEQ ID 213>是:
1 ATGATGATGC CGTTCATAAT GCTTCCTTGG ATTGCGGGTG TGCCTGCCGT
51 GCCGGGTCAA AAGAGGTTGT CGAGAATCTC TTTATGGGGT TTGGCCGGCG
101 TGTTTTTCGG GGTGTCCGGT TTGGTATGGT TTTCTTTGGG CGTTTCTTTT
151 TCTTTGGGTG TTTCTTTGGG CTGCGCCTGT TTTTCGGGTG TTTCTTTTCG
201 GGGTTCGGGA TGGGGGGCGT TTGTGGGCAG TACGGGGGTT TCTTTGAGTG
251 TGTTTTCAGC TTGTGTTCCG GTGCCGGTTA ACGAATCGGC TGCCCGGGCC
301 GCATCCGAAG GGCGCGGTTT gACCCGGTTT TTCTTGGGTG CGGCAGGGGA
351 CGGCAGTCCG CTGCCGCTTT CTTCTGTGCC GTCCGGCTGT GCGGGTTCGG
401 ATGAGGCGGC GTGGTGGTGT TCGGGTTGGG CGGCATCTTG TCCGACGGCG
451 CCGTTTGGCA GCCAGAATTC GGTTTCGCGG GGGCTGTCGG TGTGTTGCGG
501 TTCGGTTTGG AGGGTTTTGT CGCCGTTCGG GTTGAATGTG CTGACGATGC
551 CTACTGCCAA TGCGCCGATG GCGGTGATAC AGATGAGCAA TACGGCGCGT
601 ATCAGGAGTT TGGGGGTCAG CCTGAAGGGT TTGTTCGGTT TTTTTGCCAT
651 TTTGATTGTG CTTTTGGGGT GTCGGGCAAT GCCGTCTGAA GGCGGTTCAG
701 ACGGCATTGC CGAGTCAGCG TTGGACGTAG TTTTGGTAGA GGGTAATGAC
751 TTTTTGTACG CCGAcggTGG TGCTGACTTT TTGGGTAATC TGCGCCTGTT
801 CTTCGGGGGT GAGGATGCCC ATAACGTAGG TTACATTGCC GTAGGTAATG
851 ATTTTGACGC GCGCCTGTGT AGCGGGGCTG ATGCCCAGCA GcgtgGCGCG
901 GACTTTGGAC GTGTTCCAAG TGTCGCCGGC GATGTCGCCC GCAGTGCGCG
951 GCAGGGAGGC GACGGTAATG TAGTTGTATA CGCCTTCGGC GGCCTGTTCG
1001 GAACGTGCAA TCTGACCGAC GAACTGTTTT TCGCCTTCGG TGGCGACTTG
1051 TCCGAGCAGC AGCAGGTGGC GGTTGTAGCC GACGACGGAG ATTTGGGGCG
1101 TGTAGCCTTT GGTTTGGTTG TTTTGGCGCA GGTAGGAACG GGCGGTGGTT
1151 TCGATACGCA ACGCCATAAC GTtgtCATCG GTTtgcgcgc CGGTGGTTcg
1201 gCGGTCGATG ACGGATTTTG CGCCGACGGC GGCCCCGCCG ACGACTGCGC
1251 TGAAGCAGCC GCCGAGGGCA AGGCTGAGGA CGGCGGCAAT CAGGGTGCGG
1301 ACGGTGTGTG GTTTGGGTTT CATCGGGGAC TTCCTTTCTT GGGCGTTTCA
1351 GACGGCATTG CTTTGCGCCA TGCCGTCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 214>:
51
SLGVSLGCAC FSGVSFRGSG WG
AFVGSTGV SLSVFSACVP VPVNESAARA
101 ASEGRGLTRF FLGAAGDGSP LPLSSVPSGC AGSDEAAWWC SGWAASCPTA
151 PFGSQNSVSR GLSVCCGSVW RVLSPFGLNV LTMPTANAPM AVIQMSNTAR
201 IRSLG
VSLKG LFGFFAILIV LLGCRAMPSE GGSDGIAESA LDVVLVEGND
251 FLYADGGADF LGNLRLFFGG EDAHNVGYIA VGNDFDARLC SGADAQQRGA
301 DFGRVPSVAG DVARSARQGG DGNVVVYAFG GLFGTCNLTD ELFFAFGGDL
351 SEQQQVAVVA DDGDLGR
VAF GLVVLAQVGT GGGFDTQRHN VVIGLRAGGS
401 AVDDGFCADG GPADDCAEAA AEGKAEDGGN QGADGVWFGF HRGLPFLGVS
451 DGIALRHAV*
ORF34ng和ORF34-1在459个氨基酸的重叠区内显示出有90.0%的相同性:
10 20 30 40 4 50
orf34-1.pep MMMPFIMLPWIAGVPAVPGQNRLSRISLWGLGGVFFGVSGLVWFSLGVS------LGCAC
||||||||||||||||||||:||||||||||:||||||||||||||||| |||||
orf34ng MMMPFIMLPWIAGVPAVPGQKRLSRISLWGLAGVFFGVSGLVWFSLGVSFSLGVSLGCAC
10 20 30 40 50 60
60 70 80 90 100 110
orf34-1.pep FSGVSFRGSGRGTFVGSTGVSLSVFSACVPASSGCLSVXAVSAGCGLTRFFLGAAGDGSP
|||||||||| |:|||||||||||||||||: : :: |:| | |||||||||||||||
orf34ng FSGVSFRGSGWGAFVGSTGVSLSVFSACVPVPVNESAARAASEGRGLTRFFLGAAGDGSP
70 80 90 100 110 120
120 130 140 150 160 170
orf34-1.pep LPLSSVPSGCAGSDEAAWWCSGWAASCPTTPFGSQNSVSRGLSVCCGSAXRVLSPFGLNV
|||||||||||||||||||||||||||||:||||||||||||||||||: ||||||||||
orf34ng LPLSSVPSGCAGSDEAAWWCSGWAASCPTAPFGSQNSVSRGLSVCCGSVWRVLSPFGLNV
130 140 150 160 170 180
180 190 200 210 220 230
orf34-1.pep LTMPIANAPMAAIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA
|||| ||||||:||||||||||||||||||||||||||||||||||||||||||||||||
orf34ng LTMPTANAPMAVIQMSNTARIRSLGVSLKGLFGFFAILIVLLGCRAMPSEGGSDGIAESA
190 200 210 220 230 240
240 250 260 270 280 290
orf34-1.pep LDVVLVEGDDFLYADGGADFLGNLRLFFGGEDAHNVGYVAVGNDFDARLCGGADAQQRGA
||||||||:|||||||||||||||||||||||||||||:|||||||||||:|||||||||
orf34ng LDVVLVEGNDFLYADGGADFLGNLRLFFGGEDAHNVGYIAVGNDFDARLCSGADAQQRGA
250 260 270 280 290 300
300 310 320 330 340 350
orf34-1.pep DFGCVPSVAGDVAGSARQGGDGNIVVHAFGGLFGTCNLTDELFFAFGGDLSEQQQVAVVA
||| |||||||||||||||||||:||:|||||||||||||||||||||||||||||||||
orf34ng DFGRVPSVAGDVARSARQGGDGNVVVYAFGGLFGTCNLTDELFFAFGGDLSEQQQVAVVA
310 320 330 340 350 360
360 370 380 390 400 41O
orf34-1.pep DDGDLGRVAFGLVVLAQIGTGGGFDTQRHNVVVGLRAGGSAVDGGFRADGGASDYCADAA
|||||||||||||||||:||||||||||||||:|||||||||| || |||| :| ||:||
orf34ng DDGDLGRVAFGLVVLAQVGTGGGFDTQRHNVVIGLRAGGSAVDDGFCADGGPADDCAEAA
370 380 390 400 410 420
420 430 440 450
orf34-1.pep AKGKAENGGNQGADGVRFGFHRVLPFLGVSDGIALRHAVX
|:||||:||||||||| ||||| |||||||||||||||||
orf34ng AEGKAEDGGNQGADGVWFGFHRGLPFLGVSDGIALRHAVX
430 440 450 460
根据该分析结果,包括此淋球菌蛋白中存在一个推定的前导序列(双划线)和几个推定的跨膜结构域(单划线)的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例26
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 215>:
1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT
51 CGCCGCCTGC GGATT.CAAA AAGACAGCGC GCCCGCCGCA TCCGCTTCTG
101 CCGCCGCCGA CAACGGCGCG GCG
AAAAAA GAAATCGTCT TCGGCACGAC
151 CGTCGGCGAC TTCGGCGATA TGGTCAAAGA ACAAATCCAA GCCGAGCTGG
201 AGAAAAAAGG CTACACCGTC AAACTGGTCG AGTTTACCGA CTATGTACGC
251 CCGAATCTGG CATTGGCTGA GGGCGAGTTG
它对应于氨基酸序列<SEQ ID 216;0RF4>:
1 MKTFFKTLSA AALALILAAC G.QKDSAPAA SASAAADNGA AKKEIVFGTT
51 VGDFGDMVKE QIQAELEKKG YTVKLVEFTD YVRPNLALAE GEL
进一步的序列分析揭示了完整的核苷酸序列<SEQ ID 217>:
1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT
51 CGCCGCCTGC GGCGGTCAAA AAGACAGCGC GCCCGCCGCA TCCGCTTCTG
101 CCGCCGCCGA CAACGGCGCG GCGAAAAAAG AAATCGTCTT CGGCACGACC
151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CAAATCCAAG CCGAGCTGGA
201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTTTACCGAC TATGTACGCC
251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTCCAACAC
301 AAACCCTATC TTGACGACTT CAAAAAAGAA CACAATCTGG ACATCACCGA
35l AGTCTTCCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA
401 AATCGCTGGA AGAAGTCAAA GACGGCAGCA CCGTATCCGC GCCCAACGAC
451 CCGTCCAACT TCGCCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTGGAT
501 CAAACTCAAA GACGGCATCA ATCCGTTGAC CGCATCCAAA GCGGACATCG
551 CCGAGAACCT GAAAAACATC AAAATCGTCG AGCTTGAAGC CGCGCAACTG
601 CCGCGTAGCC GCGCCGACGT GGATTTTGCC GTCGTCAACG GCAACTACGC
651 CATAAGCAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT
701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA
751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA
801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG
851 GCGCAGCCAA ATAA
它对应于氨基酸序列<SEQ ID 218;ORF4-1>:
1
MKTFFKTLSA AALALILAAC GGQKDSAPAA SASAAADNGA AKKEIVFGTT
51 VGDFGDMVKE QIQAELEKKG YTVKLVEFTD YVRPNLALAE GELDINVFQH
101 KPYLDDFKKE HNLDITEVFQ VPTAPLGLYP GKLKSLEEVK DGSTVSAPND
151 PSNFARVLVM LDELGWIKLK DGINPLTASK ADIAENLKNI KIVELEAAQL
201 PRSRADVDFA VVNGNYAISS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ
251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF4与脑膜炎奈瑟球菌菌株A的ORF(ORF4a)在重叠的93个氨基酸内显示出有93.5%的相同性:
10 20 30 40 50 59
orf4.pep
MKTFFKTLSAAALALILAACG-QKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE
||||||||||||||||||| || ||||||||||||||||||| ||||||||||||||||||
orf4a
MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAXKEIVFGTTVGDFGDMVKE
10 20 30 40 50 60
60 70 80 90
orf4.pep QIQAELEKKGYTVKLVEFTDYVRPNLALAEGEL
|| ||||||||||||| ||||| |||||||||
orf4a XIQPELEKKGYTVKLVEXTDYVRXNLALAEGELDINVXQHXXYLDDXKKXHNLDITXVXQ
70 80 90 100 110 120
orf4a VPTAPLGLYPGKLKSLXXVKXGSTVSAPNDPXXFXRVLVMLDELGXIKLKDXIXXXXXXX
130 140 150 160 170 180
全长ORF4a核苷酸序列<SEQ ID 219>是:
1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT
51 CGCCGCCTGC GGCGGTCAAA AAGATAGCGC GCCCGCCGCA TCCGCTTCTG
101 CCGCCGCCGA CAACGGCGCG GCGAANAAAG AAATCGTCTT CGGCACGACC
151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CANATCCAAC CCGAGCTGGA
201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTNTACCGAC TATGTGCGCN
251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTNCAACAC
301 ANACNCTATC TTGACGACTN CAAAAAANAA CACAATCTGG ACATCACCNN
351 AGTCTTNCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA
401 AATCGCTGGA NNAAGTCAAA GANGGCAGCA CCGTATCCGC GCCCAACGAC
451 CCGTNNNACT TCGNCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTNGAT
501 CAAACTCAAA GACNGCATCA NNNNGNNGNN NNNANCNANA NNNGANANNN
551 NNNNANNNNT NNNNNNNNNN NNNNNCNNCG NNNNNNNANN NNNNNNNNNN
601 NCGNNTNNNN NNGCNNNNNT NNANNNTNNN NNCNNCNNNN NNNNNTNNNN
651 NANNANNAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT
701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA
751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA
801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG
851 GCGCAGCCAA ATAA
预计编码的蛋白质具有氨基酸序列<SEQ ID 220>:
1
MKTFFKTLSA AALALILAAC GGQKDSAPAA SASAAADNGA AXKEIVFGTT
51 VGDFGDMVKE XIQPELEKKG YTVKLVEXTD YVRXNLALAE GELDINVXQH
101 XXYLDDXKKX HNLDITXVXQ VPTAPLGLYP GKLKSLXXVK XGSTVSAPND
151 PXXFXRVLVM LDELGXIKLK DXIXXXXXXX XXXXXXXXXX XXXXXXXXXX
201 XXXXAXXXXX XXXXXXXXXS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ
251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK*
前导肽用下划线表示。
对这些菌株A序列作进一步的分析,揭示了完整的DNA序列<SEQ ID 221>:
1 ATGAAAACCT TCTTCAAAAC CCTTTCCGCC GCCGCACTCG CGCTCATCCT
51 CGCCGCCTGC GGCGGTCAAA AAGATAGCGC GCCCGCCGCA TCCGCTTCTG
101 CCGCCGCCGA CAACGGCGCG GCGAAAAAAG AAATCGTCTT CGGCACGACC
151 GTCGGCGACT TCGGCGATAT GGTCAAAGAA CAAATCCAAC CCGAGCTGGA
201 GAAAAAAGGC TACACCGTCA AACTGGTCGA GTTTACCGAC TATGTGCGCC
251 CGAATCTGGC ATTGGCTGAG GGCGAGTTGG ACATCAACGT CTTCCAACAC
301 AAACCCTATC TTGACGACTT CAAAAAAGAA CACAATCTGG ACATCACCGA
351 AGTCTTCCAA GTGCCGACCG CGCCTTTGGG ACTGTACCCG GGCAAGCTGA
401 AATCGCTGGA AGAAGTCAAA GACGGCAGCA CCGTATCCGC GCCCAACGAC
451 CCGTCCAACT TCGCCCGCGT CTTGGTGATG CTCGACGAAC TGGGTTGGAT
501 CAAACTCAAA GACGGCATCA ATCCGCTGAC CGCATCCAAA GCGGACATTG
551 CCGAAAACCT GAAAAACATC AAAATCGTCG AGCTTGAAGC CGCGCAACTG
601 CCGCGTAGCC GCGCCGACGT GGATTTTGCC GTCGTCAACG GCAACTACGC
651 CATAAGCAGC GGCATGAAGC TGACCGAAGC CCTGTTCCAA GAACCGAGCT
701 TTGCCTATGT CAACTGGTCT GCCGTCAAAA CCGCCGACAA AGACAGCCAA
751 TGGCTTAAAG ACGTAACCGA GGCCTATAAC TCCGACGCGT TCAAAGCCTA
801 CGCGCACAAA CGCTTCGAGG GCTACAAATC CCCTGCCGCA TGGAATGAAG
851 GCGCAGCCAA ATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 222;ORF4a-1>:
1
MKTFFKTLSA AALALILAAC GGQKDSAPAA SASAAADNGA AKKEIVFGTT
51 VGDFGDMVKE QIQPELEKKG YTVKLVEFTD YVRPNLALAE GELDINVFQH
101 KPYLDDFKKE HNLDITEVFQ VPTAPLGLYP GKLKSLEEVK DGSTVSAPND
151 PSNFARVLVM LDELGWIKLK DGINPLTASK ADIAENLKNI KIVELEAAQL
201 PRSRADVDFA VVNGNYAISS GMKLTEALFQ EPSFAYVNWS AVKTADKDSQ
251 WLKDVTEAYN SDAFKAYAHK RFEGYKSPAA WNEGAAK*
ORF4a-1和ORF4-1在287个氨基酸的重叠区内显示出有99.7%的相同性:
10 20 30 40 50 60
orf4a-1 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4-1 MKTFFKTLSAAALALILAACGGQKDSAPAASASAAADNGAAKKEIVFGTTVGDFGDMVKE
10 20 30 40 50 60
70 80 90 100 110 120
orf4a-1 QIQPELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4-1 QIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVFQ
70 80 90 100 110 120
130 140 150 160 170 180
orf4a-1 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4-1 VPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTASK
130 140 150 160 170 180
190 200 210 220 230 240
orf4a-1 ADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNWS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4-1 ADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNWS
190 200 210 220 230 240
250 260 270 280
orf4a-1 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX
||||||||||||||||||||||||||||||||||||||||||||||||
orf4-1 AVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX
250 260 270 280
与溶血性巴斯德氏菌的外膜蛋白(登录号为q08869)的同源性
ORF4和此外膜蛋白在91个氨基酸的重叠区内显示出有33%的氨基酸相同性:
10 20
lip2.pasha MNFKKLLGVALVSALALTACKDEKAQAP----
|| | ::|| || |:|| :|: |
ORF4 VXTPNPDGRTPCPSFLFETATTSGENMKTFFKTLSAAAL--ALILAACGFKKTARPPHPL
110 120 130 140 150
30 40 50 60 70 80
lip2.pasha -ATTAKTENKAPLKVGVMTGPEAQMTEVAVKIAKEKYGLDVELVQFTEYTQPNAALHSKD
: :: | : |: :| ::|:: :: || | |:||:||:|::|| || :
ORF4 LPPPTTARRKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALAEGE
160 170 180 190 200 210
90 100 110 120 130 140
lip2.pasha LDANAFQTVPYLEQEVKDRGYKLAIIGNTLVWPIAAYSKKIKNISELKDGATVAIPNNAS
|
ORF4 L.....
与淋病奈瑟球菌的预计ORF的同源性
ORF4与淋病奈瑟球菌的预计ORF(ORF4.ng)在重叠的94个氨基酸内显示出有93.6%的相同性:
10 20 30
orf4nm.pep MKTFFKTLSAAALALILAACGXQKDSAPAA
|||||||||:|:||||||||| ||||||||
orf4ng RANAVXTPNPDGRTPCLSFLFETATTSGENMKTFFKTLSTASLALILAACGGQKDSAPAA
200 210 220 230 240 250
40 50 60 70 80 89
orf4nm.pep SASA-AADNGAAKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALA
||:| :||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4ng SAAAPSADNGAAKKEIVFGTTVGDFGDMVKEQIQAELEKKGYTVKLVEFTDYVRPNLALA
260 270 280 290 300 310
90
orf4nm.pep EGEL
||||
orf4ng EGELDINVFQHKPYLDDFKKEHNLDITEAFQVPTAPLGLYPGKLKSLEEVKDGSTVSAPN
320 330 340 350 360 370
预计全长ORF4ng核苷酸序列<SEQ ID 223>编码的蛋白质具有氨基酸序列<SEQID 224>:
1 MKTFFKTLST ASLAL
ILAAC GGQKDSAPAA SAAAPSADNG AAKKEIVFGT
51 TVGDFGDMVK EQIQAELEKK GYTVKLVEFT DYVRPNLALA EGELDINVFQ
101 HKPYLDDFKK EHNLDITEAF QVPTAPLGLY PGKLKSLEEV KDGSTVSAPN
151 DPSNFARALV MLNELGWIKL KDGINPLTAS KADIAENLKN IKIVELEAAQ
201 LPRSRADVDF AVVNGNYAIS SGMKLTEALF QEPSFAYVNW SAVKTADKDS
251QWLKDVTEAY NSDAFKAYAH KRFEGYKYPA AWNEGAAK*
进一步的分析揭示了全长ORF4ngDNA序列<SEQ ID 225>是:
1 atgAAAACCT TCTTCAAAAC cctttccgcc gccgcaCTCG CGCTCATCCT
51 CGCAGCCTGc ggCggtcaAA AAGACAGCGC GCCCgcagcc tctgcCGCCG
101 CCCCTTCTGC CGATAACGgc gCgGCGAAAA AAGAAAtcgt ctTCGGCACG
151 Accgtgggcg acttcggcgA TAtggTCAAA GAACAAATCC AagcCGAgct
201 gGAGAAAAAA GgctACACcg tcAAattggt cgaatttacc gactatgtGC
251 gCCCGAATCT GGCATTGGCG GAGGGCGAGT TGGACATCAA CGTCTTCCAA
301 CACAAACCCT ATCTTGACGA TTTCAAAAAA GAACACAACC TGGACATCAC
351 CGAAGCCTTC CAAGTGCCGA CCGCGCCTTT GGGACTGTAT CCGGGCAAAC
401 TGAAATCGCT GGAAGAAGTC AAAGACGGCA GCACCGTATC CGCGCCCAac
451 gACccgTCCA ACTTCGCACG CGCCTTGGTG ATGCTGAACG AACTGGGTTG
501 GATCAAACTC AAAGACGGCA TCAATCCGCT GACCGCATCC AAAGCCGACA
551 TCGCGGAAAA CCTGAAAAAC ATCAAAATCG TCGAGCTTGA AGCCGCACAA
601 CTGCCGCGCA GCCGCGCCGA CGTGGATTTT GCCGTCGTCA ACGGCAACTA
651 CGCCATAAGC AGCGGCATGA AGCTGACCGA AGCCCTGTTC CAAGAGCCGA
701 GCTTTGCCTA TGTCAACTGG TCTGCCgtcA AAACCGCCGA CAAAGACAGC
751 CAATGGCTTA AAGACGTAAC CGAGGCCTAT AACTCCGACG CGTTCAAAGC
801 CTACGCGCAC AAACGCTTCG AGGGCTACAA ATACCCTGCC GCATGGAATG
851 AAGGCGCAGC CAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 226;ORF4ng-1>:
1
MKTFFKTLSA AALALILAAC GGQKDSAPAA SAAAPSADNG AAKKEIVFGT
51 TVGDFGDMVK EQIQAELEKK GYTVKLVEFT DYVRPNLALA EGELDINVFQ
101 HKPYLDDFKK EHNLDITEAF QVPTAPLGLY PGKLKSLEEV KDGSTVSAPN
151 DPSNFARALV MLNELGWIKL KDGINPLTAS KADIAENLKN IKIVELEAAQ
201 LPRSRADVDF AVVNGNYAIS SGMKLTEALF QEPSFAYVNW SAVKTADKDS
251 QWLKDVTEAY NSDAFKAYAH KRFEGYKYPA AWNEGAAK*
它与ORF4-1在重叠的288个氨基酸内显示出有97.6%的相同性:
10 20 30 40 50 59
orf4-1.pep MKTFFKTLSAAALALILAACGGQKDSAPAASASA-AADNGAAKKEIVFGTTVGDFGDMVK
||||||||||||||||||||||||||||||||:| :||||||||||||||||||||||||
orf4ng-1 MKTFFKTLSAAALALILAACGGQKDSAPAASAAAPSADNGAAKKEIVFGTTVGDFGDMVK
10 20 30 40 50 60
60 70 80 90 100 110 119
orf4-1.pep EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEVF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf4ng-1 EQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITEAF
70 80 90 100 110 120
120 130 140 150 160 170 179
orf4-1.pep QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARVLVMLDELGWIKLKDGINPLTAS
|||||||||||||||||||||||||||||||||||||:||||:|||||||||||||||||
orf4ng-1 QVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLTAS
130 140 150 160 170 180
180 190 200 210 220 230 239
orf4-1.pep KADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf4ng-1 KADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTEALFQEPSFAYVNW
190 200 210 220 230 240
240 250 260 270 280
orf4-1.pep SAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKSPAAWNEGAAKX
||||||||||||||||||||||||||||||||||||| |||||||||||
orf4ng-1 SAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKYPAAWNEGAAKX
250 260 270 280
另外,ORF4ng-1显示出与数据库的一种外膜蛋白明显同源:
ID LIP2_PASHA STANDARD; PRT;276AA.
AC Q08869;
DT 01-NOV-1995(REL.32,产生的)
DT 01-NOV-1995(REL.32,序列的最后更新)
DT 01-NOV-1995(REL.32,注解的最后更新)
DE 28.2KD外膜蛋白前体....
SCORES Init1:279Initn:416Opt:494
Smith-Waterman评分:494;在275个氨基酸的重叠区内有36.0%的相同性
10 20 30 40 50
orf4ng-1.pep MKTFFKTLSAAAL--ALILAACGGQKDSAPAASAAAPSADNGAAKKEIVFGTTVGDFGDM
|| | ::|| || |:|| :| :|||::| :::| | | |: :| ::|
lip2_pasha MNFKKLLGVALVSALALTACKDEKAQAPATTA---KTENKAPLK---VGVMTGPEAQM
10 20 30 40 50
60 70 80 90 100 110
orf4ng-1.pep VKEQIQAELEKKGYTVKLVEFTDYVRPNLALAEGELDINVFQHKPYLDDFKKEHNLDITE
:: :: || | |:||:||:|::|| || :|| |:|| |||:: |::: ::
lip2_pasha TEVAVKIAKEKYGLDVELVQFTEYTQPNAALHSKDLDANAFQTVPYLEQEVKDRGYKLAI
60 70 80 90 100 110
120 130 140 150 160 170
orf4ng-1.pep AFQVPTAPLGLYPGKLKSLEEVKDGSTVSAPNDPSNFARALVMLNELGWIKLKDGINPLT
:: : |:: | |:|:: |:|||:||: ||: || ||||::|: | :|||| | :
lip2_pasha IGNTLVWPIAAYSKKIKNISELKDGATVAIPNNASNTARALLLLQAHGLLKLKDPKN-VF
120 130 140 150 160 170
180 190 200 210 220 230
orf4ng-1.pep ASKADIAENLKNIKIVELEAAQLPRSRADVDFAVVNGNYAISSGMKLTE--ALFQEPSFA
|:: || || ||||||: ::: | | ||::||:|::|| ::|:: : : : :
lip2_pasha ATENDIIENPKNIKIVQADTSLLTRMLDDVELAVINNTYAGQAGLSPDKDGIIVESKDSP
180 190 200 210 220 230
240 250 260 270 280 289
orf4ng-1.pep YVNWSAVKTADKDSQWLKDVTEAYNSDAFKAYAHKRFEGYKYPAAWNEGAAKX
||| : : :||: |: ::::::: | | |:|
lip2_pasha YVNLVVSREDNKDDPRLQTFVKSFQTEEVFQEALKLFNGGVVKGW
240 250 260 270
根据该分析结果(包括与溶血性巴斯德氏菌的外膜蛋白同源,以及淋球菌蛋白中存在一个推定的原核细胞膜脂蛋白脂质连接位点),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF4-1(30kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图8A和8B分别显示了His-融合蛋白以及GST-融合蛋白的亲和纯化的结果。用纯化的His-融合蛋白免疫小鼠,用其血清进行ELISA(阳性结果),Western印迹(图8C),FACS分析(图8D),和杀菌试验(图8E)。这些结果确认ORF4-1是一种外露蛋白,且是一种有用的免疫原。
图8F显示了出ORF4-1的亲水性、抗原性指数和AMPHI区域的曲线。
实施例27
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 227>:
1 CCTCGTCGTC CTCGGCATGC TCCAGTTTCA AGGGGCGATT TACTCCAAGG
51 CGGTGGAACG TATGCTCGGC ACGGTCATCG GGCTGGGCGC GGGTTTGGGC
101 GTTTTATGGC TGAACCAGCA TTATTTCCAC GGCAACCTCC TCTTCTACCT
151 CACCGTCGGC ACGGCAAGCG CACTGGCCGG CTGGGCGGCG GTCGGCAAAA
201 ACGGCTACGT CCCTmTGCTG GCAGGGCTGA CGATGTGTAT GCTCATCGGC
251 GACAACGGCA GCGAATGGCT CGACAGCGGA CTCATGCGCG CCATGAACGT
301 CCTCATCGGC GyGGCCATCG CCATCGCCGC CGCCAAACTG CTGCCGCTGA
351 AATCCACACT GATGTGGCGT TTCATGCTTG CCGACAACCT GGCCGACTGC
401 AGCAAAATGA TTGCCGAAAT CAGCAACGGC AGGCGCATGA CCCGCGAACG
451 CCTCGAGGAG AACATGGCGA AAATGCGCCA AATCAACGCA CGCATGGTCA
501 AAAGCCGCAG CCATCTCGCC GCCACATCGG GCGAAAGCTG CATCAGCCCC
551 GCCATGATGG AAGCCATGCA GCACGCCCAC CGTAAAATCG TCAACACCAC
601 CGAGCTGCTC CTGACCACCG CCGCCAAGCT GCAATCTCCC AAACTCAACG
651 GCAGCGAAAT CCGGCTGCTT GACCGCCACT TCACACTGCT CCAAAC....
701. .......... .......... ........GC AGACACGCCC GCCGCATCCG
751 CATCGACACC GCCATCAACC CCGAACTGGA AGCCCTCGCC GAACACCTCC
801 ACTACCAATG GCAGGGCTTC CTCTGGCTCA GCACCGATAT GCGTCAGGAA
851 ATTTCCGCCC TCGTCATCCT GCTGCAACGC ACCCGCCGCA AATGGCTGGA
901 TGCCCACGAA CGCCAACACC TGCGCCAAAG CCTGCTTGA
它对应于氨基酸序列<SEQ ID 228;ORF8>:
1 ......PRRP RHAPVSRGDL LQGGGTYARH GHRAGRGFGR FMAEPALFPR
51 QPPLLPHRRH GKRTGRLGGG RQKRLRPXAG RADDVYAHRR QRQRMARQRT
101 HARHERPHRR GHRHRRRQTA AAEIHTDVAF HACRQPGRLQ QNDCRNQQRQ
151 AHDPRTPRGE HGENAPNQRT HGQKPQPSRR HIGRKLHQPR HDGSHAARPP
201 XNRQHHRAAP DHRRQAAISQ TQRQRNPAAX PPLHTAPN.. .........Q
251 TRPPHPHRHR HQPRTGSPRR TPPLPMAGLP LAQHRYASGN FRPRHPAATH
301 PPQMAGCPRT PTPAPKPA*
该氨基酸序列的计算机分析给出了下列结果:
序列基序
ORF8富含脯氨酸,其脯氨酸残基分布与表面定位相符。而且,RGD基序的存在可能暗示其可能在细菌粘附行为中有作用。
与淋病奈瑟球菌的预计ORF的同源性
ORF8与淋病奈瑟球菌的预计ORF(ORF8.ng)在重叠的312个氨基酸内有86.5%的相同性:
orf8ng 1 MDRDDRLRRPRHAPVPRRDLLQRGGTYARYGHRAGRGFGRFMAEPALFPR 50
|||||||| | |||| ||||||:||||||||||||||||||||
orf8.pep 1 ......PRRPRHAPVSRGDLLQGGGTYARHGHRAGRGFGRFMAEPALFPR 44
orf8ng 51 QPPLLPDHRHGKRTGRLGGGRQKRLRPYVGGADDVHAHRRQRQRMARQRP 100
|||||| |||||||||||||||||| | ||||:|||||||||||||||
orf8.pep 45 QPPLLPHRRHGKRTGRLGGGRQKRLRPXAGRADDVYAHRRQRQRMARQRT 94
orf8ng 101 DARDERPHRRRHRHCRRQTAAAEIHTDVAFHACRQPGRLQQNDCRNQQRQ 150
|| |||||| ||| ||||||||||||||||||||||| |||||||||||
orf8.pep 95 HARHERPHRRGHRHRRRQTAAAEIHTDVAFHACRQPGRMQQNDCRNQQRQ 144
orf8ng 151 AYDARTFGAEYGQNAPNQRTHGQKPQPPRRHIGRKPHQPLHDGSHAARPP 200
|:| || |:|:|||||||||||||| ||||||| ||| ||||||||||
orf8.pep 145 AHDPRTPRGEHGENAPNQRTHGQKPQPSRRHIGRKLHQPRHDGSHAARPP 194
orf8ng 201 QNRQHHRAAPDHRRQAAISQTQRQRNPAARPPLHTAPNRPATNRRPHQRQ 250
|||||||||||||||||||||||||||| |||||||| |
orf8.pep 195 XNRQHHRAAPDHRRQAAISQTQRQRNPAAXPPLHTAPN...........Q 244
orf8ng 251 TRPPHPHRHRHQPRTGSPRRTPPLPMAGFPLAQHQYASGNFRPRHPPATH 300
|||||||||||||||||||||||||||| ||||| ||||||||||| |||
orf8.pep 245 TRPPHPHRHRHQPRTGSPRRTPPLPMAGLPLAQHRYASGNFRPRHPAATH 294
orf8ng 301 PPQMAGCPRTPTPAPKPA* 319
||||||||||||||||||
orf8.pep 295 PPQMAGCPRTPTPAPKPA* 313
预计全长ORF8ng核苷酸序列<SEQ ID 229>编码的蛋白质具有氨基酸序列<SEQID 230>:
1 MDRDDRLRRP RHAPVPRRDL LQRGGTYARY GHRAGRGFGR FMAEPALFPR
51 QPPLLPDHRH GKRTGRLGGG RQKRLRPYVG GADDVHAHRR QRQRMARQRP
101 DARDERPHRR RHRHCRRQTA AAEIHTDVAF HACRQPGRLQ QNDCRNQQRQ
151 AYDARTFGAE YGQNAPNQRT HGQKPQPPRR HIGRKPHQPL HDGSHAARPP
201 QNRQHHRAAP DHRRQAAISQ TQRQRNPAAR PPLHTAPNRP ATNRRPHQRQ
251 TRPPHPHRHR HQPRTGSPRR TPPLPMAGFP LAQHQYASGN FRPRHPPATH
301 PPQMAGCPRT PTPAPKPA*
根据这些蛋白质中的序列基序,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例28
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 231>:
1 ..GAAATCAGCC TGCGGTCCGA CNACAGGCCG GTTTCCGTGN CGAAGCGGCG
51 GGATTCGGAA CGTTTTCTGC TGTTGGACGG CGGCAACAGC CGGCTCAAGT
101 GGGCGTGGGT GGAAAACGGC ACGTTCGCAA CCGTCGGTAG CGCGCCGTAC
151 CGCGATTTGT CGCCTTTGGG CGCGGAGTGG GCGGAAAAGG CGGATGGAAA
201 TGTCCGCATC GTCGGTTGCG CTGTGTGCGG AGAATTCAAA AAGGCACAAG
251 TGCAGGAACA GCTCGCCCGA AAAATCGAGT GGCTGCCGTC TTCCGCACAG
301 GCTTT.GGCA TACGCAACCA CTACCGCCAC CCCGAAGAAC ACGGTTCCGA
351 CCGCTGGTTC AACGCCTTGG GCAGCCGCCG CTTCAGCCGC AACGCCTGCG
401 TCGTCGTCAG TTGCGGCACG GCGGTAACGG TTGACGCGCT CACCGATGAC
451 GGACATTATC TCGGAGA.GG AACCATCATG CCCGGTTTCC ACCTGATGAA
501 AGAATCGCTC GCCGTCCGAA CCGCCAACCT CAACCGGCAC GCCGGTAAGC
551 GTTATCCTTT CCCGACCGG..
它对应于氨基酸序列<SEQ ID 232;ORF61>:
1 ..EISLRSDXRP VSVXKRRDSE RFLLLDGGNS RLKWAWVENG TFATVGSAPY
51 RDLSPLGAEW AEKADGNVRI VGCAVCGEFK KAQVQEQLAR KIEWLPSSAQ
101 AXGIRNHYRH PEEHGSDRWF NALGSRRFSR NACVVVSCGT AVTVDALTDD
151 GHYLGXGTIM PGFHLMKESL AVRTANLNRH AGKRYPFPT..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 233>:
1 ATGACGGTTT TGAAGCTTTC GCACTGGCGG GTGTTGGCGG AGCTTGCCGA
51 CGGTTTGCCG CAACACGTCT CGCAACTGGC GCGTATGGCG GATATGAAGC
101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA CATACGCGGG
151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CATTGGCGGT
201 TTTCGATGCC GAAGGTTTGC GCGAGCTGGG GGAAAGGTCG GGTTTTCAGA
251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG
301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGCG TGACCCACCT
351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG
401 GCGAGTGTCT GATGTTCAGT TTTGGCTGGG TGTTTGACCG GCCGCAGTAT
451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA GTGGCGTGTC GGCGCGCCTT
501 GTCGCGTTTA GGTTTGGATG TGCAGATTAA GTGGCCCAAT GATTTGGTTG
551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACGGT CAGGACGGGC
601 GGCAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTTG TCCTGCCCAA
651 GGAAGTAGAA AATGCCGCTT CCGTGCAATC GCTGTTTCAG ACGGCATCGC
701 GGCGGGGCAA TGCCGATGCC GCCGTGCTGC TGGAAACGCT GTTGGTGGAA
751 CTGGACGCGG TGTTGTTGCA ATATGCGCGG GACGGATTTG CGCCTTTTGT
801 GGCGGAATAT CAGGCTGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT
851 TGCGCGACGG CGAAACCGTG TTCGAAGGCA CGGTTAAAGG CGTGGACGGA
901 CAAGGCGTTT TGCACTTGGA AACGGCAGAG GGCAAACAGA CGGTCGTCAG
951 CGGCGAAATC AGCCTGCGGT CCGACGACAG GCCGGTTTCC GTGCCGAAGC
1001 GGCGGGATTC GGAACGTTTT CTGCTGTTGG ACGGCGGCAA CAGCCGGCTC
1051 AAGTGGGCGT GGGTGGAAAA CGGCACGTTC GCAACCGTCG GTAGCGCGCC
1101 GTACCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGCGGATG
1151 GAAATGTCCG CATCGTCGGT TGCGCTGTGT GCGGAGAATT CAAAAAGGCA
1201 CAAGTGCAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC
1251 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT
1301 CCGACCGCTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC
1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA
1401 TGACGGACAT TATCTCGGGG GAACCATCAT GCCCGGTTTC CACCTGATGA
1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGGCA CGCCGGTAAG
1501 CGTTATCCTT TCCCGACCAC AACGGGCAAT GCCGTCGCCA GCGGCATGAT
1551 GGATGCGGTT TGCGGCTCGG TTATGATGAT GCACGGGCGT TTGAAAGAAA
1601 AAACCGGGGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG
1651 GCAAAAGTTG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT
1701 GCGCGTGGCG GACAACCTCG TCATTTACGG GTTGTTGAAC ATGATTGCCG
1751 CCGAAGGCAG GGAATATGAA CATATTTAA
它对应于氨基酸序列<SEQ ID 234;ORF61-1>:
1 MTVLKLSHWR VLAELADGLP QHVSQLARMA DMKPQQLNGF WQQMPAHIRG
51 LLRQHDGYWR LVRPLAVFDA EGLRELGERS GFQTALKHEC ASSNDEILEL
101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWVFDRPQY
151 ELGSLSPVAA VACRRALSRL GLDVQIKWPN DLVVGRDKLG GILIETVRTG
201 GKTVAVVGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLVE
251 LDAVLLQYAR DGFAPFVAEY QAANRDHGKA VLLLRDGETV FEGTVKGVDG
301 Q3VLHLETAE GKQTVVSGEI SLRSDDRPVS VPKRRDSERF LLLDGGNSRL
351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KADGNVRIVG CAVCGEFKKA
401 QVQEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA
451
CVVVSCGTAV TVDALTDDGH YLGGTIMPGF HLMKESLAVR TANLNRHAGK
501 RYPFPTTTGN AVASGMMDAV CGSVMMMHGR LKEKTGAGKP VDVIITGGGA
551 AKVAEALPPA FLAENTVRVA DNLVIYGLLN MIAAEGREYE HI*
图9显示出ORF61-1的亲水性、抗原性指数和AMPHI区域的曲线。该氨基酸序列的进一步计算机分析给出了下列结果:
与副百日咳博德特氏菌的baf蛋白(登录号为U12020)的同源性
ORF61和baf蛋白在166个氨基酸的重叠区内有33%的氨基酸相同性:
orf61 23 LLLDGGNSRLKWAWVE-NGTFATVGSAPYR----DLSPLGAEWAEKADGNVRIVGCAVCG 77
+L+D GNSRLK W + + A AP DL LG A R +G V G
baf 3 ILIDSGNSRLKVGWFDPDAPQAAREPAPVAFDNLDLDALGRWLATLPRRPQRALGVNVAG 62
orf61 78 EFKKAQVQEQLAR---KIEWLPSSAQAXGIRNHYRHPEEHGSDRW---FNALGSRRFSRN 131
+ + L I WL + A G+RN YR+P++ G+DRW L +
baf 63 LARGEAIAATLRAGGCDIRWLRAQPLAMGLRNGYRNPDQLGADRWACMVGVLARQPSVHP 122
orf61 132 ACVVVSCGTAVTVDALTDDGHYLGXGTIMPGFHLMKESLAVRTANL 177
+V S GTA T+D + D + G G I+PG +M+ +LA TA+L
baf 123 PLLVASFGTATTLDTIGPDNVFPG-GLILPGPAMMRGALAYGTAHL 167
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF61与脑膜炎奈瑟球菌菌株A的ORF(ORF61a)在重叠的189个氨基酸内有97.4%的相同性:
10 20 30
orf61.pep EISLRSDXRPVSVXKRRDSERFLLLDGGNS
||||||| ||||| ||||||||||||||||
orf61a TVFEGTVKGVDGQGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNS
290 300 310 320 330 340
40 50 60 70 80 90
orf61.pep RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLAR
|||||||||||||||||||||||||||||||||:||||||||||||||||||||||||||
orf61a RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKVDGNVRIVGCAVCGEFKKAQVQEQLAR
350 360 370 380 390 400
100 110 120 130 140 150
orf61.pep KIEWLPSSAQAXGIRNHYRHPEEHGSDRWFNALGSRRFSRN
ACVVVSCGTAVTVDALTDD
||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||
orf61a KIEWLPSSAQALGIRHYRHPEEHGSDRWFNALGSRRFSRN
ACVVVSCGTAVTVDALTDD
410 420 430 440 450 460
160 170 180 189
orf61.pep GHYLGXGTIMPGFHLMKESLAVRTANLNRHAGKRYPFPT
||||| |||||||||||||||||||||||||||||||||
orf61a GHYLG-GTIMPGFHLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMM
470 480 490 500 510 520
orf61a HGRLKEKTGAGKPVDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGG
530 540 550 560 570 580
全长ORF61a核苷酸序列<SEQ ID 235>是:
1 ATGACGGTTT TGAAGCCTTC GCACTGGCGG GTGTTGGCGG AGCTTGCCGA
51 CGGTTTGCCG CAACACGTCT CGCAACTGGC GCGTATGGCG GATATGAAGC
101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA CATACGCGGG
151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CATTGGCGGT
201 TTTCGATGCC GAAGGTTTGC GCGAGCTGGG GGAAAGGTCG GGTTTTCAGA
251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG
301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGTG TGACCCACCT
351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG
401 GCGAGTGTCT GATGTTCAGT TTTGGCTGGG TGTTTGACCG GCCGCAGTAT
451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA GTGGCGTGCC GGCGCGCCTT
501 GTCGCGTTTG GGTTTGAAAA CGCAAATCAA GTGGCCAAAC GATTTGGTCG
551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACGGT CAGGACGGGC
601 GGCAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTCG TGCTGCCCAA
651 GGAAGTGGAA AACGCCGCTT CCGTGCAATC GCTGTTTCAG ACGGCATCGC
701 GGCGGGGAAA TGCCGATGCC GCCGTGTTGC TGGAAACGCT GTTGGCGGAA
751 CTTGATGCGG TGTTGTTGCA ATATGCGCGG GACGGATTTG CGCCTTTTGT
801 GGCGGAATAT CAGGCTGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT
851 TGCGCGACGG CGAAACCGTG TTCGAAGGCA CGGTTAAAGG CGTGGACGGA
901 CAAGGCGTTC TGCACTTGGA AACGGCAGAG GGCAAACAGA CGGTCGTCAG
951 CGGCGAAATC AGCCTGCGGT CCGACGACAG GCCGGTTTCC GTGCCGAAGC
1001 GGCGGGATTC GGAACGTTTT CTGCTGTTGG ACGGCGGCAA CAGCCGGCTC
1051 AAGTGGGCGT GGGTGGAAAA CGGCACGTTC GCAACCGTCG GTAGCGCGCC
1101 GTACCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGTGGATG
1151 GAAATGTCCG CATCGTCGGT TGCGCCGTGT GCGGAGAATT CAAAAAGGCA
1201 CAAGTGCAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC
1251 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT
1301 CCGACCGCTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC
1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA
1401 TGACGGACAT TATCTCGGGG GAACCATCAT GCCCGGTTTC CACCTGATGA
1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGGCA CGCCGGTAAG
1501 CGTTATCCTT TCCCGACCAC AACGGGCAAT GCCGTCGCCA GCGGCATGAT
1551 GGATGCGGTT TGCGGCTCGG TTATGATGAT GCACGGGCGT TTGAAAGAAA
1601 AAACCGGGGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG
1651 GCAAAAGTTG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT
1701 GCGCGTGGCG GACAACCTCG TCATTCACGG GCTGCTGAAC CTGATTGCCG
1751 CCGAAGGCGG GGAATCGGAA CATACTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 236>:
1 MTVLKPSHWR VLAELADGLP QHVSQLARMA DMKPQQLNGF WQQMPAHIRG
51 LLRQHDGYWR LVRPLAVFDA EGLRELGERS GFQTALKHEC ASSNDEILEL
101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWVFDRPQY
151 ELGSLSPVAA VACRRALSRL GLKTQIKWPN DLVVGRDKLG GILIETVRTG
201 GKTVAVVGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLAE
251 LDAVLLQYAR DGFAPFVAEY QAANRDHGKA VLLLRDGETV FEGTVKGVDG
301 QGVLHLETAE GKQTVVSGEI SLRSDDRPVS VPKRRDSERF LLLDGGNSRL
351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KVDGNVRIVG CAVCGEFKKA
401 QVQEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA
451
CVVVSCGTAV TVDALTDDGH YLGGTIMPGF HLMKESLAVR TANLNRHAGK
501 RYPFPTTTGN AVASGMMDAV CGSVMMMHGR LKEKTGAGKP VDVIITGGGA
551 AKVAEALPPA FLAENTVRVA DNLVIHGLLN LIAAEGGESE HT*
ORF61a和ORF61-1在591个氨基酸的重叠区内有98.5%的相同性:
10 20 30 40 50 60
orf61a.pep MTVLKPSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR
||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 MTVLKLSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR
10 20 30 40 50 60
70 80 90 100 110 120
orf61a.pep LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK
70 80 90 100 110 120b 130 140 150 160 170 180
orf61a.pep GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLKTQIKWPN
|||||||||||||||||||||||||||||||||||||||||||||||||||| :||||||
orf61-1 GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLDVQIKWPN
130 140 150 160 170 180
190 200 210 220 230 240
orf61a.pep DLVVGRDKLGGILIETVRTGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 DLVVGRDKLGGILIETVRTGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA
190 200 210 220 230 240
250 260 270 280 290 300
orf61a.pep AVLLETLLAELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 AVLLETLLVELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG
250 260 270 280 290 300
310 320 330 340 350 360
orf61a.pep QGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 QGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF
310 320 330 340 350 360
370 380 390 400 410 420
orf61a.pep ATVGSAPYRDLSPLGAEWAEKVDGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf61-1 ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL
370 380 390 400 410 420
430 440 450 460 470 480
orf61a.pep GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF
430 440 450 460 470 480
490 500 510 520 530 540
orf61a.pep HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP
490 500 510 520 530 540
550 560 570 580 590
orf61a.pep VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGGESEHTX
|||||||||||||||||||||||||||||||||||:||||:||||| | ||
orf61-1 VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIYGLLNMIAAEGREYEHIX
550 560 570 580 590
与淋病奈瑟球菌的预计ORF的同源性
ORF61与淋病奈瑟球菌的预计ORF(ORF61.ng)在重叠的189个氨基酸内有94.2%的相同性:
orf61.pep EISLRSDXRPVSVXKRRDSERFLLLDGGNS 30
||||| | | ||| || ||||||||:||||
orf61ng TVCEGTVKGVDGRGVLHLETAEGEQTVVSGEISLRPDNRSVSVPKRPDSERFLLLEGGNS 211
orf61.pep RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLAR 90
|||||||||||||||||||||||||||||||||||||||||||||||| |||||:|||||
orf61ng RLKWAWVENGTFATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGESKKAQVKEQLAR 271
orf61.pep KIEWLPSSAQAXGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDD 150
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61ng KIEWLPSSAQALGIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDD 331
orf61.pep GHYLGXGTIMPGFHLMKESLAVRTANLNRHAGKRYPFPT 189
||||| ||||||||||||||||||||||| |||||||||
orf61ng GHYLG-GTIMPGFHLMKESLAVRTANLNRPAGKRYPFPTTTGNAVASGMMDAVCGSIMMM 390
预计ORF61ng核苷酸序列<SEQ ID 237>编码的蛋白质具有氨基酸序列<SEQ ID238>:
1 MFSFGWAFDR PQYEL
GSLSP VAALACRRAL GCLGLETQIK WPNDLVVGRD
51 KLGGILIETV RAGGKTVAVV GIGINFVLPK EVENAASVQS LFQTASRRGN
101 ADAAVLLETL LAELGAVLEQ YAEEGFAPFL NEYETANRDH GKAVLLLRDG
151 ETVCEGTVKG VDGRGVLHLE TAEGEQTVVS GEISLRPDNR SVSVPKRPDS
201 ERFLLLEGGN SRLKWAWVEN GTFATVGSAP YRDLSPLGAE WAEKADGNVR
251 IVGCAVCGES KKAQVKEQLA RKIEWLPSSA QALGIRNHYR HPEEHGSDRW
301 FNALGSRRFS RNACVVVSCG TAVTVDALTD DGHYLGGTIM PGFHLMKESL
351 AVRTANLNRP AGKRYPFPTT TGNAVASGMM DAVCGSIMMM HGRLKEKNGA
401 GKPVDVIITG GGAAKVAEAL PPAFLAENTV RVADNLVIHG LLNLIAAEGG
451 ESEHA*
进一步的分析揭示完整的淋球菌DNA序列<SEQ ID 239>是:
1 ATGACGGTTT TGAAGCCTTC GCATTGGCGG GTGTTGGCGG AGCTTGCCGA
51 CGGTTTGCCG CAACACGTAT CGCAATTGGC GCGTGAGGCG GACATGAAGC
101 CGCAGCAGCT CAACGGTTTT TGGCAGCAGA TGCCGGCGCA TATACGCGGG
151 CTGTTGCGCC AACACGACGG CTATTGGCGG CTGGTGCGCC CCTTGGCGGT
201 TTTCGATGCC GAAGGTTTGC GCGATCTGGG GGAAAGGTCG GGTTTTCAGA
251 CGGCATTGAA GCACGAGTGC GCGTCCAGCA ACGACGAGAT ACTGGAATTG
301 GCGCGGATTG CGCCGGACAA GGCGCACAAA ACCATATGCG TGACCCACCT
351 GCAAAGTAAG GGCAGGGGGC GGCAGGGGCG GAAGTGGTCG CACCGTTTGG
401 GCGAGTGCCT GATGTTCAGT TTCGGCTGGG CGTTTGACCG GCCGCAGTAT
451 GAGTTGGGTT CGCTGTCGCC TGTTGCGGCA CTTGCGTGCC GGCGCGCTTT
501 GGGGTGTTTG GGTTTGGAAA CGCAAATCAA GTGGCCAAAC GATTTGGTCG
551 TCGGACGCGA CAAATTGGGC GGCATTCTGA TTGAAACAGT CAGGGCGGGC
601 GGTAAAACGG TTGCCGTGGT CGGTATCGGC ATCAATTTCG TGCTGCCCAA
651 GGAAGTGGAA AACGCCGCTT CCGTGCAGTC GCTGTTTCAG ACGGCATCGC
701 GGCGGGGCAA TGCCGATGCC GCCGTATTGC TGGAAACATT GCTTGCGGAA
751 CTGGGCGCGG TGTTGGAACA ATATGCGGAA GAAGGGTTCG CGCCATTTTT
801 AAATGAGTAT GAAACGGCCA ACCGCGACCA CGGCAAGGCG GTATTGCTGT
851 TGCGCGACGG CGAAACCGTG TGCGAAGGCA CGGTTAAAGG CGTGGACGGA
901 CGAGGCGTTC TGCACTTGGA AACGGCAgaa ggcgaACAGa cggtcgtcag
951 cggcgaaaTC AGcctGCggc ccgacaacaG GTCGGtttcc gtgccgaagc
1001 ggccggatTC GgaacgtTTT tTGCtgttgg aaggcgggaa cagccgGCTC
1051 AAGTGGGCGT GggtggAAAa cggcacgttc gcaaccgtgg gcagcgcgCc
1101 gtaCCGCGAT TTGTCGCCTT TGGGCGCGGA GTGGGCGGAA AAGGCGGATG
1151 GAAATGTCCG CATCGTCGGT TGCGCCGTGT GCGGAGAATC CAAAAAGGCA
1201 CAAGTGAAGG AACAGCTCGC CCGAAAAATC GAGTGGCTGC CGTCTTCCGC
1251 ACAGGCTTTG GGCATACGCA ACCACTACCG CCACCCCGAA GAACACGGTT
1301 CCGACCGTTG GTTCAACGCC TTGGGCAGCC GCCGCTTCAG CCGCAACGCC
1351 TGCGTCGTCG TCAGTTGCGG CACGGCGGTA ACGGTTGACG CGCTCACCGA
1401 TGACGGACAT TATCTCGGCG GAACCATCAT GCCCGGCTTC CACCTGATGA
1451 AAGAATCGCT CGCCGTCCGA ACCGCCAACC TCAACCGCCC CGCCGGCAAA
1501 CGTTACCCTT TCCCGACCAC AACGGGCAAC GCCGTCGCAA GCGGCATGAT
1551 GGACGCGGTT TGCGGCTCGA TAATGATGAT GCACGGCCGT TTGAAAGAAA
1601 AAAACGGCGC GGGCAAGCCT GTCGATGTCA TCATTACCGG CGGCGGCGCG
1651 GCGAAAGTCG CCGAAGCCCT GCCGCCTGCA TTTTTGGCGG AAAATACCGT
1701 GCGCGTGGCG GACAACCTCG TCATCCACGG GCTGCTGAAC CTGATTGCCG
1751 CCGAAGGCGG GGAATCGGAA CACGCTTAA
它对应于氨基酸序列<SEQ ID 240;ORF61ng-1>:
1 MTVLKPSHWR VLAELADGLP QHVSQLAREA DMKPQQLNGF WQQMPAHIRG
51 LLRQHDGYWR LVRPLAVFDA EGLRDLGERS GFQTALKHEC ASSNDEILEL
101 ARIAPDKAHK TICVTHLQSK GRGRQGRKWS HRLGECLMFS FGWAFDRPQY
151 ELGSLSPVAA LACRRALGCL GLETQIKWPN DLVVGRDKLG GILIETVRAG
201 GKTVAVVGIG INFVLPKEVE NAASVQSLFQ TASRRGNADA AVLLETLLAE
251 LGAVLEQYAE EGFAPFLNEY ETANRDHGKA VLLLRDGETV CEGTVKGVDG
301 RGVLHLETAE GEQTVVSGEI SLRPDNRSVS VPKRPDSERF LLLEGGNSRL
351 KWAWVENGTF ATVGSAPYRD LSPLGAEWAE KADGNVRIVG CAVCGESKKA
401 QVKEQLARKI EWLPSSAQAL GIRNHYRHPE EHGSDRWFNA LGSRRFSRNA
451
CVVVSCGTAV TVDALTDDGH YLGGTIMPGF HLMKESLAVR TANLNRPAGK
501 RYPFPTTTGN AVASGMMDAV CGSIMMMHGR LKEKNGAGKP VDVIITGGGA
551 AKVAEALPPA FLAENTVRVA DNLVIHGLLN LIAAEGGESE HA*
ORF61ng-1和ORF61-1在591个氨基酸的重叠区内有93.9%的相同性:
orf61ng-1.pep MTVLKPSHWRVLAELADGLPQHVSQLAREADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR 60
||||| |||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf61-1 MTVLKLSHWRVLAELADGLPQHVSQLARMADMKPQQLNGFWQQMPAHIRGLLRQHDGYWR 60
orf61ng-1.pep LVRPLAVFDAEGLRDLGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 120
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf61-1 LVRPLAVFDAEGLRELGERSGFQTALKHECASSNDEILELARIAPDKAHKTICVTHLQSK 120
orf61ng-1.pep GRGRQGRKWSHRLGECLMFSFGWAFDRPQYELGSLSPVAALACRRALGCLGLETQIKWPN 180
|||||||||||||||||||||||:|||||||||||||||||||||||: |||::||||||
orf61-1 GRGRQGRKWSHRLGECLMFSFGWVFDRPQYELGSLSPVAAVACRRALSRLGLDVQIKWPN 180
orf61ng-1.pep DLVVGRDKLGGILIETVRAGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA 240
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf61-1 DLVVGRDKLGGILIETVRTGGKTVAVVGIGINFVLPKEVENAASVQSLFQTASRRGNADA 240
orf61ng-1.pep AVLLETLLAELGAVLEQYAEEGFAPFLNEYETANRDHGKAVLLLRDGETVCEGTVKGVDG 300
||||||||:|| ||| |||::|||||: ||::|||||||||||||||||| |||||||||
orf61-1 AVLLETLLVELDAVLLQYARDGFAPFVAEYQAANRDHGKAVLLLRDGETVFEGTVKGVDG 300
orf61ng-1.pep RGVLHLETAEGEQTVVSGEISLRPDNRSVSVPKRPDSERFLLLEGGNSRLKWAWVENGTF 360
:||||||||||:||||||||||| |:| |||||| ||||||||:|||||||||||||||
orf61-1 QGVLHLETAEGKQTVVSGEISLRSDDRPVSVPKRRDSERFLLLDGGNSRLKWAWVENGTF 360
orf61ng-1.pep ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGESKKAQVKEQLARKIEWLPSSAQAL 420
|||||||||||||||||||||||||||||||||||| |||||:|||||||||||||||||
orf61-1 ATVGSAPYRDLSPLGAEWAEKADGNVRIVGCAVCGEFKKAQVQEQLARKIEWLPSSAQAL 420
orf61ng-1.pep GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF 480
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf61-1 GIRNHYRHPEEHGSDRWFNALGSRRFSRNACVVVSCGTAVTVDALTDDGHYLGGTIMPGF 480
orf61ng-1.pep HLMKESLAVRTANLNRPAGKRYPFPTTTGNAVASGMMDAVCGSIMMMHGRLKEKNGAGKP 540
|||||||||||||||| ||||||||||||||||||||||||||:||||||||||:|||||
orf61-1 HLMKESLAVRTANLNRHAGKRYPFPTTTGNAVASGMMDAVCGSVMMMHGRLKEKTGAGKP 540
orf61ng-1.pep VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIHGLLNLIAAEGGESEHAX 593
||||||||||||||||||||||||||||||||||:||||:|||||| | ||
orf61-1 VDVIITGGGAAKVAEALPPAFLAENTVRVADNLVIYGLLNMIAAEGREYEHIX 593
根据该分析结果(包括与副百日咳博德特氏菌的baf蛋白有同源性,以及存在一个推定的原核细胞膜脂蛋白脂质连接位点),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例29
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 241>:
1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC
51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC
101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC
151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT
201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT
251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG
301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT
351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG
401 CGGaAGAGGG CGGCGaAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG
451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC
501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT
551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC
601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGC..
它对应于氨基酸序列<SEQ ID 242;ORF62>:
1 MFYQILALII WSSSFIAAKY VYGGIDPALM VGVRLLIAAL PALPACRRHV
51 GKIPREEWKP LLIVSFVNYV LTLLLQFVGL KYTSAASASV IVGLEPLLMV
101 FVGHFFFNDK ARAYHWICGA AAFAGVALLM AGGAEEGGEV GWFGCLLVLL
151 AGAGFCAAMR PTQRLIARIG APAFTSVSIA AASLMCLPFS LALAQSYTVD
201 WSVGMVLSLL YLGLGC..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 243>:
1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC
51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC
101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC
151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT
201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT
251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG
301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT
351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG
401 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG
451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC
501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT
551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC
601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGCGG
651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA
701 ATGTTTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGCTG
751 GCGGTTTTGA TTTTGGGCGA ACACCTGTCG CCCGTGTCCG CCTTGGGCGT
801 GTTTGTCGTC ATCGCCGCCA CCTTGGTTGC CGGCCGGCTG TCGCATCAAA
851 AATAA
它对应于氨基酸序列<SEQ ID 244;ORF62-1>:
1
MFYQILALII WSSSFIAAKY VYGGID
PALM VGVRLLIAAL PALPACRRHV
51 GKIPREEWKP L
LIVSFVNYV LTLLLQFVGL KYTSA
ASASV IVGLEPLLMV
101
FVGHFFFNDK ARAYHW
ICGA AAFAGVALLM AGGAEEGGEV GW
FGCLLVLL
151
AGAGFCAAMR PTQRLIARIG APAFTS
VSIA AASLMCLPFS LALAQSYTVD
201 WSVGMVLSLL YLGLGCGWYA YWLWNKGMSR VPANVSG
LLI SLEPVVGVLL
251
AVLILGEHLS P
VSALGVFVV IAATLVAGRL SHQK*
该氨基酸序列的计算机分析给出了下列结果:
与流感嗜血菌的假设跨膜蛋白HI0976(登录号为Q57147)的同源性
ORF62和HI0976在114的氨基酸的重叠区内有50%的氨基酸相同性:
Orf62 1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRXXXXXXXXXXXCRRHVGKIPREEWKP 60
M YQILAL+IWSSS I K Y +DP L+V VR R KI + K
HI0976 1 MLYQILALLIWSSSLIVGKLTYSMMDPVLVVQVRLIIAMIIVMPLFLRRWKKIDKPMRKQ 60
Orf62 61 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAY 114
L ++F NY LLQF+GLKYTSA+SA ++GLEPLL+VFVGHFFF K +
HI0976 61 LWWLAFFNYTAVFLLQFIGLKYTSASSAVTMIGLEPLLVVFVGHFFFKTKQNGF 114
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF62与脑膜炎奈瑟球菌菌株A的ORF(ORF62a)在重叠的216个氨基酸内有99.5%的相同性:
10 20 30 40 50 60
orf62.pep
MFYQILALIIWSSSFIAAKYVYGGID
PALMVGVRLLIAALPALPACRRHVGKIPREEWKP
||||||||||||||||| ||||||||| ||||||||||||||||| |||||||||||||||||
orf62a
MFYQILALIIWSSSFIAAKYVYGGID
PALMVGVRLLIAALPALPACRRHVGKIPREEWKP
10 20 30 40 50 60
70 80 90 100 110 120
orf62.pep L
LIVSFVNYVLTLLLQFVGLKYTSA
ASASVIVGLEPLLMVFVGHFFFNDKARAYHW
ICGA
||||||||||||||||||| ||||||| ||||||||||||||||| |||||||||||||| |||
orf62a L
LIVSFVNYVLTLLLQFVGLKYTSA
ASASVIVGLEPLLMVFVGHFFFNDKARAYHW
ICGA
70 80 90 100 110 120
130 140 150 160 170 180
orf62.pep
AAFAGVALLMAGGAEEGGEVGW
FGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTS
VSIA
||||||||||||| ||||||||| ||||||||||||||||| ||||||||||||||||| ||||
orf62a
AAFAGVALLMAGGAEEGGEVGW
FGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTS
VSIA
130 140 150 160 170 180
190 200 210
orf62.pep
AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGC
||||||||||||| ||||||||||||||||||||:||
orf62a
AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGVGCSWYAYWLWNKGMSRVPANVSG
LLI
190 200 210 220 230 240
orf62a
SLEPVVGVLLAVLILGEHLSPVSVLGVFVVIAATLVAGRLSHQKX
250 260 270 280
全长ORF62a核苷酸序列<SEQ ID 245>是:
1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC
51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC
101 GCCTGCTGAT TGCTGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC
151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT
201 CAACTATGTG CTGACCCTGC TACTTCAGTT TGTCGGGTTG AAATACACTT
251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCACT GCTGATGGTG
301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT
351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG
401 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG
451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC
501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT
551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC
601 TGGAGCGTCG GAATGGTATT GTCGCTGCTG TATTTGGGCG TGGGGTGCAG
651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA
701 ACGTTTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGCTG
751 GCGGTTTTGA TTTTGGGCGA ACACCTGTCG CCCGTGTCCG TCTTGGGCGT
801 GTTTGTCGTC ATCGCCGCCA CCTTGGTTGC CGGCCGGCTG TCGCATCAAA
851 AATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 246>:
1
MFYQILALII WSSSFIAAKY VYGGID
PALM VGVRLLIAAL PALPACRRHV
51 GKIPREEWKP L
LIVSFVNYV LTLLLQFVGL KYTSA
ASASV IVGLEPLLMV
101 FVGHFFFNDK ARAYHW
ICGA AAFAGVALLM AGGAEEGGEV GW
FGCLLVLL
151
AGAGFCAAMR PTQRLIARIG APAFTS
VSIA AASLMCLPFS LALAQSYTVD
201 WSVGMVLSLL YLGVGCSWYA YWLWNKGMSR VPANVSG
LLI SLEPVVGVLL
251
AVLILGEHLS P
VSVLGVFVV IAATLVAGRL SHQK*
ORF62a和ORF62-1在284个氨基酸的重叠区内有98.9%的相同性:
orf62a.pep MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
orf62a.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
orf62a.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
orf62a.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGVGCSWYAYWLWNKGMSRVPANVSGLLI 240
|||||||||||||||||||||||||||||||||:||:|||||||||||||||||||||||
orf62-1 AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANVSGLLI 240
orf62a.pep SLEPVVGVLLAVLILGEHLSPVSVLGVFVVIAATLVAGRLSHQKX 285
|||||||||||||||||||||||:|||||||||||||||||||||
orf62-1 SLEPVVGVLLAVLILGEHLSPVSALGVFVVIAATLVAGRLSHQKX 285
与淋病奈瑟球菌的预计ORF的同源性
ORF62与淋病奈瑟球菌的预计ORF(ORF62.ng)在重叠的216个氨基酸内有99.5%的相同性:
orf62.pep MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
|||||||||||:||||||||||||||||||||||||||||||||||||||||||||||||
orf62ng MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP 60
orf62.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62ng LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA 120
orf62.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62ng AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA 180
orf62.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGC 216
||||||||||||||||||||||||||||||||||||
orf62ng AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANASGLLI 240
全长ORF62ng核苷酸序列<SEQ ID 247>是:
1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGGGCAGCT CGTTTATTGC
51 CGCCAAATAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC
101 GCCTGCTGAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC
151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT
201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT
251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG
301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT
351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG
401 CGGAAGAGGG CGGCGAAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG
451 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC
501 CCGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT
551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC
601 TGGAGCGTCG GGATGGTATT GTCGCTGTTG TATTTGGGTT TGGGGTGCGG
651 CTGGTACGCC TATTGGCTGT GGAACAAGGG GATGAGCCGT GTTCCTGCCA
701 ACGCGTCGGG ACTGTTGATT TCGCTCGAAC CCGTCGTCGG CGTGCTGTTG
751 GCGGTTTTGA TTTTGGGCGA ACATTTATCG CCCGTGTCCG CCTTGGGCGT
801 GTTTGTCGTC ATCGCCGCCA CTTTCGCCGC CGGCCGGCTG TCGCGCAGGG
851 ACGCGCAAAA CGGCAATGCC GTCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 248>:
1
AKY VYGGIDPALM VGVRLLIAAL PALPACRRHV
51 GKIPREEWKP L
LIVSFVNYV LTLLLQFVGL KYTSA
ASASV IVGLEPLLMV
101
FVGHFFFNDK ARAYHW
ICGA AAFAGVALLM AGGAEEGGEV GW
FGCLLVLL
151
AGAGFCAAMR PTQRLIARIG APAFTS
VSIA AASLMCLPFS LALAQSYTVD
201 WSVGMVLSLL YLGLGCGWYA YWLWNKGMSR VPANASG
LLI SLEPVVGVLL
251
AVLILGEHLS P
VSALGVFVV IAATFAAGRL SRRDAQNGNA V*
ORF62ng和ORF62-1在283个氨基酸的重叠区内有97.9%的相同性:
10 20 30 40 50 60
orf62ng.pep MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP
||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 MFYQILALIIWSSSFIAAKYVYGGIDPALMVGVRLLIAALPALPACRRHVGKIPREEWKP
10 20 30 40 50 60
70 80 90 100 110 120
orf62ng.pep LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAYHWICGA
70 80 90 100 110 120
130 140 150 160 170 180
orf62ng.pep AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf62-1 AAFAGVALLMAGGAEEGGEVGWFGCLLVLLAGAGFCAAMRPTQRLIARIGAPAFTSVSIA
130 140 150 160 170 180
190 200 210 220 230 240
orf62ng.pep AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANASGLLI
||||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||
orf62-1 AASLMCLPFSLALAQSYTVDWSVGMVLSLLYLGLGCGWYAYWLWNKGMSRVPANVSGLLI
190 200 210 220 230 240
250 260 270 280 290
orf62ng.pep SLEPVVGVLLAVLILGEHLSPVSALGVFVVIAATFAAGRLSRRDAQNGNAVX
|||||||||||||||||||||||||||||||||||::||||::
orf62-1 SLEPVVGVLLAVLILGEHLSPVSALGVFVVIAATLVAGRLSHQKX
250 260 270 280
另外,ORF62ng显示出与假设的流感嗜血菌蛋白明显同源:
sp|Q57147|Y976_HAEIN假设蛋白HI0976>gi|1074589|pir||B64163假设蛋白HI0976-流感嗜血菌(Rd KW20菌株)
>gi|1574004(U32778)假设的[流感嗜血菌]长度=128
评分=106位(262),估计值=2e-22
相同性=56/114(49%),阳性=68/114(59%)
询问:1 MFYQILALIIWGSSFIAAKYVYGGIDPALMVGVRXXXXXXXXXXXCRRHVGKIPREEWKP 60
M YQILAL+IW SS I K Y +DP L+V VR R KI + K
目标:1 MLYQILALLIWSSSLIVGKLTYSMMDPVLVVQVRLIIAMIIVMPLFLRRWKKIDKPMRKQ 60
询问:61 LLIVSFVNYVLTLLLQFVGLKYTSAASASVIVGLEPLLMVFVGHFFFNDKARAY 114
L ++F NY LLQF+GLKYTSA+SA ++GLEPLL+VFVGHFFF K +
目标:61 LWWLAFFNYTAVFLLQFIGLKYTSASSAVTMIGLEPLLVVFVGHFFFKTKQNGF 114
根据该分析结果(包括与流感嗜血菌的跨膜蛋白同源,且淋球菌蛋白中有推定的前导序列和几个跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例30
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 249>:
1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCmGwms TCCTGkkGTA
51 sGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT
101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT
151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT
251 GCCGkACTGC CCGGCGTGTT TCTGTTCGGC TTTCCCGCAC AGTTCATCAA
301 CGGCACGATT AATTCGTGGT TCGGCAACGA TACCCACGAG GCGCTTGAAC
351 GCAGCCTCAA TTTGAGCAAG TCCGCATTGA ATTTGGCGGC AGACAACGCC
401 CTCGGCAACG CCGTCCCCGT GCAGATAGAC CTCATCGGCG CGGCTTCCCT
451 GCCCGGGGAT ATGGGCAGGG TGCTGGAACA TTACGCCGGC AGCGGTTTTG
501 CCCAGCTTGC CCTGTACAAy ksCGCAAGCG GCAAAATCGA AAAAAGCATC
551 AACCCGCACA AGCTCGATCA GCCGTTTCCA GGTAAGGCGC GTTGGGAaAa
601 AATCCaACGG GCGGGTTCGG TCAGGGATTT GGAAAGCATA GGCGGCGTAT
651 TGTaCGCGCA GGGCTGGCTG TCGGCGGGTA CGCACwACGG GCGCGATTAC
701 GCCTTGTTTT TCCGTCAGCC GGTTCCCAAA GGCGTGGCAG AGGATGCCGT
751 yTTAATCGAA AAGGCAAGGG CGAAATATGC TGAGTTGAGT TACAGCAAAA
801 AAGGTTTGCA GACCTTTTTC CTGGCAACCC TGCTGATTGC CTCGCTGCTG
851 TCGATTTTTC TTGCACTGGT CATGGCACTG TATTTCGCCC GCCGTTTCGT
901 CGAACCCGTC CTATCGCTTG CCGAGGGGGC GAAGGCGGTG GCGCAAGGCG
951 ATTTCAGCCA GACGCGCCCC GTGTTGCGCA ACGACGAGTT CGGACGCTTG
1001 ACCArGTTGT TCAACCACAT GACCGAGCAG CTTTCCATCG CCAAAGATGC
1051 AGACGAGCGC AACCGCCGGC GCGAGGAAGC CGCCAGGCAT TATCTTGAAT
1101 GCGTGTTGGA GGGGCTGACC ACGGGCGTGG TGGTGTTTGA CGAACAAGGC
1151 TGTCTGAAAA CCTTCAACAA AGCGGCGGGT ACC..
它对应于氨基酸序列<SEQ ID 250;ORF64>:
1 MRRFLPIAAI CAXXLXXGLT AATGSTSSLA DYFWWIVAFS AMLLLVLSAV
51 LARYVILLLK DRRDGVFGSX XAKXPXXXMF TLVAXLPGVF LFGFPAQFIN
101 GTINSWFGND THEALERSLN LSKSALNLAA DNALGNAVPV QIDLIGAASL
151 PGDMGRVLEH YAGSGFAQLA LYNXASGKIE KSINPHKLDQ PFPGKARWEK
201 IQRAGSVRDL ESIGGVLYAQ GWLSAGTHXG RDYALFFRQP VPKGVAEDAV
251 LIEKARAKYA ELSYSKKGLQ TFFLATLLIA SLLSIFLALV MALYFARRFV
301 EPVLSLAEGA KAVAQGDFSQ TRPVLRNDEF GRLTXLFNHM TEQLSIAKDA
351 DERNRRREEA ARHYLECVLE GLTTGVVVFD EQGCLKTFNK AAGT..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 251>:
1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGTTGTA
51 CGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT
101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT
151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT
201 CGGTTCGCAG ATTGCCAAAC GCCTTTCTGG GATGTTTACG CTGGTTGCCG
251 TACTGCCCGG CGTGTTTCTG TTCGGCGTTT CCGCACAGTT CATCAACGGC
301 ACGATTAATT CGTGGTTCGG CAACGATACC CACGAGGCGC TTGAACGCAG
351 CCTCAATTTG AGCAAGTCCG CATTGAATTT GGCGGCAGAC AACGCCCTCG
401 GCAACGCCGT CCCCGTGCAG ATAGACCTCA TCGGCGCGGC TTCCCTGCCC
451 GGGGATATGG GCAGGGTGCT GGAACATTAC GCCGGCAGCG GTTTTGCCCA
501 GCTTGCCCTG TACAATGCCG CAAGCGGCAA AATCGAAAAA AGCATCAACC
551 CGCACAAGCT CGATCAGCCG TTTCCAGGTA AGGCGCGTTG GGAAAAAATC
601 CAACGGGCGG GTTCGGTCAG GGATTTGGAA AGCATAGGCG GCGTATTGTA
651 CGCGCAGGGC TGGCTGTCGG CGGGTACGCA CAACGGGCGC GATTACGCCT
701 TGTTTTTCCG TCAGCCGGTT CCCAAAGGCG TGGCAGAGGA TGCCGTCTTA
751 ATCGAAAAGG CAAGGGCGAA ATATGCTGAG TTGAGTTACA GCAAAAAAGG
801 TTTGCAGACC TTTTTCCTGG CAACCCTGCT GATTGCCTCG CTGCTGTCGA
851 TTTTTCTTGC ACTGGTCATG GCACTGTATT TCGCCCGCCG TTTCGTCGAA
901 CCCGTCCTAT CGCTTGCCGA GGGGGCGAAG GCGGTGGCGC AAGGCGATTT
951 CAGCCAGACG CGCCCCGTGT TGCGCAACGA CGAGTTCGGA CGCTTGACCA
1001 AGTTGTTCAA CCACATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC
1051 GAGCGCAACC GCCGGCGCGA GGAAGCCGCC AGGCATTATC TTGAATGCGT
1101 GTTGGAGGGG CTGACCACGG GCGTGGTGGT GTTTGACGAA CAAGGCTGTC
1151 TGAAAACCTT CAACAAAGCG GCGGAACAGA TTTTGGGGAT GCCGCTTACC
1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA
1251 GTCCCTGCTT GCCGAAGTGT TTGCCGCCAT CGGCGCGGCG GCAGGTACGG
1301 ACAAACCGGT CCATGTGAAA TATGCCGCGC CGGACGATGC CAAAATCCTG
1351 CTGGGCAAGG CAACCGTCCT GCCCGAAGAC AACGGCAACG GCGTGGTAAT
1401 GGTGATTGAC GACATCACCG TTTTGATACA CGCGCAAAAA GAAGCCGCGT
1451 GGGGCGAAGT GGCGAAGCGG CTGGCACACG AAATCCGCAA TCCGCTCACG
1501 CCCATCCAGC TTTCCGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT
1551 GGATGAGCAG GATGCGCAAA TCCTGACGCG TTCGACCGAC ACCATCGTCA
1601 AACAGGTGGC GGCATTGAAG GAAATGGTCG AAGCATTCCG CAATTATGCG
1651 CGTTCCCCTT CGCTCAAATT GGAAAATCAG GATTTGAACG CCTTAATCGG
1701 CGATGTGTTG GCATTGTATG AAGCCGGTCC GTGCCGGTTT GCGGCGGAGC
1751 TTGCCGGCGA ACCGCTGACG GTGGCGGCGG ATACGACCGC CATGCGGCAG
1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA
1851 TGTGCCCGAA GTCAGGGTAA AATCGGAAAC AGGGCAGGAC GGTCGGATTG
1901 TCCTGACGGT TTGCGACAAC GGCAAAGGGT TCGGCAGGGA AATGCTGCAC
1951 AACGCCTTCG AGCCGTATGT AACGGACAAA CCGGCGGGAA CGGGATTGGG
2001 TCTGCCTGTG GTGAAAAAAA TCATTGAAGA ACACGGCGGC CGCATCAGCC
2051 TGAGCAATCA GGATGCGGGT GGCGCGTGTG TCAGAATCAT CTTGCCAAAA
2101 ACGGTAAAAA CTTATGCGTA G
它对应于氨基酸序列<SEQ ID 252;ORF64-1>:
1
MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVAFS AM
LLLVLSAV
51
LARYVILLLK DRRDGVFGSQ IAKRLS
GMFT LVAVLPGVFL FGVSAQFING
101 TINSWFGNDT HEALERSLNL SKSALNLAAD NALGNAVPVQ IDLIGAASLP
151 GDMGRVLEHY AGSGFAQLAL YNAASGKIEK SINPHKLDQP FPGKARWEKI
201 QRAGSVRDLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPV PKGVAEDAVL
251 IEKARAKYAE LSYSKKGLQT FFLAT
LLIAS LLSIFLALVM ALYFARRFVE
301 PVLSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD
351 ERNRRREEAA RHYLECVLEG LTTGVVVFDE QGCLKTFNKA AEQILGMPLT
401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVHVK YAAPDDAKIL
451 LGKATVLPED NGNGVVMVID DITVLIHAQK EAAWGEVAKR LAHEIRNPLT
501 PIQLSAERLA WKLGGKLDEQ DAQILTRSTD TIVKQVAALK EMVEAFRNYA
551 RSPSLKLENQ DLNALIGDVL ALYEAGPCRF AAELAGEPLT VAADTTAMRQ
601 VLHNIFKNAA EAAEEADVPE VRVKSETGQD GRIVLTVCDN GKGFGREMLH
651 NAFEPYVTDK PAGTGLGLPV VKKIIEEHGG RISLSNQDAG GACVRIILPK
701 TVKTYA*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF64与脑膜炎奈瑟球菌菌株A的ORF(ORF64a)在重叠的392个氨基酸内有92.6%的相同性:
10 20 30 40 50 60
orf64.pep
MRRFLPIAAICAXXLXXGLTAATGSTSSLADYFWWIVAFSAM
LLLVLSAVLARYVILLLK
|||||||||||| | ||||||||||||| |||||||||||| ||||||||||||||||| |
orf64a
MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAM
LLLVLSAVLARYVILLLK
10 20 30 40 50 60
70 80 90 100 110 120
orf64.pep DRRDGVFGSXXAKXPXX
XMFTLVAXLPGVFLFGFPAQFINGTINSWFGNDTHEALERSLN
||||||||| || |||||| |||||||| |||||||||||||||||||||||||
orf64a DRRDGVFGSQIAKR-LS
GMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLN
70 80 90 100 110
130 140 150 160 170 180
orf64.pep LSKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNXASGKIE
|||||||||||||||||:||||| ||||||| ||||||||||||||||||||| ||||||
orf64a LSKSALNLAADNALGNAIPVQIDXIGAASLPXDMGRVLEHYAGSGFAQLALYNAASGKIE
120 130 140 150 160 170
190 200 210 220 230 240
orf64.pep KSINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHXGRDYALFFRQP
||||||||||||||||||||||:|||||| ||||||||| ||||| || |||||||||||
orf64a KSINPHKLDQPFPGKARWEKIQQAGSVRDXESIGGVLYAXGWLSAXTHNGRDYALFFRQP
180 190 200 210 220 230
250 260 270 280 290 300
orf64.pep VPKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFV
||||||||||||||||| |||||||||||||||||||||||||||||||||||||||
orf64a VPKGVAEDAVLIEKARAXXXXLSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFV
240 250 260 270 280 290
310 320 330 340 350 360
orf64.pep EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTXLFNHMTEQLSIAKDADERNRRREEA
|||||||||||||||||||||||||||||||||| |||||||||||||:|||||||||||
orf64a EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEA
300 310 320 330 340 350
370 380 390
orf64.pep ARHYLECVLEGLTTGVVVFDEQGCLKTFNKAAGT
||||||||||||||||||||||||||||||||
orf64a ARHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSL
360 370 380 390 400 410
orf64a LAEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNXNGVVMVIDDITVLIHAQ
420 430 440 450 460 470
全长ORF64a核苷酸序列<SEQ ID 253>是:
1 ATGCGCCGTT TTCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGTTGTA
51 CGGACTGACG GCGGCAACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT
101 GGTGGATTGT TGCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT
151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCG ACGGCGTATT
201 CGGTTCGCAG ATTGCCAAAC GCCTTTCCGG GATGTTTACG CTGGTTGCCG
251 TACTGCCCGG CGTGTTTCTG TTCGGCGTTT CCGCACAGTT TATCAACGGC
301 ACGATTAATT CGTGGTTCGG CAACGATACC CACGAGGCGC TTGAACGCAG
351 CCTCAATTTG AGCAAGTCCG CATTGAATCT GGCGGCAGAC AACGCCCTTG
401 GCAACGCCAT CCCCGTGCAG ATAGACNTCA TCGGCGCGGC TTCCCTGCCC
451 NGGGATATGG GCAGGGTGCT GGAACATTAC GCCGGCAGCG GTTTTGCCCA
501 GCTTGCCCTG TACAATGCCG CAAGCGGCAA AATCGAAAAA AGCATCAACC
551 CGCACAAGCT CGATCAGCCG TTTCCAGGTA AGGCGCGTTG GGAAAAAATC
601 CAACAGGCGG GTTCGGTCAG GGATNNGGAA AGCATAGGCG GCGTATTGTA
651 CGCGCANGGC TGGCTGTCGG CAGNNACGCA CAACGGGCGC GATTACGCCT
701 TGTTTTTCCG TCAGCCGGTT CCCAAAGGCG TGGCAGAGGA TGCCGTCTTA
751 ATCGAAAAGG CAAGGGCGNA ANANNNTNAG TTGAGTTACA GCAAAAAAGG
801 TTTGCAGACC TTTTTCCTNG CAACCCTGCT GATTGCCTCN CTGCTGTCGA
851 TTTTTCTTGC ACTGGTCATG GCACTGTATT TCGCCCGCCG TTTCGTCGAA
901 CCCGTCCTAT CGCTTGCCGA GGGGGCGAAG GCGGTGGCGC AAGGCGATTT
951 CAGCCAGACG CGCCCCGTGT TGCGCAACGA CGAGTTCGGA CGCTTGACCA
1001 AGTTGTTCAA CCACATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC
1051 GAGCGCAACC GCCGGCGCGA GGAAGCCGCC AGACATTATC TCGAATGCGT
1101 GTTGGAGGGG CTGACCACGG GCGTGGTGGT GTTTGACGAA CAAGGCTGTC
1151 TGAAAACCTT CAACAAAGCG GCGGAACAGA TTTTGGGGAT GCCGCTTACC
1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA
1251 GTCCCTGCTT GCCGAAGTGT TTGCCGCCAT CGGCGCGGCG GCAGGTACGG
1301 ACAAACCGGT CCATGTGAAA TATGCCGCGC CGGACGATGC CAAAATCCTG
1351 CTGGGCAAGG CAACCGTCCT GCCCGAAGAC AACNGCAACG GCGTGGTAAT
1401 GGTGATTGAC GACATCACCG TTTTGATACA CGCGCAAAAA GAAGCCGCGT
1451 GGGGCGAAGT GGCAAAACGG CTGGCACACG AAATCCGCAA TCCGCTCACG
1501 CCCATCCAGC TTTCTGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT
1551 GGACGAGCAN GACGCGCAAA TCCTGACACG TTCGACCGAC ACCATCATCA
1601 AACAAGTGGC GGCATTAAAA GAAATGGTCG AGGCATTCCG CAATTACNCG
1651 CGTTCCCCTT CGNCTCAATT GGAAAATCAG GATTTGAACG CCTTAATCGG
1701 CGATGTGTTG GCATTGTACG AAGCTGGTCC GTGCCGGTTT GCGGCGGAAC
1751 TTGCCGGCGA ACCGCTGATG ATGGCGGCGG ATACGACCGC CATGCGGCAG
1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA
1851 TGTGCCCGAA GTCAGGGTAA AATCGGAAGC GGGGCAGGAC GGACGGATTG
1901 TCCTGACAGT TTGCGACAAC GGCAAGGGGT TCGGCAGGGA AATGCTGCAC
1951 AATGCCTTCG AGCCGTATGT AACGGACAAA CCGGCTGGAA CGGGATTGNG
2001 ACTGCCCGTG GTGAAAAAAA TCATTGAAGA ACACGGCGGC CNCATCAGCC
2051 TGAGCAATCA GGATGCGGGC GGCGCGTNTG TCAGAATCAT CTTGCCAAAA
2101 ACGGTAGAAA CTTATGCGTA G
它编码的蛋白质具有氨基酸序列<SEQ ID 254>:
1
MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVAFS AM
LLLVLSAV
51
LARYVILLLK DRRDGVFGSQ IAKRLS
GMFT LVAVLPGVFL FGVSAQFING
101 TINSWFGNDT HEALERSLNL SKSALNLAAD NALGNAIPVQ IDXIGAASLP
151 XDMGRVLEHY AGSGFAQLAL YNAASGKIEK SINPHKLDQP FPGKARWEKI
201 QQAGSVRDXE SIGGVLYAXG WLSAXTHNGR DYALFFRQPV PKGVAEDAVL
251 IEKARAXXXX LSYSKKGLQT FFLAT
LLIAS LLSIFLALVM ALYFARRFVE
301 PVLSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD
351 ERNRRREEAA RHYLECVLEG LTTGVVVFDE QGCLKTFNKA AEQILGMPLT
401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVHVK YAAPDDAKIL
451 LGKATVLPED NXNGVVMVID DITVLIHAQK EAAWGEVAKR LAHEIRNPLT
501 PIQLSAERLA WKLGGKLDEX DAQILTRSTD TIIKQVAALK EMVEAFRNYX
551 RSPSXQLENQ DLNALIGDVL ALYEAGPCRF AAELAGEPLM MAADTTAMRQ
601 VLHNIFKNAA EAAEEADVPE VRVKSEAGQD GRIVLTVCDN GKGFGREMLH
651 NAFEPYVTDK PAGTGLXLPV VKKIIEEHGG XISLSNQDAG GAXVRI ILPK
701 TVETYA*
ORF64a和ORF64-1在706个氨基酸的重叠区内有96.6%的相同性:
10 20 30 40 50 60
orf64a.pep MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf64-1 MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
10 20 30 40 50 60
70 80 90 100 110 120
orf64a.pep DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf64-1 DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL
70 80 90 100 110 120
130 140 150 160 170 180
orf64a.pep SKSALNLAADNALGNAIPVQIDXIGAASLPXDMGRVLEHYAGSGFAQLALYNAASGKIEK
||||||||||||||||:||||| ||||||| |||||||||||||||||||||||||||||
orf64-1 SKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNAASGKIEK
130 140 150 160 170 180
190 200 210 220 230 240
orf64a.pep SINPHKLDQPFPGKARWEKIQQAGSVRDXESIGGVLYAXGWLSAXTHNGRDYALFFRQPV
|||||||||||||||||||||:|||||| ||||||||| ||||| |||||||||||||||
orf64-1 SINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHNGRDYALFFRQPV
190 200 210 220 230 240
250 260 270 280 290 300
orf64a.pep PKGVAEDAVLIEKARAXXXXLSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE
|||||||||||||||| ||||||||||||||||||||||||||||||||||||||||
orf64-1 PKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFVE
250 260 270 280 290 300
310 320 330 340 350 360
orf64a.pep PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf64-1 PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
310 320 330 340 350 360
370 380 390 400 410 420
orf64a.pep RHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf64-1 RHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL
370 380 390 400 410 420
430 440 450 460 470 480
orf64a.pep AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNXNGVVMVIDDITVLIHAQK
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||
orf64-1 AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNGNGVVMVIDDITVLIHAQK
430 440 450 460 470 480
490 500 510 520 530 540
orf64a.pep EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDEXDAQILTRSTDTIIKQVAALK
||||||||||||||||||||||||||||||||||||||| ||||||||||||:|||||||
orf64-1 EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDEQDAQILTRSTDTIVKQVAALK
490 500 510 520 530 540
550 560 570 580 590 600
orf64a.pep EMVEAFRNYXRSPSXQLENQDLNALIGDVLALYEAGPCRFAAELAGEPLMMAADTTAMRQ
||||||||| |||| :||||||||||||||||||||||||||||||||| :|||||||||
orf64-1 EMVEAFRNYARSPSLKLENQDLNALIGDVLALYEAGPCRFAAELAGEPLTVAADTTAMRQ
550 560 570 580 590 600
610 620 630 640 650 660
orf64a.pep VLHNIFKNAAEAAEEADVPEVRVKSEAGQDGRIVLTVCDNGKGFGREMLHNAFEPYVTDK
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||
orf64-1 VLHNIFKNAAEAAEEADVPEVRVKSETGQDGRIVLTVCDNGKGFGREMLHNAFEPYVTDK
610 620 630 640 650 660
670 680 690 700
orf64a.pep PAGTGLXLPVVKKIIEEHGGXISLSNQDAGGAXVRIILPKTVETYAX
|||||| ||||||||||||| ||||||||||| |||||||||:||||
orf64-1 PAGTGLGLPVVKKIIEEHGGRISLSNQDAGGACVRIILPKTVKTYAX
670 680 690 700
与淋病奈瑟球菌的预计ORF的同源性
ORF64与淋病奈瑟球菌的预计ORF(ORF64.ng)在重叠的387个氨基酸内有86.6%的相同性:
orf64.pep MRRFLPIAAICAXXLXXGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK 60
|||||||||||| | ||||||||||||||||||||:||||||||||||||||||||||
orf64ng MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVSFSAMLLLVLSAVLARYVILLLK 60
orf64.pep DRRDGVFGSXXAKXPXXXMFTLVAXLPGVFLFGFPAQFINGTINSWFGNDTHEALERSLN 120
|||:||||| || |||||| |||:||||: |||||||||||||||||||||||||
orf64ng DRRNGVFGSQIAKR-LSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLN 119
orf64.pep LSKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNXASGKIE 180
||||||:||||||::|||||||||||:||| |:|| ||||||||||||||||| ||||||
orf64ng LSKSALDLAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAGSGFAQLALYNAASGKIE 179
orf64.pep KSINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHXGRDYALFFRQP 240
||||||::|||:| | :||:||::||||:||||||||||||||||||| |||||||||||
orf64ng KSINPHQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYALFFRQP 239
orf64.pep VPKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIASLLSIFLALVMALYFARRFV 300
:|::||:|||||||||||||||||||||||||||:|||||||||||||||||||||||||
orf64ng IPENVAQDAVLIEKARAKYAELSYSKKGLQTFFLVTLLIASLLSIFLALVMALYFARRFV 299
orf64.pep EPVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTXLFNHMTEQLSIAKDADERNRRREEA 360
||:||||||||||||||||||||||||||||||| |||||||||||||:|||||||||||
orf64ng EPILSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEA 359
orf64.pep ARHYLECVLEGLTTGVVVFDEQGCLKTFNKAAGT 394
|||||||||:|||||||| :| :|
orf64ng ARHYLECVLDGLTTGVVVSYPLSCCRTAVFSTCHSSPLSYF 400
预计ORF64ng核苷酸序列<SEQ ID 255>编码的蛋白质具有氨基酸序列<SEQ ID256>:
1
DYFWWIVSFS AM
LLLVLSAV
51
LARYVILLLK DRRNGVFGSQ IAKRLS
GMFT LVAVLPGLFL FGISAQFING
101 TINSWFGNDT HEALERSLNL SKSALDLAAD NAVSNAVPVQ IDLIGTASLS
151 GNMGSVLEHY AGSGFAQLAL YNAASGKIEK SINPHQFDQP LPDKEHWEQI
201 QQTGSVRSLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPI PENVAQDAVL
251 IEKARAKYAE LSYSKKGLQT FFLVT
LLIAS LLSIFLALVM ALYFARRFVE
301 PILSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD
351 ERNRRREEAA RHYLECVLDG LTTGVVVSYP LSCCRTAVFS TCHSSPLSYF*
进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 257>:
1 ATGCGCCGCT TCCTACCGAT CGCAGCCATA TGCGCCGTCG TCCTGCTGTA
51 CGGATTGACG GCGGCGACCG GCAGCACCAG TTCGCTGGCG GATTATTTCT
101 GGTGGATAGT CTCGTTCAGC GCAATGCTGC TGCTGGTGTT GTCCGCCGTT
151 TTGGCACGTT ATGTCATATT GCTGTTGAAA GACAGGCGCA ACGGCGTGTT
201 CGGTTCGCAG ATTGCCAAAC GCCTTTCCGG GATGTTCACG CTGGTCGCCG
251 TACTGCCCGG CTTGTTCCTG TTCGGCATTT CCGCGCAGTT TATCAACGGC
301 ACGATTAATT CGTGGTTCGG CAACGACACC CACGAAGCCC TCGAACGCAG
351 CCTTAATTTG AGCAAGTCCG CACTGGATTT GGCGGCAGAC AATGCCGTCA
401 GCAACGCCGT TCCCGTACAG ATAGACCTCA TCGGCACCGC CTCCCTGTCG
451 GGCAATATGG GCAGTGTGCT GGAACACTAC GCCGGCAGCG GTTTTGCCCA
501 GCTTGCCCTG TACAATGCCG CAAGCGGGAA AATCGAAAAA AGCATCAATC
551 CGCACCAATT CGACCAGCCG CTTCCCGACA AAGAACATTG GGAACAGATT
601 CAGCAGACCG GTTCGGTTCG GAGTTTGGAA AGCATAGGCG GCGTATTGTA
651 CGCGCAGGGA TGGTTGTCGG CAGGTACGCA CAACGGGCGC GATTACGCGC
701 TGTTCTTCCG CCAGCCGATT CCCGAAAATG TGGCACAGGA TGCCGTTCTG
751 ATTGAAAAGG CGCGGGCGAA ATATGCCGAA TTGAGTTACA GCAAAAAAGG
801 TTTGCAGACC TTTTTTCTGG TAACCCTGCT GATTGCCTCG CTGCTGTCGA
851 TTTTTCTTGC GCTGGTAATG GCACTGTATT TTGCCCGCCG TTTCGTCGAA
901 CCCATTCTGT CGCTTGCCGA GGGCGCAAAG GCGGTGGCGC AGGGTGATTT
951 CAGCCAGACG CGCCCCGTAT TGCGCAACGA CGAGTTCGGA CGTTTGACCA
1001 AGCTGTTCAA CCATATGACC GAGCAGCTTT CCATCGCCAA AGAAGCAGAC
1051 GAACGCAACC GCCGGCGCGA GGAAGCCGCC CGTCACTACC TCGAGTGCGT
1101 GTTGGATGGG TTGACTACCG GTGTGGTGGT GTTTGACGAA AAAGGCCGTT
1151 TGAAAACCTT CAACAAGGCG GCGGAACAGA TTTTGGGGAT GCCGCTCGCC
1201 CCCCTGTGGG GCAGCAGCCG GCACGGTTGG CACGGCGTTT CGGCGCAGCA
1251 GTCCCTGCTT GCCGAAGTGT TtgccgccAT CGGTGCGGCG GCAGGTACGG
1301 ACAAACCGGT CCAGGTGGAA TATGCCGCGC CGGACGATGC CAAAATCCTG
1351 CTGGGCAAGG CGACGGTATT GCCCGAAGAC AACGGCAACG GCGTGGTGAT
1401 GGTGATTGAC GACATCACCG TGCTGATACG CGCGCAAAAA GAAGCCGCGT
1451 GGGGTGAAGT GGCGAAGCGG CTGGCACACG AAATCCGCAA TCCGCTCACG
1501 CCCATCCAGC TTTCCGCCGA ACGGCTGGCG TGGAAATTGG GCGGGAAGCT
1551 GGACGATCAG GACGCGCAAA TCCTGACGCG TtcgACCGAC ACCATCATCA
1601 AACAGgtggc gGCGTTAAAA GAAATGGTCG AGGCATTCCG CAATTACGCG
1651 CGCGCCCCTT CGCTCAAACT GGAAAATCAG GATTTGAACG CCTTAATCGG
1701 CGATGTTTTG GCCCTGTACG AAGCCGGCCC GTGCCGGTTT GAGGCGGAAC
1751 TTGCCGGCGA ACCGCTGATG ATGGCGGCGG ATACGACCGC CATGCGGCAG
1801 GTGCTGCACA ATATTTTCAA AAATGCCGCC GAAGCGGCGG AAGAAGCCGA
1851 TATGCCCGAA GTCAGGGTAA AATCGGAAAC GGGGCAGGAC GGACGGATTG
1901 TCCTGACGGT TTGCGACAAC GGCAAGGGAT TCGGCAAGGA AATGCTGCAC
1951 AATGCTTTCG AGCCGTATGT GACGGATAAG CCGGCGGGAA CGGGACTGGG
2001 TCTGCCTGTA GTGAAAAAAA TCATTGGAGAACACGGCGGC CGCATCAGCC
2051 TGAGCAATCA GGATGCGGGT GGGGCGTGTG TCAGAATCAT CTTGCCAAAA
2101 ACGGTAGAAA CTTATGCGTA G
它对应于氨基酸序列<SEQ ID 258;ORF64ng-1>:
1
MRRFLPIAAI CAVVLLYGLT AATGSTSSLA DYFWWIVSFS AM
LLLVLSAV
51
LARYVILLLK DRRNGVFGSQ IAKRLS
GMFT LVAVLPGLFL FGISAQFING
101 TINSWFGNDT HEALERSLNL SKSALDLAAD NAVSNAVPVQ IDLIGTASLS
151 GNMGSVLEHY AGSGFAQLAL YNAASGKIEK SINPHQFDQP LPDKEHWEQI
201 QQTGSVRSLE SIGGVLYAQG WLSAGTHNGR DYALFFRQPI PENVAQDAVL
251 IEKARAKYAE LSYSKKGLQT FFLVT
LLIAS LLSIFLALVM ALYFARRFVE
301 PILSLAEGAK AVAQGDFSQT RPVLRNDEFG RLTKLFNHMT EQLSIAKEAD
351 ERNRRREEAA RHYLECVLDG LTTGVVVFDE KGRLKTFNKA AEQILGMPLA
401 PLWGSSRHGW HGVSAQQSLL AEVFAAIGAA AGTDKPVQVE YAAPDDAKIL
451 LGKATVLPED NGNGVVMVID DITVLIRAQK EAAWGEVAKR LAHEIRNPLT
501 PIQLSAERLA WKLGGKLDDQ DAQILTRSTD TIIKQVAALK EMVEAFRNYA
551 RAPSLKLENQ DLNALIGDVL ALYEAGPCRF EAELAGEPLM MAADTTAMRQ
601 VLHNIFKNAA EAAEEADMPE VRVKSETGQD GRIVLTVCDN GKGFGKEMLH
651 NAFEPYVTDK PAGTGLGLPV VKKIIGEHGG RISLSNQDAG GACVRIILPK
701 TVETYA*
ORF64ng-1和ORF64-1在706个氨基酸的重叠区内有93.8%的相同性:
10 20 30 40 50 60
orf64ng-1.pep MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVSFSAMLLLVLSAVLARYVILLLK
|||||||||||||||||||||||||||||||||||||:||||||||||||||||||||||
orf64-1 MRRFLPIAAICAVVLLYGLTAATGSTSSLADYFWWIVAFSAMLLLVLSAVLARYVILLLK
10 20 30 40 50 60
70 80 90 100 110 120
orf64ng-1.pep DRRNGVFGSQIAKRLSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLNL
|||:|||||||||||||||||||||||:||||:|||||||||||||||||||||||||||
orf64-1 DRRDGVFGSQIAKRLSGMFTLVAVLPGVFLFGVSAQFINGTINSWFGNDTHEALERSLNL
70 80 90 100 110 120
130 140 150 160 170 180
orf64ng-1.pep SKSALDLAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAGSGFAQLALYNAASGKIEK
|||||:||||||::|||||||||||:||| |:|| |||||||||||||||||||||||||
orf64-1 SKSALNLAADNALGNAVPVQIDLIGAASLPGDMGRVLEHYAGSGFAQLALYNAASGKIEK
130 140 150 160 170 180
190 200 210 220 230 240
orf64ng-1. pep SINPHQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYALFFRQPI
|||||::|||:| | :||:||::||||:|||||||||||||||||||||||||||||||:
orf64-1 SINPHKLDQPFPGKARWEKIQRAGSVRDLESIGGVLYAQGWLSAGTHNGRDYALFFRQPV
190 200 210 220 230 240
250 260 270 280 290 300
orf64ng-1.pep PENVAQDAVLIEKARAKYAELSYSKKGLQTFFLVTLLIA LLSIFLALVMALYFARRFVE
|::||:|||||||||||||||||||||||||||:||||||||||||||||||||||||||
orf64-1 PKGVAEDAVLIEKARAKYAELSYSKKGLQTFFLATLLIA LLSIFLALVMALYFARRFVE
250 260 270 280 290 300
310 320 330 340 350 360
orf64ng-1.pep PILSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf64-1 PVLSLAEGAKAVAQGDFSQTRPVLRNDEFGRLTKLFNHMTEQLSIAKEADERNRRREEAA
310 320 330 340 350 360
370 380 390 400 410 420
orf64ng-1.pep RHYLECVLDGLTTGVVVFDEKGRLKTFNKAAEQILGMPLAPLWGSSRHGWHGVSAQQSLL
||||||||:|||||||||||:| ||||||||||||||||:||||||||||||||||||||
orf64-1 RHYLECVLEGLTTGVVVFDEQGCLKTFNKAAEQILGMPLTPLWGSSRHGWHGVSAQQSLL
370 380 390 400 410 420
430 440 450 460 470 480
orf64ng-1.pep AEVFAAIGAAAGTDKPVQVEYAAPDDAKILLGKATVLPEDNGNGVVMVIDDITVLIRAQK
|||||||||||||||||:|:||||||||||||||||||||||||||||||||||||:|||
orf64-1 AEVFAAIGAAAGTDKPVHVKYAAPDDAKILLGKATVLPEDNGNGVVMVIDDITVLIHAQK
430 440 450 460 470 480
490 500 510 520 530 540
orf64ng-1.pep EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDDQDAQILTRSTDTIIKQVAALK
||||||||||||||||||||||||||||||||||||||:|||||||||||||:|||||||
orf64-1 EAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDEQDAQILTRSTDTIVKQVAALK
490 500 510 520 530 540
550 560 570 580 590 600
orf64ng-1.pep EMVEAFRNYARAPSLKLENQDLNALIGDVLALYEAGPCRFEAELAGEPLMMAADTTAMRQ
|||||||||||:|||||||||||||||||||||||||||| |||||||| :|||||||||
orf64-1 EMVEAFRNYARSPSLKLENQDLNALIGDVLALYEAGPCRFAAELAGEPLTVAADTTAMRQ
550 560 570 580 590 600
610 620 630 640 650 660
orf64ng-1.pep VLHNIFKNAAEAAEEADMPEVRVKSETGQDGRIVLTVCDNGKGFGKEMLHNAFEPYVTDK
|||||||||||||||||:|||||||||||||||||||||||||||:||||||||||||||
orf64-1 VLHNIFKNAAEAAEEADVPEVRVKSETGQDGRIVLTVCDNGKGFGREMLHNAFEPYVTDK
610 620 630 640 650 660
670 680 690 700
orf64ng-1.pep PAGTGLGLPVVKKIIGEHGGRISLSNQDAGGACVRIILPKTVETYAX
||||||||||||||| ||||||||||||||||||||||||||:||||
orf64-1 PAGTGLGLPVVKKIIEEHGGRISLSNQDAGGACVRIILPKTVKTYAX
670 680 690 700
另外,ORF64ng-1显示出与茎瘤固氮根瘤菌的一种蛋白明显同源:
sp|Q04850|NTRY_AZOCA氮调节蛋白NTRY>gi|77479|pir||S18624ntrY蛋白-茎瘤固氮根瘤菌>gi|38737(X63841)NtrY基因产物[茎瘤固氮根瘤菌]长度=771
评分=218位(550),估计值=7e-56
相同性=195/720(27%),阳性=320/720(44%),空隙=58/720(8%)
询问:7 IAAICAVVLLYGLTAATGSTSSLADYFWWIXXXXXXXXXXXXXXXXRYVILLLKDRRNGV 66
I+A+ ++L GLT + + + R + + K R G
目标:35 ISALATFLILMGLTPVVPTHQVVIS----VLLVNAAAVLILSAMVGREIWRIAKARARGR 90
询问:67 FGSQIAKRLSGMFTLVAVLPGLFLFGISAQFINGTINSWFGNDTHEALERSLNLSKSALD 126
+++ R+ G+F +V+V+P + + +++ ++ ++ WF T E + S++++++ +
目标:91 AAARLHIRIVGLFAVVSVVPAILVAVVASLTLDRGLDRWFSMRTQEIVASSVSVAQTYVR 150
询问:127 LAADNAVSNAVPVQIDLIGTASLSGNMGSVLEHYAG--SGFAQLALYNAASGKIEKSINP 184
A N + + + DL S+ Y G S F Q+ AA + ++
目标:151 EHALNIRGDILAMSADLTRLKSV----------YEGDRSRFNQILTAQAALRNLPGAMLI 200
询问:185 HQFDQPLPDKEHWEQIQQTGSVRSLESIGGVLYAQGWLSAGTHNGRDYA----------- 233
+ D + ++ + I + V + +IG Q + N DY
目标:201 RR-DLSVVERAN-VNIGREFIVPANLAIGDATPDQPVIYLP--NDADYVAAVVPLKDYDD 256
询问:234 --LFFRQPIPENVAQDAVLIEKARAKYAELSYSKKGLQTFFLVTXXXXXXXXXXXXXVMA 291
L+ + I V ++ A Y L + G+Q F + +
目标:257 LYLYVARLIDPRVIGYLKTTQETLADYRSLEERRFGVQVAFALMYAVITLIVLLSAVWLG 316
询问:292 LYFARRFVEPILSLAEGAKAVAQGDFSQTRPVLRND-EFGRLTKLFNHMTEQLSIXXXXX 350
L F++ V PI L A VA+G+ P+ R + + L + FN MT +L
目标:317 LNFSKWLVAPIRRLMSAADHVAEGNLDVRVPIYRAEGDLASLAETFNKMTHELRSQREAI 376
询问:351 XXXXXXXXXXXHYLECVLDGLTTGVVVFDEKGRLKTFNKAAEQILGMPLAPLWGSSRHGW 410
+E VL G+ GV+ D + R+ N++AE++LG L+ + RH
目标:377 LTARDQIDSRRRFTEAVLSGVGAGVIGLDSQERITILNRSAERLLG--LSEVEALHRHLA 434
询问:411 HGVSAQQSLLAEVFXXXXXXXXTDKPVQVEYAAPDDAKILLGKATVLPEDNG---NGVVM 467
V LL E + VQ D + + V E + +G V+
目标:435 EVVPETAGLLEEA------EHARQRSVQGNITLTRDGRERVFAVRVTTEQSPEAEHGWVV 488
询问:468 VIDDITVLIRAQKEAAWGEVAKRLAHEIRNPLTPIQLSAERLAWKLGGKLDDQDAQILTR 527
+DDIT LI AQ+ +AW +VA+R+AHEI+NPLTPIQLSAERL K G + QD +I +
目标:489 TLDDITELISAQRTSAWADVARRIAHEIKNPLTPIQLSAERLKRKFGRHV-TQDREIFDQ 547
询问:528 STDTIIKQVAALKEMVEAFRNYARAPSLKLENQDLNALIGDVLALYEAGPCRFEAELAGE 587
TDTII+QV + MV+ F ++AR P +++QD++ +I +L G +
目标:548 CTDTIIRQVGDIGRMVDEFSSFARMPKPVVDSQDMSEIIRQTVFLMRVGHPEVVFDSEVP 607
询问:588 PLMMAA-DTTAMRQVLHNIFKNXXXXXXXXDMPEVRVK-------SETGQDGRIVLTVCD 639
P M A D + Q L NI KN P+VR + + G+D +V+ + D
目标:608 PAMPARFDRRLVSQALTNILKNAAEAIEAVP-PDVRGQGRIRVSANRVGED--LVIDIID 664
询问:640 NGKGFGKEMLHNAFEPYVTDKPAGTGLGLPVVKKIIGEHGGRISLSNQDAG-GACVRIIL 698
NG G +E + EPYVT + GTGLGL +V KI+ EHGG I L++ G GA +R+ L
目标:665 NGTGLPQESRNRLLEPYVTTREKGTGLGLAIVGKIMEEHGGGIELNDAPEGRGAWIRLTL 724
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列(双划线)和几个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例31
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 259>:
1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT
51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC
101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC
151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT
201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT
251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG
301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC
351 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC
401 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA ACGCATCAAC CGTCATCGGG
451 CACGCGTTGG ATACG...
它对应于氨基酸序列<SEQ ID 260;ORF66>:
1 MYAFTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFQIFGI HTTWGAFSFP
51 FIFLATDLTV RIFGSHLARR IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA
101 LSEFNTFVGR IALASFAAYA IGQILDIFVF NKLRRLKAWW IAPNASTVIG
151 HALDT...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 261>:
1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT
51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC
101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC
151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT
201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT
251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG
301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC
351 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC
401 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA CCGCATCAAC CGTCATCGGC
451 AACGCCTTGG ATACGCTGGT ATTTTTCGCC GTTGCCTTCT ACGCAAGCAG
501 CGATGGATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC
551 TGTTCAAACT TACCGTCTGC ACCCTCTTCT TCCTGCCCGC CTACGGCGTG
601 ATACTGAATC TGCTGACGAA AAAACTGACA ACCCTGCAAA CCAAACAGGC
651 GCAAGACCGC CCCGCGCCCT CGCTGCAAAA TCCGTAA
它对应于氨基酸序列<SEQ ID 262;ORF66-1>:
1
MYAFTAAQQQ KALFRLVLFH ILI IAASNYL VQFPFQIFGI HTTWGAFSFP
51 FIFLATDLTV RIFGSHLARR
IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA
101 LSEFNTFVGR IA
LASFAAYA IGQILDIFVF NKLRRLKAWW IAPTAS
TVIG
151
NALDTLVFFA VAFYASSDGF MAANWQGIAF VDYLFKLT
VC TLFFLPAYGV
201
ILNLLTKKLT TLQTKQAQDR PAPSLQNP*
该氨基酸序列的计算机分析给出了下列结果:
与大肠杆菌的假设蛋白o221(登录号P37619)的同源性
ORF66和o221蛋白在155个氨基酸的重叠区内有67%的氨基酸相同性:
orf66 1 MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60
M F+ Q+ KALF L LFH+L+I +SNYLVQ P I G HTTWGAFSFPFIFLATDLTV
o221 1 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVSILGFHTTWGAFSFPFIFLATDLTV 60
orf66 61 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120
RIFG+ LARRIIF VM PALL+SYV S LF+GSW G GAL +FN FV RIA ASF AYA
o221 61 RIFGAPLARRIIFAVMIPALLISYVISSLFYMGSWQGFGALAHFNLFVARIATASFMAYA 120
orf66 121 IGQILDIFVFNKLRRLKAWWIAPNASTVIGHALDT 155
+GQILD+ VFN+LR+ + WW+AP AST+ G+ DT
o221 121 LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDT 155
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF66与脑膜炎奈瑟球菌菌株A的ORF(ORF66a)在重叠的155个氨基酸内有96.1%的相同性:
10 20 30 40 50 60
orf66.pep
MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV
|||||||||||||| |||||||||||||||||||||| |||||||||| ||||||||||||
orf66a
MYAFTAAQQQKALFWLVLFHILIIAASNYLVQFPFQISGIHTTWGAFSFPFIFLATDLTV
10 20 30 40 50 60
70 80 90 100 110 120
orf66.pep RIFGSHLARR
IIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIA
LASFAAYA
|||||||||| ||||||||||||||||| ||||||||||||||||||||||||| ||||||||
orf66a RIFGSHLARR
IIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIA
LASFAAYA
70 80 90 100 110 120
130 140 150
orf66.pep
IGQILDIFVFNKLRRLKAWWIAPNAS
TVIGHALDT
:||||||| ||||||||||||:||:|| ||||:||||
orf66a
LGQILDIFVFNKLRRLKAWWVAPTAS
TVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF
130 140 150 160 170 180
orf66a VDYLFKLT
VCGLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX
190 200 210 220
全长ORF66a核苷酸序列<SEQ ID 263>是:
1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCTGGCTGGT
51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC
101 CCTTCCAAAT TTCCGGCATC CACACCACTT GGGGCGCGTT TTCCTTTCCC
151 TTCATCTTCC TCGCCACCGA CCTGACCGTC CGCATTTTCG GTTCGCACTT
201 GGCACGGCGG ATTATCTTTT GGGTCATGTT CCCCGCCCTT TTGCTTTCCT
251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACGGG CTTGGGCGCG
301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCGCTGG CAAGTTTTGC
351 CGCCTACGCG CTCGGACAAA TCCTTGATAT TTTTGTGTTC AACAAATTAC
401 GCCGTCTGAA AGCGTGGTGG GTTGCCCCGA CTGCATCAAC CGTCATCGGC
451 AACGCCTTAG ATACGTTGGT ATTTTTCGCC GTTGCCTTCT ACGCAAGCAG
501 CGATGGATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC
551 TGTTCAAACT CACCGTCTGC GGTCTGTTTT TCCTGCCCGC CTACGGCGTG
601 ATTCTGAATC TGCTGACGAA AAAACTGACG ACCCTGCAAA CCAAACAGGC
651 GCAAGACCGC CCCGCGCCCT CGCTGCAAAA TCCGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 264>:
1
MYAFTAAQQQ KALFWLVLFH ILIIAASNYL VQFPFQISGI HTTWGAFSFP
51 FIFLATDLTV RIFGSHLARR
IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA
101 LSEFNTFVGR IA
LASFAAYA LGQILDIFVF NKLRRLKAWW VAPTAS
TVIG
151
NALDTLVFFA VAFYASSDGF MAANWQGIAF VDYLFKLT
VC GLFFLPAYGV
201
ILNLLTKKLT TLQTKQAQDR PAPSLQNP*
ORF66a和ORF66-1在228个氨基酸的重叠区内有97.8%的相同性:
10 20 30 40 50 60
orf66a.pep MYAFTAAQQQKALFWLVLFHILIIAASNYLVQFPFQISGIHTTWGAFSFPFIFLATDLTV
|||||||||||||| |||||||||||||||||||||| ||||||||||||||||||||||
orf66-1 MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV
10 20 30 40 50 60
70 80 90 100 110 120
orf66a.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf66-1 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA
70 80 90 100 110 120
130 140 150 160 170 180
orf66a.pep LGQILDIFVFNKLRRLKAWWVAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF
:|||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||
orf66-1 IGQILDIFVFNKLRRLKAWWIAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF
130 140 150 160 170 180
190 200 210 220 229
orf66a.pep VDYLFKLTVCGLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX
|||||||||| ||||||||||||||||||||||||||||||||||||||
orf66-1 VDYLFKLTVCTLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF66与淋病奈瑟球菌的预计ORF(ORF66.ng)在重叠的155个氨基酸内有94.2%的相同性:
orf66.pep MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60
|||:|||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf66ng MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60
orf66.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120
||||||||||||||||||||| |||||||||||||||||| |:|||||||||||||||||
orf66ng RIFGSHLARRIIFWVMFPALSLSYVFSVLFHNGSWTGLGAPSQFNTFVGRIALASFAAYA 120
orf66.pep IGQILDIFVFNKLRRLKAWWIAPNASTVIGHALDT 155
:|||||||||:|||||||||||| ||||||:||||
orf66ng LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180
全长ORF66ng核苷酸序列<SEQ ID 265>是:
1 ATGTACGCAT TGACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT
51 GCTTTTCCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC
101 CCTTCCGGAT TTTCGGCATC CACACCACTT GGGGCGCGTT TTCCTTTCCC
151 TTCATCTTCC TCGCCACCGA CCTGACCGTC CGCATTTTCG GTTCGCACTT
201 GGCGCGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT ttgCTTTcat
251 aCGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACGGG CTTGGGCGCG
301 ctgTCCCAAT TCAACACCTT TGTCGGACGC ATCGCGCTGG CAAGTTTTGC
351 CGCCTACGCG CTCGGACAAA TCCTTGATAT TTTCGTATTC GACAAATTAC
401 GCCGTCTGAA AGCGTGGTGG ATTGCCCCGG CCGCATCAAC CGTCATCGGC
451 AATGCACTGG ACACGTTAGT ATTTTTTGCC GTTGCCTTTT ACGCAAGCAG
501 CGATGAATTT ATGGCGGCAA ACTGGCAGGG CATCGCTTTT GTCGATTACC
551 TGTTCAAACT TACCGTCTGC ACCCTCTTCT TCCTGCCCGC CTACGGCGTG
601 ATACTGAATC TGCTGACGAA AAAACTGACG GCCCTGCAAA CCAAACAGGC
651 GCAAGACCGC CCCGTGCCCT CGCTGCAAAA TCCGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 266>:
1 MYALTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFRIFGI HTTW
GAFSFP
51
FIFLATDLTV RIFGSHLARR IIFWVMFPAL SLSYVFSVLF HNGSWTGLGA
101 PSQ
FNTFVGR IALASFAAYA LGQILDIFVF DKLRRLKAWW IAPAA
STVIG
151
NALDTLVFFA VAFYASSDEF MAANWQGIA
F VDYLFKLTVC TLFFLPAYGV
201 ILNLLTKKLT ALQTKQAQDR PVPSLQNP*
另一个注释的序列是:
1
MYALTAAQQQ KALFRLVLFH ILIIAASNYL VQFPFRIFGI HTTWGAFSFP
51 FIFLATDLTV RIFGSHLARR
IIFWVMFPAL LLSYVFSVLF HNGSWTGLGA
101 LSQFNTFVGR IA
LASFAAYA LGQILDIFVF DKLRRLKAWW IAPAAS
TVIG
151
NALDTLVFFA VAFYASSDEF MAANWQGIAF VDYLFKLT
VC TLFFLPAYGV
201
ILNLLTKKLT ALQTKQAQDR PVPSLQNP*
ORF66ng和ORF66-1在228个氨基酸的重叠区内有96.1%的相同性:
orf66-1.pep MYAFTAAQQQKALFRLVLFHILIIAASNYLVQFPFQIFGIHTTWGAFSFPFIFLATDLTV 60
|||:|||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf66ng MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60
orf66-1.pep RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSEFNTFVGRIALASFAAYA 120
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||
orf66ng RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSQFNTFVGRIALASFAAYA 120
orf66-1.pep IGQILDIFVFNKLRRLKAWWIAPTASTVIGNALDTLVFFAVAFYASSDGFMAANWQGIAF 180
:|||||||||:||||||||||||:|||||||||||||||||||||||| |||||||||||
orf66ng LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180
orf66-1.pep VDYLFKLTVCTLFFLPAYGVILNLLTKKLTTLQTKQAQDRPAPSLQNPX 229
||||||||||||||||||||||||||||||:||||||||||:|||||||
orf66ng VDYLFKLTVCTLFFLPAYGVILNLLTKKLTALQTKQAQDRPVPSLQNPX 229
另外,ORF66ng显示出与大肠杆菌的ORF有明显的同源性:
sp|P37619|YHHQ_ECOLI FTSY-NIKA基因间区域中的假设的25.3KD蛋白(O221)
>gi|1073495|pir||S47690假设蛋白o221-大肠杆菌>gi|466607(U00039)没有发现确定线[大肠杆菌]
>gi|1789882(AE000423)ftsY-nikA基因间区域中假设的25.3kD蛋白[大肠杆菌]长度=221
评分=273位(692),估计值=5e-73
相同性=132/203(65%),阳性=155/203(76%)
询问:1 MYALTAAQQQKALFRLVLFHILIIAASNYLVQFPFRIFGIHTTWGAFSFPFIFLATDLTV 60
M + Q+ KALF L LFH+L+I +SNYLVQ P I G HTTWGAFSFPFIFLATDLTV
目标:1 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVSILGFHTTWGAFSFPFIFLATDLTV 60
询问:61 RIFGSHLARRIIFWVMFPALLLSYVFSVLFHNGSWTGLGALSQFNTFVGRIALASFAAYA 120
RIFG+ LARRIIF VM PALL+SYV S LF+ GSW G GAL+ FN FV RIA ASF AYA
目标:61 RIFGAPLARRIIFAVMIPALLISYVISSLFYMGSWQGFGALAHFNLFVARIATASFMAYA 120
询问:121 LGQILDIFVFDKLRRLKAWWIAPAASTVIGNALDTLVFFAVAFYASSDEFMAANWQGIAF 180
LGQILD+ VF++LR+ + WW+AP AST+ GN DTL FF +AF+ S D FMA +W IA
目标:121 LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDTLAFFFIAFWRSPDAFMAEHWMEIAL 180
询问:181 VDYLFKLTVCTLFFLPAYGVILN 203
VDY FK+ + +FFLP YGV+LN
目标:181 VDYCFKVLISIVFFLPMYGVLLN 203
根据该分析结果(包括与大肠杆菌蛋白质同源以及淋球菌蛋白中存在几个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例32
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 267>:
1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAAyGCA GTmwrAATAT
101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT
151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA
201 TTTAACACAC AyyCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC
301 CGCTTAGgCG CGAAATTCAG CACAAGGGCG GTtCCCTATG TCGGAACAGC
351 CcTTTTAGCC CACGACGTAT ACGAAAcTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGTAAA AGGCTACGAA
451 TATAGTAATT GCCTTTGGTA CGAAGACAAA AGACGTATTA ATAGAACCTA
501 TGGCTGCTAC GGCGTTGAT..
它对应于氨基酸序列<SEQ ID 268;ORF72>:
1 MVIKYTNLNF AKLSIIAILM MYSFEANANA VXISETVSVD TGQGAKIHKF
51 VPKNSKTYSS DLIKTVDLTH XPTGAKARIN AKITASVSRA GVLAGVGKLA
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFVKGYE
151 YSNCLWYEDK RRINRTYGCY GVD..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 269>:
1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT
101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT
151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA
201 TTTAACACAC ATCCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC
301 CGCTTAGGCG CGAAATTCAG CACAAGGGCG GTTCCCTATG TCGGAACAGC
351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGCAAA GGTCTCAGGC
451 TAA
它对应于氨基酸序列<SEQ ID 270;ORF72-1>:
1
MVIKYTNLNF AKLSIIAILM MYSFEANANA VKISETVSVD TGQGAKIHKF
51 VPKNSKTYSS DLIKTVDLTH IPTGAKARIN AKITASVSRA GVLAGVGKLA
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFAKVSG
151 *
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF72与脑膜炎奈瑟球菌菌株A的ORF(ORF72a)在重叠的147个氨基酸内有98.0%的相同性。
10 20 30 40 50 60
orf72.pep
MVIKYTNLNFAKLSIIAILMMYSFEANANAVXISETVSVDTGQGAKIHKFVPKNSKTYSS
|||||||||||||||||||||||||||| ||| ||||||||||||||||||||||||||||
orf72a
MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
10 20 30 40 50 60
70 80 90 100 110 120
orf72.pep DLIKTVDLTHXPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
|||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
orf72a DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
70 80 90 100 110 120
130 140 150 160 170
orf72.pep HDVYETFKEDIQARGYQYDPETDKFVKGYEYSNCLWYEDKRRINRTYGCYGVD
|||||||||||||||||||||||||:|
orf72a HDVYETFKEDIQARGYQYDPETDKFAKVSGX
130 140 150
全长ORF72a核苷酸序列<SEQ ID 271>是:
1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT
101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT
151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA
201 TTTAACACAC ATCCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC
301 CGCTTAGGCG CGAAATTCAG CACAAGGGCG GTTCCCTATG TCGGAACAGC
351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGCAAA GGTCTCAGGC
451 TAA
它编码的蛋白质具有氨基酸序列<SEQ ID 272>:
1
MVIKYTNLNF AKLSIIAILM MYSFEANANA VKISETVSVD TGQGAKIHKF
51 VPKNSKTYSS DLIKTVDLTH IPTGAKARIN AKITASVSRA GVLAGVGKLA
101 RLGAKFSTRA VPYVGTALLA HDVYETFKED IQARGYQYDP ETDKFAKVSG
151 *
ORF72a和ORF72-1在150个氨基酸的重叠区内有100.0%的相同性:
10 20 30 40 50 60
orf72a.pep MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf72-1 MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
10 20 30 40 50 60
70 80 90 100 110 120
orf72a.pep DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf72-1 DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
70 80 90 100 110 120
130 140 150
orf72a.pep HDVYETFKEDIQARGYQYDPETDKFAKVSGX
|||||||||||||||||||||||||||||||
orf72-1 HDVYETFKEDIQARGYQYDPETDKFAKVSGX
130 140 150
与淋病奈瑟球菌的预计ORF的同源性
ORF72与淋病奈瑟球菌的预计ORF(ORF72.ng)在重叠的173个氨基酸内有89%的相同性:
orf72.pep MVIKYTNLNFAKLSIIAILMMYSFEANANAVXISETVSVDTGQGAKIHKFVPKNSKTYSS 60
|| |:|||||||||||||||||||||||||| ||||:|||||||||:||||||:|: |||
orf72ng MVTKHTNLNFAKLSIIAILMMYSFEANANAVKISETLSVDTGQGAKVHKFVPKSSNIYSS 60
orf72.pep DLIKTVDLTHXPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA 120
|| |:||||| ||||||||||||||||||||||:|||||:| ||||:|||||||||||||
orf72ng DLTKAVDLTHIPTGAKARINAKITASVSRAGVLSGVGKLVRQGAKFGTRAVPYVGTALLA 120
orf72.pep HDVYETFKEDIQARGYQYDPETDKFVKGYEYSNCLWYEDKRRINRTYGCYGVD 173
||||||||||||||||| :||||||||||||:|||||||:|||||||||||||
orf72ng HDVYETFKEDIQARGCRYDPETDKFVKGYEYANCLWYEDERRINRTYGCYGVDSSIMRLM 180
预计ORF72ng核苷酸序列<SEQ ID 273>编码的蛋白质具有氨基酸序列<SEQ ID274>:
1
MVTKHTNLNF AKLSIIAILM MYSFEANANA VKISETLSVD TGQGAKVHKF
51 VPKSSNIYSS DLTKAVDLTH IPTGAKARIN AKITASVSRA GVLSGVGKLV
101 RQGAKFGTRA VPYVGTALLA HDVYETFKED IQARGCRYDP ETDKFVKGYE
151 YANCLWYEDE RRINRTYGCY GVDSSIMRLM PDRSRFPEVK QLMESQMYRL
201 ARPFWNWRKE ELNKLSSLDW NNFVLNRCTF DWNGGGCAVN KGDDFRAGAS
251 FSLGRNPKYK EEMDAKKPEE ILSLKVDADP DKYIEATGYP GYSEKVEVAP
301 GTKVNMGPVT DRNGNPVQVA ATFGRDAQGN TTADVQVIPR PDLTPASAEA
351 PHAQPLPEVS PAENPANNPD PDENPGTRPN PEPDPDLNPD ANPDTDGQPG
401 TSPDSPAVPD RPNGRHRKER KEGEDGGLSC DYFPEILACQ EMGKPSDRMF
451 HDISIPQVTD DKTWSSHNFL PSNGVCPQPK TFHVFGRQYR ASYEPLCVFA
501 EKIR
FAVLLA FI IMSAFVVF GSLGGE*
在进一步分析后,鉴定出下列淋球菌DNA序列<SEQ ID 275>:
1 ATGGTCACAA AACATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC
51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAATGCA GTAAAAATAT
101 CTGAAACTCT TTCGGTTGAT ACCGGACAAG GCGCGAAAGT TCATAAGTTC
151 GTTCCTAAAT CAAGTAATAT TTATTCATCT GATTTAACAA AAGCGGTAGA
201 TTTAACGCAT ATCCCCACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA
251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGT CGGGGGTCGG CAAACTTGTC
301 CGCCAAGGCG CGAAATTCGG CACAAGGGCG GTTCCCTATG TCGGAACAGC
351 CCTTTTAGCC CACGACGTAT ACGAAACTTT CAAAGAAGAC ATACAGGCAC
401 GAGGCTGCCG ATACGATCCC GAAACCGACA AATTT
它对应于氨基酸序列<SEQ ID 276;ORF72ng-1>:
1
MVTKHTNLNF AKLSIIAILM MYSFEANANA VKISETLSVD TGQGAKVHKF
51 VPKSSNIYSS DLTKAVDLTH IPTGAKARIN AKITASVSRA GVLSGVGKLV
101 RQGAKFGTRA VPYVGTALLA HDVYETFKED IQARGCRYDP ETDKF
ORF72ng-1和ORF721-1在145个氨基酸的重叠区内有89.7%的相同性:
10 20 30 40 50 60
orf72ng-1.pe MVTKHTNLNFAKLSIIAILMMYSFEANANAVKISETLSVDTGQGAKVHKFVPKSSNIYSS
|| |:|||||||||||||||||||||||||||||||:|||||||||:||||||:|: |||
orf72-1 MVIKYTNLNFAKLSIIAILMMYSFEANANAVKISETVSVDTGQGAKIHKFVPKNSKTYSS
10 20 30 40 50 60
70 80 90 100 110 120
orf72ng-1.pe DLTKAVDLTHIPTGAKARINAKITASVSRAGVLSGVGKLVRQGAKFGTRAVPYVGTALLA
|| |:||||||||||||||||||||||||||||:|||||:| ||||:|||||||||||||
orf72-1 DLIKTVDLTHIPTGAKARINAKITASVSRAGVLAGVGKLARLGAKFSTRAVPYVGTALLA
70 80 90 100 110 120
130 140
orf72ng-1.pe HDVYETFKEDIQARGCRYDPETDKF
||||||||||||||| :||||||||
orf72-1 HDVYETFKEDIQARGYQYDPETDKFAKVSGX
130 140 150
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列以及数个跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例33
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 277>:
1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT
51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT
101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCAAACCGGG
151
CTGACCGGT CTTTTATTGG CGGGCGCGGC AATGAGAAGC GGCGGGAAGG
201 TATCCGTTTA TCAGATGTTG TGGCCTATC..
它对应于氨基酸序列<SEQ ID 278;ORF73>:
1 MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAAGFA AGVLMLRQTG
51 LTGLLLAGAA MRSGGKVSVY QMLWPI..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 279>:
1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT
51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT
101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCATACGGGG
151 CTGTCCGGTC TTTTATTGGC GGGCGCGGCA ATGAGAAGCG GCGGGAGGGT
201 ATCCGTTTAT CAGATGTTGT GGCCTATCCG TTATACGGTG GCGGCTGTGT
251 GTCTGATGAG TCCGGGATTC GTATCCTCGG TGTTGGCGGT ATTGCTGCTG
301 CTGCCGTTTA AGGGAGGGGC AGTGTTGCAG GCAGGAGGTG CGGAAAATTT
351 TTTCAACATG AACCAATCGG GCAGAAAAGA GGGCTTTTCC CGCGATGACG
401 ATATTATCGA GGGAGAATAT ACGGTTGAAG AGCCTTACGG CGGCAATCGT
451 TCCCGAAACG CCATCGAACA CAAAAAAGAC GAATAA
它对应于氨基酸序列<SEQ ID 280;ORF73-1>:
1
MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAAGFA AGVLMLRHTG
51 LSGLLLAGAA MRSGGRVSVY QMLWPIRYTV AAVC
LMSPGF VSSVLAVLLL
101 LPFKGGAVLQ AGGAENFFNM NQSGRKEGFS RDDDIIEGEY TVEEPYGGNR
151 SRNAIEHKKD E*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF73与脑膜炎奈瑟球菌菌株A的ORF(ORF73a)在重叠的76个氨基酸内有90.8%的相同性:
10 20 30 40 50 60
orf73.pep
MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRQTGLTGLLLAGAA
|||||||||||||||||||||||||||||||||||||||| |||:|||:|||:||||||||
orf73a
MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVVMLRHTGLSGLLLAGAA
10 20 30 40 50 60
70
orf73.pep MRSGGKVSVYQMLWPI
|||||:|||| ||| |
orf73a MRSGGRVSVYXMLWXIRYTVAAVC
XMSPGFVSSVXAVLLXLPFKGGAVLQAGGAENFFNM
全长ORF73a核苷酸序列<SEQ ID 281>是:
1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT
51 GTCGATTGTG TGGGTTGCCG ATTGGTTGGG CGGCGGTTGG ACGCTGTTTC
101 TAATGGCGGC AACCTTTGCC GCCGGCGTGG TGATGCTCAG GCATACGGGG
151 CTGTCCGGTC TTTTATTGGC GGGCGCGGCA ATGAGAAGCG GCGGGAGGGT
201 ATCCGTTTAT CANATGTTGT GGCNTATCCG TTATACGGTG GCGGCGGTGT
251 GTCNGATGAG TCCGGGATTC GTATCCTCGG TGTNGGCGGT ATTGCTGNTG
301 CTNCCGTTTA AGGGAGGTGC AGTGTTGCAG GCAGGAGGTG CGGAAAATTT
351 TTTCAACATG AACCANTCGG GCAGAAAAGA NGGCNTTTCC CGCGATGACG
401 ATATTATCGA GGGGGAATAT ACGGTTGAAG ANCCTTACGG CGGCANTCGT
451 TTCCGAAACG CCNTNGAACA CAAAAAAGAC GAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 282>:
1
MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAATFA AGVVMLRHTG
51 LSGLLLAGAA MRSGGRVSVY XMLWXIRYTV AAVC
XMSPGF VSSVXAVLLX
101
LPFKGGAVLQ AGGAENFFNM NXSGRKXGXS RDDDI IEGEY TVEXPYGGXR
151 FRNAXEHKKD E*
ORF73a和ORF73-1在161个氨基酸的重叠区内有91.3%的相同性
10 20 30 40 50 60
orf73a.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVVMLRHTGLSGLLLAGAA
||||||||||||||||||||||||||||||||||||| |||||:||||||||||||||||
orf73-1 MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRHTGLSGLLLAGAA
10 20 30 40 50 60
70 80 90 100 110 120
orf73a.pep MRSGGRVSVYXMLWXIRYTVAAVCXMSPGFVSSVXAVLLXLPFKGGAVLQAGGAENFFNM
|||||||||| ||| ||||||||| ||||||||| |||| ||||||||||||||||||||
orf73-1 MRSGGRVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM
70 80 90 100 110 120
130 140 150 160
orf73a.pep NXSGRKXGXSRDDDIIEGEYTVEXPYGGXRFRNAXEHKKDEX
| |||| | |||||||||||||| |||| | ||| |||||||
orf73-1 NQSGRKEGFSRDDDIIEGEYTVEEPYGGNRSRNAIEHKKDEX
130 140 150 160
与淋病奈瑟球菌的预计ORF的同源性
ORF73与淋病奈瑟球菌的预计ORF(ORF73.ng)在重叠的76个氨基酸内有92.1%的相同性:
orf73.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRQTGLTGLLLAGAA 60
||||||||||||||||||||||||||||||||||||| |||||||||:|||:||||||||
orf73ng MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVLMLRHTGLSGLLLAGAA 60
orf73.pep MRSGGKVSVYQMLWPI 76
::|:||||||||||||
orf73ng VKSSGKVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM 120
全长ORF73ng核苷酸序列<SEQ ID 283>是:
1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAAATTAT
51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGTTGG AcgcTGTTTC
101 TAATGGCGGC AACCTTTGCC GCCGGTGTGC TGATGCTCAG GCATAcggGG
151 CTGTCCGGTC TTTTATTGGC TGGCGCGGCG GTAAAAagta gtgGGAAGGT
201 ATCTGTTTAT CagatgtTGT GGCCTATCCG TTATAcggtg gcggcggtgT
251 GTCTGatgag tCcggGATTC GTATCCTccg tgttggCGGT ATTGCTGCTG
301 CTGCcgttta aggGaggGgc agtgttgcag gcaggaggtg cggaaaATTT
351 TTTCAACATg aaCcaatcgg gcagaaAaga gggatttttc cacgatgacg
401 atattatcga gggagaatat acggttgaaa aacctgacgg cggcaatcgt
451 tcccgaAAcg ccatcgaaca cgaaaAagac gaataA
它编码的蛋白质具有氨基酸序列<SEQ ID 284>:
1
MRFFGIGFLV LLFLEIMSIV WVADWLGGGW TLFLMAATFA AGVLMLRHTG
51 LSGLLLAGAA VKSSGKVSVY QMLWPIRYTV AAVC
LMSPGF VSSVLAVLLL
101
LPFKGGAVLQ AGGAENFFNM NQSGRKEGFF HDDDIIEGEY TVEKPDGGNR
151 SRNAIEHEKD E*
ORF73ng和ORG73-1在161个氨基酸的重叠区内有93.8%的相同性
10 20 30 40 50 60
orf73-1.pep MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAAGFAAGVLMLRHTGLSGLLLAGAA
||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||
orf73ng MRFFGIGFLVLLFLEIMSIVWVADWLGGGWTLFLMAATFAAGVLMLRHTGLSGLLLAGAA
10 20 30 40 50 60
70 80 90 100 110 120
orf73-1.pep MRSGGRVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM
::|:|:||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf73ng VKSSGKVSVYQMLWPIRYTVAAVCLMSPGFVSSVLAVLLLLPFKGGAVLQAGGAENFFNM
70 80 90 100 110 120
130 140 150 160
orf73-1.pep NQSGRKEGFSRDDDIIEGEYTVEEPYGGNRSRNAIEHKKDEX
||||||||| :||||||||||||:| |||||||||||:||||
orf73ng NQSGRKEGFFHDDDIIEGEYTVEKPDGGNRSRNAIEHEKDEX
130 140 150 160
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列以及推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例34
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 285>:
1 ATGTTTGTTT TTCAGACGGC ATTCTT.ATG TTTCAGAAAC ATTTGCAGAA
51 AGCCTCCGAC AGCGTCGTCG GAGGGACATT ATACGTGGTT GCCACGCCCA
101 TCGGCAATTT GGCGGACATT ACCCTGCGCG CTTTGGCGGT ATTGCAAAAG
151 GCG....... .....GCCGA AGACACGCGC GTTACCGCAC AGCTTTTGAG
201 CGCGTACGGC ATTCAGGGCA AACTCGTCAG TGTGCGCGAA CACAACGAAC
251 GGCAGATGGC GGACAAGATT GTCGGCTATC TTTCAGACGG CATGGTTGTG
301 GCACAGGTTT CCGATGCGGG TACGCCGGCC GTGTGCGACC CGGGCGCGAA
351 ACTCGCCCGC CGCGTGCGTG AGGCCGGGTT TAAAGTCGTT CCCGTCGTGG
401 GCGCAAC.GC GGTGATGGCG GCTTTGAGCG TGGCCGGTGT GGAAGGATCC
451 GATTTTTATT TCAACGGTTT TGTACCGCCG AAATCGGGAG AACGCAGGAA
501 ACTGTTTGCC AAATGGGTGC GGGCGGCGTT TCCTATCGTC ATGTTTGAAA
551 CGCCGCACCG CATCGGTGCA GCGCTTGCCG ATATGGCGGA ACTGTTCCCC
601 GAACGCCGAT TAATGCTGGC GCGCGAAATT ACGAAAACGT TTGAAACGTT
651 CTTAAGCGGC ACGGTTGGGG AAATTCAGAC GGCATTGTCT GCCGACGGCG
701 ACCAATCGCG CGGCGAGATG GTGTTGGTGC TTTATCCGGC GCAGGATGAA
751 AAACACGAAG GCTTGTCCGA GTCCGCGCAA AACATCATGA AAATCCTCAC
801 AGCCGAGCTG CCGACCAAAC AGGCGGCGGA GCTTGCTGCC AAAATCACGG
851 GCGAGGGAAA GAAAGCTTTG TACGAT..
它对应于氨基酸序列<SEQ ID 286;ORF75>:
1 MFVFQTAFXM FQKHLQKASD SVVGGTLYVV ATPIGNLADI TLRALAVLQK
51 A....AEDTR VTAQLLSAYG IQGKLVSVRE HNERQMADKI VGYLSDGMVV
101 AQVSDAGTPA VCDPGAKLAR RVREAGFKVV PVVGAXAVMA ALSVAGVEGS
151 DFYFNGFVPP KSGERRKLFA KWVRAAFPIV MFETPHRIGA ALADMAELFP
201 ERRLMLAREI TKTFETFLSG TVGEIQTALS ADGDQSRGEM VLVLYPAQDE
251 KHEGLSESAQ NIMKILTAEL PTKQAAELAA KITGEGKKAL YD..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 287>:
1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC
51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC
101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG
151 CGCGTTACCG CACAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT
201 CAGTGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT
251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG
301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGCCGG
351 GTTTAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTGATG GCGGCTTTGA
401 GCGTGGCCGG TGTGGAAGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG
451 CCGAAATCGG GAGAACGCAG GAAACTGTTT GCCAAATGGG TGCGGGCGGC
501 GTTTCCTATC GTCATGTTTG AAACGCCGCA CCGCATCGGT GCGACGCTTG
551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA
601 ATTACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA
651 GACGGCATTG TCTGCCGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG
701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG
751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC
801 GGAGCTTGCT GCCAAAATCA CGGGCGAGGG AAAGAAAGCT TTGTACGATC
851 TGGCTCTGTC TTGGAAAAAC AAATAG
它对应于氨基酸序列<SEQ ID 288;ORF75-1>:
1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADI ICAEDT
51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP
101 AVCDPGAKLA RRVREAGFK
V VPVVGASAVM AALSVAGVEG SDFYFNGFVP
151 PKSGERRKLF AKWVRAAFPI VMFETPHRIG ATLADMAELF PERRLMLARE
201 ITKTFETFLS GTVGEIQTAL SADGNQSRGE MVLVLYPAQD EKHEGLSESA
251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF75与脑膜炎奈瑟球菌菌株A的ORF(ORF75a)在重叠的283个氨基酸内有95.8%的相同性:
10 20 30 40 50 60
orf75.pep MFVFQTAFXMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKAXXXXAEDTR
|||||||||||||||||||||||||||||||||||||||||| |||||
orf75a MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTR
10 20 30 40 50
70 80 90 100 110 120
orf75.pep VTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf75a VTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR
60 70 80 90 100 110
130 140 150 160 170 180
orf75.pep RVREAGFK
VVPVVGAXAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIV
||||:||| ||||||| ||||||||| || ||||||||||||||||||||||||||:|||:|
orf75a RVREVGFK
VVPVVGASAVMAALSVAGVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVV
120 130 140 150 160 170
190 200 210 220 230 240
orf75.pep MFETPHRIGAALADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGDQSRGEM
|||||||||||:||||||||||||||||||||||||||||||||||||||:|||:|||||
orf75a MFETPHRIGATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEM
180 190 200 210 220 230
250 260 270 280 290
orf75.pep VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYD
||||||||||||||||||||||||||||||||||||||||||||||||||||
orf75a VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNK
240 250 260 270 280 290
orf75a X
全长ORF75a核苷酸序列<SEQ ID 289>是:
1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC
51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC
101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG
151 CGCGTTACCG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT
201 CAGCGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT
251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG
301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGTCGG
351 GTTTAAAGTT GTCCCTGTTG TCGGCGCAAG CGCGGTGATG GCGGCTTTGA
401 GTGTGGCTGG TGTGGCGGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG
451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGTGGC
501 GTTTCCCGTC GTGATGTTTG AAACGCCGCA CCGCATCGGG GCGACGCTTG
551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA
601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA
651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG
701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG
751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC
801 GGAGCTTGCC GCCAAAATCA CGGGCGAGGG AAAAAAAGCT TTGTACGATC
851 TGGCACTGTC TTGGAAAAAC AAATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 290>:
1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADI ICAEDT
51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP
101 AVCDPGAKLA RRVREVGFK
V VPVVGASAVM AALSVAGVAG SDFYFNGFVP
151 PKSGERRKLF AKWVRVAFPV VMFETPHRIG ATLADMAELF PERRLMLARE
201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA
251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
ORF75a和ORF75-1在291个氨基酸的重叠区内有98.3%的相同性:
10 20 30 40 50 60
orf75a.pep MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf75-1 MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
10 20 30 40 50 60
70 80 90 100 110 120
orf75a.pep GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREVGFKV
|||||||||||||||||||||||||||||||||||||||||||||||||||||||:||||
orf75-1 GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKV
70 80 90 100 110 120
130 140 150 160 170 180
orf75a.pep VPVVGASAVMAALSVAGVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVVMFETPHRIG
|||||||||||||||||| ||||||||||||||||||||||||||:|||:||||||||||
orf75-1 VPVVGASAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIG
130 140 150 160 170 180
190 200 210 220 230 240
orf75a.pep m ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQD
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf75-1 ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGNQSRGEMVLVLYPAQD
190 200 210 220 230 240
250 260 270 280 290
orf75a.pep EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
||||||||||||||||||||||||||||||||||||||||||||||||||||
orf75-1 EKHEGLSESAQNKMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
250 260 270 280 290
与淋病奈瑟球菌的预计ORF的同源性
ORF75与淋病奈瑟球菌的预计ORF(ORF75.ng)在重叠的292个氨基酸内有93.2%的相同性:
orf75.pep MFVFQTAFXMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKA----AEDTR 56
| |||||| |||||||||||||||||||||||||||||||||||||||||| |||||
orf75ng MSVFQTAFFMFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTR 60
orf75.pep VTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLAR 116
|||||||||||||:|||||||||||||||::|:||||:||||||||||||||||||||||
orf75ng VTAQLLSAYGIQGRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLAR 120
orf75.pep RVREAGFKVVPVVGAXAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIV 176
||||||||||||||| ||||||||||| |||||||||||||||||||||||||||||:|
orf75ng RVREAGFKVVPVVGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVV 180
orf75.pep MFETPHRIGAALADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGDQSRGEM 236
||||||||||:||||||||||||||||||||||||||||||||||||||:|||:||||||
orf75ng MFETPHRIGATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEM 240
orf75.pep VLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYD 288
||||||||||||||||||||| ||||:|||||||||||||||||||||||||
orf75ng VLVLYPAQDEKHEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLALSWKNK 300
预计ORF75ng核苷酸序列<SEQ ID 291>编码的蛋白质具有氨基酸序列<SEQ ID292>:
1 MSVFQTAFFM FQKHLQKASD SVVGGTLYVV ATPIGNLADI TLRALAVLQK
51 ADIICAEDTR VTAQLLSAYG IQGRLVSVRE HNERQMADKV IGFLSDGLVV
101 AQVSDAGTPA VCDPGAKLAR RVREAGFK
VV PVVGASAVMA ALSVAGVAES
151 DFYFNGFVPP KSGERRKLFA KWVRAAFPVV MFETPHRIGA TLADMAELFP
201 ERRLMLAREI TKTFETFLSG TVGEIQTALA ADGNQSRGEM VLVLYPAQDE
251 KHEGLSESAQ NAMKILAAEL PTKQAAELAA KITGEGKKAL YDLALSWKNK
301 *
在进一步分析后,鉴定出下列淋球菌DNA序列<SEQ ID 293>:
1 ATGTTTCAGA AACACTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC
51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCAGAC ATTACCCTGC
101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATTTGTGC CGAAGACACG
151 CGCGTTACTG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAGGTTGGT
201 CAGTGTGCGC GAACACAACG AGCGGCAGAT GGCGGACAAG GTAATCGGTT
251 TCCTTTCAGA CGGCCTGGTT GTGGCGCAGG TTTCCGATGC GGGTACGCCG
301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GCGAAGCAGG
351 GTTCAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTAATG GCGGCGTTGA
401 GTGTGGCCGG TGTGGCGGAA TCCGATTTTT ATTTCAACGG TTTTGTACCG
451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGCGGC
501 ATTTCCTGTC GTCATGTTTG AAACGCCGCA CCGAATCGGG GCAACGCTTG
551 CCGATATGGC GGAATTGTTC CCCGAACGCC GTCTGATGCT GGCGCGCGAA
601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA
651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG
701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCTGCG
751 CAAAATGCGA TGAAAATCCT TGCGGCCGAG CTGCCGACCA AGCAGGCGGC
801 GGAGCTTGCC GCCAAGATTA CAGGTGAGGG CAAAAAGGCT TTGTACGATT
851 TGGCACTGTC GTGGAAAAAC AAATGA
它对应于氨基酸序列<SEQ ID 294;ORF75ng-1>:
1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT
51 RVTAQLLSAY GIQGRLVSVR EHNERQMADK VIGFLSDGLV VAQVSDAGTP
101 AVCDPGAKLA RRVREAGFK
V VPVVGASAVM AALSVAGVAE SDFYFNGFVP
151 PKSGERRKLF AKWVRAAFPV VMFETPHRIG ATLADMAELF PERRLMLARE
201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA
251 QNAMKILAAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
ORF75ng-1和ORF75-1在291个氨基酸的重叠区内有96.2%的相同性:
10 20 30 40 50 60
orf75-1.pep MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf75ng-1 MFQKHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAY
10 20 30 40 50 60
70 80 90 100 110 120
orf75-1.pep GIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKV
||||:|||||||||||||||::|:||||:|||||||||||||||||||||||||||||||
orf75ng-1 GIQGRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKV
70 80 90 100 110 120
130 140 150 160 170 180
orf75-1.pep VPVVGASAVMAALSVAGVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIG
|||||||||||||||||| |||||||||||||||||||||||||||||:||||||||||
orf75ng-1 VPVVGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIG
130 140 150 160 170 180
190 200 210 220 230 240
orf75-1.pep ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALSADGNQSRGEMVLVLYPAQD
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf75ng-1 ATLADMAELFPERRLMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQD
190 200 210 220 230 240
250 260 270 280 290
orf75-1.pep EKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
|||||||||||| ||||:||||||||||||||||||||||||||||||||||
orf75ng-1 EKHEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
250 260 270 280 290
另外,ORG75ng-1显示出与一种假设的大肠杆菌蛋白明显同源:
sp|P45528|YRAL_ECOLI AGAI-MTR基因间区域中的假设的31.3KD蛋白(F286)
>ig|606086(U18997)ORF_f286[大肠杆菌]
>ig|1789535(AE000395)agai-mtr基因间区域中的假设的31.3kD蛋白[大肠杆菌]长度=286
评分=218位(550),估计值=3e-56
相同性=128/284(45%),阳性=171/284(60%),空隙=4/284(1%)
询问:4 KHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQ 63
K Q A+S G LY+V TPIGNLADIT RAL VLQ D+I AEDTR T LL +GI
目标:2 KQHQSADNSQ--GQLYIVPTPIGNLADITQRALEVLQAVDLIAAEDTRHTGLLLQHFGIN 59
询问:64 GRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPV 123
RL ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG L R REAG +VVP+
目标:60 ARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRVVPL 119
询问:124 VGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATL 183
G A + ALS AG+ F + GF+P KS RR ++ +E+ HR+ +L
目标:120 PGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSL 179
询问:184 ADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEK 242
D+ + E R ++LARE+TKT+ET VGE+ + D N+ +GEMVL++ +
目标:180 EDIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIV-EGHKAQ 238
询问:243 HEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLAL 286
E L A + +L AELP K+AA LAA+I G K ALY AL
目标:239 EEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYAL 282
根据该分析结果(包括该淋球菌蛋白中存在一个推定的跨膜结构域的结果),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的该蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例35
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 295>:
1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG
51 TTTTGCGGCA GC.AAAGCAC CCGAAATCGA CCCGGCTTTG ..........
//
651 .......... ...GAGTTGG TCAGAAACCA GTTGGAGCAG GGTTTGAGAC
701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAAGA AAACGGTGTC
751 AAACCGTAA
它对应于氨基酸序列<SEQ ID 296;ORF76>:
1 MKQKKTAAAV IAAMLAGFAA XKAPEIDPAL .......... ..........
//
201 .......... .......... ELVRNQLEQG LRQEKARLKI DALLEENGVK
251 P*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 297>:
1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG
51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC
101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA
151 AAACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGCC GGCTACAAAC
201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA
251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG
301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAGACGAGCT
351 GCACAAGTTT TACGAACAGC AAATCCGCAT GATCAAATTG CAGCAGGTCA
401 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA
451 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC
501 TTTTGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTCGC
551 AGTTTGCCGC GATGAATCGG GGCGACGTTA CCCGCGATCC GGTCAAATTG
601 GGCGAACGCT ATTATCTGTT CAAACTCAGC GAGGTCGGGA AAAACCCCGA
651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAGCAG GGTTTGAGAC
701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAAGA AAACGGTGTC
751 AAACCGTAA
它对应于氨基酸序列<SEQ ID 298;ORF76-1>:
1
MKQKKTAAAV IAAMLAGFAA AKAPEIDPAL VDTLVAQIMQ QADRHAEQSQ
51 KPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE
101 EYVRFLERSE TVSEDELHKF YEQQIRMIKL QQVSFATEEE ARQAQQLLLK
151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAAMNR GDVTRDPVKL
201 GERYYLFKLS EVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDALLEENGV
251 KP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF76与脑膜炎奈瑟球菌菌株A的ORF(ORF76a)在重叠的30个氨基酸中有96.7%的相同性,在31个氨基酸的重叠区内有96.8%的相同性:
10 20 30
orf76.pep
MKQKKTAAAVIAAMLAGFAAXKAPEIDPAL
|||||||||||||||||||| || |||||||
orf76a
MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
10 20 30 40 50 60
//
70 80 90
orf76.pep XELVRNQLEQGLRQEKARLKIDALLEENGVKPX
|||||||||||||||||||||||:|||||||||
orf76a DVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLKIDAILEENGVKPX
200 210 220 230 240 250
全长ORF76a核苷酸序列<SEQ ID 299>是:
1 ATGAAACAGA AAAAAACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG
51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC
101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA
151 AAACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGTC GGCTGCAAAC
201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA
251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG
301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAAGCGCACT
351 GCGTCAGTTT TATGAGCGGC AAATCCGCAT GATCAAATTG CAGCAGGTCA
401 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA
451 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC
501 TTTTGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTCGC
551 AGTTTGCAGC GATGAATCGG GGCGACGTTA CCCGCGATCC GGTCAAATTG
601 GGCGAACGCT ATTATCTGTT CAAACTCAGC GAGGTCGGGA AAAACCCCGA
651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAACAA GGTTTGAGAC
701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCA TTTTGGAAGA AAACGGTGTC
751 AAACCGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 300>:
1
MKQKKTAAAV IAAMLAGFAA AKAPEIDPAL VDTLVAQIMQ QADRHAEQSQ
51 KPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE
101 EYVRFLERSE TVSESALRQF YERQIRMIKL QQVSFATEEE ARQAQQLLLK
151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAAMNR GDVTRDPVKL
201 GERYYLFKLS EVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDAILEENGV
251 KP*
ORF76a和ORF76-1在252个氨基酸的重叠区内有97.6%的相同性:
10 20 30 40 50 60
orf76a.pep MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf76-1 MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
10 20 30 40 50 60
70 80 90 100 110 120
orf76a.pep AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSESALRQF
||||||||||||||||||||||||||||||||||||||||||||||||||||||: |::|
orf76-1 AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSEDELHKF
70 80 90 100 110 120
130 140 150 160 170 180
orf76a.pep YERQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf76-1 YEQQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
130 140 150 160 170 180
190 200 210 220 230 240
orf76a.pep LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf76-1 LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK
190 200 210 220 230 240
250
orf76a.pep IDAILEENGVKPX
|||:|||||||||
orf76-1 IDALLEENGVKPX
250
与淋病奈瑟球菌的预计ORF的同源性
ORF76与淋病奈瑟球菌的预计ORF(ORF76.ng)的N端和C端进行氨基酸序列对比,分别显示在30和31个氨基酸重叠区内有96.7%和100%的相同性:
orf76.pep MKQKKTAAAVIAAMLAGFAAXKAPEIDPAL 30
|||||||||||||||||||| |||||||||
orf76ng MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQRPDGQAIRND 60
//
orf76.pep ELVRNQLEQGLRQEKARLKIDALLEENGVKP 251
|||||||||||||||||||||||||||||||
orf76ng VTRNPVKLGERYYLFKLGAVGKNPDAQPFELVRNQLEQGLRQEKARLKIDALLEENGVKP 251
全长ORF76ng核苷酸序列<SEQ ID 301>是:
1 ATGAAACAGA AAAAGACCGC TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG
51 TTTTGCGGCA GCCAAAGCAC CCGAAATCGA CCCGGCTTTG GTGGATACGC
101 TGGTGGCGCA GATCATGCAG CAGGCAGACC GGCATGCGGA GCAGTCCCAA
151 AGACCGGACG GGCAGGCAAT CCGAAACGAT GCCGTCCGCC GGCTGCAAAC
201 TTTGGAAGTT TTGAAAAACA GGGCATTGAA GGAAGGTTTG GATAAGGATA
251 AGGATGTCCA AAACCGCTTT AAAATCGCCG AAGCGTCTTT TTATGCCGAG
301 GAGTACGTCC GTTTTCTGGA ACGTTCGGAA ACGGTTTCCG AAAGCGCACT
351 GCGTCAGTTT TATGAGCGGC AAATCCGCAT GATCAAATTG CAGCAGGTCA
401 GCTTCGCAAC CGAAGAGGAG GCGCGTCAGG CGCAGCAGCT CCTGCTCAAA
451 GGGCTGTCTT TTGAAGGGCT GATGAAGCGT TATCCGAACG ACGAGCAGGC
501 GTTCGACGGT TTCATTATGG CGCAGCAGCT TCCCGAGCCG CTGGCTTcgc
551 agtttgCCGG TATGAACCGT GGCGACGTTA CCCGCAATCC GGTCAAATTG
601 GGCGAACGCT ATTACCTGTT CAAACTCGGC GCGGTCGGGA AAAACCCCGA
651 CGCGCAGCCT TTCGAGTTGG TCAGAAACCA GTTGGAACAA GGTTTGAGGC
701 AGGAAAAAGC CCGCTTGAAA ATCGATGCCC TTTTGGAaga Aaacggtgtc
751 AaacCGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 302>:
51 RPDGQAIRND AVRRLQTLEV LKNRALKEGL DKDKDVQNRF KIAEASFYAE
101 EYVRFLERSE TVSESALRQF YERQIRMIKL QQVSFATEEE ARQAQQLLLK
151 GLSFEGLMKR YPNDEQAFDG FIMAQQLPEP LASQFAGMNR GDVTRNPVKL
201 GERYYLFKLG AVGKNPDAQP FELVRNQLEQ GLRQEKARLK IDALLEENGV
251 KP*
ORF76ng和ORF76-1在252个氨基酸的重叠区内有96.0%的相同性
10 20 30 40 50 60
orf76-1.pep MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQKPDGQAIRND
||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||||
orf76ng MKQKKTAAAVIAAMLAGFAAAKAPEIDPALVDTLVAQIMQQADRHAEQSQRPDGQAIRND
10 20 30 40 50 60
70 80 90 100 110 120
orf76-1.pep AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSEDELHKF
||||||||||||||||||||||||||||||||||||||||||||||||||||||: |::|
orf76ng AVRRLQTLEVLKNRALKEGLDKDKDVQNRFKIAEASFYAEEYVRFLERSETVSESALRQF
70 80 90 100 110 120
130 140 150 160 170 180
orf76-1.pep YEQQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf76ng YERQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPNDEQAFDGFIMAQQLPEP
130 140 150 160 170 180
190 200 210 220 230 240
orf76-1.pep LASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDAQPFELVRNQLEQGLRQEKARLK
||||||:||||||||:|||||||||||||:||||||||||||||||||||||||||||||
orf76ng LASQFAGMNRGDVTRNPVKLGERYYLFKLGAVGKNPDAQPFELVRNQLEQGLRQEKARLK
190 200 210 220 230 240
250
orf76-1.pep IDALLEENGVKPX
|||||||||||||
orf76ng IDALLEENGVKPX
250
另外,ORF76ng显示出与一种枯草杆菌输出蛋白(export protein)前体明显同源:
sp|P24327|PRSA_BACSU蛋白输出蛋白PRSA前体>gi|98227|pir||S1526933K脂蛋白-枯草芽孢杆菌>gi|39782(X57271)33kDa脂蛋白[枯草芽孢杆菌]
>gi|2226124|gnl|PID|e325181(Y14077)33kDa脂蛋白[枯草芽孢杆菌]>gi|2633331|gnl|PID|e1182997(Z99109)分子陪伴蛋白[枯草芽孢杆菌]长度=292
评分=50.4位(118),估计值=1e-05
相同性=48/199(24%),阳性=82/199(41%),空隙=32/199(16%)
询问:70 VLKNRALKEGLDK-----DKDVQNRFKIAEASF----------YAEEYVRFLERSETVSE 114
VL ++ LDK DK++ N+ K + Y ++Y++ + E +++
目标:53 VLTQLVQEKVLDKKYKVSDKEIDNKLKEYKTQLGDQYTALEKQYGKDYLKEQVKYELLTQ 112
询问:115 SA-----------LRQFYERQIRMIKLQQVSFATEEEARQAQQLLLKGLSFEGLMKRYPN 163
A +++++E I+ + A ++ A + ++ L KG FE L K Y
目标:113 KAAKDNIKVTDADIKEYWEGLKGKIRASHILVADKKTAEEVEKKLKKGEKFEDLAKEYST 172
询问:164 DEQAFDG-----FIMAQQLPEPLASQFAAMNRGDVTRDPVKLGERYYLFKLSEVGKNPDA 218
D A G F Q+ E + + G+V+ DPVK Y++ K +E D
目标:173 DSSASKGGDLGWFAKEGQMDETFSKAAFKLKTGEVS-DPVKTQYGYHIIKKTEERGKYDD 231
询问:219 QPFELVRNQLEQGLRQEKA 237
EL LEQ L A
目标:232 MKKELKSEVLEQKLNDNAA 250
根据该分析结果(包括此淋球菌蛋白中存在一个推定前导序列和一个RGD基序),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF76-1(27.8kDa)克隆到pET载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图10A显示出His-融合蛋白亲和纯化的结果。用纯化的His-融合蛋白免疫小鼠,将小鼠血清用于Western印迹(图10B),ELISA(阳性结果),和FACS分析(图10C)。这些实验确认ORF76-1是一种外露蛋白,且是一种有用的免疫原。
实施例36
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 303>:
1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC
51 CAGCGAAATT GCC
TACCCC TTGGAATTGG GGATTGAAAC CTTACCGGCG
101 GCAAAAATTG CGGAAACGTT TGCGCTGACA TTTGTGATTG CTGCGCTGTA
151 TCTGTTTGCG CGTAATAAGG TGACGCGTTT GTTGATTGCG GTGTTTTTTG
201 CGTTCAGCAT TATTGCCAAC AATGTGCATT ACGCGGATTA TCAAAGCTGG
251 ATGACG.... .......... .......... .......... ..........
//
1201 .......... CAAACCGTAT TCGAGCAGCT GCAAAAGACT CCTGACGGCA
1251 ACTGGCTGTT TGCCTATACC TCCGATCATG GCCAGTATGT TCGCCAAGAT
1301 ATCTACAATC AAGGCACGGT GCAGCCCGAC AGCTATCTCG TGCCGCTAGT
1351 GTTGTACAGC CCGGATAAGG CCGTGCAACA GGCTGCCAAC CAGGCTTTTG
1401 CGCCTTGCGA GATTGCCTTC CATCAGCAGC TTTCAACGTT CCTGATTCAC
1451 ACGTTGGGCT ACGATATGCC GGTTTCAGGT TGTCGCGAAG GCTCGGTAAC
1501 GGGCAACCTG ATTACGGGTG ATGCAGGCAG CTTGAACATT CGCGACGGCA
1551 AGGCGGAATA TGTTTATCCG CAATGA
它对应于氨基酸序列<SEQ ID 304;ORF81>:
1 MKKSFLTLVL YSSLLTASEI AYPLELGIET LPAAKIAETF ALTFVIAALY
51 LFARNKVTRL LIAVFFAFSI IANNVHYADY QSWMT..... ..........
//
401 ...QTVFEQL QKTPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV
451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT
501 GNLITGDAGS LNIRDGKAEY VYPQ*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 305>:
1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC
51 CAGCGAAATT GCCTATCGCT TTGTATTTGG GATTGAAACC TTACCGGCGG
101 CAAAAATTGC GGAAACGTTT GCGCTGACAT TTGTGATTGC TGCGCTGTAT
151 CTGTTTGCGC GTTATAAGGT GACGCGTTTG TTGATTGCGG TGTTTTTTGC
201 GTTCAGCATT ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA
251 TGACGGGCAT CAATTATTGG CTGATGCTGA AAGAGGTTAC CGAAGTCGGC
301 AGCGCGGGTG CGTCGATGTT GGATAAGTTG TGGCTGCCTG TGTTGTGGGG
351 CGTGTTGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA
401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC
451 GTGCGTTCGT TCGACACGAA ACAAGAGCAC GGTATTTCGC CCAAACCGAC
501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGAC
551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAGGATTCC CGCCTTTAAG
601 CAGCCTGCTC CAAGCAAAAT CGGGCAGGGC AGTGTTCAAA ATATCGTCCT
651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAGCTG TTTGGCTACG
701 GACGCGAAAC TTCGCCGTTT TTAACCCGGC TGTCGCAAGC CGATTTTAAG
751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACTG CAGTGTCCCT
801 GCCCAGTTTT TTCAATGCGA TACCGCACGC CAACGGCTTG GAACAAATCA
851 GCGGCGGCGA TACCAATATG TTCCGCCTCG CCAAAGAGCA GGGCTATGAA
901 ACGTATTTTT ACAGCGCGCA GGCGGAAAAC GAGATGGCGA TTTTGAACTT
951 AATCGGTAAG AAATGGATAG ACCATCTGAT TCAGCCGACG CAACTTGGCT
1001 ACGGCAACGG CGACAATATG CCCGATGAGA AGCTGCTGCC GTTGTTCGAC
1051 AAAATCAATT TGCAGCAGGG CAAGCATTTT ATCGTGTTGC ACCAACGCGG
1101 TTCGCACGCC CCATACGGCG CATTGTTGCA GCCTCAAGAT AAAGTATTCG
1151 GCGAAGCCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC
1201 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA
1251 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTT CGCCAAGATA
1301 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATCTCGT GCCGCTAGTG
1351 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC
1401 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA
1451 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACG
1501 GGCAACCTGA TTACGGGTGA TGCAGGCAGC TTGAACATTC GCGACGGCAA
1551 GGCGGAATAT GTTTATCCGC AATGA
它对应于氨基酸序列<SEQ ID 306;ORF81-1>:
1 MKKSFLTLVL YSSLLTASEI AYRFVFGIET LPAAKIAETF ALTFVIAALY
51 LFARYKVTRL LIAVFFAFSI IANNVHYAVY QSWMTGINYW LMLKEVTEVG
101 SAGASMLDKL WLPVLWGVLE VMLFCSLAKF RRKTHFSADI LFAFLMLMIF
151 VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSRIPAFK
201 QPAPSKIGQG SVQNIVLIMG ESESAAHLKL FGYGRETSPF LTRLSQADFK
251 PIVKQSYSAG FMTAVSLPSF FNAIPHANGL EQISGGDTNM FRLAKEQGYE
301 TYFYSAQAEN EMAILNLIGK KWIDHLIQPT QLGYGNGDNM PDEKLLPLFD
351 KINLQQGKHF IVLHQRGSHA PYGALLQPQD KVFGEADIVD KYDNTIHKTD
401 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV
451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT
501 GNLITGDAGS LNIRDGKAEY VYPQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF81和脑膜炎奈瑟球菌菌株A的ORF(ORF81a)在85个氨基酸的重叠区内有84.7%的相同性,在121个氨基酸的重叠区内有99.2%的相同性:
10 20 30 40 50 60
orf81.pep
MKKSFLTLVLYSSLLTASEIAYPLELGIETLPAAK
IAETFALTFVIAALYLFARNKVTR
L
||||:::| ||||||||||||| : :|||||||||:|||||||||||||||||| |:|||
orf81a
MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAK
MAETFALTFVIAALYLFARYKATR
L
10 20 30 40 50 60
70 80
orf81.pep
LIAVFFAFSIIANNVHYADYQSWMT
|||||||||||||||||| ||||:|
orf81a
LIAVFFAFSIIANNVHYAVYQSWITGINYWLMLKEITEVGGAGASMLDKLW
LPALWGVLE
70 80 90 100 110 120
//
120 130 140
orf81.pep QTVFEQLQKTPDGNWLFAYTSDHGQYVRQD
||||||||| ||||||||||||||||||||
orf81a IPHANGLEQISGGDIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLFAYTSDHGQYVRQD
280 290 300 310 320 330
150 160 170 180 190 200
orf81.pep IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf81a IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG
340 350 360 370 380 390
210 220 230
orf81.pep CREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
||||||||||||||||||||||||||||||||
orf81a CREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
400 410 420
全长ORF81a核苷酸序列<SEQ ID 307>是:
1 ATGAAAAAAT CCCTTTTCGT TCTCTTTCTG TATTCGTCCC TACTTACTGC
51 CAGCGAAATT GCTTATCGCT TTGTATTCGG AATTGAAACC TTACCGGCTG
101 CAAAAATGGC AGAAACGTTT GCGCTGACAT TTGTGATTGC TGCGCTGTAT
151 CTGTTTGCGC GTTATAAGGC AACGCGTTTG TTGATTGCGG TGTTTTTCGC
201 GTTCAGCATT ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA
251 TAACGGGCAT TAATTATTGG CTGATGCTGA AAGAGATTAC CGAAGTTGGC
301 GGCGCAGGGG CGTCGATGTT GGATAAGTTG TGGCTGCCTG CGTTGTGGGG
351 CGTGTTGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA
401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC
451 GTGCGTTCGT TCGACACGAA ACAAGAACAC GGTATTTCGC CCAAACCGAC
501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGAC
551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAAGATTCC TGTGTTCAAA
601 CAGCCTGCTC CAAGCAGAAT CGGGCAAGGC AGTATTCAAA ATATCGTCCT
651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAATTG TTTGGCTACG
701 GGCGCGAAAC TTCGCCGTTT TTGACCCAGC TTTCGCAAGC CGATTTTAAG
751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACGG CAGTATCCCT
801 GCCCAGTTTC TTTAACGTCA TACCGCATGC CAACGGCTTG GAACAAATCA
851 GCGGCGGCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC
901 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA
951 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTT CGCCAAGATA
1001 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATCTCGT GCCGCTGGTG
1051 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC
1101 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA
1151 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACG
1201 GGCAACCTGA TTACGGGTGA TGCAGGCAGC TTGAACATTC GCGACGGCAA
1251 GGCGGAATAT GTTTATCCGC AATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 308>:
1
MKKSLFVLFL YSSLLTASEI AYRFVFGIET LPAAK
MAETF ALTFVIAALY
51
LFARYKATR
L LIAVFFAFSI IANNVHYAVY QSWITGINYW LMLKEITEVG
101 GAGASMLDKL W
LPALWGVLE VMLFCSLAKF RRKT
HFSADI LFAFLMLMIF
151
VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSKIPVFK
201 QPAPSRIGQG SIQNIVLIMG ESESAAHLKL FGYGRETSPF LTQLSQADFK
251 PIVKQSYSAG FMTAVSLPSF FNVIPHANGL EQISGGDIVD KYDNTIHKTD
301 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYLVPLV
351 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT
401 GNLITGDAGS LNIRDGKAEY VYPQ*
ORF81a和ORF81-1在524个氨基酸的重叠区内有77.9%的相同性:
10 20 30 40 50 60
orf81a.pep MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFVIAALYLFARYKATRL
||||:::| ||||||||||||||||||||||||||:||||||||||||||||||||:|||
orf81-1 MKKSFLTLVLYSSLLTASEIAYRFVFGIETLPAAKIAETFALTFVIAALYLFARYKVTRL
10 20 30 40 50 60
70 80 90 100 110 120
orf81a.pep LIAVFFAFSIIANNVHYAVYQSWITGINYWLMLKEITEVGGAGASMLDKLWLPALWGVLE
||||||||||||||||||||||||:|||||||||||:||||:||||||||||||:|||||
orf81-1 LIAVFFAFSIIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPVLWGVLE
70 80 90 100 110 120
130 140 150 160 170 180
orf81a.pep VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf81-1 VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
130 140 150 160 170 180
190 200 210 220 230 240
orf81a.pep FVGRVLPYQLFDLSKIPVFKQPAPSRIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPF
||||||||||||||:||:|||||||:|||||:||||||||||||||||||||||||||||
orf81-1 FVGRVLPYQLFDLSRIPAFKQPAPSKIGQGSVQNIVLIMGESESAAHLKLFGYGRETSPF
190 200 210 220 230 240
250 260 270 280
orf81a.pep LTQLSQADFKPIVKQSYSAGFMTAVSLPSFFNVIPHANGLEQISGGD-------------
||:|||||||||||||||||||||||||||||:||||||||||||||
orf81-1 LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNAIPHANGLEQISGGDTNMFRLAKEQGYE
250 260 270 280 290 300
orf81a.pep ------------------------------------------------------------
orf81-1 TYFYSAQAENEMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGKHF
310 320 330 340 350 360
290 300 310 320
orf81a.pep ---------------------------IVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
|||||||||||||||||||||||||||||||||
orf81-1 IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
370 380 390 400 410 420
330 340 350 360 370 380
orf81a.pep AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf81-1 AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
430 440 450 460 470 480
390 400 410 420
orf81a.pep LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
|||||||||||||||||||||||||||||||||||||||||||||
orf81-1 LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
490 500 510 520
与淋病奈瑟球菌的预计ORF的同源性
ORF81与淋病奈瑟球菌的预计ORF(ORF81.ng)的N-和C-端的氨基酸序列对比分别显示出在85和121个氨基酸的重叠区内有82.4%和97.5%的相同性:
orf81.pep MKKSFLTLVLYSSLLTASEIAYPLELGIETLPAAKIAETFALTFVIAALYLFARNKVTRL 60
||||:::| ||||||||||||| : :|||||||||:||||||||:||||||||| |::||
orf81ng MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFMIAALYLFARYKASRL 60
orf81.pep LIAVFFAFSIIANNVHYADYQSWMT 85
|||||||||:|||||||||||||||
orf81ng LIAVFFAFSMIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAE 120
//
orf81.pep QTVFEQLQKTPDGNWLFAYTSDHGQYVRQD 433
||||||||| ||||||||||||||||||||
orf81ng ALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLFAYTSDHGQYVRQD 433
orf81.pep IYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG 493
||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||||
orf81ng IYNQGTVQPDSYIVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTFLIHTLGYDMPVSG 493
orf81.pep CREGSVTGNLITGDAGSLNIRDGKAEYVYPQ 524
|||||||||||||||||||||:|||||||||
orf81ng CREGSVTGNLITGDAGSLNIRNGKAEYVYPQ 524
全长ORF81ng核苷酸序列<SEQ ID 309>是:
1 ATGAAAAAAT CCCTTTTCGT TCTCTTTCTG TATTCATCCC TACTTACCGC
51 CAGCGAAATC GCCTATCGCT TTGTATTCGG AATTGAAACC TTACCGGCTG
101 CAAAAATGGC GGAAACGTTT GCGCTGACAT TTATGATTGC TGCGCTGTAT
151 CTGTTTGCGC GTTATAAGGC TTCGCGGCTG CTGATTGCGG TGTTTTTCGC
201 GTTCAGCATG ATTGCCAACA ATGTGCATTA CGCGGTTTAT CAAAGCTGGA
251 TGACGGGTAT TAACTATTGG CTGATGCTGA AAGAGGTTAC CGAAGTCGGC
301 AGCGCGGGCG CGTCGATGTT GGATAAGTTG TGGCTGCCTG CTTTGTGGGG
351 CGTGGCGGAA GTCATGTTGT TTTGCAGCCT TGCCAAGTTC CGCCGTAAGA
401 CGCATTTTTC TGCCGATATA CTGTTTGCCT TCCTAATGCT GATGATTTTC
451 GTGCGTTCGT TCGACACGAA ACAAGAGCAC GGTATTTCGC CCAAACCGAC
501 ATACAGCCGC ATCAAAGCCA ATTATTTCAG CTTCGGTTAT TTTGTCGGGC
551 GCGTGTTGCC GTATCAGTTG TTTGATTTAA GCAAGATCCC TGTGTTCAAA
601 CAGCCTGCTC CAAGCAAAAT CGGGCAAGGC AGTATTCAAA ATATCGTCCT
651 GATTATGGGC GAAAGCGAAA GCGCGGCGCA TTTGAAATTG TTTGGTTACG
701 GGCGCGAAAC TTCGCCGTTT TTAACCCGGC TGTCGCAAGC CGATTTTAAG
751 CCGATTGTGA AACAAAGTTA TTCCGCAGGC TTTATGACGG CAGTATCCCT
801 GCCCAGTTTC TTTAACGTCA TACCGCACGC CAACGGCTTG GAACAAATCA
851 GCGGCGGCGA TACCAATATG TTCCGCCTCG CCAAAGAGCA GGGCTATGAA
901 ACGTATTTTT ACAGTGCCCA GGCTGAAAAC CAAATGGCAA TTTTGAACTT
951 AATCGGTAAG AAATGGATAG ACCATCTGAT TCAGCCGACG CAACTTGGCT
1001 ACGGCAACGG CGACAATATG CCCGATGAGA AGCTGCTGCC GTTGTTCGAC
1051 AAAATCAATT TGCAGCAGGG CAGGCATTTT ATCGTGTTGC ACCAACGCGG
1101 TTCGCACGCC CCATACGGCG CATTGTTGCA GCCTCAAGAT AAAGTATTCG
1151 GCGAAGCCGA TATTGTGGAT AAGTACGACA ACACCATCCA CAAAACCGAC
1201 CAAATGATTC AAACCGTATT CGAGCAGCTG CAAAAGCAGC CTGACGGCAA
1251 CTGGCTGTTT GCCTATACCT CCGATCATGG CCAGTATGTG CGCCAAGATA
1301 TCTACAATCA AGGCACGGTG CAGCCCGACA GCTATATTGT GCCTCTGGTT
1351 TTGTACAGCC CGGATAAGGC CGTGCAACAG GCTGCCAACC AGGCTTTTGC
1401 GCCTTGCGAG ATTGCCTTCC ATCAGCAGCT TTCAACGTTC CTGATTCACA
1451 CGTTGGGCTA CGATATGCCG GTTTCAGGTT GTCGCGAAGG CTCGGTAACA
1501 GGCAACCTGA TTACGGGCGA TGCAGGCAGC TTGAACATTC GCAACGGCAA
1551 GGCGGAATAT GTTTATCCGC AATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 310>:
101 SAGASMLDKL W
LPALWGVAE VMLFCSLAKF RRKT
HFSADI LFAFLMLMIF
151
VRSFDTKQEH GISPKPTYSR IKANYFSFGY FVGRVLPYQL FDLSKIPVFK
201 QPAPSKIGQG SIQNIVLIMG ESESAAHLKL FGYGRETSPF LTRLSQADFK
251 PIVKQSYSAG FMTAVSLPSF FNVIPHANGL EQISGGDTNM FRLAKEQGYE
301 TYFYSAQAEN QMAILNLIGK KWIDHLIQPT QLGYGNGDNM PDEKLLPLFD
351 KINLQQGRHF IVLHQRGSHA PYGALLQPQD KVFGEADIVD KYDNTIHKTD
401 QMIQTVFEQL QKQPDGNWLF AYTSDHGQYV RQDIYNQGTV QPDSYIVPLV
451 LYSPDKAVQQ AANQAFAPCE IAFHQQLSTF LIHTLGYDMP VSGCREGSVT
501 GNLITGDAGS LNIRNGKAEY VYPQ*
ORF81ng和ORF81-1在524个氨基酸的重叠区内有96.4%的相同性:
10 20 30 40 50 60
orf81ng-1.pep MKKSLFVLFLYSSLLTASEIAYRFVFGIETLPAAKMAETFALTFMIAALYLFARYKASRL
||||:::| ||||||||||||||||||||||||||:||||||||:|||||||||||::||
orf81-1 MKKSFLTLVLYSSLLTASEIAYRFVFGIETLPAAKIAETFALTFVIAALYLFARYKVTRL
10 20 30 40 50 60
70 80 90 100 110 120
orf81ng-1.pep LIAVFFAFSMIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAE
|||||||||:|||||||||||||||||||||||||||||||||||||||||||:|||| |
orf81-1 LIAVFFAFSIIANNVHYAVYQSWMTGINYWLMLKEVTEVGSAGASMLDKLWLPVLWGVLE
70 80 90 100 110 120
130 140 150 160 170 180
orf81ng-1.pep VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf81-1 VMLFCSLAKFRRKTHFSADILFAFLMLMIFVRSFDTKQEHGISPKPTYSRIKANYFSFGY
130 140 150 160 170 180
190 200 210 220 230 240
orf81ng-1.pep FVGRVLPYQLFDLSKIPVFKQPAPSKIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPF
||||||||||||||:||:|||||||||||||:||||||||||||||||||||||||||||
orf81-1 FVGRVLPYQLFDLSRIPAFKQPAPSKIGQGSVQNIVLIMGESESAAHLKLFGYGRETSPF
190 200 210 220 230 240
250 260 270 280 290 300
orf81ng-1.pep LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNVIPHANGLEQISGGDTNMFRLAKEQGYE
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf81-1 LTRLSQADFKPIVKQSYSAGFMTAVSLPSFFNAIPHANGLEQISGGDTNMFRLAKEQGYE
250 260 270 280 290 300
310 320 330 340 350 360
orf81ng-1.pep TYFYSAQAENQMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGRHF
||||||||||:||||||||||||||||||||||||||||||||||||||||||||||:||
orf81-1 TYFYSAQAENEMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQQGKHF
310 320 330 340 350 360
370 380 390 400 410 420
orf81ng-1.pep IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf81-1 IVLHQRGSHAPYGALLQPQDKVFGEADIVDKYDNTIHKTDQMIQTVFEQLQKQPDGNWLF
370 380 390 400 410 420
430 440 450 460 470 480
orf81ng-1.pep AYTSDHGQYVRQDIYNQGTVQPDSYIVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
|||||||||||||||||||||||||:||||||||||||||||||||||||||||||||||
orf81-1 AYTSDHGQYVRQDIYNQGTVQPDSYLVPLVLYSPDKAVQQAANQAFAPCEIAFHQQLSTF
430 440 450 460 470 480
490 500 510 520
orf81ng-1.pep LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRNGKAEYVYPQX
||||||||||||||||||||||||||||||||||:||||||||||
orf81-1 LIHTLGYDMPVSGCREGSVTGNLITGDAGSLNIRDGKAEYVYPQX
490 500 510 520
另外,ORF81ng显示出与大肠杆菌的OMP明显同源:
gi|1256380(U50906)结合外膜蛋白粘附蛋白的蛋白[E.coli]长度=547
评分=87.4位(213),估计值=2e-16
相同性=122/468(26%),阳性=198/468(42%),空隙=70/468(14%)
询问:25 VFGIETLPAAKMAETFA-LTFMIAALYLFARYKAS--RLLIAVFFAFSMIANNVHYAVYQ 81
VFGI L A+ A L F + + + R + RLL+A F + A ++ ++Y
目标:29 VFGITNLVASSGAHMVQRLLFFVLTILVVKRISSLPLRLLVAAPFVL-LTAADMSISLY- 86
询问:82 SWMT-------GINYWLMLKEVTEVGSAGASMLDKLWLPALWGVAEVMLFCSLAKFRRKT 134
SW T G ++ + EV A ML ++ P L A + L +
目标:87 SWCTFGTTFNDGFAISVLQSDPDEV----AKMLG-MYSPYLCAFAFLSLLFLAVIIKYDV 141
询问:135 HFSADILFAFLMLMIFVRSF---------DTKQEHGISPKPTYSRIKAN--YFSFGYFVG 183
+ L+L++ S D K ++ SP SR +F+ YF
目标:142 SLPTKKVTGILLLIVISGSLFSACQFAYKDAKNKNAFSPYILASRFATYTPFFNLNYFAL 201
询问:184 RVLPYQ--LFDLSKIPVFKQPAPSKIGQGSIQNIVLIMGESESAAHLKLFGYGRETSPFL 241
+Q L + +P F+ + I VLI+GES ++ L+GY R T+P +
目标:202 AAKEHQRLLSIANTVPYFQL----SVRDTGIDTYVLIVGESVRVDNMSLYGYTRSTTPQV 257
询问:242 TRLSQADFKPIVKQSYSAGFMTAVSLP---SFFNVIPHANGLEQISGGDTNMFRLAKEQG 298
+Q + Q+ S TA+S+P + +V+ H I N+ +A + G
目标:258 E--AQRKQIKLFNQAISGAPYTALSVPLSLTADSVLSH-----DIHNYPDNI INMANQAG 310
询问:299 YETYFYSAQA---ENQMAILNLIGKKWIDHLIQPTQLGYGNGDNMPDEKLLPLFDKINLQ 355
++T++ S+Q+ +N A+ ++ ++ + Y G DE LLP + Q
目标:311 FQTFWLSSQSAFRQNGTAVTSI--------AMRAMETVYVRGF---DELLLPHLSQALQQ 359
询问:356 --QGRHFIVLHQRGSHAPYGALLQPQDKVFGEADIVDK-YDNTIHKTDQMIQTVFEQLQK 412
Q + IVLH GSH P + VF D D YDN+IH TD ++ VFE L+
目标:360 NTQQKKLIVLHLNGSHEPACSAYPQSSAVFQPQDDQDACYDNSIHYTDSLLGQVFELLK- 418
询问:413 QPDGNWLFAYTSDHG---QYVRQDIYNQG--TVQPDSYIVPL-VLYSP 454
D Y +DHG ++++Y G +Y VP+ + YSP
目标:419--DRRASVMYFADHGLERDPTKKNVYFHGGREASQQAYHVPMFIWYSP 464
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)的结果),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例37
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 311>:
1 ...ACCCTGCTCC TCTTCATCCC CCTCGTCCTC ACAC.GTGCG GCACACTGAC
51 CGGCATACTC GCCCaCGGCG GCGGCAAACG CTTTGCCGTC GAACAAGAAC
101 TCGTCGCCGC ATCGTCCCGC GCCGCCGTCA AAGAAATGGA TTTGTCCGCC
151 yTAAAAGGAC GCAAAGCCGC CyTTTACGTC TCCGTTATGG GCGACCAAGG
201 TTCGGGCAAC ATAAGCGGCG GACGCTACTC TATCGACGCA CTGATACGCG
251 GCGGCTACCA CAACAACCCC GAAAGTGCCA CCCAATACAG CTACCCCGCC
301 TACGACACTA CCGCCACCAC CAAATCCGAC GCGCTCTCCA GCGTAACCAC
351 TTCCACATCG CTTTTGAACG CCCCCGCCGC CGyCyTGACG AAAAACAGCG
401 GACGCAAAGG CGAACGcTCC GCCGGACTGT CCGTCAACGG CACGGGCGAC
451 TACCGCAACG AAACCCTGCT CGCCAACCCC CGCGACGTTT CCTTCCTGAC
501 CAACCTCATC CAAACCGTCT TCTACCTGCG CGGCATCGAA GTCgTACCGC
551 CCGrATACGC CGACACCGAC GTATTCGTAA CCGTCGACGT A...
它对应于氨基酸序列<SEQ ID 312;ORF83>:
1 ..TLLLFIPLVL TXCGTLTGIL AHGGGKRFAV EQELVAASSR AAVKEMDLSA
51 LKGRKAAXYV SVMGDQGSGN ISGGRYSIDA LIRGGYHNNP ESATQYSYPA
101 YDTTATTKSD ALSSVTTSTS LLNAPAAXLT KNSGRKGERS AGLSVNGTGD
151 YRNETLLANP RDVSFLTNLI QTVFYLRGIE VVPPXYADTD VFVTVDV..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 313>:
1 ATGAAAACCC TGCTCCTCCT CATCCCCCTC GTCCTCACAG CCTGCGGCAC
51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC
101 AAGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGATTTG
151 TCCGCCCTAA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA
201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCTATC GACGCACTGA
251 TACGCGGCGG CTACCACAAC AACCCCGAAA GTGCCACCCA ATACAGCTAC
301 CCCGCCTACG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCAGCGT
351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA
401 ACAGCGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG
451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT
501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG
551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC
601 GGCACCGTCC GCAGCCGTAC CGAACTGCAC CTCTACAACG CCGAAACCCT
651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTTGACCGC GACAGCCGGA
701 AACTGCTGAT TACCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA
751 CAATACGCCC TTTGGACCGG CCCTTACAAA GTCAGCAAAA CCGTCAAAGC
801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATTACCCCC TACGGCGACA
851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAAACCC
901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA
它对应于氨基酸序列<SEQ ID 314;ORF83-1>:
1
MKTLLLLIPL VLTACGTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL
51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPESATQYSY
101 PAYDTTATTK SDALSSVTTS TSLLNAPAAA LTKNSGRKGE RSAGLSVNGT
151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEVVPPEYAD TDVFVTVDVF
201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLITPK TAAYESQYQE
251 QYALWTGPYK VSKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP
301 DVGNEVIRRR KGG*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF83与脑膜炎奈瑟球菌菌株A的ORF(ORF83a)在重叠的197个氨基酸内有96.4%的相同性:
10 20 30 40 50
orf83.pep
TLLLFIPLVLTXCGTLTGILAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAX
||| :|||||| ||||||| ||||||||||||||||||||||||||||||||||||||
orf83a
MKTLLXLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
10 20 30 40 50 60
60 70 80 90 100 110
orf83.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf83a YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
70 80 90 100 110 120
120 130 140 150 160 170
orf83.pep TSLLNAPAAXLTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||
orf83a TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
130 140 150 160 170 180
180 190
orf83.pep IEVVPPXYADTDVFVTVDV
|||||| ||||||||||||
orf83a IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK
190 200 210 220 230 240
全长ORF83a核苷酸序列<SEQ ID 315>是:
1 ATGAAAACCC TGCTCNTCCT CATCCCCCTC GTCCTCACAG CCTGCGGCAC
51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC
101 AAGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGACTTG
151 TCCGCCCTGA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA
201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCTATC GACGCACTGA
251 TACGCGGCGG CTACCACAAC AACCCCGAAA GTGCCACCCA ATACAGCTAC
301 CCCGCCTACG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCAGCGT
351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA
401 ACAGCGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG
451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT
501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG
551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC
601 GGCACCGTCC GCAGCCGCAC CGAACTGCAC CTCTACAACG CCGAAACCCT
651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTTGACCGC GACAGCCGGA
701 AACTGCTGAT TGCCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA
751 CAATACGCCC TCTGGATGGG ACCTTACAGC GTCGGCAAAA CCGTCAAAGC
801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATCACCCCC TACGGCGACA
851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAAACCC
901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 316>:
1
MKTLLXLIPL VLTACGTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL
51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPESATQYSY
101 PAYDTTATTK SDALSSVTTS TSLLNAPAAA LTKNSGRKGE RSAGLSVNGT
151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEVVPPEYAD TDVFVTVDVF
201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLIAPK TAAYESQYQE
251 QYALWMGPYS VGKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP
301 DVGNEVIRRR KGG*
ORF83a和ORF83-1在313个氨基酸的重叠区内有98.4%的相同性:
10 20 30 40 50 60
orf83a.pep MKTLLXLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf83-1 MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
10 20 30 40 50 60
70 80 90 100 110 120
orf83a.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf83-1 YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
70 80 90 100 110 120
130 140 150 160 170 180
orf83a.pep TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf83-1 TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
130 140 150 160 170 180
190 200 210 220 230 240
orf83a.pep IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||:||
orf83-1 IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLITPK
190 200 210 220 230 240
250 260 270 280 290 300
orf83a.pep TAAYESQYQEQYALWMGPYSVGKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP
||||||||||||||| |||:|:||||||||||||||||||||||||||||||||||||||
orf83-1 TAAYESQYQEQYALWTGPYKVSKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP
250 260 270 280 290 300
310
orf83a.pep DVGNEVIRRRKGGX
||||||||||||||
orf83-1 DVGNEVIRRRKGGX
310
与淋病奈瑟球菌的预计ORF的同源性
ORF83和淋病奈瑟球菌的预计ORF(ORF83.ng)在重叠的197个氨基酸内有94.9%的相同性:
orf83.pep TLLLFIPLVLTXCGTLTGILAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAX 58
||||:|||||| ||||||| ||||||||||||||||||||||||||||||||||||||
orf83ng MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL 60
orf83.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS 118
||||||||||||||||||||||||||||||||:|||:||||||||||||||||||:||||
orf83ng YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPDSATRYSYPAYDTTATTKSDALSGVTTS 120
orf83.pep TSLLNAPAAXLTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 178
||||||||| ||||:|||||||||||||||||||||||||||||||||||||||||||||
orf83ng TSLLNAPAAALTKNNGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG 180
orf83.pep IEVVPPXYADTDVFVTVDV 197
|||||||||||||||||||
orf83ng IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK 240
全长ORF83ng核苷酸序列<SEQ ID 317>是:
1 ATGAAAACCC TGCTCCTCCT CATCCCCCTC GTACTCACCG CCTGCGGCAC
51 ACTGACCGGC ATACCCGCCC ACGGCGGCGG CAAACGCTTT GCCGTCGAAC
101 AGGAACTCGT CGCCGCATCG TCCCGCGCCG CCGTCAAAGA AATGGACTTG
151 TCCGCCCTGA AAGGACGCAA AGCCGCCCTT TACGTCTCCG TTATGGGCGA
201 CCAAGGTTCG GGCAACATAA GCGGCGGACG CTACTCCATC GACGCACTGA
251 TACGCGGCGG CTACCACAAC AACCCCGACA GCGCCACCCG ATACAGCTAC
301 CCCGCCTATG ACACTACCGC CACCACCAAA TCCGACGCGC TCTCCGGCGT
351 AACCACTTCC ACATCGCTTT TGAACGCCCC CGCCGCCGCC CTGACGAAAA
401 ACAACGGACG CAAAGGCGAA CGCTCCGCCG GACTGTCCGT CAACGGCACG
451 GGCGACTACC GCAACGAAAC CCTGCTCGCC AACCCCCGCG ACGTTTCCTT
501 CCTGACCAAC CTCATCCAAA CCGTCTTCTA CCTGCGCGGC ATCGAAGTCG
551 TACCGCCCGA ATACGCCGAC ACCGACGTAT TCGTAACCGT CGACGTATTC
601 GGCACCGTCC GCAGCCGTAC CGAACTGCAC CTCTACAACG CCGAAACCCT
651 TAAAGCCCAA ACCAAGCTCG AATATTTCGC CGTCGACCGC GACAGCCGGA
701 AACTGCTGAT TGCCCCTAAA ACCGCCGCCT ACGAATCCCA ATACCAAGAA
751 CAATACGCCC TCTGGATGGG ACCTTACAGC GTCGGCAAAA CCGTCAAAGC
801 CTCAGACCGC CTGATGGTCG ATTTCTCCGA CATCACCCCC TACGGCGACA
851 CAACCGCCCA AAACCGTCCC GACTTCAAAC AAAACAACGG TAAAAACCCC
901 GATGTCGGCA ACGAAGTCAT CCGCCGCCGC AAAGGAGGAT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 318>:
1 MKTL
LLLIPL VLTACGTLTG IPAHGGGKRF AVEQELVAAS SRAAVKEMDL
51 SALKGRKAAL YVSVMGDQGS GNISGGRYSI DALIRGGYHN NPDSATRYSY
101 PAYDTTATTK SDALSGVTTS TSLLNAPAAA LTKNNGRKGE RSAGLSVNGT
151 GDYRNETLLA NPRDVSFLTN LIQTVFYLRG IEVVPPEYAD TDVFVTVDVF
201 GTVRSRTELH LYNAETLKAQ TKLEYFAVDR DSRKLLIAPK TAAYESQYQE
251 QYALWM
VKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKNP
301 DVGNEVIRRR KGG*
ORF83ng和ORF83-1在313个氨基酸的重叠区内有97.1%的相同性
10 20 30 40 50 60
orf83-1.pep MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf83ng MKTLLLLIPLVLTACGTLTGIPAHGGGKRFAVEQELVAASSRAAVKEMDLSALKGRKAAL
10 20 30 40 50 60
70 80 90 100 110 120
orf83-1.pep YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPESATQYSYPAYDTTATTKSDALSSVTTS
||||||||||||||||||||||||||||||||:|||:||||||||||||||||||:||||
orf83ng YVSVMGDQGSGNISGGRYSIDALIRGGYHNNPDSATRYSYPAYDTTATTKSDALSGVTTS
70 80 90 100 110 120
130 140 150 160 170 180
orf83-1.pep TSLLNAPAAALTKNSGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf83ng TSLLNAPAAALTKNNGRKGERSAGLSVNGTGDYRNETLLANPRDVSFLTNLIQTVFYLRG
130 140 150 160 170 180
190 200 210 220 230 240
orf83-1.pep IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLITPK
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||:||
orf83ng IEVVPPEYADTDVFVTVDVFGTVRSRTELHLYNAETLKAQTKLEYFAVDRDSRKLLIAPK
190 200 210 220 230 240
250 260 270 280 290 300
orf83-1.pep TAAYESQYQEQYALWTGPYKVSKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKKP
||||||||||||||| |||:|:||||||||||||||||||||||||||||||||||||:|
orf83ng TAAYESQYQEQYALWMGPYSVGKTVKASDRLMVDFSDITPYGDTTAQNRPDFKQNNGKNP
250 260 270 280 290 300
310
orf83-1.pep DVGNEVIRRRKGGX
||||||||||||||
orf83ng DVGNEVIRRRKGGX
310
根据该分析结果(预计淋球菌蛋白中存在一个推定的ATP/GTP-结合位点基序A(P-环)(双划线)以及一个推定的原核细胞膜脂蛋白脂质连接位点(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例38
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 319>:
1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT
51 AAAAATGGTT TCCATGATGG CGAATGATGA AATGTTTAAG CCTGATGAAA
101 AAGCCATACG CCGTAAAGTA TTTACGAACA TAAAAGGCTT GAAAATACCG
151 CACACCTACA TAGAAACGGA CGCAAAAAAG CTGCCGAAAT CGACAGATGA
201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA
251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC
301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG
351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG
401 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC
451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC
501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA
551 AAGTTTATGA CTTGTAysrr TmmGCGGAAG TTCATACCGT AAATAAGGTC
601 AAGCGGTCAA AGTGGTTTTA CACTCTGCCa GTAATAGTAT TGCTGATTCC
651 CGTGTTTGTC GGCCTGTCCT ATAAAATGTT GagCaGTTAC GGAAAAAAAC
701 aGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA
751 CTTCCGGATA AAACAGAAGG CGAGCCGGTA AATAACGGCA ACCTTACCGC
801 AGATATGTTT GTTCCGACAT TGTCCGAaAA ACCCGrAAGC AAGCcgaTTT
851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTATA
901 GAAGGCGGAA GAACCGGATG CGCCTGCTAT TCGCaTCAAG GGACGGCATt
1001 CCGTTTAACC CaTACAAAGA AGAAAGCCAA GGGCAGGAAG TTCAGCAAAG
1101 CCGTAGCAGA ACCTAATGTA CGATAATTGG GAAGAACGCG GGAAACCGTT
它对应于氨基酸序列<SEQ ID 320;ORF84>:
1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDEKAIRRKV FTNIKGLKIP
51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR
101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VRKHYHIASN
151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYX XAEVHTVNKV
201 KRSKWFYTLP VIVLLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEQQAV
251 LPDKTEGEPV NNGNLTADMF VPTLSEKPXS KPIYNGVRQV RTFEYIAGCI
301 EGGRTGCACY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS
351 AQQHSDRAQV ATLGGKPXQN LMYDNWEERG KPFEGIGGGV VGSAN*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 321>:
1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT
51 AAAAATGGTT TCCATGATGG CGAATGATGA AATGTTTAAG CCTGATGAAA
101 ACGGCATACG CCGTAAAGTA TTTACGAACA TAAAAGGCTT GAAAATACCG
151 CACACCTACA TAGAAACGGA CGCAAAAAAG CTGCCGAAAT CGACAGATGA
201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA
251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC
301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG
351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG
401 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC
451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC
501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA
551 AAGTTTATGA CTTGTACGAA TCAGCGGAAG TTCATACCGT AAATAAGGTC
601 AAGCGGTCAA AGTGGTTTTA CACTCTGCCA GTAATAGTAT TGCTGATTCC
651 CGTGTTTGTC GGCCTGTCCT ATAAAATGTT GAGCAGTTAC GGAAAAAAAC
701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA
751 CTTCCGGATA AAACAGAAGG CGAGCCGGTA AATAACGGCA ACCTTACCGC
801 AGATATGTTT GTTCCGACAT TGTCCGAAAA ACCCGAAAGC AAGCCGATTT
851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTATA
901 GAAGGCGGAA GAACCGGATG CGCCTGCTAT TCGCATCAAG GGACGGCATT
951 GAAAGAAGTG ACGGAGTTGA TGTGCAAGGA CTATGTAAAA AACGGCTTGC
1001 CGTTTAACCC ATACAAAGAA GAAAGCCAAG GGCAGGAAGT TCAGCAAAGC
1051 GCGCAGCAAC ATTCGGACAG GGCGCAAGTT GCCACATTGG GCGGAAAACC
1101 GTAGCAGAAC CTAATGTACG ATAATTGGGA AGAACGCGGG AAACCGTTTG
1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA
它对应于氨基酸序列<SEQ ID 322;ORF84-1>:
1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDENGIRRKV FTNIKGLKIP
51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR
101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VRKHYHIASN
151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYE SAEVHTVNKV
201 KRSKW
FYTLP VIVLLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEQQAV
251 LPDKTEGEPV NNGNLTADMF VPTLSEKPES KPIYNGVRQV RTFEYIAGCI
301 EGGRTGCACY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS
351 AQQHSDRAQV ATLGGKP*QN LMYDNWEERG KPFEGIGGGV VGSAN*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF84与脑膜炎奈瑟球菌菌株A的ORF(ORF84a)在重叠的395个氨基酸内有93.9%的相同性:
10 20 30 40 50 60
orf84.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDEKAIRRKVFTNIKGLKIPHTYIETDAKK
|||||||||||||||||||||||||||||||||::|||||||||||||||||||||||||
orf84a MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
10 20 30 40 50 60
70 80 90 100 110 120
orf84.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf84a LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
70 80 90 100 110 120
130 140 150 160 170 180
orf84.pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||
orf84a IDIFVLTQGSKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
130 140 150 160 170 180
190 200 210 220 230 240
orf84.pep LDKKVYDLYXXAEVHTVNKVKRSKW
FYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ
||||||||| |||||||||||||||||||||:|||||||||||||||||||||||||||
orf84a LDKKVYDLYESAEVHTVNKVKRSKW
FYTLPVI ILLIPVFVGLSYKMLSSYGKKQEEPAAQ
190 200 210 220 230 240
250 260 270 280 290 300
orf84.pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPXSKPIYNGVRQVRTFEYIAGCI
||||||:|||: |||||||||||||||||||||||||| ||||||||||||||||||||:
orf84a ESAATEHQAVFQDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCV
250 260 270 280 290 300
310 320 330 340 350 360
orf84.pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
|||||||:|||||||||||:|: |||||::||||||||||||||::|||| |:|||| ||
orf84a EGGRTGCTCYSHQGTALKEITKEMCKDYARNGLPFNPYKEESQGRDVQQSEQHHSDRPQV
310 320 330 340 350 360
370 380 390
orf84.pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSANX
||||||| ||||||||:|||||||||||||||||||
orf84a ATLGGKPWQNLMYDNWQERGKPFEGIGGGVVGSANX
370 380 390
全长ORF84a核苷酸序列<SEQ ID 323>是:
1 ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT
51 AAAAATGGTT TCCATGATGG CAAACGATGA AATGTTTAAG CCGGATGAAA
101 ACGGCATACG CCGTAAAGTA TTTACGAACA TCAAAGGCTT GAAGATACCG
151 CACACCTACA TAGAAACGGA CGCGAAAAAG CTGCCGAAAT CGACAGATGA
201 GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA
251 TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC
301 TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG
351 ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGCTCT AAGCTTCTAG
401 ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC
451 AAGATGGGTA TGCGTACGCT TTTAGAATGG AAAATATGCG CGGACGATCC
501 CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA
551 AAGTTTATGA CTTGTACGAA TCAGCGGAAG TTCATACCGT AAATAAGGTC
601 AAGCGGTCAA AATGGTTTTA TACTCTGCCA GTAATAATAT TGCTGATTCC
651 CGTTTTTGTC GGCCTGTCCT ATAAAATGTT AAGTAGTTAT GGAAAAAAAC
701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA TCAGGCAGTA
751 TTTCAGGATA AAACAGAAGG CGAGCCGGTA AACAACGGTA ACCTTACCGC
801 AGATATGTTT GTTCCGACAT TGTCCGAAAA ACCCGAAAGC AAGCCGATTT
851 ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTGTA
901 GAAGGCGGAA GAACCGGATG CACATGCTAT TCGCATCAAG GGACGGCATT
951 GAAAGAAATT ACAAAGGAAA TGTGCAAGGA TTACGCAAGA AACGGATTGC
1001 CGTTTAACCC ATATAAAGAA GAAAGCCAAG GGCGGGATGT CCAGCAAAGT
1051 GAGCAGCACC ATTCGGACAG ACCGCAAGTT GCCACGTTGG GCGGAAAGCC
1101 GTGGCAAAAT CTTATGTATG ATAATTGGCA GGAGCGCGGA AAACCGTTTG
1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 324>:
1 MAEICLITGT PGSGKTLKMV SMMANDEMFK PDENGIRRKV FTNIKGLKIP
51 HTYIETDAKK LPKSTDEQLS AHDMYEWIKK PENIGSIVIV DEAQDVWPAR
101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGS KLLDQNLRTL VRKHYHIASN
151 KMGMRTLLEW KICADDPVKM ASSAFSSIYT LDKKVYDLYE SAEVHTVNKV
201 KRSKW
FYTLP VIILLIPVFV GLSYKMLSSY GKKQEEPAAQ ESAATEHQAV
251 FQDKTEGEPV NNGNLTADMF VPTLSEKPES KPIYNGVRQV RTFEYIAGCV
301 EGGRTGCTCY SHQGTALKEI TKEMCKDYAR NGLPFNPYKE ESQGRDVQQS
351 EQHHSDRPQV ATLGGKPWQN LMYDNWQERG KPFEGIGGGV VGSAN*
ORF84a和ORF84-1在395个氨基酸的重叠区内有95.2%的相同性:
10 20 30 40 50 60
orf84a.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf84-1 MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
10 20 30 40 50 60
70 80 90 100 110 120
orf84a.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf84-1 LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
70 80 90 100 110 120
130 140 150 160 170 180
orf84a.pep IDIFVLTQGSKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||
orf84-1 IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
130 140 150 160 170 180
190 200 210 220 230 240
orf84a.pep LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIILLIPVFVGLSYKMLSSYGKKQEEPAAQ
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf84-1 LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ
190 200 210 220 230 240
250 260 270 280 290 300
orf84a.pep ESAATEHQAVFQDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCV
||||||:|||: |||||||||||||||||||||||||||||||||||||||||||||||:
orf84-1 ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCI
250 260 270 280 290 300
310 320 330 340 350 360
orf84a.pep EGGRTGCTCYSHQGTALKEITKEMCKDYARNGLPFNPYKEESQGRDVQQSEQHHSDRPQV
|||||||:|||||||||||:|:|||||::||||||||||||||::|||| |:|||| ||
orf84-1 EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
310 320 330 340 350 360
370 380 390
orf84a.pep ATLGGKPWQNLMYDNWQERGKPFEGIGGGVVGSANX
||||||| ||||||||:|||||||||||||||||||
orf84-1 ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSANX
370 380 390
与淋病奈瑟球菌的预计ORF的同源性
ORF84与淋病奈瑟球菌的预计ORF(ORF84.ng)在重叠的395个氨基酸内有94.2%的相同性:
orf84.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDEKAIRRKVFTNIKGLKIPHTYIETDAKK 60
|||||||||||||||||||||||||||||||||:::||||||||||||||||:|||||||
orf84ng MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGVRRKVFTNIKGLKIPHTHIETDAKK 60
orf84.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 120
|||||||||||||||||||||||:|:||||||||||||||||||||||||||||||||||
orf84ng LPKSTDEQLSAHDMYEWIKKPENVGAIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG 120
orf84.pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT 180
|||||||||||||||||||||::|||||:||||:|||||||:||||||||||||||||||
orf84ng IDIFVLTQGPKLLDQNLRTLVKRHYHIAANKMGLRTLLEWKVCADDPVKMASSAFSSIYT 180
orf84.pep LDKKVYDLYXXAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ 240
||||||||| ||:|||||||||||||:||||:||||:|||||||||:||||||||||||
orf84ng LDKKVYDLYESAEIHTVNKVKRSKWFYALPVIILLIPLFVGLSYKMLGSYGKKQEEPAAQ 240
orf84.pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPXSKPIYNGVRQVRTFEYIAGCI 300
|||||||||||||||||| ||||||||||||||| ||| |||||||||||||||||||||
orf84ng ESAATEQQAVLPDKTEGESVNNGNLTADMFVPTLPEKPESKPIYNGVRQVRTFEYIAGCI 300
orf84.pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 360
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf84ng EGGRTGCTCYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV 360
orf84.pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSAN 395
||||||| |||||||||||||||||||||||||||
orf84ng ATLGGKPQQNLMYDNWEERGKPFEGIGGGVVGSAN 395
全长ORF84ng核苷酸序列<SEQ ID 325>是:
1 ATGGCAGAAA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT
51 AAAAATGGTT TCCATGATGG CAAACGATGA AATGTTTAAG CCAGATGAAA
101 ACGGCGTACG CCGTAAAGTA TTTACGAACA TCAAAGGTTT GAAGATACCG
151 CACACCCACA TAGAAACAGA CGCAAAGAAG CTGCCGAAAT CAACCGATGA
201 ACAGCTTTCG GCGCATGATA TGTATGAATG GATCAAGAAG CCTGAAAacg
251 tcggcgCAAT CGTTATTGTC GATGAGGCGC AAGACGTATG GCCCGCACGC
301 TccgCAGGTT CGAAAATCCC CGAAAACGTC CAATGGCTGA ACACACACAG
351 GCATCAGGGC ATAGATATAT TTGTATTGAC ACAAGGTCCT AAACTCTTAG
401 ATCAGAACTT GCGAACATTG GTTAAAAGAC ATTACCACAT TGCGGCCAAC
451 AAAATGGGTT TGCGTACCCT GCTTGAATGG AAAGTATGCG CGGATGACCC
501 GGTAAAAATG GCATCAAGTG CATTTTCCAG TATCTACACA CTGGATAAAA
551 AAGTTTATGA CTTGTACGAA TCCGCAGAAA TTCACACGGT AAACAAAGTC
601 AAGCGTTCAA AATGGTTTTA TGCATTGCCC GTCATCATAT TATTGATTCC
651 GCTATTTGTC GGTTTGTCTT ACAAAATGTT GGGCAGTTAC GGAAAAAAAC
701 AGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA
751 CTTCCGGATA AAACAGAAGG AGAATCGGTG AATAACGGAA ACCTTACGGC
801 AGATATGTTT GTTCCGACAT TGCCCGAAAA ACCCGAAAGC AAGCCGATTT
851 ATAACGGTGT AAGGCAGGTA AGGACCTTTG AATATATAGC AGGCTGTATA
901 GAAGGCGGAA GAACCGGATG CACCTGCTAT TCGCATCAAG GGACGGCATT
951 GAAAGAAGTG ACGGAGTTGA TGTGCAAGGA CTATGTAAAA AACGGCTTGC
1001 CGTTTAACCC ATACAAAGAA GAAAGCCAAG GGCAGGAAGT TCAGCAAAGC
1051 GCGCAGCAAC ATTCGGACAG GGCGCAAGTT GCCACCTTGG GCGGAAAACC
1101 GCAGCAGAAC CTAATGTACG ACAATTGGGA AGAACGCGGG AAACCGTTTG
1151 AAGGAATCGG CGGGGGCGTG GTCGGATCGG CAAACTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 326>:
51 HTHIETDAKK LPKSTDEQLS AHDMYEWIKK PENVGAIVIV DEAQDVWPAR
101 SAGSKIPENV QWLNTHRHQG IDIFVLTQGP KLLDQNLRTL VKRHYHIAAN
151 KMGLRTLLEW KVCADDPVKM ASSAFSSIYT LDKKVYDLYE SAEIHTVNKV
201 KRSKW
FYALP VIILLIPLFV GLSYKMLGSY GKKQEEPAAQ ESAATEQQAV
251 LPDKTEGESV NNGNLTADMF VPTLPEKPES KPIYNGVRQV RTFEYIAGCI
301 EGGRTGCTCY SHQGTALKEV TELMCKDYVK NGLPFNPYKE ESQGQEVQQS
351 AQQHSDRAQV ATLGGKPQQN LMYDNWEERG KPFEGIGGGV VGSAN*
ORF84ng和ORF84-1在395个氨基酸的重叠区内有95.4%的相同性:
10 20 30 40 50 60
orf84-1.pep MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGIRRKVFTNIKGLKIPHTYIETDAKK
|||||||||||||||||||||||||||||||||||:||||||||||||||||:|||||||
orf84ng MAEICLITGTPGSGKTLKMVSMMANDEMFKPDENGVRRKVFTNIKGLKIPHTHIETDAKK
10 20 30 40 50 60
70 80 90 100 110 120
orf84-1.pep LPKSTDEQLSAHDMYEWIKKPENIGSIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
|||||||||||||||||||||||:|:||||||||||||||||||||||||||||||||||
orf84ng LPKSTDEQLSAHDMYEWIKKPENVGAIVIVDEAQDVWPARSAGSKIPENVQWLNTHRHQG
70 80 90 100 110 120
130 140 150 160 170 180
orf84-1.pep IDIFVLTQGPKLLDQNLRTLVRKHYHIASNKMGMRTLLEWKICADDPVKMASSAFSSIYT
|||||||||||||||||||||::|||||:||||:|||||||:||||||||||||||||||
orf84ng IDIFVLTQGPKLLDQNLRTLVKRHYHIAANKMGLRTLLEWKVCADDPVKMASSAFSSIYT
130 140 150 160 170 180
190 200 210 220 230 240
orf84-1.pep LDKKVYDLYESAEVHTVNKVKRSKWFYTLPVIVLLIPVFVGLSYKMLSSYGKKQEEPAAQ
|||||||||||||:|||||||||||||:||||:||||:|||||||||:||||||||||||
orf84ng LDKKVYDLYESAEIHTVNKVKRSKWFYALPVIILLIPLFVGLSYKMLGSYGKKQEEPAAQ
190 200 210 220 230 240
250 260 270 280 290 300
orf84-1.pep ESAATEQQAVLPDKTEGEPVNNGNLTADMFVPTLSEKPESKPIYNGVRQVRTFEYIAGCI
|||||||||||||||||| ||||||||||||||| |||||||||||||||||||||||||
orf84ng ESAATEQQAVLPDKTEGESVNNGNLTADMFVPTLPEKPESKPIYNGVRQVRTFEYIAGCI
250 260 270 280 290 300
310 320 330 340 350 360
orf84-1.pep EGGRTGCACYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf84ng EGGRTGCTCYSHQGTALKEVTELMCKDYVKNGLPFNPYKEESQGQEVQQSAQQHSDRAQV
310 320 330 340 350 360
370 380 390
orf84-1.pep ATLGGKPXQNLMYDNWEERGKPFEGIGGGVVGSANX
||||||| ||||||||||||||||||||||||||||
orf84ng ATLGGKPQQNLMYDNWEERGKPFEGIGGGVVGSANX
370 380 390
根据该分析结果(包括淋球菌蛋白中存在一个推定的跨膜结构域(单划线),以及一个推定的ATP/GTP-结合位点基序A(P环,双划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例39
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 327>:
1 GTGGTTTTCC TGAATGCCGA CAACGGGATA TTGGTTCAGG ACTTGCCTTT
51 TGAAGTCAAA CTGAAAAAAT TCCATATCGA TTTTTACAAT ACGGGTATGC
101 CGCGTGATTT CGCCAGCGAT ATTGAAGTGA CGGACAAGGC AACCGGTGAG
151 AAACTCGAGC GCACCATCCG CGTGAACCAT CCTTTGACCT TGCACGGCAT
201 CACGATTTAT CAGGCGAGTT TTGCCGACGG CGGTTCGGAT TTGACATTCA
251 AGGCGTGGAA TTTGGGTGAT GCTTCGCGCG AGCCTGTCGT GTTGAAGGCA
301 ACATCCATAC ACCAGTTTCC GTTGGAAATT GGCAAACACA AATATCGTCT
351 TGAGTTCGAT CAGTTCACTT CTATGAATGT GGAGGACATG AGCGAGGGCG
401 CGGAACGGGA AAAAAGCCTG AAATCCACGC TGCCCGATGT CCGCGCCGTT
451 ACTCAGGAAG GTCACAAATA CACCAAT... .......... .....TACCG
501 TATCCGTGAT GCGCCAGGCC AGGCGGTCGA ATATAAAAAC TATATGCTGC
551 CGGTTTTGCA GGAACAGGAT TATTTTTGGA TTACCGGCAC GCGCAGCGC.
601 TTGCAGCAGC AATACCGCTG GCTGCGTATC CCCTTGGACA AGCAGTTGAA
651 AGCGGACACC TTTATGGCAT TGCGTGAGTT TTTGAAAGAT GGGGAAGGGC
701 GCAAACGTCT .GTTGCCGAC GCAACCAAAG GCGCACCTGC CGAAATCCGC
751 GAACAATTCA TGCTGGCTGC GGAAAACACG CTGAACATCT TTGCACAAAA
801 AGGCTATTTG GGATTGGACG AATTTATTAC GTCCAATATC CCGAAAGAGC
851 AGCAGGATAA GATGCAGGGC TATTTCTACG AAATGCTTTA CGGCGTGATG
901 AACGCTGCTT TGGATGAAAC CAT.ACCCGG TACGGCTTGC CCGAATGGCA
951 GCAGGATGAA GCGCGGAATC GTTTCCTGCT GCACAGTATG GATGCGTACA
1001 CGGGTTTGAC CGAATATCCC GCGCCTATGC TGCTGCAACT TGATGGGTTT
1051 TCCGAGGTGC GTTCGTCGGG TTTGCAGATG ACCCGTTCCC C.GGTCCGCT
1101 TTTGGTCTAT CTC...
它对应于氨基酸序列<SEQ ID 328;ORF88>:
1 MVFLNADNGI LVQDLPFEVK LKKFHIDFYN TGMPRDFASD IEVTDKATGE
51 KLERTIRVNH PLTLHGITIY QASFADGGSD LTFKAWNLGD ASREPVVLKA
101 TSIHQFPLEI GKHKYRLEFD QFTSMNVEDM SEGAEREKSL KSTLPDVRAV
151 TQEGHKYTNX XXXXXYRIRD APGQAVEYKN YMLPVLQEQD YFWITGTRSX
201 LQQQYRWLRI PLDKQLKADT FMALREFLKD GEGRKRXVAD ATKGAPAEIR
251 EQFMLAAENT LNIFAQKGYL GLDEFITSNI PKEQQDKMQG YFYEMLYGVM
301 NAALDETXTR YGLPEWQQDE ARNRFLLHSM DAYTGLTEYP APMLLQLDGF
351 SEVRSSGLQM TRSXGPLLVY L...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 329>:
1 ATGAGTAAAT CCCGTAGATC TCCCCCACTT CTTTCCCGTC CGTGGTTCGC
51 TTTTTTCAGC TCCATGCGCT TTGCAGTCGC TTTGCTCAGT CTGCTGGGTA
101 TTGCATCGGT TATCGGTACG GTGTTGCAGC AAAACCAGCC GCAGACGGAT
151 TATTTGGTCA AATTCGGATC GTTTTGGGCG CAGATTTTTG GTTTTCTGGG
201 ACTGTATGAC GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTT
251 TGGTGGTTTC TACCAGTTTG TGCCTGATTC GCAATGTGCC GCCGTTCTGG
301 CGCGAAATGA AGTCTTTTCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC
351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCGCCC GAGGTTGCCA
401 AACGTTATCT GGAAGTACAA GGTTTTCAGG GAAAAACCAT TAACCGTGAA
451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCACAATGA ACAAATGGGG
501 CTATATCTTT GCCCATGTTG CTTTGATTGT CATTTGCCTG GGCGGGTTGA
551 TAGACAGTAA CCTGCTGTTG AAACTGGGTA TGCTGACCGG TCGGATTGTT
601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT
651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC
701 AGAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT ATTGGTTCAG
751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA
801 TACGGGTATG CCGCGTGATT TCGCCAGCGA TATTGAAGTG ACGGACAAGG
851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC
901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA
951 TTTGACATTC AAGGCGTGGA ATTTGGGTGA TGCTTCGCGC GAGCCTGTCG
1001 TGTTGAAGGC AACATCCATA CACCAGTTTC CGTTGGAAAT TGGCAAACAC
1051 AAATATCGTC TTGAGTTCGA TCAGTTCACT TCTATGAATG TGGAGGACAT
1101 GAGCGAGGGC GCGGAACGGG AAAAAAGCCT GAAATCCACG CTGAACGATG
1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC
1201 ATTGTTTACC GTATCCGTGA TGCGGCAGGG CAGGCGGTCG AATATAAAAA
1251 CTATATGCTG CCGGTTTTGC AGGAACAGGA TTATTTTTGG ATTACCGGCA
1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC
1351 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA
1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GGCGCACCTG
1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAACATC
1501 TTTGCACAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT
1551 CCCGAAAGAG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT
1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG
1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAT CGTTTCCTGC TGCACAGTAT
1701 GGATGCGTAC ACGGGTTTGA CCGAATATCC CGCGCCTATG CTGCTGCAAC
1751 TTGATGGGTT TTCCGAGGTG CGTTCGTCGG GTTTGCAGAT GACCCGTTCC
1801 CCGGGTGCGC TTTTGGTCTA TCTCGGCTCG GTGCTGTTGG TATTGGGTAC
1851 GGTATTGATG TTTTATGTGC GCGAAAAACG GGCGTGGGTA TTGTTTTCAG
1901 ACGGCAAAAT CCGTTTTGCC ATGTCTTCGG CCCGCAGCGA ACGGGATTTG
1951 CAGAAGGAAT TTCCAAAACA CGTCGAGAGT CTGCAACGGC TCGGCAAGGA
2001 CTTGAATCAT GACTGA
它对应于氨基酸序列<SEQ ID 330;ORF88-1>:
1 MSKSRRSPPL LSRPWFAFFS SMRFA
VALLS LLGIASVIGT VLQQNQPQTD
51 YLVKFGSFWA QIFGFLGLYD VYASAW
FVVI MMFLVVSTSL CLIRNVPPFW
101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVQ GFQGKTINRE
151 DGSVLIAAKK GTMNKWG
YIF AHVALIVICL GGLIDSNLLL KLGMLTGRIV
201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADVVF LNADNGILVQ
251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT
301 LHGITIYQAS FADGGSDLTF KAWNLGDASR EPVVLKATSI HQFPLEIGKH
351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS
401 IVYRIRDAAG QAVEYKNYML PVLQEQDYFW ITGTRSGLQQ QYRWLRIPLD
451 KQLKADTFMA LREFLKDGEG RKRLVADATK GAPAEIREQF MLAAENTLNI
501 FAQKGYLGLD EFITSNIPKE QQDKMQGYFY EMLYGVMNAA LDETIRRYGL
551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS
601 PGA
LLVYLGS VLLVLGTVLM FYVREKRAWV LFSDGKIRFA MSSARSERDL
651 QKEFPKHVES LQRLGKDLNH D*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF88与脑膜炎奈瑟球菌菌株A的ORF(ORF88a)在重叠的371个氨基酸内有95.7%的相同性:
10 20 30
orf88.pep MVFLNADNGILVQDLPFEVKLKKFHIDFYN
:|||||||||||||||||||||||||||||
orf88a AKDFKPESILGASNLSFRGNVNISEGQSADVVFLNADNGILVQDLPFEVKLKKFHIDFYN
210 220 230 240 250 260
40 50 60 70 80 90
orf88.pep TGMPRDFASDIEVTDKATGEKLERTIRVNHPLTLHGITIYQASFADGGSDLTFKAWNLGD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88a TGMPRDFASDIEVTDKATGEKLERTIRVNHPLTLHGITIYQASFADGGSDLTFKAWNLGD
270 280 290 300 310 320
100 110 120 130 140 150
orf88.pep ASREPVVLKATSIHQFPLEIGKHKYRLEFDQFTSMNVEDMSEGAEREKSLKSTLPDVRAV
|||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
orf88a ASREPVVLKATSIHQFPLEIGKHKYRLEFDQFTSMNVEDMSEGAEREKSLKSTLNDVRAV
330 340 350 360 370 380
160 170 180 190 200 210
orf88.pep TQEGHKYTNXXXXXXYRIRDAPGQAVEYKNYMLPVLQEQDYFWITGTRSXLQQQYRWLRI
||||:|||| |||||| ||||||||||||||||||||||||||| ||||||||||
orf88a TQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYMLPVLQEQDYFWITGTRSGLQQQYRWLRI
390 400 410 420 430 440
220 230 240 250 260 270
orf88.pep PLDKQLKADTFMALREFLKDGEGRKRXVADATKGAPAEIREQFMLAAENTLNIFAQKGYL
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||
orf88a PLDKQLKADTFMALREFLKDGEGRKRLVADATKGAPAEIREQFMLAAENTLNIFAQKGYL
450 460 470 480 490 500
280 290 300 310 320 330
orf88.pep GLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAALDETXTRYGLPEWQQDEARNRFLLHSM
||||||||||||||||||||||||||||||||||||| |||||||||||||||||||||
orf88a GLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAALDETIRRYGLPEWQQDEARNRFLLHSM
510 520 530 540 550 560
340 350 360 370
orf88.pep DAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRSXGP
LLVYL
||||||||||||||||||||||||||||||||| | |||||
orf88a DAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRSPGA
LLVYLGSVLLVLGTVLMFYVREKR
570 580 590 600 610 620
orf88a AWVLFSDGKIRFAMSSARSERDLQKEFPKHVESLQRLGKDLNHDX
630 640 650 660 670
全长ORF88a核苷酸序列<SEQ ID 331>是:
1 ATGAGTAAAT CCCGTAGATC TCCCCCACTT CTTTCCCGTC CGTGGTTCGC
51 TTTTTTCAGC TCCATGCGCT TTGCGGTCGC TTTGCTCAGT CTGCTGGGTA
101 TTGCATCGGT TATCGGTACG GTGTTGCAGC AAAACCAGCC GCAGACGGAT
151 TATTTGGTCA AATTCGGATC GTTTTGGGCG CAGATTTTTG GTTTTCTGGG
201 ACTGTATGAC GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTT
251 TGGTGGTTTC TACCAGTTTG TGCCTGATTC GCAATGTGCC GCCGTTCTGG
301 CGCGAAATGA AGTCTTTTCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC
351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCGCCC GAGGTTGCCA
401 AACGTTATCT GGAAGTACAA GGTTTTCAGG GAAAAACCAT TAACCGTGAA
451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCACAATGA ACAAATGGGG
501 CTATATCTTT GCCCATGTTG CTTTGATTGT CATTTGCCTG GGCGGGTTGA
551 TAGACAGTAA CCTGCTGTTG AAACTGGGTA TGCTGACCGG TCGGATTGTT
601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT
651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC
701 AGAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT ATTGGTTCAG
751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA
801 TACGGGTATG CCGCGCGATT TTGCCAGTGA TATTGAAGTA ACGGATAAGG
851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC
901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA
951 TTTGACATTC AAGGCGTGGA ATTTGGGTGA TGCTTCGCGC GAGCCTGTCG
1001 TGTTGAAGGC AACATCCATA CACCAGTTTC CGTTGGAAAT TGGCAAACAC
1051 AAATATCGTC TTGAGTTCGA TCAGTTTACT TCTATGAATG TGGAGGACAT
1101 GAGCGAGGGC GCGGAACGGG AAAAAAGCCT GAAATCCACG CTGAACGATG
1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC
1201 ATTGTTTACC GTATCCGTGA TGCGGCAGGG CAGGCGGTCG AATATAAAAA
1251 CTATATGCTG CCGGTTTTGC AGGAACAGGA TTATTTTTGG ATTACCGGCA
1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC
1351 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA
1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GGCGCACCTG
1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAACATC
1501 TTTGCACAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT
1551 CCCGAAAGAG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT
1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG
1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAT CGTTTCCTGC TGCACAGTAT
1701 GGATGCGTAC ACGGGTTTGA CCGAATATCC CGCGCCTATG CTGCTGCAAC
1751 TTGATGGGTT TTCCGAGGTG CGTTCGTCGG GTTTGCAGAT GACCCGTTCC
1801 CCGGGTGCGC TTTTGGTCTA TCTCGGCTCG GTGCTGTTGG TATTGGGTAC
1851 GGTATTGATG TTTTATGTGC GCGAAAAACG GGCGTGGGTA TTGTTTTCAG
1901 ACGGCAAAAT CCGTTTTGCC ATGTCTTCGG CCCGCAGCGA ACGGGATTTG
1951 CAGAAGGAAT TTCCAAAACA CGTCGAGAGT CTGCAACGGC TCGGCAAGGA
2001 CTTGAATCAT GACTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 332>:
1 MSKSRRSPPL LSRPWFAFFS SMRFA
VALLS LLGIASVIGT VLQQNQPQTD
51 YLVKFGSFWA QIFGFLGLYD VYASAW
FVVI MMFLVVSTSL CLIRNVPPFW
101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVQ GFQGKTINRE
151 DGSVLIAAKK GTMNKWG
YIF AHVALIVICL GGLIDSNLLL KLGMLTGRIV
201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADVVF LNADNGILVQ
251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT
301 LHGITIYQAS FADGGSDLTF KAWNLGDASR EPVVLKATSI HQFPLEIGKH
351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS
401 IVYRIRDAAG QAVEYKNYML PVLQEQDYFW ITGTRSGLQQ QYRWLRIPLD
451 KQLKADTFMA LREFLKDGEG RKRLVADATK GAPAEIREQF MLAAENTLNI
501 FAQKGYLGLD EFITSNIPKE QQDKMQGYFY EMLYGVMNAA LDETIRRYGL
551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS
601 PGA
LLVYLGS VLLVLGTVLM FYVREKRAWV LFSDGKIRFA MSSARSERDL
651 QKEFPKHVES LQRLGKDLNH D*
ORF88a和ORF88-1在671个氨基酸的重叠区内有100.0%的相同性:
orf88a.pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60
orf88a.pep QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120
orf88a.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180
orf88a.pep GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240
orf88a.pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300
orf88a.pep LHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 LHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360
orf88a.pep SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420
orf88a.pep PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480
orf88a.pep GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540
orf88a.pep LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600
orf88a.pep PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88-1 PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660
orf88a.pep LQRLGKDLNHD 672
|||||||||||
orf88-1 LQRLGKDLNHD 672
与淋病奈瑟球菌的预计ORF的同源性
ORF88与淋病奈瑟球菌的预计ORF(ORF88.ng)在重叠的371个氨基酸内有93.8%的相同性:
orf88.pep MVFLNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNH 60
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng MVFLNADNGMLVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNH 60
orf88.pep PLTLHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFD 120
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf88ng PLTLHGITIYQASFADGGSDLTFKAWNLRDASREPVVLKATSIHQFPLEIGKHKYRLEFD 120
orf88.pep QFTSMNVEDMSEGAEREKSLKSTLPDVRAVTQEGHKYTNXXXXXXYRIRDAPGQAVEYKN 180
|||||||||||||||||||||||| |||||||||:|||| |||||| ||||||||
orf88ng QFTSMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKN 180
orf88.pep YMLPVLQEQDYFWITGTRSXLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRXVAD 240
||||:||::||||:||||| |||||||||||||||||||||||||||||||||||| |||
orf88ng YMLPILQDKDYFWLTGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVAD 240
orf88.pep ATKGAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVM 300
||| |||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf88ng ATKDAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKGQQDKMQGYFYEMLYGVM 300
orf88.pep NAALDETXTRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQM 360
||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng NAALDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQM 360
orf88.pep TRSXGPLLVYL 371
||| | |||||
orf88ng TRSPGALLVYLGSVLLVLGTVFMFYVPKKRAWVLFSNXKIRFAMSSARSERDLQKEFPKH 420
预计ORF88ng核苷酸序列<SEQ ID 333>编码的蛋白质具有氨基酸序列<SEQ ID334>:
1 MVFLNADNGM LVQDLPFEVK LKKFHIDFYN TGMPRDFASD IEVTDKATGE
51 KLERTIRVNH PLTLHGITIY QASFADGGSD LTFKAWNLRD ASREPVVLKA
101 TSIHQFPLEI GKHKYRLEFD QFTSMNVEDM SEGAEREKSL KSTLNDVRAV
151 TQEGKKYTNI GPSIVYRIRD AAGQAVEYKN YMLPILQDKD YFWLTGTRSG
201 LQQQYRWLRI PLDKQLKADT FMALREFLKD GEGRKRLVAD ATKDAPAEIR
251 EQFMLAAENT LNIFAQKGYL GLDEFITSNI PKGQQDKMQG YFYEMLYGVM
301 NAALDETIRR YGLPEWQQDE ARNRFLLHSM DAYTGLTEYP APMLLQLDGF
351 SEVRSSGLQM TRSPGA
LLVY LGSVLLVLGT VFMFYVPKKR AWVLFSNXKI
401 RFAMSSARSE RDLQKEFPKH VESLQRLGKD LNHD*
进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 335>:
1 ATGAGTAAAT CCCGTATATC TCCCACACTT CTTTCCCGTC CGTGGTTCGC
51 TTTTTTCAGC TCCATGCGCT TTGCGGTCGC TTTGCTCAGT CTGCTGGGTA
101 TTGCATCGGT TATCGGCACG GTGTTACAGC AAAACCAGCC GCAGACGGAT
151 TATTTGGTCA AATTCGGACC GTTTTGGACT CGGATTTTTG ATTTTTTGGG
201 TTTGTATGAT GTCTATGCTT CGGCATGGTT TGTCGTTATC ATGATGTTTC
251 TGGTGGTTTC TACCAGTTTG TGTTTAATCC GTAACGTTCC GCCGTTTTGG
301 CGCGAAATGA AGTCTTTCCG GGAAAAGGTT AAAGAAAAAT CTCTGGCGGC
351 GATGCGCCAT TCTTCGCTGT TGGATGTAAA AATTGCCCCC GAAGTTGCCA
401 AACGTTATCT GGAGGTGCGG GGTTTTCAGG GAAAAACCGT CAGCCGTGAG
451 GACGGGTCGG TTCTGATTGC CGCCAAAAAA GGCAcaatga acaaATGGGG
501 CTATATCTTT GCccaagtag ctTTGATTGT CATTTGCCTG GGCGGGTTGA
551 TAGACAGTAA CCTGCTGCTG AAGCTGGGTA TGCTGGCCGG TCGGATTGTT
601 CCGGACAATC AGGCGGTTTA TGCCAAGGAT TTCAAGCCCG AAAGTATTTT
651 GGGTGCGTCC AATCTCTCAT TTAGGGGCAA CGTCAATATT TCCGAGGGGC
701 AAAGTGCGGA TGTGGTTTTC CTGAATGCCG ACAACGGGAT GTTGGTTCAG
751 GACTTGCCTT TTGAAGTCAA ACTGAAAAAA TTCCATATCG ATTTTTACAA
801 TACGGGTATG CCGCGCGATT TTGCCAGCGA TATTGAAGTA ACGGACAAGG
851 CAACCGGTGA GAAACTCGAG CGCACCATCC GCGTGAACCA TCCTTTGACC
901 TTGCACGGCA TCACGATTTA TCAGGCGAGT TTTGCCGACG GCGGTTCGGA
951 TTTGACATTC AAGGCGTGGA ATTTGAGGGA TGCTTCGCGC GAACCTGTCG
1001 TGTTGAAGGC AACCTCCATA CACCAGTTTC CGTTGGAAAT CGGCAAACAC
1051 AAATATCGTC TTGAGTTCGA TCAGTTCACT TCTATGAATG TGGAGGACAT
1101 GAGCGAGGGT GCGGAACGGG AAAAAAGCCT GAAATCCACT CTGAACGATG
1151 TCCGCGCCGT TACTCAGGAA GGTAAAAAAT ACACCAATAT CGGCCCTTCC
1201 ATCGTGTACC GCATCCGTGA TGcggCAGGG CAGGCGGTCG AATATAAAAA
1251 CTATATGCTG CCGATTTTGC AGGACAAAGA TTATTTTTGG CTGACCGGCA
1301 CGCGCAGCGG CTTGCAGCAG CAATACCGCT GGCTGCGTAT CCCCTTGGAC
1351 AAGCAGTTGA AAGCGGACAC CTTTATGGCA TTGCGTGAGT TTTTGAAAGA
1401 TGGGGAAGGG CGCAAACGTC TGGTTGCCGA CGCAACCAAA GACGCACCTG
1451 CCGAAATCCG CGAACAATTC ATGCTGGCTG CGGAAAACAC GCTGAATATC
1501 TTTGCGCAAA AAGGCTATTT GGGATTGGAC GAATTTATTA CGTCCAATAT
1551 CCCGAAAGGG CAGCAGGATA AGATGCAGGG CTATTTCTAC GAAATGCTTT
1601 ACGGCGTGAT GAACGCTGCT TTGGATGAAA CCATACGCCG GTACGGCTTG
1651 CCCGAATGGC AGCAGGATGA AGCGCGGAAC CGTTTCCTGC TGCACAGTAT
1701 GGATGCCTAT ACGGGGCTGA CGGAATATCC CGCGCCTATG CTGCTCCAGC
1751 TTGACGGGTT TTCCGAGGTG CGTTCCTCAG GTTTGCAGAT GACCCGTTCG
1801 CCGGGTGCGC TTTTGGTCTA TCtcggctcg gtattgttgg TTTTGGgtac
1851 ggtaTttatg tTTTATGTGC GCGAAAAACG GGCGTGGgta tTGTTTTCag
1901 aCGGCAAAAT CCGTTTTGCT ATGtCTTcgg CCcgcagcga ACGGGATTTG
1951 cAGAaggaaT TTCCAAAACA CGtcgAGAGC CTGCAACggc tcggcaaggA
2001 CttgaaTCAT GACTga
它对应于氨基酸序列<SEQ ID 336;ORF88ng-1>:
1 MSKSRISPTL LSRPWFAFFS SMRFA
VALLS LLGIASVIGT VLQQNQPQTD
51 YLVKFGPFWT RIFDFLGLYD VYASAW
FVVI MMFLVVSTSL CLIRNVPPFW
101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVR GFQGKTVSRE
151 DGSVLIAAKK GTMNKWG
YIF AQVALIVICL GGLIDSNLLL KLGMLAGRIV
201 PDNQAVYAKD FKPESILGAS NLSFRGNVNI SEGQSADVVF LNADNGMLVQ
251 DLPFEVKLKK FHIDFYNTGM PRDFASDIEV TDKATGEKLE RTIRVNHPLT
301 LHGITIYQAS FADGGSDLTF KAWNLRDASR EPVVLKATSI HQFPLEIGKH
351 KYRLEFDQFT SMNVEDMSEG AEREKSLKST LNDVRAVTQE GKKYTNIGPS
401 IVYRIRDAAG QAVEYKNYML PILQDKDYFW LTGTRSGLQQ QYRWLRIPLD
451 KQLKADTFMA LREFLKDGEG RKRLVADATK DAPAEIREQF MLAAENTLNI
501 FAQKGYLGLD EFITSNIPKG QQDKMQGYFY EMLYGVMNAA LDETIRRYGL
551 PEWQQDEARN RFLLHSMDAY TGLTEYPAPM LLQLDGFSEV RSSGLQMTRS
601 PGA
LLVYLGS VLLVLGTVFM FYVREKRAWV LFSDGKIRFA MSSARSERDL
651 QKEFPKHVES LQRLGKDLNH D*
ORF88ng-1和ORF88-1在671个氨基酸的重叠区内有97.0%的相同性:
orf88-1.pep MSKSRRSPPLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA 60
||||| || ||||||||||||||||||||||||||||||||||||||||||||||| ||:
orf88ng-1 MSKSRISPTLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGPFWT 60
orf88-1.pep QIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120
:|| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng-1 RIFDFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120
orf88-1.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 180
|||||||||||||||||||:||||||::|||||||||||||||||||||||:||||||||
orf88ng-1 SSLLDVKIAPEVAKRYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIFAQVALIVICL 180
orf88-1.pep GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||
orf88ng-1 GGLIDSNLLLKLGMLAGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF 240
orf88-1.pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300
||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng-1 LNADNGMLVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT 300
orf88-1.pep LHGITIYQASFADGGSDLTFKAWNLGDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360
||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||
orf88ng-1 LHGITIYQASFADGGSDLTFKAWNLRDASREPVVLKATSIHQFPLEIGKHKYRLEFDQFT 360
orf88-1.pep SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng-1 SMNVEDMSEGAEREKSLKSTLNDVRAVTQEGKKYTNIGPSIVYRIRDAAGQAVEYKNYML 420
orf88-1.pep PVLQEQDYFWITGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480
|:||::||||:|||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng-1 PILQDKDYFWLTGTRSGLQQQYRWLRIPLDKQLKADTFMALREFLKDGEGRKRLVADATK 480
orf88-1.pep GAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKEQQDKMQGYFYEMLYGVMNAA 540
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf88ng-1 DAPAEIREQFMLAAENTLNIFAQKGYLGLDEFITSNIPKGQQDKMQGYFYEMLYGVMNAA 540
orf88-1.pep LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf88ng-1 LDETIRRYGLPEWQQDEARNRFLLHSMDAYTGLTEYPAPMLLQLDGFSEVRSSGLQMTRS 600
orf88-1.pep PGALLVYLGSVLLVLGTVLMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf88ng-1 PGALLVYLGSVLLVLGTVFMFYVREKRAWVLFSDGKIRFAMSSARSERDLQKEFPKHVES 660
orf88-1.pep LQRLGKDLNHD 671
|||||||||||
orf88ng-1 LQRLGKDLNHD 671
另外,ORG88ng-1显示出与Aquifex aeolicus的一种假设蛋白同源:
gi|2984296(AE000771)假设蛋白[Aquifex aeolicus]长度=537
评分=94.4位(231),估计值=2e-18
相同性=91/334(27%),阳性=159/334(47%),空隙=59/334(17%)
询问:16 FAFFSSMRFAVALLSLLGIASVIG-TVLQQNQPQTDYLVKFGPFWTRIFDFLGLYDVYAS 74
+ F +S++ A+ ++ +LGI S++G T ++QNQ YL +FG L L DV+ S
目标:80 YDFLASLKLAIFIMLVLGILSMLGSTYIKQNQSFEWYLDQFGYDVGIWIWKLWLNDVFHS 139
询问:75 AWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRHSSLLDVKIAPEVAK 134
++++ ++ L V+ C I+ +P W++ S +E++ + A +H + VKI P+ K
目标:140 WYYILFIVLLAVNLIFCSIKRLPRVWKQAFS-KERILKLDEHAEKHLKPITVKI-PDKDK 197
询问:135 --RYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIFAQVALIVICLGGLIDSNLLLKL 192
++L +GF+ V E + + A+KG ++ G +AL+VI G LID
目标:198 VLKFLLKKGFK-VFVEEEGNKLYVFAEKGRFSRLGVYITHIALLVIMAGALID------- 249
询问:193 GMLAGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVFLNADNGMLVQDL 252
+I+G RG++ ++EG + DV+ + A+ L
目标:250 ----------------------AIVGV-----RGSLIVAEGDTNDVMLVGAE--QKPYKL 280
询问:253 PFEVKLKKFHIDFY---NTGMPRDFA-------SDIEVTDKATGEKLER--TIRVNHPLT 300
PF V L F I Y N + + FA SDIE+ + G K+E T++VN P
目标:281 PFAVHLIDFRIKTYAEENPNVDKRFAQAVSSYESDIEIIN---GGKVEAKGTVKVNEPFD 337
询问:301 LHGITIYQASFA--DGGSDLTFKAWNLRDASREP 332
++QA++ DG S + + + A +P
目标:338 FGRYRLFQATYGILDGTSGMGVIVVDRKKAHEDP 371
根据该分析结果(包括此淋球菌蛋白中有推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例40
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 337>:
1 ATGATGAGTA ATAmAATGGm ACAAAAAGGG TTTACATTGA TTGmGmTGAT
51 GATAGTCGTC GCGATACTCG GCATTATCAG CGTCATTGCC ATACCTTCTT
101 ATCmAAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG
151 GyCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA
201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA
251 AGATGAATCC GAAAATTGCC AAAAAaTATA GTGTTTCGGT AAAGTTTGTC
301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG
351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA
401 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA
451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA
它对应于氨基酸序列<SEQ ID 338;ORF89>:
1 MMSNXMXQKG FTLIXXMIVV AILGIISVIA IPSYXSYIEK GYQSQLYTEM
51 XGINNISKQF ILKNPLDDNQ TIENKLEIFV SGYKMNPKIA KKYSVSVKFV
101 DKEKSRAYRL VGVPKAGTGY TLSVWMNSVG DGYKCRDAAS AQAHLETLSS
151 DVGCEAFSNR KK*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 339>:
1 ATGATGAGTA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGAGATGAT
51 GATAGTCGTC GCGATACTCG GCATTATCAG CGTCATTGCC ATACCTTCTT
101 ATCAAAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG
151 GTCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA
201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA
251 AGATGAATCC GAAAATTGCC AAAAAATATA GTGTTTCGGT AAAGTTTGTC
301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG
351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA
401 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA
451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA
它对应于氨基酸序列<SEQ ID 340;ORF89-1>:
1
MMSNKMEQKG FTLIEMMIVV AILGIISVIA IPSYQSYIEK GYQSQLYTEM
51 VGINNISKQF ILKNPLDDNQ TIENKLEIFV SGYKMNPKIA KKYSVSVKFV
101 DKEKSRAYRL VGVPKAGTGY TLSVWMNSVG DGYKCRDAAS AQAHLETLSS
151 DVGCEAFSNR KK*
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的PilE(登录号Z69260)的同源性
ORF89和PilE蛋白在120个氨基酸重叠区内显示出有30%的氨基酸相同性:
orf89 8 QKGFTLIXXMIVVAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQFILKNPL- 66
QKGFTLI MIV+AI+GI++ +A+P+Y Y + S+ G + ++ L + +
PilE 5 QKGFTLIELMIVIAIVGILAAVALPAYQDYTARAQVSEAILLAEGQKSAVTEYYLNHGIW 64
orf89 67 -DDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGYTLSVW 125
DN + +G + KI KY SV + GV K G LS+W
PilE 65 PKDNTS---------AGVASSDKIKGKYVQSVTVAKGVVTAEMASTGVNKEIQGKKLSLW 115
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF89显示与脑膜炎奈瑟球菌菌株A的ORF(ORF89a)在重叠的162个氨基酸内有83.3%的相同性:
10 20 30 40 50 60
orf89.pep MMSNXMXQKGFTLIXXMIVVAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQF
|||| | ||||||||| || ||| ||||||||||||||||| ||||||||
orf89a MMSNKMEQKGFTLIXXXXXXAIXXXXSVIXXXXYXSYIEKGYQSQLYTEMVGINNISKQX
10 20 30 40 50 60
70 80 90 100 110 120
orf89.pep ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY
||||||||||||::||||||||||||||||:||:|||:||::|| ||| ||||||:||||
orf89a ILKNPLDDNQTIKSKLEIFVSGYKMNPKIAEKYNVSVHFVNEEKPRAYSLVGVPKTGTGY
70 80 90 100 110 120
130 140 150 160
orf89.pep TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX
|||||||||||||||||||||:|||||||||||||||||||||
orf89a TLSVWMNSVGDGYKCRDAASARAHLETLSSDVGCEAFSNRKKX
130 140 150 160
全长ORF89a核苷酸序列<SEQ ID 341>是:
1 ATGATGAGTA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGNGANGNT
51 NATNGNCNTC GCGATACNCN GCNTTANCAG CGTCATTNCN ATNNNTNCNT
101 ATCNNAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG
151 GTCGGTATCA ACAATATTTC CAAACAGTNT ATTTTGAAAA ATCCCCTGGA
201 CGATAATCAG ACCATCAAGA GCAAACTGGA AATATTTGTC TCAGGCTATA
251 AGATGAATCC GAAAATTGCC GAAAAATATA ATGTTTCGGT GCATTTTGTC
301 AATGAGGAAA AACCNAGGGC ATACAGCTTG GTCGGCGTTC CAAAGACGGG
351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA
401 AATGCCGTGA TGCCGCTTCT GCCCGAGCCC ATTTGGAGAC CTTGTCCTCA
451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 342>:
1
MMSNKMEQKG FTLIXXXXXX AIXXXXSVIX XXXYXSYIEK GYQSQLYTEM
51 VGINNISKQX ILKNPLDDNQ TIKSKLEIFV SGYKMNPKIA EKYNVSVHFV
101 NEEKPRAYSL VGVPKTGTGY TLSVWMNSVG DGYKCRDAAS ARAHLETLSS
151 DVGCEAFSNR KK*
ORF89a和ORF89-1显示在162个氨基酸的重叠区内有83.3%的相同性:
10 20 30 40 50 60
orf89a.pep MMSNKMEQKGFTLIXXXXXXAIXXXXSVIXXXXYXSYIEKGYQSQLYTEMVGINNISKQX
|||||||||||||| || ||| | |||||||||||||||||||||||||
orf89-1 MMSNKMEQKGFTLIEMMIVVAILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNISKQF
10 20 30 40 50 60
70 80 90 100 110 120
orf89a.pep ILKNPLDDNQTIKSKLEIFVSGYKMNPKIAEKYNVSVHFVNEEKPRAYSLVGVPKTGTGY
||||||||||||::||||||||||||||||:||:|||:||::|| ||| ||||||:||||
orf89-1 ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY
70 80 90 100 110 120
130 140 150 160
orf89a.pep TLSVWMNSVGDGYKCRDAASARAHLETLSSDVGCEAFSNRKKX
|||||||||||||||||||||:|||||||||||||||||||||
orf89-1 TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX
130 140 150 160
与淋病奈瑟球菌的预计ORF的同源性
ORF89与淋病奈瑟球菌的预计ORF(ORF89.ng)在重叠的162个氨基酸内显示有84.6%的相同性:
orf89 MMSNXMXQKGFTLIXXMIVVAILGIISVIAIPSYXSYIEKGYQSQLYTEMXGINNISKQF 60
|||| | ||||||| ||||:||||||||||||| ||||||||||||||| ||||: |||
orf89ng MMSNKMEQKGFTLIEMMIVVTILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNVLKQF 60
orf89 ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY 120
||||| |||:|:::||:||||||||||||||||||||:||| || |||||||||:|||||
orf89ng ILKNPQDDNDTLKSKLKIFVSGYKMNPKIAKKYSVSVRFVDAEKPRAYRLVGVPNAGTGY 120
orf89 TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKK 162
||||||||||||||||||:||||: :|||:| ||||||||||
orf89ng TLSVWMNSVGDGYKCRDATSAQAYSDTLSADSGCEAFSNRKK 162
全长ORF89ng核苷酸序列<SEQ ID 343>是:
1 aTGATGAGCA ATAAAATGGA ACAAAAAGGG TTTACATTGA TTGAGATGAT
51 GATAGTTGTC ACGATACTCG GCATCATCAG CGTCATTGCC ATACCTTCTT
101 ATCAGAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG
151 GTCGGTATCA ACAATGTTCT CAAACAGTTT ATTTTGAAAA ATCCCCAGGA
201 CGATAATGAT ACCCTCAAGA GCAAACTGAA AATATTTGTC TCAGGCTATA
251 AGATGAATCC GAAAAttgCC AAAAAATATA GTGTTTCGGt aaggtttGTC
301 gatGCGGAAA AACCAAGGGC ATACAGGTTG GTCGGCGTTC CGAACGCGGG
351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA
401 AATGCCGTGA TGCCACTTCT GCCCAGGCCT ATTCGGACAC CTTGTCCGCA
451 GATAGCGGCT GTGAAGCTTT CTCTAATCGT AAAAAATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 344>:
51 VGINNVLKQF ILKNPQDDND TLKSKLKIFV SGYKMNPKIA KKYSVSVRFV
101 DAEKPRAYRL VGVPNAGTGY TLSVWMNSVG DGYKCRDATS AQAYSDTLSA
151 DSGCEAFSNR KK*
该淋球菌蛋白具有一个推定的前导序列(下划线)和N端甲基化位点(NMePhe或4型菌毛,双划线)。另外,ORF89ng和ORF89-1在162个氨基酸的重叠区内有88.3%的相同性:
10 20 30 40 50 60
orf89-1.pep MMSNKMEQKGFTLIEMMIVVAILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNISKQF
||||||||||||||||||||:||||||||||||||||||||||||||||||||||: |||
orf89ng MMSNKMEQKGFTLIEMMIVVTILGIISVIAIPSYQSYIEKGYQSQLYTEMVGINNVLKQF
10 20 30 40 50 60
70 80 90 100 110 120
orf89-1.pep ILKNPLDDNQTIENKLEIFVSGYKMNPKIAKKYSVSVKFVDKEKSRAYRLVGVPKAGTGY
||||| |||:|:::||:||||||||||||||||||||:||| || |||||||||:|||||
orf89ng ILKNPQDDNDTLKSKLKIFVSGYKMNPKIAKKYSVSVRFVDAEKPRAYRLVGVPNAGTGY
70 80 90 100 110 120
130 140 150 160
orf89-1.pep TLSVWMNSVGDGYKCRDAASAQAHLETLSSDVGCEAFSNRKKX
||||||||||||||||||:||||: :|||:| |||||||||||
orf89ng TLSVWMNSVGDGYKCRDATSAQAYSDTLSADSGCEAFSNRKKX
130 140 150 160
根据该分析结果(包括淋球菌基序以及与已知PilE蛋白的同源性),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF89-1(13.6kDa)克隆到pGex载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图11A显示出GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白免疫小鼠,该小鼠的血清在ELISA测试中给出了阳性结果,这确认了ORF89-1是一种外露蛋白,且是一种有用的免疫原。
实施例41
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 345>:
1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT
51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA
101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT
151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT
201 GACCGCATTG GCGGTCGGCA ACCCTTGGsG CACCG.GTCC GACG.GCAAA
251 AACAAGCGTT GGCCn.AGAA TTTCAACCC...
它对应于氨基酸序列<SEQ ID 346;ORF91>:
1 MKKSSLISAL GIGILSIGMA FAAPADAVSQ IRQNATQVLS ILKNGDANTA
51 RQKAEAYAIP YFDFQRMTAL AVGNPWXTXS DXQKQALAXE FQP...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 347>:
1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT
51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA
101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT
151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT
201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG CACCGCGTCC GACGCGCAAA
251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC
301 GGCACGATGC TGAAATTAAA AAACGCCAAC GTCAACGTCA AAGACAATCC
351 CATCGTCAAT AAAGGCGGCA AAGAAATCAT CGTCCGCGCC GAAGTCGGCG
401 TACCCGGGCA AAAACCCGTC AACATGGACT TCACCACCTA CCAAAGCGGC
451 GGTAAATACC GTACCTACAA CGTCGCCATC GAAGGCGCGA GCCTGGTTAC
501 CGTGTACCGC AACCAATTCG GCGAAATTAT CAAAGCGAAA GGCGTGGACG
551 GACTGATTGC CGAGTTGAAA GCCAAAAACG GCGGCAAATA A
它对应于氨基酸序列<SEQ ID 348;ORF91-1>:
1
MKKSSLISAL GIGILSIGMA FAAPADAVSQ IRQNATQVLS ILKNGDANTA
51 RQKAEAYAIP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS
101 GTMLKLKNAN VNVKDNPIVN KGGKEIIVRA EVGVPGQKPV NMDFTTYQSG
151 GKYRTYNVAI EGASLVTVYR NQFGEIIKAK GVDGLIAELK AKNGGK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF91显示与脑膜炎奈瑟球菌菌株A的ORF(ORF91a)在重叠的92个氨基酸内有92.4%的相同性:
10 20 30 40 50 60
orf91.pep MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP
|||||:||||||||||||||||||||||:||||||||||||||:||||||||||||||||
orf91a MKKSSFISALGIGILSIGMAFAAPADAVNQIRQNATQVLSILKSGDANTARQKAEAYAIP
10 20 30 40 50 60
70 80 90
orf91.pep YFDFQRMTALAVGNPWXTXSDXQKQALAXEFQP
|||||||||||||||| | || |||||| |||
orf91a YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
70 80 90 100 110 120
orf91a KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
130 140 150 160 170 180
全长ORF91a核苷酸序列<SEQ ID 349>是:
1 ATGAAAAAAT CCTCCTTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT
51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAACCAA ATCCGTCAAA
101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA GCGGTGATGC CAACACCGCC
151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT
201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG CACCGCGTCC GACGCGCAAA
251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC
301 GGCACGATGC TGAAATTAAA AAACGCCAAC GTCAACGTCA AAGACAATCC
351 CATCGTCAAT AAAGGCGGCA AAGAAATCAT CGTCCGCGCC GAAGTCGGCG
401 TACCCGGGCA AAAACCCGTC AACATGGACT TCACCACCTA CCAAAGCGGC
451 GGTAAATACC GTACCTACAA CGTCGCCATC GAAGGCGCGA GCCTGGTTAC
501 CGTGTACCGC AACCAATTCG GCGAAATTAT CAAAGCGAAA GGCGTGGACG
551 GACTGATTGC CGAGTTGAAG GCTAAAAACG GCAGCAAGTA A
它编码的蛋白质具有氨基酸序列<SEQ ID 350>:
1
MKKSSFISAL GIGILSIGMA FAAPADAVNQ IRQNATQVLS ILKSGDANTA
51 RQKAEAYAIP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS
101 GTMLKLKNAN VNVKDNPIVN KGGKEIIVRA EVGVPGQKPV NMDFTTYQSG
151 GKYRTYNVAI EGASLVTVYR NQFGEIIKAK GVDGLIAELK AKNGSK*
ORF91a和ORF91-1显示在196个氨基酸的重叠区内有98.0%的相同性:
10 20 30 40 50 60
orf91a.pep MKKSSFISALGIGILSIGMAFAAPADAVNQIRQNATQVLSILKSGDANTARQKAEAYAIP
|||||:||||||||||||||||||||||:||||||||||||||:||||||||||||||||
orf91-1 MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP
10 20 30 40 50 60
70 80 90 100 110 120
orf91a.pep YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf91-1 YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
70 80 90 100 110 120
130 140 150 160 170 180
orf91a.pep KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf91-1 KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
130 140 150 160 170 180
190
orf91a.pep GVDGLIAELKAKNGSKX
||||||||||||||:||
orf91-1 GVDGLIAELKAKNGGKX
190
与淋病奈瑟球菌的预计ORF的同源性
ORF91显示与淋病奈瑟球菌的预计ORF(ORF91.ng)在重叠的92个氨基酸内有84.8%的相同性:
orf91.pep MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP 60
:||||:||||||||||||||||:|||||:||||||||||:|||:||| :|| ||||||:|
orf91ng VKKSSFISALGIGILSIGMAFASPADAVGQIRQNATQVLTILKSGDAASARPKAEAYAVP 60
orf91.pep YFDFQRMTALAVGNPWXTXSDXQKQALAXEFQP 93
|||||||||||||||| | || |||||| |||
orf91ng YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPIVN 120
预计全长ORF91ng核苷酸序列<SEQ ID 351>编码的蛋白质具有氨基酸序列<SEQ ID 352>:
1
VKKSSFISAL GIGILSIGMA FASPADAVGQ IRQNATQVLT ILKSGDAASA
51 RPKAEAYAVP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS
101 GTMLKFKNAT VNVKDNPIVN KGGKEIVVRA EVGIPGQKPV NMDFTTYQSG
151 GKYRTYNVAI EGTSLVTVYR NQFGEIIKAK GIDGLIAELK AKNGGK*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 353>:
1 ATGAAAAAAT CCTCCTTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT
51 CGGCATGGCA TTTGCCTCCC CGGCCGACGC AGTGGGACAA ATCCGCCAAA
101 ACGCCACACA GGTTTTGACC ATCCTCAAAA GCGGCGACGC GGCTTCTGCA
151 CGCCCAAAAG CCGAAGCCTA TGCGGTTCCC TATTTCGATT TCCAACGTAT
201 GACCGCATTG GCGGTCGGCA ACCCTTGGCG TACCGCGTCC GACGCGCAAA
251 AACAAGCGTT GGCCAAAGAA TTTCAAACCC TGCTGATCCG CACCTATTCC
301 GGCACGATGC TGAAATTCAA AAACGCGACC GTCAACGTCA AAGACAATCC
351 CATCGTCAAT AAGGGCGGCA AGGAAATCGT CGTCCGTGCC GAAGTCGGCA
401 TCCCCGGTCA GAAGCCCGTC AATATGGACT TTACCACCTA CCAAAGCGGC
451 GGCAAATACC GTACCTACAA CGTCGCCATC GAAGGCACGA GCCTGGTTAC
501 CGTGTACCGC AACCAATTCG GCGAAATCAT CAAAGCCAAA GGCATCGACG
551 GGCTGATTGC CGAGTTGAAA GCCAAAAACG GCGGCAAATA A
它对应于氨基酸序列<SEQ ID 354;ORF91ng-1>:
1
MKKSSFISAL GIGILSIGMA FASPADAVGQ IRQNATQVLT ILKSGDAASA
51 RPKAEAYAVP YFDFQRMTAL AVGNPWRTAS DAQKQALAKE FQTLLIRTYS
101 GTMLKFKNAT VNVKDNPIVN KGGKEIVVRA EVGIPGQKPV NMDFTTYQSG
151 GKYRTYNVAI EGTSLVTVYR NQFGEIIKAK GIDGLIAELK AKNGGK*
ORF91ng-1和ORF91-1显示在196个氨基酸的重叠区内有92.3%的相同性:
10 20 30 40 50 60
orf91-1.pep MKKSSLISALGIGILSIGMAFAAPADAVSQIRQNATQVLSILKNGDANTARQKAEAYAIP
|||||:||||||||||||||||:|||||:||||||||||:|||:||| :|| ||||||:|
orf91ng-1 MKKSSFISALGIGILSIGMAFASPADAVGQIRQNATQVLTILKSGDAASARPKAEAYAVP
10 20 30 40 50 60
70 80 90 100 110 120
orf91-1.pep YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKLKNANVNVKDNPIVN
|||||||||||||||||||||||||||||||||||||||||||||:|||:||||||||||
orf91ng-1 YFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPIVN
70 80 90 100 110 120
130 140 150 160 170 180
orf91-1.pep KGGKEIIVRAEVGVPGQKPVNMDFTTYQSGGKYRTYNVAIEGASLVTVYRNQFGEIIKAK
||||||:||||||:||||||||||||||||||||||||||||:|||||||||||||||||
orf91ng-1 KGGKEIVVRAEVGIPGQKPVNMDFTTYQSGGKYRTYNVAIEGTSLVTVYRNQFGEIIKAK
130 140 150 160 170 180
190
orf91-1.pep GVDGLIAELKAKNGGKX
|:|||||||||||||||
orf91ng-1 GIDGLIAELKAKNGGKX
190
另外,ORF91ng-1显示出与一种假设的大肠杆菌蛋白同源:
sp|P45390|YRBC_ECOLI MURA-RPON基因间区域中的假设的24.0KD蛋白前体(F211)>gi|606130(U18997)ORF_f211[大肠杆菌]>gi|1789583(AE000399)murZ-rpoN基因间区域中的假设的24.0kD蛋白[大肠杆菌]长度=211
评分=70.6位(170),估计值=6e-12
相同性=42/137(30%),阳性=76/137(54%),空隙=6/137(4%)
询问:59 VPYFDFQRMTALAVGNPWRTASDAQKQALAKEFQTLLIRTYSGTMLKFKNATVNVKDNPI 118
+PY + AL +G +++A+ AQ++A F+ L + Y + + T + P
目标:65 LPYVQVKYAGALVLGQYYKSATPAQREAYFAAFREYLKQAYGQALAMYHGQTYQIA--PE 122
询问:119 VNKGGKEIV-VRAEVGIP-GQKPVNMDFTTYQSG--GKYRTYNVAIEGTSLVTVYRNQFG 174
G K IV +R + P G+ PV +DF ++ G ++ Y++ EG S++T +N++G
目标:123 QPLGDKTIVPIRVTIIDPNGRPPVRLDFQWRKNSQTGNWQAYDMIAEGVSMITTKQNEWG 182
询问:175 EIIKAKGIDGLIAELKA 191
+++ KGIDGL A+LK+
目标:183 TLLRTKGIDGLTAQLKS 199
根据该分析结果,包括此淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例42
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 355>:
1 ATGAAACACA TACTCCCCCT GATTGCCGCA TCCGCACTCT GCATTTCAAC
51 CGCTTCGGCA CATCCTGCCA GCGAACCGTC CACTCAAAAC GAAACCGCTA
101 TGATCACGCA TACCCTCATC TCAAAATACA GTTTTGGnnn nnnnnnnnnn
151 nnnnnnnnnn nnGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT
201 CGACCATCAG GAAGCCGCAC GCCGAAACGG CTTAACGATG CAGCCGGCAA
251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GCACGCCGCT GATGGTCAAA
301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA CGCGTCCTCG TTACCGAAAC
351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG
401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA
451 AAACTGATAC AAAAAACCGT AGGCGAATAA
它对应于氨基酸序列<SEQ ID 356;ORF97>:
1 MKHILPLIAA SALCISTASA HPASEPSTQN ETAMITHTLI SKYSFGXXXX
51 XXXXAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK
101 DPAFALQLPL RVLVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE
151 KLIQKTVGE*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 357>:
1 ATGAAACACA TACTCCCCCT GATTGCCGCA TCCGCACTCT GCATTTCAAC
51 CGCTTCGGCA CATCCTGCCA GCGAACCGTC CACCCAAAAC GAAACCGCTA
101 TGACCACGCA TACCCTCACC TCAAAATACA GTTTTGACGA AACCGTCAGC
151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT
201 CGACCATCAG GAAGCCGCCC GCCGAAACGG CTTAACGATG CAGCCGGCAA
251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GCACGCCGCT GATGGTCAAA
301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA CGCGTCCTCG TTACCGAAAC
351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG
401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA
451 AAACTGATAC AAAAAACCGT AGGCGAATAA
它对应于氨基酸序列<SEQ ID 358;ORF97-1>:
1
MKHILPLIAA SALCISTASA HPASEPSTQN ETAMTTHTLT SKYSFDETVS
51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK
101 DPAFALQLPL RVLVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE
151 KLIQKTVGE*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF97显示与脑膜炎奈瑟球菌菌株A的ORF(ORF97a)在重叠的159个氨基酸内有88.7%的相同性:
10 20 30 40 50 60
orf97.pep MKHILPLIAASALCISTASAHPASEPSTQNETAMITHTLISKYSFGXXXXXXXXAIKSKG
| ||||| |||||||||| ||||||:||||||| |||| ||||| : :||||||
orf97a MXHILPLXXASALCISTASXHPASEPQTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
10 20 30 40 50 60
70 80 90 100 110 120
orf97.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||
orf97a MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVXVTETDGK
70 80 90 100 110 120
130 140 150 160
orf97.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX
||||||||||||||||||||||||||||||||||||:|||
orf97a VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTIGEX
130 140 150 160
全长ORF97a核苷酸序列<SEQ ID 359>是:
1 ATGANACACA TACTCCCCCT GANTGNCGCA TCCGCACTCT GCATTTCAAC
51 CGCTTCGGNN CATCCTGCCA GCGAACCGCA AACCCAAAAC GAAACCGCTA
101 TGACCACGCA TACCCTCACC TCAAAATACA GTTTTGACGA AACCGTCAGC
151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT
201 CGACCATCAG GAAGCCGCCC GCCGAAACGG CTTAACGATG CAGCCGGCAA
251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG GTACGCCGCT GATGGTCAAA
301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTG CGCGTCNTCG TTACCGAAAC
351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG
401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA
451 AAACTGATAC AAAAAACCAT AGGCGAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 360>:
1
MXHILPLXXA SALCISTASX HPASEPQTQN ETAMTTHTLT SKYSFDETVS
51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK
101 DPAFALQLPL RVXVTETDGK VRAAYTDTRA LIAGSRIGFD EVANTLANAE
151 KLIQKTIGE*
ORF97a和ORF97-1显示在159个氨基酸的重叠区内有95.6%的相同性:
10 20 30 40 50 60
orf97a.pep MXHILPLXXASALCISTASXHPASEPQTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
| ||||| |||||||||| ||||||:|||||||||||||||||||||||||||||||||
orf97-1 MKHILPLIAASALCISTASAHPASEPSTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
10 20 30 40 50 60
70 80 90 100 110 120
orf97a.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVXVTETDGK
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||
orf97-1 MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
70 80 90 100 110 120
130 140 150 160
orf97a.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTIGEX
||||||||||||||||||||||||||||||||||||:|||
orf97-1 VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX
130 140 150 160
与淋病奈瑟球菌的预计ORF的同源性
ORF97显示与淋病奈瑟球菌的预计ORF(ORF97.ng)在重叠的159个氨基酸内有88.1%的相同性:
orf97.pep MKHILPLIAASALCISTASAHPASEPSTQNETAMITHTLISKYSFGXXXXXXXXAIKSKG 60
|||||| |||||:||||||||||::| ||||||| |||| ||||| : :||||||
orf97ng MKHILPPIAASAFCISTASAHPAGKPPTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG 60
orf97.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf97ng MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK 120
orf97.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGE 159
||:|||||||||:||||:|||||||||||||||||||||
orf97ng VRTAYTDTRALIVGSRISFDEVANTLANAEKLIQKTVGE 159
预计全长ORF97ng核苷酸序列<SEQ ID 361>编码的蛋白质具有氨基酸序列<SEQ ID 362>:
1
MKHILPPIAA SAFCISTASA HPAGKPPTQN ETAMTTHTLT SKYSFDETVS
51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK
101 DPAFALQLPL RVLVTETDGK VRTAYTDTRA LIVGSRISFD EVANTLANAE
151 KLIQKTVGE*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 363>:
1 ATGAAACACA TACTCCCcct gatcgccgca TccgcactCT GCATTTCAAC
51 CGCTTCGGCA CACCCTGCCG GCAAACCGCC CACCCAAAAC GAAACCGCTA
101 TGACCACGCA CACCCTCACC TCGAAATACA GTTTTGACGA AACCGTCAGC
151 CGCCTTGAAA CCGCCATAAA AAGCAAAGGG ATGGACATTT TTGCCGTCAT
201 CGACCATCAG GAAGCGGCAC GCCGAAACGG CCTGACCATG CAGCCGGCAA
251 AAGTCATCGT CTTCGGCACG CCCAAGGCCG GTACGCCgct GATGGTCAAA
301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTG CGCGTCCTCG TTACCGAAAC
351 GGACGGCAAA GTACGCACCG CCTATACCGA TACGCGCGCC CTCATCGTCG
401 GCAGCCGCAT CAGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA
451 AAACTGATAC AAAAAACCGT AGGCGAATAA
它对应于氨基酸序列<SEQ ID 364;ORF97ng-1>:
1
MKHILPLIAA SALCISTASA HPAGKPPTQN ETAMTTHTLT SKYSFDETVS
51 RLETAIKSKG MDIFAVIDHQ EAARRNGLTM QPAKVIVFGT PKAGTPLMVK
101 DPAFALQLPL RVLVTETDGK VRTAYTDTRA LIVGSRISFD EVANTLANAE
151 KLIQKTVGE*
ORF97ng-1和ORF97-1显示在159个氨基酸的重叠区内有96.2%的相同性:
10 20 30 40 50 60
orf97-1.pep MKHILPLIAASALCISTASAHPASEPSTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
|||||||||||||||||||||||::| |||||||||||||||||||||||||||||||||
orf97ng-1 MKHILPLIAASALCISTASAHPAGKPPTQNETAMTTHTLTSKYSFDETVSRLETAIKSKG
10 20 30 40 50 60
70 80 90 100 110 120
orf97-1.pep MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf97ng-1 MDIFAVIDHQEAARRNGLTMQPAKVIVFGTPKAGTPLMVKDPAFALQLPLRVLVTETDGK
70 80 90 100 110 120
130 140 150 160
orf97-1.pep VRAAYTDTRALIAGSRIGFDEVANTLANAEKLIQKTVGEX
||:|||||||||:||||:||||||||||||||||||||||
orf97ng-1 VRTAYTDTRALIVGSRISFDEVANTLANAEKLIQKTVGEX
130 140 150 160
根据该分析,包括此淋球菌蛋白中有一个推定的前导序列,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF97-1(15.3kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图12A和12B分别显示了GST-融合蛋白和His-融合蛋白的亲和纯化结果。用纯化的GST-融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图12C),ELISA(阳性结果),和FACS分析(图12D).这些实验确认ORF97-1是一种外露蛋白,且是一种有用的免疫原。
图12E显示出ORF97-1的亲水性、抗原性指数和AMPHI区域的曲线。
实施例43
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA<SEQ ID 365>:
1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC AGTAAATGGC TGATTGTGCC
51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC GGCGGAGGGG ATAGATGTGA
101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG GGCAGCTTTC CATCAGCAGC
151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGg
201 CGTGCCGCTC AACTTTACCT TAAGCTGGCA GCTTTCCGCC CCGATAATCG
251 CTTCTTATCG GTTTAAATTG GGGCAACTGA TTGGCGATGA CGACaATATT
301 GACTACAAAC TGAGTTTCCA TCCGCTGACc AaACGCTACC GCGTTACCgT
351 CGgCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA
401 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT
451 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC
501 AAAACTGCCC AAGCCTTTTC AAATCAATGC ATTGACTTCT CAAAACTGGC
551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA
它对应于氨基酸序列<SEQ ID 366;ORF106>:
1 MAFITRLFKS SKWLIVPLML PAFQNVAAEG IDVSRAEARI TDGGQLSISS
51 RFQTELPDQL QQALRRGVPL NFTLSWQLSA PIIASYRFKL GQLIGDDDNI
101 DYKLSFHPLT KRYRVTVGAF STDYDTLDAA LRATGAVANW KVLNKGALSG
151 AEAGETKAEI RLTLSTSKLP KPFQINALTS QNWHLDSGWK PLNIIGNK*
进一步的工作揭示了下列DNA序列<SEQ ID 367>:
1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC AGTAAATGGC TGATTGTGCC
51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC GGCGGAGGGG ATAGATGTGA
101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG GGCAGCTTTC CATCAGCAGC
151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGG
201 CGTGCCGCTC AACTTTACCT TAAGCTGGCA GCTTTCCGCC CCGATAATCG
251 CTTCTTATCG GTTTAAATTG GGGCAACTGA TTGGCGATGA CGACAATATT
301 GACTACAAAC TGAGTTTCCA TCCGCTGACC AACCGCTACC GCGTTACCGT
351 CGGCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA
401 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT
451 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC
501 AAAACTGCCC AAGCCTTTTC AAATCAATGC ATTGACTTCT CAAAACTGGC
551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA
它对应于氨基酸序列<SEQ ID 368;ORF106-1>:
1
MAFITRLFKS SKWLIVPLML PAFQNVAAEG IDVSRAEARI TDGGQLSISS
51 RFQTELPDQL QQALRRGVPL NFTLSWQLSA PIIASYRFKL GQLIGDDDNI
101 DYKLSFHPLT NRYRVTVGAF STDYDTLDAA LRATGAVANW KVLNKGALSG
151 AEAGETKAEI RLTLSTSKLP KPFQINALTS QNWHLDSGWK PLNIIGNK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF106显示与脑膜炎奈瑟球菌菌株A的ORF(ORF106a)在重叠的199个氨基酸内有87.4%的相同性:
10 20 30 40 50 59
orf106.pep MAFITRLFKSSK-WLIVPLMLPAFQNVAAEGIDVSRAEARITDGGQLSISSRFQTELPDQ
|||||||||| | ||:: || :: ::||||||||||||||:|||||| ||||||||||
orf106a MAFITRLFKSIKQWLVLLPMLSVLPDAAAEGIDVSRAEARIXDGGQLSXXSRFQTELPDQ
10 20 30 40 50 60
60 70 80 90 100 110 119
orf106.pep LQQALRRGVPLNFTLSWQLSAPIIASYRFKLGQLIGDDDNIDYKLSFHPLTKRYRVTVGA
|| | ||| || || ||||||||||||| ||||||||| |||||||||||:||||||||
orf106a LQXAXXRGVXLNXTLXWQLSAPIIASYRFXLGQLIGDDDXIDYKLSFHPLTNRYRVTVGA
70 80 90 100 110 120
120 130 140 150 160 170 179
orf106.pep FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf106a FSTXYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT
130 140 150 160 170 180
180 190 199
orf106.pep SQNWHLDSGWKPLNIIGNKX
||||||||||||||||||||
orf106a SQNWHLDSGWKPLNIIGNKX
190 200
由于残基11位的K被替代成N,ORF106a和ORF106-1之间在相同的199个氨基酸重叠区内的同源性是87.9%。
全长ORF106a核苷酸序列<SEQ ID 369>是:
1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC ATTAAACAAT GGCTTGTGCT
51 GCTGCCGATG CTTTCCGTTT TGCCGGACGC GGCGGCGGAG GGGATAGATG
101 TGAGCCGCGC CGAAGCGAGG ATAANCGACG GCGGGCAGCT TTCCATNAGN
151 AGCCGCTTCC AAACCGAGCT GCCCGACCAG CTCCAANNNG CGNNGNGCCG
201 GGGCGTGNCG CTCAACTNTA CCTTAAGNTG GCAGCTTTCC GCCCCGATAA
251 TCGCTTCTTA TCGGTTTNAA TTGGGGCAAC TGATTGGCGA TGACGACNAT
301 ATTGACTACA AACTGAGTTT CCATCCGCTG ACCAACCGCT ACCGCGTTAC
351 CGTCGGCGCG TTTTCGACAG ANTACGACAC CTTGGATGCG GCATTGCGCG
401 CGACCGGCGC GGTTGCCAAC TGGAAAGTCC TGAACAAAGG CGCGCTGTCC
451 GGTGCGGAAG CAGGGGAAAC CAAGGCGGAA ATCCGCCTGA CGCTGTCCAC
501 TTCAAAACTG CCCAAGCCTT TTCAAATCAA TGCATTGACT TCTCAAAACT
551 GGCATTTGGA TTCGGGTTGG AAACCTCTAA ACATCATCGG GAACAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 370>:
1
MAFITRLFKS IKQWLVLLPM LSVLPDAAAE GIDVSRAEAR IXDGGQLSXX
51 SRFQTELPDQ LQXAXXRGVX LNXTLXWQLS APIIASYRFX LGQLIGDDDX
101 IDYKLSFHPL TNRYRVTVGA FSTXYDTLDA ALRATGAVAN WKYLNKGALS
151 GAEAGETKAE IRLTLSTSKL PKPFQINALT SQNWHLDSGW KPLNIIGNK*
与淋病奈瑟球菌的预计ORF的同源性
ORF106显示与淋病奈瑟球菌的预计ORF(ORF106.ng)在重叠的199个氨基酸内有90.5%的相同性:
orf106.pep MAFITRLFKSSK-WLIVPLMLPAFQNVAAEGIDVSRAEARITDGGQLSISSRFQTELPDQ 59
|||||||||| | ||:: :| :: ::||||| ::||||||||||:||||||||||||||
orf106ng MAFITRLFKSIKQWLVLLPILSVLPDAAAEGIAATRAEARITDGGRLSISSRFQTELPDQ 60
orf106.pep LQQALRRGVPLNFTLSWQLSAPIIASYRFKLGQLIGDDDNIDYKLSFHPLTKRYRVTVGA 119
|||||||||||||||||||||| ||||||||||||||||||||||||||||:||||||||
orf106ng LQQALRRGVPLNFTLSWQLSAPTIASYRFKLGQLIGDDDNIDYKLSFHPLTNRYRVTVGA 120
orf106.pep FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 179
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf106ng FSTDYDTLDAALRATGAVANWKVLNKGALSGAEAGETKAEIRLTLSTSKLPKPFQINALT 180
orf106.pep SQNWHLDSGWKPLNIIGNK 198
|||||||||||||||||||
orf106ng SQNWHLDSGWKPLNIIGNK 199
由于残基111位的K被替换成N,ORF106ng和ORF106-1之间在相同的199个氨基酸重叠区内的同源性是91.0%。
全长ORF106ng核苷酸序列<SEQ ID 371>是:
1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC ATTAAACAAT GGCTTGTGCT
51 GTTGCCGATA CTCTCCGTTT TGCCGGACGC GGCGGCGGAG GGCATTGCCG
101 CGACCCGCGC CGAAGCGAGG ATAACCGACG GCGGGCGGCT TTCCATCAGC
151 AGCCGCTTCC AAACCGAGCT GCCCGACCAG CTCCAACAGG CGTTGCGCCG
201 GGGCGTACCG CTCAACTTTA CCTTAAGCTG GCAGCTTTCC GCCCCGACAA
251 TCGCTTCTTA TCGGTTTAAA TTGGGGCAAC TGATTGGCGA TGACGACAAT
301 ATTGACTACA AACTAAGTTT CCATCCGCTG ACCAACCGCT ACCGCGTTAC
351 CGTCGGCGCA TTTTCCACCG ATTACGACAC TTTGGATGCG GCATTGCGCG
401 CGACCGGCGC GGTTGCCAAC TGGAAAGTCC TGAACAAAGG CGCGTTGTCC
451 GGTGCGGAAG CAGGGGAAAC CAAGGCGGAA ATCCGCCTGA CGCTGTCCAC
501 TTCAAAACTG CCCAAGCCTT TCCAAATCAA CGCATTGACT TCTCAAAACT
551 GGCATTTGGA TTCGGGTTGG AAACCTCTAA ACATCATCGG GAACAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 372>:
1
MAFITRLFKS IKQWLVLLPI LSVLPDAAAE GIAATRAEAR ITDGGRLSIS
51 SRFQTELPDQ LQQALRRGVP LNFTLSWQLS APTIASYRFK LGQLIGDDDN
101 IDYKLSFHPL TNRYRVTVGA FSTDYDTLDA ALRATGAVAN WKVLNKGALS
151 GAEAGETKAE IRLTLSTSKL PKPFQINALT SQNWHLDSGW KPLNIIGNK*
根据该分析结果,包括此淋球菌蛋白中有一个推定的前导序列的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF106-1(18kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图13A显示出His-融合蛋白亲和纯化的结果,图13B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,将小鼠血清用于FACS分析(图13C)。这些实验确认ORF106-1是一种外露蛋白,且是一种有用的免疫原。
实施例44
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 373>:
1 ATGGACACAA AAGAAATCCT CGG.TACGCG GcAGGcTCGA TCGGCAGCGC
51 GGTTTTAGCC GTCATCATCc TGCCGCTGCT GTCGTGGTAT TTCCCCGCCG
101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG GCTgACGGTG
151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC
201 CACCGCCGAC AAAGACAcCT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC
251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG
301 TCTGAAATCC TGTTTTCACT CGACGATGCC gCCGCCGGCa TCGGGCTGGT
351 GCTGTTTGAA CtGAGCTTCC TGCCCATCCG cTTTCTCTTA CTGGTTTTGC
401 GTATGGAAGG ACGCGCCcTT GCCTTTTCGT CCGCGCAACT CGTGCcCAAG
451 CTCGCCATCC TGCTGCTG.T GCCGCTGACG GTCGGGCTGC TGCACTTTCC
501 AGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG
551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG
601 CACGCACCGT TTTCGCCCGC CGTCCTGCAC CGGGGG.TGC GCTACGGCAT
651 ACCGATCGCA CTGAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC
701 GTTTGTTCCT GAAAAAATAT GCCGGCCTGG AACAGCTCGG CGTTTATTCG
751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGTTCCAAA GCATCTTTTC
801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGAA AACGCCCCGC
851 CCGCTCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC
901 GCCCTCTGC. TGACCGGCAT TTTCTCGCCC CTTGCCTCCC TCCTGCTGCC
951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATG.TGCCGC
1001 CGCTGTTTTG CACGCTGGCG GAAATCAGCG GCATCGGTTT GAACGTCGTT
1051 CGCAAAACGC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA
1101 CCTGCTGCTG CTGGGGCTTG ACCGTGCCGT ACCGGCGAGG CCGCC.GGCG
1151 CGGCGGTTGC CTGTGCCGCC TCATTCTGGC TGTTTTTTGC CTTCAAGACC
1201 GAAAGCTCyT GCCGCCTGTG GCAGCCGCTC AAACGCCTGC CGCTTTATCT
1251 GCACACATTG TTCTGCCTGA CCTCCTCGGC GGCCTACACC TGCTTCGGCA
1301 CGCCGGCAAA CTATCCCCTG TTTGCCGGCG TATGGGCGGC ATATCTGGCA
1351 GGCTGCATCC TGCGCCACCG GAAAGATTTG CACAAACTGT TTCATTATTT
1401 GAAAAAACAA GGTTTCCCAT TATGA
它对应于氨基酸序列<SEQ ID 374;ORF10>:
1 MDTKEILXYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV
51 SVLCLGLDQA YVREYYATAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP
101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK
151 LAILLLXPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR
201 HAPFSPAVLH RGXRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS
251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEE NAPPARLSAT AESAAALLAS
301 ALCXTGIFSP LASLLLPENY AAVRFIVVSC MXPPLFCTLA EISGIGLNVV
351 RKTRPIALAT LGALAANLLL LGLDRAVPAR PXGAAVACAA SFWLFFAFKT
401 ESSCRLWQPL KRLPLYLHTL FCLTSSAAYT CFGTPANYPL FAGVWAAYLA
451 GCILRHRKDL HKLFHYLKKQ GFPL*
进一步的序列分析揭示了完整的DNA序列<SEQ ID 375>是:
1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC
51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCCCCGCCG
101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG GCTGACGGTG
151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC
201 CACCGCCGAC AAAGACACCT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC
251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG
301 TCTGAAATCC TGTTTTCACT CGACGATGCC GCCGCCGGCA TCGGGCTGGT
351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC
401 GTATGGAAGG ACGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGCCCAAG
451 CTCGCCATCC TGCTGCTGCT GCCGCTGACG GTCGGGCTGC TGCACTTTCC
501 AGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG
551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG
601 CACGCACCGT TTTCGCCCGC CGTCCTGCAC CGGGGGCTGC GCTACGGCAT
651 ACCGATCGCA CTGAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC
701 GTTTGTTCCT GAAAAAATAT GCCGGCCTGG AACAGCTCGG CGTTTATTCG
751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGTTCCAAA GCATCTTTTC
801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGAA AACGCCCCGC
851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC
901 GCCCTCTGCC TGACCGGCAT TTTCTCGCCC CTTGCCTCCC TCCTGCTGCC
951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATGCTGCCGC
1001 CGCTGTTTTG CACGCTGGCG GAAATCAGCG GCATCGGTTT GAACGTCGTC
1051 CGCAAAACGC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA
1101 CCTGCTGCTG CTGGGGCTTG CCGTGCCGTC CGGCGGCGCG CGCGGCGCGG
1151 CGGTTGCCTG TGCCGCCTCA TTCTGGCTGT TTTTTGCCTT CAAGACCGAA
1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATCTGCA
1251 CACATTGTTC TGCCTGACCT CCTCGGCGGC CTACACCTGC TTCGGCACGC
1301 CGGCAAACTA TCCCCTGTTT GCCGGCGTAT GGGCGGCATA TCTGGCAGGC
1351 TGCATCCTGC GCCACCGGAA AGATTTGCAC AAACTGTTTC ATTATTTGAA
1401 AAAACAAGGT TTCCCATTAT GA
它对应于氨基酸序列<SEQ ID 376;ORF10-1>:
1
MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRI
V LMQTAAGLTV
51
SVLCLGLDQA YVREYYATAD KDTLFKT
LFL PPLLSAAAIA ALLLSRPSLP
101 SEILFSLDDA AAGIG
LVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQL
VPK
151
LAILLLLPLT VGLLHFPANT A
VLTAVYALA NLAAAAFLLF QNRCRLKAVR
201 HAPFSPAVLH RGLRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQ
LGVYS
251
MGISFGGAAL LFQSIFSTVW TPYIFRAIEE NAPPARLSAT AESA
AALLAS
301
ALCLTGIFSP LASLLLPENY AAVRFIVVSC MLPPLFCTLA EISGIGLNVV
351 RKTRP
IALAT LGALAANLLL LGLAVPSGGA R
GAAVACAAS FWLFFAFKTE
401 SSCRLWQPLK RLPLYLHTLF CLTSSAAYTC FGTPANYPLF AGVWAAYLAG
451 CILRHRKDLH KLFHYLKKQG FPL*
该氨基酸序列的计算机分析给出了下列结果:
预计
预计ORF10-1是一种整合膜蛋白的前体,因为它包含几个(12-13个)潜在跨膜片段,以及一个可能的可断裂信号肽。
与唾液链球菌嗜热亚种的EpsM(登录号为U40830)的同源性
ORF10显示出与唾液链球菌嗜热亚种的epsM基因同源,该基因编码的蛋白质大小与ORF10相似,并涉及外多糖的合成。它还与原核生物膜蛋白有其它同源性:
相同性=(25%)
询问: 213 LRYGIPLALSSLAYWGLASADRLFLKKYAGLEQLGVYSMGISFGGAALLLQSIFSTVW 270
L Y +PL SS+ +W L ++ R F+ + G G+ ++ + +IF+ W
目标: 210 LYYALPLIPSSILWWLLNASSRYFVLFFLGAGANGLLAVATKIPSIISIFNTIFTQAW 267
相同性=15/57(26%),阳性=31/57(54%)
询问: 7 LGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQAYVR 63
L + G++GS +L +++PL ++ + G L QT A L + ++ + + A +R
目标: 12 LVFTIGNLGSKLLVFLLVPLYTYAMTPQEYGMADLYQTTANLLLPLITMNVFDATLR 68
相同性=16/96(16%),阳性=36/96(37%)
询问:307 IFSPLASLLLPENYAAVRFTVVSCMLPPLFYTLTEISGIGLNVVRKTRPIXXXXXXXXXX 366
+ P+ ++ +YA+ V ML LF + ++ G ++T+ +
目标:305 VLKPIVEKVVSSDYASSWQYVPFFMLSMLFSSFSDFFGTNYIAAKQTKGVFMTSIYGTIV 364
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF10显示与脑膜炎奈瑟球菌菌株A的ORF(ORF10a)在重叠的475个氨基酸内有95.4%的相同性:
10 20 30 40 50 60
orf10.pep MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf10a MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
10 20 30 40 50 60
70 80 90 100 110 120
orf10.pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf10a YVREYYAAADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
70 80 90 100 110 120
130 140 150 160 170 180
orf10.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA
|||||||||||||||||||||||||||| ||||||| |||||||||||||||||||||||
orf10a LSFLPIRFLLLVLRMEGRALAFSSAQLVSKLAILLLLPLTVGLLHFPANTAVLTAVYALA
130 140 150 160 170 180
190 200 210 220 230 240
orf10.pep NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGXRYGIPIALSSIAYWGLASADRLFLKKY
||||||||||||||||||||:|||| |||||| |||||||||||||||||||||||||||
orf10a NLAAAAFLLFQNRCRLKAVRRAPFSSAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY
190 200 210 220 230 240
250 260 270 280 290 300
orf10.pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf10a AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEANAPPARLSATAESAAALLAS
250 260 270 280 290 300
310 320 330 340 350 360
orf10.pep ALCXTGIFSPLASLLLPENYAAVRFIVVSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT
||| ||||||||||||||||||||||||||| |||||||:||||||||||||||||||||
orf10a ALCLTGIFSPLASLLLPENYAAVRFIVVSCMLPPLFCTLVEISGIGLNVVRKTRPIALAT
310 320 330 340 350 360
370 380 390 400 410 419
orf10.pep LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT
||||||||||||| |||: ||||||||||||||:|||||||||||||||||||:||
orf 10a LGALAANLLLLGL--AVPSGGARGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT
370 380 390 400 410
420 430 440 450 460 470
orf10.pep LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
||||:||||||||||||||||||||||:||||||||||||||||||||||||||||
orf10a LFCLASSAAYTCFGTPANYPLFAGVWAVYLAGCILRHRKDLHKLFHYLKKQGFPLX
420 430 440 450 460 470
全长ORF10a核苷酸序列<SEQ ID 377>是:
1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC
51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCCCTGCCG
101 ACGACATCGG ACGCATCGTG CTGATGCAGA CGGCGGCGGG GCTGACGGTG
151 TCGGTGTTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC
201 CGCCGCCGAC AAAGACACTT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC
251 TGTCTGCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC ATCCCTGCCG
301 TCTGAAATCC TGTTTTCGCT CGACGATGCC GCCGCCGGCA TCGGGCTGGT
351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC
401 GTATGGAAGG ACGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGTCCAAG
451 CTCGCCATCC TGCTGCTGCT GCCGCTGACG GTCGGGCTGC TGCACTTTCC
501 GGCGAACACC GCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG
551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG
601 CGCGCACCGT TTTCATCCGC CGTCCTGCAT CGCGGCCTGC GCTACGGCAT
651 ACCGATCGCA CTAAGCAGCA TCGCCTATTG GGGGCTGGCA TCCGCCGACC
701 GTTTGTTCCT GAAAAAATAT GCCGGCCTAG AACAGCTCGG CGTTTATTCG
751 ATGGGTATTT CGTTCGGCGG AGCGGCATTA TTGTTCCAAA GCATCTTTTC
801 AACGGTCTGG ACACCGTATA TTTTCCGCGC AATCGAAGCA AACGCCCCGC
851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC
901 GCCCTCTGCC TGACCGGCAT TTTCTCGCCC CTCGCCTCCC TCCTGCTGCC
951 GGAAAACTAC GCCGCCGTCC GGTTTATCGT CGTATCGTGT ATGCTGCCTC
1001 CGCTGTTTTG CACGCTGGTA GAAATCAGCG GCATCGGTTT GAACGTCGTC
1051 CGAAAAACAC GCCCGATCGC GCTCGCCACC TTGGGCGCGC TGGCGGCAAA
1101 CCTGCTGCTG CTGGGGCTTG CCGTACCGTC CGGCGGCGCG CGCGGCGCGG
1151 CGGTTGCCTG TGCCGCCTCA TTTTGGCTGT TTTTTGTTTT CAAGACCGAA
1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATATGCA
1251 CACATTGTTC TGCCTGGCCT CCTCGGCGGC CTACACCTGC TTCGGCACTC
1301 CGGCAAACTA CCCCCTGTTT GCCGGCGTAT GGGCGGTATA TCTGGCAGGC
1351 TGCATCCTGC GCCACCGGAA AGATTTGCAC AAACTGTTTC ATTATTTGAA
1401 AAAACAAGGT TTCCCATTAT GA
它编码的蛋白质具有氨基酸序列<SEQ ID 378>:
1 MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV
51 SVLCLGLDQA YVREYYAAAD KDTLFKTLFL PPLLSAAAIA ALLLSRPSLP
101 SEILFSLDDA AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVSK
151 LAILLLLPLT VGLLHFPANT AVLTAVYALA NLAAAAFLLF QNRCRLKAVR
201 RAPFSSAVLH RGLRYGIPIA LSSIAYWGLA SADRLFLKKY AGLEQLGVYS
251 MGISFGGAAL LFQSIFSTVW TPYIFRAIEA NAPPARLSAT AESAAALLAS
301 ALCLTGIFSP LASLLLPENY AAVRFIVVSC MLPPLFCTLV EISGIGLNVV
351 RKTRPIALAT LGALAANLLL LGLAVPSGGA RGAAVACAAS FWLFFVFKTE
401 SSCRLWQPLK RLPLYMHTLF CLASSAAYTC FGTPANYPLF AGVWAVYLAG
451 CILRHRKDLH KLFHYLKKQG FPL*
ORF10a和ORF10-1显示在475个氨基酸的重叠区内有95.4%的相同性:
10 20 30 40 50 60
orf10-1.pep MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf10a MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
10 20 30 40 50 60
70 80 90 100 110 120
orf10-1.pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf10a YVREYYAAADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
70 80 90 100 110 120
130 140 150 160 170 180
orf10-1.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA
|||||||||||||||||||||||||||| ||||||| |||||||||||||||||||||||
orf10a LSFLPIRFLLLVLRMEGRALAFSSAQLVSKLAILLLLPLTVGLLHFPANTAVLTAVYALA
130 140 150 160 170 180
190 200 210 220 230 240
orf10-1.pep NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGXRYGIPIALSSIAYWGLASADRLFLKKY
||||||||||||||||||||:|||| |||||| |||||||||||||||||||||||||||
orf10a NLAAAAFLLFQNRCRLKAVRRAPFSSAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY
190 200 210 220 230 240
250 260 270 280 290 300
orf10-1.pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf10a AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEANAPPARLSATAESAAALLAS
250 260 270 280 290 300
310 320 330 340 350 360
orf10-1.pep ALCXTGIFSPLASLLLPENYAAVRFIVVSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT
||| ||||||||||||||||||||||||||| |||||||:||||||||||||||||||||
orf10a ALCLTGIFSPLASLLLPENYAAVRFIVVSCMLPPLFCTLVEISGIGLNVVRKTRPIALAT
310 320 330 340 350 360
370 380 390 400 410 419
orf10-1.pep LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT
||||||||||||| |||: ||||||||||||||:|||||||||||||||||||:||
orf10a LGALAANLLLLGL--AVPSGGARGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT
370 380 390 400 410
420 430 440 450 460 470
orf10-1.pep LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
||||:||||||||||||||||||||||:||||||||||||||||||||||||||||
orf10a LFCLASSAAYTCFGTPANYPLFAGVWAVYLAGCILRHRKDLHKLFHYLKKQGFPLX
420 430 440 450 460 470
与淋病奈瑟球菌的预计ORF的同源性
ORF10显示与淋病奈瑟球菌的预计ORF(ORF10.ng)在重叠的475个氨基酸内有94.1%的相同性:
orf10ng.pep MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 60
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf10nm MDTKEILXYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA 60
orf10ng.pep YVREYYAAADKDTLFKTLFLPPLLFSAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 120
|||||||:|||||||||||||||| :||||||||||||||||||||||||||||||||||
orf10nm YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE 120
orf10ng.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTSVLTAVYALA 180
|||||||||||||||||||||||||||||||||||| |||||||||||||:|||||||||
orf10nm LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLXPLTVGLLHFPANTAVLTAVYALA 180
orf10ng.pep NLAAAAFLLFQNRCRLKAVRRAPFSPAVLHRGLRYGIPLALSSLAYWGLASADRLFLKKY 240
||||||||||||||||||||:||||||||||| |||||:||||:||||||||||||||||
orf10nm NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGXRYGIPIALSSIAYWGLASADRLFLKKY 240
orf10ng.pep AGLEQLGVYSMGISFGGAALLLQSIFSTVWTPYIFRAIEENATPARLSATAESAAALLAS 300
|||||||||||||||||||||:|||||||||||||||||||| |||||||||||||||||
orf10nm AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS 300
orf10ng.pep ALCLTGIFSPLASLLLPENYAAVRFTVVSCMLPPLFYTLTEISGIGLNVVRKTRPIALAT 360
||| ||||||||||||||||||||| ||||| |||| ||:||||||||||||||||||||
orf10nm ALCXTGIFSPLASLLLPENYAAVRFIVVSCMXPPLFCTLAEISGIGLNVVRKTRPIALAT 360
370 380 390 400 410
orf10ng.pep LGALAANLLLLGL--AVPSGGTRGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHT
||||||||||||| |||: ||||||||||||||:|||||||||||||||||||:||
orf10nm LGALAANLLLLGLDRAVPAR-PXGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHT
370 380 390 400 410
420 430 440 450 460 470
orf10ng.pep LFCLASSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKNLHKLFHYLKKQGFPLX
||||:||||||||||||||||||||||||||||||||||:||||||||||||||||
orf10nm LFCLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
420 430 440 450 460 470
全长ORF10ng核苷酸序列<SEQ ID 379>是:
1 ATGGACACAA AAGAAATCCT CGGCTACGCG GCAGGCTCGA TCGGCAGCGC
51 GGTTTTAGCC GTCATCATCC TGCCGCTGCT GTCGTGGTAT TTCcccgCCG
101 ACGACATCGG GCGCATCGTG CTGATGCAGA CGGCGGCGGG ACTGACGGTG
151 TCGGTATTGT GCCTCGGGCT GGATCAGGCA TACGTCCGCG AATACTATGC
201 CGCCGCCGAC AAAGACACTT TGTTCAAAAC CCTGTTCCTG CCGCCGCTGC
251 TGTTTTCCGC CGCGATAGCC GCCCTGCTGC TTTCCCGCCC GTCCCTGCCG
301 TCTGAAATCC TGTTTTCGCT CGACGATGCC GCCGCCGGCA TCGGGCTGGT
351 GCTGTTTGAA CTGAGCTTCC TGCCCATCCG CTTTCTCTTA CTGGTTTTGC
401 GTATGGAAGG GCGCGCCCTT GCCTTTTCGT CCGCGCAACT CGTGCCCAAA
451 CTCGCCATTC TGCTGCTGTT GCCGCTGACG GTCGGGCTGC TGCACTTTCC
501 GGCGAACACC TCCGTCCTGA CCGCCGTTTA CGCGCTGGCA AACCTTGCCG
551 CCGCCGCCTT TTTGCTGTTT CAAAACCGAT GCCGTCTGAA GGCCGTCCGG
601 CGCGCGCCGT TTTCGCCCGC CGTCCTGCAC CGGGGGCTGC GCTACGGCAT
651 ACCGCTCGCA CTGAGCAGCC TTGCCTATTG GGGGCTGGCA TCCGCCGACC
701 GTTTGTTCCT GAAAAAATAT GCGGGCCTGG AACAGCTCGG CGTTTATTCG
751 ATGGGTATTT CGTTCGGCGG GGCGGCATTA TTGCTCCAAA GCATCTTTTC
801 AACGGTCTGG ACACCGTATA TTTTCCGTGC AATCGAAGAA AACGCCACGC
851 CCGCCCGCCT CTCGGCAACG GCAGAATCCG CCGCCGCCCT GCTTGCCTCC
901 GCCCTCTGCC TGACCGGAAT TTTCTCGCCC CTCGCCTCCC TCCTGCTGCC
951 GGAAAACTAC GCCGCCGTCC GGTTTACCGT CGTATCGTGT ATGCTGccgc
1001 cgctGTTTTA CACGCTGACC GAAATCAGCG GCATCGGTTT GAACGTCGTC
1051 CGCAAAACGC GTCCGATCGC GCTTGCCACC TTGGGCGCGC TGGCGGCAAA
1101 CCTGCTGCTG CTGGGGCTTG CCGTACCGTC CGGCGGCACG CGCGGCGCGG
1151 CGGTTGCCTG TGCCGCCTCA TTCTGGTTGT TTTTTGTTTT CAAGACAGAA
1201 AGCTCCTGCC GCCTGTGGCA GCCGCTCAAA CGCCTGCCGC TTTATATGCA
1251 CACATTGTTC TGCCTgGCCT CCTCGGCGGC CTACACCTGC TTCGGCACAC
1301 CGGCAAACTA CCCcctgttt gccggcgtAT GGGCGGCATA TCTGGCAGGC
1351 TGCATCCTGC GCCACCGGAA AAATTTGCAC AAACTGTTTC ATTATTTGAA
1401 AAAACAAGGT TTCCCATTAT GA
它编码的蛋白质具有氨基酸序列<SEQ ID 380>:
1
MDTKEILGYA AGSIGSAVLA VIILPLLSWY FPADDIGRIV LMQTAAGLTV
51 SVLCLGLDQA YVREYYAAAD KDTLFKTL
FL PPLLFSAAIA ALLLSRPSLP
101 SEILFSLDDA
AAGIGLVLFE LSFLPIRFLL LVLRMEGRAL AFSSAQLVPK
151
LAILLLLPLT VGLLHFPANT SVLTAVYALA NLAAAAFLLF QNRCRLKAVR
201 RAPFSPAVLH RGLRYGIPLA LSSLAYWGLA SADRLFLKKY AGLEQLGVYS
251 MGISFGGAAL LLQSIFSTVW TPYIFRAIEE NATPARLSAT AESAAALLAS
301 ALCLTGIFSP LASLLLPENY AAVRFTVVSC MLPPLFYTLT EISGIGLNVV
351 RKTRPI
ALAT LGALAANLLL LGLAVPSGGT RGAAVACAAS FWLFFVFKTE
401 SSCRLWQPLK RLPLYMHTLF CLASSAAYTC FGTPANYPLF AGVWAAYLAG
451 CILRHRKNLH KLFHYLKKQG FPL*
ORF10ng和OFF10-1显示在473个氨基酸的重叠区内有96.4%的相同性:
10 20 30 40 50 60
orf10-1.pep MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf10ng-1 MDTKEILGYAAGSIGSAVLAVIILPLLSWYFPADDIGRIVLMQTAAGLTVSVLCLGLDQA
10 20 30 40 50 60
70 80 90 100 110 120
orf10-1.pep YVREYYATADKDTLFKTLFLPPLLSAAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
|||||||:|||||||||||||||| :||||||||||||||||||||||||||||||||||
orf10ng-1 YVREYYAAADKDTLFKTLFLPPLLFSAAIAALLLSRPSLPSEILFSLDDAAAGIGLVLFE
70 80 90 100 110 120
130 140 150 160 170 180
orf10-1.pep LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTAVLTAVYALA
||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||||
orf10ng-1 LSFLPIRFLLLVLRMEGRALAFSSAQLVPKLAILLLLPLTVGLLHFPANTSVLTAVYALA
130 140 150 160 170 180
190 200 210 220 230 240
orf10-1.pep NLAAAAFLLFQNRCRLKAVRHAPFSPAVLHRGLRYGIPIALSSIAYWGLASADRLFLKKY
||||||||||||||||||||:|||||||||||||||||:||||:||||||||||||||||
orf10ng-1 NLAAAAFLLFQNRCRLKAVRRAPFSPAVLHRGLRYGIPLALSSLAYWGLASADRLFLKKY
190 200 210 220 230 240
250 260 270 280 290 300
orf10-1.pep AGLEQLGVYSMGISFGGAALLFQSIFSTVWTPYIFRAIEENAPPARLSATAESAAALLAS
|||||||||||||||||||||:|||||||||||||||||||| |||||||||||||||||
orf10ng-1 AGLEQLGVYSMGISFGGAALLLQSIFSTVWTPYIFRAIEENATPARLSATAESAAALLAS
250 260 270 280 290 300
310 320 330 340 350 360
orf10-1.pep ALCLTGIFSPLASLLLPENYAAVRFIVVSCMLPPLFCTLAEISGIGLNVVRKTRPIALAT
||||||||||||||||||||||||| |||||||||| ||:||||||||||||||||||||
orf10ng-1 ALCLTGIFSPLASLLLPENYAAVRFTVVSCMLPPLFYTLTEISGIGLNVVRKTRPIALAT
310 320 330 340 350 360
370 380 390 400 410 420
orf10-1.pep LGALAANLLLLGLAVPSGGARGAAVACAASFWLFFAFKTESSCRLWQPLKRLPLYLHTLF
|||||||||||||||||||:|||||||||||||||:|||||||||||||||||||:||||
orf10ng-1 LGALAANLLLLGLAVPSGGTRGAAVACAASFWLFFVFKTESSCRLWQPLKRLPLYMHTLF
370 380 390 400 410 420
430 440 450 460 470
orf10-1.pep CLTSSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKDLHKLFHYLKKQGFPLX
||:||||||||||||||||||||||||||||||||||:||||||||||||||||
orf10ng-1 CLASSAAYTCFGTPANYPLFAGVWAAYLAGCILRHRKNLHKLFHYLKKQGFPLX
430 440 450 460 470
根据该分析结果(包括存在一个推定的前导肽和几个跨膜片段,以及存在一个亮氨酸拉链基序(相隔6个氨基酸的4个Leu残基,用粗体表示)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例45
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 381>:
1..ATCCTGAAAC CGCATAACCA GCTTAAGGAA GACATCCAAC CTGATCCGGC
51 CGATCAAAAC GCCTTGTCCG AACCGGATGC TGCGACAGAG GCAGAGCAGT
101 CGGATGCGGA AAATGCTGCC GACAAGCAGC CCGTTGCCGA TAAAGCCGAC
151 GAGGTTGAAG AAAAGGCGGG CGAGCCGGAA CGGGAAGAGC CGGACGGACA
201 GGCAGTGCGT AAGAAAGCGC TGACGGAAGA GCGTGAACAA ACCGTCAGGG
251 AAAAAGCGCA GAAGAAAGAT GCCGAAACGG TTAAAATACA AGCGGTAAAA
301 CCGTCTAAAG AAACAGAGAA AAAAGCTTCA AAAGAAGAGA AAAAGGCGGC
351 GAAGGAAAAA GTTGCACCCA AACCAACCCC GGAACAAATC CTCAACAGCG
401 GCAgCATCGA AAAmGCGCGC AgTGCCGCCG CCAAAGAAGT GCAGAAAATG
451 AA.AACGTCC GACAAGGCGG AAGC.AACGC ATTATCTGCA AATGGGCGCG
501 TATGCCGACC GTCAGAGCGC GGAAGGGCAG CGTGCCAAAC TGGCAATCTT
551 GGGCATATCT TCCAAGGTGG TCGGTTATCA GGCGGGACAT AAAACGCTTT
601 ACCGGGTGCA AAGCGGCAAT ATGTCTGCCG ATGCGGTGA
它对应于氨基酸序列<SEQ ID 382;ORF65>:
1..ILKPHNQLKE DIQPDPADQN ALSEPDAATE AEQSDAENAA DKQPVADKAD
51 EVEEKAGEPE REEPDGQAVR KKALTEEREQ TVREKAQKKD AETVKIQAVK
101 PSKETEKKAS KEEKKAAKEK VAPKPTPEQI LNSGSIEXAR SAAAKEVQKM
151 XNVRQGGSXR IICKWARMPT VRARKGSVPN WQSWAYLPRW SVIRRDIKRF
201 TGCKAAICLP MR*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 383>:
1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTTTT
51 CTTCGGTTTG ATACTGGCGA CGGTCATTAT TGCCGGTATT TTGTTTTATC
101 TGAACCAGAG CGGTCAAAAT GCGTTCAAAA TCCCGGCTTC GTCGAAGCAG
151 CCTGCAGAAA CGGAAATCCT GAAACCGAAA AACCAGCCTA AGGAAGACAT
201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGCTGCGA
251 CAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT
301 GCCGATAAAG CCGACGAGGT TGAAGAAAAG GCGGGCGAGC CGGAACGGGA
351 AGAGCCGGAC GGACAGGCAG TGCGTAAGAA AGCGCTGACG GAAGAGCGTG
401 AACAAACCGT CAGGGAAAAA GCGCAGAAGA AAGATGCCGA AACGGTTAAA
451 AAACAAGCGG TAAAACCGTC TAAAGAAACA GAGAAAAAAG CTTCAAAAGA
501 AGAGAAAAAG GCGGCGAAGG AAAAAGTTGC ACCCAAACCA ACCCCGGAAC
551 AAATCCTCAA CAGCGGCAGC ATCGAAAAAG CGCGCAGTGC CGCCGCCAAA
601 GAAGTGCAGA AAATGAAAAC GTCCGACAAG GCGGAAGCAA CGCATTATCT
651 GCAAATGGGC GCGTATGCCG ACCGTCAGAG CGCGGAAGGG CAGCGTGCCA
701 AACTGGCAAT CTTGGGCATA TCTTCCAAGG TGGTCGGTTA TCAGGCGGGA
751 CATAAAACGC TTTACCGGGT GCAAAGCGGC AATATGTCTG CCGATGCGGT
801 GAAAAAAATG CAGGACGAGT TGAAAAAACA TGAAGTCGCC AGCCTGATCC
851 GTTCTATCGA AAGCAAATAA
它对应于氨基酸序列<SEQ ID 384;ORF65-1>:
1 MFMNKFSQSG KGLSG
FFFGL ILATVIIAGI LFYLNQSGQN AFKIPASSKQ
51 PAETEILKPK NQPKEDIQPE PADQNALSEP DAATEAEQSD AEKAADKQPV
101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK
151 KQAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSGS IEKARSAAAK
201 EVQKMKTSDK AEATHYLQMG AYADRQSAEG QRAKLAILGI SSKVVGYQAG
251 HKTLYRVQSG NMSADAVKKM QDELKKHEVA SLIRSIESK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF65显示与脑膜炎奈瑟球菌菌株A的ORF(ORF65a)在重叠的150个氨基酸内有92.0%的相同性:
10 20 30
orf65.pep ILKPHNQLKEDIQPDPADQNALSEPDAATE
||||:|| ||||||:||||||||||||| |
orf65a
IIAGILFYLNQSGQNAFKIPVPSKQPAETEILKPKNQPKEDIQPEPADQNALSEPDAAKE
30 40 50 60 70 80
40 50 60 70 80 90
orf65.pep AEQSDAENAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD
|||||||:|||||||||||||||||| |||||: |||||||||||||||||| |||||||
orf65a AEQSDAEKAADKQPVADKADEVEEKADEPEREKSDGQAVRKKALTEEREQTVGEKAQKKD
90 100 110 120 130 140
100 110 120 130 140 150
orf65.pep AETVKIQAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSGSIEXARSAAAKEVQKM
||||| |||||||||||||||||||| |||||||||||||||||||| ||||||||||||
orf65a AETVKKQAVKPSKETEKKASKEEKKAEKEKVAPKPTPEQILNSGSIEKARSAAAKEVQKM
150 160 170 180 190 200
160 170 180 190 200 210
orf65.pep XNVRQGGSXRIICKWARMPTVRARKGSVPNWQSWAYLPRWSVIRRDIKRFTGCKAAICLP
orf65a KTPDKAEATHYLQMGAYADRRSAEGQRAKLAILGISSKVVGYQAGHKTLYRVQSGNMSAD
210 220 230 240 250 260
全长ORF65a核苷酸序列<SEQ ID 385>是:
1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTTTT
51 CTTCGGTTTG ATACTGGCGA CGGTCATTAT TGCCGGTATT TTGTTTTATC
101 TGAACCAGAG CGGTCAAAAT GCGTTCAAAA TCCCGGTTCC GTCGAAGCAG
151 CCTGCAGAAA CGGAAATCCT GAAACCGAAA AACCAGCCTA AGGAAGACAT
201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGCTGCGA
251 AAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT
301 GCCGACAAAG CCGACGAGGT TGAGGAAAAG GCGGACGAGC CGGAGCGGGA
351 AAAGTCGGAC GGACAGGCAG TGCGCAAGAA AGCACTGACG GAAGAGCGTG
401 AACAAACCGT CGGGGAAAAA GCGCAGAAGA AAGATGCCGA AACGGTTAAA
451 AAACAAGCGG TAAAACCATC TAAAGAAACA GAGAAAAAAG CTTCAAAAGA
501 AGAGAAAAAG GCGGAGAAGG AAAAAGTTGC ACCCAAACCG ACCCCGGAAC
551 AAATCCTCAA CAGCGGCAGC ATCGAAAAAG CGCGCAGTGC CGCTGCCAAA
601 GAAGTGCAGA AAATGAAAAC GCCCGACAAG GCGGAAGCAA CGCATTATCT
651 GCAAATGGGC GCGTATGCCG ACCGCCGGAG CGCGGAAGGG CAGCGTGCCA
701 AACTGGCAAT CTTGGGCATA TCTTCCAAGG TGGTCGGTTA TCAGGCGGGA
751 CATAAAACGC TTTACCGGGT GCAAAGCGGC AATATGTCTG CCGATGCGGT
801 GAAAAAAATG CAGGACGAGT TGAAAAAACA TGAAGTCGCC AGCCTGATCC
851 GTTCTATCGA AAGCAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 386>:
1 MFMNKFSQSG KGLSG
FFFGL ILATVIIAGI LFYLNQSGQN AFKIPVPSKQ
51 PAETEILKPK NQPKEDIQPE PADQNALSEP DAAKEAEQSD AEKAADKQPV
101 ADKADEVEEK ADEPEREKSD GQAVRKKALT EEREQTVGEK AQKKDAETVK
151 KQAVKPSKET EKKASKEEKK AEKEKVAPKP TPEQILNSGS IEKARSAAAK
201 EVQKMKTPDK AEATHYLQMG AYADRRSAEG QRAKLAILGI SSKVVGYQAG
251 HKTLYRVQSG NMSADAVKKM QDELKKHEVA SLIRSIESK*
ORF65a和ORF65-1显示在289个氨基酸的重叠区内有96.5%的相同性:
10 20 30 40 50 60
orf65a.pep MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPVPSKQPAETEILKPK
|||||||||||||||||||||||||||||||||||||||||||||: |||||||||||||
orf65-1 MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPASSKQPAETEILKPK
10 20 30 40 50 60
70 80 90 100 110 120
orf65a.pep NQPKEDIQPEPADQNALSEPDAAKEAEQSDAEKAADKQPVADKADEVEEKADEPEREKSD
||||||||||||||||||||||| ||||||||||||||||||||||||||| |||||: |
orf65-1 NQPKEDIQPEPADQNALSEPDAATEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD
70 80 90 100 110 120
130 140 150 160 170 180
orf65a.pep GQAVRKKALTEEREQTVGEKAQKKDAETVKKQAVKPSKETEKKASKEEKKAEKEKVAPKP
||||||||||||||||| ||||||||||||||||||||||||||||||||| ||||||||
orf65-1 GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP
130 140 150 160 170 180
190 200 210 220 230 240
orf65a.pep TPEQILNSGSIEKARSAAAKEVQKMKTPDKAEATHYLQMGAYADRRSAEGQRAKLAILGI
||||||||||||||||||||||||||| |||||||||||||||||:||||||||||||||
orf65-1 TPEQILNSGSIEKARSAAAKEVQKMKTSDKAEATHYLQMGAYADRQSAEGQRAKLAILGI
190 200 210 220 230 240
250 260 270 280 290
orf65a.pep SSKVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHEVASLIRSIESKX
||||||||||||||||||||||||||||||||||||||||||||||||||
orf65-1 SSKVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHEVASLIRSIESKX
250 260 270 280 290
与淋病奈瑟球菌的预计ORF的同源性
ORF65显示与淋病奈瑟球菌的预计ORF(ORF65.ng)在重叠的212个氨基酸内有89.6%的相同性:
30 40 50 60 70 80
ORF65ng IIAGILLYLNQGGQNAFKIPAPSKQPAETEILKLKNQPKEDIQPEPADQNALSEPDVAKE
||| :|| ||||||:|||||||||||:| |
ORF65 ILKPHNQLKEDIQPDPADQNALSEPDAATE
10 20 30
90 100 110 120 130 140
ORF65ng AEQSDAEKAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
ORF65 AEQSDAENAADKQPVADKADEVEEKAGEPEREEPDGQAVRKKALTEEREQTVREKAQKKD
40 50 60 70 80 90
150 160 170 180 190 200
ORF65ng AETVKKKAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSRSIEKARSAAAKEVQKM
||||| :|||||||||||||||||||||||||||||||||||| ||| ||||||||||||
ORF65 AETVKIQAVKPSKETEKKASKEEKKAAKEKVAPKPTPEQILNSGSIEXARSAAAKEVQKM
100 110 120 130 140 150
210 220 230 240 250 260
ORF65ng KNFGQGGSQRIICKWARMPNPGARKGSVPNWQSWAYLPKWSAIRRDIKRFTACKAAICPP
| |||| ||||||||||: ||||||||||||||||:||:|||||||||:|||||| |
ORF65 XNVRQGGSXRIICKWARMPTVRARKGSVPNWQSWAYLPRWSVIRRDIKRFTGCKAAICLP
160 170 180 190 200 210
ORF65ng MR
||
ORF65 MR
预计An ORF65ng核苷酸序列<SEQ ID 387>编码的蛋白质具有氨基酸序列<SEQID 388>:
1 MFMNKFSQSG K
GLSGFFFGL ILATVIIAGI LLYLNQGGQN AFKIPAPSKQ
51 PAETEILKLK NQPKEDIQPE PADQNALSEP DVAKEAEQSD AEKAADKQPV
101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK
151 KKAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSRS IEKARSAAAK
201 EVQKMKNFGQ GGSQRIICKW ARMPNPGARK GSVPNWQSWA YLPKWSAIRR
251 DIKRFTACKA AICPPMR*
进一步分析后,发现此完整的淋球菌DNA序列<SEQ ID 389>是:
1 ATGTTTATGA ACAAATTTTC CCAATCCGGA AAAGGTCTGT CCGGTTTCTT
51 CTTCGGTTTG ATACTGGCAA CGGTCATTAT TGCCGGTATT TTGCTTTATC
101 TGAACCAGGG CGGTCAAAAT GCGTTCAAAA TCCCGGCTCC GTCGAAGCAG
151 CCTGCAGAAA CGGAAATCCT GAAACTGAAA AACCAGCCTA AGGAAGACAT
201 CCAACCTGAA CCGGCCGATC AAAACGCCTT GTCCGAACCG GATGTTGCGA
251 AAGAGGCAGA GCAGTCGGAT GCGGAAAAAG CTGCCGACAA GCAGCCCGTT
301 GCCGACAAag ccgacgAGGT TGAAGAAAag GcGGgcgAgc cggaACGGga
351 aGAGCCGGAC ggACAGGCAG TGCGCAAGAA AGCACTGAcg gAAGAgcGTG
401 AACAAACcgt cagggAAAAA GCGCagaaga AAGATGCCGA AACGgTTAAA
451 AAacaaGCgg tAaaaccgtc tAAAGAAACa gagaaaaaag cTtcaaaaga
501 agagaaaaag gcggcgaaag aaaAAGttgc acccaaaccg accccggaaC
551 aaatcctcaa cagccgCagc atcgaaaaag cgcgtagtgc cgctgccaaa
601 gaAgtgcaGA AAatgaaaaa ctTtgggcaa ggcgGaagcc aacgcattaT
651 CTGcaaatgg gcgcgtatgc cgaccgtccg gagcgcggaA gggcagcgtg
701 ccaaACtggc aAtcttgGgc atatctTccg aagtggtcgG CTATCAGGCG
751 GGACATAAAA CGCTTTACCG CGTGCAAagc GGCAatatgt ccgccgatgc
801 gGTGAAAAAA ATGCAGGACG AGTTGAAAAA GCATGGGGtt gcCAGCCTGA
851 TCCGTGcgAT TGAAGGCAAA TAA
它编码下列氨基酸序列<SEQ ID 390>:
1 MFMNKFSQSG KGLSG
FFFGL ILATVIIAGI LLYLNQGGQN AFKIPAPSKQ
51 PAETEILKLK NQPKEDIQPE PADQNALSEP DVAKEAEQSD AEKAADKQPV
101 ADKADEVEEK AGEPEREEPD GQAVRKKALT EEREQTVREK AQKKDAETVK
151 KQAVKPSKET EKKASKEEKK AAKEKVAPKP TPEQILNSRS IEKARSAAAK
201 EVQKMKNFGQ GGSQRIICKW ARMPTVRSAE GQRAKLAILG ISSEVVGYQA
251 GHKTLYRVQS GNMSADAVKK MQDELKKHGV ASLIRAIEGK *
ORF65ng-1和ORF65-1显示在290个氨基酸的重叠区内有89.0%的相同性:
10 20 30 40 50 60
orf65-1.pep MFMNKFSQSGKGLSGFFFGLILATVIIAGILFYLNQSGQNAFKIPASSKQPAETEILKPK
|||||||||||||||||||||||||||||||:||||:||||||||| ||||||||||| |
orf65ng-1 MFMNKFSQSGKGLSGFFFGLILATVIIAGILLYLNQGGQNAFKIPAPSKQPAETEILKLK
10 20 30 40 50 60
70 80 90 100 110 120
orf65-1.pep NQPKEDIQPEPADQNALSEPDAATEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD
|||||||||||||||||||||:| ||||||||||||||||||||||||||||||||||||
orf65ng-1 NQPKEDIQPEPADQNALSEPDVAKEAEQSDAEKAADKQPVADKADEVEEKAGEPEREEPD
70 80 90 100 110 120
130 140 150 160 170 180
orf65-1.pep GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf65ng-1 GQAVRKKALTEEREQTVREKAQKKDAETVKKQAVKPSKETEKKASKEEKKAAKEKVAPKP
130 140 150 160 170 180
190 200 210 220 230 239
orf65-1.pep TPEQILNSGSIEKARSAAAKEVQKMKTSDKAEATHYL-QMGAYADRQSAEGQRAKLAILG
|||||||| |||||||||||||||||: :: : : : : : : :|||||||||||||
orf65ng-1 TPEQILNSRSIEKARSAAAKEVQKMKNFGQGGSQRIICKWARMPTVRSAEGQRAKLAILG
190 200 210 220 230 240
240 250 260 270 280 290
orf65-1.pep ISSKVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHEVASLIRSIESKX
|||:|||||||||||||||||||||||||||||||||| ||||||:||:||
orf65ng-1 ISSEVVGYQAGHKTLYRVQSGNMSADAVKKMQDELKKHGVASLIRAIEGKX
250 260 270 280 290
根据该结果,包括淋球菌蛋白中存在一个推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例46
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 391>:
1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTkTCTTCGG
51 CGGAAcGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GcGTTTGs.s
101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC
151 ACAGGACGGG TAAGCAGCTA TACGGCAAtC GGCCTGATAC TCGGATTAAT
201 CGGACAGGTC GGCGTTTCAC TCGAcCAaAC CCGCGTCCTG CAGAATATTT
251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC
301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAaATCGGCA AACCGATATG
351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC
401 CCGCCTGCCT tGCGgTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG
451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AgCGGTAGTG CGGCAACGGG
501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTtTAG
551 CAATCGGCAT TTTtTCCCTG CAACTGAAwA AAATCATGCA AAACCGATAT
601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT
651 TGCCGTCCTG TGGCTGTAA
它对应于氨基酸序列<SEQ ID 392;ORF103>:
1 MNHDITFLTL FLLGXFGGTH CIGMCGGLSS AFXXQLPPHI NRFWLILLLN
51 TGRVSSYTAI GLILGLIGQV GVSLDQTRVL QNILYTAANL LLLFLGLYLS
101 GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIPACLAVG ILWGWLPCGL
151 VYSASLYALG SGSAATGGLY MLAFALGTLP NLLAIGIFSL QLXKIMQNRY
201 IRLCTGLSVS LWALWKLAVL WL*
进一步的工作详细描述了该DNA序列<SEQ ID 393>:
1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTTTCTTCGG
51 CGGAACGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC
101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC
151 ACAGGACGGG TAAGCAGCTA TACGGCAATC GGCCTGATAC TCGGATTAAT
201 CGGACAGGTC GGCGTTTCAC TCGACCAAAC CCGCGTCCTG CAGAATATTT
251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC
301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG
351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC
401 CCGCCTGCCT TGCGGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG
451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AGCGGTAGTG CGGCAACGGG
501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTTAG
551 CAATCGGCAT TTTTTCCCTG CAACTGAAAA AAATCATGCA AAACCGATAT
601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT
651 TGCCGTCCTG TGGCTGTAA
它对应于氨基酸序列<SEQ ID 394;ORF103-1>:
1
MNHDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRFWLILLLN
51 TGRVSSY
TAI GLILGLIGQV GVSLDQTRVL QNILYTAAN
L LLLFLGLYLS
101
GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIP
ACLAVG ILWGWLPCGL
151
VYSASLYALG SGSAATGGLY M
LAFALGTLP NLLAIGIFSL QLKKIMQNRY
201 IRLCTGLSVS LWALWKLAVL WL*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF103显示与脑膜炎奈瑟球菌菌株A的ORF(ORF103.a)在重叠的222个氨基酸内有93.8%的相同性:
10 20 30 40 50 60
orf103.pep MNHDITFLTLFLLGXFGGTHCIGMCGGLSSAFXXQLPPHINRFWLILLLNTGRVSSYTAI
|| ||||||||||| ||||||||||||||||| |||||||| |||||||||||||||||
orf103a MNXDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRXWLILLLNTGRVSSYTAI
10 20 30 40 50 60
70 80 90 100 110 120
orf103.pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||
orf103a GLILGLIGQVGVSLDQTRVXQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
70 80 90 100 110 120
130 140 150 160 170 180
orf103.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf103a NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
130 140 150 160 170 180
190 200 210 220
orf103.pep NLLAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
|| ||||||||||||||||||||||||||||||||||||||||
orf103a NLXAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
190 200 210 220
全长ORF103a核苷酸序列<SEQ ID 395>是:
1 ATGAACCANG ACATCACTTT CCTCACCCTG TTCCTACTCG GTTTCTTCGG
51 CGGAACGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC
101 TCCAACTCCC CCCGCATATC AACCGCTTNT GGCTGATCCT GCTGCTTAAC
151 ACAGGACGGG TAAGCAGCTA TACGGCAATC GGCCTGATAC TCGGATTAAT
201 CGGACAGGTC GGCGTTTCAC TCGACCAAAC CCGCGTCNTG CAGAATATTT
251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC
301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG
351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC
401 CCGCCTGCCT TGCGGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTA
451 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AGCGGTAGTG CGGCAACGGG
501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTNGG
551 CAATCGGCAT TTTTTCCCTG CAACTGNAAA AAATCATGCA AAACCGATAT
601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT
651 TGCCGTCCTG TGGCTGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 396>:
1
MNXDITFLTL FLLGFFGGTH CIGMCGGLSS AFALQLPPHI NRXWLILLLN
51 TGRVSSY
TAI GLILGLIGQV GVSLDQTRVX QNILYTAAN
L LLLFLGLYLS
101
GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIP
ACLAVG ILWGWLPCGL
151
VYSASLYALG SGSAATGGLY M
LAFALGTLP NLXAIGIFSL QLXKIMQNRY
201 IRLCTGLSVS LWALWKLAVL WL*
ORF103a和ORF103-1显示在222个氨基酸的重叠区内有97.7%的相同性:
10 20 30 40 50 60
orf103a.pep MNXDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRXWLILLLNTGRVSSYTAI
|| ||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf103-1 MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRVSSYTAI
10 20 30 40 50 60
70 80 90 100 110 120
orf103a.pep GLILGLIGQVGVSLDQTRVXQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||
orf103-1 GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
70 80 90 100 110 120
130 140 150 160 170 180
orf103a.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf103-1 NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
130 140 150 160 170 180
190 200 210 220
orf103a.pep NLXAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
|| ||||||||| ||||||||||||||||||||||||||||||
orf103-1 NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF103显示与淋病奈瑟球菌的预计ORF(ORF103.ng)在重叠的222个氨基酸内有95.5%的相同性:
orf103.pep MNHDITFLTLFLLGXFGGTHCIGMCGGLSSAFXXQLPPHINRFWLILLLNTGRVSSYTAI 60
|||||||||||||| ||||||||||||||||| |||||||||||||||||||:||||||
orf103ng MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRISSYTAI 60
orf103.pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 120
||:||||||:|:|||||||||||||||:||||||||||||||||||||||||||||||||
orf103ng GLMLGLIGQLGISLDQTRVLQNILYTASNLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL 120
orf103.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP 180
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||
orf103ng NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSATTGGLYMLAFALGTLP 180
orf103.pep NLLAIGIFSLQLXKIMQNRYIRLCTGLSVSLWALWKLAVLWL 222
|||||||||||| |||||||||||||||||||||||||||||
orf103ng NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWL 222
全长ORF103ng核苷酸序列<SEQ ID 397>是:
1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTGCTCG GTTTCTTCGG
51 CGGAACTCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GCGTTTGCGC
101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATTCT GCTGCTTAAC
151 ACAGGACGGA TAAGCAGCTA TACGGCAATC GGCCTGATGC TCGGATTAAT
201 CGGACAACTC GGCATTTCAC TCGACCAAAc ccgcgTCCTG CAAAATATTT
251 tatacacagc ctccaaCCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC
301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAAATCGGCA AACCGATATG
351 GCGCAACCTG AACCCGATAC TCAACCGGCT GCTGCCCATA AAATCCATAC
401 CCGCCTGCCT TGCTGTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG
451 GTTTACAGCG CATCACTTTA CGCGCTGGGA AGCGGTAGTG CGACAACCGG
501 CGGACTGTAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTTTGG
551 CAATCGGCAT TTTTTCCCTG CAACTGAAAA AAATCATGCA AAACCGATAT
601 ATCCGCCTGT GTACAGGATT ATCCGTATCA TTATGGGCAT TATGGAAGCT
651 TGCCGTCCTG TGGCTGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 398>:
1
LQLPPHI NRFWLILLLN
51 TGRISSY
TAI GLMLGLIGQL GISLDQTRVL QNILYTASN
L LLLFLGLYLS
101
GISSLAAKIE KIGKPIWRNL NPILNRLLPI KSIP
ACLAVG ILWGWLPCGL
151
VYSASLYALG SGSATTGGLY M
LAFALGTLP NLLAIGIFSL QLKKIMQNRY
201 IRLCTGLSVS LWALWKLAVL WL*
另外,ORF103ng和ORF103-1显示在222个氨基酸的重叠区内有97.3%的相同性:
10 20 30 40 50 60
orf103-1.pep MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRVSSYTAI
|||||||||||||||||||||||||||||||||||||||||||||||||||||:||||||
orf103ng MNHDITFLTLFLLGFFGGTHCIGMCGGLSSAFALQLPPHINRFWLILLLNTGRISSYTAI
10 20 30 40 50 60
70 80 90 100 110 120
orf103-1.pep GLILGLIGQVGVSLDQTRVLQNILYTAANLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
||:||||||:|:|||||||||||||||:||||||||||||||||||||||||||||||||
orf103ng GLMLGLIGQLGISLDQTRVLQNILYTASNLLLLFLGLYLSGISSLAAKIEKIGKPIWRNL
70 80 90 100 110 120
130 140 150 160 170 180
orf103-1.pep NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSAATGGLYMLAFALGTLP
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||
orf103ng NPILNRLLPIKSIPACLAVGILWGWLPCGLVYSASLYALGSGSATTGGLYMLAFALGTLP
130 140 150 160 170 180
190 200 210 220
orf103-1.pep NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
|||||||||||||||||||||||||||||||||||||||||||
orf103ng NLLAIGIFSLQLKKIMQNRYIRLCTGLSVSLWALWKLAVLWLX
190 200 210 220
根据该分析结果(包括此淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例47
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 399>:
1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTT CGCTTGGCAC TTTTGGCGGC
51 GATGACGTGG GGAACGCTGC CGAT.TCCGT GCGGCAGGTA TTGAAGTTTG
101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA
151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCcGAAGC GGCG
GGATT
201 TTTCTTGGTG CTCATTCAGG CTGCTGCTGC TCGGCGTGGC GGGCATTTCG
251 GCAAACTTTG TGCTGATTGC CCAAGGGCTG CATTATATTT CGCCGACCAC
301 GACGCAGGTT TTGTGGCAGA TTTCGCCGTT TACGATGATT GTwGTCGGTG
351 TGTTGGTGTT TAAAGACCGG ATGACTGCCG CTCAGAAAAT CGGCTTGGTT
401 TTGCTGCTTG CCGGTTTGCT TATGTATTTT AACGATAAAT TCGGCGAGTT
451 GTCGGGTTTG GGCGCGTATG C.AAGGGCGT GTTGCTGTGT GCGGCAGGCA
501 GTATGGCATG GGTGTGTAAT GCCGTGGCGC AAAAGCTGCT GTCGGCGCAA
551 TTCGGGCCGC AACAGATTCT GCTGTTGATT TATGCGGCAA GTGCCGCCGT
601 GTTCCTGCCG TTTGCCGAAC CGGCACACAT CGGAAGTATG GACGGTACGT
651 TGGCGTGGGT ATGTATTGCG TATTGCTGCT TGAATACGTT AATCGGTTAC
701 GGCTCGTTCG GCGAGGCGTT GAAACATTGG GAGGCTTCCA AAGTCAGCGC
751 GGTAACAACC TTGCTCCCCG TGTTTACCGT AATAAATACT TTGCTCGGGC
801 ATTATGTGAT GCCTGAAACT TTTGCCGCGC CGGA..
它对应于氨基酸序列<SEQ ID 400;ORF104>:
1 MENQRPLLGF RLALLAAMTW GTLPXSVRQV LKFVDAPTLV WVRFTVAAAV
51 LFVLLALGGR LPKRRDFSWC SFRLLLLGVA GISANFVLIA QGLHYISPTT
101 TQVLWQISPF TMIVVGVLVF KDRMTAAQKI GLVLLLAGLL MYFNDKFGEL
151 SGLGAYXKGV LLCAAGSMAW VCNAVAQKLL SAQFGPQQIL LLIYAASAAV
201 FLPFAEPAHI GSMDGTLAWV CIAYCCLNTL IGYGSFGEAL KHWEASKVSA
251 VTTLLPVFTV INTLLGHYVM PETFAAP...
进一步的工作进一步揭示了部分DNA序列<SEQ ID 401>:
1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC
51 GATGACGTGG GGAACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG
101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA
151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCCGAAGC GGCGGGATTT
201 TTCTTGGTGC TCATTCAGGC TGCTGCTGCT CGGCGTGGCG GGCATTTCGG
251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG
301 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGTGT
351 GTTGGTGTTT AAAGACCGGA TGACTGCCGC TCAGAAAATC GGCTTGGTTT
401 TGCTGCTTGC CGGTTTGCTT ATGTTTTTTA ACGATAAATT CGGCGAGTTG
451 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG
501 TATGGCATGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT
551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGCAAG TGCCGCCGTG
601 TTCCTGCCGT TTGCCGAACC GGCACACATC GGAAGTTTGG ACGGTACGTT
651 GGCGTGGGTT TGTTTTGCGT ATTGCTGCTT GAATACGTTA ATCGGTTACG
701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG
751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATAwTwwCTT TGCTCGGGCA
801 TTATGTGATG CCTGAAACTT TTGCCGCGCC GGA...
它对应于氨基酸序列<SEQ ID 402;ORF104-1>:
1
MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPT
LV WVRFTVAAAV
51
LFVLLALGGR LPKRRDFSWC SFR
LLLLGVA GISANFVLIA QGLHYISPTT
101 TQ
VLWQISPF TMIVVGVLVF KDRMTA
AQKI GLVLLLAGLL MFFNDKFGEL
151 SGLGAYAKG
V LLCAAGSMAW VCYAVAQKLL SAQFGPQQ
IL LLIYAASAAV
201
FLPFAEPAHI GSLD
GTLAWV CFAYCCLNTL IGYGSFGEAL KHWEASK
VSA
251
VTTLLPVFTV IXXLLGHYVM PETFAAP...
该氨基酸序列的计算机分析给出了下列结果:
与假设的流感嗜血菌HI0878蛋白(登录号U32769)的同源性
ORF104和HI0878显示在277个氨基酸的重叠区内有40%的氨基酸相同性:
orf104 4 QRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWXXXXXXXXXXXXXXXXXXXXP- 62
Q+PLLGF AL+ AM WG+LP +++QVL ++A T+VW P
HI0878 3 QQPLLGFTFALITAMAWGSLPIALKQVLSVMNAQTIVWYRFIIAAVSLLALLAYKKQLPE 62
orf104 63 --KRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 120
K R ++W ++L+GV G+++NF+L + L+YI P+ Q+ +S F M++ GVL+F
HI0878 63 LMKVRQYAW----IMLIGVIGLTSNFLLFSSSLNYIEPSVAQIFIHLSSFGMLICGVLIF 118
orf104 121 KDRMTAAQKIXXXXXXXXXXMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 180
K+++ QKI ++FND+F +GL Y GV+L G++ WV +AQKL+
HI0878 119 KEKLGLHQKIGLFLLLIGLGLFFNDRFDAFAGLNQYSTGVILGVGGALIWVAYGMAQKLM 178
orf104 181 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL 240
+F QQILL++Y A F+P A+ + + + LA +C YCCLNTLIGYGS+ EAL
HI0878 179 LRKFNSQQILLMMYLGCAIAFMPMADFSQVQELT-PLALICFIYCCLNTLIGYGSYAEAL 237
orf104 241 KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP 277
W+ SKVS V TL+P+FT++ + + HY P FAAP
HI0878 238 NRWDVSKVSVVITLVPLFTILFSHIAHYFSPADFAAP 274
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF104显示与脑膜炎奈瑟球菌菌株A的ORF(ORF104a)在重叠的277个氨基酸内有95.3%的相同性:
10 20 30 40 50 60
orf104.pep MENQRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
|||||||||| ||||||||||||| :||||||||||||||||||||||||||||||||||
orf104a MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
10 20 30 40 50 60
70 80 90 100 110 120
orf104.pep LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf104a LPKWRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
70 80 90 100 110 120
130 140 150 160 170 180
orf104.pep KDRMTAAQKIGLVLLLAGLLMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL
|||||||||||||||||||||:|||||||||||||| ||||||||||||||| |||||||
orf104a KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
130 140 150 160 170 180
190 200 210 220 230 240
orf104.pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL
|||||||||||||||||||||||||| |||||:||||||||:||||||||||||||||||
orf104a SAQFGPQQILLLIYAASAAVFLPFAELAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
190 200 210 220 230 240
250 260 270
orf104.pep KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP
||||||||||||||||||||| :||||||||:|||||
orf104a KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYAGALVVVGGAVTAAVG
250 260 270 280 290 300
全长ORF104a核苷酸序列<SEQ ID 403>是:
1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC
51 GATGACGTGG GGAACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG
101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA
151 TTGTTTGTTT TGCTGGCATT GGGCGGGCGG CTGCCGAAGT GGCGGGATTT
201 TTCTTGGTGC TCATTCAGGC TGCTGCTGCT CGGCGTGGCG GGCATTTCGG
251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG
301 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGTGT
351 GTTGGTGTTT AAAGACCGGA TGACTGCCGC TCAGAAAATC GGCTTGGTTT
401 TGCTGCTTGC CGGTTTGCTT ATGTTTTTTA ACGATAAATT CGGCGAGTTG
451 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG
501 TATGGCATGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT
551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGCAAG TGCCGCCGTG
601 TTCCTGCCGT TTGCCGAACT GGCACACATC GGAAGTTTGG ACGGTACGTT
651 GGCGTGGGTT TGTTTTGCGT ATTGCTGCTT GAATACGTTA ATCGGTTACG
701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG
751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATATTTTCTT TGCTCGGGCA
801 TTATGTGATG CCTGATACTT TTGCCGCGCC GGATATGAAC GGTTTGGGTT
851 ATGCCGGCGC ACTGGTCGTG GTCGGGGGTG CGGTTACGGC GGCGGTGGGG
901 GACAGGCTGT TCAAACGCCG CTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 404>:
1
MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPT
LV WVRFTVAAAV
51
LFVLLALGGR LPKWRDFSWC SFR
LLLLGVA GISANFVLIA QGLHYISPTT
101 TQ
VLWQISPF TMIVVGVLVF KDRMTA
AQKI GLVLLLAGLL MFFNDKFGEL
151 SGLGAYAKG
V LLCAAGSMAW VCYAVAQKLL SAQFGPQQ
IL LLIYAASAAV
201
FLPFAELAHI GSLD
GTLAWV CFAYCCLNTL IGYGSFGEAL KHWEASK
VSA
251
VTTLLPVFTV IFSLLGHYVM PDTFAAPDMN GL
GYAGALVV VGGAVTAAVG
301 DRLFKRR*
ORF104a和ORF104-1显示在277个氨基酸的重叠区内有98.2%的相同性:
10 20 30 40 50 60
orf104a.pep MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf104-1 MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
10 20 30 40 50 60
70 80 90 100 110 120
orf104a.pep LPKWRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf104-1 LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
70 80 90 100 110 120
130 140 150 160 170 180
orf104a.pep KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf104-1 KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
130 140 150 160 170 180
190 200 210 220 230 240
orf104a.pep SAQFGPQQILLLIYAASAAVFLPFAELAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
|||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||
orf104-1 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
190 200 210 220 230 240
250 260 270 280 290 300
orf104a.pep KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYAGALVVVGGAVTAAVG
||||||||||||||||||||| ||||||||:|||||
orf104-1 KHWEASKVSAVTTLLPVFTVIXXLLGHYVMPETFAAP
250 260 270
与淋病奈瑟球菌的预计ORF的同源性
ORF104显示和淋病奈瑟球菌的预计ORF(ORF104.ng)在重叠的277个氨基酸内有93.9%的相同性:
orf104.pep MENQRPLLGFRLALLAAMTWGTLPXSVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 60
|||||||||| ||||||||||||| :||||||||||||||||||||||||||||||||||
orf104ng MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR 60
orf104.pep LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 120
||||||||| |||||||||:||||||||||||||||||||||||||||||||||||||||
orf104ng LPKRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 120
orf104.pep KDRMTAAQKIGLVLLLAGLLMYFNDKFGELSGLGAYXKGVLLCAAGSMAWVCNAVAQKLL 180
||||||||||||||||:||||:|||||||||||||| ||||||||||||||| |||||||
orf104ng KDRMTAAQKIGLVLLLVGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 180
orf104.pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSMDGTLAWVCIAYCCLNTLIGYGSFGEAL 240
|||||||||||||||||||||| ||||||||:||||||||::|||||||||||||||||
orf104ng SAQFGPQQILLLIYAASAAVFLLXAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL 240
orf104.pep KHWEASKVSAVTTLLPVFTVINTLLGHYVMPETFAAP 277
||||||||||||||||||||| :||||||||:|||||
orf104ng KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYVGALVVVGGAVTAAVG 300
预计全长ORF104ng核苷酸序列<SEQ ID 405>编码的蛋白质具有氨基酸序列<SEQ ID 406>:
1
W GTLPIAVRQV LKFVDAPT
LV WVRFTVAAAV
51
LFVLLALGGR LPKRRDFSWH SFR
LLLLGVT GISANFVLIA QGLHYISPTT
101 TQ
VLWQISPF TMIVVGVLVF KDRMTA
AQKI GLVLLLVGLL MFFNDKFGEL
151 SGLGAYAKGV
LLCAAGSMAW VCYAVAQKLL SAQFGPQ
QIL LLIYAASAAV
201
FLLXAEPAHI GSL
DGTLAWV CFVYCCLNTL IGYGSFGEAL KHWEAS
KVSA
251
VTTLLPVFTV IFSLLGHYVM PDTFAAPDMN G
LGYVGALVV VGGAVTAAVG
301 DRPFKRR*
进一步的工作揭示了完整的淋球菌核苷酸序列<SEQ ID 407>:
1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTC GCGTTGGCAC TTTTGGCGGC
51 GATGACGTGG GGGACGCTGC CGATTGCCGT GCGGCAGGTA TTGAAGTTTG
101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA
151 TTGTTTGTTT TGCTGGCATT GGGCGGGCGG CTGCCGAAGC GGCGGGATTT
201 TTCTTGGCAT TCATTCAGGC TGCTGCTGCT CGGCGTGACG GGCATTTCGG
251 CAAACTTTGT GCTGATTGCC CAAGGGCTGC ATTATATTTC GCCGACCACG
301 ACGCAGGTTT TGTGGCAGAT TTCGCCGTTT ACGATGATTG TTGTCGGCGT
351 GTTGGTGTTT AAAGACCGGA tgaCTGCCGC GCAGAAAATC GGTTTGGTTT
401 TGCTGCttgT CGGTttgCTT ATGTTTTtta ACGACAAATT CGGCGAGTTG
451 TCGGGTTTGG GCGCGTATGC GAAGGGCGTG TTGCTGTGTG CGGCAGGCAG
501 TATGGCCTGG GTGTGTTATG CCGTGGCGCA AAAGCTGCTG TCGGCGCAAT
551 TCGGGCCGCA ACAGATTCTG CTGTTGATTT ATGCGGcaag tgccgccGTG
601 TTCCtgccgT TTGccgaaCC GGCACACATC GGAAGTTTgg aCGGTACGtt
651 GGCGTGGGTT TGTTTTGTGT ATTGCTGCTT GAATACGTTA ATCGGTTACG
701 GCTCGTTCGG CGAGGCGTTG AAACATTGGG AGGCTTCCAA AGTCAGCGCG
751 GTAACAACCT TGCTCCCCGT GTTTACCGTA ATATTTTCTT TGCTCGGGCA
801 TTATGTGATG CCTGATACTT TTGCCGCGCC GGATATGAAC GGTTTGGGTT
851 ATGTCGGCGC ACTGGTCGTG GTCGGGGGTG CGGTTACGGC GGCGGTGGGG
901 GACAGGCCGT TCAAACGCCG CTAG
它对应于氨基酸序列<SEQ ID 408;ORF104ng-1>:
1
MENQRPLLGF ALALLAAMTW GTLPIAVRQV LKFVDAPT
LV WVRFTVAAAV
51
LFVLLALGGR LPKRRDFSWH SFR
LLLLGVT GISANFVLIA QGLHYISPTT
101 TQ
VLWQISPF TMIVVGVLVF KDRMTA
AQKI GLVLLLVGLL MFFNDKFGEL
151 SGLGAYAKG
V LLCAAGSMAW VCYAVAQKLL SAQFGPQQ
IL LLIYAASAAV
201
FLPFAEPAHI GSLD
GTLAWV CFVYCCLNTL IGYGSFGEAL KHWEASK
VSA
251
VTTLLPVFTV IFSLLGHYVM PDTFAAPDMN GL
GYVGALVV VGGAVTAAVG
301 DRPFKRR*
ORF104ng-1和ORF104-1显示在277个氨基酸的重叠区内有97.5%的相同性:
10 20 30 40 50 60
orf104-1.pep MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf104ng-1 MENQRPLLGFALALLAAMTWGTLPIAVRQVLKFVDAPTLVWVRFTVAAAVLFVLLALGGR
10 20 30 40 50 60
70 80 90 100 110 120
orf104-1.pep LPKRRDFSWCSFRLLLLGVAGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
||||||||| |||||||||:||||||||||||||||||||||||||||||||||||||||
orf104ng-1 LPKRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF
70 80 90 100 110 120
130 140 150 160 170 180
orf104-1.pep KDRMTAAQKIGLVLLLAGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||
orf104ng-1 KDRMTAAQKIGLVLLLVGLLMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL
130 140 150 160 170 180
190 200 210 220 230 240
orf104-1.pep SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFAYCCLNTLIGYGSFGEAL
||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||
orf104ng-1 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL
190 200 210 220 230 240
250 260 270
orf104-1.pep KHWEASKVSAVTTLLPVFTVIXXLLGHYVMPETFAAP
||||||||||||||||||||| ||||||||:|||||
orf104ng-1 KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMNGLGYVGALVVVGGAVTAAVG
250 260 270 280 290 300
另外,ORF104ng-1显示出与一种假设的流感嗜血菌蛋白明显同源:
gi|1573895(U32769)假设的[流感嗜血菌]长度=306
评分=237位(598),估计值=8e-62
相同性=114/280(40%),阳性=168/280(59%),空隙=8/280(2%)
询问:30 QRPXXXXXXXXXXXMTWGTLPIAVRQVLKFVDAPTLVWXXXXXXXXXXXXXXXXXXXXP- 88
Q+P M WG+LPIA++QVL ++A T+VW P
目标:3 QQPLLGFTFALITAMAWGSLPIALKQVLSVMNAQTIVWYRFIIAAVSLLALLAYKKQLPE 62
询问:89 --KRRDFSWHSFRLLLLGVTGISANFVLIAQGLHYISPTTTQVLWQISPFTMIVVGVLVF 146
K R ++W ++L+GV G+++NF+L + L+YI P+ Q+ +S F M++ GVL+F
目标:63 LMKVRQYAW----IMLIGVIGLTSNFLLFSSSLNYIEPSVAQIFIHLSSFGMLICGVLIF 118
询问:147 KDRMTAAQKIXXXXXXXXXXMFFNDKFGELSGLGAYAKGVLLCAAGSMAWVCYAVAQKLL 206
K+++ QKI +FFND+F +GL Y+ GV+L G++ WV Y +AQKL+
目标:119 KEKLGLHQKIGLFLLLIGLGLFFNDRFDAFAGLNQYSTGVILGVGGALIWVAYGMAQKLM 178
询问:207 SAQFGPQQILLLIYAASAAVFLPFAEPAHIGSLDGTLAWVCFVYCCLNTLIGYGSFGEAL 266
+F QQILL++Y A F+P A+ + + L LA +CF+YCCLNTLIGYGS+ EAL
目标:179 LRKFNSQQILLMMYLGCAIAFMPMADFSQVQELT-PLALICFIYCCLNTLIGYGSYAEAL 237
询问:267 KHWEASKVSAVTTLLPVFTVIFSLLGHYVMPDTFAAPDMN 306
W+ SKVS V TL+P+FT++FS + HY P FAAP++N
目标:238 NRWDVSKVSVVITLVPLFTILFSHIAHYFSPADFAAPELN 277
根据该分析结果(包括此淋球菌蛋白中存在一个推定的前导序列和数个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例48
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 409>:
1 ATGGTAGCTC GTCGGGCTCA TAACCCGAAG GTCGTAGGTT CGAATCCTGT
51 .CCCGCAACC TAATTTCAAA CCCCTCGGTT CAATGCCGAG GG.GTTTTGT
101 T.TTGCCTGT TTCCTGTTTC CTGTTTCCTG CCGCCTCCGT TTTTTGCCGG
151 ATTTTCCTTC CGGCCGCAAT ATCGGAACGG CAGACCGCCG TCTGTTTGCG
201 GTTGCAAATT CAGGCAGTTT GGCTACAATC TTCCGCATTG TCTTCAAGAA
251 AGCCAACCAT GCCGACCGTC CGTTTTACCG AATCCGTCAG CAAACAAGAC
301 CTTGATGCTC TGTTCGAGTG GGCAAAAGCA AGTTACGGTG CAGAAAGTTG
351 CTGGAAAACG CTGTATCTGA ACGGTCysCC TTTGGGCAAC CTGTCGCCGG
401 AATGGGTGGA ACGCGTsmmA AAAGACTGGG AGGCAGGCTG CyCGGAGTCT
451 TCAGACGGCA TTTTTCTGAA TgCGGACGGc TGgCctGATA TGGgCGGAcg
501 cTTACAGCAC CTCGCCCTCG GTTGGCACTG TGCGGGGCTG TTGGACGgsT
551 GGCGCAACGA GTGTTTCGAC CTGACCGACG GCGGCGGCAA CCCCTTGTTC
601 ACGCTCGaAc GCGCCGyTTT mCGTCCTkTC GGACTGCTCA GCCGCGCCGT
651 CCATCTCAAC GGTCTGACCG AATCGGACGG CCGATGGCAT TTCTGGATAG
701 GCAGGCGCAG TCCGCACAAA GCAGTCGATC CCAACAAACT CGACAATACT
751 rCCGCCGGCG GTGTTTCCGG CGGCGAAATG CCGTCTGAAG CCGTGTGTCG
801 CGAAAGCAGC GAAGAAGCCG GTTTGGATAA AACGCTGcTT CCGCTCATCC
851 GCCCGGTATC GCAGCTGCAC AGCCTGCGCT CCGTCAGCCG GGGTGTACAC
901 AATGAAATCC TGTATGTATT CGATGCCGTC CTGCCG...
它对应于氨基酸序列<SEQ ID 410;ORF105>:
1 MVARRAHNPK VVGSNPXPAT XFQTPRFNAE XVLXLPVSCF LFPAASVFCR
51 IFLPAAISER QTAVCLRLQI QAVWLQSSAL SSRKPTMPTV RFTESVSKQD
101 LDALFEWAKA SYGAESCWKT LYLNGXPLGN LSPEWVERVX KDWEAGCXES
151 SDGIFLNADG WPDMGGRLQH LALGWHCAGL LDGWRNECFD LTDGGGNPLF
201 TLERAXXRPX GLLSRAVHLN GLTESDGRWH FWIGRRSPHK AVDPNKLDNT
251 XAGGVSGGEM PSEAVCRESS EEAGLDKTLL PLIRPVSQLH SLRSVSRGVH
301 NEILYVFDAV LP...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 411>:
1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACAAG ACCTTGATGC
51 TCTGTTCGAG TGGGCAAAAG CAAGTTACGG TGCAGAAAGT TGCTGGAAAA
101 CGCTGTATCT GAACGGTCTG CCTTTGGGCA ACCTGTCGCC GGAATGGGTG
151 GAACGCGTCA AAAAAGACTG GGAGGCAGGC TGCTCGGAGT CTTCAGACGG
201 CATTTTTCTG AATGCGGACG GCTGGCCTGA TATGGGCGGA CGCTTACAGC
251 ACCTCGCCCT CGGTTGGCAC TGTGCGGGGC TGTTGGACGG CTGGCGCAAC
301 GAGTGTTTCG ACCTGACCGA CGGCGGCGGC AACCCCTTGT TCACGCTCGA
351 ACGCGCCGCT TTCCGTCCTT TCGGACTGCT CAGCCGCGCC GTCCATCTCA
401 ACGGTCTGAC CGAATCGGAC GGCCGATGGC ATTTCTGGAT AGGCAGGCGC
451 AGTCCGCACA AAGCAGTCGA TCCCAACAAA CTCGACAATA CTGCCGCCGG
501 CGGTGTTTCC GGCGGCGAAA TGCCGTCTGA AGCCGTGTGT CGCGAAAGCA
551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGC TTCCGCTCAT CCGCCCGGTA
601 TCGCAGCTGC ACAGCCTGCG CTCCGTCAGC CGGGGTGTAC ACAATGAAAT
651 CCTGTATGTA TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC
701 AGGATGGCGA AGTGGCGGGT TTTGAGAAAA TGGACATCGG CGGTCTGTTG
751 GATGCCATGT TGTCGGGAAA CATGATGCAC GACGCGCAAC TGGTTACGCT
801 GGACGCGTTT TGCCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG
851 AGTGGCTGGA CGGCATACGT TTATAG
它对应于氨基酸序列<SEQ ID 412;ORF105-1>:
1 MPTVRFTESV SKQDLDALFE WAKASYGAES CWKTLYLNGL PLGNLSPEWV
51 ERVKKDWEAG CSESSDGIFL NADGWPDMGG RLQHLALGWH CAGLLDGWRN
101 ECFDLTDGGG NPLFTLERAA FRPFGLLSRA VHLNGLTESD GRWHFWIGRR
151 SPHKAVDPNK LDNTAAGGVS GGEMPSEAVC RESSEEAGLD KTLLPLIRPV
201 SQLHSLRSVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL
251 DAMLSGNMMH DAQLVTLDAF CRYGLIDAAH PLSEWLDGIR L*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF105显示与脑膜炎奈瑟球菌菌株A的ORF(ORF105a)在重叠的226个氨基酸内有89.4%的相同性:
60 70 80 90 100 110
orf105.pep ISERQTAVCLRLQIQAVWLQSSALSSRKPTMPTVRFTESVSKQDLDALFEWAKASYGAES
||||||||||||:|||||||||||||||||
orf105a MPTVRFTESVSKHDLDALFEWAKASYGAES
10 20 30
120 130 140 150 160 170
orf105.pep CWKTLYLNGXPLGNLSPEWVERVXKDWEAGCXESSDGIFLNADGWPDMGGRLQHLALGWH
||||||||| |||||||||:||| ||||||| ||||||||||||||||| |||||| |:
orf105a CWKTLYLNGLPLGNLSPEWAERVKKDWEAGCSESSDGIFLNADGWPDMGRRLQHLARIWK
40 50 60 70 80 90
180 190 200 210 220 230
orf105.pep CAGLLDGWRNECFDLTDGGGNPLFTLERAXXRPXGLLSRAVHLNGLTESDGRWHFWIGRR
|||| |||:|||||||||:||||:|||| || ||||||||||||:|||||||||||||
orf105a EAGLLHGWRDECFDLTDGGSNPLFALERAAFRPFGLLSRAVHLNGLVESDGRWHFWIGRR
100 110 120 130 140 150
240 250 260 270 280 290
orf105.pep SPHKAVDPNKLDNTXAGGVSGGEMPSEAVCRESSEEAGLDKTLLPLIRPVSQLHSLRSVS
||||||||:||||| |||||:||:|||:||||||||||||||||||||||||||||| ||
orf105a SPHKAVDPDKLDNTAAGGVSSGELPSETVCRESSEEAGLDKTLLPLIRPVSQLHSLRPVS
160 170 180 190 200 210
300 310
orf105.pep RGVHNEILYVFDAVLP
||||||||||||||||
orf105a RGVHNEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLAAMLSGNMMHDAQLVTLDAF
220 230 240 250 260 270
全长ORF105a核苷酸序列<SEQ ID 413>是:
1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACACG ACCTTGATGC
51 CCTATTCGAG TGGGCAAAGG CAAGTTACGG TGCGGAAAGT TGCTGGAAAA
101 CGCTGTATCT GAACGGTCTG CCTTTGGGCA ATCTGTCGCC GGAATGGGCG
151 GAGCGCGTCA AAAAAGACTG GGAGGCAGGC TGCTCGGAGT CTTCAGACGG
201 CATTTTCCTG AATGCGGACG GCTGGCCAGA TATGGGCAGA CGCTTGCAGC
251 ACCTCGCCCG AATATGGAAA GAAGCGGGAC TGCTTCACGG CTGGCGCGAC
301 GAGTGTTTCG ACCTGACCGA CGGCGGCAGC AATCCCTTGT TCGCGCTCGA
351 ACGCGCCGCT TTCCGTCCGT TCGGACTGCT CAGCCGCGCC GTCCATCTCA
401 ACGGTTTGGT CGAATCGGAC GGCCGATGGC ATTTCTGGAT AGGCAGGCGC
451 AGTCCGCACA AAGCAGTCGA TCCCGACAAA CTCGACAATA CTGCCGCCGG
501 CGGTGTTTCC AGCGGTGAAT TGCCGTCTGA AACCGTGTGT CGCGAAAGCA
551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGC TTCCGCTCAT CCGCCCGGTA
601 TCGCAGCTGC ACAGCCTGCG CCCCGTCAGC CGGGGTGTGC ACAATGAAAT
651 CCTGTATGTA TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC
701 AGGATGGCGA AGTGGCGGGT TTTGAGAAAA TGGACATCGG CGGTCTGTTG
751 GCTGCCATGT TGTCGGGAAA CATGATGCAC GACGCGCAAC TGGTTACGCT
801 GGACGCGTTT TGCCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG
851 AGTGGCTGGA CGGCATACGT TTATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 414>:
1 MPTVRFTESV SKHDLDALFE WAKASYGAES CWKTLYLNGL PLGNLSPEWA
51 ERVKKDWEAG CSESSDGIFL NADGWPDMGR RLQHLARIWK EAGLLHGWRD
101 ECFDLTDGGS NPLFALERAA FRPFGLLSRA VHLNGLVESD GRWHFWIGRR
151 SPHKAVDPDK LDNTAAGGVS SGELPSETVC RESSEEAGLD KTLLPLIRPV
201 SQLHSLRPVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL
251 AAMLSGNMMH DAQLVTLDAF CRYGLIDAAH PLSEWLDGIR L*
ORF105a和ORF105-1显示在291个氨基酸的重叠区内有93.8%的相同性:
10 20 30 40 50 60
orf105a.pep MPTVRFTESVSKHDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWAERVKKDWEAG
||||||||||||:||||||||||||||||||||||||||||||||||||:||||||||||
orf105-1 MPTVRFTESVSKQDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWVERVKKDWEAG
10 20 30 40 50 60
70 80 90 100 110 120
orf105a.pep CSESSDGIFLNADGWPDMGRRLQHLARIWKEAGLLHGWRDECFDLTDGGSNPLFALERAA
||||||||||||||||||| |||||| |: |||| |||:|||||||||:||||:|||||
orf105-1 CSESSDGIFLNADGWPDMGGRLQHLALGWHCAGLLDGWRNECFDLTDGGGNPLFTLERAA
70 80 90 100 110 120
130 140 150 160 170 180
orf105a.pep FRPFGLLSRAVHLNGLVESDGRWHFWIGRRSPHKAVDPDKLDNTAAGGVSSGELPSETVC
||||||||||||||||:|||||||||||||||||||||:|||||||||||:||:|||:||
orf105-1 FRPFGLLSRAVHLNGLTESDGRWHFWIGRRSPHKAVDPNKLDNTAAGGVSGGEMPSEAVC
130 140 150 160 170 180
190 200 210 220 230 240
orf105a.pep RESSEEAGLDKTLLPLIRPVSQLHSLRPVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||
orf105-1 RESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
190 200 210 220 230 240
250 260 270 280 290
orf105a.pep FEKMDIGGLLAAMLSGNMMHDAQLVTLDAFCRYGLIDAAHPLSEWLDGIRLX
|||||||||| |||||||||||||||||||||||||||||||||||||||||
orf105-1 FEKMDIGGLLDAMLSGNMMHDAQLVTLDAFCRYGLIDAAHPLSEWLDGIRLX
250 260 270 280 290
与淋病奈瑟球菌的预计ORF的同源性
ORF105显示与淋病奈瑟球菌的预计ORF(ORF105.ng)在重叠的312个氨基酸内有87.5%的相同性:
orf105.pep MVARRAHNPKVVGSNPXPATXFQTPRFNAEXVLXLPVSCFLFPAASVFCRIFLPAAISER 60
|||||||||||||||| ||| :|||||||| || |||||||||||||||||||||
orf105ng MVARRAHNPKVVGSNPAPATKYQTPRFNAEGVLF-----FLFPAASVFCRIFLPAAISER 55
orf105.pep QTAVCLRLQIQAVWLQSSALSSRKPTMPTVRFTESVSKQDLDALFEWAKASYGAESCWKT 120
|:|||||||||||||||||| ||||:|||||||||||||||||||| |||||||||||||
orf105ng QAAVCLRLQIQAVWLQSSALCSRKPAMPTVRFTESVSKQDLDALFERAKASYGAESCWKT 115
orf105.pep LYLNGXPLGNLSPEWVERVXKDWEAGCXESSDGIFLNADGWPDMGGRLQHLALGWHCAGL 180
|||| |||||||||:||: ||||||| |||:|||||||||||||||||||| |: |||
orf105ng LYLNRLPLGNLSPEWAERIKKDWEAGCSESSNGIFLNADGWPDMGGRLQHLARTWNKAGL 175
orf105.pep LDGWRNECFDLTDGGGNPLFTLERAXXRPXGLLSRAVHLNGLTESDGRWHFWIGRRSPHK 240
| ||||||||||||||||||||||| || ||| ||||||||:||:||||||||||||||
orf105ng LHGWRNECFDLTDGGGNPLFTLERAAFRPFGLLIRAVHLNGLVESNGRWHFWIGRRSPHK 235
orf105.pep AVDPNKLDNTXAGGVSGGEMPSEAVCRESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVH 300
||||:|||| :|||||||||||||||||||||||||||:|||||||:||||| ||||||
orf105ng AVDPGKLDNIAGGGVSGGEMPSEAVCRESSEEAGLDKTLFPLIRPVSRLHSLRPVSRGVH 295
orf105.pep NEILYVFDAVLP 312
||||||||||||
orf105ng NEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLDAMLSKNMMHDAQLVTLDAFYRYG 355
预计全ORF105ng核苷酸序列<SEQ ID 415>编码的蛋白质具有氨基酸序列<SEQID 416>:
1 MVARRAHNPK VVGSNPAPAT KYQTPRFNAE G
VLFFLFPAA SVFCRIFLPA
51 AISERQAAVC LRLQIQAVWL QSSALCSRKP AMPTVRFTES VSKQDLDALF
101 ERAKASYGAE SCWKTLYLNR LPLGNLSPEW AERIKKDWEA GCSESSNGIF
151 LNADGWPDMG GRLQHLARTW NKAGLLHGWR NECFDLTDGG GNPLFTLERA
201 AFRPFGLLIR AVHLNGLVES NGRWHFWIGR RSPHKAVDPG KLDNIAGGGV
251 SGGEMPSEAV CRESSEEAGL DKTLFPLIRP VSRLHSLRPV SRGVHNEILY
301 VFDAVLPETF LPENQDGEVA GFEKMDIGGL LDAMLSKNMM HDAQLVTLDA
351 FYRYGLIDAA HPLSEWLDGI RL*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 417>:
1 ATGCCGACCG TCCGTTTTAC CGAATCCGTC AGCAAACAAG ACCTTGATGC
51 CCTGTTCGAG CGGGCAAAAG CAAGTTACGG TGCCGAAAGT TGCTGGAAAA
101 CGCTGTATCT GAACCGTCTT CCTTTGGGCA ATCTGTCGCC GGAATGGGCT
151 GAGCGCATCA AAAAAGACTG GGAGGCAGGC TGCTCCGAGT CTTCAGACGG
201 CATTTTTCTG AATGCGGACG GCTGGCCGGA TATGGGCGGA CGCTTGCAGC
251 ACCTCGCCCG CACATGGAAC AAGGCGGGGC TGCTTCACGG ATGGCGCAAC
301 GAGTGTTTCG ACCTGACCGA CGGCGGCGGC AACCCCTTGT TCACGCTCGA
351 ACGCGCCGCT TTCCGTCCGT TCGGACTACT CAGCCGCGCC GTCCATCTCA
401 ACGGTTTGGT CGAATCGAAC GGCAGATGGC ATTTTTGGAT AGGCAGGCGC
451 AGTCCGCACA AAGCAGTCGa tcCCGGCAAG CTCGACAATA TTGCCGGCGG
501 CGGTGTTTCC GGCGGCGAAA TGCCGTCTGA AGCCGTGTGC CGCGAAAGCA
551 GCGAAGAAGC CGGTTTGGAT AAAACGCTGT TTCCGCTCAT CCGCCCAGTA
601 TCGCGGCTGC ACAGCCTTCG CCCCGTCAGC CGAGGTGTGC ACAATGAAAT
651 CCTGTATGTG TTCGATGCCG TCCTGCCCGA AACCTTCCTG CCTGAAAATC
701 AGGATGGCGA GGTAGCGGGT TTTGAAAAGA TGGACATTGG CGGCCTATTG
751 GATGCCATGT TGTCGAAAAA CATGATGCAC GACGCGCAAC TGGTTACGCT
801 GGACGCGTTT TACCGTTACG GTCTGATTGA TGCCGCCCAT CCGCTGTCCG
851 AGTGGCTGGA CGGCATACGT TTATAG
它对应于氨基酸序列<SEQ ID 418;ORF105ng-1>:
1 MPTVRFTESV SKQDLDALFE RAKASYGAES CWKTLYLNRL PLGNLSPEWA
51 ERIKKDWEAG CSESSDGIFL NADGWPDMGG RLQHLARTWN KAGLLHGWRN
101 ECFDLTDGGG NPLFTLERAA FRPFGLLSRA VHLNGLVESN GRWHFWIGRR
151 SPHKAVDPGK LDNIAGGGVS GGEMPSEAVC RESSEEAGLD KTLFPLIRPV
201 SRLHSLRPVS RGVHNEILYV FDAVLPETFL PENQDGEVAG FEKMDIGGLL
251 DAMLSKNMMH DAQLVTLDAF YRYGLIDAAH PLSEWLDGIR L*
ORG105ng-1和ORF105-1显示在291个氨基酸的重叠区内有93.5%的相同性:
10 20 30 40 50 60
orf105-1.pep MPTVRFTESVSKQDLDALFEWAKASYGAESCWKTLYLNGLPLGNLSPEWVERVKKDWEAG
|||||||||||||||||||| ||||||||||||||||| ||||||||||:||:|||||||
orf105ng-1 MPTVRFTESVSKQDLDALFERAKASYGAESCWKTLYLNRLPLGNLSPEWAERIKKDWEAG
10 20 30 40 50 60
70 80 90 100 110 120
orf105-1.pep CSESSDGIFLNADGWPDMGGRLQHLALGWHCAGLLDGWRNECFDLTDGGGNPLFTLERAA
|||||||||||||||||||||||||| |: |||| ||||||||||||||||||||||||
orf105ng-1 CSESSDGIFLNADGWPDMGGRLQHLARTWNKAGLLHGWRNECFDLTDGGGNPLFTLERAA
70 80 90 100 110 120
130 140 150 160 170 180
orf105-1.pep FRPFGLLSRAVHLNGLTESDGRWHFWIGRRSPHKAVDPNKLDNTAAGGVSGGEMPSEAVC
||||||||||||||||:||:||||||||||||||||||:|||| |:||||||||||||||
orf105ng-1 FRPFGLLSRAVHLNGLVESNGRWHFWIGRRSPHKAVDPGKLDNIAGGGVSGGEMPSEAVC
130 140 150 160 170 180
190 200 210 220 230 240
orf105-1.pep RESSEEAGLDKTLLPLIRPVSQLHSLRSVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
|||||||||||||:|||||||:||||| ||||||||||||||||||||||||||||||||
orf105ng-1 RESSEEAGLDKTLFPLIRPVSRLHSLRPVSRGVHNEILYVFDAVLPETFLPENQDGEVAG
190 200 210 220 230 240
250 260 270 280 290
orf105-1.pep FEKMDIGGLLDAMLSGNMMHDAQLVTLDAFCRYGLIDAAHPLSEWLDGIRLX
||||||||||||||| |||||||||||||| |||||||||||||||||||||
orf105ng-1 FEKMDIGGLLDAMLSKNMMHDAQLVTLDAFYRYGLIDAAHPLSEWLDGIRLX
250 260 270 280 290
另外,ORF105ng-1显示出与一种酵母的酶同源:
sp|P41888|TNR3_SCHPO硫胺焦磷酸激酶(TPK)(硫胺激酶)
>gi|1076928|pir||S52350硫胺焦磷酸激酶(EC 2.7.6.2)-裂殖酵母(栗酒裂殖酵母)>gi|666111(X84417)硫胺焦磷酸激酶[栗酒裂殖酵母]>gi|2330852|gnl|PID|e334056(Z98533)硫胺焦磷酸激酶[栗酒裂殖酵母]长度=569
评分=105位(259),估计值=4e-22
相同性=64/192(33%),阳性=94/192(48%),空隙=3/192(1%)
询问:268 NKAGLLHGWRNECFDLTDGGGNPLFTLERAAFRPFGLLSRAVHLNGLVESNGRW--HFWI 441
N G+ WRNE + + P+ +ER F FG LS VH + + W+
目标:96 NTFGIADQWRNELYTVYGKSKKPVLAVERGGFWLFGFLSTGVHCTMYIPATKEHPLRIWV 155
询问:442 GRRSPHKAVDPGKLDNIAGGGVSGGEMPSEAVCRESSEEAGLDKTLFPLIRPVSRLHSLR 621
RRSP K P LDN GG++ G+ + +E SEEA LD + LI P + ++
目标:156 PRRSPTKQTWPNYLDNSVAGGIAHGDSVIGTMIKEFSEEANLDVSSMNLI-PCGTVSYIK 214
询问:622 PVSRG-VHNEILYVFDAVLPETFLPENQDGEVAGFEKMDIGGLLDAMLSKNMMHDAQLVT 798
R + E+ YVFD + + +P DGEVAGF + + +L + K+ + LV
目标:215 MEKRHWIQPELQYVFDLPVDDLVIPRINDGEVAGFSLLPLNQVLHELELKSFKPNCALVL 274
询问:799 LDAFYRYGLIDAAHP 843
LD R+G+I HP
目标:275 LDFLIRHGIITPQHP 289
根据该分析结果(包括淋球菌蛋白中存在一个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例49
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 419>:
1 ATGAATAGAC CCAAGCAACC CTTCTTCCGT CCCGAAGTCG CCGTTGCCCG
51 CCAAACCAGC CTGACGGGTA AAGTGATTCT GACACGACCG TTGTCATTTT
101 CCCTATGGAC GACATTTGCA TCGATATCTG CGTTATTGAT TATCCTGTTT
151 TTGATATTTG GTAACTATAC GCGAAAGACA ACAGTGGAGG GACAAATTTT
201 ACCTGCATCG GGCGTAATCA GGGTGTATGC ACCGgATACG rGkACAATTA
251 CAGCGAAATT CGTGGAAGAT GGmsAAAAGG TTAAGGCTGG CGACAAGCTA
301 TTTGCGCTTT CGACCTCACG TTTCGGCGCA GGAGGTAGCG TGCAGCAGCA
351 GTTGAAAACG GAGGCAGTTT TGAAGAAAAC GTTGGCAGAA CAGGAACTGG
401 GTCGTCTGAA GCTGATACAC GGGAATGAAA CGCGCAgCcT TAAAGCAACT
451 GTCGAACGTT TGGAAAACCA GGAACTCCAT ATTTCGCAAC AGATAGACGG
501 TCAGAAAAGG CGCATTAGAC TTGCGGAAGA AATGTTGCAG AAATATCGTT
551 TCCTATCCGC .CAATGA
它对应于氨基酸序列<SEQ ID 420;ORF107>:
1 MNRPKQPFFR PEVAVARQTS LTGKVILTRP LSFSLWTTFA SISALLIILF
51 LIFGNYTRKT TVEGQILPAS GVIRVYAPDT XTITAKFVED GXKVKAGDKL
101 FALSTSRFGA GGSVQQQLKT EAVLKKTLAE QELGRLKLIH GNETRSLKAT
151 VERLENQELH ISQQIDGQKR RIRLAEEMLQ KYRFLSXQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF107显示与脑膜炎奈瑟球菌菌株A的ORF(ORF107a)在重叠的186个氨基酸内有97.8%的相同性:
10 20 30 40 50 60
orf107.pep MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf107a MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT
10 20 30 40 50 60
70 80 90 100 110 120
orf107.pep TVEGQILPASGVIRVYAPDTXTITAKFVEDGXKVKAGDKLFALSTSRFGAGGSVQQQLKT
|||||||||||||||||||| |||||| ||| ||||||||||||||||||| ||||||||
orf107a TVEGQILPASGVIRVYAPDTGTITAKFXEDGEKVKAGDKLFALSTSRFGAGDSVQQQLKT
70 80 90 100 110 120
130 140 150 160 170 180
orf107.pep EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf107a EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ
130 140 150 160 170 180
189
orf107.pep KYRFLSXQX
||||||
orf107a KYRFLSANDAVPKQEMMNVKAELLEQKAKLDAYRREEVGLLQEIRTQNLTLXSLPQAAX
190 200 210 220 230
全长ORF107a核苷酸序列<SEQ ID 421>是:
1 ATGAATAGAC CCAAGCAACC NTTCTTCCGT CCCGAAGTCG CCGTTGCCCG
51 CCAAACCAGC CTGACGGGTA AAGTGATTCT GACACGACCG TTGTCATTTT
101 CCCTATGGAC GACATTTGCA TCGATATCTG CGTTATTGAT TATCCTGTTT
151 TTGATATTTG GTAACTATAC GCGAAAGACA ACAGTGGAGG GACAAATTTT
201 ACCTGCATCG GGCGTAATCA GGGTGTATGC ACCGGATACG GGGACAATTA
251 CNGCGAAATT CNTGGAAGAT GGAGAAAAGG TTAAGGCTGG CGACAAGCTA
301 TTTGCGCTTT CGACCTCACG TTTCGGCGCA GGAGATAGCG TGCAGCAGCA
351 GTTGAAAACG GAGGCAGTTT TGAAGAAAAC GTTGGCAGAA CAGGAACTGG
401 GTCGTCTGAA GCTGATACAC GGGAATGAAA CGCGCAGCCT TAAAGCAACT
451 GTCGAACGTT TGGAAAACCA GGAACTCCAT ATTTCGCAAC AGATAGACGG
501 TCAGAAAAGG CGCATTAGAC TTGCGGAAGA AATGTTGCAG AAATATCGTT
551 TCCTATCCGC CAATGATGCA GTGCCAAAAC AAGAAATGAT GAATGTCAAG
601 GCAGAGCTTT TAGAGCAGAA AGCCAAACTT GATGCCTACC GCCGAGAAGA
651 AGTCGGGCTG CTTCAGGAAA TCCGCACGCA GAATCTGACA TTGGNNAGCC
701 TCCCCCAAGC GGCATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 422>:
1 MNRPKQPFFR PEVAVARQTS LTGKVILTRP LSFSLWT
TFA SISALLIILF
51
LIFGNYTRKT TVEGQILPAS GVIRVYAPDT GTITAKFXED GEKVKAGDKL
101 FALSTSRFGA GDSVQQQLKT EAVLKKTLAE QELGRLKLIH GNETRSLKAT
151 VERLENQELH ISQQIDGQKR RIRLAEEMLQ KYRFLSANDA VPKQEMMNVK
201 AELLEQKAKL DAYRREEVGL LQEIRTQNLT LXSLPQAA*
与淋病奈瑟球菌的预计ORF的同源性
ORF107显示与淋病奈瑟球菌的预计ORF(ORF107.ng)在重叠的188个氨基酸内有95.7%的相同性:
orf107.pep MNRPKQPFFRPEVAVARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT 60
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf107ng MNRPKQPFFRPEVAIARQTSLTGKVILTRPLSFSLWTTFASISALLIILFLIFGNYTRKT 60
orf107.pep TVEGQILPASGVIRVYAPDTXTITAKFVEDGXKVKAGDKLFALSTSRFGAGGSVQQQLKT 120
|:|||||||||||||||||| |||||||||| ||||||||||||||||||||||||||||
orf107ng TMEGQILPASGVIRVYAPDTGTITAKFVEDGEKVKAGDKLFALSTSRFGAGGSVQQQLKT 120
orf107.pep EAVLKKTLAEQELGRLKLIHGNETRSLKATVERLENQELHISQQIDGQKRRIRLAEEMLQ 180
|||||||||||||||||||| ||||||||||||||||:|||||||||||||||||||||:
orf107ng EAVLKKTLAEQELGRLKLIHENETRSLKATVERLENQKLHISQQIDGQKRRIRLAEEMLR 180
orf107.pep KYRFLSXQ 188
|||||| |
orf107ng KYRFLSAQ 188
预计全长ORF107ng核苷酸序列<SEQ ID 423>编码的蛋白质具有氨基酸序列<SEQ ID 424>:
1 MNRPKQPFFR PEVAIARQTS LTGKVILTRP LSFSLWT
TFA SISALLIILF
51
LIFGNYTRKT TMEGQILPAS GVIRVYAPDT GTITAKFVED GEKVKAGDKL
101 FALSTSRFGA GGSVQQQLKT EAVLKKTLAE QELGRLKLIH ENETRSLKAT
151 VERLENQKLH ISQQIDGQKR RIRLAEEMLR KYRFLSAQ*
根据此淋球菌蛋白中存在一个推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例50
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 425>:
1 ATGCTGAATA CTTTTTTTGC CGTATTGGGC GGCTGCCTGC TGCT.TTGCC
51 GTGCGGCAAA TCCGTAAATA CGGCGGTACA GCCGCAAAAC GCGGTACAAA
101 GCGCGCCGAA ACCGGTTTTC AAAGTCATAT ATATCGACAA TACGGCGATT
151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA
201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC
251 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT
301 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT
351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG
401 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG
451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA
501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA
它对应于氨基酸序列<SEQ ID 426;ORF 108>:
1 MLNTFFAVLG GCLLXLPCGK SVNTAVQPQN AVQSAPKPVF KVIYIDNTAI
51 AGLDLGQSSE GKTNDGKKQI SYPIKGLPEQ NVIRLIGKHP GDLEAVSGKC
101 METDDKDSPA GWAENGVCHT LFAKLVGNIA EDGGKLTDYL VSHAALQPYQ
151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y*
进一步的工作揭示了下列DNA序列<SEQ ID 427>:
1 ATGCTGAAAA CATCTTTTGC CGTATTGGGC GGCTGCCTGC TGCTTGCCGC
51 CTGCGGCAAA TCCGAAAATA CGGCGGAACA GCCGCAAAAC GCGGTACAAA
101 GCGCGCCGAA ACCGGTTTTC AAAGTCAAAT ATATCGACAA TACGGCGATT
151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA
201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC
251 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT
301 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT
351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG
401 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG
451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA
501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA
它对应于氨基酸序列<SEQ ID 428;ORF108-1>:
1
MLKTSFAVLG GCLLLAACGK SENTAEQPQN AVQSAPKPVF KVKYIDNTAI
51 AGLDLGQSSE GKTNDGKKQI SYPIKGLPEQ NVIRLIGKHP GDLEAVSGKC
101 METDDKDSPA GWAENGVCHT LFAKLVGNIA EDGGKLTDYL VSHAALQPYQ
151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y*
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF108显示与淋病奈瑟球菌的预计ORF(ORF108.ng)在重叠的181个氨基酸内有88.4%的相同性:
orf108.pep MLNTFFAVLGGCLLXLPCGKSVNTAVQPQNAVQSAPKPVFKVIYIDNTAIAGLDLGQSSE 60
||: ||||||||| |||| ||| |||||:|||||||||| |||||||||| ||||||
orf108ng MLKIPFAVLGGCLLLAACGKSENTAEQPQNAAQSAPKPVFKVKYIDNTAIAGLALGQSSE 60
orf108.pep GKTNDGKKQISYPIKGLPEQNVIRLIGKHPGDLEAVSGKCMETDDKDSPAGWAENGVCHT 120
|||||||||||||||||||||::|| ||||:||||| ||||||| ||:|:||||||||||
orf108ng GKTNDGKKQISYPIKGLPEQNAVRLTGKHPNDLEAVVGKCMETDGKDAPSGWAENGVCHT 120
orf108.pep LFAKLVGNIAEDGGKLTDYLVSHAALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181
||||||||||||||||||||:||:|||||||||||||||||||||||||||||||||||||
orf108ng LFAKLVGNIAEDGGKLTDYLISHSALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181
ORF108-1与ORF108ng在相同的181个氨基酸重叠区内有92.3%的相同性:
orf108-1.pep MLKTSFAVLGGCLLLAACGKSENTAEQPQNAVQSAPKPVFKVKYIDNTAIAGLDLGQSSE 60
||| ||||||||||||||||||||||||||:||||||||||||||||||||| ||||||
orf108ng-1 MLKIPFAVLGGCLLLAACGKSENTAEQPQNAAQSAPKPVFKVKYIDNTAIAGLALGQSSE 60
orf108-1.pep GKTNDGKKQISYPIKGLPEQNVIRLIGKHPGDLEAVSGKCMETDDKDSPAGWAENGVCHT 120
|||||||||||||||||||||::|| ||||:||||| ||||||| ||:|:||||||||||
orf108ng-1 GKTNDGKKQISYPIKGLPEQNAVRLTGKHPNDLEAVVGKCMETDGKDAPSGWAENGVCHT 120
orf108-1.pep LFAKLVGNIAEDGGKLTDYLVSHAALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181
||||||||||||||||||||:||:|||||||||||||||||||||||||||||||||||||
orf108ng-1 LFAKLVGNIAEDGGKLTDYLISHSALQPYQAGKSGYAAVQNGRYVLEIDSEGAFYFRRRHY 181
全长ORF108ng核苷酸序列<SEQ ID 429>是:
1 ATGCTGAAAa tacctTTTGC CGTGTtgggc ggCtgcctGC TGCTTGCCGC
51 CTGCGGCAAA TCCGAAAATa cggcggaACA GCCGCAAAAT gcggCACAAA
101 GCGCGCCGAA ACCGGTTTTC AAAGTCAAAT ACATCGACAA TACGGCGATT
151 GCCGGTTTGG CTTTGGGACA AAGTAGCGAA GGCAAAACCA acgacgGCAA
201 AAAACAAATC AGTTATccgA TTAAAGGCTT GCCGGAACAA Aacgccgtcc
251 gGCTGACCGG AAAGCATCCC AACGACTTGG AagccgtcgT CGGCAAATGT
301 ATGGAAACCG ACGGAAAGGA CGCGCCTTCG GGCTGGGCGG AAAACGGCGT
351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG
401 GCAAACTGAC TGATTACCTG ATTTCGCATT CCGCCCTGCA ACCCTATCAG
451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA
501 AATCGACAGC GagggGGCGT TTTATttccg ccgccgccat tattgA
它编码的蛋白质具有氨基酸序列<SEQ ID 430>:
1 MLKIPFA
YLG GCLLLAACGK SENTAEQPQN AAQSAPKPVF KVKYIDNTAI
101 METDGKDAPS GWAENGVCHT LFAKLVGNIA EDGGKLTDYL ISHSALQPYQ
151 AGKSGYAAVQ NGRYVLEIDS EGAFYFRRRH Y*
根据该分析结果(包括淋球菌蛋白中存在一个预计的原核细胞膜脂蛋白脂质连接位点(下划线)和一个推定的ATP/GTP-结合位点基序A(P-环,双划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例51
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 431>:
1 ATGGAAGATT TATATATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC
51 CGgATTTATC GATgcgatTg cGggCGGGGG TGGTTTGATT ACGCTGCCCG
101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG
151 CTGCAAgCAG CCGCTGCTAC GTTTTCAGCT ACGGTTTCTT TTGCACGCAA
201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG
251 TAGGCGGCGT GGcCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT
301 CTgCTgGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCAC TGTATTTTGT
351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT
401 TTTTTCTGTT cGGGCTGACG GTCGC.ACCG CTTTTGGGTT TTTACGACGG
451 TGTGTTCGGA CCGGGTGTCG GCTCGTTTTT TCTGATTGCC TTTATTGTTT
501 TGCTCGGCTG CAAgCTGTTG AACGCGATGT CTTACACCAA ATTGGCGAAC
551 GTTGCCTGCA ATCTTGGTTC GCTATCGGTA TTCCTGCTGC ACGGTTCGAT
601 TATTTTCCCG ATTGCGGCAA CGaTGGCGGT CGGTGCGTTT GTCGGtGCGA
651 ATTTAgGTGC GAGATTTGCC GTaCgctTCG GTTCGAAGCT GATTAA
它对应于氨基酸序列<SEQ ID 432;ORF109>:
1 MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK
51 LQAAAATFSA TVSFARKGLI DWKKGLPIAA ASFVGGVAGA LSVSLVSKDI
101 LLAVVPVLLI FVALYFVFSP KLDGSKEGKA RMSFFLFGLT VXTAFGFLRR
151 CVRTGCRLVF SDCLYCFARL QAVERDVLHQ IGERCLQSWF AIGIPAARFD
201 YFPDCGNDGG RCVCRCEFRC EICRTLRFEA D*
进一步的工作揭示了下列DNA序列<SEQ ID 433>:
1 ATGGAAGATT TATATATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC
51 CGGATTTATC GATGCGATTG CGGGCGGGGG TGGTTTGATT ACGCTGCCCG
101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG
151 CTGCAAGCAG CCGCTGCTAC GTTTTCAGCT ACGGTTTCTT TTGCACGCAA
201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG
251 TAGGCGGCGT GGCCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT
301 CTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCAC TGTATTTTGT
351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT
401 TTTTTCTGTT CGGGCTGACG GTCGCACCGC TTTTGGGTTT TTACGACGGT
451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT
501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG
551 TTGCCTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT
601 ATTTTCCCGA TTGCGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA
651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC
701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG
751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA
它对应于氨基酸序列<SEQ ID 434;ORF109-1>:
1
MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK
51 LQAAAATFSA TVSFARKGLI DWKKGLPIA
A ASFVGGVAGA LSVSLVSKD
I
101
LLAVVPVLLI FVALYFVFSP KLDGSKEGKA R
MSFFLFGLT VAPLLGFYDG
151 VFGPG
VGSFF LIAFIVLLGC KLLNAMSYTK LANVACNLGS LSVFLLHGS
I
201
IFPIAATMAV GAFVGANLGA RFAVRFGSKL IK
PLLIVISI SMAVKLLIDE
251 RNPLYQMIVS MF*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF109显示与脑膜炎奈瑟球菌菌株A的ORF(ORF109a)在重叠的147个氨基酸内有95.9%的相同性:
10 20 30 40 50 60
orf109.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109a MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
10 20 30 40 50 60
70 80 90 100 110 120
orf109.pep TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||
orf109a TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
70 80 90 100 110 120
130 140 150 160 170 180
orf109.pep KLDGSKEGKARMSFFLFGLTVXTAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ
||||||||||||||||||||| :||
orf109a KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
130 140 150 160 170 180
全长ORF109a核苷酸序列<SEQ ID 435>是:
1 ATGGAAGATT TATACATAAT ACTCGCTTTG GGTTTGGTTG CGATGATTGC
51 CGGATTTATC GATGCGATTG CGGGTGGGGG TGGTTTGATT ACGCTGCCTG
101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG
151 CTGCAAGCAG CCGCTGCTAC GTTTTCGGCT ACGGTTTCTT TTGCACGCAA
201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCGGCA GCATCGTTTG
251 CAGGCGGCGT GGTCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT
301 CTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCGC TGTATTTTGT
351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT
401 TTTTTCTGTT CGGTCTGACG GTTGCACCAC TTTTGGGTTT TTACGACGGT
451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT
501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG
551 TTGCCTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT
601 ATTTTCCCGA TTGCGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA
651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC
701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG
751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 436>:
1
MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK
51 LQAAAATFSA TVSFARKGLI DWKKGLPIA
A ASFAGGVVGA LSVSLVSKD
I
101
LLAVVPVLLI FVALYFVFSP KLDGSKEGKA R
MSFFLFGLT VAPLLGFYDG
151 VFGPG
VGSFF LIAFIVLLGC KLLNAMSYTK LANVACNLGS LSVFLLHGS
I
201
IFPIAATMAV GAFVGANLGA RFAVRFGSKL IK
PLLIVISI SMAVKLLIDE
251 RNPLYQMIVS MF*
ORF109a和ORF109-1显示在262个氨基酸的重叠区内有99.2%的相同性:
10 20 30 40 50 60
orf109a.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109-1 MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
10 20 30 40 50 60
70 80 90 100 110 120
orf109a.pep TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||
orf109-1 TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
70 80 90 100 110 120
130 140 150 160 170 180
orf109a.pep KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109-1 KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
130 140 150 160 170 180
190 200 210 220 230 240
orf109a.pep LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109-1 LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
190 200 210 220 230 240
250 260
orf109a.pep SMAVKLLIDERNPLYQMIVSMFX
|||||||||||||||||||||||
orf109-1 SMAVKLLIDERNPLYQMIVSMFX
250 260
与淋病奈瑟球菌的预计ORF的同源性
ORF109显示与淋病奈瑟球菌的预计ORF(ORF109.ng)在重叠的231个氨基酸内有98.3%的相同性:
orf109.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109ng MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA 60
orf109.pep TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP 120
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||
orf109ng TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP 120
orf109.pep KLDGSKEGKARMSFFLFGLTVXTAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 180
||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||
orf109ng KLDGSKEGKARMSFFLFGLTVATAFGFLRRCVRTGCRLVFSDCLYCFARLQAVERDVLHQ 180
orf109.pep IGERCLQSWFAIGIPAARFDYFPDCGNDGGRCVCRCEFRCEICRTLRFEAD 231
|||||||||||||||||||||||||||||||||||||||||||| ||||||
orf109ng IGERCLQSWFAIGIPAARFDYFPDCGNDGGRCVCRCEFRCEICRPLRFEAD 231
预计ORF109ng核苷酸序列<SEQ ID 437>编码的蛋白质具有氨基酸序列<SEQ ID438>:
1
TNK
51 LQAAAATFSA TVSFARKGLI DWKKGLPIA
A ASFAGGVVGA LSVSLVSKD
I
101
LLAVVPVLLI FVALYFVFSP KLDGSKEGKA R
MSFFLFGLT VATAFGFLRR
151 CVRTGCRLVF SDCLYCFARL QAVERDVLHQ IGERCLQSWF AIGIPAARFD
201 YFPDCGNDGG RCVCRCEFRC EICRPLRFEA D*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 439>:
1 ATGGAAGATT TATACATAAT ACTCGCTTTG GGTTTGGTTG CGATGATCGC
51 CGGATTTATC GATGCGATTG CGGGCGGGGG TGGTTTGATT ACGCTGCCTG
101 CACTCTTGTT GGCAGGTATT CCTCCCGTGT CGGCAATTGC CACCAACAAG
151 CTGCAAGCAG CCGCTGCTAC GTTTTCGGCT ACGGTTTCTT TTGCACGCAA
201 AGGTTTGATT GATTGGAAGA AAGGTCTCCC GATTGCCGCA GCATCGTTTG
251 CAGGCGGCGT GGTCGGTGCA TTATCGGTCA GCTTGGTTTC CAAAGATATT
301 TTGCTGGCGG TCGTGCCGGT TTTGTTGATA TTTGTCGCGC TGTATTTTGT
351 GTTTTCGCCC AAGCTCGACG GCAGTAAGGA AGGCAAAGCC AGAATGTCTT
401 TTTTTCTATT CGGGCTGACG GTTGCACCGC TTTTGGGTTT TTACGACGGT
451 GTGTTCGGAC CGGGTGTCGG CTCGTTTTTT CTGATTGCCT TTATTGTTTT
501 GCTCGGCTGC AAGCTGTTGA ACGCGATGTC TTACACCAAA TTGGCGAACG
551 TTGCTTGCAA TCTTGGTTCG CTATCGGTAT TCCTGCTGCA CGGTTCGATT
601 ATTTTCCCGA TTGTGGCAAC GATGGCGGTC GGTGCGTTTG TCGGTGCGAA
651 TTTAGGTGCG AGATTTGCCG TCCGCTTCGG TTCGAAGCTG ATTAAGCCGC
701 TGCTGATTGT CATCAGCATT TCGATGGCTG TGAAATTGTT GATAGACGAG
751 AGAAATCCGC TGTATCAGAT GATTGTTTCG ATGTTTTAA
它对应于氨基酸序列<SEQ ID 440;ORF109ng-1>:
1
MEDLYIILAL GLVAMIAGFI DAIAGGGGLI TLPALLLAGI PPVSAIATNK
51 LQAAAATFSA TVSFARKGLI DWKKGLPIA
A ASFAGGVVGA LSVSLVSKD
I
101
LLAVVPVLLI FVALYFVFSP KLDGSKEGKA R
MSFFLFGLT VAPLLGFYDG
151 VFGPG
VGSFF LIAFIVLLGC KLLNAMSYTK LANVACNLGS LSVFLLHGS
I
201
IFPIVATMAV GAFVGANLGA RFAVRFGSKL IK
PLLIVISI SMAVKLLIDE
251 RNPLYQMIVS MF*
ORF109ng-1和ORF109-1显示在262个氨基酸的重叠区内有98.9%的相同性:
10 20 30 40 50 60
orf109ng-1.pep MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109-1 MEDLYIILALGLVAMIAGFIDAIAGGGGLITLPALLLAGIPPVSAIATNKLQAAAATFSA
10 20 30 40 50 60
70 80 90 100 110 120
orf109ng-1.pep TVSFARKGLIDWKKGLPIAAASFAGGVVGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
|||||||||||||||||||||||:|||:||||||||||||||||||||||||||||||||
orf109-1 TVSFARKGLIDWKKGLPIAAASFVGGVAGALSVSLVSKDILLAVVPVLLIFVALYFVFSP
70 80 90 100 110 120
130 140 150 160 170 180
orf109ng-1.pep KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf109-1 KLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFFLIAFIVLLGCKLLNAMSYTK
130 140 150 160 170 180
190 200 210 220 230 240
orf109ng-1.pep LANVACNLGSLSVFLLHGSIIFPIVATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||||
orf109-1 LANVACNLGSLSVFLLHGSIIFPIAATMAVGAFVGANLGARFAVRFGSKLIKPLLIVISI
190 200 210 220 230 240
250 260
orf109ng-1.pep SMAVKLLIDERNPLYQMIVSMFX
|||||||||||||||||||||||
orf109-1 SMAVKLLIDERNPLYQMIVSMFX
250 260
另外,ORF109ng-1显示出与一种假设的假单胞菌属蛋白同源:
sp|P29942|YCB9_PSEDE COBO 3’区域中假设的27.4KD蛋白(ORF9)
>gi|94984|pir||I38164假设蛋白9-假单胞菌属>gi|551929(M62866)ORF9[脱氮假单胞菌]长度=261
评分=175位(439),估计值=3e-43
相同性=83/214(38%),阳性=131/214(60%),空隙=1/214(0%)
询问:41 PPVSAIATNKLQXXXXXXXXXXXXXRKGLIDWKKGLPIXXXXXXXXXXXXXXXXXXXKDI 100
PP+ + TNKLQ R+G ++ K+ LP+ D+
目标:43 PPLQTLGTNKLQGLFGSGSATLSYARRGHVNLKEQLPMALMSAAGAVLGALLATIVPGDV 102
询问:101 LLAVVPVLLIFVALYFVFSPKLDGSKEGKARMSFFLFGLTVAPLLGFYDGVFGPGVGSFF 160
L A++P LLI +ALYF P + G + +R++ F+F LT+ PL+GFYDGVFGPG GSFF
目标:103 LKAILPFLLIAIALYFGLKPNM-GDVDQHSRVTPFVFTLTLVPLIGFYDGVFGPGTGSFF 161
询问:161 LIAFIVLLGCKLLNAMSYTKLANVACNLGSLSVFLLHGSIIFPIVATMAVGAFVGANLGA 220
++ F+ L G +L A ++TK N N+G+ VFL G++++ + M +G F+GA +G+
目标:162 MLGFVTLAGFGVLKATAHTKFLNFGSNVGAFGVFLFFGAVLWKVGLLMGLGQFLGAQVGS 221
询问:221 RFAVRFGSKLIKPLLIVISISMAVKLLIDERNPL 254
R+A+ G+K+IKPLL+++SI++A++LL D +PL
目标:222 RYAMAKGAKIIKPLLVIVSIALAIRLLADPTHPL 255
根据该分析结果(包括该淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例52
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 441>:
1 ..CTGCTAGG
ATTGCATCGG TTATCGGTAC G
CTGTTGCA GCAAAACCAG
51 CCGCAGACGG ATTATTTGGT CAAATTCGGA TCGTTTTGGG CGAG.ATTTT
101 TGGTTTTCTG GGACTGTATG ACGTCTATGC TTCGGCATGG TTTGTCGTTA
151 TCATGATGTT TTTGGTGGTT TCTACCAGTT TGTGCCTGAT TCGCAATGTG
201 CCGCCGTTCT GGCGCGAAAT GAAGTCTTTT CGGGAAAAGG TTAAAGAAAA
251 ATCTCTGGCG GCGATGCGCC ATTCTTCGCT GTTGGATGTA AAAATTGCGC
301 CCGAGGTTGC CAAACGTTAT CTGGAAGTAC AAGGTTTTCA GGGGAAAACC
351 ATTAACCGTG AAGACGGGTC GGTTCTGATT GCCGCCAAAA AAGGCACAAT
401 GAACAAATGG GGCTATATCT TTGCCCATGT TGCTTTGATT GTCATTTGCC
451 TGGGCGGGTT GATAGACAGT AACCTGCTGT TGAAACTGGG TATGCTGACC
551 CCGAAAGTAT .TTTGGGTGC gTCCAATCTC TCATTTAGGG GCAACGTCAA
601 TATTTCCG.A GGGGCAGAgT GCGGATGTGG TTTTCCTGA
它对应于氨基酸序列<SEQ ID 442;ORF110>:
1 ..LLGIASVIGT LLQQNQPQTD YLVKFGSFWA XIFGFLGLYD VYASAWFVVI
51 MMFLVVSTSL CLIRNVPPFW REMKSFREKV KEKSLAAMRH SSLLDVKIAP
101 EVAKRYLEVQ GFQGKTINRE DGSVLIAAKK GTMNKWGYIF AHVALIVICL
151 GGLIDSNLLL KLGMLTGRIF RTIRRFMPRI XKPESXFGCV QSLI*GQRQY
201 FXRGRVRMWF S*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的ORF88a的同源性
ORF110显示与脑膜炎奈瑟球菌菌株A的ORF88a在重叠的188个氨基酸内有91.5%的相同性:
10 20 30 40 50 60
orf88a.pep MSKSRRSPPLLSRPWFAFFSSMRFA
VALLSLLGIASVIGTVLQQNQPQTDYLVKFGSFWA
||||||||||:| ||||||||||||||||||
orf110
LLGIASVIGTLLQQNQPQTDYLVKFGSFWA
10 20 30
70 80 90 100 110 120
orf88a.pep QIFGFLGLYDVYASAW
FVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH
|||||||||||||||| ||||||||||||||||| |||||||||||||||||||||||||||
orf110 XIFGFLGLYDVYASAW
FVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH
40 50 60 70 80 90
130 140 150 160 170 180
orf88a.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWG
YIFAHVALIVICL
||||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||
orf110 SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWG
YIFAHVALIVICL
100 110 120 130 140 150
190 200 210 220 230 240
orf88a.pep
GGLIDSNLLLKLGMLTGRIVPDNQAVYAKDFKPESILGASNLSFRGNVNISEGQSADVVF
|||| ||||||||||||||| : : : |||| :|
orf110
GGLIDSNLLLKLGMLTGRIFRTIRRFMPRIXKPESXFGCVQSLIXGQRQYFXRGRVRMWF
160 170 180 190 200 210
250 260 270 280 290 300
orf88a.pep LNADNGILVQDLPFEVKLKKFHIDFYNTGMPRDFASDIEVTDKATGEKLERTIRVNHPLT
orf110 SX
然而,ORF88和ORF110并不匹配,因为它们代表了同一蛋白上的两个不同片段。
与淋病奈瑟球菌的预计ORF的同源性
ORF110显示与淋病奈瑟球菌的预计ORF(ORF110.ng)在重叠的211个氨基酸内有88.6%的相同性:
orf110.pep LLGIASVIGTLLQQNQPQTDYLVKFGSFWA 30
||||||||||:||||||||||||||| ||:
orf110ng MSKSRISPTLLSRPWFAFFSSMRFAVALLSLLGIASVIGTVLQQNQPQTDYLVKFGPFWT 60
orf110.pep XIFGFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 90
|| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf110ng RIFDFLGLYDVYASAWFVVIMMFLVVSTSLCLIRNVPPFWREMKSFREKVKEKSLAAMRH 120
orf110.pep SSLLDVKIAPEVAKRYLEVQGFQGKTINREDGSVLIAAKKGTMNKWGYIFAHVALIVICL 150
|||||||||||||||||||:||||||::||||||||||||||||||||| ||||||||||
orf110ng SSLLDVKIAPEVAKRYLEVRGFQGKTVSREDGSVLIAAKKGTMNKWGYIXAHVALIVICL 180
orf110.pep GGLIDSNLLLKLGMLTGRIFRTIRRFMPRIXKPESXFGCVQSLIXGQRQYFXRGRVRMWF 210
| ||: |||||||||:| |||: || |||| |||| :| ||||| |||||| ||:|||||
orf110ng GRLINXNLLLKLGMLAGSIFRNNRRVMPRISKPESIWGGVQSLIKGQRQYFQRGKVRMWF 240
orf110.pep S 211
|
orf110ng S 241
预计全长ORF110ng核苷酸序列<SEQ ID 443>编码的蛋白质具有氨基酸序列<SEQ ID 444>:
1 MSKSRISPTL LSRPWFAFFS SMRFA
VALLS LLGIASVIGT VLQQNQPQTD
51 YLVKFGPFWT RIFDFLGLYD VYASAW
FVVI MMFLVVSTSL CLIRNVPPFW
101 REMKSFREKV KEKSLAAMRH SSLLDVKIAP EVAKRYLEVR GFQGKTVSRE
151 DGSVLIAAKK GTMNKWGYIX A
HVALIVICL GRLINXNLLL KLGMLAGSIF
201 RNNRRVMPRI SKPESIWGGV QSLIKGQRQY FQRGKVRMWF S*
根据淋球菌蛋白中存在推定的跨膜结构域的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例53
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 445>:
1 ATGCCGTCTG AAACACGCCT GCCGAACTTT ATCCGCGTCT TGATATTTGC
51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG
101 TTACCCTGCA AGGCGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT
151 TCAAATAATC GGGACAAACT CCCCTCACCT GCCGAAATAC AAAAACGCAT
201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG
251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC
301 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC GCCTGAACCG
351 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT
401 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA
451 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA
501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG
551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA
601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGCGAGTT
651 GCACGGCAAA GGCAAAAACG CGCGCGGCGA ACCGTGGCGC ATCGGTATCG
701 AGCAGCCCAA TATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG
751 AACAACCGTT CGCTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA
801 TAAAAACGGC AAACGCCTCT CCCATATCAT CAACCCGAAC AACAAACGAC
851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGGTCGCAGA CAGTGCGATG
901 ACGGCGGACG GCTTGTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC
951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG
1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC
1051 CGCTAA
它对应于氨基酸序列<SEQ ID 446;ORF111>:
1
MPSETRLPNF IRVLIFALGF IFLNACSEQT AQTVTLQGET MGTTYTVKYL
51 SNNRDKLPSP AEIQKRIDDA LKEVNRQMST YQPDSEISRF NQHTAGKPLR
101 ISSDFAHVTA EAVRLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ
151 IKQAASYTGI DKIILKQGKD YASLSKTHPK AYLDLSSIAK GFGVDKVAGE
201 LEKYGIQNYL VEIGGELHGK GKNARGEPWR IGIEQPNIVQ GGNTQIIVPL
251 NNRSLATSGD YRIFHVDKNG KRLSHIINPN NKRPISHNLA SISVVADSAM
301 TADGLSTGLF VLGETEALKL AEREKLAVFL IVRDKGGYRT AMSSEFEKLL
351 R*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF111显示与脑膜炎奈瑟球菌菌株A的ORF(ORF111a)在重叠的351个氨基酸内有96.9%的相同性:
10 20 30 40 50 60
orf111a.pep MPSETRLPNFIRTLIFALSFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDXLPSP
||||||||||||:|||||:|||||||||||||||||||||||||||||||||||| ||||
orf111 MPSETRLPNFIRVLIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP
10 20 30 40 50 60
70 80 90 100 110 120
orf111a.pep AEIQXRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVHLNRLTH
|||| ||||||||||||||||||||||||||||||||||||||||||||||||:||||||
orf111 AEIQKRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH
70 80 90 100 110 120
130 140 150 160 170 180
orf111a.pep GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf111 GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK
130 140 150 160 170 180
190 200 210 220 230 240
orf111a.pep AYLDLSSIAKGFGVDXVAGELEKYGIQNYLVEIGGELHGKXKNARGEPWRIGIEQPNIVQ
||||||||||||||| |||||||||||||||||||||||| |||||||||||||||||||
orf111 AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNARGEPWRIGIEQPNIVQ
190 200 210 220 230 240
250 260 270 280 290 300
orf111a.pep GGNTQIIVPLNNRSXATSGDYRIFHVDKSGKRLSHIINPNNKRPISHNLASISVXADSAM
|||||||||||||| |||||||||||||:||||||||||||||||||||||||| |||||
orf111 GGNTQIIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVADSAM
250 260 270 280 290 300
310 320 330 340 350
orf111a.pep TADGXSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX
|||| |||||||||||||||||||||||||||||||||||||||||||||||
orf111 TADGLSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX
310 320 330 340 350
全长ORF111a核苷酸序列<SEQ ID 447>是:
1 ATGCCGTCTG AAACACGCCT GCCGAACTTT ATCCGCACCT TGATATTTGC
51 CCTGAGTTTT ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG
101 TTACCCTGCA AGGTGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT
151 TCAAATAATC GGGACNAACT CCCNTCACCT GCCGAAATAC AAAANCGCAT
201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG
251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC
301 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC ACCTGAACCG
351 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT
401 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA
451 ATCAAACAAG CAGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA
501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG
551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATNANGT TGCGGGCGAA
601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGNGAGTT
651 GCACGGCAAA GNCAAAAACG CGCGCGGCGA ACCTTGGCGC ATCGGCATCG
701 AACAGCCCAA CATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG
751 AACAACCGTT CGNTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA
801 TAAAAGCGGC AAACGCCTCT CCCATATCAT TAATCCGAAC AACAAACGAC
851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGNTCGCAGA CAGTGCGATG
901 ACGGCGGACG GCTTNTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC
951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG
1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC
1051 CGCTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 448>:
1
MPSETRLPNF IRTLIFALSF IFLNACSEQT AQTVTLQGET MGTTYTVKYL
51 SNNRDXLPSP AEIQXRIDDA LKEVNRQMST YQPDSEISRF NQHTAGKPLR
101 ISSDFAHVTA EAVHLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ
151 IKQAASYTGI DKIILKQGKD YASLSKTHPK AYLDLSSIAK GFGVDXVAGE
201 LEKYGIQNYL VEIGGELHGK XKNARGEPWR IGIEQPNIVQ GGNTQIIVPL
251 NNRSXATSGD YRIFHVDKSG KRLSHIINPN NKRPISHNLA SISVXADSAM
301 TADGXSTGLF VLGETEALKL AEREKLAVFL IVRDKGGYRT AMSSEFEKLL
351 R*
与淋病奈瑟球菌的预计ORF的同源性
ORF111显示与淋病奈瑟球菌的预计ORF(ORF111.ng)在重叠的351个氨基酸内有96.6%的相同性:
10 20 30 40 50 60
orf111ng MPSETRLPNLIRALIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP
|||||||||:||:|||||||||||||||||||||||||||||||||||||||||||||||
orf111 MPSETRLPNFIRVLIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSP
10 20 30 40 50 60
70 80 90 100 110 120
orf111 AKIQKRIDDALKEVNRQMSTYQTDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH
|:|||||||||||||||||||| |||||||||||||||||||||||||||||||||||||
orf111 AEIQKRIDDALKEVNRQMSTYQPDSEISRFNQHTAGKPLRISSDFAHVTAEAVRLNRLTH
70 80 90 100 110 120
130 140 150 160 170 180
orf111ng GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILQQGKDYASLSKTHPK
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||
orf111 GALDVTVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILKQGKDYASLSKTHPK
130 140 150 160 170 180
190 200 210 220 230 240
orf111ng AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNAHGEPWRIGIEQPNIIQ
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||:|
orf111 AYLDLSSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNARGEPWRIGIEQPNIVQ
190 200 210 220 230 240
250 260 270 280 290 300
orf111ng GGNTQIIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVSDSAM
|||||||||||||||||||||||||||||||||||||||||||||||||||||||:||||
orf111 GGNTQIIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVADSAM
250 260 270 280 290 300
310 320 330 340 350
orf111ng TADGLSTGLFVLGETEALRLAEQEKLAVFLIVRDKDGYRTAMSSEFAKLLRX
||||||||||||||||||:|||:|||||||||||| |||||||||| |||||
orf111 TADGLSTGLFVLGETEALKLAEREKLAVFLIVRDKGGYRTAMSSEFEKLLRX
310 320 330 340 350
全长ORF111ng核苷酸序列<SEQ ID 449>是:
1 ATGCCGTCTG AAACACGCCT GCCGAACCTT ATCCGCGCCT TGATATTTGC
51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGaacaaacC GCGCAaaccg
101 TTACCCTGCA AGGCGAAAcg aTGGGTACGA CCTATACCGT CAAATACCTT
151 TCAAATAATC GGGACAAACT CCCCTCCCCT GCCAAAATAC AAAAGCGCAT
201 TGATGATGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TACCAGACCG
251 ATTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC
301 ATTTCAAGCG ATTTCGCACA CGTTACCGCC GAAGCCGTCC GCCTGAACCG
351 CCTGACTCAC GGCGCACTGG ACGTAACCGT CGGCCCTTTG GTCAACCTTT
401 GGGGGTTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA
451 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGCAACA
501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAA GCCTATTTGG
551 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA
601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAAtcg gcggcGAGTT
651 GCACGGCAAA GGCAAAAATG CGCACGGCGA ACCGTGGCGC ATCGGTATAG
701 AGCAACCCAA TATCATCCAA GgcgGCAata CGCAGATTAt cgtcccgctg
751 aaCaaccgtt cgctTGCCAC TTCCGGCGAT TAccgtaTTT tccacgtcgA
801 TAAAAAcggc aaacgccttt cccacaTCAT CAATCCCaAC aacAAACgac
851 ccATCAGcca caacctcgcc tccatcagcg tggtctcAGA CAGTGCAATG
901 ACGGCGGACG GTTtatCCAC AGGATTATTT GTTTTAGGCG AAACCGAAGC
951 CTTAAGGCTG GCAGAACAAG AAAAACTCGC TGTTTTCCTA ATTGTCCGGG
1001 ATAAGGACGG CTACCGCACC GCCATGTCTT CCGAATTTGC CAAGCTGCTC
1051 CGCTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 450>:
1
MPSETRLPNL IRALIFALGF IFLNACSEQT AQTVTLQGET MGTTYTVKYL
51 SNNRDKLPSP AKIQKRIDDA LKEVNRQMST YQTDSEISRF NQHTAGKPLR
101 ISSDFAHVTA EAVRLNRLTH GALDVTVGPL VNLWGFGPDK SVTREPSPEQ
151 IKQAASYTGI DKIILQQGKD YASLSKTHPK AYLDLSSIAK GFGVDKVAGE
201 LEKYGIQNYL VEIGGELHGK GKNAHGEPWR IGIEQPNIIQ GGNTQIIVPL
251 NNRSLATSGD YRIFHVDKNG KRLSHIINPN NKRPISHNLA SISVVSDSAM
301 TADGLSTGLF VLGETEALRL AEQEKLAVFL IVRDKDGYRT AMSSEFAKLL
351 R*
该蛋白显示出与一种假设的流感嗜血菌的脂蛋白前体同源:
sp|P44550|YOJL_HAEIN假设的脂蛋白HI0172前体>gi|1074292|pir|4假设蛋白HI0172-流感嗜血菌(Rd KW20菌株)>gi|1573128(U32702)假设的[流感嗜血菌]长度=346
评分=353位(896),估计值=9e-97
相同性=181/344(52%),阳性=247/344(71%),空隙=4/344(1%)
询问:7 LPNLIRALIFALGFIFLNACSEQTAQTVTLQGETMGTTYTVKYLSNNRDKLPSPAKIQKR 66
+ LI +I + L AC ++T + ++L G+TMGTTY VKYL + S K +
目标:1 MKKLISGIIAVAMALSLAACQKET-KVISLSGKTMGTTYHVKYLDDGSITATSE-KTHEE 58
询问:67 IDDALKEVNRQMSTYQTDSEISRFNQHT-AGKPLRISSDFAHVTAEAVRLNRLTHGALDV 125
I+ LK+VN +MSTY+ DSE+SRFNQ+T P+ IS+DFA V AEA+RLN++T GALDV
目标:59 IEAILKDVNAKMSTYKKDSELSRFNQNTQVNTPIEISADFAKVLAEAIRLNKVTEGALDV 118
询问:126 TVGPLVNLWGFGPDKSVTREPSPEQIKQAASYTGIDKIILQQGKDYASLSKTHPKAYLDL 185
TVGP+VNLWGFGP+K ++P+PEQ+ + ++ GIDKI L K+ A+LSK P+ Y+DL
目标:119 TVGPVVNLWGFGPEKRPEKQPTPEQLAERQAWVGIDKITLDTNKEKATLSKALPQVYVDL 178
询问:186 SSIAKGFGVDKVAGELEKYGIQNYLVEIGGELHGKGKNAHGEPWRIGIEQPNIIQGGNTQ 245
SSIAKGFGVD+VA +LE+ QNY+VEIGGE+ KGKN G+PW+I IE+P +
目标:179 SSIAKGFGVDQVAEKLEQLNAQNYMVEIGGEIRAKGKNIEGKPWQIAIEKPTTTGERAVE 238
询问:246 IIVPLNNRSLATSGDYRIFHVDKNGKRLSHIINPNNKRPISHNLASISVVSDSAMTADGL 305
++ LNN +A+SGDYRI+ ++NGKR +H I+P PI H+LASI+V++ ++MTADGL
目标:239 AVIGLNNMGMASSGDYRIY-FEENGKRFAHEIDPKTGYPIQHHLASITVLAPTSMTADGL 297
询问:306 STGLFVLGETEALRLAEQEKLAVFLIVRDKDGYRTAMSSEFAKL 349
STGLFVLGE +AL +AE+ LAV+LI+R +G+ T SS F KL
目标:298 STGLFVLGEDKALEVAEKNNLAVYLIIRTDNGFVTKSSSAFKKL 341
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例54
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 451>:
1 ..CCGTGCCGCC GACAGGGCGA CGACGTGTAT GCGGCGCACG CGTCCCGTCA
51 AAAATTGTGG CTGCGCTTCA TCGGCGGCCG GTCGCATCAA AATATACGGG
101 GCGGCGCGGC TGCGGACGGG TGGCGCAAAG GCGTGCAAAT CGGCGGCGAG
151 GTGTTTGTAC GGCAAAATGA AGGCAGCCkA yTGGCAATCG GCGTGATGGG
201 CGGCAGGGCC GGCCAGCACG CwTCAGTCAA CGGCAAAGGC GGTGCGGCAG
251 gCAGTGATTT GTATGGTTAT GgCGGGGgTG TTTATGCTgC GTGGCATCAG
301 TTGCGCGATA AACAAACGGG TgCGTATTTG GACGGCTGGT TGCAATACCA
351 ACGTTTCAAA CACCGCATCA ATGATGAAAA CCGTGCGGAA CgCTACAAAA
401 CCAAAGGTTG GACGGCTTCT GTCGAAGGCG GCTACAACGC GCTTGTGGCG
451 GAAGGCATTG TCGGAAAAGG CAATAATGTG CGGTTTTACC TACAACCGCA
501 GgCGCAGTTT ACCTACTTGG GCGTAAACGG CGGCTTTACC GACAGCGAGG
551 GGACGGCGGT CGGACTGCTC GGCAGCGGTC AGTGGCAAAG CCGCGCCGGC
601 AtTCGGGCAA AAACCCGTTT TGCTTTGCGT AACGGTGTCA ATCTTCAGCC
651 TTTTGCCGCT TTTAATGTtt TGCACAGGTC AAAATCTTTC GGCGTGGAAA
701 TGGACGGCGA AAAACAGACG CTGGCAGGCA GGACGGCACT CGAAGGGCGG
751 TTCGGTATTG AAGCCGGTTG GAAAGGCCAT ATGTCCGCA..
它对应于氨基酸序列<SEQ ID 452;ORF35>:
1 ..PCRRQGDDVY AAHASRQKLW LRFIGGRSHQ NIRGGAAADG WRKGVQIGGE
51 VFVRQNEGSX LAIGVMGGRA GQHASVNGKG GAAGSDLYGY GGGVYAAWHQ
101 LRDKQTGAYL DGWLQYQRFK HRINDENRAE RYKTKGWTAS VEGGYNALVA
151 EGIVGKGNNV RFYLQPQAQF TYLGVNGGFT DSEGTAVGLL GSGQWQSRAG
201 IRAKTRFALR NGVNLQPFAA FNVLHRSKSF GVEMDGEKQT LAGRTALEGR
251 FGIEAGWKGH MSA..
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌的推定分泌性VirG-同系物(登录号A32247)的同源性
ORF和virg-h蛋白显示在261个氨基酸的重叠区内有51%的氨基酸相同性:
Orf35 5 QGDDVYAAHASRQKLWLRFIGGRSHQNIRGGAA-ADGWRKGVQIGGEVFVRQNEGSXLAI 63
+ D++ R+ LWLR I G S+Q ++G A +G+RKGVQ+GGEVF QNE + L+I
virg-h 396 KNSDIFDRTLPRKGLWLRVIDGHSNQWVQGKTAPVEGYRKGVQLGGEVFTWQNESNQLSI 455
Orf35 64 GVMGGRAGQHASVNGKG--GAAGSDLYGYGGGVYAAWHQLRDKQTGAYLDGWLQYQRFKH 121
G+MGG+A Q ++ + ++ G+G GVYA WHQL+DKQTGAY D W+QYQRF+H
virg-h 456 GLMGGQAEQRSTFHNPDTDNLTTGNVKGFGAGVYATWHQLQDKQTGAYADSWMQYQRFRH 515
Orf35 122 RINDENRAERYKTKGWTASVEGGYNALVAEGIVGKGNNVRFYLQPQAQFTYLGVNGGFTD 181
RIN E+ ER+ +KG TAS+E GYNAL+AE KGN++R YLQPQAQ TYLGVNG F+D
virg-h 516 RINTEDGTERFTSKGITASIEAGYNALLAEHFTKKGNSLRVYLQPQAQLTYLGVNGKFSD 575
Orf35 182 SEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVNLQPFAAFNVLHRSKSFGVEMDGEKQTL 241
SE V LLGS Q Q+R G++AK +F+L + ++PFAA N L+ +K FGVEMDGE++ +
virg-h 576 SENAHVNLLGSRQLQTRVGVQAKAQFSLYKNIAIEPFAAVNALYHNKPFGVEMDGERRVI 635
Orf35 242 AGRTALEGRFGIEAGWKGHMS 262
+TA+E + G+ K H++
virg-h 636 NNKTAIESQLGVAVKIKSHLT 656
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF35显示与脑膜炎奈瑟球菌菌株A的ORF(ORF35a)在重叠的259个氨基酸内有96.9%的相同性:
10 20 30
orf35.pep PCRRQGDDVYAAHASRQKLWLRFIGGRSHQNIRG
:||||||| ||||||||||||||||||||
orf35a QRLAIPEAEAVLYAQQAYAANTLFGLRAADRGDDVYAADPSRQKLWLRFIGGRSHQNIRG
310 320 330 340 350 360
40 50 60 70 80 90
orf35.pep GAAADGWRKGVQIGGEVFVRQNEGSXLAIGVMGGRAGQHASVNGKGGAAGSDLYGYGGGV
|||||| |||||||||||||||||| ||||||||||||||||||||||||| |:||||||
orf35a GAAADGRRKGVQIGGEVFVRQNEGSRLAIGVMGGRAGQHASVNGKGGAAGSYLHGYGGGV
370 380 390 400 410 420
100 110 120 130 140 150
orf35.pep YAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAEGIV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf35a YAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAEGVV
430 440 450 460 470 480
160 170 180 190 200 210
orf35.pep GKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf35a GKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRNGVN
490 500 510 520 530 540
220 230 240 250 260
orf35.pep LQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSA
|||||||||||||||||||||||||||||||||||||||||||||||||
orf35a LQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSARIGYGKRTDGD
550 560 570 580 590 600
orf35a KEAALSLKWLFX
610 620
全长ORF35a核苷酸序列<SEQ ID 453>是:
1 ATGTTCAGAG CTCAGCTTGG TTCAAATACT CGTTCTACCA AAATCGGCGA
51 CGATGCCGAT TTTTCATTTT CAGACAAGCC GAAACCCGGC ACTTCCCATT
101 ATTTTTCCAG CGGTAAAACC GATCAAAATT CATCCGAATA TGGGTATGAC
151 GAAATCAATA TCCAAGGTAA AAACTACAAT AGCGGCATAC TCGCCGTCGA
201 TAATATGCCC GTTGTTAAGA AATATATTAC AGATACTTAC GGGGATAATT
251 TAAAGGATGC GGTTAAGAAG CAATTACAGG ATTTATACAA AACAAGACCC
301 GAAGCTTGGG AAGAAAATAA AAAACGGACT GAGGAGGCGT ATATAGAACA
351 GCTTGGACCA AAATTTAGTA TACTCAAACA GAAAAACCCC GATTTAATTA
401 ATAAATTGGT AGAAGATTCC GTACTCACTC CTCATAGTAA TACATCACAG
451 ACTAGTCTCA ACAACATCTT CAATAAAAAA TTACACGTCA AAATCGAAAA
501 CAAATCCCAC GTCGCCGGAC AGGTGTTGGA ACTGACCAAG ATGACGCTGA
551 AAGATTCCCT TTGGGAACCG CGCCGCCATT CCGACATCCA TATGCTGGAA
601 ACTTCCGATA ATGCCCGCAT CCGCCTGAAC ACGAAAGATG AAAAACTGAC
651 CGTCCATAAA GCGTATCAGG GCGGTGCGGA TTTCCTGTTC GGCTACGACG
701 TGCGGGAGTC GGACAAACCC GCCCTGACCT TTGAAGAAAA AGTCAGCGGA
751 CAATCCGGCG TGGTTTTGGA ACGCCGGCCG GAAAATCTGA AAACGCTCGA
801 CGGGCGCAAA CTGATTGCGG CGGAAAAGGC AGACTCTAAT TCGTTTGCGT
851 TTAAACAAAA TTACCGGCAG GGACTGTACG AATTATTGCT CAAGCAATGC
901 GAAGGCGGAT TTTGCTTGGG CGTGCAGCGT TTGGCTATCC CCGAGGCGGA
951 AGCGGTTTTA TATGCCCAAC AGGCTTATGC GGCAAATACT TTGTTCGGGC
1001 TGCGTGCCGC CGACAGGGGC GACGACGTGT ATGCCGCCGA TCCGTCCCGT
1051 CAAAAATTGT GGCTGCGCTT CATCGGCGGC CGGTCGCATC AAAATATACG
1101 GGGCGGCGCG GCTGCGGACG GGCGGCGCAA AGGCGTGCAA ATCGGCGGCG
1151 AGGTGTTTGT ACGGCAAAAT GAAGGCAGCC GGCTGGCAAT CGGCGTGATG
1201 GGCGGCAGGG CTGGCCAGCA CGCATCAGTC AACGGCAAAG GCGGTGCGGC
1251 AGGCAGTTAT TTGCATGGTT ATGGCGGGGG TGTTTATGCT GCGTGGCATC
1301 AGTTGCGCGA TAAACAAACG GGTGCGTATT TGGACGGCTG GTTGCAATAC
1351 CAACGTTTCA AACACCGCAT CAATGATGAA AACCGTGCGG AACGCTACAA
1401 AACCAAAGGT TGGACGGCTT CTGTCGAAGG CGGCTACAAC GCGCTTGTGG
1451 CGGAAGGCGT TGTCGGAAAA GGCAATAATG TGCGGTTTTA CCTGCAACCG
1501 CAGGCGCAGT TTACCTACTT GGGCGTAAAC GGCGGCTTTA CCGACAGCGA
1551 GGGGACGGCG GTCGGACTGC TCGGCAGCGG TCAGTGGCAA AGCCGCGCCG
1601 GCATTCGGGC AAAAACCCGT TTTGCTTTGC GTAACGGTGT CAATCTTCAG
1651 CCTTTTGCCG CTTTTAATGT TTTGCACAGG TCAAAATCTT TCGGCGTGGA
1701 AATGGACGGC GAAAAACAGA CGCTGGCAGG CAGGACGGCG CTCGAAGGGC
1751 GGTTCGGCAT TGAAGCCGGT TGGAAAGGCC ATATGTCCGC ACGCATCGGA
1801 TACGGCAAAA GGACGGACGG CGACAAAGAA GCCGCATTGT CGCTCAAATG
1851 GCTGTTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 454>:
1 MFRAQLGSNT RSTKIGDDAD FSFSDKPKPG TSHYFSSGKT DQNSSEYGYD
51 EINIQGKNYN SGILAVDNMP VVKKYITDTY GDNLKDAVKK QLQDLYKTRP
101 EAWEENKKRT EEAYIEQLGP KFSILKQKNP DLINKLVEDS VLTPHSNTSQ
151 TSLNNIFNKK LHVKIENKSH VAGQVLELTK MTLKDSLWEP RRHSDIHMLE
201 TSDNARIRLN TKDEKLTVHK AYQGGADFLF GYDVRESDKP ALTFEEKVSG
251 QSGVVLERRP ENLKTLDGRK LIAAEKADSN SFAFKQNYRQ GLYELLLKQC
301 EGGFCLGVQR LAIPEAEAVL YAQQAYAANT LFGLRAADRG DDVYAADPSR
351 QKLWLRFIGG RSHQNIRGGA AADGRRKGVQ IGGEVFVRQN EGSRLAIGVM
401 GGRAGQHASV NGKGGAAGSY LHGYGGGVYA AWHQLRDKQT GAYLDGWLQY
451 QRFKHRINDE NRAERYKTKG WTASVEGGYN ALVAEGVVGK GNNVRFYLQP
501 QAQFTYLGVN GGFTDSEGTA VGLLGSGQWQ SRAGIRAKTR FALRNGVNLQ
551 PFAAFNVLHR SKSFGVEMDG EKQTLAGRTA LEGRFGIEAG WKGHMSARIG
601 YGKRTDGDKE AALSLKWLF*
与淋病奈瑟球菌的预计ORF的同源性
ORF35显示与淋病奈瑟球菌的预计ORF(ORF35ngh)在重叠的261个氨基酸内有51.7%的相同性:
orf35.pep PCRRQGDDVYAAHASRQKLWLRFIGGRSHQNIRG 34
:::|:: |: |||| | |:|:| ::|
orf35ngh FTKVQERDDIAIYAQQAQAANTLFALRLNDKNSDIFDRTLPRKGLWLRVIDGHSNQWVQG 370
orf35.pep GAA-ADGWRKGVQIGGEVFVRQNEGSXLAIGVMGGRAGQHASVNGKG--GAAGSDLYGYG 91
:| ::|:|||||:|||||: |||:: |:||:|||:| |::: : : : ::: |:|
orf35ngh KTAPVEGYRKGVQLGGEVFTWQNESNQLSIGLMGGQAEQRSTFRNPDTDNLTTGNVKGFG 430
orf35.pep GGVYAAWHQLRDKQTGAYLDGWLQYQRFKHRINDENRAERYKTKGWTASVEGGYNALVAE 151
:||||:||||:|||||||:|:|:|||||:|||| | :||: :|| |||:|:|||||:||
orf35ngh AGVYATWHQLQDKQTGAYVDSWMQYQRFRHRINTEYATERFTSKGITASIEAGYNALLAE 490
orf35.pep GIVGKGNNVRFYLQPQAQFTYLGVNGGFTDSEGTAVGLLGSGQWQSRAGIRAKTRFALRN 211
:: |||::| |||||||:||||||| |:|||:: |:|||| | |||:|::||::||: |
orf35ngh HFTKKGNSLRVYLQPQAQLTYLGVNGKFSDSENAQVNLLGSRQLQSRVGVQAKAQFAFTN 550
orf35.pep GVNLQPFAAFNVLHRSKSFGVEMDGEKQTLAGRTALEGRFGIEAGWKGHMSA 263
||::|||:| | ::::| ||||:||::::: ::|::| ::|: | |:|::
orf35ngh GVTFQPFVAVNSIYQQKPFGVEIDGDRRVINNKTVIETQLGVAAKIKSHLTLQASFNRQT 610
预计部分ORF35ngh核苷酸序列<SEQ ID 455>编码的蛋白质具有部分氨基酸序列<SEQ ID 456>:
1 ..KKLRDRNSEY WKEETYHIKS NGRTYPNIPA LFPKHPFDPF ENINNSKKIS
51 FYDKEYTEDY LVGFARGFGV EKRNGEEEKP LRQYFKDCVN TENSNNDNCK
101 ISSFGNYGPI LIKSDIFALA SQIKNSHINS EILSVGNYIE WLRPTLNKLT
151 GWQEHLYAGL DPFHYIEVTD NSHVIGQTID LGALELTNSL WKPRWNSNID
201 YLITKNAEIR FNTKNESLLV KEDYAGGARF RFAYDLKDKV PEIPVLTFEK
251 NITGTSDIIF EGKALDNLKH LDGHQIVKVN DTADKDAFRL SSKYRKGIYT
301 LSLQQRPEGF FTKVQERDDI AIYAQQAQAA NTLFALRLND KNSDIFDRTL
351 PRKGLWLRVI DGHSNQWVQG KTAPVEGYRK GVQLGGEVFT WQNESNQLSI
401 GLMGGQAEQR STFRNPDTDN LTTGNVKGFG AGVYATWHQL QDKQTGAYVD
451 SWMQYQRFRH RINTEYATER FTSKGITASI EAGYNALLAE HFTKKGNSLR
501 VYLQPQAQLT YLGVNGKFSD SENAQVNLLG SRQLQSRVGV QAKAQFAFTN
551 GVTFQPFVAV NSIYQQKPFG VEIDGDRRVI NNKTVIETQL GVAAKIKSHL
601 TLQASFNRQT SKHHHAKQGA LNLQWTF*
根据该预测,脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例55
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 457>:
1 ..GCGGAATATG TTCAGTTCTC TATAGATTTG TTCAGTGTGG GTAAATCGGG
51 GGGCGGTATA CCTAAGGCTA AGCCTGTGTT TGATGCGAAA CCGAGATGGG
101 AGGTTGATAG GAAGCTTAAT AAATTGACAA CTCGTGAGCA GGTGGAGAAA
151 AATGTTCAGG AAACGAGAAG AAGGAGTCAG AGTAGTCAGT TTAAAGCCCA
201 TGCGCAACGA GAATGGGAAA ATAAAACAGG GTTAGATTTT AATCATTTTA
251 TAGGTGGTGA TATCAATAAA AAAGGCACAG TAACAGGAGG GCATAGTCTA
301 ACCCGTGGTG ATGTACGGGT GATACAACAA ACCTCGGCAC CTGATAAACA
351 TGGGGT.TTA TCAAGCGACA GTGGAAATTN A
它对应于氨基酸序列<SEQ ID 458;ORF46>:
1 ..AEYVQFSIDL FSVGKSGGGI PKAKPVFDAK PRWEVDRKLN KLTTREQVEK
51 NVQETRRRSQ SSQFKAHAQR EWENKTGLDF NHFIGGDINK KGTVTGGHSL
101 TRGDVRVIQQ TSAPDKHGXL SSDSGNX
进一步的工作进一步揭示了部分核苷酸序列<SEQ ID 459>:
1 ..GCAGTGTGCC TnCCGATGCA TGCACACGCC TCAnATTTGG CAAACGATTC
51 TTTTATCCGG CAGGTTCTCG ACCGTCAGCA TTTCGAACCC GACGGGAAAT
101 ACCACCTATT CGGCAGCAGG GGGGAACTTG CCGAGCGCCA GTCTCATATC
151 GGATTGGGAA AAATACAAAG CCATCAGTTG GGCAACCTGA TGATTCAACA
201 GGCGGCCATT AAAGGAAATA TCGGCTACAT TGTCCGCTTT TCCGATCACG
251 GGCACGAAGT CCATTCCCCs TTCGACAACC ATGCCTCACA TTCCGATTCT
301 GATGAAGCCG GTAGTCCCGT TGACGGATTT AGCCTTTACC GCATCCATTG
351 GGACGGATAC GAACACCATC CCGCCGACGG CTATGACGGG CCACAGGGCG
401 GCGGCTATCC CGCTCCCAAA GGCGCGAGGG ATATATACAG TTACGACATA
451 AAAGGCGTTG CCCAAAATAT CCGCCTCAAC CTGACCGACA ACCGCAGCAC
501 CGGACAACGG CTTGCCGACC GTTTCCACAA TGCCGGTAGT ATGCTGACGC
551 AAGGAGTAGG CGACGGATTC AAACGCGCCA CCCGATACAG CCCCGAGCTG
601 GACAGATCGG GCAATGCCGC CGAAGCCTTC AACGGCACTG CAGATATCGT
651 TAAAAACATC ATCGGCGCTG CAGGAGAAAT TGT
它对应于氨基酸序列<SEQ ID 460;ORF46-1>:
1 ..
AVCLPMHAHA SXLANDSFIR QVLDRQHFEP DGKYHLFGSR GELAERQSHI
51 GLGKIQSHQL GNLMIQQAAI KGNIGYIVRF SDHGHEVHSP FDNHASHSDS
101 DEAGSPVDGF SLYRIHWDGY EHHPADGYDG PQGGGYPAPK GARDIYSYDI
151 KGVAQNIRLN LTDNRSTGQR LADRFHNAGS MLTQGVGDGF KRATRYSPEL
201 DRSGNAAEAF NGTADIVKNI IGAAGEI
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF46显示与淋病奈瑟球菌的预计ORF(ORF64ng)在重叠的111个氨基酸内有98.2%的相同性:
orf46.pep AEYVQFSIDLFSVGKSGGGIPKAKPVFDAKPRWEVDRKLNKLTTR 45
||||||||||||||||||||||||||||||
orf46ng PKTGVPFDGKGFPNFEKHVKYDTKLDIQELSGGGIPKAKPVFDAKPRWEVDRKLNKLTTR 217
orf46.pep EQVEKNVQETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGTVTGGHSLTRGDV 105
|||||||||||||||||||||||||||||||||||||||||||||||:||||||||||||
orf46ng EQVEKNVQETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGAVTGGHSLTRGDV 277
orf46.pep RVIQQTSAPDKHGXLSSDSGN 126
||||||||||||| |||||||
orf46ng RVIQQTSAPDKHGVLSSDSGN 298
预计部分ORF46ng核苷酸序列<SEQ ID 461>编码的蛋白质具有部分氨基酸序列<SEQ ID 462>:
1 ..RRLKHCCHAR LGSAFHRKQD GAHQRFGRYG ATQRLCRSSH PRLGSPKPQC
51 RTRHRSRQQY LYGSHPHQRD WSCPGKIQLG RHHGTSCRAV ADXRDRICER
101 EIRRQRQXCR CRLGKIPSLS IPKYPLKLEQ RYGKENITSS TVPPSNGKNV
151 KLADQRHPKT GVPFDGKGFP NFEKHVKYDT KLDIQELSGG GIPKAKPVFD
201 AKPRWEVDRK LNKLTTREQV EKNVQETRRR SQSSQFKAHA QREWENKTGL
251 DFNHFIGGDI NKKGAVTGGH SLT
RGDVRVI QQTSAPDKHG VLSSDSGN*
进一步的工作揭示了该完整的淋球菌DNA序列<SEQ ID 463>:
1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG
51 CCTGCCGATG CATGCACACG CCTCAGATTT GGcaAACGAT CCCTTTATCC
101 GgCaggttcT CGaccGTCAG CATTTCGaac ccgacggGAa ATACCaCCTA
151 TTcggCaGCA GGGGGGAGCT TgccnagcGC aacggccATa tcggattggG
201 aaacaTAcaa Agccatcagt tGggccacct gatgattcaa caggcggccg
251 ttgaaggaaA TAtcgGctac attgtccgct tttccgatca cgggcacaaa
301 ttccattcgc ccttcGAcaa ccaTGCCTCA CATTCCGATT CTGACGAAGC
351 CGGTAGTCCC GTTGACGGAT TCAGCCTTTA CCGCATCCAT TGGGACGGAT
401 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT
451 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT
501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC
551 GGCTTGCCGA CCGTTTCCAC AATGCCGGCG CTATGCTGAC GCAAGGAGTA
601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC
651 GGGCAATGCc gccGAAGCCT TCAACGGCAC TGCAGATATC GTCAAAAACA
701 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGATGC CGTGCagGGT
751 ATAAGCGAAG GCTCAAACAT TGCTGTCATG CACGGCTTGG GTCTGCTTTC
801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC
851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC
901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA TGGCAGCCAT
951 CCCCATCAAA GGGATTGGAG CTGTCCGGGG AAAATACGGC TTGGGCGGCA
1001 TCACGGCACA TCCTGTCAAG CGGTCGCAGA TGGGCGCGAT CGCATTGCCG
1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA
1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC
1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGC
1201 AAAAATGTCA AACTGGCAGA CCAACGCCAC CCGAAGACAG GCGTACCGTT
1251 TGACGGTAAA GGGTTTCCGA ATTTTGAGAA GCACGTGAAA TATGATACGA
1301 AGCTCGATAT TCAAGAATTA TCGGGGGGCG GTATACCTAA GGCTAAGCCT
1351 GTGTTTGATG CGAAACCGAG ATGGGAGGTT GATAGGAAGC TTAATAAATT
1401 GACAACTCGT GAGCAGGTGG AGAAAAATGT TCAGGAAACG AGAAGAAGGA
1451 GTCAGAGTAG TCAGTTTAAA GCCCATGCGC AACGAGAATG GGAAAATAAA
1501 ACAGGGTTAG ATTTTAATCA TTTTATAGGT GGTGATATCA ATAAGAAAGG
1551 CACAGTAACA GGAGGGCATA GTCTAACCCG TGGTGATGTA CGGGTGATAC
1601 AACAAACCTC GGCACCTGAT AAACATGGGG TTTATCAAGC GACAGTGGAA
1651 ATTAAAAAGC CTGATGGAAG TTGGGAGGTG AAAACGAAAA AAGGTGGGAA
1701 AGTGATGACC AAGCACACCA TGTTCCCAAA AGATTGGGAT GAGGCTAGAA
1751 TTAGGGCTGA AGTTACTTCG GCTTGGGAAA GTAGAATAAT GCTTAAGGAT
1801 AATAAATGGC AGGGTACAAG TAAATCGGGT ATTAAAATAG AAGGATTTAC
1851 CGAACCTAAT AGAACAGCAT ATCCCATTTA TGAATAG
它对应于氨基酸序列<SEQ ID 464;ORF46ng-1>:
1
LGISRKISLI LSILAVCLPM HAHASDLAND PFIRQVLDRQ HFEPDGKYHL
51 FGSRGELAXR NGHIGLGNIQ SHQLGHLMIQ QAAVEGNIGY IVRFSDHGHK
101 FHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY
151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH NAGAMLTQGV
201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG
251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP
301 NAAQGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPVK RSQMGAIALP
351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG
401 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP
451 VFDAKPRWEV DRKLNKLTTR EQVEKNVQET RRRSQSSQFK AHAQREWENK
501 TGLDFNHFIG GDINKKGTVT GGHSLTRGDV RVIQQTSAPD KHGVYQATVE
551 IKKPDGSWEV KTKKGGKVMT KHTMFPKDWD EARIRAEVTS AWESRIMLKD
601 NKWQGTSKSG IKIEGFTEPN RTAYPIYE*
ORF46ng-1和ORF46-1显示在227个氨基酸的重叠区内有94.7%的相同性:
10 20 30 40
orf46-1.pep AVCLPMHAHASXLANDSFIRQVLDRQHFEPDGKYHLFGSRGELAER
||||||||||| |||| ||||||||||||||||||||||||||| |
orf46ng-1 LGISRKISLILSILAVCLPMHAHASDLANDPFIRQVLDRQHFEPDGKYHLFGSRGELAXR
10 20 30 40 50 60
50 60 70 80 90 100
orf46-1.pep QSHIGLGKIQSHQLGNLMIQQAAIKGNIGYIVRFSDHGHEVHSPFDNHASHSDSDEAGSP
::|||||:|||||||:|||||||::||||||||||||||: |||||||||||||||||||
orf46ng-1 NGHIGLGNIQSHQLGHLMIQQAAVEGNIGYIVRFSDHGHKFHSPFDNHASHSDSDEAGSP
70 80 90 100 110 120
110 120 130 140 150 160
orf46-1.pep VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf46ng-1 VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
130 140 150 160 170 180
170 180 190 200 210 220
orf46-1.pep TGQRLADRFHNAGSMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
|||||||||||||:||||||||||||||||||||||||||||||||||||||||||||||
orf46ng-1 TGQRLADRFHNAGAMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
190 200 210 220 230 240
orf46-1.pep I
|
orf46ng-1 IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP
250 260 270 280 290 300
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF46ng-1显示与脑膜炎奈瑟球菌菌株A的ORF(ORF46a)在重叠的486个氨基酸内有87.4%的相同性:
10 20 30 40 50 60
orf46a.pep LGISRKISLILSILAVCLPMHAHASDLANDSFIRQVLDRQHFEPDGKYHLFGSRGELAER
|||||||||||||||||||||||||||||| ||||||||||||||||||||||||||| |
orf46ng-1 LGISRKISLILSILAVCLPMHAHASDLANDPFIRQVLDRQHFEPDGKYHLFGSRGELAXR
10 20 30 40 50 60
70 80 90 100 110 120
orf46a.pep SGHIGLGNIQSHQLGNLFIQQAAIKGNIGYIVRFSDHGHEVHSPFDNHASHSDSDEAGSP
:||||||||||||||:|:|||||::||||||||||||||: |||||||||||||||||||
orf46ng-1 NGHIGLGNIQSHQLGHLMIQQAAVEGNIGYIVRFSDHGHKFHSPFDNHASHSDSDEAGSP
70 80 90 100 110 120
130 140 150 160 170 180
orf46a.pep VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf46ng-1 VDGFSLYRIHWDGYEHHPADGYDGPQGGGYPAPKGARDIYSYDIKGVAQNIRLNLTDNRS
130 140 150 160 170 180
190 200 210 220 230 240
orf46a.pep TGQRLVDRFHNTGSMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
|||||:|||||:|:||||||||||||||||||||||||||||||||||||||||||||||
orf46ng-1 TGQRLADRFHNAGAMLTQGVGDGFKRATRYSPELDRSGNAAEAFNGTADIVKNIIGAAGE
190 200 210 220 230 240
250 260 270 280 290 300
orf46a.pep IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf46ng-1 IVGAGDAVQGISEGSNIAVMHGLGLLSTENKMARINDLADMAQLKDYAAAAIRDWAVQNP
250 260 270 280 290 300
310 320 330 340 350 360
orf46a.pep NAAQGIEAVSNIFTAVIPVKGIGAVRGKYGLGGITAHPVKRSQMGEIALPKGKSAVSDNF
||||||||||||| |:||:|||||||||||||||||||||||||| ||||||||||||||
orf46ng-1 NAAQGIEAVSNIFMAAIPIKGIGAVRGKYGLGGITAHPVKRSQMGAIALPKGKSAVSDNF
310 320 330 340 350 360
370 380 390 400 410 420
orf46a.pep ADAAYAKYPSPYHSRNIRSNLEQRYGKENITSSTVPPSNGKNVKLANKRHPKTKVPFDGK
||||||||||||||||||||||||||||||||||||||||||||||::||||| ||||||
orf46ng-1 ADAAYAKYPSPYHSRNIRSNLEQRYGKENITSSTVPPSNGKNVKLADQRHPKTGVPFDGK
370 380 390 400 410 420
430 440 450 460 470
orf46a.pep GFPNFEKDVKYDTRINTAVPQVN----PIDEPVFN--PKGSVGSAHSWSITARIQYAKLP
||||||| |||||::: : ::: | :|||: |: | : ::|:| | |
orf46ng-1 GFPNFEKHVKYDTKLD--IQELSGGGIPKAKPVFDAKPRWEVDRKLN-KLTTREQVEKNV
430 440 450 460 470
480 490 500 510 520 530
orf46a.pep RQGRIRYIPPKNYSPSAPLPKGPNNGYLDKFGNEWTKGPSRTKGQEFEWDVQLSKTGREQ
:: | |
orf46ng-1 QETRRRSQSSQFKAHAQREWENKTGLDFNHFIGGDINKKGTVTGGHSLTRGDVRVIQQTS
480 490 500 510 520 530
全长ORF46aDNA序列<SEQ ID 465>是:
1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG
51 CCTGCCGATG CATGCACACG CCTCAGATTT GGCAAACGAT TCTTTTATCC
101 GGCAGGTTCT CGACCGTCAG CATTTCGAAC CCGACGGGAA ATACCACCTA
151 TTCGGCAGCA GGGGGGAACT TGCCGAGCGC AGCGGTCATA TCGGATTGGG
201 AAACATACAA AGCCATCAGT TGGGCAACCT GTTCATCCAG CAGGCGGCCA
251 TTAAAGGAAA TATCGGCTAC ATTGTCCGCT TTTCCGATCA CGGGCACGAA
301 GTCCATTCCC CCTTCGACAA CCATGCCTCA CATTCCGATT CTGATGAAGC
351 CGGTAGTCCC GTTGACGGAT TCAGCCTTTA CCGCATCCAT TGGGACGGAT
401 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT
451 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT
501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC
551 GGCTTGTCGA CCGTTTCCAC AATACCGGTA GTATGCTGAC GCAAGGAGTA
601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC
651 GGGCAATGCC GCCGAAGCTT TCAACGGCAC TGCAGATATC GTCAAAAACA
701 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGATGC CGTGCAGGGT
751 ATAAGCGAAG GCTCAAACAT TGCTGTTATG CACGGCTTGG GTCTGCTTTC
801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC
851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC
901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA CGGCAGTCAT
951 CCCCGTCAAA GGGATTGGAG CTGTTCGGGG AAAATACGGC TTGGGCGGCA
1001 TCACGGCACA TCCTGTCAAG CGGTCGCAGA TGGGCGAGAT CGCATTGCCG
1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA
1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC
1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGA
1201 AAGAATGTGA AACTGGCAAA CAAACGCCAC CCGAAGACCA AAGTGCCGTT
1251 TGACGGTAAA GGGTTTCCGA ATTTTGAAAA AGACGTAAAA TACGATACGA
1301 GAATTAATAC CGCTGTACCA CAAGTGAATC CTATAGATGA ACCCGTCTTT
1351 AATCCTAAAG GTTCTGTCGG ATCGGCTCAT TCTTGGTCTA TAACTGCCAG
1401 AATTCAATAC GCAAAATTAC CAAGGCAAGG TAGAATCAGA TATATCCCAC
1451 CTAAAAATTA CTCTCCTTCA GCACCGCTAC CAAAAGGACC TAATAATGGA
1501 TATTTGGATA AATTTGGTAA TGAATGGACT AAAGGTCCAT CAAGAACTAA
1551 AGGTCAAGAA TTTGAATGGG ATGTTCAATT GTCTAAAACA GGAAGAGAGC
1601 AACTTGGATG GGCTAGTAGG GATGGTAAGC ATTTAAATAT ATCAATTGAT
1651 GGAAAGATTA CACACAAATG A
它对应于氨基酸序列<SEQ ID 466>:
1
LGISRKISLI LSILAVCLPM HAHASDLAND SFIRQVLDRQ HFEPDGKYHL
51 FGSRGELAER SGHIGLGNIQ SHQLGNLFIQ QAAIKGNIGY IVRFSDHGHE
101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY
151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLVDRFH NTGSMLTQGV
201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG
251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP
301 NAAQGIEAVS NIFTAVIPVK GIGAVRGKYG LGGITAHPVK RSQMGEIALP
351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG
401 KNVKLANKRH PKTKVPFDGK GFPNFEKDVK YDTRINTAVP QVNPIDEPVF
451 NPKGSVGSAH SWSITARIQY AKLPRQGRIR YIPPKNYSPS APLPKGPNNG
501 YLDKFGNEWT KGPSRTKGQE FEWDVQLSKT GREQLGWASR DGKHLNISID
551 GKITHK*
根据该分析结果(包括淋球菌蛋白中存在粘附素典型的RGD序列),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例56
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 467>:
1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT
51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG
101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT
151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT
201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC
251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC
301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC
351 CGGGCTG...
它对应于氨基酸序列<SEQ ID 468;ORF48>:
1 MNIHTLLSKQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN
51 LDYLPAALLI ALPWRFVKIA GVLAFWLAVL FDGLMMVIQL FPFMDLIGAI
101 NLVPFILTAP APYQIMTGL...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 469>:
1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT
51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG
101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT
151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT
201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC
251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC
301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC
351 CGGGCTGTTG CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAGAAAG
401 CCGCCGCCAA AACCGACTTC CGGCACATTG CCGTCTGCGC CGCCGTTGTG
451 GCGGCAGCCG GCTATTTCAC CGGCCATTTG AGTTACTACG ACCGGGGTCG
501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTACTACGCC AAAAGTCAGG
551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGGCCTG
601 GTCGATCCCG TCTTCCTCCC CTTGGGCAAT CAACAGCGTG CCGCCACGCA
651 TCTGAACGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT
701 GGGGGCTGCC GGCCAATCCC GAACTTCAAA ACGCCACTTT TGCCAAACTG
751 CTGGCGCAAA AAGACCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT
801 CATCGGCGCG ACGGTCGAAG GCGAAATGCG CGAACTGTGT GCCTACGGCG
851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC
901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA
951 CGGCGCGGGC AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG
1001 GCTTTCAAGA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC
1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC
1101 ATTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA
1151 GCCACGCCGA CTATCCCGAA TCCGACATTT TCAACCACAG GCTCAAATGC
1201 ACCGAATATG GCCTGCCCGC CGAAACCGAC CTCTGCCGCA ATTTCAGCCT
1251 GCACACCCAA TTCTTCGACC AACTGGCGGA TTTGATCCAA CGCCCCGAAA
1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC
1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGGCACG TCGCCTGGCT
1401 GAACTTCAAA ATCAAATAA
它对应于氨基酸序列<SEQ ID 470;ORF48-1>:
1
MNIHTLLSKQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN
51 LDYLPAALLI ALPWRFVKIA G
VLAFWLAVL FDGLMMVIQ
L FPFMDLIGAI
101
NLVPFILTAP APYQ
IMTGLL LLYMLAMPFV LQKAAAKTDF R
HIAVCAAVV
151
AAAGYFTGHL SYYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL
201 VDPVFLPLGN QQRAATHLNE PKSQKILFIV AESWGLPANP ELQNATFAKL
251 LAQKDRFSVW ESGSFPFIGA TVEGEMRELC AYGGLRGFAL RRAPDEKFAR
301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQEIKT AENLIGKKTC
351 AIFGGVCDSE LFGEVSAFFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC
401 TEYGLPAETD LCRNFSLHTQ FFDQLADLIQ RPEMKGTEVI IVGDHPPPVG
451 NLNETFRYLK QGHVAWLNFK IK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF48显示与脑膜炎奈瑟球菌菌株A的ORF(ORF48a)在重叠的119个氨基酸内有94.1%的相同性:
10 20 30 40 50 60
orf48.pep
MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
||||||||||||||||||||||||||||| ||||||||||||||| |||||| ||||||||
orf48a
MNIHTLLSKQWTLPPFLPKRLLLSLLILLXPNAVFWVLALLTATARPIVNLXYLPAALLI
10 20 30 40 50 60
70 80 90 100 110 119
orf48.pep ALPWRFVKIAG
VLAFWLAVLFDGLMMVIQ
LFPFMDLIGAINLVPFILTAPAPYQ
IMTGL
||||| ||| | ||| ||||||||||||| | ||||||||||||||||| |||| || |||||
orf48a ALPWRXVKIXG
VLAXWLAVLFDGLMMVIQ
LFPFMDLIGAINLVPFIXTAPALYQ
IMTGLL
70 80 90 100 110 120
orf48a LLYMLAMPFVLQKAAAKTDFRHIAACAAVVVAAGYFTGHLSXYDRGRMANIFGANNFYYA
130 140 150 160 170 180
全长ORF48a核苷酸序列<SEQ ID 471>是:
1 ATGAATATTC ACACCCTGCT CTCCAAACAA TGGACGCTGC CGCCATTCCT
51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTNNCC CCCAATGCGG
101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT
151 TTGGANTACC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTNTCGT
201 CAAAATTGNC GGCGTATTGG CGTNTTGGCT GGCGGTTTTG TTTGACGGGC
251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC
301 AACCTCGTCC CCTTCATCNT GACCGCCCCC GCCCTTTATC AGATAATGAC
351 CGGGCTGTTA CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAGAAAG
401 CCGCCGCCAA AACCGACTTC CGACACATTG CCGCCTGTGC CGCCGTTGTG
451 GTGGCAGCCG GCTATTTTAC CGGCCATTTG AGTTANTACG ACCGGGGGCG
501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTATTACGCC AAAAGTCAGG
551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGGCCTG
601 GTCGATCCCG TCTTCCTCCC CTTGGGCAAT CAACAGCGTG CCGCCACGCA
651 TCTGAACGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT
701 GGGGGCTGCC GGCCAATCCC GAACTTCAAA ACGCCACTTT TGCCAAACTG
751 CTGGCGCAAA AAGANCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT
801 CATCGGCGCG ACGATCGAAG GCGAAATGCG CGAACTGTGT GCCTACGGCG
851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC
901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA
951 CGGCGCGGGC AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG
1001 GCTTTCAAGA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC
1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC
1101 ANTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA
1151 GCCACGCCGA CTATCCCGAA TCNGACATTT TCAACCACAG GCTCAAATGC
1201 ACCGAATATG GCCTGCCCGC CGAAACCGAC NTCTGCCGCA ATTTCAGCCT
1251 GCACACCCAA TTCTTCGACC AACTGGCGGA TTTGATCCAA CGCCCCGAAA
1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC
1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGGCACG TCGNCTGGCT
1401 GAACTTCAAA ATCAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 472>:
1
MNIHTLLSKQ WTLPPFLPKR LLLSLLILLX PNAVFWVLAL LTATARPIVN
51 LXYLPAALLI ALPWRXVKIX G
VLAXWLAVL FDGLMMVIQ
L FPFMDLIGAI
101
NLVPFIXTAP ALYQ
IMTGLL LLYMLAMPFV LQKAAAKTDF R
HIAACAAVV
151
VAAGYFTGHL SXYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL
201 VDPVFLPLGN QQRAATHLNE PKSQKILFIV AESWGLPANP ELQNATFAKL
251 LAQKXRFSVW ESGSFPFIGA TIEGEMRELC AYGGLRGFAL RRAPDEKFAR
301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQEIKT AENLIGKKTC
351 AIFGGVCDSE LFGEVSAXFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC
401 TEYGLPAETD XCRNFSLHTQ FFDQLADLIQ RPEMKGTEVI IVGDHPPPVG
451 NLNETFRYLK QGHVXWLNFK IK*
ORF48a和ORF48-1显示在472个氨基酸的重叠区内有96.8%的相同性:
10 20 30 40 50 60
orf48a.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLXPNAVFWVLALLTATARPIVNLXYLPAALLI
||||||||||||||||||||||||||||| ||||||||||||||||||||| ||||||||
orf48-1 MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
10 20 30 40 50 60
70 80 90 100 110 120
orf48a.pep ALPWRXVKIXGVLAXWLAVLFDGLMMVIQLFPFMDLIGAINLVPFIXTAPALYQIMTGLL
||||| ||| |||| ||||||||||||||||||||||||||||||| |||| ||||||||
orf48-1 ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL
70 80 90 100 110 120
130 140 150 160 170 180
orf48a.pep LLYMLAMPFVLQKAAAKTDFRHIAACAAVVVAAGYFTGHLSXYDRGRMANIFGANNFYYA
||||||||||||||||||||||||:|||||:|||||||||| ||||||||||||||||||
orf48-1 LLYMLAMPFVLQKAAAKTDFRHIAVCAAVVAAAGYFTGHLSYYDRGRMANIFGANNFYYA
130 140 150 160 170 180
190 200 210 220 230 240
orf48a.pep KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf48-1 KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP
190 200 210 220 230 240
250 260 270 280 290 300
orf48a.pep ELQNATFAKLLAQKXRFSVWESGSFPFIGATIEGEMRELCAYGGLRGFALRRAPDEKFAR
|||||||||||||| ||||||||||||||||:||||||||||||||||||||||||||||
orf48-1 ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR
250 260 270 280 290 300
310 320 330 340 350 360
orf48a.pep CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf48-1 CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE
310 320 330 340 350 360
370 380 390 400 410 420
orf48a.pep LFGEVSAXFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDXCRNFSLHTQ
||||||| |||||||||||||||||||||||||||||||||||||||||| |||||||||
orf48-1 LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ
370 380 390 400 410 420
430 440 450 460 470
orf48a.pep FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVXWLNFKIKX
|||||||||||||||||||||||||||||||||||||||||||| ||||||||
orf48-1 FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLNFKIKX
430 440 450 460 470
与淋病奈瑟球菌的预计ORF的同源性
ORF48显示与淋病奈瑟球菌的预计ORF(ORF48ng)在重叠的119个氨基酸内有97.5%的相同性:
orf48.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI 60
||||:|||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf48ng MNIHALLSEQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI 60
orf48.pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGL 119
|||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf48ng ALPWRFVKIAGVLAFWPAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL 120
预计ORF48ng核苷酸序列<SEQ ID 473>编码的蛋白质具有氨基酸序列<SEQ ID474>:
1
RPTVN
51 LDYLPAALLI ALPWRFVKIA G
VLAFWPAVL FDGLMMVIQ
L FPFMDLIGAI
101
NLVPFILTAP APYQ
IMTGLL LLYMLAMPFV LQKAAVKTDF RHIAVCAAVV
151 AAARYFTGPF ELLRTGGRWQ YVQHRRLLLS GSRASFRRRQ KADVLRRLGN
201 PYASMGNGG..
进一步的工作鉴定出完整的淋球菌DNA序列<SEQ ID 475>:
1 ATGAATATTC ACGCCCTGCT CTCCGAACAA TGGACGCTGC CGCCATTCCT
51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTGGCC CCCAATGCGG
101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT
151 TTGGACTACC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT
201 CAAAATTGCC GGCGTATTGG CGTTTTGGCC GGCGGTTTTG TTTGACGGGC
251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGACCTCAT CGGCGCCATC
301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC
351 CGGGCTGTTG CTGCTGTATA TGCTGGCGAT GCCGTTTGTG TTGCAAAAAG
401 CCGCCGTCAA AACCGACTTC CGACACATTG CCGTCTGTGC CGCCGTTGTG
451 GCGGCAGCCG GCTATTTCAC CGGCCATTTG AGTTACTACG ACCGGGGGCG
501 GATGGCCAAT ATCTTCGGCG CAAACAACTT CTATTACGCc aAAAGTCAGG
551 CGATGCTCTA CACCGTCAGC CAGAATGCCG ACTTTATTAC CGCCGgcctG
601 GTCGACCCCG TCTTCCTCCC CTTGGGCAAT CAGCAGCGTG CCGCCACGCG
651 GCTGAGTGAG CCGAAATCTC AAAAAATCCT CTTTATCGTC GCCGAATCTT
701 GGGGGCTGCC GGGCAATCCC GAGCTTCAAA ACGCCACTTT TGCCAAACTG
751 CTGGCGCAAA AAGACCGTTT TTCGGTTTGG GAAAGCGGCA GTTTTCCCTT
801 CATCGGCGCG ACGGTCGAAG GCGAAATGCG CGAATTGTGC GCCTACGGCG
851 GTTTGCGCGG GTTCGCACTG CGCCGCGCGC CCGACGAAAA ATTTGCCCGC
901 TGCCTCCCCA ACCGTTTGAA ACAAGAAGGT TACGCCACCT TTGCGATGCA
951 CGGCGCGGGT AGTTCGCTTT ACGACCGCTT CAGCTGGTAT CCGAGGGCGG
1001 GCTTTCAAAA AATCAAAACC GCCGAAAACC TGATCGGTAA AAAAACCTGC
1051 GCCATTTTCG GCGGCGTGTG CGACAGCGAG CTGTTCGGCG AAGTGTCGGC
1101 ATTTTTCAAA AAACACGACA AGGGACTGTT TTACTGGATG ACGCTGACCA
1151 GCCACGCCGA CTATCCCGAA TCCGACATTT TCAACCACAG GCTCAAATGC
1201 ACCGAATACG GCCTGCCCGC CGAAACCGAC CTCTGCCGCA ATTTCAGCCT
1251 GCACACCCAA TtcttcgACC AACTGGCGGA TTTGATCCGA CGCCCCGAAA
1301 TGAAAGGCAC GGAAGTCATC ATCGTCGGCG ACCATCCGCC GCCCGTCGGC
1351 AACCTCAATG AAACCTTCCG CTACCTCAAA CAGGGACACG TCGCCTGGCT
1401 GCACTTCAAA ATCAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 476;ORF48ng-1>:
1 MNIHALLSEQ WTLPPFLPKR LLLSLLILLA PNAVFWVLAL LTATARPIVN
51 LDYLPAALLI ALPWRFVKIA GVLAFWPAVL FDGLMMVIQL FPFMDLIGAI
101 NLVPFILTAP APYQIMTGLL LLYMLAMPFV LQKAAVKTDF RHIAVCAAVV
151 AAAGYFTGHL SYYDRGRMAN IFGANNFYYA KSQAMLYTVS QNADFITAGL
201 VDPVFLPLGN QQRAATRLSE PKSQKILFIV AESWGLPGNP ELQNATFAKL
251 LAQKDRFSVW ESGSFPFIGA TVEGEMRELC AYGGLRGFAL RRAPDEKFAR
301 CLPNRLKQEG YATFAMHGAG SSLYDRFSWY PRAGFQKIKT AENLIGKKTC
351 AIFGGVCDSE LFGEVSAFFK KHDKGLFYWM TLTSHADYPE SDIFNHRLKC
401 TEYGLPAETD LCRNFSLHTQ FFDQLADLIR RPEMKGTEVI IVGDHPPPVG
451 NLNETFRYLK QGHVAWLHFK IK*
ORG48ng-1和ORF48-1显示在472个氨基酸的重叠区内有97.9%的相同性:
10 20 30 40 50 60
orf48-1.pep MNIHTLLSKQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
||||:|||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf48ng-1 MNIHALLSEQWTLPPFLPKRLLLSLLILLAPNAVFWVLALLTATARPIVNLDYLPAALLI
10 20 30 40 50 60
70 80 90 100 110 120
orf48-1.pep ALPWRFVKIAGVLAFWLAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf48ng-1 ALPWRFVKIAGVLAFWPAVLFDGLMMVIQLFPFMDLIGAINLVPFILTAPAPYQIMTGLL
70 80 90 100 110 120
130 140 150 160 170 180
orf48-1.pep LLYMLAMPFVLQKAAAKTDFRHIAVCAAVVAAAGYFTGHLSYYDRGRMANIFGANNFYYA
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||
orf48ng-1 LLYMLAMPFVLQKAAVKTDFRHIAVCAAVVAAAGYFTGHLSYYDRGRMANIFGANNFYYA
130 140 150 160 170 180
190 200 210 220 230 240
orf48-1.pep KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATHLNEPKSQKILFIVAESWGLPANP
||||||||||||||||||||||||||||||||||||:|:||||||||||||||||||:||
orf48ng-1 KSQAMLYTVSQNADFITAGLVDPVFLPLGNQQRAATRLSEPKSQKILFIVAESWGLPGNP
190 200 210 220 230 240
250 260 270 280 290 300
orf48-1.pep ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf48ng-1 ELQNATFAKLLAQKDRFSVWESGSFPFIGATVEGEMRELCAYGGLRGFALRRAPDEKFAR
250 260 270 280 290 300
310 320 330 340 350 360
orf48-1.pep CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQEIKTAENLIGKKTCAIFGGVCDSE
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf48ng-1 CLPNRLKQEGYATFAMHGAGSSLYDRFSWYPRAGFQKIKTAENLIGKKTCAIFGGVCDSE
310 320 330 340 350 360
370 380 390 400 410 420
orf48-1.pep LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf48ng-1 LFGEVSAFFKKHDKGLFYWMTLTSHADYPESDIFNHRLKCTEYGLPAETDLCRNFSLHTQ
370 380 390 400 410 420
430 440 450 460 470
orf48-1.pep FFDQLADLIQRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLNFKIKX
|||||||||:|||||||||||||||||||||||||||||||||||||:|||||
orf48ng-1 FFDQLADLIRRPEMKGTEVIIVGDHPPPVGNLNETFRYLKQGHVAWLHFKIKX
430 440 450 460 470
根据该分析结果(包括淋球菌蛋白中存在一个推定的前导序列(双划线)和两个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例57
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 477>:
1 ..GTGAGCGGAC GTTACCGCGC TTTGGATCGC GTTTCCAAAA TCATCATCGT
51 TACTTTGAGT ATCGCCACGC TTGCCGCCGC CGGCATCGCT ATGTCGCGCG
101 GTATGCAGAT GCAGTCCGAT TTTATCGAGC CGACACCGTG GACGCTTGCC
151 GGTTTGGGCT TCCTGATCGC GCTGATGGGC TGGATGCCCG CGCCGATTGA
201 AATTTCCGCC ATCAATTCTT TGTGGGTAAC CGAAAAACAA CGCATCAATC
251 CTTCCGAATA CCGCGACGGG ATTTTTGAAT TCAACGTCGG TTATATCGCC
301 AGTGCGGTTT TGGCTTTGGT TTTCCTTGCA CTGGGCGC.G TAGCGCCGAA
351 CGGCAACGGC GA.ACAGTGC AGATGGCGGG CGGCAAATAT AACGGGCAAT
401 TGATCAATAT GTACGCC..
它对应于氨基酸序列<SEQ ID 478;ORF53>:
1 ..VSGRYRALDR VSKIIIVTLS IATLAAAGIA MSRGMQMQSD FIEPTPWTLA
51 GLGFLIALMG WMPAPIEISA INSLWVTEKQ RINPSEYRDG IFEFNVGYIA
101 SAVLALVFLA LGXVAPNGNG XTVQMAGGKY NGQLINMYA..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 479>:
1 ATGTCCGAAC AACATATTTC GACTTGGAAA AGTAAAATCA ACGCATTGGG
51 TCCGGGGATC ATGATGGCTT CGGCGGCGGT CGGCGGTTCG CACCTGATTG
101 CCTCGACGCA GGCGGGCGCG CTTTACGGCT GGCAGATCGC GCTCATCATC
151 ATCCTGACCA ACCTCTTCAA ATACCCGTTT TTCCGCTTCA GCGCGCATTA
201 CACGCTGGAC ACGGGCAAGA GCCTGATTGA AGGTTATGCC GAGAAAAGCC
251 GCGTTTATTT GTGGGTATTC CTGATTTTGT GCATCCTCTC CGCCACGATT
301 AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA AAATGGCGAT
351 TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG ATTATGGCAT
401 CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT GGATCGCGTT
451 TCCAAAATCA TCATCGTTAC TTTGAGTATC GCCACGCTTG CCGCCGCCGG
501 CATCGCTATG TCGCGCGGTA TGCAGATGCA GTCCGATTTT ATCGAGCCGA
551 CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT GATGGGCTGG
601 ATGCCCGCGC CGATTGAAAT TTCCGCCATC AATTCTTTGT GGGTAACCGA
651 AAAACAACGC ATCAATCCTT CCGAATACCG CGACGGGATT TTTGATTTCA
701 ACGTCGGTTA TATCGCCAGT GCGGTTTTGG CTTTGGTTTT CCTTGCACTG
751 GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA TGGCGGGCGG
801 CAAATATATC GGGCAATTGA TCAATATGTA CGCCGTTACC ATCGGCGGCT
851 GGTCGCGCCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT GTACGGCACG
901 ACGATTACCG TCGTGGACGG CTATGCCCGT GCCATTGCCG AACCCGTGCG
951 CCTGCTGCGC GGAAAAGACA AAACGGGCAA CGCCGAATTC TTTGCCTGGA
1001 ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG GTTTGACGGC
1051 GTAATGGCGA ATCTGCTCAA ATTTGCGATG ATTGCCGCTT TTGTGTCCGC
1101 CCCTGTGTTT GCCTGGCTGA ATTACCGTTT GGTTAAAGGT GATGAAAAAC
1151 ACAAACTCAC ATCAGGTATG AATGCCCTTG CATTGGCAGG CTTGATTTAT
1201 CTGACCGGTT TTACCGTTTT GTTCTTATTG AATTTGGCGG GAATGTTCAA
1251 ATGA
它对应于氨基酸序列<SEQ ID 480;ORF53-1>:
1 MSEQHISTWK SKINALGPGI MMASAAVGGS HLIASTQAG
A LYGWQIALII
51
ILTNLFKYPF FRFSAHYTLD TGKSLIEGYA EKSRVYLW
VF LILCILSATI
101 NAGAV
AIVTA AIVKMAIPSL MFD
AGTVAAL IMASCLIILV SGRYRALDRV
151 SK
IIIVTLSI ATLAAAGIAM SRGMQMQSDF IEPTPW
TLAG LGFLIALMGW
201
MPAPIEISAI NSLWVTEKQR INPSEYRDGI FDFNVGY
IAS AVLALVFLAL
251
GAFVQYGNGE AVQMAGGKYI GQLINMYAVT IGGWSRPL
VA FIAFACMYGT
301
TITVVDGYAR AIAEPVRLLR GKDKTGNAE
F FAWNIWVAGS GLAVIFWFDG
351 VMAN
LLKFAM IAAFVSAPVF AWLNYRLVKG DEKHKLTSGM NA
LALAGLIY
401
LTGFTVLFLL NLAGMFK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF53显示与脑膜炎奈瑟球菌菌株A的ORF(ORF53a)在重叠的139个氨基酸内有93.5%的相同性:
10 20 30
orf53.pep
VSGRYRALDRVSK
IIIVTLSIATLAAAGIA
| |||||||||||| |||||||||||||||||
orf53a A
AIVKMAIPSLMFD
AGTVAALIMASCLIILVSGRYRALDRVSK
IIIVTLSIATLAAAGIA
110 120 130 140 150 160
40 50 60 70 80 90
orf53.pep MSRGMQMQSDFIEPTPW
TLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG
||||||||||||||||| ||||||||||||||||| ||||||||||||||||||||||||||
orf53a MSRGMQMQSDFIEPTPW
TLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG
170 180 190 200 210 220
100 110 120 130 139
orf53.pep IFEFNVGY
IASAVLALVFLALGXVAPNGNGXTVQMAGGKYNGQLINMYA
||:||||| |||||||||||||| : ||| :|||||||| ||||||||
orf53a IFDFNVGY
IASAVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLV
230 240 250 260 270 280
orf53a
AFIAFACMYGTTITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFD
290 300 310 320 330 340
全长ORF53a核苷酸序列<SEQ ID 481>是:
1 ATGTCCGAAC AACATATTTC GACTTGGAAA AGTAAAATCA ACGCATTGGG
51 ACCGGGGATT ATGATGGCTT CGGCGGCGGT CGGCGGTTCG CACCTGATTG
101 CCTCGACGCA GGCGGGCGCG CTTTACGGCT GGCAGATCGC GCTCATCATC
151 ATCCTGACCA ACCTCTTCAA ATACCCGTTT TTCCGCTTCA GCGCGCATTA
201 CACGCTGGAC ACGGGCAAGA GCCTGATTGA AGGTTATGCC GAGAAAAGCC
251 GCGTTTATTT GTGGGTATTC CTGATTTTGT GCATCCTCTC CGCCACGATT
301 AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA AAATGGCGAT
351 TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG ATTATGGCAT
401 CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT GGATCGCGTT
451 TCCAAAATCA TCATCGTTAC TTTGAGTATC GCCACGCTTG CCGCCGCCGG
501 CATCGCTATG TCGCGCGGTA TGCAGATGCA GTCCGATTTT ATCGAGCCGA
551 CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT GATGGGCTGG
601 ATGCCCGCGC CGATTGAAAT TTCCGCCATC AATTCTTTGT GGGTAACCGA
651 AAAACAACGC ATCAATCCTT CCGAATACCG CGACGGGATT TTTGATTTCA
701 ACGTCGGTTA TATCGCCAGT GCGGTTTTGG CTTTGGTTTT CCTTGCACTG
751 GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA TGGCGGGCGG
801 CAAATATATC GGGCAATTGA TCAATATGTA CGCCGTTACC ATCGGCGGCT
851 GGTCGCGCCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT GTACGGCACG
901 ACGATTACCG TTGTGGACGG CTATGCCCGT GCCATTGCCG AACCCGTGCG
951 CCTGCTGCGC GGAAAAGACA AAACGGGCAA CGCCGAATTC TTTGCCTGGA
1001 ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG GTTTGACGGC
1051 GTAATGGCGA ATCTGCTCAA ATTTGCGATG ATTGCCGCTT TTGTGTCCGC
1101 CCCTGTGTTT GCCTGGCTGA ATTACCGTTT GGTCAAAGGT GATGAAAAAC
1151 ACAAACTCAC ATCAGGTATG AATGCCCTTG CATTGGCAGG CTTGATTTAT
1201 CTGACCGGTT TTACCGTTTT GTTCTTATTG AATTTGGCGG GAATGTTCAA
1251 ATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 482>:
1 MSEQHISTWK SKINALGPGI MMASAAVGGS HLIASTQAG
A LYGWQIALII
51
ILTNLFKYPF FRFSAHYTLD TGKSLIEGYA EKSRVYLW
VF LILCILSATI
101 NAGAV
AIVTA AIVKMAIPSL MFD
AGTVAAL IMASCLIILV SGRYRALDRV
151 SK
IIIVTLSI ATLAAAGIAM SRGMQMQSDF IEPTPW
TLAG LGFLIALMGW
201
MPAPIEISAI NSLWVTEKQR INPSEYRDGI FDFNVGY
IAS AVLALVFLAL
251
GAFVQYGNGE AVQMAGGKYI GQLINMYAVT IGGWSRPL
VA FIAFACMYGT
301
TITVVDGYAR AIAEPVRLLR GKDKTGNAE
F FAWNIWVAGS GLAVIFWFDG
351 VMAN
LLKFAM IAAFVSAPVF AWLNYRLVKG DEKHKLTSGM NA
LALAGLIY
401
LTGFTVLFLL NLAGMFK*
ORF 53a显示与ORF53-1在重叠的417个氨基酸内有100.0%的相同性:
10 20 30 40 50 60
orf53a.pep MSEQHISTWKSKINALGPGIMMASAAVGGSHLIASTQAGALYGWQIALIIILTNLFKYPF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 MSEQHISTWKSKINALGPGIMMASAAVGGSHLIASTQAGALYGWQIALIIILTNLFKYPF
10 20 30 40 50 60
70 80 90 100 110 120
orf53a.pep FRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTAAIVKMAIPSL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 FRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTAAIVKMAIPSL
70 80 90 100 110 120
130 140 150 160 170 180
orf53a.pep MFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAMSRGMQMQSDF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 MFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAMSRGMQMQSDF
130 140 150 160 170 180
190 200 210 220 230 240
orf53a.pep IEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGIFDFNVGYIAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 IEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGIFDFNVGYIAS
190 200 210 220 230 240
250 260 270 280 290 300
orf53a.pep AVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVAFIAFACMYGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 AVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVAFIAFACMYGT
250 260 270 280 290 300
310 320 330 340 350 360
orf53a.pep TITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDGVMANLLKFAM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 TITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDGVMANLLKFAM
310 320 330 340 350 360
370 380 390 400 410
orf53a.pep IAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLLNLAGMFKX
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53-1 IAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLLNLAGMFKX
370 380 390 400 410
与淋病奈瑟球菌的预计ORF的同源性
ORF53显示与淋病奈瑟球菌的预计ORF(ORF53ng)在重叠的139个氨基酸内有92.1%的相同性:
orf53.pep VSGRYRALDRVSKIIIVTLSIATLAAAGIA 30
||||||||||||||||||||||||||||||
orf53ng AAIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIA 91
orf53.pep MSRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG 90
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||
orf53ng MSRGMQMQPDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDG 151
orf53.pep IFEFNVGYIASAVLALVFLALGXVAPNGNGXTVQMAGGKYNGQLINMYA 139
||:||||||||||||||||||| : ||| :|||:|||| ||||||||
orf53ng IFDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMGGGKYIGQLINMYAVTIGGGSRPLV 211
预计ORF53ng核苷酸序列<SEQ ID 483>编码的蛋白质具有氨基酸序列<SEQ ID484>:
1
51 ALIMASCLII LVSGRYRALD RVSK
IIIVTL SIATLAAAGI AMSRGMQMQP
101 DFIEPTPW
TL AGLGFLIALM GWMPAPIEIS AINSLWVTEK QRINPSEYRD
151 GIFDFNVGY
I ASAVLALVFL ALGAFVQYGN GEAVQMGGGK YIGQLINMYA
201 VTIGGGSRPL
VAFIAFACMY GAASTVVDGY ARAIAEPVRL LRGKDKTARP
251 IVLLEKLGGR HRFGRDFLV*
进一步的分析进一步揭示了淋球菌的该部分DNA序列<SEQ ID 485>:
1 ..aagaAAAGCT GCGTTTATTT GTGGGTTTTT TTGATTTTGT GTATCGCCTC
51 CGCCACGATT AACGCGGGCG CGGTCGCCAT TGTAACCGCC GCCATCGTCA
101 AAATGGCGAT TCCCTCGCTG ATGTTTGATG CCGGCACGGT TGCCGCCTTG
151 ATTATGGCAT CCTGCCTGAT TATTTTGGTG AGCGGACGTT ACCGCGCTTT
201 GGATCGTGTT TCCAAAATCA TCATTGTTAC TTTGAGCATC GCCACGCTTG
251 CCGCCGCCGG CATCGCTATG TCGCGCGGTA TGCAGATGCA GCCCGATTTT
301 ATCGAGCCGA CACCGTGGAC GCTTGCCGGT TTGGGCTTCC TGATCGCGCT
351 GATGGGCTGG ATGCCCGCGC CGATCGAAAT TTCCGCCATC AATTCTTTGT
401 GGGTAACCGA AAAACAACGC ATCAATCCTT CTGAATACCG CGACGGGATT
451 TTCGATTTCA ACGTCGGTTA TATCGCcagT GCGGTTTTGG CTTTGGTTTT
501 CCTTGCACTG GGCGCGTTTG TGCAATACGG CAACGGCGAA GCAGTGCAGA
551 TGGCGGGCGG CAAATATATC GGGCAATTGA TTAATATGTA TGCCGTAACC
601 ATCGGCGGCT GGTCTCGTCC GCTGGTGGCG TTTATCGCGT TTGCCTGTAT
651 GTACGGCACG ACGATTACCG TTGTGGACGG TTATGCGCGT GCCATTGCCG
701 AACCCGTGCG CCTGCTGCGC GGCAGGGATA AAACCGGCAA CGCCGAGTTG
751 TTtgccTGGA ATATTTGGGT GGCGGGCAGC GGTTTGGCGG TGATTTTCTG
801 GTTTGACggc gcaaTGGCgG AACtgcTCAA ATTTGCGATG ATtgccgcCT
851 TTGTGTCCGC CCCTGTGTTC GCCTGGCTCA ACTACCGCCT CGTCAAAGGG
901 GACAAACGCC ACAGGCTTAC CGCCGGTATG AACGCCCTTG CCATTGTCGG
951 CCTGCTCTAC CTGGCCGGGT TTGCCGTTTT GTTCCTGTTG AACCTTACCG
1001 GACTTTTGGC ATAG
它对应于氨基酸序列<SEQ ID 486;ORF53ng-1>:
1 ..KKSCVYLWVF LILCIASATI NAGAVAIVTA AIVKMAIPSL MFDAGTVAAL
51 IMASCLIILV SGRYRALDRV SK
IIIVTLSI ATLAAAGIAM SRGMQMQPDF
101 IEPTPW
TLAG LGFLIALMGW MPAPIEISAI NSLWVTEKQR INPSEYRDGI
151 FDFNVGY
IAS AVLALVFLAL GAFVQYGNGE AVQMAGGKYI GQLINMYAVT
201 IGGWSRPL
VA FIAFACMYGT TITVVDGYAR AIAEPVRLLR GRDKTGNAE
L
251
FAWNIWVAGS GLAVIFWFDG AMAE
LLKFAM IAAFVSAPVF AWLNYRLVKG
301 DKRHRLTAGM NA
LAIVGLLY LAGFAVLFLL NLTGLLA*
ORF53ng-1和ORF53-1显示在336个氨基酸的重叠区内有94.0%的相同性:
60 70 80 90 100 110
orf53-1.pep ILTNLFKYPFFRFSAHYTLDTGKSLIEGYAEKSRVYLWVFLILCILSATINAGAVAIVTA
:|| ||||||||||| ||||||||||||||
orf53ng-1 KKSCVYLWVFLILCIASATINAGAVAIVTA
10 20 30
120 130 140 150 160 170
orf53-1.pep AIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53ng-1 AIVKMAIPSLMFDAGTVAALIMASCLIILVSGRYRALDRVSKIIIVTLSIATLAAAGIAM
40 50 60 70 80 90
180 190 200 210 220 230
orf53-1.pep SRGMQMQSDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGI
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53ng-1 SRGMQMQPDFIEPTPWTLAGLGFLIALMGWMPAPIEISAINSLWVTEKQRINPSEYRDGI
100 110 120 130 140 150
240 250 260 270 280 290
orf53-1.pep FDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf53ng-1 FDFNVGYIASAVLALVFLALGAFVQYGNGEAVQMAGGKYIGQLINMYAVTIGGWSRPLVA
160 170 180 190 200 210
300 310 320 330 340 350
orf53-1.pep FIAFACMYGTTITVVDGYARAIAEPVRLLRGKDKTGNAEFFAWNIWVAGSGLAVIFWFDG
|||||||||||||||||||||||||||||||:|||||||:||||||||||||||||||||
orf53ng-1 FIAFACMYGTTITVVDGYARAIAEPVRLLRGRDKTGNAELFAWNIWVAGSGLAVIFWFDG
220 230 240 250 260 270
360 370 380 390 400 410
orf53-1.pep VMANLLKFAMIAAFVSAPVFAWLNYRLVKGDEKHKLTSGMNALALAGLIYLTGFTVLFLL
:||:|||||||||||||||||||||||||||::|:||:||||||::||:||:||:|||||
orf53ng-1 AMAELLKFAMIAAFVSAPVFAWLNYRLVKGDKRHRLTAGMNALAIVGLLYLAGFAVLFLL
280 290 300 310 320 330
orf53-1.pep NLAGMFKX
||:|::
orf53ng-1 NLTGLLAX
根据该分析结果(包括淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例58
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 487>:
1 ..TTGCGGGAAA CGGCATATGT TTTGGATAGT TTTGATCGTT ATTTTGTTGT
51 TGCGCTTGCC GGCTTGTTTT TTGTCCGCGC ACAATCCGAA CGCGAGTGGA
101 TGCGCGAGGT TTCTGCGTGG CAGGAAAAGA AAGGGGAAAA ACAGGCGGAG
151 CTGCCTGAAA TCAAAGACGG TATGCCCGAT TTTCCCGAAC TTGCCCTGAT
201 GCTTTTCCAC GCCGTCAAAA CGGCAGTGTA TTGGCTGTTT GTCGGTGTCG
251 TCCGTTTCTG CCGAAACTAT CTGGCGCACG AATCCGAACC GGACAGGCCC
301 GTTCCGCCT..
它对应于氨基酸序列<SEQ ID 488;ORF58>:
1 ..LRETAYVLDS FDRYFVVALA GLFFVRAQSE REWMREVSAW QEKKGEKQAE
51 LPEIKDGMPD FPELALML
FH AVKTAVYWLF VGVVRFCRNY LAHESEPDRP
101 VPP..
进一步的工作揭示了其完整的核苷酸序列<SEQ ID 489>:
1 ATGTTTTGGA TAGTTTTGAT CGTTATTTTG TTGCTTGCGC TTGCCGGCTT
51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG
101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA
151 GACGGTATGC CCGATTTTCC CGAACTTGCC CTGATGCTTT TCCATGCCGT
201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA
251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT
301 GCAAACCGTG CGGATGTTCC GACCGCATCC GACGGATATT CAGACAGTGG
351 AAACGGGACG GAAGAAGCGG AAACGGAAGA AGCAGAAGCT GCGGAGGAAG
401 AGGCTGCCGA TACGGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC
451 ATCCCATTCG ACCGGAGTAT TGCTGAAGGG TTGATGCCGT CTGAAAGCGA
501 AATTTCGCCC GTCCGTCCGG TTTTTAAAGA AATCACTTTG GAAGAAGCAA
551 CGCGTGCTTT AAACAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC
601 GATGCATTTG AGAAAAACGA AACAGCGGTC CCCAAAGTCC GCGTGTCCGA
651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC
701 AACGCACGTA TTCCCATATG TTCGATGCGG ACAAAGAAGC GTTTTCCGAG
751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC
801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCACCGTC
851 ATGCAGGGCA GGGGAAAGGG CAGGCGGAGG CAAAATCCCC GGATGTTTCC
901 CAAGGGCAGT CCGTTTCAGA CGGCACGGCC GTCCGCGATG CCCGCCGCCG
951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG
1001 CGCGAATTTC TCGCCTGATT CCGGAAAGTC AGACGGTTGT CGGGAAACGG
1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAACCGTTTC
1101 GTCTGTGGGA TACGGCGGTC CGGTTTATGA TGAAACTGCC GATATCCATA
1151 TTGAAGAACC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG
1201 CCGAAAGTTC CCATGACCGC AATCGATATT CAGCCGCCGC CTCCCGTATC
1251 GGAAATCTAC AACCGTACCT ATGAACCGCC GTCAGGATTC GAGCAGGTGC
1301 AACGCAGCCG CATTGCCGAG ACCGACCATC TTGCCGATGA TGTTTTGAAT
1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCGGATGACG GCAGTGAAGG
1401 TGCGGCAGAG CGGTCAAGCG GGCAATATCT GTCGGAAACC GAAGCGTTCG
1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAAATGTGCC GTCTGAACGC
1501 CCGTCCTGCC GGGTATCGGA TACGGAAGCG GATGAAGGGG CGTTCCCATC
1551 TGAAGAAACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC
1601 TGCCTCCGCT GTTCAATCCC GAGGCGACGC AAACCGAAGA AGAACTGTTG
1651 GAAAACAGCA TCACCATCGA AGAAAAATTG GCGGAGTTCA AAGTCAAGGT
1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT AATTACGCGT TATGAAATCG
1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTGAATCT GGAAAAAGAT
1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCC
1851 CGGCAAAACC TGCATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA
1901 TACGCCTGAG CGAAATCTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC
1951 AAGCTGACGC TCGCGCTCGG TCAGGACATC ACCGGACAGC CCGTCGTAAC
2001 CGACTTGGGA AAAGCACCGC ATTTGTTGGT TGCCGGCACG ACCGGTTCGG
2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC
2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT
2151 GAGCATTTAC GAAGGCATCC CGCACCTGCT CGCCCCTGTC GTTACCGATA
2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA
2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGTAATCTTG CGGGCTTCAA
2301 TCAAAAAATC GCCGAAGCCG CAGCAAGGGG AGAAAAAATC GGCAATCCGT
2351 TCAGCCTCAC GCCCGACGAT CCCGAACCTT TGGAAAAACT GCCGTTTATC
2401 GTGGTCGTGG TCGATGAGTT TGCCGACCTG ATGATGACGG CAGGCAAGAA
2451 AATCGAAGAA CTGATTGCCC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA
2501 TCCATTTGAT TCTTGCCACA CAACGCCCCA GCGTCGATGT CATCACGGGT
2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA
2601 AATCGACAGC CGCACGATTC TCGACCAAAT GGGCGCGGAA AACCTGCTCG
2651 GTCAGGGCGA TATGCTGTTC CTGCTGCCGG GTACTGCCTA TCCGCAGCGC
2701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA
2751 TTTGAAACAG TTTGGCGAAC CGGACTATGT TGACGATATT TTGAGCGGCG
2801 GCGGCAGCGA AGAGCTGCCC GGCATCGGGC GCAGCGGCGA CGACGAAACC
2851 GATCCGATGT ACGACGAGGC CGTATCCGTT GTCCTGAAAA CGCGCAAAGC
2901 CAGCATTTCG GGCGTACAGC GCGCCTTGCG TATCGGCTAC AACCGCGCCG
2951 CGCGTCTGAT TGACCAGATG GAGGCGGAAG GCATTGTGTC CGCACCGGAA
3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTGGACAATG CTTGA
它对应于氨基酸序列<SEQ ID 490;ORF58-1>:
1
MFWIVLIVIL LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK
51 DGMPDFPELA LM
LFHAVKTA VYWLFVGVVR FCRNYLAHES EPDRPVPPAS
101 ANRADVPTAS DGYSDSGNGT EEAETEEAEA AEEEAADTED IATAVIDNRR
151 IPFDRSIAEG LMPSESEISP VRPVFKEITL EEATRALNSA ALRETKKRYI
201 DAFEKNETAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSHM FDADKEAFSE
251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FHRHAGQGKG QAEAKSPDVS
301 QGQSVSDGTA VRDARRRVSV NLKEPNKATV SAEARISRLI PESQTVVGKR
351 DVEMPSETEN VFTETVSSVG YGGPVYDETA DIHIEEPAAP DAWVVEPPEV
401 PKVPMTAIDI QPPPPVSEIY NRTYEPPSGF EQVQRSRIAE TDHLADDVLN
451 GGWQEETAAI ADDGSEGAAE RSSGQYLSET EAFGHDSQAV CPFENVPSER
501 PSCRVSDTEA DEGAFPSEET GAVSEHLPTT DLLLPPLFNP EATQTEEELL
551 ENSITIEEKL AEFKVKVKVV DSYSGPVITR YEIEPDVGVR GNSVLNLEKD
601 LARSLGVASI RVVETIPGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS
651 KLTLALGQDI TGQPVVTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA
701 APEDVRMIMI DPKMLELSIY EGIPHLLAPV VTDMKLAANA LNWCVNEMEK
751 RYRLMSFMGV RNLAGFNQKI AEAAARGEKI GNPFSLTPDD PEPLEK
LPFI
801
VVVVDEFADL MMTAGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG
851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LLPGTAYPQR
901 VHGAFASDEE VHRVVEYLKQ FGEPDYVDDI LSGGGSEELP GIGRSGDDET
951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE
1001 HNGNRTILVP LDNA*
对该氨基酸序列的计算机分析预计了指定的跨膜区,并给出下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF58显示与脑膜炎奈瑟球菌菌株A的ORF(ORF58a)在重叠的89个氨基酸内有96.6%的相同性:
10 20 30 40 50 60
orf58.pep LRETAYVLDSFDRYFVV
ALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPD
::: |||||||||||| |||||||||||||||||||||||||||||||
orf58a
MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPD
10 20 30 40 50
70 80 90 100
orf58.pep FPELALM
LFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPP
||||||| ||||||||||||||||| |||||||||||||||||||
orf58a FPELALM
LFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSD
60 70 80 90 100 110
全长ORF58a核苷酸序列<SEQ ID 491>是:
1 ATGTTTTGGA TAGTTTTGAT CGTTATTTTG TTGCTTGCGC TTGCCGGCTT
51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG
101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA
151 GACGGTATGC CCGATTTTCC CGAACTTGCC CTGATGCTTT TCCATGCCGT
201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA
251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT
301 GCAAATCGTG CGGATGTTCC GACCGCATCC GACGGATATT CAGACAGTGG
351 AAACGGGACG GAAGAAGCGG AAACGGAAGA AGCAGAAGCT GCGGAGGAAG
401 AGGCTGCCGA TACGGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC
451 ATCCCATTCG ACCGGAGTAT TGCTGAAGGG TTGATGCCGT CTGAAAGCGA
501 AATTTCGCCC GTCCGTCCGG TTTTTAAGGA AATCACTTTG GAAGAAGCAA
551 CGCGTGCTTT AAACAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC
601 GATGCATTTG AGAAAAACGA AACAGCGGTC CCCAAAGTCC GCGTGTCCGA
651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC
701 AACGCACGTA TTCCCGTATG TTCGATGCGG ACAAAGAAGC GTTTTCCGAG
751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC
801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCGCCGTC
851 ATGCAGGGCA GGGNAAAGGG CAGGCGGAGG CNAAATCCCC GGATGTTTCC
901 CAAGGGCAGT CCGTTTCAGA CGGCACAGCC GTCCGCGATG CCNGCCGCCG
951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG
1001 CGCGGATTTC GCGCCTGATT CCGGAAAGTC GGACGGTTGT CGGGAAACGG
1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAANTGTTTC
1101 GTCTGTGGGA TACGGCGNTC CGGTTTATGA TGAAACTGCC GATATCCATA
1151 TTGAAGAACC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG
1201 CCGAAAGTTC CCATGCCCGC AATNGATATT CCGCCGCCGC CTCCCGTATC
1251 GGAAATCTAC AACCGTACCT ATGAACCGCC GGCAGGATTC GAGCAGGTGC
1301 AACGCAGCCG CATTGCCGAA ACCGATCATC TTGCCGATGA TGTTTTGAAT
1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCGAATGACG GCAGTGAGGG
1401 TGTGGCAGAG CGGTCAAGCG GGCAATATTT GTCGGAAACC GAAGCGTTCG
1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAAATGTGCC GTCTGAACGC
1501 CCGTCCCGCC GGGCATNGGA TACGGAAGCG GATGAAGGGG CGTTCCAATC
1551 TGAAGAAACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC
1601 TGCCGCCGCT GTTCAATCCC GGGGCGACGC AAACCGAAGA AGANCTGTTG
1651 GANAACAGCA TCACCATCGA AGAAAAATNG GCGGAGTTCA AAGTCAAGGT
1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT GATTACGCGT TATGAAATCG
1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTAAATCT GGAAAAAGAN
1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCT
1851 CGGCAAAACC TGTATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA
1901 TACGCCTGAG CGAAATCTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC
1951 AAGCTGACGC TCGCGCTCGG TCAGGACATC ACCGGACAGC CCGTCGTAAC
2001 CGACTTGGGC AAAGCACCGC ATTTGTTGGT TGCCGGCACG ACCGGTTCGG
2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC
2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT
2151 GAGCATTTAC GAAGGCATCC CGCACCTGCT CGCCCCTGTC GTTACCGATA
2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA
2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGCAATCTTG CGGGTNTCAA
2301 TCAAAAAATC GCCGAAGCCG CAGCAAGGGG GGAGAAAATC GGCAACCCGT
2351 TCAGCCTCAC GCCCGACAAT CCCGAACCTT TGGANAAATT GCCGTTTATC
2401 GTGGTCGTGG TTGATGAGTT TGCCGACCTG ATGATGACGG CAGGCAAGAA
2451 AATCGAAGAA CTGATTGCCC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA
2501 TCCATCTTAT CCTTGCCACA CAACGCCCCA GTGTCGATGT CATCACGGGT
2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA
2601 AATCGACAGC CGCACGATTC TTGACCAAAT GGGTGCGGAA AACCTGCTCG
2651 GGCAGGGCGA TATGCTGTTC CTGCCGCCGG GTACGGCCTA TCCGCAGCGC
2701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA
2751 TCTGAAACAG TTTGGCGAAC CGGACTATGT TGACGATATN TTGAGCGGCG
2801 GTATGTCCGA CGATTTGCTG GGAATCAGCC GGAGCGGCGA CGGCGAAACC
2851 GATCCGATGT ACGACGAGGC CGTGTCNGTT GTTTTGAAAA CGCGCAAAGC
2901 CAGCATTTCT GGCGTGCAGC GCGCATTGCG TATCGGCTAT AATCGCGCCG
2951 CGCGTCTGAT TGACCAGATG GAGGCGGAAG GCATTGTGTC CGCACCGGAA
3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTNGACAATG CTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 492>:
1
MFWIVLIVIL LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK
51 DGMPDFPELA LM
LFHAVKTA VYWLFVGVVR FCRNYLAHES EPDRPVPPAS
101 ANRADVPTAS DGYSDSGNGT EEAETEEAEA AEEEAADTED IATAVIDNRR
151 IPFDRSIAEG LMPSESEISP VRPVFKEITL EEATRALNSA ALRETKKRYI
201 DAFEKNETAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSRM FDADKEAFSE
251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FRRHAGQGKG QAEAKSPDVS
301 QGQSVSDGTA VRDAXRRVSV NLKEPNKATV SAEARISRLI PESRTVVGKR
351 DVEMPSETEN VFTEXVSSVG YGXPVYDETA DIHIEEPAAP wDAWVVEPPEV
401 PKVPMPAXDI PPPPPVSEIY NRTYEPPAGF EQVQRSRIAE TDHLADDVLN
451 GGWQEETAAI ANDGSEGVAE RSSGQYLSET EAFGHDSQAV CPFENVPSER
501 PSRRAXDTEA DEGAFQSEET GAVSEHLPTT DLLLPPLFNP GATQTEEXLL
551 XNSITIEEKX AEFKVKVKVV DSYSGPVITR YEIEPDVGVR GNSVLNLEKX
601 LARSLGVASI RVVETILGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS
651 KLTLALGQDI TGQPVVTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA
701 APEDVRMIMI DPKMLELSIY EGIPHLLAPV VTDMKLAANA LNWCVNEMEK
751 RYRLMSFMGV RNLAGXNQKI AEAAARGEKI GNPFSLTPDN PEPLXK
LPFI
801
VVVVDEFADL MMTAGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG
851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LPPGTAYPQR
901 VHGAFASDEE VHRVVEYLKQ FGEPDYVDDX LSGGMSDDLL GISRSGDGET
951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE
1001 HNGNRTILVP XDNA*
ORF58a和ORF58-1显示在1014个氨基酸的重叠区内有96.6%的相同性:
10 20 30 40 50 60
orf58a.pep MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58-1 MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA
10 20 30 40 50 60
70 80 90 100 110 120
orf58a.pep LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58-1 LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
70 80 90 100 110 120
130 140 150 160 170 180
orf58a.pep EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58-1 EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL
130 140 150 160 170 180
190 200 210 220 230 240
orf58a.pep EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSRM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf58-1 EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSHM
190 200 210 220 230 240
250 260 270 280 290 300
orf58a.pep FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFRRHAGQGKGQAEAKSPDVS
|||||||||||||||||||||||||||||||||||||||||:||||||||||||||||||
orf58-1 FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFHRHAGQGKGQAEAKSPDVS
250 260 270 280 290 300
310 320 330 340 350 360
orf58a.pep QGQSVSDGTAVRDAXRRVSVNLKEPNKATVSAEARISRLIPESRTVVGKRDVEMPSETEN
|||||||||||||| ||||||||||||||||||||||||||||:||||||||||||||||
orf58-1 QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESQTVVGKRDVEMPSETEN
310 320 330 340 350 360
370 380 390 400 410 420
orf58a.pep VFTEXVSSVGYGXPVYDETADIHIEEPAAPDAWVVEPPEVPKVPMPAXDIPPPPPVSEIY
||||:||||||| |||||||||||||||||||||||||||||||| | || |||||||||
orf58-1 VFTETVSSVGYGGPVYDETADIHIEEPAAPDAWVVEPPEVPKVPMTAIDIQPPPPVSEIY
370 380 390 400 410 420
430 440 450 460 470 480
orf58a.pep NRTYEPPAGFEQVQRSRIAETDHLADDVLNGGWQEETAAIANDGSEGVAERSSGQYLSET
|||||||:|||||||||||||||||||||||||||||||||:|||||:||||||||||||
orf58-1 NRTYEPPSGFEQVQRSRIAETDHLADDVLNGGWQEETAAIADDGSEGAAERSSGQYLSET
430 440 450 460 470 480
490 500 510 520 530 540
orf58a.pep EAFGHDSQAVCPFENVPSERPSRRAXDTEADEGAFQSEETGAVSEHLPTTDLLLPPLFNP
|||||||||||||||||||||| |: ||||||||| ||||||||||||||||||||||||
orf58-1 EAFGHDSQAVCPFENVPSERPSCRVSDTEADEGAFPSEETGAVSEHLPTTDLLLPPLFNP
490 500 510 520 530 540
550 560 570 580 590 600
orf58a.pep GATQTEEXLLXNSITIEEKXAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKX
|||||| || |||||||| |||||||||||||||||||||||||||||||||||||||
orf58-1 EATQTEEELLENSITIEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKD
550 560 570 580 590 600
610 620 630 640 650 660
orf58a.pep LARSLGVASIRVVETILGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf58-1 LARSLGVASIRVVETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
610 620 630 640 650 660
670 680 690 700 710 720
orf58a.pep TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58-1 TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
670 680 690 700 710 720
730 740 750 760 770 780
orf58a.pep EGIPHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGXNQKIAEAAARGEKI
||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||||
orf58-1 EGIPHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI
730 740 750 760 770 780
790 800 810 820 830 840
orf58a.pep GNPFSLTPDNPEPLXKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
|||||||||:|||| |||||||||||||||||||||||||||||||||||||||||||||
orf58-1 GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
790 800 810 820 830 840
850 860 870 880 890 900
orf58a.pep QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQR
||||||||||||||||||||||||||||||||||||||||||||||||||| ||||||||
orf58-1 QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLLPGTAYPQR
850 860 870 880 890 900
910 920 930 940 950 960
orf58a.pep VHGAFASDEEVHRVVEYLKQFGEPDYVDDXLSGGMSDDLLGISRSGDGETDPMYDEAVSV
||||||||||||||||||||||||||||| |||| |::| ||:|||| ||||||||||||
orf58-1 VHGAFASDEEVHRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDDETDPMYDEAVSV
910 920 930 940 950 960
970 980 990 1000 1010
orf58a.pep VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPXDNAX
|||||||||||||||||||||||||||||||||||||||||||||||||| ||||
orf58-1 VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPLDNAX
970 980 990 1000 1010
与淋病奈瑟球菌的预计ORF的同源性
ORF58显示出与淋病奈瑟球菌的预计ORF(ORF58ng)的9个氨基酸重叠区完全相同:
orf58.pep ALMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPP 103
|||||||||
orf58ng SEPDRPVPPASANRADVPTASDGYSDSGNG 30
预计ORF58ng核苷酸序列<SEQ ID 493>编码的蛋白质具有部分氨基酸序列
<SEQ ID 494>:
1 ..SEPDRPVPPA 8ANRADVPTA SDGYSDSGNG TEEAETEAAE AAEEEAADTE
51 DIATAVIDNR RIPFDRSIAE GLMQSESKTS PVRPVFKEIT LEEATRALSS
101 AALRETKKRY IDAFEKNGTA VPKVRVSDTP MEGLQIIGLD DPVLQRTYSR
151 MFDADKEAFS ESADYGFEPY FEKQHPSAFS AVKAENARNA PFRRHAGQEK
201 GQAEAKSPDV SQGQSVSDGT AVRDARRRVS VNLKEPNKAT VSAEARISRL
251 IPESRTVVGK RDVEMPSETE NVFTETVSSV GYGGPVYDEA ADIHIEEPAA
301 PDAWVVEPPE VPEVAVPEID ILPPPPVSEI YNRTYEPPAG FEQAQRSRIA
351 ETDHLAADVL NGGWQEETAA IADDGSEGAA ERSSGQYLSE TEAFGHDSQA
401 VCPFEDVPSE RPSCRVSDTE ADEGAFQSEE TGAVSEHLPT TDLLLPPLFN
451 PEATQTEEEL LENSITIEEK LAEFKVKVKV VDSYSGPVIT RYEIEPDVGV
501 RGNSVLNLEK DLARSLGVAS IRVVETIPGK TCMGLELPNP KRQMIRLSEI
601 NAMILSMLFK AAPEDVRMIM IDPKMLELSI YEGITHLLAP VVTDMKLAAN
651 ALNWCVNEME KRYRLMSFMG VRNLAGFNQK IAEAAARGEK IGNPFSLTPD
701 DPEPLEK
LPF IVVVVDEFAD LMMTAGKKIE ELIARLAQKA RAAGIHLILA
751 TQRPSVDVIT GLIKANIPTR IAFQVSSKID SRTILDQMGA ENLLGQGDML
801 FLPPGTAYPQ RVHGAFASDE EVHRVVEYLK QFGEPDYVDD ILSGGGSEEL
851 PGIGRSGDGE TDPMYDEAVS VVLKTRKASI SGVQRALRIG YNRAARLIDQ
901 MEAEGIVSAP EHNGNRTILV PLDNA*
该部分淋球菌序列含有一个预计的跨膜区和一个预计的ATP/GTP-结合位点基序A(P-环;双划线)。另外,它具有一个与大肠杆菌的FTSK细胞分裂蛋白同源的结构域。将ORF58ng和Ftsk(登录号p46889)作序列对比,结果显示在459个氨基酸重叠区内有65%的氨基酸相同性:
ORF58ng: 467 IEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKDLARSLGVASIRVVET 526
+E +LA+F++K VV+ GPVITR+E+ GV+ + NL +DLARSL ++RVVE
FtsK: 868 VEARLADFRIKADVVNYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVAVRVVEV 927
ORF58ng: 527 IPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDITGQPVVTDLGKAPHL 586
IPGK +GLELPN KRQ + L E+ ++ +F ++ S LT+ LG+DI G+PVV DL K PHL
FtsK: 928 IPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTVVLGKDIAGEPVVADLAKMPHL 987
ORF58ng: 587 LVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIYEGITHLLAPVVTDMK 646
LVAGTTGSGKSVGVNAMILSML+KA PEDVR IMIDPKMLELS+YEGI HLL VVTDMK
FtsK: 988 LVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTEVVTDMK 1047
ORF58ng: 647 LAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKIGNPFSLTPDDPEP-- 704
AANAL WCVNEME+RY+LMS +GVRNLAG+N+KIAEA I +P+ D +
FtsK: 1048 DAANALRWCVNEMERRYKLMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGDSMDAQH 1107
ORF58ng: 705 --LEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILATQRPSVDVITGL 762
L+K P+IVV+VDEFADLMMT GKK+EELIARLAQKARAAGIHL+LATQRPSVDVITGL
FtsK: 1108 PVLKKEPYIVVLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSVDVITGL 1167
ORF58ng: 763 IKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQRVHGAFASDEEV 822
IKANIPTRIAF VSSKIDSRTILDQ GAE+LLG GDML+ P + P RVHGAF D+EV
FtsK: 1168 IKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAFVRDQEV 1227
0RF58ng: 823 HRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSVVLKTRKASISG 882
H VV+ K G P YVD I S SE G G G E DP++D+AV V + RKASISG
FtsK: 1228 HAVVQDWKARGRPQYVDGITSDSESEGGAG-GFDGAEELDPLFDQAVQFVTEKRKASISG 1286
ORF58ng: 883 VQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVP 921
VQR RIGYNRAAR+I+QMEA+GIVS HNGNR+L P
FtsK: 1287 VQRQFRIGYNRAARIIEQMEAQGIVSEQGHNGNREVLAP 1325
对ORF58ng作进一步工作揭示了其完整的淋球菌DNA序列<SEQ ID 495>是:
1 ATGTTTTGGA TAGTTTTGAT CGTTATtgtg TTGCTTGCGC TTGCCGGCCT
51 GTTTTTTGTC CGCGCACAAT CCGAACGCGA GTGGATGCGC GAGGTTTCTG
101 CGTGGCAGGA AAAGAAAGGG GAAAAACAGG CGGAGCTGCC TGAAATCAAA
151 GACGGTATGC CCGATTTTCC CGAGTTTTCC CTGATGCTTT TCCATGCCGT
201 CAAAACGGCA GTGTATTGGC TGTTTGTCGG TGTCGTCCGT TTCTGCCGAA
251 ACTATCTGGC GCACGAATCC GAACCGGACA GGCCCGTTCC GCCTGCTTCT
301 GCAAACCGTG CGGATGTTCC GACCGCATCC GACGGGTATT CAGACAGTGG
351 AAACGGGACG GAAGAAGCGG AAACGGAAGC AGCAGAAGCT GCGGAGGAAG
401 AGGCTGCCgA TACgGAAGAC ATTGCAACTG CCGTAATCGA CAACCGCCGC
451 ATCCcatTCG ACCGGAGTAT TGCTGAAGGG TTGATGCAGT CTGAAAGCAA
501 AACTTCGCCC GTCCGTCCGG TTTTTAAGGA AATCACTTTG GAAGAAGCAA
551 CGCGTGCTTT AAGCAGCGCG GCTTTAAGGG AAACGAAAAA ACGCTATATC
601 GATGCATTTG AGAAAAACGG AACAGCCGTC CCCAAAGTAC GCGTGTCCGA
651 TACCCCGATG GAAGGGCTGC AGATTATCGG TTTGGACGAC CCTGTGCTTC
701 AACGCACGTA TTCCCGTATG TTTGATGCGG ACAAAGAAGC GTTTTCCGAG
751 TCTGCGGATT ACGGATTTGA GCCGTATTTT GAGAAGCAGC ATCCGTCTGC
801 CTTTTCTGCA GTCAAAGCCG AAAATGCACG GAATGCGCCG TTCCGCCGTC
851 ATGCAGGGCA GGAGAAAGGG CAGGCGGAGG CAAAATCCCC GGATGTTTCC
901 CAAGGGCAGT CCGTTTCAGA CGGCACAGCC GTCCGCGATG CCCGCCGCCG
951 CGTTTCCGTC AATTTGAAAG AACCGAACAA GGCAACGGTT TCTGCGGAGG
1001 CGCGGATTTC GCGCCTGATT CCGGAAAGTC GGACGGTTGT CGGGAAACGG
1051 GATGTCGAAA TGCCGTCTGA AACCGAAAAT GTTTTCACGG AAACCGTTTC
1101 GTCTGTGGGA TACGGCGGTC CGGTTTATGA TGAAGCTGCC GATATCCATA
1151 TTGAAGAGCC TGCCGCGCCC GATGCTTGGG TGGTCGAACC ACCCGAAGTG
1201 CCGGAGGTAG CCGTACCCGA AATCGATATT CTGCCGCCGC CTCCCGTATC
1251 GGAAATCTAC AACCGTACCT ATGAGCCGCC GGCAGGATTC GAGCAGGCGC
1301 AACGCAGCCG CATTGCCGAA ACCGACCATC TTGCCGCTGA TGTTTTGAAT
1351 GGAGGTTGGC AGGAGGAAAC CGCCGCTATT GCAGATGACG GCAGTGAGGG
1401 TGCGGCAGAG CGGTCAAGCG GGCAATATCT GTCGGAAACC GAAGCGTTCG
1451 GGCATGACAG TCAGGCGGTT TGTCCGTTTG AAGATGTGCC GTCTGAACGC
1501 CCGTCCTGCC GGGTATCGGA TACGGAAGCG GATGAAGGGG CGTTCCAATC
1551 GGAAGAGACC GGTGCGGTAT CCGAACACCT GCCGACAACC GACCTGCTTC
1601 TGCCTCCGCT GTTCAATCCC GAGGCGACGC AAACCGAAGA AGAACTGTTG
1651 GAAAACAGCA TCACCATCGA AGAAAAATTG GCGGAGTTCA AAGTCAAGGT
1701 CAAGGTTGTC GATTCTTATT CCGGCCCCGT GATTACGCGT TATGAAATCG
1751 AACCCGATGT CGGCGTGCGC GGCAATTCCG TTCTGAATTT GGAAAAAGAC
1801 TTGGCGCGTT CGCTCGGCGT GGCTTCCATC CGCGTTGTCG AAACCATCCC
1851 CGGCAAAACC TGCATGGGTT TGGAACTTCC GAACCCGAAA CGCCAAATGA
1901 TACGCCTGAG CGAAATTTTC AATTCGCCCG AGTTTGCCGA ATCCAAATCC
1951 AAGCTGACGC TCGCGCTCGG TCAGGACATT ACCGGACAGC CCGTCGTAAC
2001 CGACTTGGGC AAAGCACCGC ATTTGCTGGT TGCCGGCACG ACCGGTTCGG
2051 GCAAATCGGT GGGTGTCAAC GCGATGATTC TGTCTATGCT TTTCAAAGCC
2101 GCGCCGGAAG ACGTGCGTAT GATTATGATC GATCCGAAAA TGCTGGAATT
2151 GAGCATTTAC GAAGGCATCA CGCACCTGCT CGCCCCTGTC GTTACCGATA
2201 TGAAGCTGGC GGCAAACGCG CTGAACTGGT GTGTTAACGA AATGGAAAAA
2251 CGCTACCGCC TGATGAGCTT TATGGGCGTG CGCAATCTTG CGGGCTTCAA
2301 CCAAAAAATC GCCGAAGCCG CAGCAAGGGG AGAAAAAATC GGCAATCCGT
2351 TCAGCCTCAC GCCCGACGAT CCCGAACCTT TGGAAAAACT GCCGTTTATC
2401 GTGGTCGTGG TCGATGAGTT TGCCGATTTG ATGATGACGG CAGGCAAGAA
2451 AATCGAAGAA CTGATTGCGC GCCTCGCCCA AAAAGCCCGC GCGGCAGGCA
2501 TCCACCTTAT CCTTGCCACA CAACGCCCCA GCGTCGATGT CATCACGGGT
2551 CTGATTAAGG CGAACATCCC GACGCGTATC GCGTTCCAAG TGTCCAGCAA
2601 AATCGACAGC CGCACGATTC TCGACCAAAT GGGCGCGGAA AACCTGCTCG
2651 GTCAGGGCGA TATGCTGTTC CTGCCGCCGG GTACTGCCTA TCCGCAGCGC
2701 GTTCACGGCG CGTTTGCCTC GGATGAAGAG GTGCACCGCG TGGTCGAATA
2751 TCTGAAGCAG TTTGGCGAGC CGGACTATGT TGACGATATT TTGAGCGGCG
2801 GCGGCAGCGA AGAGCTGCCC GGCATCGGGC GCAGCGGCGA CGGCGAAACC
2851 GATCCGATGT ACGACGAGGC CGTATCCGTT GTCCTGAAAA CGCGCAAAGC
2901 CAGCATTTCG GGCGTACAGC GCGCCTTGCG CATCGGCTAC AACCGCGCCG
2951 CGCGTCTGAT TGACCAAATG GAAGCGGAAG GCATTGTGTC CGCACCGGAA
3001 CACAACGGCA ACCGTACGAT TCTCGTCCCC TTGGACAATG CTTGA
它对应于氨基酸序列<SEQ ID 496;ORF58ng-1>:
1
MFWIVLIVIV LLALAGLFFV RAQSEREWMR EVSAWQEKKG EKQAELPEIK
51 DGMPDFPEFS LM
LFHAVKTA VYWLFVGVVR FCRNYLAHES EPDRPVPPAS
101 ANRADVPTAS DGYSDSGNGT EEAETEAAEA AEEEAADTED IATAVIDNRR
151 IPFDRSIAEG LMQSESKTSP VRPVFKEITL EEATRALSSA ALRETKKRYI
201 DAFEKNGTAV PKVRVSDTPM EGLQIIGLDD PVLQRTYSRM FDADKEAFSE
251 SADYGFEPYF EKQHPSAFSA VKAENARNAP FRRHAGQEKG QAEAKSPDVS
301 QGQSVSDGTA VRDARRRVSV NLKEPNKATV SAEARISRLI PESRTVVGKR
351 DVEMPSETEN VFTETVSSVG YGGPVYDEAA DIHIEEPAAP DAWVVEPPEV
401 PEVAVPEIDI LPPPPVSEIY NRTYEPPAGF EQAQRSRIAE TDHLAADVLN
451 GGWQEETAAI ADDGSEGAAE RSSGQYLSET EAFGHDSQAV CPFEDVPSER
501 PSCRVSDTEA DEGAFQSEET GAVSEHLPTT DLLLPPLFNP EATQTEEELL
551 ENSITIEEKL AEFKVKVKVV DSYSGPVITR YEIEPDVGVR GNSVLNLEKD
601 LARSLGVASI RVVETIPGKT CMGLELPNPK RQMIRLSEIF NSPEFAESKS
651 KLTLALGQDI TGQPVVTDLG KAPHLLVAGT TGSGKSVGVN AMILSMLFKA
701 APEDVRMIMI DPKMLELSIY EGITHLLAPV VTDMKLAANA LNWCVNEMEK
751 RYRLMSFMGV RNLAGFNQKI AEAAARGEKI GNPFSLTPDD PEPLEK
LPFI
801
VVVVDEFADL MMTAGKKIEE LIARLAQKAR AAGIHLILAT QRPSVDVITG
851 LIKANIPTRI AFQVSSKIDS RTILDQMGAE NLLGQGDMLF LPPGTAYPQR
901 VHGAFASDEE VHRVVEYLKQ FGEPDYVDDI LSGGGSEELP GIGRSGDGET
951 DPMYDEAVSV VLKTRKASIS GVQRALRIGY NRAARLIDQM EAEGIVSAPE
1001 HNGNRTILVP LDNA*
ORF58ng-1和ORF58-1显示在1014个氨基酸的重叠区内有97.2%的相同性:
10 20 30 40 50 60
orf58-1.pep MFWIVLIVILLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPELA
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||||::
orf58ng-1 MFWIVLIVIVLLALAGLFFVRAQSEREWMREVSAWQEKKGEKQAELPEIKDGMPDFPEFS
10 20 30 40 50 60
70 80 90 100 110 120
orf58-1.pep LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 LMLFHAVKTAVYWLFVGVVRFCRNYLAHESEPDRPVPPASANRADVPTASDGYSDSGNGT
70 80 90 100 110 120
130 140 150 160 170 180
orf58-1.pep EEAETEEAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMPSESEISPVRPVFKEITL
|||||| ||||||||||||||||||||||||||||||||||| |||: ||||||||||||
orf58ng-1 EEAETEAAEAAEEEAADTEDIATAVIDNRRIPFDRSIAEGLMQSESKTSPVRPVFKEITL
130 140 150 160 170 180
190 200 210 220 230 240
orf58-1.pep EEATRALNSAALRETKKRYIDAFEKNETAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSHM
|||||||:|||||||||||||||||| |||||||||||||||||||||||||||||||:|
orf58ng-1 EEATRALSSAALRETKKRYIDAFEKNGTAVPKVRVSDTPMEGLQIIGLDDPVLQRTYSRM
190 200 210 220 230 240
250 260 270 280 290 300
orf58-1.pep FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFHRHAGQGKGQAEAKSPDVS
|||||||||||||||||||||||||||||||||||||||||:||||| ||||||||||||
orf58ng-1 FDADKEAFSESADYGFEPYFEKQHPSAFSAVKAENARNAPFRRHAGQEKGQAEAKSPDVS
250 260 270 280 290 300
310 320 330 340 350 360
orf58-1.pep QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESQTVVGKRDVEMPSETEN
|||||||||||||||||||||||||||||||||||||||||||:||||||||||||||||
orf58ng-1 QGQSVSDGTAVRDARRRVSVNLKEPNKATVSAEARISRLIPESRTVVGKRDVEMPSETEN
310 320 330 340 350 360
370 380 390 400 410 420
orf58-1.pep VFTETVSSVGYGGPVYDETADIHIEEPAAPDAWVVEPPEVPKVPMTAIDIQPPPPVSEIY
||||||||||||||||||:||||||||||||||||||||||:| : ||| |||||||||
orf58ng-1 VFTETVSSVGYGGPVYDEAADIHIEEPAAPDAWVVEPPEVPEVAVPEIDILPPPPVSEIY
370 380 390 400 410 420
430 440 450 460 470 480
orf58-1.pep NRTYEPPSGFEQVQRSRIAETDHLADDVLNGGWQEETAAIADDGSEGAAERSSGQYLSET
|||||||:||||:|||||||||||| ||||||||||||||||||||||||||||||||||
orf58ng-1 NRTYEPPAGFEQAQRSRIAETDHLAADVLNGGWQEETAAIADDGSEGAAERSSGQYLSET
430 440 450 460 470 480
490 500 510 520 530 540
orf58-1.pep EAFGHDSQAVCPFENVPSERPSCRVSDTEADEGAFPSEETGAVSEHLPTTDLLLPPLFNP
||||||||||||||:|||||||||||||||||||| ||||||||||||||||||||||||
orf58ng-1 EAFGHDSQAVCPFEDVPSERPSCRVSDTEADEGAFQSEETGAVSEHLPTTDLLLPPLFNP
490 500 510 520 530 540
550 560 570 580 590 600
orf58-1.pep EATQTEEELLENSITIEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 EATQTEEELLENSITIEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKD
550 560 570 580 590 600
610 620 630 640 650 660
orf58-1.pep LARSLGVASIRVVETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 LARSLGVASIRVVETIPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDI
610 620 630 640 650 660
670 680 690 700 710 720
orf58-1.pep TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 TGQPVVTDLGKAPHLLVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIY
670 680 690 700 710 720
730 740 750 760 770 780
orf58-1.pep EGIPHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 EGITHLLAPVVTDMKLAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKI
730 740 750 760 770 780
790 800 810 820 830 840
orf58-1.pep GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 GNPFSLTPDDPEPLEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILAT
790 800 810 820 830 840
850 860 870 880 890 900
orf58-1.pep QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLLPGTAYPQR
||||||||||||||||||||||||||||||||||||||||||||||||||| ||||||||
orf58ng-1 QRPSVDVITGLIKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQR
850 860 870 880 890 900
910 920 930 940 950 960
orf58-1.pep VHGAFASDEEVHRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDDETDPMYDEAVSV
||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||
orf58ng-1 VHGAFASDEEVHRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSV
910 920 930 940 950 960
970 980 990 1000 1010
orf58-1.pep VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPLDNAX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf58ng-1 VLKTRKASISGVQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVPLDNAX
970 980 990 1000 1010
另外,ORF58ng-1显示出与大肠杆菌蛋白Ftsk明显同源:
sp|P46889|FTSK_ECOLI细胞分裂蛋白FTSK>gi|1651412|gnl|PID|d1015290(D1分裂蛋白FtsK[大肠杆菌]>gi|1651418|gnl|PID|d1015296(D90727)细胞分裂蛋白FtsK[大肠杆菌]>gi|1787117(AE000191)细胞分裂蛋白FtsK[大肠杆菌]长度=1329
评分=576位(1469),估计值=e-163
相同性=301/459(65%),阳性=353/459(76%),空隙=5/459(1%)
询问:556 IEEKLAEFKVKVKVVDSYSGPVITRYEIEPDVGVRGNSVLNLEKDLARSLGVASIRVVET 615
+E +LA+F++K VV+ GPVITR+E+ GV+ + NL +DLARSL ++RVVE
目标:868 VEARLADFRIKADVVNYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVAVRVVEV 927
询问:616 IPGKTCMGLELPNPKRQMIRLSEIFNSPEFAESKSKLTLALGQDITGQPVVTDLGKAPHL 675
IPGK +GLELPN KRQ + L E+ ++ +F ++ S LT+ LG+DI G+PVV DL K PHL
目标:928 IPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTVVLGKDIAGEPVVADLAKMPHL 987
询问:676 LVAGTTGSGKSVGVNAMILSMLFKAAPEDVRMIMIDPKMLELSIYEGITHLLAPVVTDMK 735
LVAGTTGSGKSVGVNAMILSML+KA PEDVR IMIDPKMLELS+YEGI HLL VVTDMK
目标:988 LVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTEVVTDMK 1047
询问:736 LAANALNWCVNEMEKRYRLMSFMGVRNLAGFNQKIAEAAARGEKIGNPFSLTPDDPEP-- 793
AANAL WCVNEME+RY+LMS +GVRNLAG+N+KIAEA I +P+ D +
目标:1048 DAANALRWCVNEMERRYKLMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGDSMDAQH 1107
询问:794 --LEKLPFIVVVVDEFADLMMTAGKKIEELIARLAQKARAAGIHLILATQRPSVDVITGL 851
L+K P+IVV+VDEFADLMMT GKK+EELIARLAQKARAAGIHL+LATQRPSVDVITGL
目标:1108 PVLKKEPYIVVLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSVDVITGL 1167
询问:852 IKANIPTRIAFQVSSKIDSRTILDQMGAENLLGQGDMLFLPPGTAYPQRVHGAFASDEEV 911
IKANIPTRIAF VSSKIDSRTILDQ GAE+LLG GDML+ P + P RVHGAF D+EV
目标:1168 IKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAFVRDQEV 1227
询问:912 HRVVEYLKQFGEPDYVDDILSGGGSEELPGIGRSGDGETDPMYDEAVSVVLKTRKASISG 971
H VV+ K G P YVD I S SE G G G E DP++D+AV V + RKASISG
目标:1228 HAVVQDWKARGRPQYVDGITSDSESEGGAG-GFDGAEELDPLFDQAVQFVTEKRKASISG 1286
询问:972 VQRALRIGYNRAARLIDQMEAEGIVSAPEHNGNRTILVP 1010
VQR RIGYNRAAR+I+QMEA+GIVS HNGNR +L P
目标:1287 VQRQFRIGYNRAARIIEQMEAQGIVSEQGHNGNREVLAP 1325
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例59
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 497>:
1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG
51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC
101 TGCTCGGCCG TGCCGCCGAC GGGC..GTGA TCGCCATCGA TGCCGTGTTG
151 GCATTGGTCG GCTTCTGGGT C......... .......... ..........
//
901 .........A TTGCCATCGG TTTGTTTTTA ATTTACCAAA ACGGGCTGAC
951 CCTGCTTTTT GAAGCCGTGG AAGACGGCAA AATCCATTTT TGGCTCGGAC
1001 TGCTGCCTAT GCACATTATC ATGTTTGTCC TTGCACTCAT CCTGTTGCGC
1051 GTCCGCAGTA TGCCCAGCCA GCCCTTGTGG CAGGCGGTTG GCAAAAGTCT
1101 GACATTGAAA GGCGGAAAAT GA
它对应于氨基酸序列<SEQ ID 498;ORF101>:
1 MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GXVIAIDAVL
51 ALVGFWV... .......... .......... .......... ..........
//
301 ...IAIGLFL IYQNGLTLLF EAVEDGKIHF WLGLLPMHII MFVLALILLR
351 VRSMPSQPFW QAVGKSLTLK GGK*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 499>:
1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG
51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC
101 TGCTCGGCCG TGCCGCCGAC GGGCGTGTCG CCATCGATGC CGTGTTGGCA
151 TTGGTCGGCT TCTGGGTCAT CGGTATGACG CCGCTTTTGC TGGTGTTGAC
201 CGCATTTATC AGTACGTTGA CCGTGTTGAC CCGCTACTGG CGCGACAGCG
251 AAATGTCGGT CTGGCTATCC TGCGGATTGG CATTGAAACA ATGGATACGC
301 CCGGTGATGC AGTTTGCCGT GCCGTTTGCC GTTTTGGTTG CCGTCATGCA
351 GCTTTGGGTG ATACCGTGGG CAGAGCTACG CAGCCGCGAA TACGCTGAAA
401 TCCTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAGGCAGG CGAGTTCAAC
451 AGTTTGGGCA AGCGCAACGG CAGGGTTTAT TTTGTCGAAA CCTTCGATAC
501 CGAATCCGGC ATCATGAAAA ACCTGTTCCT GCGCGAACAG GACAAAAACG
551 GCGGCGACAA CATCATCTTC GCCAAAGAAG GTAACTTCTC GCTGAACGAC
601 AACAAACGCA CGCTCGAATT GCGCCACGGC TACCGTTACA GCGGCACGCC
651 CGGACGCGCC GACTACAATC AGGTTTCCTT CCAAAAACTC AACCTGATTA
701 TCAGCACCAC GCCCAAACTC ATCGACCCCG TTTCCCACCG CCGTACCATT
751 CCGACCGCCC AACTGATTGG CAGCAGCAAC CCGCAACATC AGGCGGAATT
801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTACTC TGCCTGCTTG
851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC
901 TTGATTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT
951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC
1001 CTATGCACAT TATCATGTTT GCCGTTGCAC TCATCCTGTT GCGCGTCCGC
1051 AGTATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT
1101 GAAAGGCGGA AAATGA
它对应于氨基酸序列<SEQ ID 500;ORF101-1>:
1
MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDA
VLA
51
LVGFWVIGMT PLLLVLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR
101 PVMQ
FAVPFA VLVAVMQLWV IPWAELRSRE YAEILKQKQE LSLVEAGEFN
151 SLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLND
201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI
251 PTAQLIGSSN PQHQAELMWR
ISLTVSVLLL CLLAVPLSYF NPRSGHTYN
I
301
LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL
GLLPMHIIMF AVALILLRVR
351 SMPSQPFWQA VGKSLTLKGG K*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF101显示和脑膜炎奈瑟球菌菌株A的ORF(ORF101a)在57个氨基酸重叠区内有91.2%的相同性,在69个氨基酸重叠区内有95.7%的相同性:
10 20 30 40 50
orf101.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGXVIAIDAVLALVGFWVX
|||||||||||||||||||||||||||||||||||| ||| ||||||||||||||
orf101a MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGXAADXRX-AIDAVLALVGFWVXXM
10 20 30 40 50
//
90 100 110
orf101.pep .............................IAIGLFLIYQNGLTLLFEAVEDGKIHFWLGL
||||||||||||||||||||||||||||||
orf101a LTVSVLLLCLLAVPLSYFNPRSGHTYNILXAIGLFLIYQNGLTLLFEAVEDGKIHFWLGL
280 290 300 310 320 330
120 130 140 150
orf101.pep LPMHIIMFVLALILLRVRSMPSQPFWQAVGKSLTLKGGKX
|||||||||:|::|||||||||||||||||||||||||||
orf101a LPMHIIMFVIAIVLLRVRSMPSQPFWQAVGKSLTLKGGKX
340 350 360 370
全长ORF101a核苷酸序列<SEQ ID 501>是:
1 ATGATTTATC AAAGAAACCT CATCAAAGAACTCTCTTTTA CCGCCGTCGG
51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC
101 TGCTCGGCCN TGCCGCCGAC NGGCGTNTCG CCATCGATGC CGTGTTGGCA
151 TTGGTCGGCT TCTGGGTCNN NNGNATGACG CCGCTTTTGC TNGTGTTGAC
201 CGCATTTATC AGTACGTTGA CCGTGTTGAC CCGCTACTGG CGNGACAGCG
251 AAATGTCGGT CTGGNTATCC TGCGGATTGG CATTGAAACA ATGGATACGC
301 CCGGTGATGC AGTTTGCCGT GCCGTTTGCC GTTTTGGTTG CCGTCATGCA
351 GCTTTGGGTG ATACCGTGGG CAGAGCTACG CAGCCGCGAA TACGCTGAAA
401 TCCTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAGGCAGG CGGGTTCAAC
451 AGTTTGGGCA AGCGCAACGG CAGGGTTTAT TTTGTCGAAA CCTTCGATAC
501 CGAATCCGGC ATCATGAAAA ACCTGTTCCT GCGCGAACAG GACAAAAACG
551 GCGGCGACAA CATCATCTTC NCCAAAGAAA GTAACTTCTC GCTGAACGAC
601 AACAAACGCA CGCTCGAATT GCGCCACGGC TACCGTTACA GCGGCACGCC
651 CGGACGCGCC GACTACAATC AGGTTTCCTT CCNAAAACTC AACCTGATTA
701 TCAGCACCAC GCCCAAACTC ATCGACCCCG TTTCCCACCG CCGTACNATN
751 CCNACNGCCC AACTGATTGG CAGCAGCAAC CCGCAACATC ANGCGGAATT
801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTACTC TGCCTGCTTG
851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC
901 TTGANTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT
951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC
1001 CTATGCACAT CATCATGTTC GTCATCGCAA TCGTACTTCT GCGCGTCCGC
1051 AGCATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT
1101 GAAAGGCGGA AAATGA
它编码的蛋白质具有氨基酸序列<SEQ ID 502>:
1
MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGXAAD XRXAIDA
VLA
51
LVGFWVXXMT PLLLVLTAFI STLTVLTRYW RDSEMSVWXS CGLALKQWIR
101 PVMQ
FAVPFA VLVAVMQLWV IPWAELRSRE YAEILKQKQE LSLVEAGGFN
151 SLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF XKESNFSLND
201 NKRTLELRHG YRYSGTPGRA DYNQVSFXKL NLIISTTPKL IDPVSHRRTX
251 PTAQLIGSSN PQHXAELMWR
ISLTVSVLLL CLLAVPLSYF NPRSGHTYN
I
301
LXAIGLFLIY QNGLTLLFEA VEDGKIHFWL
GLLPMHIIMF VIAIVLLRVR
351 SMPSQPFWQA VGKSLTLKGG K*
ORF101a和ORF101-1显示在371个氨基酸的重叠区内有95.4%的相同性:
orf101a.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGXAADXRXAIDAVLALVGFWVXXMT 60
|||||||||||||||||||||||||||||||||||| ||| | ||||||||||||| ||
orf101-1 MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT 60
orf101a.pep PLLLVLTAFISTLTVLTRYWRDSEMSVWXSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV 120
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf101-1 PLLLVLTAFISTLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV 120
orf101a.pep IPWAELRSREYAEILKQKQELSLVEAGGFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 180
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||
orf101-1 IPWAELRSREYAEILKQKQELSLVEAGEFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ 180
orf101a.pep DKNGGDNIIFXKESNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFXKLNLIISTTPKL 240
|||||||||| ||:||||||||||||||||||||||||||||||||| ||||||||||||
orf101-1 DKNGGDNIIFAKEGNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLIISTTPKL 240
orf101a.pep IDPVSHRRTXPTAQLIGSSNPQHXAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 300
||||||||| ||||||||||||| ||||||||||||||||||||||||||||||||||||
orf101-1 IDPVSHRRTIPTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI 300
orf101a.pep LXAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFVIAIVLLRVRSMPSQPFWQA 360
| ||||||||||||||||||||||||||||||||||||||::|::|||||||||||||||
orf101-1 LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFAVALILLRVRSMPSQPFWQA 360
orf101a.pep VGKSLTLKGGK 371
|||||||||||
orf101-1 VGKSLTLKGGK 371
与淋病奈瑟球菌的预计ORF的同源性
ORF101显示和淋病奈瑟球菌的预计ORF(ORF101ng)在N端结构域的57个氨基酸重叠区以及C端结构域的61个氨基酸重叠区内分别有96.5%和95.1%的相同性:
orf101.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGXVIAIDAVLALVGFWV 57
||||||||||||||||||||||||||||||||||||||||| | |||||||||||||
orf101ng MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRV-AIDAVLALVGFWVIGM 59
//
orf101.pep IAIGLFLIYQNGLTLLFEAVEDGKIHFWLG 333
||||||||||||||||||||||||||||||
orf101ng SLTVSVLLLCLLAVPLSYFNPRSGHTYNILIAIGLFLIYQNGLTLLFEAVEDGKIHFWLG 331
orf101.pep LLPMHIIMFVLALILLRVRSMPSQPFWQAVGKSLTLKGGK 373
||||||||||:|::|||||||||||||||||
orf101ng LLPMHIIMFVIAIVLLRVRSMPSQPFWQAVG 362
预计ORF101ng核苷酸序列<SEQ ID 503>编码的蛋白质具有部分氨基酸序列<SEQ ID 504>:
51
LVGFWVIGMT PLLLVLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR
101 PVMQ
FAVPFA ILIAVMQLWV IPWAELRSRE YAEILKQKQE LSLVEAGEFN
151 NLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLKD
201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI
251 STAQLIGSSN PQHQAELMWR
ISLTVSVLLL CLLAVPLSYF NPRSGHTYN
I
301
LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL
GLLPMHIIMF VIAIVLLRVR
351 SMPSQPFWQA VG...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 505>:
1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG
51 CATTTTCGTC GTCCTCTTGG CGGTGTTGGT GTCCACGCAG GCGATCAACC
101 TGCTTGGCCG CGCAGCTGAC GGGCGTGTCG CCATCGATGC CGTGTTGGCC
151 TTAGTCGGCT TCTGGGTCAT CGGTATGACC CCGCTTTTGC TGGTGTTGAC
201 CGCATTCATC AGCACGCTGA CCGTATTGAC CCGCTACTGG CGCGACAGCG
251 AAATGTCGGT CTGGCTATCC TGCGGATTGG CGTTGAAACA GTGGATACGC
301 CCCGTCATGC AGTTTGCCGT GCCGTTTGCC ATCCTGATTG CCGTCATGCA
351 GCTTTGGGTG ATACCGTGGG CAGAGCTGCG CAGCCGCGAA TATGCCGAAA
401 TTTTGAAGCA GAAGCAGGAA TTGTCTTTGG TGGAAGCCGG CGAGTTCAAT
451 AACTTGGGCA AGCGCAACGG CAgggtttaT TtcgtcgaaA CCTTTGACAC
501 CGaatccgGC ATCATGAAAA ACCTGTtcct GGGCGAACAG GACAAAAACG
551 gcggcgacaA CATCATCTTC GCcaaaGAag gtaactTctc gctgaaggaC
601 AACAAAcgca cgctcgaATT GCGCCACGGC TACCGTTACA GCGGcacgcC
651 CGGacGCGCc gactaCAATC AGGTTtcctt cCAAAAacTc aacctgATta
701 TCAGCACCAC GCCCAAacTT ATCGaccCCG TTTCCCACCG CCGCACCATT
751 tcgacCGCCC AAcTGATTGG CAGCAGCAAT CCGCAACATC AGGCAGAATT
801 GATGTGGCGC ATCTCGCTGA CCGTCAGCGT CCTCCTGCTC TGCCTACTCG
851 CCGTGCCGCT TTCCTATTTC AACCCGCGCA GCGGACATAC CTACAATATC
901 TTGATTGCCA TCGGTTTGTT TTTAATTTAC CAAAACGGGC TGACCCTGCT
951 TTTTGAAGCC GTGGAAGACG GCAAAATCCA TTTTTGGCTC GGACTGCTGC
1001 CTATGCACAT CATCATGTTC GTCATCGCAA TCGTACTTCT GCGCGTCCGC
1051 AGTATGCCCA GCCAGCCCTT CTGGCAGGCG GTTGGCAAAA GTCTGACATT
1101 GAAAGgcgGA AAATGA
它对应于氨基酸序列<SEQ ID 506;ORF101ng-1>:
1
MIYQRNLIKE LSFTAVGIFV VLLAVLVSTQ AINLLGRAAD GRVAIDA
VLA
51
LVGFWVIGMT PLLLVLTAFI STLTVLTRYW RDSEMSVWLS CGLALKQWIR
101 PVMQ
FAVPFA ILIAVMQLWV IPWAELRSRE YAEILKQKQE LSLVEAGEFN
151 NLGKRNGRVY FVETFDTESG IMKNLFLREQ DKNGGDNIIF AKEGNFSLKD
201 NKRTLELRHG YRYSGTPGRA DYNQVSFQKL NLIISTTPKL IDPVSHRRTI
251 STAQLIGSSN PQHQAELMWR
ISLTVSVLLL CLLAVPLSYF NPRSGHTYN
I
301
LIAIGLFLIY QNGLTLLFEA VEDGKIHFWL
GLLPMHIIMF VIAIVLLRVR
351 SMPSQPFWQA VGKSLTLKGG K*
ORF101ng-1和ORF101-1显示在371个氨基酸的重叠区内有97.6%的相同性:
10 20 30 40 50 60
orf101-1.pep MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf101ng-1 MIYQRNLIKELSFTAVGIFVVLLAVLVSTQAINLLGRAADGRVAIDAVLALVGFWVIGMT
10 20 30 40 50 60
70 80 90 100 110 120
orf101-1.pep PLLLVLTAFISTLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAVLVAVMQLWV
||||||||||||||||||||||||||||||||||||||||||||||||||:|:|||||||
orf101ng-1 PLLLVLTAFISTLTVLTRYWRDSEMSVWLSCGLALKQWIRPVMQFAVPFAILIAVMQLWV
70 80 90 100 110 120
130 140 150 160 170 180
orf101-1.pep IPWAELRSREYAEILKQKQELSLVEAGEFNSLGKRNGRVYFVETFDTESGIMKNLFLREQ
||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||||
orf101ng-1 IPWAELRSREYAEILKQKQELSLVEAGEFNNLGKRNGRVYFVETFDTESGIMKNLFLREQ
130 140 150 160 170 180
190 200 210 220 230 240
orf101-1.pep DKNGGDNIIFAKEGNFSLNDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLIISTTPKL
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf101ng-1 DKNGGDNIIFAKEGNFSLKDNKRTLELRHGYRYSGTPGRADYNQVSFQKLNLIISTTPKL
190 200 210 220 230 240
250 260 270 280 290 300
orf101-1.pep IDPVSHRRTIPTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI
|||||||||| |||||||||||||||||||||||||||||||||||||||||||||||||
orf101ng-1 IDPVSHRRTISTAQLIGSSNPQHQAELMWRISLTVSVLLLCLLAVPLSYFNPRSGHTYNI
250 260 270 280 290 300
310 320 330 340 350 360
orf101-1.pep LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFAVALILLRVRSMPSQPFWQA
||||||||||||||||||||||||||||||||||||||||::|::|||||||||||||||
orf101ng-1 LIAIGLFLIYQNGLTLLFEAVEDGKIHFWLGLLPMHIIMFVIAIVLLRVRSMPSQPFWQA
310 320 330 340 350 360
370
orf101-1.pep VGKSLTLKGGKX
||||||||||||
orf101ng-1 VGKSLTLKGGKX
370
根据该分析结果(包括此淋球菌蛋白中存在一个推定的前导序列(双划线)和数个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例60
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 507>:
1 ..GGTGGTGGTT TTATCAATGC TTCCTGTGCC ACTTTGACGA CAGCCAAACC
51 GCAATATCAA GCAGGAGACC TTAGCGCTTT TAAGATAAGG CAAGGCAATG
101 TTGTAATCGC CGGACACGGT TTGGATGCAC GTGATACCGA TTACACACGT
151 ATTCTCAGTT ATCATTCCAA AATCGATGCA CCCGTATGGG GACAAGATGT
201 TCGTGTCGTC GCGGGACAAA ACGATGTGGC CGCAACAGGT GATGCACATT
251 CGCCTATTCT CAATAATGCT GCTGCCAATA CGTCAAACAA TACAGCCAAC
301 AACGGCACAC ATATCCCTTT ATTTGCGATT GATACAGGCA AATTAGGAGG
351 TAT.GTATGC CAACAAAATC ACCTTGATCA GTACGGTCGA GCAAGCAGGC
401 ATTCGTAA
它对应于氨基酸序列<SEQ ID 508;ORF113>:
1 ..GGGFINASCA TLTTAKPQYQ AGDLSAFKIR QGNVVIAGHG LDARDTDYTR
51 ILSYHSKIDA PVWGQDVRVV AGQNDVAATG DAHSPILNNA AANTSNNTAN
101 NGTHIPLFAI DTGKLGGXVC QQNHLDQYGR ASRHS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌的pspA推定分泌蛋白(登录号为AF030941)的同源性
ORF和pspA显示在179个氨基酸的重叠区内有44%的氨基酸相同性:
orf113 GGGFINASCATLTTAKPQYQAGDLSAFKIRQGNVVIAGHGLDARDTDYTRILSYHSKIDA 60
GGG INA+ TLT+ P G+L+ F + G VVI G GLD D DYTRILS ++I+A
pspa GGGLINAASVTLTSGVPVLNNGNLTGFDVSSGKVVIGGKGLDTSDADYTRILSRAAEINA 256
orf113 PVWGQDVRVVAGQNDVAATGDAHSPILXXXXXXXXXXXXXXGTHIPLFAIDTGKLGGMYA 120
VWG+DV+VV+G+N + G + P AIDT LGGMYA
pspa GVWGKDVKVVSGKNKLDFDG---------SLAKTASAPSSSDSVTPTVAIDTATLGGMYA 307
orf113 NKITLISTVEQAGIRNQGQWFASAGNVAVNAEGKLVNTGMIAATGENHAVSLHARNVHN 179
+KITLIST A IRN+G+ FA+ G V ++A+GKL N+G I A +++ A+ V N
pspa DKITLISTDNGAVIRNKGRIFAATGGVTLSADGKLSNSGSIDAA----EITISAQTVDN 362
与淋病奈瑟球菌的预计ORF的同源性
ORF113显示和淋病奈瑟球菌的预计ORF(ORF113ng)在N端部分的52个氨基酸重叠区以及C端部分的17个氨基酸重叠区内有86.5%和94.1%的相同性:
orf113 GGGFINASCATLTTAKPQYQAGDLSAFKIR 30
|||||||| |||||::||||||||:|:|||
orf113ng SHPSQLNGYIEVGGRRAEVVIANPAGIAVNGGGFINASRATLTTGQPQYQAGDFSGFKIR 224
orf113 QGNVVIAGHGLDARDTDYTRILSYHSKIDAPVWGQDVRVVAGQNDVAATGDAHSPILNNA 90
|||:|||||||||||||:||||
orf113ng QGNAVIAGHGLDARDTDFTRILVCQQNHLDQYGRTSRHS 263
orf113 IDTGKLGGXVCQQNHLDQYGRASRHS 135
||||||||||||:||||
orf113ng DFSGFKIRQGNAVIAGHGLDARDTDFTRILVCQQNHLDQYGRTSRHS 263
预计全长ORF113ng核苷酸序列<SEQ ID 509>编码的蛋白质具有氨基酸序列<SEQ ID 510>:
1 MNKTLYRVIF NRKRGAVVAV AETTKREGKS CADSGSGSVY VKSVSFTPTH
51 SKAFCFSALG FSLCLALGTV NIAFADGIIT DKAAPKTQQA TILQTGNGIP
101 QVNIQTPTSA GVSVNQYAQF DVGNRGAILN NSRSNTQTQL GGWIQGNPWL
151 TRGEARVVVN QINSSHPSQL NGYIEVGGRR AEVVIANPAG IAVNGGGFIN
201 ASRATLTTGQ PQYQAGDFSG FKIRQGNAVI AGHGLDARDT DFTRILVCQQ
251 NHLDQYGRTS RHS*
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例61
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 511>:
1 ..TCAACGGGAC ATAGCGAACA AAATTACACT TTGCCGCGAG AAATCACACG
51 CAACATTTCA CTGGGTTCAT TTGCCTATGA ATCGCATCGC AAAGCATTAA
101 GCCATCATGC GCCCAGCCAA GGCACTGAGT TGCCGCAAAG CAACGGTATT
151 TCGCTACCCT ATACGTCCAA TTCTTTTACC CCATTACCCA GCAGCAGCTT
201 ATACATTATC AATCCTGTCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC
251 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCtGGACAGC
301 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA
351 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC
401 GTTTAGAcGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT
451 AATGGCGCGA CTGCGGCACG TTcGATGAAT CTCAGCGTTG GCATTGCATT
501 AAGTGCCGAG CAAGTAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC
551 AAAAAGAAGT TAAGCTTCCT GATGGCGGCA CACAAACCGT ATTGGTGCCA
601 CAGGTTTATG TACGCGTTAA AAATGGCGAC ATAGACGGTA AAGGTGCATT
651 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT
701 CAGGCACGAT TGCAGGgCGC AATGCGCTTA TTATCAATAC CGATACGCTA
751 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC
801 ACAAGACATC AATAATATTG GCGGCATGCT TTCTGCCGAA CAGACATTAT
851 TGCTCAACGC AGGCAACAAC ATCAACAGCC AAAGCACCAC CGCCAGCAGT
901 CAAAATACAC AAGGCAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA
951 TATCACAGGC AAAGAAAAAG GTGTTT..
它对应于氨基酸序列<SEQ ID 512;ORF115>:
1 ..STGHSEQNYT LPREITRNIS LGSFAYESHR KALSHHAPSQ GTELPQSNGI
51 SLPYTSNSFT PLPSSSLYII NPVNKGYLVE TDPRFANYRQ WLGSDYMLDS
101 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD
151 NGATAARSMN LSVGIALSAE QVAQLTSDIV WLVQKEVKLP DGGTQTVLVP
201 QVYVRVKNGD IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL
251 DNIGGRIHAQ KSAVTATQDI NNIGGMLSAE QTLLLNAGNN INSQSTTASS
301 QNTQGSSTYL DRMAGIYITG KEKGV..
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌的pspA推定分泌蛋白(登录号为AF030941)的同源性
ORF115和pspA蛋白显示在325个氨基酸的重叠区内有50%的氨基酸相同性:
Orf115:1 STGHSEQNYTLPREITRNISLGSFAYESHRKALSHHAPSQGTELPQSNGISLPYTSNSFT 60
STG+S Y E++ +I +G AY+ + + P + NGI +T
pspA: 778 STGYSRSPYEPAPEVS-SIRMGISAYKGYAPQQASDIPGTVVPVVAENGIHPTFT----- 831
Orf115:61 PLPSSSLYIINPVNKGYLVETDPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQR 120
LP+SSL+ I P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+
pspA: 832 -LPNSSLFAIAPNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQK 890
Orf115:121 LINEQIAELTGHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIV 180
L+NEQIA+LTG+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQVA+LTSDIV
pspA: 891 LVNEQIAKLTGYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIV 950
Orf115:181 WLVQKEVKLPDGGTQTVLVPQVYVRVKNGDIDGKGALLSGSNTQINVSGSLKN-SGTIAG 239
WL + V LPDG TQTVL P+VYVR + D++G+GALLSGS I SG+++N G IAG
pspA: 951 WLENETVTLPDGTTQTVLKPKVYVRARPKDMNGQGALLSGSVVDIG-SGAIENRGGLIAG 1009
Orf115:240 RNALIINTDTLDNIGGRIHAQKSAVTATQDINNIGGMLSAEQTLLLNAGXXXXXXXXXXX 299
R ALI+N + N+ G + + A DI N G + AE LLL A
pspA: 1010 REALILNAQNIKNLQGDLQGKNIFAAAGSDITNTGS-IGAENALLLKASNNIESRSETRS 1068
Orf115:300 XXXXXXXXXYLDRMAGIYITGKEKG 324
+ R+AGIY+TG++ G
pspA: 1069 NQNEQGSVRNIGRVAGIYLTGRQNG 1093
与淋病奈瑟球菌的预计ORF的同源性
ORF115显示与淋病奈瑟球菌的预计ORF(ORF115ng)在重叠的334个氨基酸内有91.9%的相同性:
orf115.pep STGHSEQNYTLPREITRNISLGSFAYESHRK 31
||| |||||||:||||:||||||||||| |
orf115ng NEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDISLGSFAYESHSK 71
orf115.pep ALSHHAPSQGTELPQSN----------GISLPYTSNSFTPLPSSSLYIINPVNKGYLVET 81
|||:||||||||||||| ||||||| |||||||:||||||||:||||||||
orf115ng ALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYIINPANKGYLVET 131
orf115.pep DPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 141
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf115ng DPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND 191
orf115.pep EEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIVWLVQKEVKLPDGGTQTVLVPQ 201
||||||||||||||||||||||||||||||:||||||||||||||||||||||||||:||
orf115ng EEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLPDGGTQTVLMPQ 251
orf115.pep VYVRVKNGDIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK 261
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||
orf115ng VYVRVKNGGIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK 311
orf115.pep SAVTATQDINNIGGMLSAEQTLLLNAGNNINSQSTTASSQNTQGSSTYLDRMAGIYITGK 321
||||||||||||||:||||||||||||||||:|||: ||||:||||||||||||||||||
orf115ng SAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTYLDRMAGIYITGK 371
orf115.pep EKGV 325
||||
orf115ng EKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQEIHFDADNHTIR 431
预计ORF115ng核苷酸序列<SEQ ID 513>编码的蛋白质具有氨基酸序列<SEQ ID514>:
1 MLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT
51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI
101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS
151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD
201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP
251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL
301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS
351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT
401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL
451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG
501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI
551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS
601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ
651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAK QFDKAKTTAL
701 MPWRLPMQVG RLFKQAKAPK K*
进一步的工作揭示了下列淋球菌的部分DNA序列<SEQ ID 515>:
1 TTGCTTGTGC AAACAGAAAA AGACGGTTTG CATAACGAGC AAACCTTTGG
51 CGAGAAGAAA GTCTTCAGCG AAAATGGTAA GTTGCACAAC TACTGGCGTG
101 CGCGTCGTAA AGGACATGAT GAAACAGGGC ATCGTGAACA AAATTATACT
151 TTGCCGGAGG AAATCACACG CGACATTTCA CTGGGTTCAT TTGCCTATGA
201 ATCGCATAGC AAAGCATTAA GCCGTCATGC GCCCAGCCAA GGCACTGAGT
251 TGCCACAAAG TAACCGGGAT AATATCCGTA CTGCGAAAAG CAACGGTATT
301 TCGCTACCCT ATACGCCCAA TTCTTTTACC CCATTACCCG GCAGCAGCTT
351 ATACATTATC AATCCTGCCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC
401 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCTGGGCAGC
451 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA
501 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC
551 GTTTAGACGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT
601 AATGGCGCGA CTGCGGCACG TTCGATGAAT CTCAGCGTTG GCATTGCATT
651 AAGTGCCGAG CAAGCAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC
701 AAAAAGAAGT TAAACTTCCT GATGGCGGCA CACAAACCGT ATTGATGCCA
751 CAGGTTTATG TACGCGTTAA AAATGGCGGC ATAGACGGTA AAGGTGCATT
801 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT
851 CAGGCACGAT TGCAGGGCGC AATGCGCTTA TTATCAATAC CGATACGCTA
901 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC
951 ACAAGACATC AATAATATTG GCGGCATTCT TTCTGCCGAA CAGACATTAT
1001 TGCTCAATGC GGGTAACAAC ATCAACAACC AAAGCACGGC CAAGAGCAGT
1051 CAAAATGCAC AAGGTAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA
1101 TATCACAGGC AAAGAAAAAG GTGTTTTAGC AGCGCAGGCA GGCAAAGACA
1151 TCAACATCAT TGCCGGTCAA ATCAGCAATC AATCAGATCA AGGGCAAACC
1201 CGGCTGCAGG CAGGACGCGA CATTAACCTG GATACGGTAC AAACCGGCAA
1251 ATATCAAGAA ATCCATTTTG ATGCCGATAA CCATACCATC CGAGGTTCAA
1301 CGAACGAAGT CGGCAGCAGC ATTCAAACAA AAGGCGATGT TACCCtatTG
1351 TCAGGGAATA ATCTCAATGC CAAAGCTGCC GAAGTCGGCA GCGCAAAAGG
1401 CACACTTGCC GTGTATGCTA AAAATGACAT TACTATCAGC TCAGGCATCC
1451 ATGCCGGCCA AGTTGATGAT GCGTCCAAAC ATACAGGCAG AAGCGGCGGC
1501 GGTAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACTGC
1551 TCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG
1601 ATGCCAACAT CCTTGGCAGT AATGTTATTT CCGATAATGG CACCCGGATT
1651 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG
1701 CGAAACCTAT CATCAAACCC AAAAATCAGG ATTGATGAGT GCAGGTATCG
1751 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC
1801 AACGAACATA CAGGCAGTAC CGTAGGCAGC CTGAAAGGCG ATACCACCAT
1851 TGTTGCAAGC AAACACTACG AACAAACCGG CAGCAACGTT TCCAGCCCTG
1901 AGGGCAACAA CCTTATCAGC ACGCAAAGTA TGGATATTGG CGCAGCACAA
1951 AACCAATTAA ACAGCAAAAC CACCCAAACC TACGAACAAA AAGGCTTAAC
2001 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA GCGATTGCCG
2051 TAGCACACAA AGCAGCAAAC AAGTCGGACA AAGCAAAAAC GACCGCGTTA
2101 ATGCCATGGC GGCTGCCAAT GCAGGTTGGC AGGCCTATCA AACAGGCAAA
2151 GGCGCACAAA ACTTAG
它对应于氨基酸序列<SEQ ID 516;ORF115ng-1>:
1 LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT
51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI
101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS
151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD
201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP
251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL
301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS
351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT
401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL
451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG
501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI
551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS
601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ
651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAN KSDKAKTTAL
701 MPWRLPMQVG RPIKQAKAHK T*
此淋球菌蛋白(ORF115ng-1)显示和ORF115在334个氨基酸内有91.9%的相同性:
20 30 40 50 60 70
orf115ng-1.p NEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDISLGSFAYESHSK
||| |||||||:||||:||||||||||| |
orf115 STGHSEQNYTLPREITRNISLGSFAYESHRK
10 20 30
80 90 100 110 120 130
orf115ng-1.p ALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYIINPANKGYLVET
|||:||||||||||||| ||||||| |||||||:||||||||:||||||||
orf115 ALSHHAPSQGTELPQSN----------GISLPYTSNSFTPLPSSSLYIINPVNKGYLVET
40 50 60 70 80
140 150 160 170 180 190
orf115ng-1.p DPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf115 DPRFANYRQWLGSDYMLDSLKLDPNNLHKRLGDGYYEQRLINEQIAELTGHRRLDGYQND
90 100 110 120 130 140
200 210 220 230 240 250
orf115ng-1.p EEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLPDGGTQTVLMPQ
||||||||||||||||||||||||||||||:||||||||||||||||||||||||||:||
orf115 EEQFKALMDNGATAARSMNLSVGIALSAEQVAQLTSDIVWLVQKEVKLPDGGTQTVLVPQ
150 160 170 180 190 200
260 270 280 290 300 310
orf115ng-1.p VYVRVKNGGIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||| orf115 VYVRVKNGDIDGKGALLSGSNTQINVSGSLKNSGTIAGRNALIINTDTLDNIGGRIHAQK
210 220 230 240 250 260
320 330 340 350 360 370
orf1 15ng-1.p SAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTYLDRMAGIYITGK
||||||||||||||:||||||||||||||||:|||: ||||:||||||||||||||||||
orf115 SAVTATQDINNIGGMLSAEQTLLLNAGNNINSQSTTASSQNTQGSSTYLDRMAGIYITGK
270 280 290 300 310 320
380 390 400 410 420 430
orf115ng-1.p EKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQEIHFDADNHTIR
||||
orf115 EKGV
另外,它显示出与数据库中一种分泌的脑膜炎奈瑟球菌蛋白同源:
gi|2623258(AF030941)推定分泌的蛋白[脑膜炎奈瑟球菌]长度=2273
评分=604位(1541),估计值=e-172
相同性=325/678(47%),阳性=449/678(65%),空隙=22/678(3%)
询问:1 LLVQTEKDGLHNEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDIS 60
L+V T + L N++T G K + ++ G LH Y R +KG D TG+ Y E++ I
目标:739 LIVGTPESALDNDETLGTKTI-TDKGDLHRYHRHHKKGRDSTGYSRSPYEPAPEVS-SIR 796
询问:61 LGSFAYESHSKALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYII 120
+G AY+ + AP Q +++P + + NGI +T LP SSL+ I
目标:797 MGISAYKGY-------APQQASDIPGTV---VPVVAENGIHPTFT------LPNSSLFAI 840
询问:121 NPANKGYLVETDPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELT 180
P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+L+NEQIA+LT
目标:841 APNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQKLVNEQIAKLT 900
询问:181 GHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLP 240
G+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQ A+LTSDIVWL + V LP
目标:901 GYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIVWLENETVTLP 960
询问:241 DGGTQTVLMPQVYVRVKNGGIDGKGALLSGSNTQINVSGSLKN-SGTIAGRNALIINTDT 299
DG TQTVL P+VYVR + ++G+GALLSGS I SG+++N G IAGR ALI+N
目标:961 DGTTQTVLKPKVYVRARPKDMNGQGALLSGSVVDIG-SGAIENRGGLIAGREALILNAQN 1019
询问:300 LDNIGGRIHAQKSAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTY 359
+ N+ G + + A DI N G I AE LLL A NNI ++S +S+QN QGS
目标:1020 IKNLQGDLQGKNIFAAAGSDITNTGSI-GAENALLLKASNNIESRSETRSNQNEQGSVRN 1078
询问:360 LDRMAGIYITGKEKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQ 419
+ R+AGIY+TG++ G + AG +I + A +++NQS+ GQT L AG DI DT + Q
目标:1079 IGRVAGIYLTGRQNGSVLLDAGNNIVLTASELTNQSEDGQTVLNAGGDIRSDTTGISRNQ 1138
询问:420 EIHFDADNHTIRGSTNEVGSSIQTKGDVTLLSGNNLNAKAAEVGSAKGTLAVYAKNDITI 479
FD+DN+ IR NEVGS+I+T+G+++L + ++ +AAEVGS +G L + A DI +
目标:1139 NTIFDSDNYVIRKEQNEVGSTIRTRGNLSLNAKGDIRIRAAEVGSEQGRLKLAAGRDIKV 1198
询问:480 SSGIHAGQVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILG 539
+G + +DA K+TGRSGGG K +T ++ + A S T +GK+++L +G D + G
目标:1199 EAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQNGQAVSGTLDGKEIILVSGRDITVTG 1258
询问:540 SNVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLM-SAGIGFTIGSKTNTQENQS 598
SN+I+DN T + A N++ + +T+S+S ++ +KSGLM S GIGFT GSK +TQ N+S
目标:1259 SNIIADNHTILSAKNNIVLKAAETRSRSAEMNKKEKSGLMGSGGIGFTAGSKKDTQTNRS 1318
询问:599 QSNEHTGSTVGSLKGDTTIVASKHYEQTGSNVSSPEGNNLISTQSMDIGAAQNQLNSKTT 658
++ HT S VGSL G+T I A KHY QTGS +SSP+G+ IS+ + I AAQN+ + ++
目标:1319 ETVSHTESVVGSLNGNTLISAGKHYTQTGSTISSPQGDVGISSGKISIDAAQNRYSQESK 1378
询问:659 QTYEQKGLTVAFSSPVTD 676
Q YEQKG+TVA S PV +
目标:1379 QVYEQKGVTVAISVPVVN 1396
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例62
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 517>:
1 ..TCAGGGAATA ACCTCAATGC CAAAGCTGCC GAAGTCAGCA GCGCAAACGG
51 TACACTCGCT GTGTCTGCCA ATAATGACAT CAACATCAGC GCAGGCATCA
101 ACACGACCCA TGTTGATGAT GCGTCCAAAC ACACAGGCAG AAGCGGTGGT
151 GGCAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACCGC
201 CCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG
251 ATGCCAACAT CCTTGGCAGC AATGTTATTT CCGATAATGG CACCCAGATT
301 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG
351 CGAAACCTAT CATCAAACCC AGAAATCAGG ATTGATGAGT GCAGGTATCG
401 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC
451 AACGAACATA CAGGCAGTAC CGTAGGCAGC TTGAAAGGCG ATACCACCAT
501 TGTTGCAGGC AAACACTACG AACAAATCGG CAGTACCGTT TCCAGCCCGG
551 AAGGCAACAA TACCATCTAT GCCCAAAGCA TAGACATTCA AGCGGCACAC
601 AACAAATTAA ACAGTAATAC CACCCAAACC TATGAACAAA AAGG.CTAAC
651 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA ...
它对应于氨基酸序列<SEQ ID 518;ORF117>:
1 ..SGNNLNAKAA EVSSANGTLA VSANNDINIS AGINTTHVDD ASKHTGRSGG
51 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTQI
101 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS
151 NEHTGSTVGS LKGDTTIVAG KHYEQIGSTV SSPEGNNTIY AQSIDIQAAH
201 NKLNSNTTQT YEQKXLTVAF SSPVTDLAQQ ...
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌的pspA推定分泌蛋白(登录号AF030941)的同源性
ORF117和pspA蛋白显示在224个氨基酸的重叠区内有45%的氨基酸相同性:
Orf117:4 NLNAKAAEVSSANGTLAVSANNDINISAGINTTHVDDASKHTGRSGGGNKLVITDKAQSH 63
++ +AAEV S G L ++A DI + AG T +DA K+TGRSGGG K +T ++
pspA: 1173 DIRIRAAEVGSEQGRLKLAAGRDIKVEAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQ 1232
Orf117:64 HETAQSSTFEGKQVVLQAGNDANILGSNVISDNGTQIQAGNHVRIGTTQTQSQSETYHQT 123
+ A S T +GK+++L +G D + GSN+I+DN T + A N++ + +T+S+S ++
pspA: 1233 NGQAVSGTLDGKEIILVSGRDITVTGSNIIADNHTILSAKNNIVLKAAETRSRSAEMNKK 1292
Orf117:124 QKSGLM-SAGIGFTIGSKTNTQENQSQSNEHTGSTVGSLKGDTTIVAGKHYEQIGSTVSS 182
+KSGLM S GIGFT GSK +TQ N+S++ HT S VGSL G+T I AGKHY Q GST+SS
pspA: 1293 EKSGLMGSGGIGFTAGSKKDTQTNRSETVSHTESVVGSLNGNTLISAGKHYTQTGSTISS 1352
Orf117:183 PEGNNTIYAQSIDIQAAHNKLNSNTTQTYEQKXLTVAFSSPVTD 226
P+G+ I + I I AA N+ + + Q YEQK +TVA S PV +
pspA: 1353 PQGDVGISSGKISIDAAQNRYSQESKQVYEQKGVTVAISVPVVN 1396
与淋病奈瑟球菌的预计ORF的同源性
ORF117显示和淋病奈瑟球菌的预计ORF(ORF117ng)在230个氨基酸的重叠区内有90%的相同性:
orf117.pep SGNNLNAKAAEVSSANGTLAVSANNDINIS 30
||||||||||||:||:||||| |:|||:||
orf117ng IHFDADNHTIRGSTNEVGSSIQTKGDVTLLSGNNLNAKAAEVGSAKGTLAVYAKNDITIS 480
orf117.pep AGINTTHVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILGS 90
:||:: :|||||||||||||||||||||||||||||||||||||||||||||||||||||
orf117ng SGIHAGQVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILGS 540
orf117.pep NVISDNGTQIQAGNHVRIGTTQTQSQSETYHQTQKSGLMSAGIGFTIGSKTNTQENQSQS 150
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf117ng NVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLMSAGIGFTIGSKTNTQENQSQS 600
orf117.pep NEHTGSTVGSLKGDTTIVAGKHYEQIGSTVSSPEGNNTIYAQSIDIQAAHNKLNSNTTQT 210
|||||||||||||||||||:||||| ||:|||||||| | :||:|| ||:|:|||:||||
orf117ng NEHTGSTVGSLKGDTTIVASKHYEQTGSNVSSPEGNNLISTQSMDIGAAQNQLNSKTTQT 660
orf117.pep YEQKXLTVAFSSPVTDLAQQ 230
|||| |||||||||||||||
orf117ng YEQKGLTVAFSSPVTDLAQQAIAVAHKAAKQFDKAKTTALMPWRLPMQVGRLFKQAKAPK 720
预计ORF117ng核苷酸序列<SEQ ID 519>编码的蛋白质具有氨基酸序列<SEQ ID520>:
1 ..LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT
51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI
101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS
151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD
201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP
251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL
301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS
351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT
401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL
451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG
501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI
551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS
601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ
651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAK QFDKAKTTAL
701 MPWRLPMQVG RLFKQAKAPK K*
进一步的工作揭示了下列淋球菌的部分DNA序列<SEQ ID 521>:
1 TTGCTTGTGC AAACAGAAAA AGACGGTTTG CATAACGAGC AAACCTTTGG
51 CGAGAAGAAA GTCTTCAGCG AAAATGGTAA GTTGCACAAC TACTGGCGTG
101 CGCGTCGTAA AGGACATGAT GAAACAGGGC ATCGTGAACA AAATTATACT
151 TTGCCGGAGG AAATCACACG CGACATTTCA CTGGGTTCAT TTGCCTATGA
201 ATCGCATAGC AAAGCATTAA GCCGTCATGC GCCCAGCCAA GGCACTGAGT
251 TGCCACAAAG TAACCGGGAT AATATCCGTA CTGCGAAAAG CAACGGTATT
301 TCGCTACCCT ATACGCCCAA TTCTTTTACC CCATTACCCG GCAGCAGCTT
351 ATACATTATC AATCCTGCCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC
401 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCTGGGCAGC
451 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA
501 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC
551 GTTTAGACGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT
601 AATGGCGCGA CTGCGGCACG TTCGATGAAT CTCAGCGTTG GCATTGCATT
651 AAGTGCCGAG CAAGCAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC
701 AAAAAGAAGT TAAACTTCCT GATGGCGGCA CACAAACCGT ATTGATGCCA
751 CAGGTTTATG TACGCGTTAA AAATGGCGGC ATAGACGGTA AAGGTGCATT
801 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAAAAACT
851 CAGGCACGAT TGCAGGGCGC AATGCGCTTA TTATCAATAC CGATACGCTA
901 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC
951 ACAAGACATC AATAATATTG GCGGCATTCT TTCTGCCGAA CAGACATTAT
1001 TGCTCAATGC GGGTAACAAC ATCAACAACC AAAGCACGGC CAAGAGCAGT
1051 CAAAATGCAC AAGGTAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA
1101 TATCACAGGC AAAGAAAAAG GTGTTTTAGC AGCGCAGGCA GGCAAAGACA
1151 TCAACATCAT TGCCGGTCAA ATCAGCAATC AATCAGATCA AGGGCAAACC
1201 CGGCTGCAGG CAGGACGCGA CATTAACCTG GATACGGTAC AAACCGGCAA
1251 ATATCAAGAA ATCCATTTTG ATGCCGATAA CCATACCATC CGAGGTTCAA
1301 CGAACGAAGT CGGCAGCAGC ATTCAAACAA AAGGCGATGT TACCCtatTG
1351 TCAGGGAATA ATCTCAATGC CAAAGCTGCC GAAGTCGGCA GCGCAAAAGG
1401 CACACTTGCC GTGTATGCTA AAAATGACAT TACTATCAGC TCAGGCATCC
1451 ATGCCGGCCA AGTTGATGAT GCGTCCAAAC ATACAGGCAG AAGCGGCGGC
1501 GGTAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACTGC
1551 TCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG
1601 ATGCCAACAT CCTTGGCAGT AATGTTATTT CCGATAATGG CACCCGGATT
1651 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG
1701 CGAAACCTAT CATCAAACCC AAAAATCAGG ATTGATGAGT GCAGGTATCG
1751 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC
1801 AACGAACATA CAGGCAGTAC CGTAGGCAGC CTGAAAGGCG ATACCACCAT
1851 TGTTGCAAGC AAACACTACG AACAAACCGG CAGCAACGTT TCCAGCCCTG
1901 AGGGCAACAA CCTTATCAGC ACGCAAAGTA TGGATATTGG CGCAGCACAA
1951 AACCAATTAA ACAGCAAAAC CACCCAAACC TACGAACAAA AAGGCTTAAC
2001 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA GCGATTGCCG
2051 TAGCACACAA AGCAGCAAAC AAGTCGGACA AAGCAAAAAC GACCGCGTTA
2101 ATGCCATGGC GGCTGCCAAT GCAGGTTGGC AGGCCTATCA AACAGGCAAA
2151 GGCGCACAAA ACTTAG
它对应于氨基酸序列<SEQ ID 522;ORF117ng-1>:
1 LLVQTEKDGL HNEQTFGEKK VFSENGKLHN YWRARRKGHD ETGHREQNYT
51 LPEEITRDIS LGSFAYESHS KALSRHAPSQ GTELPQSNRD NIRTAKSNGI
101 SLPYTPNSFT PLPGSSLYII NPANKGYLVE TDPRFANYRQ WLGSDYMLGS
151 LKLDPNNLHK RLGDGYYEQR LINEQIAELT GHRRLDGYQN DEEQFKALMD
201 NGATAARSMN LSVGIALSAE QAAQLTSDIV WLVQKEVKLP DGGTQTVLMP
251 QVYVRVKNGG IDGKGALLSG SNTQINVSGS LKNSGTIAGR NALIINTDTL
301 DNIGGRIHAQ KSAVTATQDI NNIGGILSAE QTLLLNAGNN INNQSTAKSS
351 QNAQGSSTYL DRMAGIYITG KEKGVLAAQA GKDINIIAGQ ISNQSDQGQT
401 RLQAGRDINL DTVQTGKYQE IHFDADNHTI RGSTNEVGSS IQTKGDVTLL
451 SGNNLNAKAA EVGSAKGTLA VYAKNDITIS SGIHAGQVDD ASKHTGRSGG
501 GNKLVITDKA QSHHETAQSS TFEGKQVVLQ AGNDANILGS NVISDNGTRI
551 QAGNHVRIGT TQTQSQSETY HQTQKSGLMS AGIGFTIGSK TNTQENQSQS
601 NEHTGSTVGS LKGDTTIVAS KHYEQTGSNV SSPEGNNLIS TQSMDIGAAQ
651 NQLNSKTTQT YEQKGLTVAF SSPVTDLAQQ AIAVAHKAAN KSDKAKTTAL
701 MPWRLPMQVG RPIKQAKAHK T*
ORF117ng-1和ORF117显示在230个氨基酸的重叠区内同样有90%的相同性。另外,它显示出与数据库中一种分泌型脑膜炎奈瑟球菌蛋白同源:
gi|2623258(AF030941)推定分泌的蛋白[脑膜炎奈瑟球菌]长度=2273
评分=604位(1541),估计值=e-172
相同性=325/678(47%),阳性=449/678(65%),空隙=22/678(3%)
询问:1 LLVQTEKDGLHNEQTFGEKKVFSENGKLHNYWRARRKGHDETGHREQNYTLPEEITRDIS 60
L+V T + L N++T G K + ++ G LH Y R +KG D TG+ Y E++ I
目标:739 LIVGTPESALDNDETLGTKTI-TDKGDLHRYHRHHKKGRDSTGYSRSPYEPAPEVS-SIR 796
询问:61 LGSFAYESHSKALSRHAPSQGTELPQSNRDNIRTAKSNGISLPYTPNSFTPLPGSSLYII 120
+G AY+ + AP Q +++P + + NGI +T LP SSL+ I
目标:797 MGISAYKGY-------APQQASDIPGTV---VPVVAENGIHPTFT------LPNSSLFAI 840
询问:121 NPANKGYLVETDPRFANYRQWLGSDYMLGSLKLDPNNLHKRLGDGYYEQRLINEQIAELT 180
P NKGYL+ETDP F +YR+WLGS YML +L+ DPN++HKRLGDGYYEQ+L+NEQIA+LT
目标:841 APNNKGYLIETDPAFTDYRKWLGSGYMLAALQQDPNHIHKRLGDGYYEQKLVNEQIAKLT 900
询问:181 GHRRLDGYQNDEEQFKALMDNGATAARSMNLSVGIALSAEQAAQLTSDIVWLVQKEVKLP 240
G+RRLDGY NDEEQFKALMDNG T A+ + L+ GIALSAEQ A+LTSDIVWL + V LP
目标:901 GYRRLDGYTNDEEQFKALMDNGITIAKELQLTPGIALSAEQVARLTSDIVWLENETVTLP 960
询问:241 DGGTQTVLMPQVYVRVKNGGIDGKGALLSGSNTQINVSGSLKN-SGTIAGRNALIINTDT 299
DG TQTVL P+VYVR + ++G+GALLSGS I SG+++N G IAGR ALI+N
目标:961 DGTTQTVLKPKVYVRARPKDMNGQGALLSGSVVDIG-SGAIENRGGLIAGREALILNAQN 1019
询问:300 LDNIGGRIHAQKSAVTATQDINNIGGILSAEQTLLLNAGNNINNQSTAKSSQNAQGSSTY 359
+ N+ G + + A DI N G I AE LLL A NNI ++S +S+QN QGS
目标:1020 IKNLQGDLQGKNIFAAAGSDITNTGSI-GAENALLLKASNNIESRSETRSNQNEQGSVRN 1078
询问:360 LDRMAGIYITGKEKGVLAAQAGKDINIIAGQISNQSDQGQTRLQAGRDINLDTVQTGKYQ 419
+ R+AGIY+TG++ G + AG +I + A +++NQS+ GQT L AG DI DT + Q
目标:1079 IGRVAGIYLTGRQNGSVLLDAGNNIVLTASELTNQSEDGQTVLNAGGDIRSDTTGISRNQ 1138
询问:420 EIHFDADNHTIRGSTNEVGSSIQTKGDVTLLSGNNLNAKAAEVGSAKGTLAVYAKNDITI 479
FD+DN+ IR NEVGS+I+T+G+++L + ++ +AAEVGS +G L + A DI +
目标:1139 NTIFDSDNYVIRKEQNEVGSTIRTRGNLSLNAKGDIRIRAAEVGSEQGRLKLAAGRDIKV 1198
询问:480 SSGIHAGQVDDASKHTGRSGGGNKLVITDKAQSHHETAQSSTFEGKQVVLQAGNDANILG 539
+G + +DA K+TGRSGGG K +T ++ + A S T +GK+++L +G D + G
目标:1199 EAGKAHTETEDALKYTGRSGGGIKQKMTRHLKNQNGQAVSGTLDGKEIILVSGRDITVTG 1258
询问:540 SNVISDNGTRIQAGNHVRIGTTQTQSQSETYHQTQKSGLM-SAGIGFTIGSKTNTQENQS 598
SN+I+DN T + A N++ + +T+S+S ++ +KSGLM S GIGFT GSK +TQ N+S
目标:1259 SNIIADNHTILSAKNNIVLKAAETRSRSAEMNKKEKSGLMGSGGIGFTAGSKKDTQTNRS 1318
询问:599 QSNEHTGSTVGSLKGDTTIVASKHYEQTGSNVSSPEGNNLISTQSMDIGAAQNQLNSKTT 658
++ HT S VGSL G+T I A KHY QTGS +SSP+G+ IS+ + I AAQN+ + ++
目标:1319 ETVSHTESVVGSLNGNTLISAGKHYTQTGSTISSPQGDVGISSGKISIDAAQNRYSQESK 1378
询问:659 QTYEQKGLTVAFSSPVTD 676
Q YEQKG+TVA S PV +
目标:1379 QVYEQKGVTVAISVPVVN 1396
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例63
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 523>:
1 ATGATTTACA TCGTACTGTT TCTAGCTGTC GTCCTCGCCG TTGTCGCCTA
51 CAACATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG
101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAwAACCAG CCATGTCCGC
151 GACGGCAAAC CGTCCGGCGG GTCAGTCATG ATGCCGAAAC CCCAACCGGC
201 GGTCAAAAAA ACGGCAAAAC CCCAAGACCC CGyCATGCGC AACCTGCAAG
251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG
301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA
351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCAACGAAAC
401 CTGCCGACGC GTCGGCAAAA CCTGCACCCG TTCCGCAAAC ACCTGCAAAA
451 CCGCTGATTA CGCTCAAAGA ACTGTCAAAA GTCGAATTAT CCTGGTTTGA
501 CGTGCGCATC GACTTCATCT CCTAT...
它对应于氨基酸序列<SEQ ID 524;ORF119>:
1 MIYIVLFLAV VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSXTSHVR
51 DGKPSGGSVM MPKPQPAVKK TAKPQDPXMR NLQEQDAVYI AKQKQAKASP
101 FKTEIETALE ESGIIGNSAH TVSEPQTGHS ATKPADASAK PAPVPQTPAK
151 PLITLKELSK VELSWFDVRI DFISY...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 525>:
1 ATGATTTACA TCGTACTGTT TCTAGCTGTC GTCCTCGCCG TTGTCGCCTA
51 CAACATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG
101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC
151 GACGGCAAAC CGTCCGGCGG GTCAGTCATG ATGCCGAAAC CCCAACCGGC
201 GGTCAAAAAA ACGGCAAAAC CCCAAGACCC CGCCATGCGC AACCTGCAAG
251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG
301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA
351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCACCGAAAC
401 CTGCCGACGC GCCGGCAAAA CCTGCACCCG TTCCGCAAAC ACCTGCAAAA
451 CCGCTGATTA CGCTCAAAGA ACTGTCAAAA GTCGAATTAC CCTGGTTTGA
501 CGTGCGCTTC GACTTCATCT CCTATATCGC GCTGACCGAA GCCAAAGAAC
551 TGCACGCACT GCCGCGCCTT TCCAACCGCT GCCGCTACCA GATTGTCGGC
601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG
651 CTATCAGGCA TTTATCGTGG GTATTCAGGC AGTCAGCCGC AACGGACTTG
701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGTGGA CGCATTCGCA
751 CAAAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA
801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACCATCG
851 CCATCCATTT GGTTTCCCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC
901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA
951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG
1001 AGCCGTTTAC CAACGCCCTT TTGGACAACC AGTCCTACAA AGGCTTCAGT
1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA
1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGCCAGTTG AACCTGAATC
1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTG
1201 CGCACTTATG TATTGGCGCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA
1251 ACCGGGCGGC AAAACCGCAT TGCGCCTGTT CTCCTAA
它对应于氨基酸序列<SEQ ID 526;ORF119-1>:
1
MIYIVLFLAV VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR
51 DGKPSGGSVM MPKPQPAVKK TAKPQDPAMR NLQEQDAVYI AKQKQAKASP
101 FKTEIETALE ESGIIGNSAH TVSEPQTGHS APKPADAPAK PAPVPQTPAK
151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG
201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQVDAFA
251 QSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS
301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS
351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV
401 RTYVLARQSE MLKVGIEPGG KTALRLFS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF119显示和脑膜炎奈瑟球菌菌株A的ORF(ORF119a)在175个氨基酸的重叠区内有93.7%的相同性:
10 20 30 40 50 60
orf119.pep MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSXTSHVRDGKPSGGSVM
|||||||||:|||||||||||||||||||||||||||||||||| |||||||||||| ||
orf119a MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM
10 20 30 40 50 60
70 80 90 100 110 120
orf119.pep MPKPQPAVKKTAKPQDPXMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
||||||||||||| ||| ||||||||||||||||||||||||||||||||||||||||||
orf119a MPKPQPAVKKTAKSQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
70 80 90 100 110 120
130 140 150 160 170
orf119.pep TVSEPQTGHSATKPADASAKPAPVPQTPAKPLITLKELSKVELSWFDVRIDFISY
|| |||||||| ||||| |||:||||||||||||||||||||| |||||:|||||
orf119a TVPEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
130 140 150 160 170 180
orf119a AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
190 200 210 220 230 240
全长ORF119a核苷酸序列<SEQ ID 527>是:
1 ATGATTTACA TCGTACTGTT CCTCGCCGCC GTCCTCGCCG TTGTCGCCTA
51 CAATATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG
101 GGCACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC
151 GACGGCAAAC CGTCCGGCGG GCCAGTCATG ATGCCGAAAC CCCAACCGGC
201 GGTCAAAAAA ACGGCAAAAT CCCAAGACCC CGCCATGCGC AACCTGCAAG
251 AGCAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG
301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAAGCGGCA TTATCGGCAA
351 CTCCGCCCAC ACCGTTCCCG AACCCCAAAC CGGACATTCC GCACCAAAAC
401 CTGCCGACGC GCCGGCAAAA CCTGTTCCCG TTCCGCAAAC GCCGGCAAAA
451 CCGCTGATTA CGCTCAAAGA GCTGTCGAAG GTCGAGCTGC CCTGGTTTGA
501 CGTGCGCTTC GACTTCATCT CTTATATCGC GCTGACCGAA GCCAAAGAAC
551 TGCACGCACT GCCGCGCCTT TCCAACCGCT GCCGCTACCA GATTGTCGGC
601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG
651 CTATCAGGCA TTTATCGTGG GTATTCAGGC AGTCAGCCGC AACGGACTTG
701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGTGGA TGCATTCGCA
751 CACAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA
801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACTATCG
851 CCATCCATTT GGTTTCCCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC
901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA
951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG
1001 AGCCGTTTAC CAATGCCCTT TTGGACAACC AGTCCTATAA AGGCTTCAGT
1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA
1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGCCAGTTG AACCTGAATC
1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTG
1201 CGCACTTATG TATTGGCTCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA
1251 ACCGGGCGGC AAAACCGCAT TGCGCCTGTT CTCCTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 528>:
1
MIYIVLFLAA VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR
51 DGKPSGGPVM MPKPQPAVKK TAKSQDPAMR NLQEQDAVYI AKQKQAKASP
101 FKTEIETALE ESGIIGNSAH TVPEPQTGHS APKPADAPAK PVPVPQTPAK
151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG
201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQVDAFA
251 HSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS
301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS
351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV
401 RTYVLARQSE MLKVGIEPGG KTALRLFS*
ORF119a和ORF119-1显示在428个氨基酸的重叠区内有98.6%的相同性:
10 20 30 40 50 60
orf119a.pep MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||| ||
orf119-1 MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGSVM
10 20 30 40 50 60
70 80 90 100 110 120
orf119a.pep MPKPQPAVKKTAKSQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 MPKPQPAVKKTAKPQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
70 80 90 100 110 120
130 140 150 160 170 180
orf119a.pep TVPEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
|| ||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf119-1 TVSEPQTGHSAPKPADAPAKPAPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
130 140 150 160 170 180
190 200 210 220 230 240
orf119a.pep AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
190 200 210 220 230 240
250 260 270 280 290 300
orf119a.pep AFNRQVDAFAHSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
||||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 AFNRQVDAFAQSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
250 260 270 280 290 300
310 320 330 340 350 360
orf119a.pep AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
310 320 330 340 350 360
370 380 390 400 410 420
orf119a.pep GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
370 380 390 400 410 420
429
orf119a.pep KTALRLFSX
|||||||||
orf119-1 KTALRLFSX
与淋病奈瑟球菌的预计ORF的同源性
ORF119显示和淋病奈瑟球菌的预计ORF(ORF119ng)在175个氨基酸的重叠区内有93.1%的相同性:
orf119.pep MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSXTSHVRDGKPSGGSVM 60
|||||||||:|||||||||||||||||||||||||||||||||| |||||||||||| ||
orf119ng MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM 60
orf119.pep MPKPQPAVKKTAKPQDPXMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH 120
|||||||||| ||||| |||||||||||||||||||||||||||||||||| ||||||||
orf119ng MPKPQPAVKKPAKPQDSAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEEIGIIGNSAH 120
orf119.pep TVSEPQTGHSATKPADASAKPAPVPQTPAKPLITLKELSKVELSWFDVRIDFISY 175
||||||||||| ||||| |||:||||||||||||||||||||| |||||:|||||
orf119ng TVSEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE 180
全长ORF119ng核苷酸序列<SEQ ID 529>是:
1 ATGATTTACA TCGTACTGTT CCTCGCCGCC GTCCTCGCCG TTGTCGCCTA
51 CAATATGTAT CAGGAAAACC AATACCGCAA AAAAGTGCGC GACCAGTTCG
101 GACACTCCGA CAAAGATGCC CTGCTCAACA GCAAAACCAG CCATGTCCGC
151 GACGGCAAAC CGTCCGGCGG GCCAGTCATG ATGCCGAAAC CCCAACCGGC
201 GGTCAAAAAA CCGGCCAAAC CCCAAGACTC CGCCATGCGC AACCTGCAAG
251 AACAGGATGC CGTCTACATC GCCAAGCAGA AACAGGCAAA AGCCTCCCCG
301 TTCAAAACCG AAATCGAAAC CGCCTTGGAA GAAATCGGCA TTATCGGCAA
351 CTCCGCCCAC ACCGTTTCCG AACCCCAAAC CGGACATTCC GCACCGAAAC
401 CTGCCGACGC GCCGGCAAAA CCCGTTCCCG TTCCGCAAAC GCCGGCAAAA
451 CCGCTGATTA CGCTCAAAGA GCTGTCGAAG GTCGAGCTGC CCTGGTTTGA
501 CGTGCGCTtc gACTTCATCT CCTATATCGC GCTGACCGAA GCCAAAGAAC
551 TGCACGCACT GCCGCGCCTT tccAACCGCT GCCGCTACCA GATTGTCGGC
601 TGCACCATGG ACGACCATTT CCAGATTGCC GAACCCATCC CGGGCATCCG
651 CTATCAGGCA TTTATCGTGG GTATCCAGGC AGTCAGCCGC AACGGACTTG
701 CCTCGCAGGA AGAACTCTCC GCATTCAACC GCCAGGCGGA CGCATTCGCA
751 CAAAGCATGG GCGGTCAGAC GCTGCACACC GACCTTGCCG CCTTTATCGA
801 AGTGGCTTCC GCACTGGACG CATTCTGCGC GCGCGTCGAC CAGACCATCG
851 CCATCCATTT GGTTTCGCCG ACCAGCATCA GCGGCGTAGA ACTGCGTTCC
901 GCCGTAACGG GCGTGGGTTT CGTTTTGGAA GACGACGGCG CGTTCCACTA
951 TACCGACACG TCGGGCTCGA CCATGTTCTC CATCTGCTCG CTCAACAACG
1001 AGCCGTTTAC CAATGCCCTT TTGGACAACC AGTCCTACAA AGGCTTCAGT
1051 ATGCTGCTCG ACATCCCGCA CTCTCCGGCA GGCGAAAAAA CCTTCGACGA
1101 TTTGTTTATG GATTTGGCGG TACGCCTGTC CGGTCAGTTG AACCTGAATC
1151 TGGTCAACGA CAAAATGGAA GAAGTTTCGA CCCAATGGCT CAAAGACGTA
1201 CGCACTTATG TATTGGCGCG TCAGTCCGAG ATGCTCAAAG TCGGTATCGA
1251 ACCGGGCGGC AAAACCGCCC TGCGCCTGTT TTCATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 530>:
1
MIYIVLFLAA VLAVVAYNMY QENQYRKKVR DQFGHSDKDA LLNSKTSHVR
51 DGKPSGGPVM MPKPQPAVKK PAKPQDSAMR NLQEQDAVYI AKQKQAKASP
101 FKTEIETALE EIGIIGNSAH TVSEPQTGHS APKPADAPAK PVPVPQTPAK
151 PLITLKELSK VELPWFDVRF DFISYIALTE AKELHALPRL SNRCRYQIVG
201 CTMDDHFQIA EPIPGIRYQA FIVGIQAVSR NGLASQEELS AFNRQADAFA
251 QSMGGQTLHT DLAAFIEVAS ALDAFCARVD QTIAIHLVSP TSISGVELRS
301 AVTGVGFVLE DDGAFHYTDT SGSTMFSICS LNNEPFTNAL LDNQSYKGFS
351 MLLDIPHSPA GEKTFDDLFM DLAVRLSGQL NLNLVNDKME EVSTQWLKDV
401 RTYVLARQSE MLKVGIEPGG KTALRLFS*
ORF119ng和ORF119-1显示在428个氨基酸的重叠区内有98.4%的相同性:
10 20 30 40 50 60
orf119ng MIYIVLFLAAVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGPVM
|||||||||:||||||||||||||||||||||||||||||||||||||||||||||| ||
orf119-1 MIYIVLFLAVVLAVVAYNMYQENQYRKKVRDQFGHSDKDALLNSKTSHVRDGKPSGGSVM
10 20 30 40 50 60
70 80 90 100 110 120
orf119ng MPKPQPAVKKPAKPQDSAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEEIGIIGNSAH
|||||||||| ||||| |||||||||||||||||||||||||||||||||| ||||||||
orf119-1 MPKPQPAVKKTAKPQDPAMRNLQEQDAVYIAKQKQAKASPFKTEIETALEESGIIGNSAH
70 80 90 100 110 120
130 140 150 160 170 180
orf119ng TVSEPQTGHSAPKPADAPAKPVPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf119-1 TVSEPQTGHSAPKPADAPAKPAPVPQTPAKPLITLKELSKVELPWFDVRFDFISYIALTE
130 140 150 160 170 180
190 200 210 220 230 240
orf119ng AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 AKELHALPRLSNRCRYQIVGCTMDDHFQIAEPIPGIRYQAFIVGIQAVSRNGLASQEELS
190 200 210 220 230 240
250 260 270 280 290 300
orf119ng AFNRQADAFAQSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
|||||:||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 AFNRQVDAFAQSMGGQTLHTDLAAFIEVASALDAFCARVDQTIAIHLVSPTSISGVELRS
250 260 270 280 290 300
310 320 330 340 350 360
orf119ng AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 AVTGVGFVLEDDGAFHYTDTSGSTMFSICSLNNEPFTNALLDNQSYKGFSMLLDIPHSPA
310 320 330 340 350 360
370 380 390 400 410 420
orf119ng GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf119-1 GEKTFDDLFMDLAVRLSGQLNLNLVNDKMEEVSTQWLKDVRTYVLARQSEMLKVGIEPGG
370 380 390 400 410 420
429
orf119ng KTALRLFSX
|||||||||
orf119-1 KTALRLFSX
根据该分析结果,包括此淋球菌蛋白中有一个推定的前导序列,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例64
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 531>
1 ..GCGCGGCACG GCACGGAAGA TTTCTTCATG AACAACAGCG ACAC.ATCAG
51 GCAGATAGTC GAAAGCACCA CCGGTACGAT GAAGCTGCTG ATTTCCTCCA
101 TCGCCCTGAT TTCATTGGTA GTCGGCGGCA TCGGCGTGAT GAACATCATG
151 CTGGTGTCCG TTACCGAGCG CACCAAAGAA ATCGGCATAC GGATGGCAAT
201 CGGCGCGCGG CGCGGCAATA TTTyGCAGCA GTTTTTGATT GAGGCGGTGT
251 TAATCTGCGT CATCGGCGGT TTGGTCGGCG TGGGTTTGTC CGCCGCCGTC
301 AGCCTCGTGT TCAATCATTT TGTAACCGAC TTCCCGATGG ACATTTCCGC
351 CATGTCCGTC ATCGGCGCGG TCGCCTGTTC GACCGGAATC GGCATCGCGT
401 TCGGCTTTAT GCCTGCCAAT AAAGCAGCCA AACTCAATCC GATAGACGCA
451 TTGGCACAGG ATTGA
它对应于氨基酸序列<SEQ ID 532;ORF134>:
1 ..ARHGTEDFFM NNSDXIRQIV ESTTGTMKLL ISSIALISLV VGGIGVMNIM
51 LVSVTERTKE IGIRMAIGAR RGNIXQQFLI EAVLICVIGG LVGVGLSAAV
101 SLVFNHFYTD FPMDISAMSV IGAVACSTGI GIAFGFMPAN KAAKLNPIDA
151 LAQD*
进一步的工作揭示了其完整的核苷酸序列<SEQ ID 533>:
1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACGAT
51 GCTCGGCATC ATCATCGGTA TCGCGTCGGT GGTTTCCGTC GTCGCATTGG
101 GCAATGGTTC GCAGAAAAAA ATCCTTGAAG ACATCAGTTC GATAGGGACG
151 AACACCATCA GCATCTTCCC GGGGCGCGGC TTCGGCGACA GGCGCAGCGG
201 CAGGATTAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA
251 GCTACGTTGC TTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACT
301 TACCGCAACA CCGACCTGAC CGCCTCGCTT TACGGCGTGG GCGAACAATA
351 TTTCGACGTG CGCGGACTGA AGCTGGAAAC GGGGCGGCTG TTTGACGAAA
401 ACGATGTGAA AGAAGACGCG CAGGTCGTCG TCATCGACCA AAATGTCAAA
451 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG
501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT
551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG
601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA
651 AGACAATGCC AATACCCAGG TTGCCGAAAA AGGGCTGACC GATCTGCTCA
701 AAGCGCGGCA CGGCACGGAA GATTTCTTCA TGAACAACAG CGACAGCATC
751 AGGCAGATAG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC
801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGCGTG ATGAACATCA
851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA
901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT
951 GTTAATCTGC GTCATCGGCG GTTTGGTCGG CGTGGGTTTG TCCGCCGCCG
1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ACTTCCCGAT GGACATTTCC
1051 GCCATGTCCG TCATCGGCGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC
1101 GTTCGGCTTT ATGCCTGCCA ATAAAGCAGC CAAACTCAAT CCGATAGACG
1151 CATTGGCACA GGATTGA
它对应于氨基酸序列<SEQ ID 534;ORF134-1>:
1
MSVQAVLAHK MRSLLTMLGI IIGIASVVSV VALGNGSQKK ILEDISSIGT
51 NTISIFPGRG FGDRRSGRIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT
101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVVVIDQNVK
151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWSPYTTVM
201 HQITGESHTN SITVKIKDNA NTQVAEKGLT DLLKARHGTE DFFMNNSDSI
251 RQIVESTTGT MKL
LISSIAL ISLVVGGIGV MNIMLVSVTE RTKEIGIRMA
301 IGARRGNILQ Q
FLIEAVLIC VIGGLVGVGL SAAVSLVFNH FVTDFPMDIS
351 AMS
VIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD*
该氨基酸序列的计算机分析给出了下列结果:
与假设的大肠杆菌o648蛋白(登录号为AE000189)的同源性
ORF134和o648蛋白显示在153个氨基酸的重叠区内有45%的氨基酸相同性:
Orf134:2 RHGTEDFFMNNSDXIRQIVESTTGTMKXXXXXXXXXXXVVGGIGVMNIMLVSVTERTKEI 61
RHG +DFF N D + + VE TT T++ VVGGIGVMNIMLVSVTERT+EI
o648: 496 RHGKKDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLVVGGIGVMNIMLVSVTERTREI 555
Orf134:62 GIRMAIGARRGNIXQQFLIEAXXXXXXXXXXXXXXXXXXXXXFNHFVTDFPMDISAMSVI 121
GIRMA+GAR ++ QQFLIEA F+ + + S ++++
o648: 556 GIRMAVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQLFLPGWEIGFSPLALL 615
Orf134:122 GAVACSTGIGIAFGFMPANKAAKLNPIDALAQD 154
A CST GI FG++PA AA+L+P+DALA++
o648: 616 LAFLCSTVTGILFGWLPARNAARLDPVDALARE 648
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF134显示和脑膜炎奈瑟球菌菌株A的ORF(ORF134a)在154个氨基酸的重叠区内有98.7%的相同性:
10 20 30
orf134.pep ARHGTEDFFMNNSDXIRQIVESTTGTMKLL
|||||||||||||| |||||||||||||||
orf134a GESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTEDFFMNNSDSIRQIVESTTGTMKLL
210 220 230 240 250 260
40 50 60 70 80 90
orf134.pep ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNIXQQFLIEAVLICVIGG
|||||||||||||||||||||||||||||||||||||||||||| |||||||||||||||
orf134a ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNILQQFLIEAVLICVIGG
270 280 290 300 310 320
100 110 120 130 140 150
orf134.pep LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134a LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA
330 340 350 360 370 380
orf134.pep LAQDX
|||||
orf134a LAQDX
全长ORF134a核苷酸序列<SEQ ID 535>是:
1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACGAT
51 GCTCGGCATC ATCATCGGTA TCGCTTCGGT TGTCTCCGTC GTCGCATTGG
101 GCAACGGTTC GCAGAAAAAA ATCCTTGAAG ACATCAGTTC GATAGGGACG
151 AACACCATCA GCATCTTCCC AGGGCGCGGC TTCGGCGACA GGCGCAGCGG
201 CAGGATTAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA
251 GCTACGTTGC TTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACT
301 TACCGCAATA CCGACCTGAC CGCTTCTTTG TACGGTGTGG GCGAACAATA
351 TTTCGACGTG CGCGGGCTGA AGCTGGAAAC GGGGCGGCTG TTTGACGAAA
401 ACGATGTGAA AGAAGACGCG CAGGTCGTCG TCATCGACCA AAATGTCAAA
451 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG
501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT
551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG
601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA
651 AGACAATGCC AATACCCAGG TTGCCGAAAA AGGGCTGACC GATCTGCTCA
701 AAGCGCGGCA CGGCACGGAA GATTTCTTCA TGAACAACAG CGACAGCATC
751 AGGCAGATAG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC
801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGCGTG ATGAACATCA
851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA
901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT
951 GTTAATCTGC GTCATCGGCG GTTTGGTCGG CGTGGGTTTG TCCGCCGCCG
1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ACTTCCCGAT GGACATTTCC
1051 GCCATGTCCG TCATCGGCGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC
1101 GTTCGGCTTT ATGCCTGCCA ATAAAGCAGC CAAACTCAAT CCGATAGATG
1151 CATTGGCGCA GGATTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 536>:
1
MSVQAVLAHK MRSLLTMLGI IIGIASVVSV VALGNGSQKK ILEDISSIGT
51 NTISIFPGRG FGDRRSGRIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT
101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVVVIDQNVK
151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWSPYTTVM
201 HQITGESHTN SITVKIKDNA NTQVAEKGLT DLLKARHGTE DFFMNNSDSI
251 RQIVESTTGT MKL
LISSIAL ISLVVGGIGV MNIMLVSVTE RTKEIGIRMA
301 IGARRGNILQ Q
FLIEAVLIC VIGGLVGVGL SAAVSLVFNH FVTDFPMDIS
351 AMS
VIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD*
ORF134a和ORF134-1显示在388个氨基酸的重叠区内有100.0%的相同性:
orf134a.pep MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSIGTNTISIFPGRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSIGTNTISIFPGRG
orf134a.pep FGDRRSGRIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 FGDRRSGRIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV
orf134a.pep RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD
orf134a.pep ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTE
orf134a.pep DFFMNNSDSIRQIVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 DFFMNNSDSIRQIVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMA
orf134a.pep IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVAC
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVAC
orf134a.pep STGIGIAFGFMPANKAAKLNPIDALAQDX
|||||||||||||||||||||||||||||
orf134-1 STGIGIAFGFMPANKAAKLNPIDALAQDX
与淋病奈瑟球菌的预计ORF的同源性
ORF134显示和淋病奈瑟球菌的预计ORF(ORF134.ng)在154个氨基酸的重叠区内有96.8%的相同性:
orf134.pep ARHGTEDFFMNNSDXIRQIVESTTGTMKLL 30
|||||||||||||| |||:|||||||||||
orf134ng GESHTNSITVKIKDNANTRVAEKGLAELLKARHGTEDFFMNNSDSIRQMVESTTGTMKLL 264
orf134.pep ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNIXQQFLIEAVLICVIGG 90
|||||||||||||||||||||||||||||||||||||||||||| |||||||||||:|||
orf134ng ISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMAIGARRGNILQQFLIEAVLICIIGG 324
orf134.pep LVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVACSTGIGIAFGFMPANKAAKLNPIDA 150
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||
orf134ng LVGVGLSAAVSLVFNHFVTDFPMDISAASVIGAVACSTGIGIAFGFMPANKAAKLNPIDA 384
orf134.pep LAQD 154
||||
orf134ng LAQD 388
全长ORF134ng核苷酸序列<SEQ ID 537>是:
1 ATGTCGGTGC AAGCAGTATT GGCGCACAAA ATGCGTTCGC TTCTGACCAT
51 GCTCGGCATC ATCATCGGTA TCGCTTCGGT TGTCTCCGTC GTCGCGCTGG
101 GCAACGGTTC GCAGAAAAAA ATCCTCGAAG ACATCAGTTC GATGGGGACG
151 AACACCATCA GCATCTTCCC CGGGCGCGGC TTCGGCGACA GGCGCAGCGG
201 CAAAATCAAA ACCCTGACCA TAGACGACGC AAAAATCATC GCCAAACAAA
251 GCTACGTTGC CTCCGCCACG CCCATGACTT CGAGCGGCGG CACGCTGACC
301 TACCGCAATA CCGACCTGAC CGCTTCTTTG TACGGTGTGG GCGAACAATA
351 TTTCGACGTG CGCGGGCTGA AGCTGGAAAC GGGGCGGCTG TTTGATGAGA
401 ACGATGTGAA AGAAGACGCG CAAGTCGTCG TCATCGACCA AAATGTCAAA
451 GACAAACTCT TTGCGGACTC GGATCCGTTG GGTAAAACCA TTTTGTTCAG
501 GAAACGCCCC TTGACCGTCA TCGGCGTGAT GAAAAAAGAC GAAAACGCTT
551 TCGGCAATTC CGACGTGCTG ATGCTTTGGT CGCCCTATAC GACGGTGATG
601 CACCAAATCA CAGGCGAGAG CCACACCAAC TCCATCACCG TCAAAATCAA
651 AGACAATGCC AATACCCGGG TTGCCGAAAA AGGGCTGGCC GAGCTGCTCA
701 AAGCACGGCA CGGCACGGAA GACTTCTTTA TGAACAACAG CGACAGCATC
751 AGGCAGATGG TCGAAAGCAC CACCGGTACG ATGAAGCTGC TGATTTCCTC
801 CATCGCCCTG ATTTCATTGG TAGTCGGCGG CATCGGTGTG ATGAACATTA
851 TGCTGGTGTC CGTTACCGAG CGCACCAAAG AAATCGGCAT ACGGATGGCA
901 ATCGGCGCGC GGCGCGGCAA TATTTTGCAG CAGTTTTTGA TTGAGGCGGT
951 GTTAATCTGC ATCATCGGAG GCTTGGTCGG CGTAGGTTTG TCCGCCGCCG
1001 TCAGCCTCGT GTTCAATCAT TTTGTAACCG ATTTCCCGAT GGACATTTCG
1051 GCGGCATCCG TTATCGGGGC GGTCGCCTGT TCGACCGGAA TCGGCATCGC
1101 GTTCGGCTTT ATGCCTGCCA ATAAGGCAGC CAAACTCAAT CCGATAGATG
1151 CATTGGCGCA GGATTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 538>:
1
MSVQAVLAHK MRSLLTMLGI IIGIASVVSV VALGNGSQKK ILEDISSMGT
51 NTISIFPGRG FGDRRSGKIK TLTIDDAKII AKQSYVASAT PMTSSGGTLT
101 YRNTDLTASL YGVGEQYFDV RGLKLETGRL FDENDVKEDA QVVVIDQNVK
151 DKLFADSDPL GKTILFRKRP LTVIGVMKKD ENAFGNSDVL MLWSPYTTVM
201 HQITGESHTN SITVKIKDNA NTRVAEKGLA ELLKARHGTE DFFMNNSDSI
251 RQMVESTTGT MKL
LISSIAL ISLVVGGIGV MNIMLVSVTE RTKEIGIRMA
301 IGARRGNILQ Q
FLIEAVLIC IIGGLVGVGL SAAVSLVFNH FVTDFPMDIS
351 AAS
VIGAVAC STGIGIAFGF MPANKAAKLN PIDALAQD*
ORF134ng和ORF134-1显示在388个氨基酸的重叠区内有97.9%的相同性:
orf134ng MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSMGTNTISIFPGRG
|||||||||||||||||||||||||||||||||||||||||||||||:||||||||||||
orf134-1 MSVQAVLAHKMRSLLTMLGIIIGIASVVSVVALGNGSQKKILEDISSIGTNTISIFPGRG
orf134ng FGDRRSGKIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 FGDRRSGRIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV
orf134ng RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFADSDPLGKTILFRKRPLTVIGVMKKD
orf134ng ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTRVAEKGLAELLKARHGTE
||||||||||||||||||||||||||||||||||||||||||:|||||::||||||||||
orf134-1 ENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTQVAEKGLTDLLKARHGTE
orf134ng DFFMNNSDSIRQMVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMA
||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||||
orf134-1 DFFMNNSDSIRQIVESTTGTMKLLISSIALISLVVGGIGVMNIMLVSVTERTKEIGIRMA
orf134ng IGARRGNILQQFLIEAVLICIIGGLVGVGLSAAVSLVFNHFVTDFPMDISAASVIGAVAC
||||||||||||||||||||:|||||||||||||||||||||||||||||| ||||||||
orf134-1 IGARRGNILQQFLIEAVLICVIGGLVGVGLSAAVSLVFNHFVTDFPMDISAMSVIGAVAC
orf134ng STGIGIAFGFMPANKAAKLNPIDALAQDX
|||||||||||||||||||||||||||||
orf134-1 STGIGIAFGFMPANKAAKLNPIDALAQDX
ORF134ng还显示出与一种大肠杆菌ABC转运蛋白同源:
sp|P75831|YBJZ_ECOLI假设的ABC转运蛋白ATP-结合蛋白YBJZ>gi5(AE000189)o648;similar to YBBA_HAEINSW:P45247[大肠杆菌]长度=648
评分=297位(753),估计值=6e-80
相同性=162/389(41%),阳性=230/389(58%),空隙=1/389(0%)
询问:1 MSVQAVLAHKMRSLLTMLXXXXXXXXXXXXXXLGNGSQKKILEDISSMGTNTISIFPGRG 60
M+ +A+ A+KMR+LLTML +G+ +++ +L DI S+GTNTI ++PG+
目标:260 MAWRALAANKMRTLLTMLGIIIGIASVVSIVVVGDAAKQMVLADIRSIGTNTIDVYPGKD 319
询问:61 FGDRRSGKIKTLTIDDAKIIAKQSYVASATPMTSSGGTLTYRNTDLTASLYGVGEQYFDV 120
FGD + L DD I KQ +VASATP S L Y N D+ AS GV YF+V
目标:320 FGDDDPQYQQALKYDDLIAIQKQPWVASATPAVSQNLRLRYNNVDVAASANGVSGDYFNV 379
询问:121 RGLKLETGRLFDENDVKEDAQVVVIDQNVKDKLFAD-SDPLGKTILFRKRPLTVIGVMKK 179
G+ G F++ + AQVVV+D N + +LF +D +G+ IL P VIGV ++
目标:380 YGMTFSEGNTFNQEQLNGRAQVVVLDSNTRRQLFPHKADVVGEVILVGNMPARVIGVAEE 439
询问:180 DENAFGNSDVLMLWSPYTTVMHQITGESHTNSITVKIKDNANTRVAEKGLAELLKARHGT 239
++ FG+S VL +W PY+T+ ++ G+S NSITV++K+ ++ AE+ L LL RHG
目标:440 KQSMFGSSKVLRVWLPYSTMSGRVMGQSWLNSITVRVKEGFDSAEAEQQLTRLLSLRHGK 499
询问:240 EDFFMNNSDSIRQMVESTTGTMKXXXXXXXXXXXVVGGIGVMNIMLVSVTERTKEIGIRM 299
+DFF N D + + VE TT T++ VVGGIGVMNIMLVSVTERT+EIGIRM
目标:500 KDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLVVGGIGVMNIMLVSVTERTREIGIRM 559
询问:300 AIGARRGNILQQFLIEXXXXXXXXXXXXXXXXXXXXXXFNHFVTDFPMDISAASVIGAVA 359
A+GAR ++LQQFLIE F+ + + S +++ A
目标:560 AVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQLFLPGWEIGFSPLALLLAFL 619
询问:360 CSTGIGIAFGFMPANKAAKLNPIDALAQD 388
CST GI FG++PA AA+L+P+DALA++
目标:620 CSTVTGILFGWLPARNAARLDPVDALARE 648
根据该分析结果(包括淋球菌蛋白中存在前导肽和跨膜区),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例65
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 539>:
1 ..GGGACGGGAG CGATGCTGCT GCTGTTTTAC GCGGTAACGA T.CTGCCTTT
51 GGCCACTGGC GTTACCCTGA GTTACACCTC GTCGATTTTT TTGGCGGTAT
101 TTTCCTTCCT GATTTTGAAA GAACGGATTT CCGTTTACAC GCAGGCGGTG
151 CTGCTCCTTG GTTTTGCCGG CGTGGTATTG CTGCTTAATC CCTCGTTCCG
201 CAGCGGTCAG GAAACGGCGG CACTCGCCGG GCTGGCGGGC GGCGCGATGT
251 CCGGCTGGGC GTATTTGAAA GTGCGCGAAC TGTCTTTGGC GGGCGAACCC
301 GGCTGGCGCG TCGTGTTTTA CCTTTCCGTG ACAGGTGTGG CGATGTCGTC
351 GGTTTGGGCG ACGCTGACCG GCTGGCACAC CCTGTCCTTT CCATCGGCAG
401 TTTATCTGTC GTGCATCGGC GTGTCCGCGC TGATTGCCCA ACTGTCGATG
451 ACGCGCGCCT ACAAAGTCGG CGACAAATTC ACGGTTGCCT CGCTTTCCTA
501 TATGACCGTC GTTTTTTCCG CTCTGTCTGC CGCATTTTTT CTGGGCGAAG
601 ATTTTGA
它对应于氨基酸序列<SEQ ID 540;ORF135>:
1 ..GTGAMLLLFY AVTILPLATG VTLSYTSSIF LAVFSFLILK ERISVYTQAV
51 LLLGFAGVVL LLNPSFRSGQ ETAALAGLAG GAMSGWAYLK VRELSLAGEP
101 GWRVVFYLSV TGVAMSSVWA TLTGWHTLSF PSAVYLSCIG VSALIAQLSM
151 TRAYKVGDKF TVASLSYMTV VFSALSAAFF LGEELFWQEI LGMCIIISAV
201 F*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 541>:
1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC
51 GGCGGCCTGC TTTACCATTA TGAACGTATT GATTAAAGAG GCATCGGCAA
101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA
151 ACCGTTGCGC TCGGGGCTGC CGCCGTATTG CGTCGGGACA mCTTCCGCAC
201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA
251 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGGC CACTGGCGTT
301 ACCCTGAGTT ACACCTCGTC GATTTTTTTG GCGGTATTTT CCTTCCTGAT
351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT
401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA
451 ACGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA
501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG
551 TGTTTTACCT TTCCGTGACA GGTGTGGCGA TGTCGTCGGT TTGGGCGACG
601 CTGACCGGCT GGCACACCCT GTCCTTTCCA TCGGCAGTTT ATCTGTCGTG
651 CATCGGCGTG TCCGCGCTGA TTGCCCAACT GTCGATGACG CGCGCCTACA
701 AAGTCGGCGA CAAATTCACG GTTGCCTCGC TTTCCTATAT GACCGTCGTT
751 TTTTCCGCTC TGTCTGCCGC ATTTTTTCTG GGCGAAGAGC TTTTCTGGCA
801 GGAAATACTC GGTATGTGCA TCATCATCCT CAGCGGTATT TTGAGCAGCA
851 TCCGCCCCAC TGCCTTCAAA CAGCGGCTGC AATCCCTGTT CCGCCAAAGA
901 TAA
它对应于氨基酸序列<SEQ ID 542;ORF135-1>:
1
MDTAKKDILG SGWMLVAAAC FTIMNVLIKE ASAKFALGSG ELVFWRMLFS
51 TVALGAAAVL RRDXFRTPHW KNHLNRS
MVG TGAMLLLFYA VTHLPLATGV
101 T
LSYTSSIFL AVFSFLILKE RISVYTQA
VL LLGFAGVVLL LNPSFRSGQE
151 TAALAGLAGG AMSGWAYLKV RELSLAGEPG WRVVFYLSVT GVAMSSVWAT
201 LTGWHTLS
FP SAVYLSCIGV SALIAQLSMT RAYKVGDKFT VAS
LSYMTVV
251
FSALSAAFFL GEELFWQ
EIL GMCIIILSGI LSSIRPTAFK QRLQSLFRQR
301 *
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF135显示和脑膜炎奈瑟球菌菌株A的ORF(ORF135a)在197个氨基酸的重叠区内有99.0%的相同性:
10 20 30
orf135.pep GTGAMLLLFYAVTILPLATGVTLSYTSSIF
||||||||||||| ||||||||||||||||
orf135a STVALGAAAVLRRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIF
50 60 70 80 90 100
40 50 60 70 80 90
orf135.pep LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf135a LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLK
110 120 130 140 150 160
100 110 120 130 140 150
orf135.pep VRELSLAGEPGWRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf135a VRELSLAGEPGWRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM
170 180 190 200 210 220
160 170 180 190 200
orf135.pep TRAYKVGDKFTVASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIISAVFX
|||||||||||||||||||||||||||||||:|||||||||||||||
orf135a TRAYKVGDKFTVASLSYMTVVFSALSAAFFLAEELFWQEILGMCIIILSGILSSIRPTAF
230 240 250 260 270 280
orf135a KQRLQSLFRQRX
290 300
全长ORF135a核苷酸序列<SEQ ID 543>是:
1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC
51 GGCGGCCTGC TTTACCATTA TGAACGTATT GATTAAAGAG GCATCGGCAA
101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA
151 ACCGTTGCGC TCGGGGCTGC CGCCGTATTG CGTCGGGACA CCTTCCGCAC
201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA
251 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGGC CACCGGCGTT
301 ACCCTGAGTT ACACCTCGTC GATTTTTTTG GCGGTATTTT CCTTCCTGAT
351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT
401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA
451 ACGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA
501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG
551 TGTTTTACCT TTCCGTGACA GGTGTGGCGA TGTCATCGGT TTGGGCGACG
601 CTGACCGGCT GGCACACCCT GTCCTTTCCA TCGGCAGTTT ATCTGTCGTG
651 CATCGGCGTG TCCGCGCTGA TTGCCCAACT GTCGATGACG CGCGCCTACA
701 AAGTCGGCGA CAAATTCACG GTTGCCTCGC TTTCCTATAT GACCGTCGTT
751 TTTTCCGCTC TGTCTGCCGC ATTTTTTCTG GCCGAAGAGC TTTTCTGGCA
801 GGAAATACTC GGTATGTGCA TCATCATCCT CAGCGGTATT TTGAGCAGCA
851 TCCGCCCCAC TGCCTTCAAA CAGCGGCTGC AATCCCTGTT CCGCCAAAGA
901 TAA
它编码的蛋白质具有氨基酸序列<SEQ ID 544>:
1
MDTAKKDILG SGWMLVAAAC FTIMNVLIKE ASAKFALGSG ELVFWRMLFS
51 TVALGAAAVL RRDTFRTPHW KNHLNRS
MVG TGAMLLLFYA VTHLPLATGV
101 T
LSYTSSIFL AVFSFLILKE RISVYTQA
VL LLGFAGVVLL LNPSFRSGQE
151 TAALAGLAGG AMSGWAYLKV RELSLAGEPG WRVVFYLSVT GVAMSSVWAT
201 LTGWHTLS
FP SAVYLSCIGV SALIAQLSMT RAYKVGDKFT VAS
LSYMTVV
251
FSALSAAFFL AEELFWQ
EIL GMCIIILSGI LSSIRPTAFK QRLQSLFRQR
301 *
ORF135a和ORF135-1显示在300个氨基酸的重叠区内有99.3%的相同性:
orf135a.pep MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf135-1 MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVL
orf135a.pep RRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKE
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf135-1 RRDXFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKE
orf135a.pep RISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf135-1 RISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG
orf135a.pep WRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf135-1 WRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFT
orf135a.pep VASLSYMTVVFSALSAAFFLAEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR
||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||
orf135-1 VASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR
与淋病奈瑟球菌的预计ORF的同源性
ORF135和淋病奈瑟球菌的预计ORF(ORF135ng)在201个氨基酸的重叠区内显示出有97%的相同性:
orf135.pep GTGAMLLLFYAVTXLPLATGVTLSYTSSIF 30
||||||||||||| |||:||||||||||||
orf135ng STVTLGAAAVLRRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLTTGVTLSYTSSIF 335
orf135.pep LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLK 90
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||
orf135ng LAVFSFLILKERISVYTQAVLLLGFAGVVLLLNPSFRSGQEPAALAGLAGGAMSGWAYLK 395
orf135.pep VRELSLAGEPGWRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSM 150
|||||||||||||||||||:||||||||||||||||||||||||||| ||||||||||||
orf135ng VRELSLAGEPGWRVVFYLSATGVAMSSVWATLTGWHTLSFPSAVYLSGIGVSALIAQLSM 455
orf135.pep TRAYKVGDKFTVASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIISAVF 201
|||||||||||||||||||||||||||||||||||||||||||||||||:|
orf135ng TRAYKVGDKFTVASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIISAAF 506
预计ORF135ng核苷酸序列<SEQ ID 545>编码的蛋白质具有氨基酸序列<SEQ ID546>:
1 MPSEKAFRRH LRTASFQGLH LHHFHQKVGK
CGIIGFGIHI FPTLLPAAQG
51 ILDIQLGLFR IDFAALAVYR RTQVDFIHTV IDGIASDQAF SEVVQILRRL
101 NLGHFTDTHL IAQARRFIAD FGNIRPMRRG EAKTFCRCFR FDGIDGIHGD
151 FRQCGHINRL APGKDCRNGK RDKYFFHTRH YNQVCLEKTN CSARKIKFRH
201 QKQAKTHSTS LAARFTIRPS LSQRPFMDTA KKDILGS
GWM LVAAACFTVM
251
NVLIKEASAK FALGSGELVF WRMLFSTVTL GAAAVLRRDT FRTPHWKNHL
301 NRS
MVGTGAM LLLFYAVTHL PLTTGVT
LSY TSSIFLAVFS FLILKERISV
351 YTQA
VLLLGF AGVVLLLNPS FRSGQEPAAL AGLAGGAMSG WAYLKVRELS
401 LAGEPGWRVV FYLSATGVAM SSVWATLTGW HTLS
FPSAVY LSGIGVSALI
451
AQLSMTRAYK VGDKFTVAS
L SYMTVVFSAL SAAFFLGEE
L FWQEILGMCI
501
IISAAF*
进一步的工作揭示了下列淋球菌序列<SEQ ID 547>:
1 ATGGATACCG CAAAAAAAGA CATTTTAGGA TCGGGCTGGA TGCTGGTGGC
51 GGCGGCCTGC TTCACCGTTA TGAACGTATT GATTAAAGAG GCATCGGCAA
101 AATTTGCCCT CGGCAGCGGC GAATTGGTCT TTTGGCGCAT GCTGTTTTCA
151 ACCGTTACGC TCGGTGCTGC CGCCGTATTG CGGCGCGACA CCTTCCGCAC
201 GCCCCATTGG AAAAACCACT TAAACCGCAG TATGGTCGGG ACGGGGGCGA
251 TGCTGCTGCT GTTTTACGCG GTAACGCATC TGCCTTTGAC AACCGGCGTT
301 ACCCTGAGTT ACACCTCGTC GATTTTTttg GCGGTATTTT CCTTCCTGAT
351 TTTGAAAGAA CGGATTTCCG TTTACACGCA GGCGGTGCTG CTCCTTGGTT
401 TTGCCGGCGT GGTATTGCTG CTTAATCCCT CGTTCCGCAG CGGTCAGGAA
451 CCGGCGGCAC TCGCCGGGCT GGCGGGCGGC GCGATGTCCG GCTGGGCGTA
501 TTTGAAAGTG CGCGAACTGT CTTTGGCGGG CGAACCCGGC TGGCGCGTCG
551 TGTTTTACCT TTCCGCAACC GGCGTGGCGA TGTCGTCggt ttgggcgacg
601 Ctgaccggct ggCACAcccT GTCCTTTcca tcggcagttt ATCtgtCGGG
651 CATCGGCGTG tccgcgCtgA TTGCCCAaCT GtcgatgAcg cGCGcctaca
701 aaGTCGGCGA CAAATTCACG GTTGCCTCGC tttcctaTAt gaccgtcGTC
751 TTTTCCGCCC TGTCTGCCGC ATTTTTTCTg ggcgaagagc tttTCtggCA
801 GGAAATACTC GGTATGTGCA TCATTAtccT CAGCGGCATT TTGAGCAGCA
851 TCCGCCCCAT TGCCTTCAAA CAGCGGCTGC AAGCCCTCTT CCGCCAAAGA
901 TAA
它对应于氨基酸序列<SEQ ID 548;ORF135ng-1>:
1
MDTAKKDILG SGWMLVAAAC FTVMNVLIKE ASAKFALGSG ELVFWRMLFS
51 TVTLGAAAVL RRDTFRTPHW KNHLNRS
MVG TGAMLLLFYA VTHLPLTTGV
101 T
LSYTSSIFL AVFSFLILKE RISVYTQA
VL LLGFAGVVLL LNPSFRSGQE
151 PAALAGLAGG AMSGWAYLKV RELSLAGEPG WRVVFYLSAT GVAMSSVWAT
201 LTGWHTLS
FP SAVYLSGIGV SALIAQLSMT RAYKVGDKFT VAS
LSYMTVV
251
FSALSAAFFL GEELFWQ
EIL GMCIIILSGI LSSIRPIAFK QRLQALFRQR
301 *
ORF135ng-1和ORF135-1显示在300个氨基酸的重叠区内有97.0%的相同性:
orf135ng-1.pep MDTAKKDILGSGWMLVAAACFTVMNVLIKEASAKFALGSGELVFWRMLFSTVTLGAAAVL
||||||||||||||||||||||:|||||||||||||||||||||||||||||:|||||||
orf135-1 MDTAKKDILGSGWMLVAAACFTIMNVLIKEASAKFALGSGELVFWRMLFSTVALGAAAVL
orf135ng-1.pep RRDTFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLTTGVTLSYTSSIFLAVFSFLILKE
|||:||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf135-1 RRDXFRTPHWKNHLNRSMVGTGAMLLLFYAVTHLPLATGVTLSYTSSIFLAVFSFLILKE
orf135ng-1.pep RISVYTQAVLLLGFAGVVLLLNPSFRSGQEPAALAGLAGGAMSGWAYLKVRELSLAGEPG
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf135-1 RISVYTQAVLLLGFAGVVLLLNPSFRSGQETAALAGLAGGAMSGWAYLKVRELSLAGEPG
orf135ng-1.pep WRVVFYLSATGVAMSSVWATLTGWHTLSFPSAVYLSGIGVSALIAQLSMTRAYKVGDKFT
||||||||:||||||||||||||||||||||||||| |||||||||||||||||||||||
orf135-1 WRVVFYLSVTGVAMSSVWATLTGWHTLSFPSAVYLSCIGVSALIAQLSMTRAYKVGDKFT
orf135ng-1.pep VASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPIAFKQRLQALFRQR
|||||||||||||||||||||||||||||||||||||||||||||| |||||||:|||||
orf135-1 VASLSYMTVVFSALSAAFFLGEELFWQEILGMCIIILSGILSSIRPTAFKQRLQSLFRQR
根据该分析结果,包括此淋球菌蛋白中存在几个推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例66
在脑膜炎奈瑟球菌中鉴定出下列DNA序列<SEQ ID 549>:
1 ATGAAGCGGC GTATAGCCGT CTTCGTCCTG TTCCCGCAGA TAATCCGAGT
51 TTTGGGACAA CTGTTGCCGA AAATCGTCAA TACAGTTCCG GCACATCGGA
101 TGCTCTTCCA GATTTTCGGG ATGTTCTTTT TCTTCATACA CCAGCAATAT
151 CTGCCCGGGA TCGCCGAAAT CGATTCCCCA TGCGGCATCG TGTTCGGTGC
201 GCTCCTCTTC CGTCATCTGC CCGCGCATTG CCTGTATGGT AAAGCCGCCG
251 TAGGGGATGC CgTTGCACAC GAACATCCAG TCGCTGATGT CGTCAACCGG
301 AACGCAAACG cTTTCGCCTT GTTCGACATT GGTCAGTTCG CCsGGTTCAT
351 TGTTCAGCAC ACCGTAAATA TAAAGACCGT CAAAATAAAT ATCGTCGATC
401 CACATATGTT CGCAAATTTC GCCGTCTTCG CCGTCTTGGA AAAAAGGGAC
451 TTTGACCATG GCAAAATCCA AGGCGGAAAT AATGCGGCGG CGTTCCCAAA
501 AAAGcTCGCG CCAAAAATAT TTGAATGTTT TACGGGCGCG TTCGTCGGCA
551 CGGTTTACCG GTTCGTCTGC CTGTTCTACA TAATAAATGA CGGAATCGCC
651 GCTTTCTgcC kTCGGCATCC GATTCGGATT TGAAAAGTTC mmrwyATTCG
701 GAATAG
它对应于氨基酸序列<SEQ ID 550;ORF136>:
1 MKRRIAVFVL FPQIIRVLGQ LLPKIVNTVP AHRMLFQIFG MFFFFIHQQY
51 LPGIAEIDSP CGIVFGALLF RHLPAHCLYG KAAVGDAVAH EHPVADVVNR
101 NANAFALFDI GQFAXFIVQH TVNIKTVKIN IVDPHMFANF AVFAVLEKRD
151 FDHGKIQGGN NAAAFPKKLA PKIFECFTGA FVGTVYRFVC LFYIINDGIA
201 HHSAPQRVRY LFAPYCGFLP SASDSDLKSS XXSE*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 551>:
1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGTTCCCGC AGATAATCCG
51 AGTTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC
101 GGATGCTCTT CCAGATTTTC GGGATGTTCT TTTTCTTCAT ACACCAGCAA
151 TATCTGCCCG GGATCGCCGA AATCGATTCC CCATGCGGCA TCGTGTTCGG
201 TGCGCTCCTC TTCCGTCATC TGCCCGCGCA TTGCCTGTAT GGTAAAGCCG
251 CCGTAGGGGA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGTCAAC
301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT TCGCCGGGTT
351 CATTGTTCAG CACACCGTAA ATATAAAGAC CGTCAAAATA AATATCGTCG
401 ATCCACATAT GTTCGCAAAT TTCGCCGTCT TCGCCGTCTT GGAAAAAAGG
451 GACTTTGACC ATGGCAAAAT CCAAGGCGGA AATAATGCGG CGGCGTTCCC
501 AAAAAAGCTC GCGCCAAAAA TATTTGAATG TTTTACGGGC GCGTTCGTCG
551 GCACGGTTTA CCGGTTCGTC TGCCTGTTCT ACATAATAAA TGACGGAATC
601 GCCCATCATT CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACTG
651 CGGCTTTCTG CCTTCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT
701 CGGAATAG
它对应于氨基酸序列<SEQ ID 552;ORF 136-1>:
1 MMKRR
IAVFV LFPQIIRVLG QLLPKIVNTV PAHRMLFQIF GMFFFFIHQQ
51 YLPGIAEIDS PCGIVFGALL FRHLPAHCLY GKAAVGDAVA HEHPVADVVN
101 RNANAFALFD IGQFAGFIVQ HTVNIKTVKI NIVDPHMFAN FAVFAVLEKR
151 DFDHGKIQGG NNAAAFPKKL APKIFECFT
G AFVGTVYRFV CLFYIINDGI
201 AHHSAPQRVR YLFAPYCGFL PSASDSDLKS SKYSE*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF136和脑膜炎奈瑟球菌菌株A的ORF(ORF136a)在237个氨基酸的重叠区内显示出有71.7%的相同性:
10 20 30 40 50 59
orf136.pep MKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS
||||||||||: | ||:|||||||||||||||||||| |||||||||||||||||||||
orf136a MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQXFGMFFFFIHQQYLPGIAEIDS
10 20 30 40 50 60
60 70 80 90 100 110 119
orf136.pep PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAXFIVQ
|||||||:||||| :||||||||||:|||||||||||||||||||||||||||| ||||
orf136a PCGIVFGTLLFRHXSTHCLYGKAAVGNAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
70 80 90 100 110 120
120 130 140 150 160 170 179
orf136.pep HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG
|::|:||||||||||||||||| ||||||| : :| : |: | :: : :
orf136a HAINVKTVKINIVDPHMFANFAXFAVLEKRALTMAKSKXXXMRRRSQKSSRQKYLNVLRA
130 140 150 160 170 180
180 190 200 210 220 230
orf136.pep AFVGTVYRFVCLFYIINDGIAHH---SAPQRVRYLFAPYCGFLPSASDSDLKSSXXSEX
: ||: | : ::: |||||||||||||||||||||||||||| |||
orf136a R---SPARFTGLSACSTXXMTESPIISAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
190 200 210 220 230
全长ORF136a核苷酸序列<SEQ ID 553>是:
1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGCTCATGC AGAAAATCCG
51 GATTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC
101 GGATGCTCTT CCAGATNTTC GGGATGTTCT TTTTCTTCAT ACACCAGCAA
151 TACCTGCCCG GGATCGCCGA AATCGATTCC CCATGCGGCA TCGTGTTCGG
201 TACGCTCCTC TTCCGTCATC NGTCCACGCA TTGCCTGTAT GGTAAAGCCG
251 CCGTAGGGAA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGTCAAC
301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT TCGCCGGGTT
351 CATTGTTCAG CACGCCATAA ATGTAAAGAC CGTCAAAATA AATATCGTCG
401 ATCCACATAT GTTCGCAAAT TTCGCCNTCT TCGCCGTCTT GGAAAAAAGG
451 GCTTTGACCA TGGCAAAATC TAAGGNGNNA NNGATGCGGC GGCGTTCCCA
501 AAAAAGCTCG CGCCAAAAAT ATTTGAATGT TTTGCGGGCG CGTTCGCCGG
551 CACGGTTTAC CGGTTTGTCT GCCTGTTCTA CATAATAAAT GACGGAATCG
601 CCCATCATAT CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACTG
651 CGGCTTTCTG CCTTCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT
701 CGGAATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 554>:
1 MMKRR
IAVFV LLMQKIRILG QLLPKIVNTV PAHRMLFQXF GMFFFFIHQQ
51 YLPGIAEIDS PCGIVFGTLL FRHXSTHCLY GKAAVGNAVA HEHPVADVVN
101 RNANAFALFD IGQFAGFIVQ HAINVKTVKI NIVDPHMFAN FAXFAVLEKR
151 ALTMAKSKXX XMRRRSQKSS RQKYLNVLRA RSPARFTGLS ACST**MTES
201 PIISAPQRVR YLFAPYCGFL PSASDSDLKS SKYSE*
ORF136a和ORF136-1显示在238个氨基酸的重叠区内有73.1%的相同性:
10 20 30 40 50 60
orf136a.pep MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQXFGMFFFFIHQQYLPGIAEIDS
|||||||||||: | ||:|||||||||||||||||||| |||||||||||||||||||||
orf136-1 MMKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS
10 20 30 40 50 60
70 80 90 100 110 120
orf136a.pep PCGIVFGTLLFRHXSTHCLYGKAAVGNAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
|||||||:||||| :||||||||||:|||||||||||||||||||||||||||||||||
orf136-1 PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
70 80 90 100 110 120
130 140 150 160 170 180
orf136a.pep HAINVKTVKINIVDPHMFANFAXFAVLEKRALTMAKSKXXXMRRRSQKSSRQKYLNVLRA
|::|:||||||||||||||||| ||||||| : :| : |: | :: : :
orf136-1 HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG
130 140 150 160 170 180
190 200 210 220 230
orf136a.pep R---SPARFTGLSACSTXXMTESPIISAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
: ||: | : ::: |||||||||||||||||||||||||||||||||
orf136-1 AFVGTVYRFVCLFYIINDGIAHH---SAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
190 200 210 220 230
与淋病奈瑟球菌的预计ORF的同源性
ORF136和淋病奈瑟球菌的预计ORF(ORF136ng)在234个氨基酸的重叠区内显示出有92.3%的相同性:
orf136.pep MKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS 59
||||||||||: | ||:||||||||||||||||||||||||||||||:|||||||||||
orf136ng MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQIFGMFFFFIHRQYLPGIAEIDS 60
orf136.pep PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAXFIVQ 119
| |||||:|||||| |||||||||||||||||||||||:|||||||||||||| | ||||
orf136ng PGGIVFGTLLFRHLSAHCLYGKAAVGDAVAHEHPVADVANRNANAFALFDIGQSAGFIVQ 120
orf136.pep HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG 179
|||||||||||||||||||||||||||||||||||||||||||||||||||||:||||||
orf136ng HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKVFECFTG 180
orf136.pep AFVGTVYRFVCLFYIINDGIAHHSAPQRVRYLFAPYCGFLPSASDSDLKSSXXSE 234
||:||||||||||||||||||||:|||||||||||| |||| ||||||||| ||
orf136ng AFAGTVYRFVCLFYIINDGIAHHTAPQRVRYLFAPYRGFLPPASDSDLKSSKYSE 235
全长ORF136ng核苷酸序列<SEQ ID 555>是:
1 ATGATGAAGC GGCGTATAGC CGTCTTCGTC CTGCTCATGC AGAAAATCCG
51 GATTTTGGGA CAACTGTTGC CGAAAATCGT CAATACAGTT CCGGCACATC
101 GGATGCTCTT CCAAATTTTC GGGATGTTCT TTTTCTTCAT ACACCGGCAA
151 TACCTGCCCG GGATCGCCGA AATCGATTCC CCAGGCGGTA TCGTGTTCGG
201 TACGCTCCTC TTCCGTCATC TGTCCGCGCA TTGCCTGTAC GGTAAAGCCG
251 CCGTAGGGGA TGCCGTTGCA CACGAACATC CAGTCGCTGA TGTCGCCAAC
301 CGGAACGCAA ACGCTTTCGC CTTGTTCGAC ATTGGTCAGT CCGCCGGGTT
351 CATTGTTCAG CACACCGTAA ATATAAAGAC CGTCAAAATA AATATCGTCG
401 ATCCACATAT GTTCGCAAAT TTCGCCGTCT TCGCCGTCTT GGAAAAAAGG
451 GACTTTGACC ATGGCAAAAT CCAAGGCGGA AATAATGCGG CGGCGTTCCC
501 AAAAAAGCTC GCGCCAAAAG TATTTGAATG TTTTACGGGC GCGTTCGCCG
551 GCACGGTTTA CCGGTTCGTC TGCCTGTTCT ACATAATAAA TGACGGAATC
601 GCCCATCATA CTGCTCCTCA ACGTGTACGG TATCTGTTTG CACCTTACCG
651 CGGTTTTCTA CCTCCGGCAT CCGATTCGGA TTTGAAAAGT TCCAAATATT
701 CGGAATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 556>:
1 MMKRR
IAVFV LLMQKIRILG QLLPKIVNTV PAHRMLFQIF GMFFFFIHRQ
51 YLPGIAEIDS PGGIVFGTLL FRHLSAHCLY GKAAVGDAVA HEHPVADVAN
101 RNANAFALFD IGQSAGFIVQ HTVNIKTVKI NIVDPHMFAN FAVFAVLEKR
151 DFDHGKIQGG NNAAAFPKKL APKVFECFT
G AFAGTVYRFV CLFYIINDGI
201 AHHTAPQRVR YLFAPYRGFL PPASDSDLKS SKYSE*
ORF136ng和ORF136-1显示在235个氨基酸的重叠区内有93.6%的相同性:
orf136ng MMKRRIAVFVLLMQKIRILGQLLPKIVNTVPAHRMLFQIFGMFFFFIHRQYLPGIAEIDS
|||||||||||: | ||:||||||||||||||||||||||||||||||:|||||||||||
orf136-1 MMKRRIAVFVLFPQIIRVLGQLLPKIVNTVPAHRMLFQIFGMFFFFIHQQYLPGIAEIDS
orf136ng PGGIVFGTLLFRHLSAHCLYGKAAVGDAVAHEHPVADVANRNANAFALFDIGQSAGFIVQ
| |||||:|||||| |||||||||||||||||||||||:|||||||||||||| ||||||
orf136-1 PCGIVFGALLFRHLPAHCLYGKAAVGDAVAHEHPVADVVNRNANAFALFDIGQFAGFIVQ
orf136ng HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKVFECFTG
|||||||||||||||||||||||||||||||||||||||||||||||||||||:||||||
orf136-1 HTVNIKTVKINIVDPHMFANFAVFAVLEKRDFDHGKIQGGNNAAAFPKKLAPKIFECFTG
orf136ng AFAGTVYRFVCLFYIINDGIAHHTAPQRVRYLFAPYRGFLPPASDSDLKSSKYSEX
||:||||||||||||||||||||:|||||||||||| |||| ||||||||||||||
orf136-1 AFVGTVYRFVCLFYIINDGIAHHSAPQRVRYLFAPYCGFLPSASDSDLKSSKYSEX
根据此淋球菌蛋白中存在推定跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例67
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 557>:
1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC
51 CGCCGCCGCG TTGCTTGCCG CC.TGCGGAC GGCGGGAAAT AATGCTGTCC
101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC
151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT
201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACC TCCGCAGGTT
251 CGATTGTCGG CAACCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA
301 TTGGAAGCCG AAATTTTAGG CAAAACCGAT TTGGTCGATT TAACCTTGTC
351 CACCAATGGG TTTATCAAAG GCGCAAAGCT GCAAAATTAC ATCAACCGAA
401 AACTCCGCGG CATGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCC..
它对应于氨基酸序列<SEQ ID 558;ORF137>:
1 MENMVTFSKI RPLLAIAAAA LLAAXRTAGN NAVRKPVQTA KPAAVVGLAL
51 GGGASKGFAH VGIIKVLKEN GIPVKVVTGT SAGSIVGNLF ASGMSPDRLE
101 LEAEILGKTD LVDLTLSTNG FIKGAKLQNY INRKLRGMQI QQFPIKFAA..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 559>:
1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC
51 CGCCGCCGCG TTGCTTGCCG CCTGCGGCAC GGCGGGAAAT AATGCTGTCC
101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC
151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT
201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT
251 CGATTGTCGG CAGCCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA
301 TTGGAAGCCG AAATTTTAGG CAAAACCGAT TTGGTCGATT TAACCTTGTC
351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA
401 AAGTCGGCGG CAGGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT
451 GCTACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AGGGGAATGC
501 CGGGCAGGCT GTGCGCGCTT CCGCCGCCAT TCCCAATGTG TTCCAACCCG
551 TTATCATCGG CAGGCATACA TATGTTGACG GCGGTCTGTC GCAGCCCGTG
601 CCCGTCAGTG CCGCCCGGCG GCAGGGGGCG AATTTCGTGA TTGCCGTCGA
651 TATTTCCGCC CGTCCGGGCA AAAACATCAG CCAAGGTTTC TTCTCTTATC
701 TCGATCAGAC GCTGAACGTA ATGAGCGTTT CTGCGTTGCA AAATGAGTTG
751 GGGCAGGCGG ATGTGGTTAT CAAACCGCAG GTTTTGGATT TGGGTGCAGT
801 CGGCGGATTC GATCAGAAAA AACGCGCCAT CCGGTTGGGT GAGGAGGCAG
851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT
901 TGA
它对应于氨基酸序列<SEQ ID 560;ORF137-1>:
1
MENMVTFSKI RPLLAIAAAA LLAACGTAGN NAVRKPVQTA KPAAVVGLAL
51 GGGASKGFAH VGIIKVLKEN GIPVKVVTGT SAGSIVGSLF ASGMSPDRLE
101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRQI QQFPIKFAAV
151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHT YVDGGLSQPV
201 PVSAARRQGA NFVIAVDISA RPGKNISQGF FSYLDQTLNV MSVSALQNEL
251 GQADVVIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY
301 *
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF137和脑膜炎奈瑟球菌菌株A的ORF(ORF137a)在149个氨基酸的重叠区内显示出有93.3%的相同性:
10 20 30 40 50 60
orf137.pep MENMVTFSKIRPLLAIAAAALLAAXRTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH
|||||||||||||||||||||||| ||||||:|||||||||||||||||||||||||||
orf137a MENMVTFSKIRPLLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVGLALGGGASKGFAH
10 20 30 40 50 60
70 80 90 100 110 120
orf137.pep VGIIKVLKENGIPVKVVTGTSAGSIVGNLFASGMSPDRLELEAEILGKTDLVDLTLSTNG
|||||||||||||||||||||||||||:||||||||||||||||||||||||||||||:|
orf137a VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG
70 80 90 100 110 120
130 140 149
orf137.pep FIKGAKLQNYINRKLRGMQIQQFPIKFAA
|||| |||||||||: | :||||||||||
orf137a FIKGEKLQNYINRKVGGRRIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
130 140 150 160 170 180
全长ORF137a核苷酸序列<SEQ ID 561>是:
1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGACCGCTTT TGGCAATCGC
51 CGCCGCCGCG TTGCTTGCCG CCTGCGGCAC GGCGGGAAAT AATGCTGCCC
101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGG TTTGGCACTC
151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT GTAGGTATTA TTAAGGTTTT
201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT
251 CGATAGTCGG CAGCCTTTTT GCATCGGGTA TGTCGCCCGA CCGCCTCGAA
301 TTGGAAGCCG AAATTTTAGG TAAAACCGAT TTGGTCGATT TAACCTTGTC
351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA
401 AAGTCGGCGG CAGGCGGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT
451 GCTACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AAGGGAATGC
501 CGGGCAGGCT GTGCGCGCTT CCGCCGCCAT TCCCAATGTG TTCCAACCCG
551 TTATCATCGG CAGGCATACA TATGTTGACG GCGGTCTGTC GCAGCCCGTG
601 CCCGTCAGTG CCGCCCGGCG GCANGNNNNG NATNTCGTGA TTGCCGTCGA
651 TATTTCCGCC CGTCCGAGCA AAAACATCAG CCAAGGCTTC TTCTCTTATC
701 TCGATCAGAC GCTGAACGTA ATGAGCGTTT CCGCGTTGCA AAATGAGTTG
751 GGGCAGGCGG ATGTGGTTAT CAAACCGCAG GTTTTGGATT TGGGTGCAGT
801 CGGCGGATTC GATCAGAAAA AACGCGCCAT CCGGTTGGGT GAGGAGGCAG
851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT
901 TGA
它编码的蛋白质具有氨基酸序列<SEQ ID 562>:
1
MENMVTFSKI RPLLAIAAAA LLAACGTAGN NAARKPVQTA KPAAVVGLAL
51 GGGASKGFAH VGIIKVLKEN GIPVKVVTGT SAGSIVGSLF ASGMSPDRLE
101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRRI QQFPIKFAAV
151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHT YVDGGLSQPV
201 PVSAARRXXX XXVIAVDISA RPSKNISQGF FSYLDQTLNV MSVSALQNEL
251 GQADVVIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY
301 *
ORF137a和ORF137-1显示在300个氨基酸的重叠区内有97.3%的相同性:
orf137a.pep MENMVTFSKIRPLLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVGLALGGGASKGFAH
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf137-1 MENMVTFSKIRPLLAIAAAALLAACGTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH
orf137a.pep VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf137-1 VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG
orf137a.pep FIKGEKLQNYINRKVGGRRIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf137-1 FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
orf137a.pep FQPVIIGRHTYVDGGLSQPVPVSAARRXXXXXVIAVDISARPSKNISQGFFSYLDQTLNV
||||||||||||||||||||||||||| ||||||||||:|||||||||||||||||
orf137-1 FQPVIIGRHTYVDGGLSQPVPVSAARRQGANFVIAVDISARPGKNISQGFFSYLDQTLNV
orf137a.pep MSVSALQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf137-1 MSVSALQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
与淋病奈瑟球菌的预计ORF的同源性
ORF137和淋病奈瑟球菌的预计ORF(ORF137ng)在149个氨基酸的重叠区内显示出有89.9%的相同性:
orf137.pep MENMVTFSKIRPLLAIAAAALLAAXRTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH 60
||||||||||| :||||||||||| ||||||:|||||||||||||:|||||||||||||
orf137ng MENMVTFSKIRSFLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVALALGGGASKGFAH 60
orf137.pep VGIIKVLKENGIPVKVVTGTSAGSIVGNLFASGMSPDRLELEAEILGKTDLVDLTLSTNG 120
:||:|||||||||||||||||||||||:|:||||||||||||||||||||||||||||:|
orf137ng IGIVKVLKENGIPVKVVTGTSAGSIVGSLLASGMSPDRLELEAEILGKTDLVDLTLSTSG 120
orf137.pep FIKGAKLQNYINRKLRGMQIQQFPIKFAA 149
|||| |||||||||: | |||||||||||
orf137ng FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV 180
全长ORF137ng核苷酸序列<SEQ ID 563>是:
1 ATGGAAAATA TGGTAACGTT TTCAAAAATC AGATCATTTT TGGCAATCGC
51 CGCCGCCGCG TTGCTTGCCG CCTGCGGTAC GGCGGGAAAC AATGCCGCCC
101 GCAAGCCGGT GCAAACCGCC AAACCCGCCG CAGTGGTCGC TTTGGCACTC
151 GGTGGCGGCG CATCTAAAGG ATTTGCCCAT ATAGGAATTG TTAAGGTTTT
201 GAAAGAAAAC GGTATTCCTG TGAAGGTGGT TACCGGCACA TCGGCAGGTT
251 CGATAGTCGG CAGCCTTTTG GCATCGGGTA TGTCGCCCGA CCGCCTCGAA
301 TTGGAAGCCG AGATTTTAGG TAAAACCGAT TTAGTCGATT TAACCTTGTC
351 CACCAGTGGT TTTATCAAAG GCGAAAAGCT GCAAAATTAC ATCAACCGAA
401 AAGTCGGCGG CAGGCAGATT CAGCAGTTTC CCATCAAATT TGCCGCCGTT
451 GCCACTGATT TTGAAACCGG CAAGGCCGTC GCTTTCAATC AAGGGAATGC
501 CGGGCAGGCG GTTCGTGCTT CCGCCGCCAT TCCCAATGTG TTCCAGCCAG
551 TCATCATCGG CAGGCACAAA TATGTTGACG GCGGTCTGTC GCAGCCCGTG
601 CCCGTCAGTG CCGCTCGGCG GCAGGGGGCG AATTTCGTGA TTGCCGTCGA
651 TATTTCCGCA CGTCCGAGCA AAAATGTCGG TCAAGGTTTC TTCTCTTATC
701 TCGATCAGAC GCTGAACGTG ATGAGCGTTT CCGTGTTGCA AAACGAGTTG
751 gggcAGGCGG ATGTGGTTAT CAAACCGCag gtTTTGGATT TGGGTGCAGT
801 CGGCGGATTC GATCAGAAAA AGCGCGCCAT CCGGTTGGGC GAGGAGGCAG
851 CACGTGCCGC ATTGCCTGAA ATCAAACGCA AACTGGCGGC ATACCGTTAT
901 TGA
它编码的蛋白质具有氨基酸序列<SEQ ID 564>:
1 MENMVTFSK
I RSFLAIAAAA LLAACGTAGN NAARKPVQTA KPAAVVALAL
51 GGGASKGFAH IGIVKVLKEN GIPVKVVTGT SAGSIVGSLL ASGMSPDRLE
101 LEAEILGKTD LVDLTLSTSG FIKGEKLQNY INRKVGGRQI QQFPIKFAAV
151 ATDFETGKAV AFNQGNAGQA VRASAAIPNV FQPVIIGRHK YVDGGLSQPV
201 PVSAARRQGA NFVIAVDISA RPSKNVGQGF FSYLDQTLNV MSVSVLQNEL
251 GQADVVIKPQ VLDLGAVGGF DQKKRAIRLG EEAARAALPE IKRKLAAYRY
301 *
ORF137ng和ORF137-1显示在300个氨基酸的重叠区内有96.0%的相同性:
orf137ng MENMVTFSKIRSFLAIAAAALLAACGTAGNNAARKPVQTAKPAAVVALALGGGASKGFAH
||||||||||| :|||||||||||||||||||:|||||||||||||:|||||||||||||
orf137-1 MENMVTFSKIRPLLAIAAAALLAACGTAGNNAVRKPVQTAKPAAVVGLALGGGASKGFAH
orf137ng IGIVKVLKENGIPVKVVTGTSAGSIVGSLLASGMSPDRLELEAEILGKTDLVDLTLSTSG
:||:|||||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf137-1 VGIIKVLKENGIPVKVVTGTSAGSIVGSLFASGMSPDRLELEAEILGKTDLVDLTLSTSG
orf137ng FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf137-1 FIKGEKLQNYINRKVGGRQIQQFPIKFAAVATDFETGKAVAFNQGNAGQAVRASAAIPNV
orf137ng FQPVIIGRHKYVDGGLSQPVPVSAARRQGANFVIAVDISARPSKNVGQGFFSYLDQTLNV
||||||||| ||||||||||||||||||||||||||||||||:||::|||||||||||||
orf137-1 FQPVIIGRHTYVDGGLSQPVPVSAARRQGANFVIAVDISARPGKNISQGFFSYLDQTLNV
orf137ng MSVSVLQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf137 MSVSALQNELGQADVVIKPQVLDLGAVGGFDQKKRAIRLGEEAARAALPEIKRKLAAYRY
根据此淋球菌蛋白中存在预计的原核细胞膜脂蛋白脂质连接位点(下划线表示),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例68
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 565>:
1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA
51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGcTG CCGCTTTCCT
101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA
151 AAGGAAGACC GCGCGCGCAT CGTCGCCmAT ATGCGGCAGG CGGGTTTGAA
201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG
251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA
301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA
351 ACACGAAGGG CTGCTATTC..
它对应于氨基酸序列<SEQ ID 566;ORF138>:
1 MFRLQFRLFP PLRTAMHILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL
51 KEDRARIVAX MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET
101 MFKAVHGWEH VQQALDKHEG LLF
进一步的工作揭示了完整的核苷酸序列<SEQ ID 567>:
1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA
51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG CCGCTTTCCT
101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA
151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGGCAGG CGGGTTTGAA
201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG
251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA
301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA
351 ACACGAAGGG CTGCTATTCA TCACGCCGCA CATCGGCAGC TACGATTTGG
401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCCGCTGAC CGCCATGTAC
451 AAACCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT
501 TCGCGGCAAA GGAAAAACCG CGCCTACCAG CATACAAGGG GTCAAACAAA
551 TCATCAAAGC CCTGCGTTCG GGCGAAGCAA CCATCGTCCT GCCCGACCAC
601 GTCCCCTCCC CTCAAGAAGG CGGGGAAGGC GTATGGGTGG ATTTCTTCGG
651 CAAACCTGCC TATACCATGA CGCTGGCGGC AAAATTGGCA CACGTCAAAG
701 GCGTGAAAAC CCTGTTTTTC TGCTGCGAAC GCCTGCCTGG CGGACAAGGT
751 TTCGATTTGC ACATCCGCCC CGTCCAAGGG GAATTGAACG GCGACAAAGC
801 CCATGATGCC GCCGTGTTCA ACCGCAATGC CGAATATTGG ATACGCCGTT
851 TTCCGACGCA GTATCTGTTT ATGTACAACC GCTACAAAAT GCCGTAA
它对应于氨基酸序列<SEQ ID 568;ORF138-1>:
1 MFRLQFRLFP PLRTAMH
ILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL
51 KEDRARIVAN MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET
101 MFKAVHGWEH VQQALDKHEG LLFITPHIGS YDLGGRYISQ QLPFPLTAMY
151 KPPKIKAIDK IMQAGRVRGK GKTAPTSIQG VKQIIKALRS GEATIVLPDH
201 VPSPQEGGEG VWVDFFGKPA YTMTLAAKLA HVKGVKTLFF CCERLPGGQG
251 FDLHIRPVQG ELNGDKAHDA AVFNRNAEYW IRRFPTQYLF MYNRYKMP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF138和脑膜炎奈瑟球菌菌株A的ORE(ORF138a)在123个氨基酸重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60
orf138.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138a MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
10 20 30 40 50 60
70 80 90 100 110 120
orf138.pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138a MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
70 80 90 100 110 120
orf138.pep LLF
|||
orf138a LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
130 140 150 160 170 180
全长ORF138a核苷酸序列<SEQ ID 569>是:
1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA
51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG CCGCTTTCCT
101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA
151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGTCAGG CAGGCATGAA
201 TCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG
251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA
301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA
351 ACACGAAGGG CTGCTATTCA TCACGCCGCA CATCGGCAGC TACGATTTGG
401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCCGCTGAC CGCCATGTAC
451 AAACCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT
501 TCGCGGCAAA GGAAAAACCG CGCCTACCAG CATACAAGGG GTCAAACAAA
551 TCATCAAAGC CCTGCGTTCG GGCGAAGCAA CCATCGTCCT GCCCGACCAC
601 GTCCCCTCCC CTCAAGAAGG CGGGGAAGGC GTATGGGTGG ATTTCTTCGG
651 CAAACCTGCC TATACCATGA CGCTGGCGGC AAAATTGGCA CACGTCAAAG
701 GCGTGAAAAC CCTGTTTTTC TGCTGCGAAC GCCTGCCTGG CGGACAAGGT
751 TTCGATTTGC ACATCCGCCC CGTCCAAGGG GAATTGAACG GCGACAAAGC
801 CCATGATGCC GCCGTGTTCA ACCGCAATGC CGAATATTGG ATACGCCGTT
851 TTCCGACGCA GTATCTGTTT ATGTACAACC GCTACAAAAT GCCGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 570>:
1 MFRLQFRLFP PLRTAMH
ILL TALLKCLSLL PLSCLHTLGN RLGHLAFYLL
51 KEDRARIVAN MRQAGLNPDP KTVKAVFAET AKGGLELAPA FFRKPEDIET
101 MFKAVHGWEH VQQALDKHEG LLFITPHIGS YDLGGRYISQ QLPFPLTAMY
151 KPPKIKAIDK IMQAGRVRGK GKTAPTSIQG VKQIIKALRS GEATIVLPDH
201 VPSPQEGGEG VWVDFFGKPA YTMTLAAKLA HVKGVKTLFF CCERLPGGQG
251 FDLHIRPVQG ELNGDKAHDA AVFNRNAEYW IRRFPTQYLF MYNRYKMP*
ORF138a和ORF138-1显示在298个氨基酸的重叠区内有99.7%的相同性:
orf138a.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
orf138a.pep MRQAGMNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
|||||:||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
orf138a.pep LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
orf138a.pep VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF
orf138a.pep CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf138-1 CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP
与淋病奈瑟球菌的预计ORF的同源性
ORF138和淋病奈瑟球菌的预计ORF(ORF138ng)在123个氨基酸的重叠区内显示出有94.3%的相同性:
orf138.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAX 60
|||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||
orf138ng MFRLQFRLFPPLRTAMHILLTALLKCLSLLSLSCLHTLGNRLGHLAFYLLKEDRARIVAN 60
orf138.pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG 120
||||||||| :||||||||||| |||||||||:|||||||||||||||||||||||| ||
orf138ng MRQAGLNPDTQTVKAVFAETAKCGLELAPAFFKKPEDIETMFKAVHGWEHVQQALDKGEG 120
orf138.pep LLF 123
|||
orf138ng LLFITPHIGSYDLGGRYISQQLPFHLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTGIQG 180
全长ORF138ng核苷酸序列<SEQ ID 571>是:
1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA
51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGCTG TCGCTTTCCT
101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA
151 AAGGAAGACC GCGCGCGCAT CGTCGCCAAT ATGCGGCAGG CGGGTTTGAA
201 CCCCGACACG CAGACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAATGCG
251 GTTTGGAACT TGCCCCCGCG TTTTTCAAAA AACCGGAAGA CATCGAAACA
301 ATGTTCAAAG CGGTACACGG CTGGGAACAC GTGCAGCAGG CTTTGGACAA
351 GGGCGAAGGG CTGCTGTTCA TCACGCCGCA CATCGGCAGC TACGATTTGG
401 GCGGACGCTA CATCAGCCAG CAGCTTCCGT TCCACCTGAC CGCCATGTAC
451 AAGCCGCCGA AAATCAAAGC GATAGACAAA ATCATGCAGG CGGGCAGGGT
501 GCGCGGCAAA GGCAAAACcg cgcccaccgg catACAAGGG GTCAAACAAA
551 tcatcaAGGC CCTGCGCGCG GGCGAGGCAA CCAtcATCCT GCCCGACCAC
601 GTCCCTTCTC CGCAGGAagg cggCGGCGTG TGGGCGGATT TTTTCGGCAA
651 ACCTGCATAc acCATGACAC TGGCGGCAAA ATTGGCACAC GTCAAAGGCG
701 TGAAAACCCT GTTTTTCTGC TGCGAACGCC TGCCCGACGG ACAAGGCTTC
751 GTGTTGCACA TCCGCCCCGT CCAAGGGGAA TTGAACGGCA ACAAAGCCCA
801 CGATGCCGCC GTGTTCAACC GCAATACCGA ATATTGGATA CGCCGTTTTC
851 CGACGCAGTA TCTGTTTATG TACAACCGCT ATAAAACGCC GTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 572>:
1 MFRLQFRLFP PLRTAMH
ILL TALLKCLSLL SLSCLHTLGN RLGHLAFYLL
51 KEDRARIVAN MRQAGLNPDT QTVKAVFAET AKCGLELAPA FFKKPEDIET
101 MFKAVHGWEH VQQALDKGEG LLFITPHIGS YDLGGRYISQ QLPFHLTAMY
151 KPPKIKAIDK IMQAGRVRGK GKTAPTGIQG VKQIIKALRA GEATIILPDH
201 VPSPQEGGGV WADFFGKPAY TMTLAAKLAH VKGVKTLFFC CERLPDGQGF
251 VLHIRPVQGE LNGNKAHDAA VFNRNTEYWI RRFPTQYLFM YNRYKTP*
ORF138ng和ORF138-1在299个氨基酸的重叠区内显示出有94.3%的相同性:
orf138-1.pep MFRLQFRLFPPLRTAMHILLTALLKCLSLLPLSCLHTLGNRLGHLAFYLLKEDRARIVAN
|||||||||||||||||||||||||||||| |||||||||||||||||||||||||||||
orf138ng MFRLQFRLFPPLRTAMHILLTALLKCLSLLSLSCLHTLGNRLGHLAFYLLKEDRARIVAN
orf138-1.pep MRQAGLNPDPKTVKAVFAETAKGGLELAPAFFRKPEDIETMFKAVHGWEHVQQALDKHEG
||||||||| :||||||||||| |||||||||:|||||||||||||||||||||||| ||
orf138ng MRQAGLNPDTQTVKAVFAETAKCGLELAPAFFKKPEDIETMFKAVHGWEHVQQALDKGEG
orf138-1.pep LLFITPHIGSYDLGGRYISQQLPFPLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTSIQG
|||||||||||||||||||||||| |||||||||||||||||||||||||||||||:|||
orf138ng LLFITPHIGSYDLGGRYISQQLPFHLTAMYKPPKIKAIDKIMQAGRVRGKGKTAPTGIQG
orf138-1.pep VKQIIKALRSGEATIVLPDHVPSPQEGGEGVWVDFFGKPAYTMTLAAKLAHVKGVKTLFF
|||||||||:|||||:|||||||||||| |||:|||||||||||||||||||||||||||
orf138ng VKQIIKALRAGEATIILPDHVPSPQEGG-GVWADFFGKPAYTMTLAAKLAHVKGVKTLFF
orf138-1.pep CCERLPGGQGFDLHIRPVQGELNGDKAHDAAVFNRNAEYWIRRFPTQYLFMYNRYKMP
|||||| |||| ||||||||||||:|||||||||||:||||||||||||||||||| |
orf138ng CCERLPDGQGFVLHIRPVQGELNGNKAHDAAVFNRNTEYWIRRFPTQYLFMYNRYKTP
另外,ORF138ng与荧光假单胞菌的htrB蛋白同源:
gnl|PID|e334283(Y14568)htrB[荧光假单胞菌]长度=253
评分=80.8位(196),估计值=9e-15
相同性=49/151(32%),阳性=79/151(51%),空隙=6/151(3%)
询问:101 MFKAVHGWEHVQQALDKGEGLLFITPHIGSYD-LGGRYISQQLPFHLTAMYKPPKIKAID 159
+ + V G E +++AL G+G++ IT H+G+++ L Y SQ P Y+PPK+KA+D
目标:94 LVREVEGLEVLKEALASGKGVVGITSHLGNWEVLNHFYCSQCKPI---IFYRPPKLKAVD 150
询问:160 KIMQAGRVRGKGKTAPTGIQGVKQIIKALRAGEATIILPDHVPSPQEGGGVWADFFGKPA 219
++++ RV+ K A + +G+ +IK +R G I D P P E G++ FF A
目标:151 ELLRKQRVQLGNKVAASTKEGILSVIKEVRKGGQVGIPAD--PEPAESAGIFVPFFATQA 208
询问:220 YTMTLAAKLAHVKGVKTLFFCCERLPDGQGF 250
T + +F RLPDG G+
目标:209 LTSKFVPNMLAGGKAVGVFLHALRLPDGSGY 239
根据该分析结果(包括淋球菌蛋白中存在推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF138-1(57kDa)克隆到pGex载体中,并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图14A显示出GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA(阳性结果)和FACS分析(图14B)。这些实验确认ORF138-1是一种外露蛋白,且是一种有用的免疫原。
实施例69
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 573>:
1 ..GCGTGGTCGG CCGGCGAATC GTGGCGTGTG TTAATGGAAA GTGAAACGTG
51 GCATGCGGTG TGGAATACTT TGCGCTTCTC GGCGGCGGCG GTGTATGCGG
101 CAGCGGTTTT GGGTGTGGTG TATGCGGCGC CGGCGCGGCG GTCGGCGTGG
151 ATGCGCGGGC TGATGTTTTA GCCGTTTATG GTGTCGCCGG TTTGTGTTTC
201 GGCGGGCGTG CTGCTGCTTT ATCCGCAGTG GACGGCTTCG TTGCCGTTGC
251 TGCTGGCGAT GTATGCGCTG CTGGCGTATC CGTTTGTGGC AAAAGATGTT
301 TTATCAGCCT GGGATGCACT GCCGCCGGAT TACGGCAGGG CGGCGGCGGG
351 TTTGGGTGCA AACGGCTTTC AGACGGCATG CCGCATCACG TTCCCCCTCT
401 TGAAACCGGC GTTGCGGCGC GGTCTGACTT TGGCGGCGGC AACCTGCGTG
451 GGCGAATTTG CGGCGACATT GTTTCTGTCG CGTCCGGAAT GGCAGACGCT
501 GACGACTTTG ATTTATGCCT ATTTGGGACG CGCGGGTGAG GATAATTACG
551 CGCGGGCGAT GGTGCTG..
它对应于氨基酸序列<SEQ ID 574;ORF139>:
1 ..AWSAGESWRV LMESETWHAV WNTLRFSAAA VYAAAVLGVV YAAPARRSAW
51 MRGLMFXPFM VSPVCVSAGV LLLYPQWTAS LPLLLAMYAL LAYPFVAKDV
101 LSAWDALPPD YGRAAAGLGA NGFQTACRIT FPLLKPALRR GLTLAAATCV
151 GEFAATLFLS RPEWQTLTTL IYAYLGRAGE DNYARAMVL..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 575>:
1 ATGGATGGAC GGCGTTGGGT GGTATGGGGT GCTTTTGCCC TGCTGCCTTC
51 GGCTTTTTTG GCGGTAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT
101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA
151 CGTTTGGCGT GGACGGTATT TCAGGCAGCG GCAACCTGTG TGCTGGTGCT
201 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTTCCGG
251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCTTTTGT GATGCCCACG
301 TTGGTGGCGG GCGTGGGCGT GCTGGCCCTG TTCGGGGCGG ACGGGCTGTT
351 GTGGCGCGGC AGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT
401 TTTTCAACCT TCCTGTGTTG GTCAGGGCGG CGTATCAGGG GTTTGTGCAA
451 GTGCCTGCGG CACGGCTTCA GACGGCACGG ACGTTGGGCG CGGGGGCGTG
501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG
551 GCGGCGTGTG CCTTGTCTTT CTGTATTGTT TTTCCGGGTT CGGGCTGGCG
601 CTGCTGCTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA
651 GTTGGTCATG TTCGAACTCG ATATGGCGGT TGCTTCGGTG CTGGTGTGGC
701 TGGTGTTGGG GGTAACGGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC
751 AGGCGCGCGG TTTCGGATAA GGCGGTTTCC CCTGTGATGC CGTCGCCGCC
801 GCAGTCGGTC GGGGAATATG TGCTGCTGGC GTTTGCGGCG GCGGTGTTGT
851 CTGTGTGCTG CCTGTTTCCT TTGTTGGCAA TTGTTGTGAA AGCGTGGTCG
901 GCCGGCGAAT CGTGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCGGT
951 GTGGAATACT TTGCGCTTCT CGGCGGCGGC GGTGTATGCG GCGGCGGTTT
1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGTCGGCGTG GATGCGCGGG
1051 CTGATGTTTT TGCCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT
1101 GCTGCTGCTT TATCCGCAGT GGACGGCTTC GTTGCCGTTG CTGCTGGCGA
1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCAGCC
1201 TGGGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCGG GTTTGGGTGC
1251 AAACGGCTTT CAGACGGCAT GCCGCATCAC GTTCCCCCTC TTGAAACCGG
1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CAACCTGCGT GGGCGAATTT
1351 GCGGCGACAT TGTTTCTGTC GCGTCCGGAA TGGCAGACGC TGACGACTTT
1401 GATTTATGCC TATTTGGGAC GCGCGGGTGA GGATAATTAC GCGCGGGCGA
1451 TGGTGCTGAC ATTGCTGTTG GCGGCGTTCG CGCTGGGTAT TTTCCTGCTG
1501 TTGGACGGCG GCGAAGGCGG AAAACAGACG GAAACGTTAT AA
它对应于氨基酸序列<SEQ ID 576;ORF139-1>:
1
MDGRRWVVWG AFALLPSAFL AVMVVAPLWA VAAYDGLAWR AVLSDAYMLK
51 RLAWTVFQAA
ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLML
PFVMPT
101
LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFVQ
151 VPAARLQTAR TLGAGAWRRF WDIEMPVLRP WLAGG
VCLVF LYCFSGFGLA
201
LLLGGSRYAT VEVEIYQLVM FELDMAVA
SV LVWLVLGVTA AAGLLYAWFG
251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFA
A AVLSVCCLFP LLAIVVKAWS
301 AGESWRVLME SETWQAVWNT LRFS
AAAVYA AAVLGVVYAA AARRSAWMRG
351 LMF
LPFMVSP VCVSAGVLLL YPQWTAS
LPL LLAMYALLAY PFVAKDVLSA
401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF
451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAM
VLTLLL AAFALGIFLL
501
LDGGEGGKQT ETL*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF139和脑膜炎奈瑟球菌菌株A的ORF(ORF139a)在189个氨基酸的重叠区内显示出有94.7%的相同性:
10 20 30
orf139.pep AWSAGESWRVLMESETWHAVWNTLRFS
AAA
|||||||||||||||||:||||| ||| |||
orf139a QSVGEYVLLAFA
AAVXSVCCLFXLLAIVVKAWSAGESWRVLMESETWQAVWNTXRFS
AAA
270 280 290 300 310 320
40 50 60 70 80 90
orf139.pep
VYAAAVLGVVYAAPARRSAWMRGLMF
XPFMVSPVCVSAGVLLLYPQWTAS
LPLLLAMYAL
||||||||||||| |||||||||||| |||||||||||||||| |||||| ||||||||||
orf139a
VYAAAVLGVVYAAAARRSAWMRGLMF
LPFMVSPVCVSAGVLLLXPQWTAS
LPLLLAMYAL
330 340 350 360 370 380
100 110 120 130 140 150
orf139.pep
LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV
||||||| |||||| ||||||||||||||||||||||||||||||||||||||||||||||
orf139a
LAYPFVAKDVLSAXDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV
390 400 410 420 430 440
160 170 180 189
orf139.pep GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAM
VL
|||||||| || |||||||||||| |||| |||||||||
orf139a GEFAATLFXSRXEWQTLTTLIYAYXGRAGXDNYARAM
VLTLLLAAFALGXFLLLDGGEGG
450 460 470 480 490 500
全长ORF139a核苷酸序列<SEQ ID 577>是:
1 ATGGATGGAC GGCGTTGGGC GGTATGGGGT GCTTTTGCCC TGCTGCCTTC
51 GGCTTTTTTG GCGGCAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT
101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA
151 CGTTTGGCGT GGACGGTATT TCAGGCAGCG GCAACCTGTG TGCTGGTGCT
201 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTTCCGG
251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCTTTTGT GATGCCCACG
301 TTGGTGGCGG GCGTGGGCGT GCTGGCTCTG TTCGGGGCGG ACGGCCTGTN
351 GTGGCGCGGC TGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT
401 TTTTTNACCT TCCTGTGTTG GTCAGGGCGG CATATCAGGG GTTTGTGCAA
451 GTGCCTGCGG CACGGCTTCA GACGGCACNG ACATTGGGCG CGGGGGCGTG
501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG
551 GCGGCGTGTG CCTTGTCTTC CTGTATTGTT TTTCGGGGTT CGGGCTGGCA
601 TTGCTGCTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA
651 GTTGGTCATG TTCGAACTCG ATATGGCGGT TGCTTCGGTG CTNGTGTGGC
701 TGGTGTNGGG GGTAACNGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC
751 AGGCGCGCGG TTTCGGATAA GGCNGTTTCC CCTGTGATGC CGTCGCCGCC
801 GCAGTCGGTC GGGGAATATG TGCTNCTGGC GTTTGCGGCG GCGGTGTNGT
851 CTGTGTGCTG CCTGTTTCNT TTGTTGGCAA TTGTTGTGAA AGCGTGGTCG
901 GCCGGCGAAT CGTGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCGGT
951 GTGGAATACT NTGCGCTTCT CGGCGGCGGC GGTGTATGCG GCGGCGGTTT
1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGTCGGCGTG GATGCGCGGG
1051 CTGATGTTTT TGCCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT
1101 GCTGCTGCTT NATCCGCAGT GGACGGCTTC GTTGCCGCTG CTGCTGGCGA
1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCAGCC
1201 TGNGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCGG GTTTGGGTGC
1251 AAACGGCTTT CAGACGGCAT GCCGCATCAC GTTCCCCCTC TTGAAACCGG
1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CAACCTGCGT GGGCGAATTT
1351 GCGGCAACCT TGTTCNTGTC GCGTCNCGAG TGGCAGACGC TGACGACTTT
1401 GATTTATGCC TATNTGGGAC GCGCGGGTGA NGATAATTAC GCGCGGGCGA
1451 TGGTGCTGAC ATTGCTGTTG GCGGCGTTCG CGCTGGGTAT NTTCCTGCTG
1501 TTGGACGGCG GCGAAGGCGG AAAACGGACG GAAACGTTAT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 578>:
1
MDGRRWAVWG AFALLPSAFL AAMVVAPLWA VAAYDGLAWR AVLSDAYMLK
51 RLAWTVFQAA
ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLML
PFVMPT
101
LVAGVGVLAL FGADGLXWRG WQDTPYLLLY GNVFFXLPVL VRAAYQGFVQ
151 VPAARLQTAX TLGAGAWRRF WDIEMPVLRP WLAGG
VCLVF LYCFSGFGLA
201
LLLGGSRYAT VEVEIYQLVM FELDMAVA
SV LVWLVXGVTA AAGLLYAWFG
251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFA
A AVXSVCCLFX LLAIVVKAWS
301 AGESWRVLME SETWQAVWNT XRFS
AAAVYA AAVLGVVYAA AARRSAWMRG
351 LMF
LPFMVSP VCVSAGVLLL XPQWTAS
LPL LLAMYALLAY PFVAKDVLSA
401 XDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF
451 AATLFXSRXE WQTLTTLIYA YXGRAGXDNY ARAM
VLTLLL AAFALGXFLL
501
LDGGEGGKRT ETL*
ORF139a和ORF139-1在514个氨基酸的重叠区内显示出有96.5%的同源性:
orf139a.pep MDGRRWAVWGAFALLPSAFLAAMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA
||||||:||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf139-1 MDGRRWVVWGAFALLPSAFLAVMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA
orf139a.pep ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLXWRG
|||||||||||||||||||||||||||||||||||||||||||||||||||||||| |||
orf139-1 ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG
orf139a.pep WQDTPYLLLYGNVFFXLPVLVRAAYQGFVQVPAARLQTAXTLGAGAWRRFWDIEMPVLRP
|||||||||||||| ||||||||||||||||||||||| ||||||||||||||||||||
orf139-1 RQDTPYLLLYGNVFFNLPVLVRAAYQGFVQVPAARLQTARTLGAGAWRRFWDIEMPVLRP
orf139a.pep WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVXGVTA
||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||||
orf139-1 WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVLGVTA
orf139a.pep AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVXSVCCLFXLLAIVVKAWS
|||||||||||||||||||||||||||||||||||||||||| |||||| ||||||||||
orf139-1 AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVLSVCCLFPLLAIVVKAWS
orf139a.pep AGESWRVLMESETWQAVWNTXRFSAAAVYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSP
|||||||||||||||||||| |||||||||||||||||||||||||||||||||||||||
orf139-1 AGESWRVLMESETWQAVWNTLRFSAAAVYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSP
orf139a.pep VCVSAGVLLLXPQWTASLPLLLAMYALLAYPFVAKDVLSAXDALPPDYGRAAAGLGANGF
|||||||||| ||||||||||||||||||||||||||||| ||||||||||||||| |||
orf139-1 VCVSAGVLLLYPQWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF
orf139a.pep QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFXSRXEWQTLTTLIYAYXGRAGXDNY
||||||||||||||||||||||||||||||||||| || |||||||||||| |||| |||
orf139-1 QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY
orf139a.pep ARAMVLTLLLAAFALGXFLLLDGGEGGKRTETLX
|||||||||||||||| |||||||||||:|||||
orf139-1 ARAMVLTLLLAAFALGIFLLLDGGEGGKQTETLX
与淋病奈瑟球菌的预计ORF的同源性
ORF139和淋病奈瑟球菌的预计ORF(ORF139ng)在189个氨基酸的重叠区内显示出有95.2%的相同性:
orf139.pep AWSAGESWRVLMESETWHAVWNTLRFSAAA 30
||||||| |||||||||:||||||||||||
orf139ng QSVGEYVLLAFSVAVLSVCCLFPLSAIVVKAWSAGESRRVLMESETWQAVWNTLRFSAAA 327
orf139.pep VYAAAVLGVVYAAPARRSAWMRGLMFXPFMVSPVCVSAGVLLLYPQWTASLPLLLAMYAL 90
|:||||||||||| ||| :|||||:| |||||||||||||||||| ||||||||||||||
orf139ng VFAAAVLGVVYAAAARRLVWMRGLVFLPFMVSPVCVSAGVLLLYPGWTASLPLLLAMYAL 387
orf139.pep LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV 150
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf139ng LAYPFVAKDVLSAWDALPPDYGRAAAGLGANGFQTACRITFPLLKPALRRGLTLAAATCV 447
orf139.pep GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVL 189
|||||||||||||||||||||||||||||||||||||||
orf139ng GEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNYARAMVLTLLLSAFAVCIFLLLDNGEGG 507
预计全长ORF139ng核苷酸序列<SEQ ID 579>编码的蛋白质具有氨基酸序列<SEQ ID 580>:
1 MDGRCWAVRG AFSLLPSAFL AVMVVAPLWA VAAYDGLAWR AVLSDAYMLK
51 RLAWTVFQAA ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLPFVMPT
101 LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFAQ
151
VPAARLQTAR TLGAGAWRPF WDIEMPVLRP WLAGGVCLVF LYCFSGFGLA
201 LLLGGSRYAT VEVEIYQLVM FELDMAGASA LVWLVLGVTA AAGLLYAWFG
251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFSV AVLSVCCLFP LSAIVVKAWS
301 AGESRRVLME SETWQAVWNT LRFSAAAVFA AAVLGVVYAA AARRLVWMRG
351 LVFLPFMVSP VCVSAGVLLL YPGWTASLPL LLAMYALLAY PFVAKDVLSA
401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF
451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAMVLTLLL SAFAVCIFLL
501 LDNGEGGKRT ETL*
进一步的工作揭示了一个淋球菌变体DNA序列<SEQ ID 581>:
1 ATGGATGGAC GGTGTTGGGC GGTACGGGGT GCTTTTTCCC TGCTGCCTTC
51 GGCTTTTTTG GCGGTAATGG TCGTTGCGCC TTTGTGGGCG GTGGCGGCGT
101 ATGACGGTTT GGCGTGGCGC GCGGTGCTGT CGGATGCCTA TATGCTCAAA
151 CGTTTGGCGT GGACGGTGTT TCAGGCGGCG GCAACCTGTG TGCTGGTGCT
201 GCCTTTGGGC GTGCCTGTCG CGTGGGTGCT GGCGCGGCTG GCGTTCCCGG
251 GGCGGGCTTT GGTGCTGCGC CTGCTGATGC TGCCGTTTGT GATGCCCACG
301 CTGGTGGCGG GCGTGGGCGT GCTGGCTCTG TTCGGGGCGG ACGGGCTGTT
351 GTGGCGCGGC CGGCAGGATA CGCCGTATCT GTTGTTGTAC GGCAATGTGT
401 TTTTCAACCT GCCCGTGTTG GTCAGGGCGG CGTATCAGGG GTTTGCTCAA
451 GTGCCTGCGG CACGGCTTCA GACGGCACGG ACGTTGGGCG CGGGGGCGTG
501 GCGGCGGTTT TGGGACATTG AAATGCCCGT TTTGCGCCCG TGGCTTGCCG
551 GCGGCGTGTG CCTTGTCTTC CTGTATTGTT TTTCGGGGTT CGGGCTGGCA
601 TTGCTGTTGG GCGGCAGCCG TTATGCCACG GTCGAAGTGG AAATTTACCA
651 GTTGGTTATG TTCGAACTCG ATATGGCGGG GGCTTCGGCG CTGGTGTGGC
701 TGGTGTTGGG GGTAACGGCG GCGGCAGGGT TGCTGTATGC GTGGTTCGGC
751 AGGCGCGCGG TTTCGGATAA GGCGGTTTCC CCCGTGATGC CGTCGCCGCC
801 GCAATCGGTG GGGGAATATG TATTGCTGGC ATTTTCGGTG GCGGTGTTGT
851 CCGTGTGCTG CCTGTTTCCT TTGTCGGCAA TTGTTGTGAA AGCGTGGTCG
901 GCCGGCGAAT CGCGGCGTGT GTTAATGGAA AGTGAAACGT GGCAGGCAGT
951 GTGGAATACt ttGCGCTTTT CGGCGGCGGC GGTGTTTGCG GCGGCGGTTT
1001 TGGGTGTGGT GTATGCGGCG GCGGCGCGGC GGCTGGTGTG GATGCGCGGA
1051 CTGGTGTTTT TACCGTTTAT GGTGTCGCCG GTTTGTGTTT CGGCGGGCGT
1101 GCTGCTGCTT TATCCGGGGT GGACGGCTTC GTTACCGCTG CTGCTGGCGA
1151 TGTATGCGCT GCTGGCGTAT CCGTTTGTGG CAAAAGATGT TTTATCGGCC
1201 TGGGATGCAC TGCCGCCGGA TTACGGCAGG GCGGCGGCAG GTTTGGGCGC
1251 AAACGGCTTT CAGACGGCAT GCCGTATCAC GTTCCCCCTC TTGAAACCGG
1301 CGTTGCGGCG CGGTCTGACT TTGGCGGCGG CGACGTGTGT GGGCGAATTT
1351 GCGGCAACCT TGTTCCTGTC GCGTCCGGAA TGGCAGACGT TGACGACTTT
1401 GATTTATGCC TATTTGGGGC GTGCGGGTGA GGACAATTAT GCGCGGGCAA
1451 TGGTGTTGAC ATTGCTGTTG TCGGCATTTG CGGTGTGCAT TTTCCTGCTG
1501 TTGGACAACG GCGAAGGCGg aaaACGGACG GAAACGTTAT AA
它对应于氨基酸序列<SEQ ID 582;ORF139ng-1>:
1
MDGRCWAVRG AFSLLPSAFL AVMVVAPLWA VAAYDGLAWR AVLSDAYMLK
51 RLAWTVFQAA
ATCVLVLPLG VPVAWVLARL AFPGRALVLR LLMLP
FVMPT
101
LVAGVGVLAL FGADGLLWRG RQDTPYLLLY GNVFFNLPVL VRAAYQGFAQ
151 VPAARLQTAR TLGAGAWRRF WDIEMPVLRP WLAGG
VCLVF LYCFSGFGLA
201
LLLGGSRYAT VEVEIYQLVM FELDMAGA
SA LVWLVLGVTA AAGLLYAWFG
251 RRAVSDKAVS PVMPSPPQSV GEYVLLAFS
V AVLSVCCLFP LSAIVVKAWS
301 AGESRRVLME SETWQAVWNT LRFS
AAAVFA AAVLGVVYAA AARRLVWMRG
351 LVF
LPFMVSP VCVSAGVLLL YPGWTASL
PL LLAMYALLAY PFVAKDVLSA
401 WDALPPDYGR AAAGLGANGF QTACRITFPL LKPALRRGLT LAAATCVGEF
451 AATLFLSRPE WQTLTTLIYA YLGRAGEDNY ARAM
VLTLLL SAFAVCIFLL
501
LDNGEGGKRT ETL*
ORF139ng-1和ORF139-1在513个氨基酸的重叠区内显示出有95.9%的相同性:
orf139ng MDGRCWAVRGAFSLLPSAFLAVMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA
|||| |:| |||:|||||||||||||||||||||||||||||||||||||||||||||||
orf139-1 MDGRRWVVWGAFALLPSAFLAVMVVAPLWAVAAYDGLAWRAVLSDAYMLKRLAWTVFQAA
orf139ng ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf139-1 ATCVLVLPLGVPVAWVLARLAFPGRALVLRLLMLPFVMPTLVAGVGVLALFGADGLLWRG
orf139ng RQDTPYLLLYGNVFFNLPVLVRAAYQGFAQVPAARLQTARTLGAGAWRRFWDIEMPVLRP
||||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||
orf139-1 RQDTPYLLLYGNVFFNLPVLVRAAYQGFVQVPAARLQTARTLGAGAWRRFWDIEMPVLRP
orf139ng WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAGASALVWLVLGVTA
|||||||||||||||||||||||||||||||||||||||||||||| ||:||||||||||
orf139-1 WLAGGVCLVFLYCFSGFGLALLLGGSRYATVEVEIYQLVMFELDMAVASVLVWLVLGVTA
orf139ng AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFSVAVLSVCCLFPLSAIVVKAWS
||||||||||||||||||||||||||||||||||||||::||||||||||| ||||||||
orf139-1 AAGLLYAWFGRRAVSDKAVSPVMPSPPQSVGEYVLLAFAAAVLSVCCLFPLLAIVVKAWS
orf139ng AGESRRVLMESETWQAVWNTLRFSAAAVFAAAVLGVVYAAAARRLVWMRGLVFLPFMVSP
|||| |||||||||||||||||||||||:||||||||||||||| :|||||:||||||||
orf139 AGESWRVLMESETWQAVWNTLRFSAAAVYAAAVLGVVYAAAARRSAWMRGLMFLPFMVSP
orf139ng VCVSAGVLLLYPGWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF
|||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||
orf139-1 VCVSAGVLLLYPQWTASLPLLLAMYALLAYPFVAKDVLSAWDALPPDYGRAAAGLGANGF
orf139ng QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf139-1 QTACRITFPLLKPALRRGLTLAAATCVGEFAATLFLSRPEWQTLTTLIYAYLGRAGEDNY
orf139ng ARAMVLTLLLSAFAVCIFLLLDNGEGGKRTETL
||||||||||:|||: ||||||:|||||:||||
orf139-1 ARAMVLTLLLAAFALGIFLLLDGGEGGKQTETL
根据淋球菌蛋白中存在一个预计的结合蛋白依赖型转运蛋白系统内膜组分特征序列(下划线),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例70
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 583>:
1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC
51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAGA TTCCGCATCC
101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC
151 GGTTTGCCCA CAGGCAGCAT TGTCAAAGAC ATACTGGTCA AAAACTTCGG
201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG
251 AACGTTTGGT C...
它对应于氨基酸序列<SEQ ID 584;ORF 140>:
1 MDGWTQTLSA QTLLGISAAA IILILILIVR FRIHALLTLV IVSLLTALAT
51 GLPTGSIVKD ILVKNFGGTL GGVALLVGLG AMLERLV..
进一步的工作揭示了其完整的核苷酸序列<SEQ ID 585>:
1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC
51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC
101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC
151 GGTTTGCCCA CAGGCAGCAT TGTCAACGAC ATACTGGTCA AAAACTTCGG
201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG
251 GACGTTTGGT CGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG
301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCGCTGG GCGTTGCCTC
351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC
401 TGCCCATCGT GTTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC
451 TTCGCGCTTG CCTCCATCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC
501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG
551 GCCAAGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC
601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCACCATCC ATGTTCCCGT
651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAACGACCTG CCGAAAGAAC
701 CTGCCAAAGC AGGAACGGTC GTCGCCATCA TGCTGATTCC CATGCTGCTG
751 ATTTTCCTGA ATACCGGCGT ATCGGCCCTC ATCAGCGAAA AACTCGTAAG
801 TGCGGACGAA ACCTGGGTTC AGACGGCAAA AATAATCGGT TCGACACCGA
851 TCGCCCTTCT GATTTCCGTA TTGGTCGCAC TGTTTGTCTT GGGACGCAAA
901 CGCGGCGAAA GCGGCAGCGC GTTGGAAAAA ACCGTGGACG GCGCACTCGC
951 CCCCGTCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG
1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG
1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGT TTCCTTGTCG CCTTGGCACT
1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACC GCCGCCGCGC
1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC
1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA
1251 CGACTCCGGC TTCTGGCTGG TCGGCCGTCT CTTGGACATG GACGTACCGA
1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ACTCATCGGC
1351 TTTGCCTTGT CCGCACTGCT GTTCGCCATC GTCTGA
它对应于氨基酸序列<SEQ ID 586;ORF140-1>:
1
MDGWTQTLSA QTLLGISAAA IILILILIVK FRIHALLTLV IVSLLTALAT
51
GLPTGSIVND ILVKNFGGTL
GGVALLVGLG AMLGRLVETS GGAQSLADAL
101 IRMFGEKRAP FALGVAS
LIF GFPIFFDAGL IVMLPIVFAT ARRMKQD
VLP
151
FALASIGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF
201 SGYMLGKVLG RTIHVPVPEL LSGGTQDNDL PKEPAK
AGTV VAIMLIPMLL
251
IFLNTGVSAL ISEKLVSADE TWVQTAKIIG S
TPIALLISV LVALFVLGRK
301 RGESGSALEK TVDGALAPVC
SVILITGAGG MFGGVLRASG IGKALADSMA
351 DLG
IPVLLGC FLVALALRIA QGSAT
VALTT AAALMAPAVA AAGFTDWQLA
401 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQTLIALIG
451
FALSALLFAI V*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF140和脑膜炎奈瑟球菌菌株A的ORF(ORF140a)在87个氨基酸的重叠区内显示出有95.4%的相同性:
10 20 30 40 50 60
orf140.pep
MDGWTQTLSAQTLLGISAAAIILILILIVRFRIHALLTLVIVSLLTALATGLPTGSIVKD
|||||||||||||||||||||||||||||:||||||||||||||||||||||||||||:|
orf140a
MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND
10 20 30 40 50 60
70 80
orf140.pep ILVKNFGGTL
GGVALLVGLGAMLERLV
:|||||||||||||||||||||| |||
orf140a VLVKNFGGTL
GGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF
70 80 90 100 110 120
全长ORF140a核苷酸序列<SEQ ID 587>是:
1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC
51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC
101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC
151 GGTTTGCCCA CAGGCAGCAT TGTCAACGAC GTACTGGTCA AAAACTTCGG
201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG
251 GACGTTTGGT CGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG
301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCGCTGG GCGTTGCCTC
351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC
401 TGCCCATCGT GTTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC
451 TTCGCGCTTG CCTCCATCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC
501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG
551 GCCAAGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC
601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCACCATCC ATGTTCCCGT
651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAACGACCTG CCGAAAGAAC
701 CTGCCAAAGC AGGAACGGTC GTCGCCATCA TGCTGATTCC CATGCTGCTG
751 ATTTTCCTGA ATACCGGCGT ATCGGCCCTC ATCAGCGAAA AACTCGTAAG
801 TGCGGACGAA ACCTGGGTTC AGACGGCAAA AATAATCGGT TCGACACCGA
851 TCGCCCTTCT GATTTCCGTA TTGGTCGCAC TGTTTGTCTT GGGACGCAAA
901 CGCGGCGAAA GCGGCAGCGC GTTGGAAAAA ACCGTGGACG GCGCACTCGC
951 CCCCGTCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG
1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG
1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGT TTCCTTGTCG CCTTGGCACT
1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACC GCCGCCGCGC
1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC
1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA
1251 CGACTCCGGC TTCTGGCTGG TCGGCCGCCT CTTGGACATG GACGTACCGA
1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ACTCATCGGC
1351 TTTGCCTTGT CCGCACTGCT GTTCGCCATC GTCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 588>:
1
MDGWTQTLSA QTLLGISAAA IILILILIVK FRIHALLTLV IVSLLTALAT
51
GLPTGSIVND VLVKNFGGTL
GGVALLVGLG AMLGRLVETS GGAQSLADAL
101 IRMFGEKRAP FALGVAS
LIF GFPIFFDAGL IVMLPIVFAT ARRMKQD
VLP
151
FALASIGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF
201 SGYMLGKVLG RTIHVPVPEL LSGGTQDNDL PKEPAK
AGTV VAIMLIPMLL
251
IFLNTGVSAL ISEKLVSADE TWVQTAKIIG S
TPIALLISV LVALFVLGRK
301 RGESGSALEK TVDGALAPVC SVILITGAGG MFGGVLRASG IGKALADSMA
351 DLG
IPVLLGC FLVALALRIA QGSAT
VALTT AAALMAPAVA AAGFTDWQLA
401 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQT
LIALIG
451
FALSALLFAI V*
ORF140a和ORF140-1在461个氨基酸的重叠区内显示出有99.8%的相同性:
orf140-1.pep MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND 60
orf140-1.pep ILVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF 120
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF 120
orf140-1.pep GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASIGAFSVMHVFLPPHPGPIAASEFYG 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASIGAFSVMHVFLPPHPGPIAASEFYG 810
orf140-1.pep ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTV 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTV 240
orf140-1.pep VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVLVALFVLGRK 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVLVALFVLGRK 300
orf140-1.pep RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC 360
orf140-1.pep FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140a FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG 420
orf140-1.pep FWLVGRLLDMDVPTTLKTWTVNQTLIALIGFALSALLFAIV 461
|||||||||||||||||||||||||||||||||||||||||
orf140a FWLVGRLLDMDVPTTLKTWTVNQTLIALIGFALSALLFAIV 461
与淋病奈瑟球菌的预计ORF的同源性
ORF140和淋病奈瑟球菌的预计ORF(ORF140ng)在87个氨基酸的重叠区内显示出有92%的相同性:
orf140.pep MDGWTQTLSAQTLLGISAAAIILILILIVRFRIHALLTLVIVSLLTALATGLPTGSIVKD 60
||| |||||||||||||||||||||||||:|||:|||||||:||||||||||||||||:|
orf140ng MDGRTQTLSAQTLLGISAAAIILILILIVKFRIRALLTLVIASLLTALATGLPTGSIVND 60
orf140.pep ILVKNFGGTLGGVALLVGLGAMLERLV 87
:|||||||||||||||||||||| |||
orf140ng VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFAPGVASLIF 120
预计全长ORF 140ng核苷酸序列<SEQ ID 589>编码的蛋白质具有氨基酸序列<SEQ ID 590>:
101 IRMFGEKRAP FAPGVAS
LIF GFPIFFDAGL IVMLPIVFAT ARRMKQD
VLP
151
FALASVGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF
201 SGYMLGKVLG RAIHVPVPEL LSGGTQDSDP PKEPAK
AGTV VAVMLIPMLL
251
IFLNTGVSAL ISEKLVSADE TWVQTAKMIG S
TPVALLISV LAALLVLGRK
301 RGESGSTLEK TVDGALAPA
C SVILITGAGG MFGGVLRASG IGKALADSMA
351 DLG
IPVLLGC FLVALALRIA QGSAT
VALTT AAALMAPAVA AAGFTDWQLA
401 CIVLATAAGS VGCSHFNDSG FWLVGRLSDM DVPTTLKTWT VNQT
LIAFIG
451
FALSALLFAI V*
进一步的工作揭示了一个淋球菌变体DNA序列<SEQ ID 591>:
1 ATGGACGGCC GGACACAGAC GCTGTCCGCG CAAACCTTGT TGGGCATTTC
51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAAA TTCCGCATCC
101 GCGCGCTGCT GACACTGGTC ATCGCCAGCC TGCTGACGGC TTTGGCAACC
151 GGTTTGCCCA CAGGCAGCAT CGTCAACGAC GTACTGGTCA AAAACTTCGG
201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGTCTGGGC GCAATGCTCG
251 GACGTTTGGT AGAAACATCC GGCGGCGCAC AGTCGCTGGC GGACGCGCTG
301 ATCCGGATGT TCGGCGAAAA ACGCGCACCG TTCGCTCCGG GCGTTGCCTC
351 GCTGATTTTC GGCTTCCCGA TTTTCTTCGA TGCCGGACTA ATCGTCATGC
401 TGCCCATCGT ATTCGCCACC GCACGGCGCA TGAAACAGGA CGTACTGCCC
451 TTCGCGCTTG CCTCCGTCGG CGCATTTTCC GTCATGCACG TCTTCCTGCC
501 GCCCCATCCG GGCCCGATTG CCGCTTCCGA ATTTTACGGC GCGAACATCG
551 GCCAGGTTTT GATTTTGGGT CTGCCGACCG CCTTCATCAC ATGGTATTTC
601 AGCGGCTATA TGCTCGGCAA AGTGTTGGGG CGCGCCATCC ATGTTCCCGT
651 TCCCGAACTG CTCAGCGGCG GCACGCAAGA CAGCGACCCG CCGAAAGAAC
701 CTGCCAAAGC AGGAACGGTC GTCGCCGTCA TGCTGATTCC CATGCTGCTG
751 ATTTTCCTGA ATACCGGCGT ATCAGCCCTC ATCAGCGAAA AACTCGTAAG
801 TGCGGACGAA ACTTGGGTTC AGACGGCAAA AATGATCGGT TCGACACCTG
851 TCGCCCTTCT GATTTCCGTA TTGGCCGCAC TGTTGGTCTT GGGACGCAAA
901 CGCGGCGAAA GCGGCAGCAC GTTGGAAAAA ACCGTGGACG GCGCACTCGC
951 CCCCGCCTGT TCCGTGATTC TGATTACCGG CGCGGGCGGT ATGTTCGGCG
1001 GCGTTTTGCG CGCTTCCGGC ATCGGCAAGG CACTCGCCGA CAGCATGGCG
1051 GATTTGGGCA TTCCCGTCCT TTTGGGCTGC TTCCTTGTCG CCTTGGCACT
1101 GCGTATCGCG CAAGGTTCGG CAACCGTCGC CCTGACCACA GCCGCCGCGC
1151 TGATGGCTCC TGCCGTTGCC GCCGCCGGCT TTACCGACTG GCAGCTCGCC
1201 TGTATCGTAT TGGCAACGGC GGCAGGTTCG GTCGGTTGCA GCCACTTCAA
1251 CGACTCCGGC TTCTGGCTGG TCGGCCGCCT CTTGGATATG GACGTACCGA
1301 CCACGCTGAA AACCTGGACG GTCAACCAAA CCCTCATCGC ATTCATCGGC
1351 TTTGCCTTGT CCGCACTGCT GTTTGCCATC GTCTGA
它对应于氨基酸序列<SEQ ID 592;ORF140ng-1>:
1
MDGRTQTLSA QTLLGISAAA IILILILIVK FRIRALLTLV IASLLTALAT
51
GLPTGSIVND VLVKNFGGTL
GGVALLVGLG AMLGRLVETS GGAQSLADAL
101 IRMFGEKRAP FAPGVAS
LIF GFPIFFDAGL IVMLPIVFAT ARRMKQD
VLP
151
FALASVGAFS VMHVFLPPHP GPIAASEFYG ANIGQVLILG LPTAFITWYF
201 SGYMLGKVLG RAIHVPVPEL LSGGTQDSDP PKEPAK
AGTV VAVMLIPMLL
251
IFLNTGVSAL ISEKLVSADE TWVQTAKMIG S
TPVALLISV LAALLVLGRK
301 RGESGSTLEK TVDGALAPAC
SVILITGAGG MFGGVLRASG IGKALADSMA
351 DLG
IPVLLGC FLVALALRIA QGSAT
VALTT AAALMAPAVA AAGFTDWQLA
401 CIVLATAAGS VGCSHFNDSG FWLVGRLLDM DVPTTLKTWT VNQT
LIAFIG
451
FALSALLFAI V*
ORF140ng-1和ORF140-1在461个氨基酸的重叠区内显示出有96.3%的相同性:
orf140ng-1.pep MDGRTQTLSAQTLLGISAAAIILILILIVKFRIRALLTLVIASLLTALATGLPTGSIVND
||| |||||||||||||||||||||||||||||:|||||||:||||||||||||||||||
orf140-1 MDGWTQTLSAQTLLGISAAAIILILILIVKFRIHALLTLVIVSLLTALATGLPTGSIVND
orf140ng-1.pep VLVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFAPGVASLIF
:||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||
orf140-1 ILVKNFGGTLGGVALLVGLGAMLGRLVETSGGAQSLADALIRMFGEKRAPFALGVASLIF
orf140ng-1.pep GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASVGAFSVMHVFLPPHPGPIAASEFYG
|||||||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf140-1 GFPIFFDAGLIVMLPIVFATARRMKQDVLPFALASIGAFSVMHVFLPPHPGPIAASEFYG
orf140ng-1.pep ANIGQVLILGLPTAFITWYFSGYMLGKVLGRAIHVPVPELLSGGTQDSDPPKEPAKAGTV
|||||||||||||||||||||||||||||||:|||||||||||||||:| ||||||||||
orf140-1 ANIGQVLILGLPTAFITWYFSGYMLGKVLGRTIHVPVPELLSGGTQDNDLPKEPAKAGTV
orf140ng-1.pep VAVMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKMIGSTPVALLISVLAALLVLGRK
||:||||||||||||||||||||||||||||||||||:|||||:|||||||:||:|||||
orf140-1 VAIMLIPMLLIFLNTGVSALISEKLVSADETWVQTAKIIGSTPIALLISVLVALFVLGRK
orf140ng-1.pep RGESGSTLEKTVDGALAPACSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC
||||||:|||||||||||:|||||||||||||||||||||||||||||||||||||||||
orf140-1 RGESGSALEKTVDGALAPVCSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGC
orf140ng-1.pep FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf140-1 FLVALALRIAQGSATVALTTAAALMAPAVAAAGFTDWQLACIVLATAAGSVGCSHFNDSG
orf140ng-1.pep FWLVGRLLDMDVPTTLKTWTVNQTLIAFIGFALSALLFAIV
|||||||||||||||||||||||||||:|||||||||||||
orf140-1 FWLVGRLLDMDVPTTLKTWTVNQTLIALIGFALSALLFAIV
另外,ORF140ng-1与一种大肠杆菌蛋白同源:
gi|882633(U29579)ORF_o454[大肠杆菌]>gi|1789097(AE000358)o454;
该454个氨基酸的ORF与约456个氨基酸的蛋白的444个残基有34%的相同性(9个空隙)GNTP_BACLI SW:P46832[大肠杆菌]长度=454
评分=210位(529),估计值=1e-53
相同性=130/384(33%),阳性=194/384(49%),空隙=19/384(4%)
询问:88 ETSGGAQSLADALIRMFGEKRAPFAPGVASLIFGFPIFFDAGLIVMLPIVFATARRMKQD 147
E SGGA+SLA+ R G+KR A +A+ G P+FFD G I++ PI++ A+ K
目标:80 EHSGGAESLANYFSRKLGDKRTIAALTLAAFFLGIPVFFDVGFIILAPIIYGFAKVAKIS 139
询问:148 VLPFALASVGAFSVMHVFLPPHPGPIAASEFYGANIGQVLILGLPTAFITWYFSGYMLGK 207
L F L G +HV +PPHPGP+AA+ A+IG + I+G+ + I GY K
目标:140 PLKFGLPVAGIMLTVHVAVPPHPGPVAAAGLLHADIGWLTIIGIAIS-IPVGVVGYFAAK 198
询问:208 VLGRAIHVPVPELL----------SGGTQDSDPPKEPAKAGTVVAVMLIPMLLIFLNTGV 257
++ + + E+L G T+ SD P A V ++++IP+ +I T
目标:199 IINKRQYAMSVEVLEQMQLAPASEEGATKLSDKINPPGVA-LVTSLIVIPIAIIMAGT-- 255
询问:258 SALISEKLVSADETWVQTAKMIGSTPXXXXXXXXXXXXXXGRKRGESGSTLEKTVDGALA 317
+S L+ + T ++IGS +RG S + AL
目标:256 ---VSATLMPPSHPLLGTLQLIGSPMVALMIALVLAFWLLALRRGWSLQHTSDIMGSALP 312
询问:318 PACSVILITGAGGMFGGVLRASGIGKALADSMADLGIPVLLGCFLVALALRIAQGSXXXX 377
A VIL+TGAGG+FG VL SG+GKALA+ + + +P+L F+++LALR +QGS
目标:313 TAAVVILVTGAGGVFGKVLVESGVGKALANMLQMIDLPLLPAAFIISLALRASQGS--AT 370
询问:378 XXXXXXXXXXXXXXXGFTDWQLACIVLATAAGSVGCSHFNDSGFWLVGRLLDMDVPTTLK 437
G Q + LA G +G SH NDSGFW+V + L + V LK
目标:371 VAILTTGGLLSEAVMGLNPIQCVLVTLAACFGGLGASHINDSGFWIVTKYLGLSVADGLK 430
询问:438 TWTVNQTLIAFIGFALSALLFAIV 461
TWTV T++ F GF ++ ++A++
目标:431 TWTVLTTILGFTGFLITWCVWAVI 454
根据该分析结果(包括鉴定出此淋球菌蛋白中存在一个推定前导序列(双划线)和几个推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例71
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 593>:
1 ..GATTTCGGCA TATCGCCCGT GTATCTTTGG GTTGCCGCCG CGTTCAAACA
51 TTTGCTGTCG CCGTGGGCTG CCGACTCATA CGATGTCGCA CGCTTTGCAG
101 GCGTATTTTT TGCCGTTATC GGACTGACTT CCTGCGGCTT TGCCGGTTTC
151 AACTTTTTGG GCAGACACCA CGGGCGCAC. GTCGTCCTGA TTCTCATCGG
201 CTGTATCGGG CTGATTCCAG TTGCCCATTT CCTCAACCCC GCTGCCGCCG
251 CCTTTGCCGC CGCCGGACTG GTGCTGCACG GTTATTCTTT GGCTCGCCGG
301 CGCGTGATTG CCGCCTCTTT TCTGCTCGGT ACGGGCTGGA CGCTGATGTC
351 GTTGGCAGCA GCTTATCCGG CAGCATTTGC CCTGATGCTG CCCTTGCCCG
401 TACTGATGTT TTTCCGTCCG ..
它对应于氨基酸序列<SEQ ID 594;ORF141>:
1 ..DFGISPVYLW VAAAFKHLLS PWAADSYDVA RFAGVFFAVI GLTSCGFAGF
51 NFLGRHHGRX VVLILIGCIG LIPVAHFLNP AAAAFAAAGL VLHGYSLARR
101 RVIAASFLLG TGWTLMSLAA AYPAAFALML PLPVLMFFRP ..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 595>:
1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA
51 AAAGCCGTGG CTGCTGCTGT TGATGGCGTT TGCCTGGTTG TGGCCCGGCG
101 TGTTTTCCCA CGATTTGTGG AATCCTGACG AACCTGCCGT CTATACCGCC
151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCC TTGGTTGCCC ATCTGTTCGG
201 TCAAACCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCGT
251 TCAAACATTT GCTGTCGCCG TGGGCTGCCG ACTCATACGA TGCCGCACGC
301 TTTGCAGGCG TATTTTTTGC CGTTATCGGA CTGACTTCCT GCGGCTTTGC
351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAgCGTC GTCCTGATTC
401 TCATCGGCTG TATCGGGCTG ATTCCAGTTG CCCATTTCCT CAACCCCGCT
451 GCCGCCGCCT TTGCCGCCGC CGGACTGGTG CTGCACGGTT ATTCTTTGGC
501 TCGCCGGCGC GTGATTGCCG CCTCTTTTCT GCTCGGTACG GGCTGGACGC
551 TGATGTCGTT GGCAGCAGCT TATCCGGCAG CATTTGCCCT GATGCTGCCC
601 TTGCCCGTAC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT
651 GACGGCAGTC GCCTCACTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC
701 CGCTGCTCTT GGCAAAAACG CAGCCCGCGC TGTTCGCGCA ATGGCTCGAC
751 TATCACGTTT TCGGTACGTT CGGCGGCGTG CGGCACGTTC AGACGGCATT
801 CAGTTTGTTT TACTATCTGA AAAACCTGCT TTGGTTTGCA TTGCCCGCGC
851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CGCGCCTGTT TTCGACCGAC
901 TGGGGGATTT TGGGCGTCGT CTGGATGCTT GCCGTTTTGG TGCTGCTTGC
951 CGTCAATCCG CAGCGTTTTC AGGATAACCT CGTCTGGCTG CTTCCGCCGC
1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGGCG CGGCGCGGCG
1051 GCGTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGACTGT TTGCCGTGTT
1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG
1151 CCGAACGCGC CGCCTATTTC AGCCCGTATT ATGTTCCTGA TATCGATCCC
1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT
1251 TACCCGGAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG
1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG
1351 GACGCGGCGA AAAGCCACGC GCCGGTCGTC CGGAGTATGG AGGCATCGCT
1401 TTCCCCGGAA TTGAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGGCA
1451 TAGGCGGCGG CGACCTGCAC ACGCGGATTG TTTGGACGCA GTACGGCACA
1501 TTGCCGCACC GCGTCGGCGA TGTACAATGC CGCTACCGCA TCGTCCTCCT
1551 GCCCCAAAAT GCGGATGCGC CGCAAGGCTG GCAGACGGTT TGGCAGGGTG
1601 CGCGTCCGCG CAACAAAGAC AGTAAGTTCG CACTGATACG GAAAATCGGG
1651 GAAAATATAT AA
它对应于氨基酸序列<SEQ ID 596;ORF141-1>:
1
MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPDEPAVYTA
51 VEALAGSPTP LVAHLFGQTD FGIPPVYLWV AAAFKHLLSP WAADSYDAAR
101
FAGVFFAVIG
LTSCGFAGFN FLGRHHGRS
V VLILIGCIGL IPVAHFLNPA
151 AAAFAAAGLV LHGYSLARRR
VIAASFLLGT
GWTLMSLAAA
YPAAFALMLP
201
LPVLMFFRPW QSRRL
MLTAV ASLAFALPLM TVYPLLLAKT QPALFAQWLD
251 YHVFGTFGGV RHVQTAFSLF YYLKNLLWFA LPALPLAVWT VCRTRLFSTD
301 W
GILGVVWML AVLVLLAVNP QRFQDNLVWL LPPLALFGAA QLDSLRRGAA
351 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP
401
IPMAVAVLFT PLWLWAITRK NIRGRQAVTN
WAAGVTLTWA LLMTLFLPWL
451 DAAKSHAPVV RSMEASLSPE LKRELSDGIE CIGIGGGDLH TRIVWTQYGT
501 LPHRVGDVQC RYRIVLLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKIG
551 ENI*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF141和脑膜炎奈瑟球菌菌株A的ORF(ORF141a)在140个氨基酸的重叠区内显示出有95.0%的相同性:
10 20 30
orf141.pep DFGISPVYLWVAAAFKHLLSPWAADSYDVA
|||| |||||||||||||||||||| ||:|
orf141a WNPDEPAVYTAVEALAGSPTPLVAHLFGQIDFGIPPVYLWVAAAFKHLLSPWAADPYDAA
40 50 60 70 80 90
40 50 60 70 80 90
orf141.pep R
FAGVFFAVIGLTSCGFAGFNFLGRHHGRX
VVLILIGCIGLIPVAHFLNPAAAAFAAAGL
| ||||||||:||||||||| |||||||||| |||||||||||||::|||||||||||||||
orf141a R
FAGVFFAVVGLTSCGFAGFNFLGRHHGRS
VVLILIGCIGLIPTVHFLNPAAAAFAAAGL
100 110 120 130 140 150
100 110 120 130 140
orf141.pep VLHGYSLARRR
VIAASFLLGTGWTLMSLAAA
YPAAFALMLPLPVLMFFRP
||||||||||||||||||||||||||||||||||||||||||||||||||
orf141a VLHGYSLARRR
VIAASFLLGTGWFLMSLAAA
YPAAFALMLPLPVLMFFRPWQSRRL
MLTA
160 170 180 190 200 210
orf141a
VASLAFALPLMTVYPLLLAKTQPALFAQWLDDHVFGTFGGVRHIQTAFSLFYYLKNLLWF
220 230 240 250 260 270
全长ORF141a核苷酸序列<SEQ ID 597>是:
1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA
51 AAAGCCGTGG CTGTTGCTGT TGATGGCGTT TGCCTGGTTG TGGCCCGGCG
101 TGTTTTCCCA CGATTTGTGG AATCCTGACG AACCTGCCGT CTATACCGCC
151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCT TTGGTTGCCC ATCTGTTCGG
201 TCAAATCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCGT
251 TCAAACATTT GCTGTCGCCG TGGGCTGCCG ACCCGTATGA TGCCGCACGC
301 TTTGCCGGCG TGTTTTTCGC CGTTGTCGGA CTGACTTCCT GCGGCTTTGC
351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAGCGTC GTCCTGATTC
401 TCATCGGCTG TATCGGGCTG ATTCCGACCG TACACTTTCT CAACCCCGCT
451 GCCGCCGCCT TTGCCGCCGC CGGACTGGTG CTGCACGGTT ATTCTTTGGC
501 TCGCCGGCGC GTGATTGCCG CCTCTTTTCT GCTCGGTACG GGTTGGACGC
551 TGATGTCGTT GGCAGCAGCT TATCCGGCGG CATTTGCCCT GATGCTGCCC
601 CTGCCCGTGC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT
651 GACGGCAGTC GCCTCGCTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC
701 CGCTGCTCTT GGCAAAAACG CAGCCCGCGC TGTTCGCGCA ATGGCTCGAC
751 GATCACGTTT TCGGTACGTT CGGCGGCGTG CGGCACATTC AGACGGCATT
801 CAGTTTGTTT TACTATCTGA AAAACCTGCT TTGGTTTGCA TTGCCTGCGC
851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CGCGCCTGTT TTCGACCGAC
901 TGGGGGATTT TGGGCGTCGT CTGGATGCTT GCCGTTTTGG TGCTGCTTGC
951 CGTCAATCCG CAGCGTTTTC AGGATAACCT CGTCTGGCTG CTTCCGCCGC
1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGACG CGGCGCGGCG
1051 GCGTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGACTGT TTGCCGTGTT
1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG
1151 CCGAACGCGC CGCCTATTTC AGCCCGTATT ATGTTCCTGA TATCGATCCC
1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT
1251 TACCCGCAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG
1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG
1351 GACGCGGCGA AAAGCCACGC GCCCGTCGTC CGGAGTATGG AGGCATCGCT
1401 TTCCCCGGAA TTAAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGACA
1451 TAGGCGGCGG CGACCTACAC ACGCGGATTG TTTGGACGCA GTACGGCACA
1501 TTGCCGCACC GCGTCGGCGA TGTACAATGC CGCTACCGCA TCGTCCGCTT
1551 GCCCCAAAAC GCGGATGCGC CGCAAGGCTG GCAGACGGTC TGGCAGGGTG
1601 CGCGCCCGCG CAACAAAGAC AGTAAGTTCG CACTGATACG GAAAACCGGG
1651 GAAAATATAT TAAAAACAAC AGATTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 598>:
1
MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPDEPAVYTA
51 VEALAGSPTP LVAHLFGQID FGIPPVYLWV AAAFKHLLSP WAADPYDAAR
101
FAGVFFAVVG LTSCGFAGFN FLGRHHGRS
V VLILIGCIGL IPTVHFLNPA
151 AAAFAAAGLV LHGYSLARRR
VIAASFLLGT GWTLMSLAAA
YPAAFALMLP
201
LPVLMFFRPW QSRRL
MLTAV ASLAFALPLM TVYPLLLAKT QPALFAQWLD
251 DHVFGTFGGV RHIQTAFSLF YYLKNLLWFA LPALPLAVWT VCRTRLFSTD
301 W
GILGVVWML AVLVLLAVNP QRFQDNLVWL LPPLALFGAA QLDSLRRGAA
351 AFVNWFGIMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP
401
IPMAVAVLFT PLWLWAITRK NIRGRQAVTN
WAAGVTLTWA LLMTLFLPWL
451 DAAKSHAPVV RSMEASLSPE LKRELSDGIE CIDIGGGDLH TRIVWTQYGT
501 LPHRVGDVQC RYRIVRLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKTG
551 ENILKTTD*
ORF141a和ORF141-1在553个氨基酸的重叠区内显示出有98.2%的相同性:
orf141a.pep MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP
orf141a.pep LVAHLFGQIDFGIPPVYLWVAAAFKHLLSPWAADPYDAARFAGVFFAVVGLTSCGFAGFN
|||||||| ||||||||||||||||||||||||| |||||||||||||:|||||||||||
orf141-1 LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADSYDAARFAGVFFAVIGLTSCGFAGFN
orf141a.pep FLGRHHGRSVVLILIGCIGLIPTVHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
||||||||||||||||||||||::||||||||||||||||||||||||||||||||||||
orf141-1 FLGRHHGRSVVLILIGCIGLIPVAHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
orf141a.pep GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
orf141a.pep QPALFAQWLDDHVFGTFGGVRHIQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD
|||||||||| |||||||||||:|||||||||||||||||||||||||||||||||||||
orf141-1 QPALFAQWLDYHVFGTFGGVRHVQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD
orf141a.pep WGILGVVWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 WGILGVVWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
orf141a.pep FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
orf141a.pep NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASLSPELKRELSDGIE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASLSPELKRELSDGIE
orf141a.pep CIDIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVRLPQNADAPQGWQTVWQGARPRNKD
|| |||||||||||||||||||||||||||||||| ||||||||||||||||||||||||
orf141-1 CIGIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVLLPQNADAPQGWQTVWQGARPRNKD
orf141a.pep SKFALIRKTGENI
|||||||| ||||
orf141-1 SKFALIRKIGENI
与淋病奈瑟球菌的预计ORF的同源性
ORF141和淋病奈瑟球菌的预计ORF(ORF141ng)在140个氨基酸的重叠区内显示出有95%的相同性:
orf141.pep DFGISPVYLWVAAAFKHLLSPWAADSYDVA 30
|||| ||||||||||||||||||| ||:|
orf141ng WNPAEPAVYTAVEALAGSPTPLVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAAHPYDAA 126
orf141.pep RFAGVFFAVIGLTSCGFAGFNFLGRHHGRXVVLILIGCIGLIPVAHFLNPAAAAFAAAGL 90
||||||||||||||||||||||||||||| |||| ||||||||||||:||||||||||||
orf141ng RFAGVFFAVIGLTSCGFAGFNFLGRHHGRSVVLIHIGCIGLIPVAHFFNPAAAAFAAAGL 186
orf141.pep VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRP 140
||||||||||||||||||||||||||||||||||||||||||||||||||
orf141ng VLHGYSLARRRVIAASFLLGTGWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTA 246
预计ORF141ng核苷酸序列<SEQ ID 599>编码的蛋白质具有氨基酸序列<SEQ ID600>:
1 MPSEAVSARP LCEYLLHLAI RPFLLTLMLT YTPPDARPPA KTHEKP
WLLL
51
LMAFAWLWPG VFSHDLWNPA EPAVYTAVEA LAGSPTPLVA HLFGQTDFGI
101 PPVYLWVAAA FKHLLSPWAA HPYDAAR
FAG VFFAVIGLTS CGFAGFNFLG
151 RHHGRS
VVLI HIGCIGLIPV AHFFNPAAAA FAAAGLVLHG YSLARRR
VIA
201
ASFLLGTGWT LMSLAAA
YPA AFALMLPLPV LMFFRPWQSR RL
MLTAVASL
251
AFALPLMTVY PLLLAKTQPA LFAQWLNYHV FGTFGGVRHI QRAFSLFHYL
301 KNLLWFAPPG LPLAVWTVCR TRLFSTDW
GI LGIVWMLAVL VLLAFNPQRF
351 QDNLVWLLPP LALFGAAQLD SLRRGAAAFV NWFG
IMAFGL FAVFLWTGFF
401
AMNYGWPAKL AERAAYFSPY YVPDIDP
IPM AVAVLFTPLW LWAITRKNIR
451 GRQAVTN
WAA GVTLTWALLM TLFLPWLDAA KSHAPVVRSM EASFSPELKR
501 ELSDGIECIG IGGGDLHTRI VWTQYGTLPH RVGDVRCRYR IVRLPQNADA
551 PQGWQTVWQG ARPRNKDSKF ALIRKIGENI LKTTD*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 601>:
1 ATGCTGACCT ATACCCCGCC CGATGCCCGC CCGCCCGCCA AAACCCACGA
51 AAAACCGTGG CTGCTGCTGT TGATGGCGTT TGCCTGGCTG TGGCCCGGCG
101 TGTTTTCCCA CGATTTGTGG AATCCTGCCG AACCTGCCGT CTATACCGCC
151 GTCGAAGCAC TGGCAGGCAG CCCCACCCCC TTGGTTGCCC ATCTGTTCGG
201 TCAAACCGAT TTCGGCATAC CGCCCGTGTA TCTTTGGGTT GCCGCCGCAT
251 TCAAACATTT GCTGTCGCCG TGGGCAGCCG ACCCGTATGA TGCCGCACGC
301 TTTGCAGGCG TATTTTTTGC CGTTATCGGA CTGACTTCTT GCGGCTTTGC
351 CGGTTTCAAC TTTTTGGGCA GACACCACGG GCGCAGCGTT GTTTTAATCC
401 ATATCGGCTG TATCGGGCTG ATTCCGGTTG CCCATTTCCT CAATCCcgcc
451 gccgccgcct tTGCCGCCGC CGGACTGGTG CTGCacggct actcgctgGC
501 ACGCCGGCGC GTGATtgccg cctctTtccT GCTCGGTACG GGTTGGACGT
551 TGATGTCGCT GGCGGCAGCT TATCCGGCGG CGTTTGCGCT GATGCTGCCC
601 CTGCCCGTGC TGATGTTTTT CCGTCCGTGG CAAAGCAGGC GTTTGATGTT
651 GACGGCAGTC GCCTCGCTTG CCTTTGCCCT GCCGCTTATG ACCGTTTACC
701 CGCTGCTCtt gGCAAAAACG CAGCCCGCGC TGTTTGCGCA ATGGCTCAAC
751 TATCACGTTT TCGGTACGTt cggcgGCGTG CGGCAcaTTC AGAggGCatT
801 Cagtttgttt cactatctgA AAaatctgct ttggttcgca ccgcccgggC
851 TGCCGCTGGC GGTTTGGACG GTTTGCCGCA CACGCCTGTT TTCGACCGAC
901 TGGGGGATTT TGGGCATTGT CTGGATGCTT GCCGTTTTGG TGCTGCTCGC
951 CTTTAATCCG CAGCGTTTTC AAGACAACCT CGTCTGGCTG CTGCCGCCGC
1001 TTGCCCTGTT CGGCGCGGCG CAACTGGACA GCCTGAGGCG CGGCGCGGCG
1051 GCTTTTGTCA ACTGGTTCGG CATTATGGCG TTCGGGCTGT TTGCCGTGTT
1101 CCTGTGGACG GGCTTTTTCG CCATGAATTA CGGCTGGCCC GCCAAGCTTG
1151 CCGAACGCGC CGCCTACTTC AGCCCGTATT ACGTTCCCGA CATCGATCCC
1201 ATTCCGATGG CGGTTGCCGT ACTGTTCACA CCCTTGTGGC TGTGGGCGAT
1251 TACCCGGAAA AACATACGCG GCAGGCAGGC GGTTACCAAC TGGGCGGCAG
1301 GCGTTACCCT GACCTGGGCT TTGCTGATGA CGCTGTTCCT GCCGTGGCTG
1351 GACGCGGCGA AAAGCCACGC GCCCGTCGTC CGGAGTATGG AGGCATCGTT
1401 TTCCCCGGAA TTAAAACGGG AGCTTTCAGA CGGCATCGAG TGTATCGGCA
1451 TAGGCGGCGG CGACCTGCAC ACGCGGATTG TTTGGACGCA GTACGGCACA
1501 TTGCCGCACC GCGTCGGCGA TGTCCGTTGC CGCTACCGTA TCGTCCGCCT
1551 GCCCCAAAAC GCGGATGCGC CGCAAGGCTG GCAGACGGTC TGGCAGGGTG
1601 CGCGCCCGCG CAACAAAGAC AGTAAGTTTG CACTGATACG GAAAATCGGG
1651 GAAAATATAT TAAAAACAAC AGATTGA
它对应于氨基酸序列<SEQ ID 602;ORF141ng-1>:
1
MLTYTPPDAR PPAKTHEKPW LLLLMAFAWL WPGVFSHDLW NPAEPAVYTA
51 VEALAGSPTP LVAHLFGQTD FGIPPVYLWV AAAFKHLLSP WAADPYDAAR
101
FAGVFFAVIG LTSCGFAGFN FLGRHHGRS
V VLIHIGCIGL IPVAHFLNPA
151 AAAFAAAGLV LHGYSLARRR
VIAASFLLGT GWTLMSLAAA
YPAAFALMLP
201
LPVLMFFRPW QSRRL
MLTAV ASLAFALPLM TVYPLLLAKT QPALFAQWLN
251 YHVFGTFGGV RHIQRAFSLF HYLKNLLWFA PPGLPLAVWT VCRTRLFSTD
301 W
GILGIVWML AVLVLLAFNP QRFQDNLVWL LPPLALFGAA QLDSLRRGAA
351 AFVNWFG
IMA FGLFAVFLWT GFFAMNYGWP AKLAERAAYF SPYYVPDIDP
401
IPMAVAVLFT PLWLWAITRK NIRGRQAVTN
WAAGVTLTWA LLMTLFLPWL
451 DAAKSHAPVV RSMEASFSPE LKRELSDGIE CIGIGGGDLH TRIVWTQYGT
501 LPHRVGDVRC RYRIVRLPQN ADAPQGWQTV WQGARPRNKD SKFALIRKIG
551 ENILKTTD*
ORF141ng-1和ORF141-1在553个氨基酸的重叠区内显示出有97.5%的相同性有:
orf141ng-1.pep MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPAEPAVYTAVEALAGSPTP
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf141-1 MLTYTPPDARPPAKTHEKPWLLLLMAFAWLWPGVFSHDLWNPDEPAVYTAVEALAGSPTP
orf141ng-1.pep LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADPYDAARFAGVFFAVIGLTSCGFAGFN
|||||||||||||||||||||||||||||||||| |||||||||||||||||||||||||
orf141-1 LVAHLFGQTDFGIPPVYLWVAAAFKHLLSPWAADSYDAARFAGVFFAVIGLTSCGFAGFN
orf141ng-1.pep FLGRHHGRSVVLIHIGCIGLIPVAHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 FLGRHHGRSVVLILIGCIGLIPVAHFLNPAAAAFAAAGLVLHGYSLARRRVIAASFLLGT
orf141ng-1.pep GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 GWTLMSLAAAYPAAFALMLPLPVLMFFRPWQSRRLMLTAVASLAFALPLMTVYPLLLAKT
orf141ng-1.pep QPALFAQWLNYHVFGTFGGVRHIQRAFSLFHYLKNLLWFAPPGLPLAVWTVCRTRLFSTD
|||||||||:||||||||||||:| |||||:||||||||| |:|||||||||||||||||
orf141-1 QPALFAQWLDYHVFGTFGGVRHVQTAFSLFYYLKNLLWFALPALPLAVWTVCRTRLFSTD
orf141ng-1.pep WGILGIVWMLAVLVLLAFNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
|||||:||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf141-1 WGILGVVWMLAVLVLLAVNPQRFQDNLVWLLPPLALFGAAQLDSLRRGAAAFVNWFGIMA
orf141ng-1.pep FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf141-1 FGLFAVFLWTGFFAMNYGWPAKLAERAAYFSPYYVPDIDPIPMAVAVLFTPLWLWAITRK
orf141ng-1.pep NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASFSPELKRELSDGIE
||||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||
orf141-1 NIRGRQAVTNWAAGVTLTWALLMTLFLPWLDAAKSHAPVVRSMEASLSPELKRELSDGIE
orf141ng-1.pep CIGIGGGDLHTRIVWTQYGTLPHRVGDVRCRYRIVRLPQNADAPQGWQTVWQGARPRNKD
||||||||||||||||||||||||||||:|||||| ||||||||||||||||||||||||
orf141-1 CIGIGGGDLHTRIVWTQYGTLPHRVGDVQCRYRIVLLPQNADAPQGWQTVWQGARPRNKD
orf141ng-1.pep SKFALIRKIGENILKTTDX
|||||||||||||
orf141-1 SKFALIRKIGENIX
根据淋球菌中存在几个推定跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例72
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 603>:
1 ..CAATCCGCCA AATGGTTATC GGGCCAAACT CTAGTCGGCA CAGCAATTGG
51 GATACGCGGG CAGATAAAGC TTGGCGGCAA CCTGCATTAC GATATATTTA
101 CCGGCCGCGC ATTGAAAAAG CCCGAATTTT TCCAATCAAG GAAATGGGCA
151 AGCGGTTTTC AGGTAGGCTA TACGTTTTAA
它对应于氨基酸序列<SEQ ID 604;ORF142>:
1 ..QSAKWLSGQT LVGTAIGIRG QIKLGGNLHY DIFTGRALKK PEFFQSRKWA
51 SGFQVGYTF*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 605>:
1 ATGGATAATT CGGGTAGTGA GGCGACAGGA AAATACCAAG GAAATATCAC
51 TTTCTCTGCC GACAATCCTT TGGGACTGAG TGATATGTTC TATGTAAATT
101 ATGGACGTTC GATTGGCGGT ACGCCCGATG AGGAAAGTTT TGACGGCCAT
151 CGCAAAGAAG GCGGATCAAA CAATTACGCC GTACATTATT CAGCCCCTTT
201 CGGTAAATGG ACATGGGCAT TCAATCACAA TGGCTACCGT TACCATCAGG
251 CAGTTTCCGG ATTATCGGAA GTCTATGACT ATAATGGAAA AAGTTACAAT
301 ACTGATTTCG GCTTCAACCG CCTGTTGTAT CGTGATGCCA AACGCAAAAC
351 CTATCTCGGT GTAAAACTGT GGATGAGGGA AACAAAAAGT TACATTGATG
401 ATGCCGAACT GACTGTACAA CGGCGTAAAA CTGCGGGTTG GTTGGCAGAA
451 CTTTCCCACA AAGAATATAT CGGTCGCAGT ACGGCAGATT TTAAGTTGAA
501 ATATAAACGC GGCACCGGCA TGAAAGATGC TCTGCGCGCG CCTGAAGAAG
551 CCTTTGGCGA AGGCACGTCA CGTATGAAAA TTTGGACGGC ATCGGCTGAT
601 GTAAATACTC CTTTTCAAAT CGGTAAACAG CTATTTGCCT ATGACACATC
651 CGTTCATGCA CAATGGAACA AAACCCCGCT AACATCGCAA GACAAACTGG
701 CTATCGGCGG ACACCACACC GTACGTGGCT TCGACGGTGA AATGAGTTTG
751 TCTGCCGAGC GGGGATGGTA TTGGCGCAAC GATTTGAGCT GGCAATTTAA
801 ACCAGGCCAT CAGCTTTATC TTGGGGCTGA TGTAGGACAT GTTTCAGGAC
851 AATCCGCCAA ATGGTTATCG GGCCAAACTC TAGTCGGCAC AGCAATTGGG
901 ATACGCGGGC AGATAAAGCT TGGCGGCAAC CTGCATTACG ATATATTTAC
951 CGGCCGCGCA TTGAAAAAGC CCGAATTTTT CCAATCAAGG AAATGGGCAA
1001 GCGGTTTTCA GGTAGGCTAT ACGTTTTAA
它对应于氨基酸序列<SEQ ID 606;ORF142-1>:
1 MDNSGSEATG KYQGNITFSA DNPLGLSDMF YVNYGRSIGG TPDEESFDGH
51 RKEGGSNNYA VHYSAPFGKW TWAFNHNGYR YHQAVSGLSE VYDYNGKSYN
101 TDFGFNRLLY RDAKRKTYLG VKLWMRETKS YIDDAELTVQ RRKTAGWLAE
151 LSHKEYIGRS TADFKLKYKR GTGMKDALRA PEEAFGEGTS RMKIWTASAD
201 VNTPFQIGKQ LFAYDTSVHA QWNKTPLTSQ DKLAIGGHHT VRGFDGEMSL
251 SAERGWYWRN DLSWQFKPGH QLYLGADVGH VSGQSAKWLS GQTLVGTAIG
301 IRGQIKLGGN LHYDIFTGRA LKKPEFFQSR KWASGFQVG
Y TF*
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF142和淋病奈瑟球菌的预计ORF(ORF142ng)在59个氨基酸的重叠区内显示出有88.1%的相同性。
orf142.pep QSAKWLSGQTLVGTAIGIRGQIKLGGNLHY 30
|||||||||||:||||||||||||||||||
orf142ng RGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLAGTAIGIRGQIKLGGNLHY 313
orf142.pep DIFTGRALKKPEFFQSRKWASGFQVGYTF 59
||||||||||||:||::||::||||||:|
orf142ng DIFTGRALKKPEYFQTKKWVTGFQVGYSF 342
全长ORF142ng核苷酸序列<SEQ ID 607>是:
1 ATGGATAATT CGGGTAGTGA GGCGACAGGA AAATACCAAG GAAATATCAC
51 TTTCTCTGCC GACAATCCTT TTGGACTGAG TGATATGTTC TATGTAAATT
101 ATGGACGTTC AATTGGCGGT ACGCCCGATG AGGAAAATTT TGACGGCCAT
151 CGCAAAGAAG GCGGATCAAA CAATTACGCC GTACATTATT CAGCCCCTTT
201 CGGTAAATGG ACATGGGCAT TCAATCACAA TGGCTACCGT TACCATCAGG
251 CGGTTTCCGG ATTATCGGAA GTCTATGACT ATAATGGAAA AAGTTACAAC
301 ACTGATTTCG GCTTCAACCG CCTGTTGTAT CGTGATGCCA AACGCAAAAC
351 CTATCTCAGT GTAAAACTGT GGACGAGGGA AACAAAAAGT TACATTGATG
401 ATGCCGAACT GACTGTACAA CGGCGTAAAA CCACAGGTTG GTTGGCAGAA
451 CTTTCCCACA AAGGATATAT CGGTCGCAGT ACGGCAGATT TTAAGTTGAA
501 ATATAAACAC GGCACCGGCA TGAAAGATGC TCTGCGCGCG CCTGAAGAAG
551 CCTTTGGCGA AGGCACGTCA CGTATGAAAA TTTGGACGGC ATCGGCTGAT
601 GTAAATACTC CTTTTCAAAT CGGTAAACAG CTATTTGCCT ATGACACATC
651 CGTTCATGCA CAATGGAACA AAACCCCGCT AACATCGCAA GACAAACTGG
701 CTATCGGCGG ACACCACACC GTACGTGGCT TCGACGGTGA AATGAGTTTG
751 CCTGCCGAGC GGGGATGGTA TTGGCGCAAC GATTTGAGCT GGCAATTTAA
801 ACCAGGCCAT CAGCTTTATC TTGGGGCTGA TGTAGGACAT GTTTCAGGAC
851 AATCCGCCAA ATGGTTATCG GGCCAAACTC TAGCCGGCAC AGCAATTGGG
901 ATACGCGGGC AGATAAAGCT TGGCGGCAAC CTGCATTACG ATATATTTAC
951 CGGCCGTGCA TTGAAAAAGC CCGAATATTT TCAGACGAAG AAATGGGTAA
1001 CGGGGTTTCA GGTGGGTTAT TCGTTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 608>:
1 MDNSGSEATG KYQGNITFSA DNPFGLSDMF YVNYGRSIGG TPDEENFDGH
51 RKEGGSNNYA VHYSAPFGKW TWAFNHNGYR YHQAVSGLSE VYDYNGKSYN
101 TDFGFNRLLY RDAKRKTYLS VKLWTRETKS YIDDAELTVQ RRKTTGWLAE
151 LSHKGYIGRS TADFKLKYKH GTGMKDALRA PEEAFGEGTS RMKIWTASAD
201 VNTPFQIGKQ LFAYDTSVHA QWNKTPLTSQ DKLAIGGHHT VRGFDGEMSL
251 PAERGWYWRN DLSWQFKPGH QLYLGADVGH VSGQSAKWLS GQTLAGTAIG
301 IRGQIKLGGN LHYDIFTGRA LKKPEYFQTK KWVTGFQVG
Y SF*
通常发现有下划线的序列(芳族-Xaa-芳族氨基酸基序)在外膜蛋白的C端。
ORF142ng和ORF142-1在342个氨基酸的重叠区内显示出有95.6%的相同性:
orf142-1.pep MDNSGSEATGKYQGNITFSADNPLGLSDMFYVNYGRSIGGTPDEESFDGHRKEGGSNNYA
|||||||||||||||||||||||:|||||||||||||||||||||:||||||||||||||
orf142ng-1 MDNSGSEATGKYQGNITFSADNPFGLSDMFYVNYGRSIGGTPDEENFDGHRKEGGSNNYA
orf142-1.pep VHYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLG
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:
orf142ng-1 VHYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLS
orf142-1.pep VKLWMRETKSYIDDAELTVQRRKTAGWLAELSHKEYIGRSTADFKLKYKRGTGMKDALRA
|||| |||||||||||||||||||:||||||||| ||||||||||||||:||||||||||
orf142ng-1 VKLWTRETKSYIDDAELTVQRRKTTGWLAELSHKGYIGRSTADFKLKYKHGTGMKDALRA
orf142-1.pep PEEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf142ng-1 PEEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHT
orf142-1.pep VRGFDGEMSLSAERGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLVGTAIG
|||||||||| |||||||||||||||||||||||||||||||||||||||||||:|||||
orf142ng-1 VRGFDGEMSLPAERGWYWRNDLSWQFKPGHQLYLGADVGHVSGQSAKWLSGQTLAGTAIG
orf142-1.pep IRGQIKLGGNLHYDIFTGRALKKPEFFQSRKWASGFQVGYTF
|||||||||||||||||||||||||:||::||::||||||:|
orf142ng-1 IRGQIKLGGNLHYDIFTGRALKKPEYFQTKKWVTGFQVGYSF
另外,ORF142ng与菊欧文氏菌的HecB蛋白同源:
gi|1772622(L39897)HecB[菊欧文氏菌]长度=558
评分=119位(295),估计值=3e-26
相同性=88/346(25%),阳性=151/346(43%),空隙=22/346(6%)
询问:2 DNSGSEATGKYQGNITFSADNPFGLSDMFYVNYGRSIGGTPDEENFDGHRKEGGSNNYAV 61
DNSG ++TG+ Q N + + DN FGL+D ++++ G S + +D + G
目标:230 DNSGQKSTGEEQLNGSLALDNVFGLADQWFISAGHS---SRFATSHDAESLQAG------ 280
询问:62 HYSAPFGKWTWAFNHNGYRYHQAVSGLSEVYDYNGKSYNTDFGFNRLLYRDAKRKTYLSV 121
+S P+G W +N++ RY + G S F +R+++RD KT ++
目标:281 -FSMPYGYWNLGYNYSQSRYRNTFINRDFPWHSTGDSDTHRFSLSRVVFRDGTMKTAIAG 339
询问:122 KLWTRETKSYIDDAELTVQRRKTTGWLAELSHKGYIGRSTADFKLKYKHGTGMKDALRAP 181
R +Y++ + L RK + ++H + A F Y G +
目标:340 TFSQRTGNNYLNGSLLPSSSRKLSSVSLGVNHSQKLWGGLATFNPTYNRGVRWLGSETDT 399
询问:182 EEAFGEGTSRMKIWTASADVNTPFQIGKQLFAYDTSVHAQWNKTPLTSQDKLAIGGHHTV 241
+++ E + WT SA P Y S++ Q++ L ++L +GG ++
目标:400 DKSADEPRAEFNKWTLSASYYHPV---TDSITYLGSLYGQYSARALYGSEQLTLGGESSI 456
询问:242 RGFDGEMSLPAERGWYWRNDLSWQFKP----GHQLYLGA-DVGHVSGQSAKWLSGQTLAG 296
RGF E RG YWRN+L+WQ G+ ++ A D GH+ + +L G
目标:457 RGF-REQYTSGNRGAYWRNELNWQAWQLPVLGNVTFMAAVDGGHLYNHKQDNSTAASLWG 515
询问:297 TAIGIRGQIKLGGNLHYDIFTGRALKKPEYFQTKKWVTGFQVGYSF 342
A+G+ + L + G + P + Q V G++VG SF
目标:516 GAVGMTVASRW---LSQQVTVGWPISYPAWLQPDTMVVGYRVGLSF 558
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例73
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 609>:
51 GCCGACATCG ATACCGCTTT GAACCTGTTG TACCGTTTGC AAAAACTCGA
101 ATTCCTCTAT GGCGATGAAA ACGGTCATTC AGACGGCATC AATTTGwCGG
151 ACGAGCAATT GCCGTTGCTG ATGGAACAAT TGTCCGGCAG CGGTAAGGCG
201 TTATTGGTCG ATCGGAACGG TCTGTATCTT GCCAACGCCA ATTTCCATCA
251 TGAGGCGGCG GAAGAGTTGG GGTTGTTGGC GGCAGAAGTC GCACAGATGG
301 AAAAGAAATA CCGGCTGCTG ATTAAGAACA AC..
它对应于氨基酸序列<SEQ ID 610;ORF143>:
1 MRTKWSAVRS CTWADTADID TALNLLYRLQ KLEFLYGDEN GHSDGINLXD
51 EQLPLLMEQL SGSGKALLVD RNGLYLANAN FHHEAAEELG LLAAEVAQME
101 KKYRLLIKNN ..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 611>:
1 ATGGAATCAA CACTTTCACT ACAAGCAAAT TTATATCCCC GCCTGACTCC
51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGCCCCCAGT GCCGGTAAAA
101 CTTTGTTGCA CAGCCTGTTG AAAGCAGATG CGGACGAAAT GGTCAGCAGT
151 GAGAAGCTGC TTACTTGGGC GGACACCGCC GACATCGATA CCGCTTTGAA
201 CCTGTTGTAC CGTTTGCAAA AACTCGAATT CCTCTATGGC GATGAAAACG
251 GTCATTCAGA CGGCATCAAT TTGTCGGACG AGCAATTGCC GTTGCTGATG
301 GAACAATTGT CCGGCAGCGG TAAGGCGTTA TTGGTCGATC GGAACGGTCT
351 GTATCTTGCC AACGCCAATT TCCATCATGA GGCGGCGGAA GAGTTGGGGT
401 TGTTGGCGGC AGAAGTCGCA CAGATGGAAA AGAAATACCG GCTGCTGATT
451 AAGAACAACC TGTATATCAA CAATAACGCT TGGGGCGTTT GCGATCCTTC
501 CGGTCAGAGC GAATTGACAT TTTTCCCATT GTATATCGGT TCAACCAAAT
551 TTATTTTGGT TATCGGCGGC ATTCCCGATT TGGGCAAAGA GGCATTTGTT
601 ACTTTGGTAA GGATTTTATA CCGCCGTTAC AGCAACCGCG TGTAA
它对应于氨基酸序列<SEQ ID 612;ORF143-1>:
1 MESTLSLQAN LYPRLTPAGA FYAVSSDAPS AGKTLLHSLL KADADEMVSS
51 EKLLTWADTA DIDTALNLLY RLQKLEFLYG DENGHSDGIN LSDEQLPLLM
101 EQLSGSGKAL LVDRNGLYLA NANFHHEAAE ELGLLAAEVA QMEKKYRLLI
151 KNNLYINNNA WGVCDPSGQS ELT
FFPLYIG STKFILVIGG IPDLGKEAFV
201 TLVRILYRRY SNRV*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF143和脑膜炎奈瑟球菌菌株A的ORF(ORF143a)在105个氨基酸的重叠区内显示出有92.4%的相同性:
10 20 30
orf143.pep MRTKWSAVRSCTWADTADIDTALNLLYRLQKLEFL
|: : ||| ||||||||||||||||||||
orf143a GAFYAVSSDXPSAGKTLLHSLLKADADEMVSSEKLLTWAXTADIDTALNLLYRLQKLEFL
20 30 40 50 60 70
40 50 60 70 80 90
orf143.pep YGDENGHSDGINLXDEQLPLLMEQLSGSGKALLVDRNGLYLANANFHHEAAEELGLLAAE
||||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||
orf143a YGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLANANFHHEAAEELGLLAAE
80 90 100 110 120 130
100 110
orf143.pep VAQMEKKYRLLIKNN
|||||||||| ||||
orf143a VAQMEKKYRLXIKNNLYINNNAWGVCDPSGQSELT
FFPLYIGSTKFILVIGGIPDLGKEA
140 150 160 170 180 190
全长ORF143a核苷酸序列<SEQ ID 613>是:
1 ATGGAATCAA CANTTTCACT ACAAGCAAAT TTATATCNCC GCCTGACTCC
51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGNCCCCAGT GCCGGTAAAA
101 CTTTGTTGCA CAGCCTGTTG AAAGCGGATG CGGACGAAAT GGTNAGCAGT
151 GAGAAGCTGC TTACCTGGGC GGANACCGCC GACATCGATA CCGCTTTGAA
201 CCTGTTGTAC CGTTTGCAAA AACTCGAATT CCTCTATGGC GATGAAAACG
251 GTCATTCAGA CGGCATCAAT TTGTCGGACG AGCAATTGCC GTTGCTGATG
301 GAACAATTGT CCGGCAGCGG TAAGGCGTTA TTGGTCGATC GGAACGGTCT
351 GTATCTTGCC AACGCCAATT TCCATCATGA GGCGGCGGAA GAGTTGGGGT
401 TGTTGGCGGC AGAAGTCGCA CAGATGGAAA AGAAATACCG GCTGCNNATT
451 AAGAACAACC TGTATATCAA CAATAACGCT TGGGGCGTTT GCGATCCTTC
501 CGGTCAGAGC GAATTGACAT TTTTCCCATT GTATATCGGT TCAACCAAAT
551 TTATTTTGGT TATCGGCGGC ATTCCCGATT TGGGCAAAGA GGCATTTGTT
601 ACTTTGGTAA GGATNTTATA CCNCCNGTTA CAGCAACCGC GTGTAAAACT
651 TGGGAGAGAG GANGGGTTAT GCAGCAATTA TTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 614>:
1 MESTXSLQAN LYXRLTPAGA FYAVSSDXPS AGKTLLHSLL KADADEMVSS
51 EKLLTWAXTA DIDTALNLLY RLQKLEFLYG DENGHSDGIN LSDEQLPLLM
101 EQLSGSGKAL LVDRNGLYLA NANFHHEAAE ELGLLAAEVA QMEKKYRLXI
151 KNNLYINNNA WGVCDPSGQS ELT
FFPLYIG STKFILVIGG IPDLGKEAFV
201 TLVRXLYXXL QQPRVKLGRE XGLCSNY*
ORF143a和ORF143-1在207个氨基酸的重叠区内显示出有97.1%的相同性:
orf143a.pep MESTXSLQANLYXRLTPAGAFYAVSSDXPSAGKTLLHSLLKADADEMVSSEKLLTWAXTA
|||| ||||||| |||||||||||||| ||||||||||||||||||||||||||||| ||
orf143-1 MESTLSLQANLYPRLTPAGAFYAVSSDAPSAGKTLLHSLLKADADEMVSSEKLLTWADTA
orf143a.pep DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf143-1 DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA
orf143a.pep NANFHHEAAEELGLLAAEVAQMEKKYRLXIKNNLYINNNAWGVCDPSGQSELTFFPLYIG
|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||
orf143-1 NANFHHEAAEELGLLAAEVAQMEKKYRLLIKNNLYINNNAWGVCDPSGQSELTFFPLYIG
orf143a.pep STKFILVIGGIPDLGKEAFVTLVRXLY
|||||||||||||||||||||||| ||
orf143-1 STKFILVIGGIPDLGKEAFVTLVRILY
与淋病奈瑟球菌的预计ORF的同源性
ORF143和淋病奈瑟球菌的预计ORF(ORF143ng)在110个氨基酸的重叠区内显示出有95.5%的相同性:
orf143.pep MRTKWSAVRSCTWADTADIDTALNLLYRLQKLEFLYGDENGHSDGINLXDEQLPLLMEQL 60
|||||||||||: ||||||||||||||||||||||||||||||||||| |||||||||||
orf143ng MRTKWSAVRSCSRADTADIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQL 60
orf143.pep SGSGKALLVDRNGLYLANANFHHEAAEELGLLAAEVAQMEKKYRLLIKNN 110
|||||||||||||||||||||||||:||||||||||||||||||||||:||
orf143ng SGSGKALLVDRNGLYLANANFHHESAEELGLLAAEVAQMEKKYRLLIRNNLYINNNAWGV 120
预计ORF143ng核苷酸序列<SEQ ID 615>编码的蛋白质具有氨基酸序列<SEQ ID616>:
1 MRTKWSAVRS CSRADTADID TALNLLYRLQ KLEFLYGDEN GHSDGINLSD
51 EQLPLLMEQL SGSGKALLVD RNGLYLANAN FHHESAEELG LLAAEVAQME
101 KKYRLLIRNN LYINNNAWGV CDPSGQSELT F
FPLYIGSTK FILVIAGIPD
151 LSKGGICYFG KDFIPPLQQP RVKLGTGGIM RQLLISILED LNNTSTDIIA
201 SAVISTDGLP MATMLPSHLN SDRVGAISAT LLALGSRSVQ ELACGELEQV
251 MIKGKSGYIL LSQAGKDAVL VLVAKETG
RL GLILLDAKRA ARHIAEAI*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 617>:
1 ATGGAATCAA CACTTTCACT ACAAGCGAAT TTATATCCCT GCCTGACTCC
51 TGCCGGTGCA TTTTATGCCG TATCCAGCGA TGCCCCCAGT GCCGGTAAAA
101 CTTTGTTGCG CAGCCTGTTG AAAGCGGATG CGGACGAAGT GGTCAGCAGT
151 GAGAAGCTGC TCGCGGCGGA CACCGCCGAC ATCGATACCG CTTTGAACCT
201 GTTGTACCGT TTGCAAAAAC TCGAATTCCT CTATGGCGAT GAAAACGGTC
251 ATTCAGACGG CATCAATTTG TCGGACGAGC AATTGCCGTT GCTGATGGAA
301 CAATTGTCCG GCAGCGGTAA GGCATTATTG GTCGATCGGA ACGGTCTGTA
351 TCTTGCCAAC GCCAATTTCC ATCATGAGTC GGCGGAAGAG TTGGGGTTGT
401 TGGCGGCAGA AGTCGCACAG ATGGAAAAGA AATACCGGCT GCTGATTAGG
451 AACAACCTGT ATATCAACAA TAACGCTTGG GGCGTTTGCG ATCCTTCCGG
501 TCAGAGCGAA TTGACATTTT TCCCATTGTA TATCGGTTCA ACCAAATTTA
551 TTTTGGTTAT CGCCGGCATT CCCGATTTGA GCAAAGAGGC ATTTGTTACT
601 TTGGTAAGGA TTTTATACCG CCGTTACAGC AACCGCGTGT AA
它对应于氨基酸序列<SEQ ID 618;ORF143ng-1>:
1 MESTLSLQAN LYPCLTPAGA FYAVSSDAPS AGKTLLRSLL KADADEVVSS
51 EKLLAADTAD IDTALNLLYR LQKLEFLYGD ENGHSDGINL SDEQLPLLME
101 QLSGSGKALL VDRNGLYLAN ANFHHESAEE LGLLAAEVAQ MEKKYRLLIR
151 NNLYINNNAW GVCDPSGQSE LTF
FPLYIGS TKFILVIAGI PDLSKEAFVT
201 LVRILYRRYS NRV*
ORF143ng-1和ORF143-1在214个氨基酸的重叠区内显示出有95.8%的相同性:
orf143ng-1.pep MESTLSLQANLYPCLTPAGAFYAVSSDAPSAGKTLLRSLLKADADEVVSSEKLLA-ADTA 59
||||||||||||| ||||||||||||||||||||||:|||||||||:|||||||: ||||
orf 143-1 MESTLSLQANLYPRLTPAGAFYAVSSDAPSAGKTLLHSLLKADADEMVSSEKLLTWADTA 60
orf143ng-1.pep DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 119
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf143-1 DIDTALNLLYRLQKLEFLYGDENGHSDGINLSDEQLPLLMEQLSGSGKALLVDRNGLYLA 120
orf143ng-1.pep NANFHHESAEELGLLAAEVAQMEKKYRLLIRNNLYINNNAWGVCDPSGQSELTFFPLYIG 179
|||||||:||||||||||||||||||||||:|||||||||||||||||||||||||||||
orf143-1 NANFHHEAAEELGLLAAEVAQMEKKYRLLIKNNLYINNNAWGVCDPSGQSELTFFPLYIG 180
orf143ng-1.pep STKFILVIAGIPDLSKEAFVTLVRILYRRYSNRV 213
||||||||:|||||:|||||||||||||||||||
orf143-1 STKFILVIGGIPDLGKEAFVTLVRILYRRYSNRV 214
根据淋球菌蛋白中存在推定的跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例74
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 619>:
1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC
51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGr
101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG
151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC
201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CA.GGCGCGG
251 ACATGGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG
301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCTGA TGCTGATTCG
401 CCGTGGATG..
它对应于氨基酸序列<SEQ ID 620;ORF144>:
1 MTFLQRLQGL ADNKICAFAW FVVRRFDEER VPQXAASMTF TTLLALVPVL
51 TVMVAVASIF PVFDRWSDSF VSFVNQTIVP XGADMVFDYI NAFREQANRL
101 TAIGSVMLVV TSLMLIRTID NTFNRIWRVX XQRPWM...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 621>:
1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC
51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGG
101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG
151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC
201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG
251 ACATGGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG
301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCTGA TGCTGATTCG
351 GACGATAGAC AATACGTTCA ACCGCATCTG GCGGGTCAAT TCCCAGCGTC
401 CGTGGATGAT GCAGTTTCTC GTCTATTGGG CTTTACTGAC GTTCGGGCCG
451 CTGTCTTTGG GCGTGGGCAT TTCCTTTATG GTCGGCTCGG TACAGGATGC
501 CGCGCTTGCC TCAGGTGCGC CGCAGTGGTC GGGCGCGTTG CGAACGGCGG
551 CGACGCTGAC CTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTTCGTG
601 CCAAACCGCT TCGTTCCCGC GCGGCAGGCG TTTGTCGGGG CTTTGGCAAC
651 AGCGTTTTGT CTGGAAACCG CGCGCTCCCT CTTCACTTGG TATATGGGCA
701 ATTTCGACGG CTACCGCTCG ATTTACGGCG CGTTTGCCGC CGTGCCGTTT
751 TTTCTGTTGT GGCTGAACCT GTTGTGGACG CTGGTCTTGG GCGGCGCGGT
801 GCTGACTTCT TCACTCTCCT ACTGGCAGGG AGAAGCGTTC CGCAGGGGCT
851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG
901 GATGCGGCGC AAAAAGAAGG CAAAGCCTTG CCTGTTCAGG AGTTCAGACG
951 GCATATCAAT ATGGGCTACG ACGAGTTGGG CGAGCTTTTG GAAAAGCTGG
1001 CGCGGCACGG CTACATCTAT TCCGGCAGAC AGGGTTGGGT GTTGAAAACG
1051 GGGGCGGATT CGATTGAGTT GAACGAACTC TTCAAGCTCT TCGTTTACCG
1101 TCCGTTGCCT GTGGAAAGGG ATCATGTGAA CCAAGCTGTC GATGCGGTAA
1151 TGACACCGTG TTTGCAGACT TTGAACATGA CGCTGGCAGA GTTTGACGCT
1201 CAGGCGAAAA AACGGCAGTA G
它对应于氨基酸序列<SEQ ID 622;ORF144-1>:
1
MTFLQRLQGL ADNKICAFAW FVVRRFDEER VPQAAASMTF TT
LLALVPVL
51
TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI NAFREQANR
L
101
TAIGSVMLVV TSLMLIRTID NTFNRIWRVN SQRPWMMQFL VYWA
LLTFGP
151
LSLGVGISFM VGSVQDAALA SGAPQWSGAL RTAATLTFMT LLLWGLYRFV
201 PNRFVPARQA FVGALATAFC LETARSLFTW YMGNFDGYRS IYGAF
AAVPF
251
FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL
301 DAAQKEGKAL PVQEFRRHIN MGYDELGELL EKLARHGYIY SGRQGWVLKT
351 GADSIELNEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA
401 QAKKRQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF144和脑膜炎奈瑟球菌菌株A的ORF(ORF144a)在136个氨基酸的重叠区内显示出有96.3%的相同性:
10 20 30 40 50 60
orf144.pep
MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQXAASMTFTT
LLALVPVLTVMVAVASIF
||||||||||||||||||| |||||||||||||| |||||||| ||||||||||||||||| |
orf144a
MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTT
LLALVPVLTVMVAVASIF
10 20 30 40 50 60
70 80 90 100 110 120
orf144.pep PVFDRWSDSFVSFVNQTIVPXGADMVFDYINAFREQANR
LTAIGSVMLVVTSLMLIRTID
|||||||||||||||||||| ||||||||||||||||||||||||||||||| |||||||
orf144a PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANR
LTAIGSVMLVVTSXMLIRTID
70 80 90 100 110 120
130
orf144.pep NTFNRIWRVXXQRPWM
||||||||| |||||
orf144a NTFNRIWRVNSQRPWMMQFLVYWA
LLTFGPLSLGVGISFXVGSVQDAALASGAPQWSGAL
130 140 150 160 170 180
全长ORF144a核苷酸序列<SEQ ID 623>是:
1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC
51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGG
101 CGGCGGCAAG CATGACGTTT ACGACACTGC TGGCACTCGT CCCCGTGCTG
151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGNTGGTC
201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG
251 ACATGGTNTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG
301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGCNGA TGCTGATTCG
351 GACGATAGAC AATACGTTCA ACCGCATCTG GCGGGTCAAT TCCCAGCGTC
401 CGTGGATGAT GCAGTTTCTC GTCTATTGGG CTTTACTGAC GTTCGGGCCG
451 CTGTCTTTGG GCGTGGGCAT TTCCTTTATN GTCGGCTCGG TACAGGATGC
501 CGCGCTTGCC TCAGGTGCGC CGCAGTGGTC GGGCGCGTTG CGAACGGCGG
551 CGACGCTGAN CTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTNCGTG
601 CCAAACCGCT TCGTTCCCGC GCGGCANGCG TTTGTCGGGG CTTTGGCAAC
651 AGCGTTCTGT CTGGAAACCG CGCGTTCCCT CTTTACTTGG TATATGGGCA
701 ATTTCGACGG CTACCGCTCG ATTTACGGNG CGTTTGCCGC CGTGCCGTTT
751 TTTCTGTTGT GGCTGAACCT GTTGTGGACG CTGGTCTTGG GCGGCGCGGT
801 GCTGACTTCT TCACTCTCCT ACTGGCAGGG AGAAGCGTTC CGCAGGGNCT
851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG
901 GATGCGGCGC AAAAAGAAGG CNAAGCCTTG CCTGTTCAGG AGTTCAGACG
951 GCATATCAAT ATGGGCTACG ACGAGTTGGG CGAGCTTTTG GAAAAGCTGG
1001 CGCGGCACGG CTACATCTAT TCCGGCAGAC AGGGTTGGGT GTTGAAAACG
1051 GGGGCGGATT CGATTGAGTT GAACGAACTC TTCAAGCTCT TCGTTTACCG
1101 TCCGTTGCCT GTGGAAAGGG ATCATGTGAA CCAAGCTGTC GATGCGGTAA
1151 TGATGCCGTG TTTGCAGACT TTGAACATGA CGCTGGCAGA GTTTGACGCT
1201 CAGGCGAAAA AACAGCAGCA ATCTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 624>:
1
MTFLQRLQGL ADNKICAFAW FVVRRFDEER VPQAAASMTF TT
LLALVPVL
51
TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI NAFREQANRL
101
TAIGSVMLVV TSXMLIRTID NTFNRIWRVN SQRPWMMQFL VYWA
LLTFGP
151
LSLGVGISFX VGSVQDAALA SGAPQWSGAL RTAATLXFMT LLLWGLYRXV
201 PNRFVPARXA FVGALATAFC LETARSLFTW YMGNFDGYRS IYGAFA
AVPF
251
FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRXFDSRGRF DDVLKILLLL
301 DAAQKEGXAL PVQEFRRHIN MGYDELGELL EKLARHGYIY SGRQGWVLKT
351 GADSIELNEL FKLFVYRPLP VERDHVNQAV DAVMMPCLQT LNMTLAEFDA
401 QAKKQQQS*
ORF144a和ORF144-1在406个氨基酸的重叠区内显示出有97.8%的相同性:
orf144a.pep MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf144-1 MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
orf144a.pep PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSXMLIRTID
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf144-1 PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTID
orf144a.pep NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFXVGSVQDAALASGAPQWSGAL
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf144-1 NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDAALASGAPQWSGAL
orf144a.pep RTAATLXFMTLLLWGLYRXVPNRFVPARXAFVGALATAFCLETARSLFTWYMGNFDGYRS
||||||:||||||||||| ||||||||| |||||||||||||||||||||||||||||||
orf144-1 RTAATLTFMTLLLWGLYRFVPNRFVPARQAFVGALATAFCLETARSLFTWYMGNFDGYRS
orf144a.pep IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRXFDSRGRFDDVLKILLLL
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf144-1 IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLL
orf144a.pep DAAQKEGXALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNEL
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf144-1 DAAQKEGKALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNEL
orf144a.pep FKLFVYRPLPVERDHVNQAVDAVMMPCLQTLNMTLAEFDAQAKKQQQS 408
|||||||||||||||||||||||| |||||||||||||||||||:|
orf144-1 FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKRQ 406
与淋病奈瑟球菌的预计ORF的同源性
ORF144和淋病奈瑟球菌的预计ORF(ORF114ng)在136个氨基酸的重叠区内显示出有91.2%的相同性:
orf144.pep MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQXAASMTFTTLLALVPVLTVMVAVASIF 60
||||| || ||||||||||||:|||:|||||| ||||||||||||||||||||||||||
orf144ng MTFLQCWQGSADNKICAFAWFVIRRFSEERVPQAAASMTFTTLLALVPVLTVMVAVASIF 60
orf144.pep PVFDRWSDSFVSFVNQTIVPXGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTID 120
|||||||||||||||||||| |||||||||:|||:|||||||||||||||||||||||||
orf144ng PVFDRWSDSFVSFVNQTIVPQGADMVFDYIDAFRDQANRLTAIGSVMLVVTSLMLIRTID 120
orf144.pep NTFNRIWRVXXQRPWM 136
|:||||||| :|||||
orf144ng NAFNRIWRVNTQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDSVLSSGAQQWADAL 180
预计全长ORF144ng核苷酸序列<SEQ ID 625>编码的蛋白质具有氨基酸序列<SEQ ID 626>:
1 MTFLQCWQGS ADNKICAFAW FVIRRFSEER VPQAAASMTF TT
LLALVPVL
51
TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI DAFRDQANR
L
101
TAIGSVMLVV TSLMLIRTID NAFNRIWRVN TQRPWMMQFL VYWA
LLTFGP
151
LSLGVGISFM VGSVQDSVLS SGAQQWADAL KTAARLAFMT LLLWGLYRFV
201 PNRFVPARQA FVGALITAFC LETARFLFTW YMGNFDGYRS IYGAFA
AVPF
251
FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL
301 DAAQKEGRTL SVQEFRRHIN MGYDELGELL EKLARYGYIY SGRQGWVLKT
351 GADSIELSEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA
401 QAKKQQQS*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 627>:
1 ATGACCTTTT TACAACGTTG GCAAGGTTTG GCGGACAATA AAATCTGTGC
51 ATTTGCATGG TTCGTCATCC GCCGTTTCAG TGAAGAGCGC GTACCGCAGG
101 CAGCGGCGAG CATGACGTTT ACGACACTGC TGGCACTCGT CCCCGTACTG
151 ACCGTAATGG TCGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC
201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CAGGGCGCGG
251 ATATGGTGTT CGACTATATC GACGCATTCC GCGATCAGGC AAACCGGCTG
301 ACCGCCATCG GCAGCGTGAT GCTGGTCGTA ACCTCGCTGA TGCTGATTCG
351 GACGATAGAC AATGCGTTCA ACCGCATCTG GCGGGTTAAC ACGCAACGCC
401 CCTGGATGAT GCAGTTCCTC GTTTATTGGG CGTTGCTGAC TTTCGGGCCT
451 TTGTCTTTGG GTGTGGGCAT TTCCTTTATG GTCGGGTCGG TTCAAGACTC
501 CGTACTCTCC TCCGGAGCGC AACAATGGGC GGACGCGTTG AAGACGGCGG
551 CAAGGCTGGC TTTCATGACG CTTTTGCTGT GGGGGCTGTA CCGCTTCGTG
601 CCCAACCGCT TCGTGCCCGC CCGGCAGGCG TTTGTCGGAG CTTTGATTAC
651 GGCATTCTGC CTGGAGACGG CACGTTTCCT GTTCACCTGG TATATGGGCA
701 ATTTCGACGG CTACCGCTCG ATTTACGGCG CATTTGCCGC CGTGCCGTTT
751 TTCCTGCTGT GGTTAAACCT GCTGTGGACG CTGGTCTTGG GCGGGGCGGT
801 GCTGACTTCG TCGCTGTCTT ATTGGCAGGG CGAGGCCTTC CGCAGGGGAT
851 TCGACTCGCG CGGACGGTTT GACGACGTGT TGAAAATCCT GCTGCTTCTG
901 GATGCGGCGC AAAAAGAAGG CCGAACCCTG TCCGTTCAGG AGTTCAGACG
951 GCATATCAAT ATGGGTTACG ATGAATTGGG CGAGCTTTTG GAAAAGCTGG
1001 CGCGGTACGG CTATATCTAT TCCGGCAGAC AGGGCTGGGT TTTGAAAACG
1051 GGGGCGGATT CGATTGAGTT GAGCGAACTC TTCAAGCTCT TCGTGTACCG
1101 CCCGTTGCct gtggaAAGGG ATCATGTGAA CCAAGCTGtc gaTGCGGTAA
1151 TGAcgccgtG TTTGCAGACT TTGAACATGA CGCTGGCGGA GTTTGACGCT
1201 CAGgcgAAAA AACAGCAGCA GTCTTGA
它编码ORF144ng的一个变体,该变体具有氨基酸序列<SEQ ID 628;ORF144ng-1>:
1
MTFLQRWQGL ADNKICAFAW FVIRRFSEER VPQAAASMTF TT
LLALVPVL
51
TVMVAVASIF PVFDRWSDSF VSFVNQTIVP QGADMVFDYI DAFRDQANRL
101
TAIGSVMLVV TSLMLIRTID NAFNRIWRVN TQRPWMMQFL VYWA
LLTFGP
151
LSLGVGISFM VGSVQDSVLS SGAQQWADAL KTAARLAFMT LLLWGLYRFV
201 PNRFVPARQA FVGALITAFC LETARFLFTW YMGNFDGYRS IYGAFA
AVPF
251
FLLWLNLLWT LVLGGAVLTS SLSYWQGEAF RRGFDSRGRF DDVLKILLLL
301 DAAQKEGRTL SVQEFRRHIN MGYDELGELL EKLARYGYIY SGRQGWVLKT
351 GADSIELSEL FKLFVYRPLP VERDHVNQAV DAVMTPCLQT LNMTLAEFDA
401 QAKKQQQS*
ORF144ng-1和ORF144-1在406个氨基酸的重叠区内显示出有94.1%的相同性:
orf144ng-1.pep MTFLQRWQGLADNKICAFAWFVIRRFSEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
|||||| |||||||||||||||:|||:|||||||||||||||||||||||||||||||||
orf144-1 MTFLQRLQGLADNKICAFAWFVVRRFDEERVPQAAASMTFTTLLALVPVLTVMVAVASIF
orf144ng-1.pep PVFDRWSDSFVSFVNQTIVPQGADMVFDYIDAFRDQANRLTAIGSVMLVVTSLMLIRTID
||||||||||||||||||||||||||||||:|||:|||||||||||||||||||||||||
orf144-1 PVFDRWSDSFVSFVNQTIVPQGADMVFDYINAFREQANRLTAIGSVMLVVTSLMLIRTID
orf144ng-1.pep NAFNRIWRVNTQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDSVLSSGAQQWADAL
|:||||||||:|||||||||||||||||||||||||||||||||||::|:||| ||: ||
orf 144-1 NTFNRIWRVNSQRPWMMQFLVYWALLTFGPLSLGVGISFMVGSVQDAALASGAPQWSGAL
orf144ng-1.pep KTAARLAFMTLLLWGLYRFVPNRFVPARQAFVGALITAFCLETARFLFTWYMGNFDGYRS
:||| |:|||||||||||||||||||||||||||| ||||||||| ||||||||||||||
orf144-1 RTAATLTFMTLLLWGLYRFVPNRFVPARQAFVGALATAFCLETARSLFTWYMGNFDGYRS
orf144ng-1.pep IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf144-1 IYGAFAAVPFFLLWLNLLWTLVLGGAVLTSSLSYWQGEAFRRGFDSRGRFDDVLKILLLL
orf144ng-1.pep DAAQKEGRTLSVQEFRRHINMGYDELGELLEKLARYGYIYSGRQGWVLKTGADSIELSEL
|||||||::| ||||||||||||||||||||||||:|||||||||||||||||||||:||
orf144-1 DAAQKEGKALPVQEFRRHINMGYDELGELLEKLARHGYIYSGRQGWVLKTGADSIELNEL
orf144ng-1.pep FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKQQQS
||||||||||||||||||||||||||||||||||||||||||||:|
orf144-1 FKLFVYRPLPVERDHVNQAVDAVMTPCLQTLNMTLAEFDAQAKKRQ
根据该分析结果(包括在淋球菌蛋白中鉴定出几个推定的跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例75
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 629>:
1 ..AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA
51 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA
101 GCACCGATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC
151 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG
201 CCTGCTTGAA ACACGGGAAC ACGGCTGA
它对应于氨基酸序列<SEQ ID 630;ORF146>:
1 ..RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTDMRQE ISALVILLQR
51 TRRKWLDAHE RQHLRQSLLE TREHG*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 631>:
1 ATGAACACCT CGCAACGCAA CCGCCTCGTC AGCCGCTGGC TCAACTCCTA
51 CGAACGCTAC CGCTACCGCC GCCTCATCCA CGCCGTCCGG CTCGGCGGGG
101 CCGTCCTGTT CGCCACCGCC TCCGCCCGGC TGCTCCACCT CCAACACGGC
151 GAGTGGATAG GGATGACCGT CTTCGTCGTC CTCGGCATGC TCCAGTTTCA
201 AGGGGCGATT TACTCCAAGG CGGTGGAACG TATGCTCGGC ACGGTCATCG
251 GGCTGGGCGC GGGTTTGGGC GTTTTATGGC TGAACCAGCA TTATTTCCAC
301 GGCAACCTCC TCTTCTACCT CACCGTCGGC ACGGCAAGCG CACTGGCCGG
351 CTGGGCGGCG GTCGGCAAAA ACGGCTACGT CCCTATGCTG GCAGGGCTGA
401 CGATGTGTAT GCTCATCGGC GACAACGGCA GCGAATGGCT CGACAGCGGA
451 CTCATGCGCG CCATGAACGT CCTCATCGGC GCGGCCATCG CCATCGCCGC
501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG
551 CCGACAACCT GGCCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC
601 AGGCGCATGA CCCGCGAACG CCTCGAGGAG AACATGGCGA AAATGCGCCA
651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCATCTCGCC GCCACATCGG
701 GCGAAAGCCG CATCAGCCCC GCCATGATGG AAGCCATGCA GCACGCCCAC
751 CGTAAAATCG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT
801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTT GACCGCCACT
851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGTCGCCCT TATCAACGGC
901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA
951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA
1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC
1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG
1101 CCTGCTTGAA ACACGGGAAC ACGGCTGA
它对应于氨基酸序列<SEQ ID 632;ORF146-1>:
1 MNTSQRNRLV SRWLNSYERY RYRRLIHAVR LGGAVLFATA SARLLHLQHG
51 EW
IGMTVFVV LGMLQFQGAI YSKAVER
MLG TVIGLGAGLG VLWLNQHYFH
101 GNLLFYLTVG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWLDSG
151 LMRAMN
VLIG AAIAIAAAKL LPLKSTLMWR FMLADNLADC SKMIAEISNG
201 RRMTRERLEE NMAKMRQINA RMVKSRSHLA ATSGESRISP AMMEAMQHAH
251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTVALING
301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR
351 TRRKWLDAHE RQHLRQSLLE TREHG*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF146和脑膜炎奈瑟球菌菌株A的ORF(ORF146a)在74个氨基酸的重叠区内显示出有98.6%的相同性:
10 20 30
orf146.pep RHARRIRIDTAINPELEALAEHLHYQWQGF
||||||||||||||||||||||||||||||
orf146a KLNGSEIRLLDRHFTLLQTDLQQTVALINGRHARRIRIDTAINPELEALAEHLHYQWQGF
280 290 300 310 320 330
40 50 60 70
orf146.pep LWLSTDMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHGX
|||||:||||||||||||||||||||||||||||||||||||||:
orf146a LWLSTNMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHSX
340 350 360 370
全长ORF146a核苷酸序列<SEQ ID 633>是:
1 ATGAACACCT CGCAACGCAA CCGCCTCGTC AGCCGCTGGC TCAACTCCTA
51 CGAACGCTAC CGCTACCGCC GCCTCATCCA CGCCGTCCGG CTCGGCGGGG
101 CCGTCCTGTT CGCCACCGCC TCCGCCCGGC TGCTCCACCT CCAACACGGC
151 GAGTGGATAG GGATGACCGT CTTCGTCGTC CTCGGCATGC TCCAGTTTCA
201 AGGGGCGATT TACTCCAAGG CGGTGGAACG TATGCTCGGC ACGGTCATCG
251 GGCTGGGCGC GGGTTTGGGC GTTTTATGGC TGAACCAGCA TTATTTCCAC
301 GGCAACCTCC TCTTCTACCT CACCGTCGGC ACGGCAAGCG CACTGGCCGG
351 CTGGGCGGCG GTCGGCAAAA ACGGCTACGT CCCTATGCTG GCGGGGCTGA
401 CGATGTGCAT GCTCATCGGC GACAACGGCA GCGAATGGTT CGACAGCGGC
451 CTGATGCGCG CGATGAACGT CCTCATCGGC GCGGCCATCG CCATCGCCGC
501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG
551 CCGACAACCT GACCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC
601 AGGCGCATGA CCCGCGAACG CCTCGAAGAG AACATGGCGA AAATGCGCCA
651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCACCTCGCC GCCACATCGG
701 GCGAAAGCCG CATCAGCCCC GCCATGATGG AAGCCATGCA GCACGCCCAC
751 CGTAAAATTG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT
801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTT GACCGCCACT
851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGTCGCCCT TATCAACGGC
901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA
951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA
1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC
1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG
1101 CCTGCTTGAA ACACGGGAAC ACAGTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 634>:
1 MNTSQRNRLV SRWLNSYERY RYRRLIHAVR LGGAVLFATA SARLLHLQHG
51 EW
IGMTVFVV LGMLQFQGAI YSKAVER
MLG TVIGLGAGLG VLWLNQHYFH
101 GNLLFYLTVG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWFDSG
151 LMRAMN
VLIG AAIAIAAAKL LPLKSTLMWR FMLADNLTDC SKMIAEISNG
201 RRMTRERLEE NMAKMRQINA RMVKSRSHLA ATSGESRISP AMMEAMQHAH
251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTVALING
301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR
351 TRRKWLDAHE RQHLRQSLLE TREHS*
ORF146a和ORF146-1在374个氨基酸的重叠区内显示出有99.5%的相同性:
orf146a.pep MNTSQRNRLVSRWLNSYERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146-1 MNTSQRNRLVSRWLNSYERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFVV
orf146a.pep LGMLQFQGAIYSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146-1 LGMLQFQGAIYSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA
orf146a.pep VGKNGYVPMLAGLTMCMLIGDNGSEWFDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||
orf146-1 VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR
orf146a.pep FMLADNLTDCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISP
|||||||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146-1 FMLADNLADCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISP
orf146a.pep AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALING
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146-1 AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALING
orf146a.pep RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146-1 RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE
orf146a.pep RQHLRQSLLETREHSX
||||||||||||||:
orf146-1 RQHLRQSLLETREHGX
与淋病奈瑟球菌的预计ORF的同源性
ORF146和淋病奈瑟球菌的预计ORF(ORF146ng)在75个氨基酸的重叠区内显示出有97.3%的相同性:
orf146.pep RHARRIRIDTAINPELEALAEHLHYQWQGF 30
||||||||||||||||||||||||||||||
orf146ng KLNGSEIRLLDRHFTLLQTDLQQTAALINGRHARRIRIDTAINPELEALAEHLHYQWQGF 364
orf146.pep LWLSTDMRQEISALVILLQRTRRKWLDAHERQHLRQSLLETREHG 75
|||||:|||||||||| ||||||||||||||||||||||||||||
orf146ng LWLSTNMRQEISALVIPLQRTRRKWLDAHERQHLRQSLLETREHG 409
预计ORF146ng核苷酸序列<SEQ ID 635>编码的蛋白质具有氨基酸序列<SEQ ID636>:
1 MSGVRFPSPA PIPSTDPPSG SLCFFTFPLQ TASDMNSSQR KRLSGRWLNS
51 YERYRHRRLI HAVRLGGTVL FATALARLLH LQHGEW
IGMT VFVVLGMLQF
101
QGAIYSNAVE R
MLGTVIGLG AGLGVLWLNQ HYFHGNLLFY LTIGTASALA
151 GWAAVGKNGY VPMLAGLTMC MLIGDNGSEW LDSGLMRAMN
VLIGAAIAIA
201
AAKLLPLKST LMWRFMLADN LADCSKMIAE ISNGRRMTRE RLEQNMVKMR
251 QINARMVKSR SHLAATSGES RISPSMMEAM QHAHRKIVNT TELLLTTAAK
301 LQSPKLNGSE IRLLDRHFTL LQTDLQQTAA LINGRHARRI RIDTAINPEL
351 EALAEHLHYQ WQGFLWLSTN MRQEISALVI PLQRTRRKWL DAHERQHLRQ
401 SLLETREHG*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 637>:
1 ATGAACTCCT CGCAACGCAA ACGCCTTTCC GgccGCTGGC TCAACTCCTA
51 CGAACGCTac cGCCaccGCC GCCTCATACA TGCCGTGCGG CTCGGCggaa
101 ccgtCCTGTT CGCCACCGCA CTCGCCCGgc tACTCCACCT CCAacacggc
151 gAATGGATAG GGAtgaCCGT CTTCGTCGTC CTCGGCATGC TCCAGTTCCA
201 AGGCgcgatt tActccaacg cggtgGAacg taTGctcggt acggtcatcg
251 ggctgGGCGC GGGTTTGGgc gTTTTATGGC TGAACCAGCA TTAtttccac
301 ggcaacCTcc tcttctacct gaccatcggc acggcaagcg cactggccgg
351 ctGGGCGGCG GTCGGCAAAA acggctacgt ccctatgctg GCGGGGctgA
401 CGATGTGCAT gctcatcggc gACAACGGCA GCGAATGGCT CGACAGCGGC
451 CTGATGCGCG CGATGAACGT CCTCATCGGC GCCGCCATCG CCATTGCCGC
501 CGCCAAACTG CTGCCGCTGA AATCCACACT GATGTGGCGT TTCATGCTTG
551 CCGACAACCT GGCCGACTGC AGCAAAATGA TTGCCGAAAT CAGCAACGGC
601 AGGCGTATGA CGCGCGAACG TTTGGAGCAG AATATGGTCA AAATGCGCCA
651 AATCAACGCA CGCATGGTCA AAAGCCGCAG CCACCTCGCC GCCACATCGG
701 GCGAAAGCCG CATCAGCCCC TCCATGATGG AAGCCATGCA GCACGCCCAC
751 CGCAAAATCG TCAACACCAC CGAGCTGCTC CTGACCACCG CCGCCAAGCT
801 GCAATCTCCC AAACTCAACG GCAGCGAAAT CCGGCTGCTC GACCGCCACT
851 TCACACTGCT CCAAACCGAC CTGCAACAAA CCGCCGCCCT CATCAACGGC
901 AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA
951 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA
1001 GCACCAATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC
1051 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG
1101 CCTGCTTGAA ACACGGGAAC ACGGCTGA
它对应于氨基酸序列<SEQ ID 638;ORF146ng-1>:
1 MNSSQRKRLS GRWLNSYERY RHRRLIHAVR LGGTVLFATA LARLLHLQHG
51 EW
IGMTVFVV LGMLQFQGAI YSNAVER
MLG TVIGLGAGLG VLWLNQHYFH
101 GNLLFYLTIG TASALAGWAA VGKNGYVPML AGLTMCMLIG DNGSEWLDSG
151 LMRAMN
VLIG AAIAIAAAKL LPLKSTLMWR FMLADNLADC SKMIAEISNG
201 RRMTRERLEQ NMVKMRQINA RMVKSRSHLA ATSGESRISP SMMEAMQHAH
251 RKIVNTTELL LTTAAKLQSP KLNGSEIRLL DRHFTLLQTD LQQTAALING
301 RHARRIRIDT AINPELEALA EHLHYQWQGF LWLSTNMRQE ISALVILLQR
351 TRRKWLDAHE RQHLRQSLLE TREHG*
ORF146ng-1和ORF146-1在375个氨基酸的重叠区内显示出有96.5%的相同性:
orf146-1.pep MNTSQRNRLVSRWLNSYERYRYRRLIHAVRLGGAVLFATASARLLHLQHGEWIGMTVFVV
||:|||:|| :||||||||||:|||||||||||:|||||| |||||||||||||||||||
orf146ng-1 MNSSQRKRLSGRWLNSYERYRHRRLIHAVRLGGTVLFATALARLLHLQHGEWIGMTVFVV
orf146-1.pep LGMLQFQGAIYSKAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTVGTASALAGWAA
||||||||||||:|||||||||||||||||||||||||||||||||||:|||||||||||
orf146ng-1 LGMLQFQGAIYSNAVERMLGTVIGLGAGLGVLWLNQHYFHGNLLFYLTIGTASALAGWAA
orf146-1.pep VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146ng-1 VGKNGYVPMLAGLTMCMLIGDNGSEWLDSGLMRAMNVLIGAAIAIAAAKLLPLKSTLMWR
orf146-1.pep FMLADNLADCSKMIAEISNGRRMTRERLEENMAKMRQINARMVKSRSHLAATSGESRISP
|||||||||||||||||||||||||||||:||:|||||||||||||||||||||||||||
orf146ng-1 FMLADNLADCSKMIAEISNGRRMTRERLEQNMVKMRQINARMVKSRSHLAATSGESRISP
orf146-1.pep AMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTVALING
:|||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||
orf146ng-1 SMMEAMQHAHRKIVNTTELLLTTAAKLQSPKLNGSEIRLLDRHFTLLQTDLQQTAALING
orf146-1.pep RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf146ng-1 RHARRIRIDTAINPELEALAEHLHYQWQGFLWLSTNMRQEISALVILLQRTRRKWLDAHE
orf146-1.pep RQHLRQSLLETREHGX
||||||||||||||||
orf146ng-1 RQHLRQSLLETREHGX
另外,ORF146ng-1显示出与一种假设的大肠杆菌蛋白同源:
sp|P33011|YEEA_ECOLI COBU-SBMC基因间区域中假设的40.0KD蛋白
>gi|1736674|gnl|PID|d1016553(D90838)ORF_ID:o348#20;与[SwissProt登录号P33011][大肠杆菌]相似>gi|1736682|gnl|PID|d1016560(D90839)ORF_ID:o348#20;与[SwissProt登录号P33011][大肠杆菌]相似>gi|1788318(AE000292)f352;与片段YEEA_ECOLI 100%相同SW:P33011,但C端有附加的203个残基[大肠杆菌]长度=352
评分=109位(271),估计值=2e-23
相同性=89/347(25%),阳性=150/347(42%),空隙=21/347(6%)
询问:20 YRHRRLIHAVRLGGTVLFATALARLLHLQHGEWIGMTVFVVLGMLQFQGAIYSNAVERML 79
YRH R++H R+ L + RL + W +T+ V++G + F G + A ER+
目标:15 YRHYRIVHGTRVALAFLLTFLIIRLFTIPESTWPLVTMVVIMGPISFWGNVVPRAFERIG 74
询问:80 GTVIGLGAGLGVLWLNQHYFHGNLLFYLTIGTASALAGWAAVGKNGYVPMLAGLTMCMLI 139
GTV+G GL L L L + A L GW A+GK Y +L G+T+ +++
目标:75 GTVLGSILGLIALQLE---LISLPLMLVWCAAAMFLCGWLALGKKPYQGLLIGVTLAIVV 131
询问:140 GDNGSEWLDSGLMRAMNVLIGXXXXXXXXKLLPLKSTLMWRFMLADNLADCSKMIAEISN 199
G E +D+ L R+ +V++G + P ++ + WR LA +L + +++ +
目标:132 GSPTGE-IDTALWRSGDVILGSLLAMLFTGIWPQRAFIHWRIQLAKSLTEYNRVYQSAFS 190
询问:200 GRRMTRERLEQNMVKMRQINARMVKSRSHLAATSGESRISPSMMEAMQHAHRKIVNXXXX 259
+ R RLE ++ K+ VK R +A S E+RI S+ E +Q +R +V
目标:191 PNLLERPRLESHLQKLL---TDAVKMRGLIAPASKETRIPKSIYEGIQTINRNLVCMLEL 247
询问:260 XXXXXXXXQSPK---LNGSEIRLLDRHFXXXXXXXXXXAALINGRHARRIRIDTAINPEL 316
+ LN ++R D AL G +N +
目标:248 QINAYWATRPSHFVLLNAQKLR--DTQHMMQQILLSLVHALYEGNPQPVFANTEKLNDAV 305
询问:317 EALAEHL--HYQWQ-------GFLWLSTNMRQEISALVILLQRTRRK 354
E L + L H+ + G++WL+ ++ L L+ R RK
目标:306 EELRQLLNNHHDLKVVETPIYGYVWLNMETAHQLELLSNLICRALRK 352
根据该分析结果(包括鉴定出在此淋球菌中的几个跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例76
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 639>
1 ..GCCGAAGACA CGCGCGTTAC CGCACAGCTT TTGAGCGCGT ACGGCATTCA
51 GGGCAAACTC GTCAGTGTGC GCGAACACAA CGAACGGCAG ATGGCGGACA
101 AGATTGTCGG CTATCTTTCA GACGGCATGG TTGTGGCACA GGTTTCCGAT
151 GCGGGTACGC CGGCCGTGTG CGACCCGGGC GCGAAACTCG CCCGCCGCGT
201 GCGTGAGGCC GGGTTTAAAG TCGTTCCCGT CGTGGGCGCA AC.GCGGTGA
251 TGGCGGCTTT GAGCGTGGCC GGTGTGGAAG GATCCGATTT TTATTTCAAC
301 GGTTTTGTAC CGCCGAAATC GGGAGAACGC AGGAAACTGT TTGCCAAATG
351 GGTGCGGGCG GCGTTTCCTA TCGTCATGTT TGAAACGCCG CACCGCATCG
401 GTGCAGCGCT TGCCGATATG GCGGAACTGT TCCCCGAACG CCGATTAATG
451 CTGGCGCGCG AAATTACGAA AACGTTTGAA ACGTTCTTAA GCGGCACGGT
501 TGGGGAAATT CAGACGGCAT TGTCTGCCGA CGGCGACCAA TCGCGCGGCG
551 AGATGGTGTT GGTGCTTTAT CCGGCGCAGG ATGAAAAACA CGAAGGCTTG
601 TCCGAGTCCG CGCAAAACAT CATGAAAATC CTCACAGCCG AGCTGCCGAC
651 CAAACAGGCG GCGGAGCTTG CTGCCAAAAT CACGGGCGAG GGAAAGAAAG
701 CTTTGTACGA T..
它对应于氨基酸序列<SEQ ID 640;ORF147>:
1 ..AEDTRVTAQL LSAYGIQGKL VSVREHNERQ MADKIVGYLS DGMVVAQVSD
51 AGTPAVCDPG AKLARRVREA GFKVVPVVGA XAVMAALSVA GVEGSDFYFN
101 GFVPPKSGER RKLFAKWVRA AFPIVMFETP HRIGAALADM AELFPERRLM
151 LAREITKTFE TFLSGTVGEI QTALSADGDQ SRGEMVLVLY PAQDEKHEGL
201 SESAQNIMKI LTAELPTKQA AELAAKITGE GKKALYD..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 641>:
1 ATGTTTCAGA AACATTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC
51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCGGAC ATTACCCTGC
101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATCTGTGC CGAAGACACG
151 CGCGTTACCG CACAGCTTTT GAGCGCGTAC GGCATTCAGG GCAAACTCGT
201 CAGTGTGCGC GAACACAACG AACGGCAGAT GGCGGACAAG ATTGTCGGCT
251 ATCTTTCAGA CGGCATGGTT GTGGCACAGG TTTCCGATGC GGGTACGCCG
301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GTGAGGCCGG
351 GTTTAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTGATG GCGGCTTTGA
401 GCGTGGCCGG TGTGGAAGGA TCCGATTTTT ATTTCAACGG TTTTGTACCG
451 CCGAAATCGG GAGAACGCAG GAAACTGTTT GCCAAATGGG TGCGGGCGGC
501 GTTTCCTATC GTCATGTTTG AAACGCCGCA CCGCATCGGT GCGACGCTTG
551 CCGATATGGC GGAACTGTTC CCCGAACGCC GATTAATGCT GGCGCGCGAA
601 ATTACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA
651 GACGGCATTG TCTGCCGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG
701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCCGCG
751 CAAAACATCA TGAAAATCCT CACAGCCGAG CTGCCGACCA AACAGGCGGC
801 GGAGCTTGCT GCCAAAATCA CGGGCGAGGG AAAGAAAGCT TTGTACGATC
851 TGGCTCTGTC TTGGAAAAAC AAATAG
它对应于氨基酸序列<SEQ ID 642;ORF147-1>:
1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT
51 RVTAQLLSAY GIQGKLVSVR EHNERQMADK IVGYLSDGMV VAQVSDAGTP
101 AVCDPGAKLA RRVREAGFK
V VPVVGASAVM AALSVAGVEG SDFYFNGFVP
151 PKSGERRKLF AKWVRAAFPI VMFETPHRIG ATLADMAELF PERRLMLARE
201 ITKTFETFLS GTVGEIQTAL SADGNQSRGE MVLVLYPAQD EKHEGLSESA
251 QNIMKILTAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
该氨基酸序列的计算机分析给出了下列结果:
与大肠杆菌的假设蛋白ORF286(登录号为U18997)的同源性
ORF147和大肠杆菌ORF286蛋白在237个氨基酸的重叠区内显示出有36%的氨基酸相同性:
Orf147:1 AEDTRVTAQLLSAYGIQGKLVSVREHNERQMADKIVGYLSDGMVVAQVSDAGTPAVCDPG 60
AEDTR T LL +GI +L ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG
Orf286:43 AEDTRHTGLLLQHFGINARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPG 102
Orf147:61 AKLARRVREXXXXXXXXXXXXXXXXXXXXXXXEGSDFYFNGFVPPKSGERRKLFAKWVRA 120
L R RE F + GF+P KS RR
Orf286:103 YHLVRTCREAGIRVVPLPGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAE 162
Orf147:121 AFPIVMFETPHRIGAALADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALSADGD 179
++ +E+ HR+ +L D+ + E R ++LARE+TKT+ET VGE+ + D +
Orf286:163 PRTLIFYESTHRLLDSLEDIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDEN 222
Orf147:180 QSRGEMVLVLYPAQDEKHEGLSESAQNIMKILTAELPTKQAAELAAKITGEGKKALY 236
+ +GEMVL++ + E L A + +L AELP K+AA LAA+I G K ALY
Orf286:223 RRKGEMVLIV-EGHKAQEEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALY 278
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF147和脑膜炎奈瑟球菌菌株A的ORF75a在237个氨基酸的重叠区内显示出有96.6%的相同性:
10 20 30
orf147.pep AEDTRVTAQLLSAYGIQGKLVSVREHNERQ
||||||||||||||||||||||||||||||
orf75a TLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQGKLVSVREHNERQ
20 30 40 50 60 70
40 50 60 70 80 90
orf147.pep MADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFK
VVPVVGAXAVMAALSVA
|||||||||||||||||||||||||||||||||||||||:|||||||||| |||||||||
orf75a MADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREVGFK
VVPVVGASAVMAALSVA
80 90 100 110 120 130
100 110 120 130 140 150
orf 147.pep GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIGAALADMAELFPERRLM
|| ||||||||||||||||||||||||||:|||:|||||||||||:||||||||||||||
orf75a GVAGSDFYFNGFVPPKSGERRKLFAKWVRVAFPVVMFETPHRIGATLADMAELFPERRLM
140 150 160 170 180 190
160 170 180 190 200 210
orf147.pep LAREITKTFETFLSGTVGEIQTALSADGDQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI
||||||||||||||||||||||||:|||:|||||||||||||||||||||||||||||||
orf75a LAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI
200 210 220 230 240 250
220 230
orf147.pep LTAELPTKQAAELAAKITGEGKKALYD
|||||||||||||||||||||||||||
orf75a LTAELPTKQAAELAAKITGEGKKALYDLALSWKNKX
260 270 280 290
ORF147a与ORF75a相同,它包括ORF75的氨基酸56-292。
与淋病奈瑟球菌的预计ORF的同源性
ORF147和淋病奈瑟球菌的预计ORF(ORF147ng)在237个氨基酸的重叠区内显示出有94.1%的相同性:
orf147.pep AEDTRVTAQLLSAYGIQGKLVSVREHNERQ 30
||||||||||||||||||:|||||||||||
orf147ng TLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQGRLVSVREHNERQ 85
orf147.pep MADKIVGYLSDGMVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPVVGAXAVMAALSVA 90
||||::|:||||:||||||||||||||||||||||||||||||||||||| |||||||||
orf147ng MADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPVVGASAVMAALSVA 145
orf147.pep GVEGSDFYFNGFVPPKSGERRKLFAKWVRAAFPIVMFETPHRIGAALADMAELFPERRLM 150
|| |||||||||||||||||||||||||||||:|||||||||||:||||||||||||||
orf147ng GVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATLADMAELFPERRLM 205
orf147.pep LAREITKTFETFLSGTVGEIQTALSADGDQSRGEMVLVLYPAQDEKHEGLSESAQNIMKI 210
||||||||||||||||||||||||:|||:|||||||||||||||||||||||||| |||
orf147ng LAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEKHEGLSESAQNAMKI 265
orf147.pep LTAELPTKQAAELAAKITGEGKKALYD 237
|:|||||||||||||||||||||||||
orf147ng LAAELPTKQAAELAAKITGEGKKALYDLALSWKNK 300
预计ORF147ng核苷酸序列<SEQ ID 643>编码的蛋白质具有氨基酸序列<SEQ ID644>:
1 MSVFQTAFFM FQKHLQKASD SVVGGTLYVV ATPIGNLADI TLRALAVLQK
51 ADIICAEDTR VTAQLLSAYG IQGRLVSVRE HNERQMADKV IGFLSDGLVV
101 AQVSDAGTPA VCDPGAKLAR RVREAGFK
VV PVVGASAVMA ALSVAGVAES
151 DFYFNGFVPP KSGERRKLFA KWVRAAFPVV MFETPHRIGA TLADMAELFP
201 ERRLMLAREI TKTFETFLSG TVGEIQTALA ADGNQSRGEM VLVLYPAQDE
251 KHEGLSESAQ NAMKILAAEL PTKQAAELAA KITGEGKKAL YDLALSWKNK
301 *
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 645>:
1 ATGTTTCAGA AACACTTGCA GAAAGCCTCC GACAGCGTCG TCGGAGGGAC
51 ATTATACGTG GTTGCCACGC CCATCGGCAA TTTGGCAGAC ATTACCCTGC
101 GCGCTTTGGC GGTATTGCAA AAGGCGGACA TCATTTGTGC CGAAGACACG
151 CGCGTTACTG CGCAGCTTTT GAGCGCGTAC GGCATTCAGG GCAGGTTGGT
201 CAGTGTGCGC GAACACAACG AGCGGCAGAT GGCGGACAAG GTAATCGGTT
251 TCCTTTCAGA CGGCCTGGTT GTGGCGCAGG TTTCCGATGC GGGTACGCCG
301 GCCGTGTGCG ACCCGGGCGC GAAACTCGCC CGCCGCGTGC GCGAAGCAGG
351 GTTCAAAGTC GTTCCCGTCG TGGGCGCAAG CGCGGTAATG GCGGCGTTGA
401 GTGTGGCCGG TGTGGCGGAA TCCGATTTTT ATTTCAACGG TTTTGTACCG
451 CCGAAATCGG GCGAACGTAG GAAATTGTTT GCCAAATGGG TGCGGGCGGC
501 ATTTCCTGTC GTCATGTTTG AAACGCCGCA CCGAATCGGG GCAACGCTTG
551 CCGATATGGC GGAATTGTTC CCCGAACGCC GTCTGATGCT GGCGCGCGAA
601 ATCACGAAAA CGTTTGAAAC GTTCTTAAGC GGCACGGTTG GGGAAATTCA
651 GACGGCATTG GCGGCGGACG GCAACCAATC GCGCGGCGAG ATGGTGTTGG
701 TGCTTTATCC GGCGCAGGAT GAAAAACACG AAGGCTTGTC CGAGTCTGCG
751 CAAAATGCGA TGAAAATCCT TGCGGCCGAG CTGCCGACCA AGCAGGCGGC
801 GGAGCTTGCC GCCAAGATTA CAGGTGAGGG CAAAAAGGCT TTGTACGATT
851 TGGCACTGTC GTGGAAAAAC AAATGA
它对应于氨基酸序列<SEQ ID 646;ORF147ng-1>:
1 MFQKHLQKAS DSVVGGTLYV VATPIGNLAD ITLRALAVLQ KADIICAEDT
51 RVTAQLLSAY GIQGRLVSVR EHNERQMADK VIGFLSDGLV VAQVSDAGTP
101 AVCDPGAKLA RRVREAGFK
V VPVVGASAVM AALSVAGVAE SDFYFNGFVP
151 PKSGERRKLF AKWVRAAFPV VMFETPHRIG ATLADMAELF PERRLMLARE
201 ITKTFETFLS GTVGEIQTAL AADGNQSRGE MVLVLYPAQD EKHEGLSESA
251 QNAMKILAAE LPTKQAAELA AKITGEGKKA LYDLALSWKN K*
ORF147ng显示出与一种假设的大肠杆菌蛋白同源:
sp|P45528|YRAL_ECOLI AGAI-MTR基因间区域中假设的31.3KD蛋白(F286)
>gi|606086(U18997)ORF_f286[大肠杆菌]
>gi|1789535(AE000395)agai-mtr基因间区域中假设的31.3 kD蛋白[大肠杆菌]长度=286
评分=218位(550),估计值=3e-56
相同性=128/284(45%),阳性=171/284(60%),空隙=4/284(1%)
询问:4 KHLQKASDSVVGGTLYVVATPIGNLADITLRALAVLQKADIICAEDTRVTAQLLSAYGIQ 63
K Q A +S G LY+V TPIGNLADIT RAL VLQ D+I AEDTR T LL +GI
目标:2 KQHQSADNSQ--GQLYIVPTPIGNLADITQRALEVLQAVDLIAAEDTRHTGLLLQHFGIN 59
询问:64 GRLVSVREHNERQMADKVIGFLSDGLVVAQVSDAGTPAVCDPGAKLARRVREAGFKVVPV 123
RL ++ +HNE+Q A+ ++ L +G +A VSDAGTP + DPG L R REAG +VVP+
目标:60 ARLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRVVPL 119
询问:124 VGASAVMAALSVAGVAESDFYFNGFVPPKSGERRKLFAKWVRAAFPVVMFETPHRIGATL 183
G A + ALS AG+ F + GF+P KS RR ++ +E+ HR+ +L
目标:120 PGPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSL 179
询问:184 ADMAELFPERR-LMLAREITKTFETFLSGTVGEIQTALAADGNQSRGEMVLVLYPAQDEK 242
D+ + E R ++LARE+TKT+ET VGE+ + D N+ +GEMVL++ +
目标:180 EDIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIV-EGHKAQ 238
询问:243 HEGLSESAQNAMKILAAELPTKQAAELAAKITGEGKKALYDLAL 286
E L A + +L AELP K+AA LAA+I G K ALY AL
目标:239 EEDLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYAL 282
根据计算机的分析以及淋球菌蛋白中存在一个推定跨膜结构域,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例77
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 647>
1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA
51 AACCGGTCGC ATCCGCTTCT C.GCTGCTTA CTTAGCCATA TGCCTGTCGT
101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC
151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG
201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT
251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC
351 GCGGCTATAA CAACGTTGAT TTTGGTGCGG AAGGAAk.AA tATCCC.GAT
401 CAACAwCGww TTACTTATAA AATTGTGAAA CGGAATAATT ATAAAGCAGG
451 GACTAAAGGC CATCCTTATG GCGGCGATTA TCATATGCCG CGTTTGCATA
501 AATwTGTCAC AGATGCAGAA CCTGTTGAAA TGACCAGTTA TATGGATGGG
551 CGGAAATATA TCGATCAAAA TAATTACCCT GACCGTGTTC GTATTGGGGC
601 AGGCAGGCAA TATTGGCGAT CTGATGAAGA TGAGCCCAAT AACCGCGAAA
651 GTTCATATCA TATTGCAAGT .......... .......... ..........
701 .......... .....GGCTC ACCAATGTTT ATCTATGATG CCCAAAAGCA
751 AAAGTGGTTA ATTAATGGGG TATTGCAAAC GGGCAACCCC TATATAGGAA
801 AAAGCAATGG CTTCCAGCTG GTTCGTAAAG ATTGGTTCTA TGATGAAATC
851 TTTGCTGGAG ATACCCATTC AGTATTCTAC GAACCACGTC AAAATGGGAA
901 ATACTCTTTT AACGACGATA ATAATGGCAC AGGAAAAATC AATGCCAAAC
951 ATGAACACAA TTCTCTGCCT AATAGATTAA AAACACGAAC CGTTCAATTG
1001 TTTAATGTTT CTTTATCCGA GACAGCAAGA GAACCTGTTT ATCATGCTGC
1051 AGGTGGTGTC AACAGTTATC GACCCAGACT GAATAATGGA GAAAATATTT
1101 CCTTTATTGA CGAAGGAAAA GGCGAATTGA TACTTACCAG CAACATCAAT
1151 CAAGGTGCTG GAGGATTATA TTTCCAAGGA GATTTTACGG TCTCGCCTGA
1201 AAATAACGAA ACTTGGCAAG GCGCGGGCGT TCATATCAGT GAAGACAGTA
1251 CCGTTACTTG GAAAGTAAAC GGCGTGGCAA ACGACCGCCT GTCCAAAATC
1301 GGCAAAGGCA CGCTG..... .......... .......... ..........
//
2101 .......... .......... .......... .......... ...GATAAAG
2151 TGACTGCTTC ATTGACTAAG ACCGACATCA GCGGCAATGT CGATCTTGCC
2201 GATCACGCTC ATTTAAATCT CACAGGGCTT GCCACACTCA ACGGCAATCT
2251 TAGTGCAAAT GGCGATACAC GTTATACAGT CAGCCACAAC GCCACCCAAA
2301 ACGGCAACCk TAgCCtCGtG G.sAATGcCC AAGCAACATT TAATCAAGCC
2351 ACATTAAACG GCAACACATC GGCTTCgGGC AATGCTTCAT TTAATCTAAG
2401 CGACCACGCC GTACAAAACG GCAGTCTGAC GCTTTCCGGC AACGCTAAGG
2451 CAAACGTAAG CCATTCCGCA CTCAACGGTA ATGTCTCCCT AGCCGATAAG
2501 GCAGTATTCC ATTTTGAAAG CAGCCGCTTT ACCGGACAAA TCAGCGGCGG
2551 CAagGATACG GCATTACACT TAAAAGACAG CGAATGGACG CTGCCGTCAg
2601 GarCGGAATT AGGCAATTTA AACCTTGACA ACGCCACCAT TACaCTCAAT
2651 TCCGCCTATC GCCACGATGC GGCAGGGGCG CAAACCGGCA GTGCGACAGA
2701 TGCGCCGCGC CGCCGTTCGC GCCGTTCGCG CCGTTCCCTA TTATmCGTTA
2751 CACCGCCAAC TTCGGTAGAA TCCCGTTTCA ACACGCTGAC GGTAAACGGC
2801 AAATTGAACG GTCAGGGAAC ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA
2851 CCGCAGCGAC AAATTGAAGC TGGCGGAAAG TTCCGAAGGC ACTTACACCT
2901 TGGCGGTCAA CAATACCGGC AACGAACCTG CAAGCCTCGA ACAATTGACG
2951 GTAGTGGAAG GAAAAGACAA CAAACCGCTG TCCGAAAACC TTAATTTCAC
3001 CCTGCAAAAC GAACACGTCG ATGCAGGCGC GTGG...... ..........
//
3551 .......... .......... ....TTAGAC CGCGTATTTG CCGAAGACCG
3601 CCGCAACGCC GTTTGGACAA GCGGCATCCG GGACACCAAA CACTACCGTT
3651 CGCAAGATTT CCGCGCCTAC CGCCAACAAA CCGACCTGCG CCAAATCGGT
3701 ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC GGCATCCTGT TTTCGCACAA
3751 CCGGACCGAA AACACCTTCG ACGACGGCAT CGGCAACTCG GCACGGCTTG
3801 CCCACGGCGC CGTTTTCGGG CAATACGGCA TCGACAGGTT CTACATCGGC
3901 GAGsmAAAwT CCGCCGCCGC GTGCtGCATT ACGGCATTCA GGCACGAtAC
3951 CGCGCCGgtt tCggCGgATt CGGCATCGAA CCGCACATCG GCGCAACGCg
4001 ctATTTCGTC CAAAAAGCGG ATTACCGCTA CGAAAACGTC AATATCGCCA
4051 CCCCCGGCCT TGCATTCAAC CGcTACCGCG CGGGCATTAa GGCAGATTAT
4101 TCATTCAAAC CGGCGCAACA CATTTCCATC ACGCCTTATT TGAGCCTGTC
4151 CTATACCGAT GCCGCTTCGG GCAAAGTCCG AACACGCGTC AATACCGCCG
4201 TATTGGCTCA GGATTTCGGC AAAACCCGCA GTGCGGAATG GGgCGTAAAC
4251 GCCGAAATCA AAGGTTTCAC GCTGTCCCTC CACGCTGCCG CCGCCAAAGG
4301 CCCGCAACTG GAAGCGCAAC ACAGCGCGGG CATCAAATTA GGCTACCGCT
4351 GGTAA...
它对应于氨基酸序列<SEQ ID 648;ORF1>:
1 MKTTDKRTTE THRKAPKTGR IRFXAAYLAI CLSFGILPQA WAGHTYFGIN
51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG
101 VAALVGVQYI VSVAHNGGYN NVDFGAEGXN IXDQXRXTYK IVKRNNYKAG
151 TKGHPYGGDY HMPRLHKXVT DAEPVEMTSY MDGRKYIDQN NYPDRVRIGA
201 GRQYWRSDED EPNNRESSYH IAS....... ........GS PMFIYDAQKQ
251 KWLINGVLQT GNPYIGKSNG FQLVRKDWFY DEIFAGDTHS VFYEPRQNGK
301 YSFNDDNNGT GKINAKHEHN SLPNRLKTRT VQLFNVSLSE TAREPVYHAA
351 GGVNSYRPRL NNGENISFID EGKGELILTS NINQGAGGLY FQGDFTVSPE
401 NNETWQGAGV HISEDSTVTW KVNGVANDRL SKIGKGTL.. ..........
//
701 .......... ....DKVTAS LTKTDISGNV DLADHAHLNL TGLATLNGNL
751 SANGDTRYTV SHNATQNGNX SLVXNAQATF NQATLNGNTS ASGNASFNLS
801 DHAVQNGSLT LSGNAKANVS HSALNGNVSL ADKAVFHFES SRFTGQISGG
851 KDTALHLKDS EWTLPSGXEL GNLNLDNATI TLNSAYRHDA AGAQTGSATD
901 APRRRSRRSR RSLLXVTPPT SVESRFNTLT VNGKLNGQGT FRFMSELFGY
951 RSDKLKLAES SEGTYTLAVN NTGNEPASLE QLTVVEGKDN KPLSENLNFT
1001 LQNEHVDAGA W......... .......... .......... ..........
//
1151 .......... .......... .......... .......... .LDRVFAEDR
1201 RNAVWTSGIR DTKHYRSQDF RAYRQQTDLR QIGMQKNLGS GRVGILFSHN
1251 RTENTFDDGI GNSARLAHGA VFGQYGIDRF YIGISAGAGF SSGSLSDGIG
1301 XKXRRRVLHY GIQARYRAGF GGFGIEPHIG ATRYFVQKAD YRYENVNIAT
1351 PGLAFNRYRA GIKADYSFKP AQHISITPYL SLSYTDAASG KVRTRVNTAV
1401 LAQDFGKTRS AEWGVNAEIK GFTLSLHAAA AKGPQLEAQH SAGIKLGYRW
1451 *
进一步的序列分析揭示了全部的核苷酸序列<SEQ ID 649>:
1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA
51 AACCGGCCGC ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT
101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC
151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG
201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT
251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC
301 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG
351 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGAAAT CCCGATCAAC
401 ATCGTTTTAC TTATAAAATT GTGAAACGGA ATAATTATAA AGCAGGGACT
451 AAAGGCCATC CTTATGGCGG CGATTATCAT ATGCCGCGTT TGCATAAATT
501 TGTCACAGAT GCAGAACCTG TTGAAATGAC CAGTTATATG GATGGGCGGA
551 AATATATCGA TCAAAATAAT TACCCTGACC GTGTTCGTAT TGGGGCAGGC
601 AGGCAATATT GGCGATCTGA TGAAGATGAG CCCAATAACC GCGAAAGTTC
651 ATATCATATT GCAAGTGCGT ATTCTTGGCT CGTTGGTGGC AATACCTTTG
701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG TGAAAAAATT
751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG
801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA
851 ATGGGGTATT GCAAACGGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC
901 CAGCTGGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC
951 CCATTCAGTA TTCTACGAAC CACGTCAAAA TGGGAAATAC TCTTTTAACG
1001 ACGATAATAA TGGCACAGGA AAAATCAATG CCAAACATGA ACACAATTCT
1051 CTGCCTAATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT
1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGTGTCAACA
1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACGAA
1201 GGAAAAGGCG AATTGATACT TACCAGCAAC ATCAATCAAG GTGCTGGAGG
1251 ATTATATTTC CAAGGAGATT TTACGGTCTC GCCTGAAAAT AACGAAACTT
1301 GGCAAGGCGC GGGCGTTCAT ATCAGTGAAG ACAGTACCGT TACTTGGAAA
1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT
1401 GCACGTTCAA GCCAAAGGGG AAAACCAAGG CTCGATCAGC GTGGGCGACG
1451 GTACAGTCAT TTTGGATCAG CAGGCAGACG ATAAAGGCAA AAAACAAGCC
1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGTACGGTGC AACTGAATGC
1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC
1601 GTTTGGATTT AAACGGGCAT TCGCTTTCGT TCCACCGTAT TCAAAATACC
1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT
1701 TACCATTACA GGCAATAAAG ATATTGCTAC AACCGGCAAT AACAACAGCT
1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT
1801 ACGACCAAAA CGAACGGGCG GCTCAACCTT GTTTACCAGC CCGCCGCAGA
1851 AGACCGCACC CTGCTGCTTT CCGGCGGAAC AAATTTAAAC GGCAACATCA
1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCAAC ACCGCACGCC
1951 TACAATCATT TAAACGACCA TTGGTCGCAA AAAGAGGGCA TTCCTCGCGG
2001 GGAAATCGTG TGGGACAACG ACTGGATCAA CCGCACATTT AAAGCGGAAA
2051 ACTTCCAAAT TAAAGGCGGA CAGGCGGTGG TTTCCCGCAA TGTTGCCAAA
2101 GTGAAAGGCG ATTGGCATTT GAGCAATCAC GCCCAAGCAG TTTTTGGTGT
2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC
2201 TGACAAATTG TGTCGAAAAA ACCATTACCG ACGATAAAGT GATTGCTTCA
2251 TTGACTAAGA CCGACATCAG CGGCAATGTC GATCTTGCCG ATCACGCTCA
2301 TTTAAATCTC ACAGGGCTTG CCACACTCAA CGGCAATCTT AGTGCAAATG
2351 GCGATACACG TTATACAGTC AGCCACAACG CCACCCAAAA CGGCAACCTT
2401 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG
2451 CAACACATCG GCTTCGGGCA ATGCTTCATT TAATCTAAGC GACCACGCCG
2501 TACAAAACGG CAGTCTGACG CTTTCCGGCA ACGCTAAGGC AAACGTAAGC
2551 CATTCCGCAC TCAACGGTAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA
2601 TTTTGAAAGC AGCCGCTTTA CCGGACAAAT CAGCGGCGGC AAGGATACGG
2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCAGG CACGGAATTA
2701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG
2751 CCACGATGCG GCAGGGGCGC AAACCGGCAG TGCGACAGAT GCGCCGCGCC
2801 GCCGTTCGCG CCGTTCGCGC CGTTCCCTAT TATCCGTTAC ACCGCCAACT
2851 TCGGTAGAAT CCCGTTTCAA CACGCTGACG GTAAACGGCA AATTGAACGG
2901 TCAGGGAACA TTCCGCTTTA TGTCGGAACT CTTCGGCTAC CGCAGCGACA
2951 AATTGAAGCT GGCGGAAAGT TCCGAAGGCA CTTACACCTT GGCGGTCAAC
3001 AATACCGGCA ACGAACCTGC AAGCCTCGAA CAATTGACGG TAGTGGAAGG
3051 AAAAGACAAC AAACCGCTGT CCGAAAACCT TAATTTCACC CTGCAAAACG
3101 AACACGTCGA TGCCGGCGCG TGGCGTTACC AACTCATCCG CAAAGACGGC
3151 GAGTTCCGCC TGCATAATCC GGTCAAAGAA CAAGAGCTTT CCGACAAACT
3201 CGGCAAGGCA GAAGCCAAAA AACAGGCGGA AAAAGACAAC GCGCAAAGCC
3251 TTGACGCGCT GATTGCGGCC GGGCGCGATG CCGTCGAAAA GACAGAAAGC
3301 GTTGCCGAAC CGGCCCGGCA GGCAGGCGGG GAAAATGTCG GCATTATGCA
3351 GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC GGATAAAGAC ACCGCCTTGG
3401 CGAAACAGCG CGAAGCGGAA ACCCGGCCGG CTACCACCGC CTTCCCCCGC
3451 GCCCGCCGCG CCCGCCGGGA TTTGCCGCAA CTGCAACCCC AACCGCAGCC
3501 CCAACCGCAG CGCGACCTGA TCAGCCGTTA TGCCAATAGC GGTTTGAGTG
3551 AATTTTCCGC CACGCTCAAC AGCGTTTTCG CCGTACAGGA CGAATTAGAC
3601 CGCGTATTTG CCGAAGACCG CCGCAACGCC GTTTGGACAA GCGGCATCCG
3651 GGACACCAAA CACTACCGTT CGCAAGATTT CCGCGCCTAC CGCCAACAAA
3701 CCGACCTGCG CCAAATCGGT ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC
3751 GGCATCCTGT TTTCGCACAA CCGGACCGAA AACACCTTCG ACGACGGCAT
3801 CGGCAACTCG GCACGGCTTG CCCACGGCGC CGTTTTCGGG CAATACGGCA
3851 TCGACAGGTT CTACATCGGC ATCAGCGCGG GCGCGGGTTT TAGCAGCGGC
3901 AGCCTTTCAG ACGGCATCGG AGGCAAAATC CGCCGCCGCG TGCTGCATTA
3951 CGGCATTCAG GCACGATACC GCGCCGGTTT CGGCGGATTC GGCATCGAAC
4001 CGCACATCGG CGCAACGCGC TATTTCGTCC AAAAAGCGGA TTACCGCTAC
4051 GAAAACGTCA ATATCGCCAC CCCCGGCCTT GCATTCAACC GCTACCGCGC
4101 GGGCATTAAG GCAGATTATT CATTCAAACC GGCGCAACAC ATTTCCATCA
4151 CGCCTTATTT GAGCCTGTCC TATACCGATG CCGCTTCGGG CAAAGTCCGA
4201 ACACGCGTCA ATACCGCCGT ATTGGCTCAG GATTTCGGCA AAACCCGCAG
4251 TGCGGAATGG GGCGTAAACG CCGAAATCAA AGGTTTCACG CTGTCCCTCC
4301 ACGCTGCCGC CGCCAAAGGC CCGCAACTGG AAGCGCAACA CAGCGCGGGC
4351 ATCAAATTAG GCTACCGCTG GTAA
它对应于氨基酸序列<SEQ ID 650;ORF1-1>:
1 MKTTDKRTTE THRKAPKTGR
IRFSPAYLAI CLSFGILPQA WAGHTYFGIN
51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG
101 VAALVGDQYI VSVAHNGGYN NVDFGAEGRN PDQHRFTYKI VKRNNYKAGT
151 KGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGRKYIDQNN YPDRVRIGAG
201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI
251 KHSPYGFLPT GGSFGDSGSP MFIYDAQKQK WLINGVLQTG NPYIGKSNGF
301 QLVRKDWFYD EIFAGDTHSV FYEPRQNGKY SFNDDNNGTG KINAKHEHNS
351 LPNRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDE
401 GKGELILTSN INQGAGGLYF QGDFTVSPEN NETWQGAGVH ISEDSTVTWK
451 VNGVANDRLS KIGKGTLHVQ AKGENQGSIS VGDGTVILDQ QADDKGKKQA
501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT
551 DEGAMIVNHN QDKESTVTIT GNKDIATTGN NNSLDSKKEI AYNGWFGEKD
601 TTKTNGRLNL VYQPAAEDRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA
651 YNHLNDHWSQ KEGIPRGEIV WDNDWINRTF KAENFQIKGG QAVVSRNVAK
701 VKGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTNCVEK TITDDKVIAS
751 LTKTDISGNV DLADHAHLNL TGLATLNGNL SANGDTRYTV SHNATQNGNL
801 SLVGNAQATF NQATLNGNTS ASGNASFNLS DHAVQNGSLT LSGNAKANVS
851 HSALNGNVSL ADKAVFHFES SRFTGQISGG KDTALHLKDS EWTLPSGTEL
901 GNLNLDNATI TLNSAYRHDA AGAQTGSATD APRRRSRRSR RSLLSVTPPT
951 SVESRFNTLT VNGKLNGQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN
1001 NTGNEPASLE QLTVVEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG
1051 EFRLHNPVKE QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAVEKTES
1101 VAEPARQAGG ENVGIMQAEE EKKRVQADKD TALAKQREAE TRPATTAFPR
1151 ARRARRDLPQ LQPQPQPQPQ RDLISRYANS GLSEFSATLN SVFAVQDELD
1201 RVFAEDRRNA VWTSGIRDTK HYRSQDFRAY RQQTDLRQIG MQKNLGSGRV
1251 GILFSHNRTE NTFDDGIGNS ARLAHGAVFG QYGIDRFYIG ISAGAGFSSG
1301 SLSDGIGGKI RRRVLHYGIQ ARYRAGFGGF GIEPHIGATR YFVQKADYRY
1351 ENVNIATPGL AFNRYRAGIK ADYSFKPAQH ISITPYLSLS YTDAASGKVR
1401 TRVNTAVLAQ DFGKTRSAEW GVNAEIKGFT LSLHAAAAKG PQLEAQHSAG
1451 IKLGYRW*
这些序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF1和脑膜炎奈瑟球菌菌株A的ORF(ORF1a)在1456个氨基酸的重叠区内显示出有57.8%的相同性:
10 20 30 40 50 60
orf1.pep MKTTDKRTTETHRKAPKTGR
IRFXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
orf1a MKTTDKRTTETHRKAPKTGR
IRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
10 20 30 40 50 60
70 80 90 100 110 120
orf1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGVQYIVSVAHNGGYN
|||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||
orf1a KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
70 80 90 100 110 120
130 140 150 160 170 180
orf1.pep NVDFGAEGXNIXDQXRXTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSY
|||||||||| || | :|:|||||||| :: |||:|| ||||||| |||||||||||
orf1a NVDFGAEGXN-PDQHRFSYQIVKRNNYKPDNS-HPYNGDXHMPRLHKFVTDAEPVEMTSD
130 140 150 160 170
190 200 210
orf1.pep MDGRKYIDQNNYPDRVRIGAGRQYWRSDEDEP---------------------NN-----
| | | |:::||:|||||:|::||| |:|: ||
orf1a MRGNTYSDKEKYPERVRIGSGHHYWRYDDDKHGDLSYSGAWLIGGNTHMQGWGNNGVXSL
180 190 200 210 220 230
220 230 240 250 260
orf1.pep ----RESSYH----IA-----SGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRK
|::: : || ||||||||| ::|||:||||||| || |: |||||:||
orf1a SGDVRHANDYGPMPIAGAAGDSGSPMFIYDKTNNKWLLNGVLQTGYPYSGRENGFQLIRK
240 250 260 270 280 290
270 280 290 300 310 320
orf1.pep DWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRTVQLFNV
|||||:|: ||||:| :|||:||::||:::||||| :: :|: | | :||::||:||:
orf1a DWFYDDIYRGDTHTVXFEPRSNGHFSFTSNNNGTGTVTETNEKVSNP-KLKVQTVRLFDE
300 310 320 330 340 350
330 340 350 360 370 380
orf1.pep SLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYFQGDFT
||:|||:|||| ||||||:||||||||||:|||| |:|:|||::|||||||||||:||||
orf1a SLNETDKEPVY-AAGGVNQYRPRLNNGENLSFIDYGNGKLILSNNINQGAGGLYFEGDFT
360 370 380 390 400 410
390 400 410 420 430
orf1.pep VSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTL------------------
||||||||||||||||||||||||||||||||||||||||||
orf1a VSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSISVGDGT
420 430 440 450 460 470
orf1.pep ------------------------------------------------------------
orf1a VILDQQADDKGKKQAFSEIGLXSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGHSLSFH
480 490 500 510 520 530
orf1.pep ------------------------------------------------------------
orf1a RIQNTDEGAMIXXHNATTTSTVTITGNESITQPSGKNINRLNYSKEIAYNGWFGEKDTTK
540 550 560 570 580 590
orf1.pep ------------------------------------------------------------
orf1a TNGRLNLVYQPAAEDRTXLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSGWSKMEG
600 610 620 630 640 650
orf1.pep ------------------------------------------------------------
orf1a IPQGEIVWDNDWIXRTFKAENFHIQGGQAVISRNVAKVEGDXHLSNHAQAVFGVAPHQSH
660 670 680 690 700 710
440 450 460 470 480
orf1.pep ----------------XXXXXDKVTASLTKTDISGNVDLADHAHLNLTGLATLNGNLSAN
: || : ||| ||||||| || | | |:| |:| ||||||
orf1a TICTRSDWTGLTNCVEXXITDDKVIASLTKTDXSGXVXLXXXXXXXLXGXAXLXGNLSAN
720 730 740 750 760 770
490 500 510 520 530 540
orf1.pep GDTRYTVSHNATQNGNXSLVXNAQATFNQATLNGNTSASGNASFNLSDHAVQNGSLTLSG
|||||||||||||||| ||| ||||||||||||||:| |||||||||::|:||||||||
orf 1a GDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNXSXSGNASFNLSNNAAQNGSLTLSD
780 790 800 810 820 830
550 560 570 580 590 600
orf1.pep NAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGXELGNL
||||||||||||||||||||||||||:||||||:||:| |||||||||||||||:|||||
orf1a NAKANVSHSALNGNVSLADKAVFHFENSRFTGQLSGSKXTALHLKDSEWTLPSGTELGNL
840 850 860 870 880 890
610 620 630 640 650 660
orf1.pep NLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLXVTPPTSVESRFNTLTVNG
||||||||||||||||||||||| ::|:|||||||| || ||||||||||||||||||
orf1a NLDNATITLNSAYRHDAAGAQTGXVSDTPRRRSRRS---LLSVTPPTSVESRFNTLTVNG
900 910 920 930 940 950
670 680 690 700 710 720
orf1.pep KLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDNKPL
||| |||||||||||||||||||||||||||||||||||||||:||:|||||||||||||
orf1a KLNXQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPVSLDQLTVVEGKDNKPL
960 970 980 990 1000 1010
730 740 750
orf1.pep SENLNFTLQNEHVDAGAW------------------------------------------
||||||||||||||||||
orf1a SENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAEKDNAQS
1020 1030 1040 1050 1060 1070
orf1.pep ------------------------------------------------------------
orf1a LDALIAAGRDAAEKTESVAEPARXAGGENVGIMQAEEEKKRVQADKDSALAKQREAETRP
1080 1090 1100 1110 1120 1130
760
orf1.pep ---------------------------------------------------------LDR
|||
orf1a XTTAFPRARXARRDLPQPQPQPQPQPQPQRDLXSRYANSGLSEFSATLNSVFAVQDELDR
1140 1150 1160 1170 1180 1190
770 780 790 800 810 820
orf1.pep VFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNRTEN
||||||||||||| || |||||||||||||||||||||||||||||||||||||||||||
orf1a VFAEDRRNAVWTSXIRXTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNRTEN
1200 1210 1220 1230 1240 1250
830 840 850 860 870 880
orf1.pep TFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGIGXKXRRRVLHYGIQA
:|||||||||||||||||||||| || ||||:||||||| |||||||| |||||||||||
orf1a XFDDGIGNSARLAHGAVFGQYGIGRFDIGISTGAGFSSGXLSDGIGGKIRRRVLHYGIQA
1260 1270 1280 1290 1300 1310
890 900 910 920 930 940
orf1.pep RYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPAQHI
|||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf1a RYRAGFGGFGIEPYIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPAQHX
1320 1330 1340 1350 1360 1370
950 960 970 980 990 1000
orf1.pep SITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHAAAAKGP
||||| ||||||||||||||||||||||||||||||||||||||||||||| ||||||||
orf1a SITPYXSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSXHAAAAKGP
1380 1390 1400 1410 1420 1430
1010 1020
orf1.pep QLEAQHSAGIKLGYRWX
|||||||||||||||||
orf1a QLEAQHSAGIKLGYRWX
1440 1450
全长ORF1a核苷酸序列<SEQ ID 651>是:
1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA
51 AACCGGCCGC ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT
101 TCGGCATTCT TCCCCAAGCT TGGGCGGGAC ACACTTATTT CGGCATCAAC
151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG
201 GGCGAAAGAT ATTGAGGTNT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT
251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC
301 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG
351 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGNAAT CCCGATCAGC
401 ACCGTTTTTC TTACCAAATT GTGAAAAGAA ATAATTATAA GCCTGACAAT
451 TCACACCCTT ACAACGGCGA TTANCATATG CCGCGTTTGC ATAAATTTGT
501 CACAGATGCA GAACCTGTCG AAATGACGAG TGACATGAGG GGGAATACCT
551 ATTCCGATAA AGAAAAATAT CCCGAGCGTG TCCGCATCGG CTCAGGACAC
601 CACTATTGGC GTTATGATGA TGACAAACAC GGCGATTTAT CCTACTCCGG
651 CGCATGGTTA ATTGGCGGCA ATACACATAT GCAGGGTTGG GGAAATAATG
701 GCGTANTTAG TTTGAGCGGC GATGTGCGCC ATGCCAACGA CTATGGCCCT
751 ATGCCGATTG CAGGTGCGGC AGGCGACAGC GGTTCGCCAA TGTTTATTTA
801 TGACAAAACA AACAATAAAT GGCTGCTCAA CGGAGTTTTA CAAACCGGCT
851 ACCCTTATTC CGGCAGGGAA AACGGTTTCC AGCTGATACG CAAAGATTGG
901 TTCTACGATG ACATTTACAG AGGCGATACA CATACCGTCT NTTTTGAACC
951 GCGCAGTAAC GGACATTTTT CCTTTACATC CAACAACAAC GGTACGGGTA
1001 CGGTAACAGA AACCAACGAA AAGGTNTCCA ATCCAAAGCT TAAAGTACAG
1051 ACAGTCCGAC TGTTTGACGA ATCTTTGAAT GAAACTGATA AAGAACCAGT
1101 TTACGCGGCA GGGGGTGTTA ATCAGTACCG TCCAAGGTTA AACAACGGTG
1151 AAAACCTTTC TTTTATCGAT TACGGCAACG GCAAACTCAT CTTATCAAAC
1201 AACATCAACC AAGGCGCGGG CGGTTTGTAT TTTGAAGGTG ATTTTACGGT
1251 CTCGCCTGAA AACAACGAAA CGTGGCAAGG CGCGGGCGTT CATATCAGTG
1301 AAGACAGTAC CGTTACTTGG AAAGTAAACG GCGTGGCAAA CGACCGCCTG
1351 TCCAAAATCG GCAAAGGCAC GCTGCACGTT CAAGCCAAAG GGGAAAACCA
1401 AGGCTCGATC AGCGTGGGCG ACGGTACAGT CATTTTGGAT CAGCAGGCAG
1451 ACGATAAAGG CAAAAAACAA GCCTTTAGTG AAATCGGCTT GNTCAGCGGC
1501 AGGGGTACGG TGCAACTGAA TGCCGATAAT CAGTTCAACC CCGACAAACT
1551 CTATTTCGGC TTTCGCGGCG GACGTTTGGA TTTAAACGGG CATTCGCTTT
1601 CGTTCCACCG TATTCAAAAT ACCGATGAAG GGGCGATGAT TGNCNATCAT
1651 AATGCCACAA CAACATCCAC CGTTACCATT ACAGGGAATG AAAGTATTAC
1701 ACAACCGAGT GGTAAGAATA TCAATAGACT TAATTACAGC AAAGAAATTG
1751 CCTACAACGG TTGGTTTGGC GAGAAAGATA CGACCAAAAC GAACGGGCGG
1801 CTCAACCTTG TTTACCAGCC CGCCGCAGAA GACCGCACCC NGCTGCTTTC
1851 CGGCGGAACA AATTTAAACG GCAACATCAC GCAAACAAAC GGCAAACTGT
1901 TTTTCAGCGG CAGACCGACA CCGCACGCCT ACAATCATTT AGGAAGCGGG
1951 TGGTCAAAAA TGGAAGGTAT CCCACAAGGA GAAATCGTGT GGGACAACGA
2001 CTGGATCNAC CGCACGTTTA AAGCGGAAAA TTTCCATATT CAGGGCGGGC
2051 AGGCGGTGAT TTCCCGCAAT GTTGCCAAAG TGGAAGGCGA TTGNCATTTG
2101 AGCAATCACG CCCAAGCAGT TTTTGGTGTC GCACCGCATC AAAGCCATAC
2151 AATCTGTACA CGTTCGGACT GGACNGGTCT GACAAATTGT GTCGAANAAA
2201 NCATTACCGA CGATAAAGTG ATTGCTTCAT TGACTAAGAC NGACNTNAGC
2251 GGCANTGTNA GNCTNNCCNA TNACGNTNNT TNAAANCTCN CNGGGCNTGC
2301 NNCACTNAAN GGCAATCTTA GTGCAAATGG CGATACACGT TATACAGTCA
2351 GCCACAACGC CACCCAAAAC GGCAACCTTA GCCTCGTGGG CAATGCCCAA
2401 GCAACATTTA ATCAAGCCAC ATTAAACGGC AACNCATCGG NTTCGGGCAA
2451 TGCTTCATTT AATCTAAGCA ACAACGCCGC ACAAAACGGC AGTCTGACGC
2501 TTTCCGACAA CGCTAAGGCA AACGTAAGCC ATTCCGCACT CAACGGCAAT
2551 GTCTCCCTAG CCGATAAGGC AGTATTCCAT TTTGAAAACA GCCGCTTTAC
2601 CGGACAACTC AGCGGCAGCA AGGANACAGC ATTACACTTA AAAGACAGCG
2651 AATGGACGCT GCCGTCAGGC ACGGAATTAG GCAATTTAAA CCTTGACAAC
2701 GCCACCATTA CACTCAATTC CGCCTATCGC CACGATGCTG CAGGCGCGCA
2751 AACCGGCAGN GTGTCAGACA CGCCGCGCCG CCGTTCGCGC CGTTCCCTAT
2801 TATCCGTTAC ACCGCCAACT TCGGTAGAAT CCCGTTTCAA CACGCTGACG
2851 GTAAACGGCA AATTGAACNG TCAAGGAACA TTCCGCTTTA TGTCGGAACT
2901 CTTCGGCTAC CGAAGCGACA AATTGAAGCT GGCGGAAAGT TCCGAAGGNA
2951 CTTACACCTT GGCGGTCAAC AATACCGGCA ACGAACCCGT AAGCCTCGAT
3001 CAATTGACGG TAGTGGAAGG GAAAGACAAC AAACCGCTGT CCGAAAACCT
3051 TAATTTCACC CTGCAAAACG AACACGTCGA TGCCGGCGCG TGGCGTTACC
3101 AACTCATCCG CAAAGACGGC GAGTTCCGCC TGCATAATCC GGTCAAAGAA
3151 CAAGAGCTTT CCGACAAACT CGGCAAGGCA GAAGCCAAAA AACAGGCGGA
3201 AAAAGACAAC GCGCAAAGCC TTGACGCGCT GATTGCGGCC GGGCGCGATG
3251 CCGCCGAAAA GACAGAAAGC GTTGCCGAAC CGGCCCGGCN GGCAGGCGGG
3301 GAAAATGTCG GCATTATGCA GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC
3351 GGATAAAGAC AGCGCNTTGG CGAAACAGCG CGAAGCGGAA ACCCGGCCGG
3401 NTACCACCGC CTTCCCCCGC GCCCGCNGCG CCCGCCGGGA TTTGCCGCAA
3451 CCGCAGCCCC AACCGCAACC TCAACCCCAA CCGCAGCGCG ACCTGATNAG
3501 CCGTTATGCC AATAGCGGTT TGAGTGAATT TTCCGCCACG CTCAACAGCG
3551 TTTTCGCCGT ACAGGACGAA TTGGACCGCG TGTTTGCCGA AGACCGCCGC
3601 AACGCNGTTT GGACAAGCNG CATCCGGNAC ACCAAACACT ACCGTTCGCA
3651 AGATTTCCGC GCCTACCGCC AACAAACCGA CCTGCGCCAA ATCGGTATGC
3701 AGAAAAACCT CGGCAGCGGG CGCGTCGGCA TCCTGTTTTC GCACAACCGG
3751 ACCGAAAACA NCTTCGACGA CGGCATCGGC AACTCGGCAC GGCTTGCCCA
3801 CGGCGCCGTT TTCGGGCAAT ACGGCATCGG CAGGTTCGAC ATCGGCATCA
3851 GCACGGGCGC GGGTTTTAGC AGCGGCANTC TNTCAGACGG CATCGGAGGC
3901 AAAATCCGCC GCCGCGTGCT GCATTACGGC ATTCAGGCAC GATACCGCGC
3951 CGGTTTCGGC GGATTCGGCA TCGAACCGTA CATCGGCGCA ACGCGCTATT
4001 TCGTCCAAAA AGCGGATTAC CGCTACGAAA ACGTCAATAT CGCCACCCCC
4051 GGTCTTGCGT TCAACCGNTA CCGNGCGGGC ATTAAGGCAG ATTATTCATT
4101 CAAACCGGCG CAACACATNT CCATCACNCC TTATTTNAGC CTGTCCTATA
4151 CCGATGCCGC TTCGGGCAAA GTCCGAACAC GCGTCAATAC CGCNGTATTG
4201 GCTCAGGATT TCGGCAAAAC CCGCAGTGCG GAATGGGGCG TAAACGCCGA
4251 AATCAAAGGT TTCACGCTGT CCNTCCACGC TGCCGCCGCC AAAGGNCCGC
4301 AACTGGAAGC GCAACACAGC GCGGGCATCA AATTAGGCTA CCGCTGGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 652>:
1 MKTTDKRTTE THRKAPKTGR
IRFSPAYLAI CLSFGILPQA WAGHTYFGIN
51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG
101 VAALVGDQYI VSVAHNGGYN NVDFGAEGXN PDQHRFSYQI VKRNNYKPDN
151 SHPYNGDXHM PRLHKFVTDA EPVEMTSDMR GNTYSDKEKY PERVRIGSGH
201 HYWRYDDDKH GDLSYSGAWL IGGNTHMQGW GNNGVXSLSG DVRHANDYGP
251 MPIAGAAGDS GSPMFIYDKT NNKWLLNGVL QTGYPYSGRE NGFQLIRKDW
301 FYDDIYRGDT HTVXFEPRSN GHFSFTSNNN GTGTVTETNE KVSNPKLKVQ
351 TVRLFDESLN ETDKEPVYAA GGVNQYRPRL NNGENLSFID YGNGKLILSN
401 NINQGAGGLY FEGDFTVSPE NNETWQGAGV HISEDSTVTW KVNGVANDRL
451 SKIGKGTLHV QAKGENQGSI SVGDGTVILD QQADDKGKKQ AFSEIGLXSG
501 RGTVQLNADN QFNPDKLYFG FRGGRLDLNG HSLSFHRIQN TDEGAMIXXH
551 NATTTSTVTI TGNESITQPS GKNINRLNYS KEIAYNGWFG EKDTTKTNGR
601 LNLVYQPAAE DRTXLLSGGT NLNGNITQTN GKLFFSGRPT PHAYNHLGSG
651 WSKMEGIPQG EIVWDNDWIX RTFKAENFHI QGGQAVISRN VAKVEGDXHL
701 SNHAQAVFGV APHQSHTICT RSDWTGLTNC VEXXITDDKV IASLTKTDXS
751 GXVXLXXXXX XXLXGXAXLX GNLSANGDTR YTVSHNATQN GNLSLVGNAQ
801 ATFNQATLNG NXSXSGNASF NLSNNAAQNG SLTLSDNAKA NVSHSALNGN
851 VSLADKAVFH FENSRFTGQL SGSKXTALHL KDSEWTLPSG TELGNLNLDN
901 ATITLNSAYR HDAAGAQTGX VSDTPRRRSR RSLLSVTPPT SVESRFNTLT
951 VNGKLNXQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN NTGNEPVSLD
1001 QLTVVEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG EFRLHNPVKE
1051 QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAAEKTES VAEPARXAGG
1101 ENVGIMQAEE EKKRVQADKD SALAKQREAE TRPXTTAFPR ARXARRDLPQ
1151 PQPQPQPQPQ PQRDLXSRYA NSGLSEFSAT LNSVFAVQDE LDRVFAEDRR
1201 NAVWTSXIRX TKHYRSQDFR AYRQQTDLRQ IGMQKNLGSG RVGILFSHNR
1251 TENXFDDGIG NSARLAHGAV FGQYGIGRFD IGISTGAGFS SGXLSDGIGG
1301 KIRRRVLHYG IQARYRAGFG GFGIEPYIGA TRYFYQKADY RYENVNIATP
1351 GLAFNRYRAG IKADYSFKPA QHXSITPYXS LSYTDAASGK VRTRVNTAVL
1401 AQDFGKTRSA EWGVNAEIKG FTLSXHAAAA KGPQLEAQHS AGIKLGYRW*
跨膜区用下划线表示。
ORF1-1和ORF1a在1462个氨基酸的重叠区内显示出有86.3%的相同性:
10 20 30 40 50 60
orf1a.pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1-1 MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
10 20 30 40 50 60
70 80 90 100 110 120
orf1a.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1-1 KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
70 80 90 100 110 120
130 140 150 160 170 179
orf1a.pep NVDFGAEGXNPDQHRFSYQIVKRNNYKPDNS-HPYNGDXHMPRLHKFVTDAEPVEMTSDM
|||||||| |||||||:|:|||||||| :: |||:|| ||||||||||||||||||| |
orf1-1 NVDFGAEGRNPDQHRFTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
130 140 150 160 170 180
180 190 200 210 220 230
orf1a.pep RGNTYSDKEKYPERVRIGSGHHYWRYDDDKHGDL--SYSGA----WLIGGNTHMQGWGNN
| | |:::||:|||||:|::||| |:|: :: || | ||:|||| |: :::
orf1-1 DGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
190 200 210 220 230 240
240 250 260 270 280 290
orf1a.pep GVXSLSGD-VRHANDYGPMPIAGAAGDSGSPMFIYDKTNNKWLLNGVLQTGYPYSGRENG
|: :|::: :: || :| : ||||||||||||| ::|||:||||||| || |: ||
orf1-1 GTVNLGSEKIKHS-PYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNG
250 260 270 280 290
300 310 320 330 340 350
orf1a.pep FQLIRKDWFYDDIYRGDTHTVXFEPRSNGHFSFTSNNNGTGTVTETNEKVSNP-KLKVQT
|||:|||||||:|: ||||:| :|||:||::||:::||||| :: :|: | | :||::|
orf1-1 FQLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRT
300 310 320 330 340 350
360 370 380 390 400 410
orf1a.pep VRLFDESLNETDKEPVY-AAGGVNQYRPRLNNGENLSFIDYGNGKLILSNNINQGAGGLY
|:||: ||:|| :|||| ||||||:||||||||||:||||||:|:|||::||||||||||
orf1-1 VQLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLY
360 370 380 390 400 410
420 430 440 450 460 470
orf1a.pep FEGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSI
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1-1 FQGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSI
420 430 440 450 460 470
480 490 500 510 520 530
orf1a.pep SVGDGTVILDQQADDKGKKQAFSEIGLXSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG
||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||
orf1-1 SVGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG
480 490 500 510 520 530
540 550 560 570 580 590
orf1a.pep HSLSFHRIQNTDEGAMIXXHNATTTSTVTITGNESITQPSGKNINRLNYSKEIAYNGWFG
||||||||||||||||| || ||||||||::|: :|:| | |: :||||||||||
orf1-1 HSLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDIAT-TGNN-NSLDSKKEIAYNGWFG
540 550 560 570 580 590
600 610 620 630 640 650
orf1a.pep EKDTTKTNGRLNLVYQPAAEDRTXLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSG
||||||||||||||||||||||| |||||||||||||||||||||||||||||||||::
orf1-1 EKDTTKTNGRLNLVYQPAAEDRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLNDH
600 610 620 630 640 650
660 670 680 690 700 710
orf1a.pep WSKMEGIPQGEIVWDNDWIXRTFKAENFHIQGGQAVISRNVAKVEGDXHLSNHAQAVFGV
||: ||||:|||||||||| ||||||||:|:|||||:|||||||:|| ||||||||||||
orf1-1 WSQKEGIPRGEIVWDNDWINRTFKAENFQIKGGQAVVSRNVAKVKGDWHLSNHAQAVFGV
660 670 680 690 700 710
720 730 740 750 760 770
orf1a.pep APHQSHTICTRSDWTGLTNCVEXXITDDKVIASLTKTDXSGXVXLXXXXXXXLXGXAXLX
|||||||||||||||||||||| :|||||||||||||| || | | |:| |:|
orf1-1 APHQSHTICTRSDWTGLTNCVEKTITDDKVIASLTKTDISGNVDLADHAHLNLTGLATLN
720 730 740 750 760 770
780 790 800 810 820 830
orf1a.pep GNLSANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNXSXSGNASFNLSNNAAQNG
|||||||||||||||||||||||||||||||||||||||||:| |||||||||::|:|||
orf1-1 GNLSANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNTSASGNASFNLSDHAVQNG
780 790 800 810 820 830
840 850 860 870 880 890
orf1a.pep SLTLSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGQLSGSKXTALHLKDSEWTLPSG
|||||:||||||||||||||||||||||||||:||||||:||:| |||||||||||||||
orf1-1 SLTLSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSG
840 850 860 870 880 890
900 910 920 930 940
orf1a.pep TELGNLNLDNATITLNSAYRHDAAGAQTGXVSDTPRRRSRRS---LLSVTPPTSVESRFN
||||||||||||||||||||||||||||| ::|:|||||||| |||||||||||||||
orf1-1 TELGNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLSVTPPTSVESRFN
900 910 920 930 940 950
950 960 970 980 990 1000
orf1a.pep TLTVNGKLNXQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPVSLDQLTVVEG
||||||||| |||||||||||||||||||||||||||||||||||||||:||:|||||||
orf1-1 TLTVNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEG
960 970 980 990 1000 1010
1010 1020 1030 1040 1050 1060
orf1a.pep KDNKPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1-1 KDNKPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAEAKKQAE
1020 1030 1040 1050 1060 1070
1070 1080 1090 1100 1110 1120
orf1a.pep KDNAQSLDALIAAGRDAAEKTESVAEPARXAGGENVGIMQAEEEKKRVQADKDSALAKQR
||||||||||||:|||||||||||||||| |||||||||||||||||||||||:||||||
orf1-1 KDNAQSLDALIAAGRDAVEKTESVAEPARQAGGENVGIMQAEEEKKRVQADKDTALAKQR
1080 1090 1100 1110 1120 1130
1130 1140 1150 1160 1170 1180
orf1a.pep EAETRPXTTAFPRARXARRDLPQPQPQPQPQPQPQRDLXSRYANSGLSEFSATLNSVFAV
|||||| |||||||| ||||||| |||||||| |||| |||||||||||||||||||||
orf1-1 EAETRPATTAFPRARRARRDLPQLQPQPQPQP--QRDLISRYANSGLSEFSATLNSVFAV
1140 1150 1160 1170 1180 1190
1190 1200 1210 1220 1230 1240
orf1a.pep QDELDRVFAEDRRNAVWTSXIRXTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFS
||||||||||||||||||| || |||||||||||||||||||||||||||||||||||||
orf1-1 QDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFS
1200 1210 1220 1230 1240 1250
1250 1260 1270 1280 1290 1300
orf1a.pep HNRTENXFDDGIGNSARLAHGAVFGQYGIGRFDIGISTGAGFSSGXLSDGIGGKIRRRVL
||||||:|||||||||||||||||||||| || ||||:||||||| ||||||||||||||
orf1-1 HNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGIGGKIRRRVL
1260 1270 1280 1290 1300 1310
1310 1320 1330 1340 1350 1360
orf1a.pep HYGIQARYRAGFGGFGIEPYIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSF
|||||||||||||||||||:||||||||||||||||||||||||||||||||||||||||
orf1-1 HYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSF
1320 1330 1340 1350 1360 1370
1370 1380 1390 1400 1410 1420
orf1a.pep KPAQHXSITPYXSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSXHA
||||| ||||| ||||||||||||||||||||||||||||||||||||||||||||| ||
orf1-1 KPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHA
1380 1390 1400 1410 1420 1430
1430 1440 1450
orf1a.pep AAAKGPQLEAQHSAGIKLGYRWX
|||||||||||||||||||||||
orf1-1 AAAKGPQLEAQHSAGIKLGYRWX
1440 1450
与流感嗜血菌的粘附和穿透蛋白hap前体(登录号为P45387)的同源性
ORF1的氨基酸23-423和hap蛋白在450个氨基酸的重叠区内显示出有59%的氨基酸相同性:
orf1 23 FXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAENKGKFAVGAKDIEVYNKKGELVG 82
F +L C+S GI QAWAGHTYFGI+YQYYRDFAENKGKF VGAK+IEVYNK+G+LVG
hap 6 FRLNFLTACVSLGIASQAWAGHTYFGIDYQYYRDFAENKGKFTVGAKNIEVYNKEGQLVG 65
orf1 83 KSMTKAPMIDFSVVSRNGVAALVGVQYIVSVAHNGGYNNVDFGAEGXNIXDQXRXTYKIV 142
SMTKAPMIDFSVVSRNGVAALVG QYIVSVAHNGGYN+VDFGAEG N DQ R TY+IV
hap 66 TSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYNDVDFGAEGRN-PDQHRFTYQIV 124
orf1 143 KRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSYMDGRKYIDQNNYPDRVRIGAGR 202
KRNNY+A + HPY GDYHMPRLHK VT+AEPV MT+ MDG+ Y D+ NYP+RVRIG+GR
hap 125 KRNNYQAWERKHPYDGDYHMPRLHKFVTEAEPVGMTTNMDGKVYADRENYPERVRIGSGR 184
orf1 203 QYWRSDEDEPNNRESSYHIA---------------------------------------- 222
QYWR+D+DE N SSY+++
hap 185 QYWRTDKDEETNVHSSYYVSGAYRYLTAGNTHTQSGNGNGTVNLSGNVVSPNHYGPLPTG 244
orf1 223 -----SGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGFQLVRKDWFYDEIFAGDTHSVF 277
SGSPMFIYDA+K++WLIN VLQTG+P+ G+ NGFQL+R++WFY+E+ A DT SVF
hap 245 GSKGDSGSPMFIYDAKKKQWLINAVLQTGHPFFGRGNGFQLIREEWFYNEVLAVDTPSVF 304
orf1 278 --YEPRQNGKYSFNDDNNGTGKIN-AKHEHNSLPNRLKTRTVQLFNVSLSETAREPVYHA 334
Y P NG YSF +N+GTGK+ + + + + TV+LFN SL++TA+E V A
hap 305 QRYIPPINGHYSFVSNNDGTGKLTLTRPSKDGSKAKSEVGTVKLFNPSLNQTAKEHV-KA 363
orf1 335 AGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYFQGDFTV-SPENNETWQGA 393
A G N Y+PR+ G+NI D+GKG L + +NINQGAGGLYF+G+F V +NN TWQGA
hap 364 AAGYNIYQPRMEYGKNIYLGDQGKGTLTIENNINQGAGGLYFEGNFVVKGKQNNITWQGA 423
orf1 394 GVHISEDSTVTWKVNGVANDRLSKIGKGTL 423
GV I +D+TV WKV+ NDRLSKIG GTL
hap 424 GVSIGQDATVEWKVHNPENDRLSKIGIGTL 453
ORF1的氨基酸715-1011和hap蛋白在258个氨基酸的重叠区内显示出有50%的氨基酸相同性:
Orf1 41 DTRYTVSHNATQ-NGNXSLVXNAQATFNQ-ATLNGNTSASGNASFNLSDHAVQNGSLTLS 98
DT+ S TQ NG+ +L NA + A LNGN + ++ F LS++A Q G++ LS
hap 733 DTKVINSIPITQINGSINLTNNATVNIHGLAKLNGNVTLIDHSQFTLSNNATQTGNIKLS 792
orf1 99 GNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGXELGN 158
+A A V+++ LNGNV L D A F ++S F QI G KDT + L+++ WT+PS L N
hap 793 NHANATVNNATLNGNVHLTDSAQFSLKNSHFWHQIQGDKDTTVTLENATWTMPSDTTLQN 852
orf1 159 LNLDNATITLNSAYRHDAAGAQTGSATDAPXXXXXXXXXXLLXVTPPTSVESRFNTLTVN 218
L L+N+T+TLNSAY + S+ +AP L T PTS E RFNTLTVN
hap 853 LTLNNSTVTLNSAY--------SASSNNAPRHRRS-----LETETTPTSAEHRFNTLTVN 899
orf1 219 GKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDNKP 278
GKL+GQGTF+F S LFGY+SDKLKL+ +EG YTL+V NTG EP +LEQLT++E DNKP
hap 900 GKLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDYTLSVRNTGKEPVTLEQLTLIESLDNKP 959
orf1 279 LSENLNFTLQNEHVDAGA 296
LS+ L FTL+N+HVDAGA
hap 960 LSDKLKFTLENDHVDAGA 977
ORF1的氨基酸1192-1450和hap蛋白在259个氨基酸的重叠区内显示出有41%的氨基酸相同性:
Orf1 1 LDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLGSGRVGILFSHNR 60
LDR+F + ++AVWT+ +D + Y S FRAY+Q+T+LRQIG+QK L +GR+G +FSH+R
hap 1135 LDRLFVDQAQSAVWTNIAQDKRRYDSDAFRAYQQKTNLRQIGVQKALANGRIGAVFSHSR 1194
orf1 61 TENTFDDGIGNSARLAHGAVFGQYGIDRFYXXXXXXXXXXXXXXXXXIGXKXRRRVLHYG 120
++NTFD+ + N A L + F QY K R+ ++YG
hap 1195 SDNTFDEQVKNHATLTMMSGFAQYQWGDLQFGVNVGTGISASKMAEEQSRKIHRKAINYG 1254
orf1 121 IQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYRAGIKADYSFKPA 180
+ A Y+ G GI+P+ G RYF+++ +Y+ E V + TP LAFNRY AGI+ DY+F P
hap 1255 VNASYQFRLGQLGIQPYFGVNRYFIERENYQSEEVRVKTPSLAFNRYNAGIRVDYTFTPT 1314
orf1 181 QHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEIKGFTLSLHAAAA 240
+IS+ PY ++Y D ++ V+T VN VL Q FG+ E G+ AEI F +S + +
hap 1315 DNISVKPYFFVNYVDVSNANVQTTVNLTVLQQPFGRYWQKEVGLKAEILHFQISAFISKS 1374
orf1 241 KGPQLEAQHSAGIKLGYRW 259
+G QL Q + G+KLGYRW
hap 1375 QGSQLGKQQNVGVKLGYRW 1393
与淋病奈瑟球菌的预计ORF的同源性
ORF1的片段和淋病奈瑟球菌的预计ORF(ORF1ng)在467、298和259个氨基酸的重叠区内分别显示出有83.5%,88.3%和97.7%的相同性:
orf1.pep MKTTDKRTTETHRKAPKTGRIRFXAAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN 60
||||||||||||||||||||||| ||||||||||||||| |||||||||||||||||||
orf1ng MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN 60
orf1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGVQYIVSVAHNGGYN 120
||||||||||||||||||||||||||||||||||||||||||||:| |||||||||||||
orf1ng KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN 120
orf1.pep NVDFGAEGXNIXDQXRXTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKXVTDAEPVEMTSY 180
|||||||||| || | :|:|||||||||||:||||||||||||||| ||||||||||||
orf1ng NVDFGAEGSN-PDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSY 179
orf1.pep MDGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESSYHIAS----------------- 223
||| || | |:||||||||||||||||||||||||||||||||
orf1ng MDGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSG 239
orf1.pep ----------------------------
GSPMFIYDAQKQKWLIN
255
||||||||||||||||||||||||||||||||
orf1ng GGTVNLGSEKIKHSPY
GFLPTGGSFGDSGSPMFIYDAQKQKWLIN
GVLQTGNPYIGKSNG 289
orf1.pep
VRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRT 315
201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT
251 CGATGACGAA AGCCCCGATG ATTGATTTTT CTGTGGTATC GCGTAACGGC
301 GTGGCGGCAT TGGCGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG
351 CGGCTATAAC AATGTTGATT TTGGTGCGGA GGGAAGCAAT CCCGATCAGC
401 ACCGCTTTTC TTACCAAATT GTGAAAAGAA ATAATTATAA AGCAGGGACT
451 AACGGCCATC CTTATGGCGG CGATTATCAT ATGCCGCGTT TGCACAAATT
501 TGTCACAGAT GCAGAACCTG TTGAGATGAC CAGTTATATG GATGGGTGGA
551 AATACGCTGA TTTAAATAAA TACCCTGATC GTGTTCGAAT CGGAGCAGGC
601 AGACAATATT GGCGGTCTGA TGAAGACGAA CCCAATAACC GCGAAAGTTC
651 ATATCATATT GCAAGCGCAT ATTCTTGGCT CGTCGGTGGC AATACCTTTG
701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG CGAAAAAATT
751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG
801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA
851 ATGGGGTATT GCAAACAGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC
901 CAGCTAGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC
951 CCATTCAGTA TTCTACGAAC CACATCAAAA TGGGAAATAC TTTTTTAACG
1001 ACAATAATAA TGGCGCAGGA AAAATCGATG CCAAACATAA ACACTATTCT
1051 CTACCTTATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT
1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGGGTCAACA
1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACAAA
1201 GGAAAAGGTG AATTGATACT TACCAGCAAC ATCAACCAAG GCGCGGGCGG
1251 TTTGTATTTT GAGGGTAATT TTACGGTCTC GCCTAAAAAC AACGAAACGT
1301 GGCAAGGCGC GGGCGTTCAT ATCAGTGATG GCAGTACCGT TACTTGGAAA
1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT
1401 GCTGGTTCAA GCCAAAGGGG AAAACCAAGG CTCGGTCAGC GTGGGCGACG
1451 GTAAAGTCAT CTTAGATCAG CAGGCGGACG ATCAAGGCAA AAAACAAGCC
1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGGACGGTGC AACTGAATGC
1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC
1601 GTTTGGATTT GAACGGGCAT TCGCTTTCGT TCCACCGCAT TCAAAATACC
1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT
1701 TACCATTACA GGCAATAAAG ATATTACTAC AACCGGCAAT AACAACAACT
1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT
1801 GCAACCAAAA CGAACGGGCG GCTCAATCTG AATTACCAAC CGGAAGAAGC
1851 GGATCGCACT TTACTGCTTT CCGGCGGAAC AAATTTAAAC GGCAATATCA
1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCGAC ACCGCACGCC
1951 TACAATCATT TAGGAAGCGG GTGGTCAAAA ATGGAAGGTA TCCCACAAGG
2001 AGAAATCGTG TGGGACAACG ATTGGATCGA CCGCACATTT AAAGCGGAAA
2051 ACTTCCATAT TCAGGGCGGA CAAGCGGTGG TTTCCCGCAA TGTTGCCAAA
2101 GTGGAAGGCG ATTGGCATTT AAGCAATCAC GCCCAAGCAG TTTTCGGTGT
2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC
2201 TGACAAGTTG TACCGAAAAA ACCATTACCG ACGATAAAGT GATTGCTTCA
2251 TTGAGCAAGA CCGACATCAG AGGCAATGTC AGCCTTGCCG ATCACGCTCA
2301 TTTAAATCTC ACAGGACTTG CCACACTCAA CGGCAATCTT AGTGCAGGCG
2351 GAGACACGCA CTATACGGTT ACGCGCAACG CCACCCAAAA CGGCAACCTC
2401 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG
2451 CAACACATCG GCTTCGGACA ATGCTTCATT TAATCTAAGC AACAACGCCG
2501 TACAAAACGG CAGTCTGACG CTTTCCGACA ACGCTAAGGC AAACGTAAGC
2551 CATTCCGCAC TCAACGGCAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA
2601 TTTTGAAAAC AGCCGCTTTA CCGGAAAAAT CAGCGGCGGC AAGGATACGG
2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCGGG CACGGAATTA
2701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG
2751 ACACGATGCG GCAGGCGCGC AAACCGGCAG TGCGGCAGAT GCGCCGCGCC
2801 GCCGTTCGCG CCGTTCCCTA TTATCCGTTA CGCCGCCAAC TTCGGCAGAA
2851 TCCCGTTTCA ACACGCTGAC GGTAAACGGC AAATTGAACG GTCAGGGAAC
2901 ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA CCGCAGCGGC AAATTGAAGC
2951 TGGCGGAAAG TTCCGAAGGC ACTTACACCT TGGCTGTCAA CAATACCGGC
3001 AACGAACCCG TAAGTCTCGA GCAATTGACG GTAGTGGAAG GAAAAGACAA
3051 CACACCGCTG TCCGAAAATC TTAATTTCAC CCTGCaaaAc gaacacgtcg
3101 atgccggcgc atggCGTTAT CAGCTTATCC gcaaagacgG CGAGTTCCgc
3151 CTGCATAATC CGGTCAAAGA ACAAGAGCTT TCCGACAAAC TCGGCAAGgc
3201 gggagaaACA GAggccgccT TGACGGCAAA ACAGGCacaA CTTGCCGCCA
3251 AAcaacaggc ggaaaAAGAC AACgcgcaaa gccttgAcgc gctgattgcg
3301 gCcgggcgca atgccaccga AAAGGCAgaa agtgttgccg aaccgGCCCG
3351 GCAGGCAGGC GGGGAAAAtg ccgGCATTAT GCAGGCGGAG GAAGAGAAAA
3401 AACGGGTGCA GGCGGATAAA GACACCGCCT TGGCGAAACA GCGCGAAGCG
3451 GAAACCCGGC CGGCTACCAC CGCCTTCCCC CGCGCCCGCC GCGCCCGCCG
3501 GGATTTGCCG CAACCGCAGC CCCAACCGCA ACCCCAACCG CAGCGCGACC
3551 TGATCAGCCG TTATGCCAAT AGCGGTTTGA GTGAATTTTC CGCCACGCTC
3601 AACAGCGTTT TCGCCGTACA GGACGAATTG GACCGCGTGT TTGCCGAAGA
3651 CCGCCGCAAC GCCGTTTGGA CAAGCGGCAT CCGGGACACC AAACACTACC
3701 GTTCGCAAGA TTTCCGCGCC TACCGCCAAC AAACCGACCT GCGCCAAATC
3751 GGTATGCAGA AAAACCTCGG CAGCGGGCGC GTCGGCATCC TGTTTTCGCA
3801 CAACCGGACC GGAAACACCT TCGACGACGG CATCGGCAAC TCGGCACGGC
3851 TTGCCCACGG TGCCGTTTTC GGGCAATACG GCATCGGCAG GTTCGACATC
3901 GGCATCAGCG CGGGCGCGGG TTTTAGTAGC GGCAGCCTTT CAGACGGCAT
3951 CAGAGGCAAA ATCCGCCGCC GCGTGCTGCA TTACGGCATT CAGGCAAGAT
4001 ACCGCGCAGG TTTCGGCGGA TTCGGCATCG AACCGCACAT CGGCGCAACG
4051 CGCTATTTCG TCCAAAAAGC GGATTACCGA TACGAAAACG TCAATATCGC
4101 CACCCCGGGC CTTGCATTCA ACCGCTACCG CGCGGGCATT AAGGCAGATT
4151 ATTCATTCAA ACCGGCGCAA CACATTTCCA TCACGCCTTA TTTGAGCCTG
4201 TCCTATACCG ATGCCGCTTC CGGCAAAGTC CGAACGCGCG TCAATACCGC
4251 CGTATTGGCG CAGGATTTCG GCAAAACCCG CAGTGCGGAA TGGGGCGTAA
4301 ACGCCGAAAT CAAAGGTTTC ACGCTGTCCC TCCACGCTGC CGCCGCCAAG
4351 GGGCCGCAAT TGGAAGCGCA GCACAGCGCG GGCATCAAAT TAGGCTACCG
4401 CTGGTAA
预计它编码的蛋白质具有氨基酸序列<SEQ ID 654>:
1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA RAGHTYFGIN
51 YQYYRDFAEN KGKFAVGAKD IEVYNKKGEL VGKSMTKAPM IDFSVVSRNG
101 VAALAGDQYI VSVAHNGGYN NVDFGAEGSN PDQHRFSYQI VKRNNYKAGT
151 NGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGWKYADLNK YPDRVRIGAG
201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI
301
VRKDWFYD EIFAGDTHSV FYEPHQNGKY FFNDNNNGAG KIDAKHKHYS
351 LPYRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDK
401 GKGELILTSN INQGAGGLYF EGNFTVSPKN NETWQGAGVH ISDGSTVTWK
451 VNGVANDRLS KIGKGTLLVQ AKGENQGSVS VGDGKVILDQ QADDQGKKQA
501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT
551 DEGAMIVNHN QDKESTVTIT GNKDITTTGN NNNLDSKKEI AYNGWFGEKD
601 ATKTNGGLNL NYPPEEADRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA
651 YNHLGSGWSK MEGIPQGEIV WDNDWIDRTF KAENFHIQGG QAVVSRNVAK
701 VEGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTSCTEK TITDDKVIAS
751 LSKTDVRGNV SLADHAHLNL TGLATFNGNL VQAETRTIRL RANATQNGNL
801 SLVGNAQATF NQATLNGNTS ASDNASFNLS NNAVQNGSLT LSDNAKANVS
851 HSALNGNVSL ADKAVFHFEN SRFTGKISGG KDTALHLKDS EWTLPSGTEL
901 GNLNLDNATI TLNSAYRHDA AGAQTGSAAD APRRRSRRSL LSVTPPTSAE
951 SRFNTLTVNG KLNGQGTFRF MSELFGYRSG KLKLAESSEG TYTLAVNNTG
1001 NEPVSLEQLT VVEGKDNTPL SENLNFTLQN EHVDAGAWRY QLIRKDGEFR
1051 LHNPVKEQEL SDKLGKAGET EAALTAKQAQ LAAKQQAEKD NAQSLDALIA
1101 AGRNATEKAE SVAEPARQAG GENAGIMQAE EEKKRVQADK DTALAKQREA
1151 ETRPATTAFP RARRARRDLP QPQPQPQPQP QRDLISRYAN SGLSEFSATL
1201 NSVFAVQDEL DRVFAEDRRN AVWTSGIRDT KHYRSQDFRA YRQQTDLRQI
1251 GMQKNLGSGR VGILFSHNRT GNTFDDGIGN SARLAHGAVF GQYGIGRFDI
1301 GISAGAGFSS GSLSDGIRGK IRRRVLHYGI QARYRAGFGG FGIEPHIGAT
1351 RYFVQKADYR YENVNIATPG LAFNRYRAGI KADYSFKPAQ HISITPYLSL
1401 SYTDAASGKV RTRVNTAVLA QDFGKTRSAE WGVNAEIKGF TLSLHAAAAK
1451 GPQLEAQHSA GIKLGYRW*
有下划线和双划线的序列代表丝氨酸蛋白酶(胰蛋白酶家族)的活性位点以及ATP/GTP-结合位点基序A(P-环)。
ORF1-1和ORF1ng在1471个氨基酸的重叠区内显示出有93.7%的相同性:
10 20 30 40 50 60
orf1-1.pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQAWAGHTYFGINYQYYRDFAEN
|||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||
orf1ng-1 MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN
10 20 30 40 50 60
70 80 90 100 110 120
orf1-1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALVGDQYIVSVAHNGGYN
||||||||||||||||||||||||||||||||||||||||||||:|||||||||||||||
orf1ng-1 KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN
70 80 90 100 110 120
130 140 150 160 170 180
orf1-1.pep NVDFGAEGRNPDQHRFTYKIVKRNNYKAGTKGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
|||||||| |||||||:|:|||||||||||:|||||||||||||||||||||||||||||
orf1ng-1 NVDFGAEGSNPDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
130 140 150 160 170 180
190 200 210 220 230 240
orf1-1.pep DGRKYIDQNNYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
|| || | |:||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 DGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
190 200 210 220 230 240
250 260 270 280 290 300
orf1-1.pep GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF
250 260 270 280 290 300
310 320 330 340 350 360
orf1-1.pep QLVRKDWFYDEIFAGDTHSVFYEPRQNGKYSFNDDNNGTGKINAKHEHNSLPNRLKTRTV
||||||||||||||||||||||||:||||| |||:|||:|||:|||:| ||| |||||||
orf1ng-1 QLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRTV
310 320 330 340 350 360
370 380 390 400 410 420
orf1-1.pep QLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDEGKGELILTSNINQGAGGLYF
|||||||||||||||||||||||||||||||||||||||||||:||||||||||||||||
orf1ng-1 QLFNVSLSETAREPVYHAAGGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLYF
370 380 390 400 410 420
430 440 450 460 470 480
orf1-1.pep QGDFTVSPENNETWQGAGVHISEDSTVTWKVNGVANDRLSKIGKGTLHVQAKGENQGSIS
:|:||||||||:||||||||||: ||||||||||||||||||||||| ||||||||||:|
orf1ng-1 EGNFTVSPKNNETWQGAGVHISDGSTVTWKVNGVANDRLSKIGKGTLLVQAKGENQGSVS
430 440 450 460 470 480
490 500 510 520 530 540
orf1-1.pep VGDGTVILDQQADDKGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGH
||| |||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 VGDGKVILDQQADDQGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNGH
490 500 510 520 530 540
550 560 570 580 590 600
orf1-1.pep SLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDIATTGNNNSLDSKKEIAYNGWFGEKD
|||||||||||||||||||||||||||||||||||:||||||:|||||||||||||||||
orf1ng-1 SLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDITTTGNNNNLDSKKEIAYNGWFGEKD
550 560 570 580 590 600
610 620 630 640 650 660
orf1-1.pep TTKTNGRLNLVYQPAAEDRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLNDHWSQ
:||||||||| ||| ||||||||||||||||||||||||||||||||||||:: ||:
orf 1ng-1 ATKTNGRLNLNYQPEEADRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSGWSK
610 620 630 640 650 660
670 680 690 700 710 720
orf1-1.pep KEGIPRGEIVWDNDWINRTFKAENFQIKGGQAVVSRNVAKVKGDWHLSNHAQAVFGVAPH
|||||:||||||||||:||||||||:|:|||||||||||||:||||||||||||||||||
orf1ng-1 MEGIPQGEIVWDNDWIDRTFKAENFHIQGGQAVVSRNVAKVEGDWHLSNHAQAVFGVAPH
670 680 690 700 710 720
730 740 750 760 770 780
orf1-1.pep QSHTICTRSDWTGLTNCVEKTITDDKVIASLTKTDISGNVDLADHAHLNLTGLATLNGNL
|||||||||||||||:|:|||||||||||||:|||| |||:|||||||||||||||||||
orf1ng-1 QSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDIRGNVSLADHAHLNLTGLATLNGNL
730 740 750 760 770 780
790 800 810 820 830 840
orf1-1.pep SANGDTRYTVSHNATQNGNLSLVGNAQATFNQATLNGNTSASGNASFNLSDHAVQNGSLT
||:|||:|||::|||||||||||||||||||||||||||||| |||||||::||||||||
orf1ng-1 SAGGDTHYTVTRNATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNAVQNGSLT
790 800 810 820 830 840
850 860 870 880 890 900
orf1-1.pep LSGNAKANVSHSALNGNVSLADKAVFHFESSRFTGQISGGKDTALHLKDSEWTLPSGTEL
|| ||||||||||||||||||||||||||:|||||:||||||||||||||||||||||||
orf1ng-1 LSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWTLPSGTEL
850 860 870 880 890 900
910 920 930 940 950 960
orf1-1.pep GNLNLDNATITLNSAYRHDAAGAQTGSATDAPRRRSRRSRRSLLSVTPPTSVESRFNTLT
||||||||||||||||||||||||||||:|||||||| |||||||||||:||||||||
orf1ng-1 GNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSR---RSLLSVTPPTSAESRFNTLT
910 920 930 940 950
970 980 990 1000 1010 1020
orf1-1.pep VNGKLNGQGTFRFMSELFGYRSDKLKLAESSEGTYTLAVNNTGNEPASLEQLTVVEGKDN
|||||||||||||||||||||| |||||||||||||||||||||||:|||||||||||||
orf1ng-1 VNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLTVVEGKDN
960 970 980 990 1000 1010
1030 1040 1050 1060 1070
orf1-1.pep KPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKA----------
|||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 TPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGETEAALTAK
1020 1030 1040 1050 1060 1070
1080 1090 1100 1110 1120
orf1-1.pep ----EAKKQAEKDNAQSLDALIAAGRDAVEKTESVAEPARQAGGENVGIMQAEEEKKRVQ
||:||||||||||||||||||:|:||:||||||||||||||:|||||||||||||
orf1ng-1 QAQLAAKQQAEKDNAQSLDALIAAGRNATEKAESVAEPARQAGGENAGIMQAEEEKKRVQ
1080 1090 1100 1110 1120 1130
1130 1140 1150 1160 1170 1180
orf1-1.pep ADKDTALAKQREAETRPATTAFPRARRARRDLPQLQPQPQPQPQRDLISRYANSGLSEFS
|||||||||||||||||||||||||||||||||| |||||||||||||||||||||||||
orf1ng-1 ADKDTALAKQREAETRPATTAFPRARRARRDLPQPQPQPQPQPQRDLISRYANSGLSEFS
1140 1150 1160 1170 1180 1190
1190 1200 1210 1220 1230 1240
orf1-1.pep ATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 ATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQTDLRQIGMQKNLG
1200 1210 1220 1230 1240 1250
1250 1260 1270 1280 1290 1300
orf1-1.pep SGRVGILFSHNRTENTFDDGIGNSARLAHGAVFGQYGIDRFYIGISAGAGFSSGSLSDGI
||||||||||||| |||||||||||||||||||||||| || ||||||||||||||||||
orf1ng-1 SGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFDIGISAGAGFSSGSLSDGI
1260 1270 1280 1290 1300 1310
1310 1320 1330 1340 1350 1360
orf1-1.pep GGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYR
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 RGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGLAFNRYR
1320 1330 1340 1350 1360 1370
1370 1380 1390 1400 1410 1420
orf1-1.pep AGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf1ng-1 AGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEWGVNAEI
1380 1390 1400 1410 1420 1430
1430 1440 1450
orf1-1.pep KGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX
||||||||||||||||||||||||||||||||
orf 1ng-1 KGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX
1440 1450 1460
另外,ORF1ng和hap蛋白(P45387)在1455个氨基酸的重叠区内显示出有55.7%的相同性:
SCORES Initl:1104 Initn:4632 Opt:2680
Smith-Waterman评分:5165;在1455个氨基酸的重叠区内有55.7%的相同性
10 20 30 40 50 60
orf1ng-1.pep MKTTDKRTTETHRKAPKTGRIRFSPAYLAICLSFGILPQARAGHTYFGINYQYYRDFAEN
| :|: |:|:||: || ||||||||:||||||||||
p45387 MKKTVFRLNFLTACISLGIVSQAWAGHTYFGIDYQYYRDFAEN
10 20 30 40
70 80 90 100 110 120
orf1ng-1.pep KGKFAVGAKDIEVYNKKGELVGKSMTKAPMIDFSVVSRNGVAALAGDQYIVSVAHNGGYN
||||:|||::|:||||:|:||| ||| |||||||||||||||||: :||||||||| ||:
p45387 KGKFTVGAQNIKVYNKQGQLVGTSMTKAPMIDFSVVSRNGVAALVENQYIVSVAHNVGYT
50 60 70 80 90 100
130 140 150 160 170 180
orf1ng-1.pep NVDFGAEGSNPDQHRFSYQIVKRNNYKAGTNGHPYGGDYHMPRLHKFVTDAEPVEMTSYM
:|||||||:|||||||:|:|||||||| | |||| ||| ||||||||:| |::||| |
p45387 DVDFGAEGNNPDQHRFTYKIVKRNNYKKD-NLHPYEDDYHNPRLHKFVTEAAPIDMTSNM
110 120 130 140 150 160
190 200 210 220 230 240
orf1ng-1.pep DGWKYADLNKYPDRVRIGAGRQYWRSDEDEPNNRESSYHIASAYSWLVGGNTFAQNGSGG
:| |:| :|||:|||||:|||:||:|:|: : ::|:|| :|::||| | |:|:
p45387 NGSTYSDRTKYPERVRIGSGRQFWRNDQDKGD------QVAGAYHYLTAGNTHNQRGAGN
170 180 190 200 210
250 260 270 280 290 300
orf1ng-1.pep GTVNLGSEKIKHSPYGFLPTGGSFGDSGSPMFIYDAQKQKWLINGVLQTGNPYIGKSNGF
| ||:: | : || || :|| ||||||||||||:||||||||:|:||||: || |||
p45387 GYSYLGGDVRKAGEYGPLPIAGSKGDSGSPMFIYDAEKQKWLINGILREGNPFEGKENGF
220 230 240 250 260 270
310 320 330 340 350 360
orf1ng-1.pep QLVRKDWFYDEIFAGDTHSVFYEPHQNGKYFFNDNNNGAGKIDAKHKHYSLPYRLKTRTV
|||||::| |||| | |: :| || | :: |:|| |:| | ::| ::| :
p45387 QLVRKSYF-DEIFERDLHTSLYTRAGNGVYTISGNDNGQGSITQKS---GIPSEIK---I
280 290 300 310 320
370 380 390 400 410 419
orf1ng-1.pep QLFNVSLSETAREPVYHAA-GGVNSYRPRLNNGENISFIDKGKGELILTSNINQGAGGLY
| |:|| :: |:: | | | |||||||:: |:|: :| ||::|:|||||||||
p45387 TLANMSLPLKEKDKVHNPRYDGPNIYSPRLNNGETLYFMDQKQGSLIFASDINQGAGGLY
330 340 350 360 370 380
420 430 440 450 460 470 479
orf1ng-1.pep FEGNFTVSPKNNETWQGAGVHISDGSTVTWKVNGVANDRLSKIGKGTLLVQAKGENQGSV
|||||||||::|:||||||:|:|::|||||||||| :||||||||||| |||||||:||:
p45387 FEGNFTVSPNSNQTWQGAGIHVSENSTVTWKVNGVEHDRLSKIGKGTLHVQAKGENKGSI
390 400 410 420 430 440
480 490 500 510 520 530 539
orf1ng-1.pep SVGDGKVILDQQADDQGKKQAFSEIGLVSGRGTVQLNADNQFNPDKLYFGFRGGRLDLNG
|||||||||:|||||||:||||||||||||||||||| |:||: ||:|||||||||||||
p45387 SVGDGKVILEQQADDQGNKQAFSEIGLVSGRGTVQLNDDKQFDTDKFYFGFRGGRLDLNG
450 460 470 480 490 500
540 550 560 570 580 590
orf1ng-1.pep HSLSFHRIQNTDEGAMIVNHNQDKESTVTITGNKDITT-TGNN-NNLDSKKEIAYNGWFG
|||:|:||||||||||||||| : ::||||||::|: :||| |:|| :||||||||||
p45387 HSLTFKRIQNTDEGAMIVNHNTTQAANVTITGNESIVLPNGNNINKLDYRKEIAYNGWFG
510 520 530 540 550 560
600 610 620 630 640 650
orf1ng-1.pep EKDATKTNGRLNLNYQPEEADRTLLLSGGTNLNGNITQTNGKLFFSGRPTPHAYNHLGSG
| | :| |||||| |:| ||||||||||||:|:||||:|||||||||||||||||::
p45387 ETDKNKHNGRLNLIYKPTTEDRTLLLSGGTNLKGDITQTKGKLFFSGRPTPHAYNHLNKR
570 580 590 600 610 620
660 670 680 690 700 710
orf1ng-1.pep WSKMEGIPQGEIVWDNDWIDRTFKAENFHIQGGQAVVSRNVAKVEGDWHLSNHAQAVFGV
||:||||||||||||:|||:||||||||:|:||:|||||||:::||:| :||:|:|:|||
p45387 WSEMEGIPQGEIVWDHDWINRTFKAENFQIKGGSAVVSRNVSSIEGNWTVSNNANATFGV
630 640 650 660 670 680
720 730 740 750 760 770
orf1ng-1.pep APHQSHTICTRSDWTGLTSCTEKTITDDKVIASLSKTDIRGNVSLADHAHLNLTGLATLN
:|:|::||||||||||||:| : :|| ||| |: ||:| |:::|:|:| |: ||| ||
p45387 VPNQQNTICTRSDWTGLTTCQKVDLTDTKVINSIPKTQINGSINLTDNATANVKGLAKLN
690 700 710 720 730 740
780 790 800 810 820 830
orf1ng-1.pep GNLSAGGDTHYTVTRNATQNGNLSLVGNAQATFNQATLNGNTSASDNASFNLSNNAVQNG
||:: :::::|:|||||:| |
p45387 GNVTL---------------------------------------TNHSQFTLSNNATQIG
750 760 770
840 850 860 870 880 890
orf1ng-1.pep SLTLSDNAKANVSHSALNGNVSLADKAVFHFENSRFTGKISGGKDTALHLKDSEWTLPSG
:: ||||: |:|::: ||||| |:|:| | ::||:|: :|:| | |:: |::: ||:||
p45387 NIRLSDNSTATVDNANLNGNVHLTDSAQFSLKNSHFSHQIQGDKGTTVTLENATWTMPSD
780 790 800 810 820 830
900 910 920 930 940 950
orf1ng-1.pep TELGNLNLDNATITLNSAYRHDAAGAQTGSAADAPRRRSRRSLLSVTPPTSAESRFNTLT
| | ||:|:|:|||||||| ::|: ::||||| | : | ||||| ||||||
p45387 TTLQNLTLNNSTITLNSAY--------SASSNNTPRRRS---LETETTPTSAEHRFNTLT
840 850 860 870
960 970 980 990 1000 1010
orf1ng-1.pep VNGKLNGQGTFRFMSELFGYRSGKLKLAESSEGTYTLAVNNTGNEPVSLEQLTVVEGKDN
|||||:|||||:| | ||||:| ||||::::|| | |:| |||:|| :|||||:||:|||
p45387 VNGKLSGQGTFQFTSSLFGYKSDKLKLSNDAEGDYILSVRNTGKEPETLEQLTLVESKDN
880 890 900 910 920 930
1020 1030 1040 1050 1060 1070
orf1ng-1.pep TPLSENLNFTLQNEHVDAGAWRYQLIRKDGEFRLHNPVKEQELSDKLGKAGETEAALTAK
|||::|:|||:|:|||||| ||:|:::|||||||||:||||| : | :| ::| :| ||
p45387 QPLSDKLKFTLENDHVDAGALRYKLVKNDGEFRLHNPIKEQELHNDLVRAEQAERTLEAK
940 950 960 970 980 990
1080 1090 1100 1110 1120 1130
orf1ng-1.pep QAQLAAKQQAEKDNAQSLDALIAAGRNAT-EKAESVAEPARQAGGENAGIMQAEEEKKRV
|:: :|| |: : :::| | || :: ::: | |:|| :| :::: : |:|
p45387 QVEPTAKTQTGEPKVRSRRAARAAFPDTLPDQSLLNALEAKQAE-LTAETQKSKAKTKKV
1000 1010 1020 1030 1040 1050
1140 1150 1160 1170 1180 1190
orf1ng-1.pep QADK---DTALAKQREAETRPATTAFPRARRARRD-LPQPQPQPQPQPQRDLISRYANSG
:: : : | | : | :: :::::| | | : : | : |:||||||:||:
p45387 RSKRAVFSDPLLDQSLFALEAALEVIDAPQQSEKDRLAQEEAEKQ-RKQKDLISRYSNSA
1060 1070 1080 1090 1100 1110
1200 1210 1220 1230 1240 1250
orf1ng-1.pep LSEFSATLNSVFAVQDELDRVFAEDRRNAVWTSGIRDTKHYRSQDFRAYRQQ-TDLRQIG
|||:|||:||:::|||||||:|::: ::||||: :| ::| |: ||||:|| |:|||||
p45387 LSELSATVNSMLSVQDELDRLFVDQAQSAVWTNIAQDKRRYDSDAFRAYQQQKTNLRQIG
1120 1130 1140 1150 1160 1170
1260 1270 1280 1290 1300 1310
orf1ng-1.pep MQKNLGSGRVGILFSHNRTGNTFDDGIGNSARLAHGAVFGQYGIGRFDIGISAGAGFSSG
:|| |::||:| :|||:|: ||||: : | | |: : |:|| | :::|:::|:|:|::
p45387 VQKALANGRIGAVFSHSRSDNTFDEQVKNHATLTMMSGFAQYQWGDLQFGVNVGTGISAS
1180 1190 1200 1210 1220 1230
1320 1330 1340 1350 1360 1370
orf1ng-1.pep SLSDGIRGKIRRRVLHYGIQARYRAGFGGFGIEPHIGATRYFVQKADYRYENVNIATPGL
:::: ||:|::::||::| |: :| :||:|::|::|||::: :|: |:| : ||:|
p45387 KMAEEQSRKIHRKAINYGVNASYQFRLGQLGIQPYFGVNRYFIERENYQSEEVRVKTPSL
1240 1250 1260 1270 1280 1290
1380 1390 1400 1410 1420 1430
orf1ng-1.pep AFNRYRAGIKADYSFKPAQHISITPYLSLSYTDAASGKVRTRVNTAVLAQDFGKTRSAEW
||||| |||::||:| |:::||: ||: ::|:|:::::|:| || :|| | ||: : |
p45387 AFNRYNAGIRVDYTFTPTDNISVKPYFFVNYVDVSNANVQTTVNLTVLQQPFGRYWQKEV
1300 1310 1320 1330 1340 1350
1440 1450 1460 1469
orf1ng-1.pep GVNAEIKGFTLSLHAAAAKGPQLEAQHSAGIKLGYRWX
|::||| | :| : ::| || |:::|:||||||
p45387 GLKAEILHFQISAFISKSQGSQLGKQQNVGVKLGYRW
1360 1370 1380 1390
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例78
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 655>:
1 ..AAGGTGTGGC AATTTGTCGA AGA.CCGCTG CGTGCCGTCG TGCCTGCCGA
51 CAGTTTTGAA CCGACCGCGC AAAAATTGAA CCTGTTTAAG GCGGGTGCGG
101 CAACCATTTT GTTTTATGAA GATCAAAATG TCGTCAAAGG TTTGCAGGAG
151 CAGTTCCCTG CTTATGCCGC TAACTTCCCC GTTTGGGCGg ATCAGGCAAA
201 CGCGATGGTG CAGTATGCCG TTTGGACGAC ACTTGCCGCG GTCGGCGTAG
251 GTGCAAACCT GCAACATTAC AATCCCTTGC CCGATGCGGC GATTGCCAAA
301 GCGTGGAATA TCCCCGAAAA CTGGTTGTTG CGCGCACAAA TGGTTATCGG
351 CGGTATTGAA GGGGCGGCAG GTGAAAAGAC CTTTGAACCC GTTGCAGAAC
401 GTTTGAAAGT GTTCGGCGCA TAA
它对应于氨基酸序列<SEQ ID 656;ORF6>:
1 ..KVWQFVEXPL RAVVPADSFE PTAQKLNLFK AGAATILFYE DQNVVKGLQE
51 QFPAYAANFP VWADQANAMV QYAVWTTLAA VGVGANLQHY NPLPDAAIAK
101 AWNIPENWLL RAQMVIGGIE GAAGEKTFEP VAERLKVFGA *
进一步的序列分析进一步揭示了部分DNA序列<SEQ ID 657>:
1 ..CTGCGTGCCG TCGTGCCTGC CGACAGTTTT GAACCGACCG CGCAAAAATT
51 GAACCTGTTT AAGGCGGGTG CGGCAACCAT TTTGTTTTAT GAAGATCAAA
101 ATGTCGTCAA AGGTTTGCAG GAGCAGTTCC CTGCTTATGC CGCTAACTTC
151 CCCGTTTGGG CGGATCAGGC AAACGCGATG GTGCAGTATG CCGTTTGGAC
201 GACACTTGCC GCGGTCGGCG TAGGTGCAAA CCTGCAACAT TACAATCCCT
251 TGCCCGATGC GGCGATTGCC AAAGCGTGGA ATATCCCCGA AAACTGGTTG
301 TTGCGCGCAC AAATGGTTAT CGGCGGTATT GAAGGGGCGG CAGGTGAAAA
351 GACCTTTGAA CCCGTTGCAG AACGTTTGAA AGTGTTCGGC GCATAA
它对应于氨基酸序列<SEQ ID 658;ORF6-1>:
1 ..LRAVVPADSF EPTAQKLNLF KAGAATILFY EDQNVVKGLQ EQFPAYAANF
51 PVWADQANAM VQYAVWTTLA AVGVGANLQH YNPLPDAAIA KAWNIPENWL
101 LRAQMVIGGI EGAAGEKTFE PVAERLKVFG A*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF6和脑膜炎奈瑟球菌菌株A的ORF(ORF6a)在140个氨基酸的重叠区内显示出有98.6%的相同性:
10 20 30
orf6.pep KVWQFVEXPLRAVVPADSFEPTAQKLNLFK
||||||| |||||||||||||||||||||
orf6a QIVEHAVLHTPSSFNSQSARVVVLFGEEHDKVWQFVEDALRAVVPADSFEPTAQKLNLFK
40 50 60 70 80 90
40 50 60 70 80 90
orf6.pep AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf6a AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY
100 110 120 130 140 150
100 110 120 130 140
orf6.pep NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
|||||||||||||||||||||||||||||||||||||||||||||||||||
orf6a NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
160 170 180 190 200
全长ORF6a核苷酸序列<SEQ ID 659>是:
1 ATGACCCGTC AATCTCTGCA ACAGGCTGCC GAAAGCCGCC GTTCCATTTA
51 TTCGTTAAAT AAAAATCTGC CCGTCGGCAA AGATGAAATC GTCCAAATCG
101 TCGAACACGC CGTTTTGCAC ACACCTTCTT CGTTCAATTC CCAATCTGCC
151 CGTGTGGTCG TGCTGTTTGG CGAAGAGCAT GATAAGGTGT GGCAATTTGT
201 CGAAGACGCG CTGCGTGCCG TCGTGCCTGC CGACAGTTTT GAACCGACCG
251 CGCAAAAATT GAACCTGTTT AAGGCGGGTG CGGCAACTAT TTTGTTTTAT
301 GAAGATCAAA ATGTCGTCAA AGGTTTGCAG GAGCAGTTCC CTGCTTATGC
351 CGCCAACTTT CCCGTTTGGG CGGACCAGGC GAACGCGATG GTGCAGTATG
401 CCGTTTGGAC GACACTTGCC GCGGTCGGCG TAGGTGCAAA CCTGCAACAT
451 TACAATCCCT TGCCCGATGC GGCGATTGCC AAAGCGTGGA ATATCCCCGA
501 AAACTGGTTG TTGCGCGCAC AAATGGTTAT CGGCGGTATT GAAGGGGCGG
551 CAGGTGAAAA GACCTTTGAA CCAGTTGCAG AACGTTTGAA AGTGTTCGGC
601 GCATAA
预计它编码的蛋白质具有氨基酸序列<SEQ ID 660>:
1 MTRQSLQQAA ESRRSIYSLN KNLPVGKDEI VQIVEHAVLH TPSSFNSQSA
51 RVVVLFGEEH DKVWQFVEDA LRAVVPADSF EPTAQKLNLF KAGAATILFY
101 EDQNVVKGLQ EQFPAYAANF PVWADQANAM VQYAVWTTLA AVGVGANLQH
151 YNPLPDAAIA KAWNIPENWL LRAQMVIGGI EGAAGEKTFE PVAERLKVFG
201 A*
ORF6a和ORF6-1在131个氨基酸的重叠区内显示出有100.0%的相同性:
50 60 70 80 90 100
orf6a.pep TPSSFNSQSARVVVLFGEEHDKVWQFVEDALRAVVPADSFEPTAQKLNLFKAGAATILFY
||||||||||||||||||||||||||||||
orf6-1 LRAVVPADSFEPTAQKLNLFKAGAATILFY
10 20 30
110 120 130 140 150 160
orf6a.pep EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf6-1 EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA
40 50 60 70 80 90
170 180 190 200
orf6a.pep KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
||||||||||||||||||||||||||||||||||||||||||
orf6-1 KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
100 110 120 130
与淋病奈瑟球菌的预计ORF的同源性
ORF6和淋病奈瑟球菌的预计ORF(ORF6ng)在140个氨基酸的重叠区内显示出有95.7%的相同性:
orf6.pep KVWQFVEXPLRAVVPADSFEPTAQKLNLFK 30
||||||| |||||||||||||||||:|||
orf6ng SNVSLDMSNPTVLRMGLPLYIASLRRGAIYKVWQFVEDALRAVVPADSFEPTAQKLKLFK 64
orf6.pep AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHY 90
||||||||||||||||||||||||||||||||||||||||||||||||||||:|||||||
orf6ng AGAATILFYEDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGAGANLQHY 124
orf6.pep NPLPDAAIAKAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGA 140
|||||:||||||||||||||||||||||||||||||:|||||||||||||
orf6ng NPLPDVAIAKAWNIPENWLLRAQMVIGGIEGAAGEKVFEPVAERLKVFGA 174
鉴定出全长ORF6ng核苷酸序列<SEQ ID 661>:
1 ATGGCCGTTG CGTCAAATGT CAGCTTGGAT ATGTCCAATC CTACGGTGTT
51 ACGCATGGGA TTACCCTTAT ATATTGCGTC CCTAAGAAGG GGCGCAATAT
101 ATAAGGTGTG GCAATTTGTC GAAGACGCGC TGCGTGCCGT CGTGCCTGCC
151 GACAGTTTTG AACCGACCGC GCAAAAATTG AAGCTGTTTA AGGCGGGCGC
201 GGCAACCATT TTGTTTTATG AAGATCAAAA TGTCGTCAAA GGTTTGCAGG
251 AGCAGTTCCC TGCTTATGCC GCCAACTTTC CCGTTTGGGC GGACCAGGCG
301 AACGCTATGG TACAGTATGC CGTCTGGACG ACACTTGCCG CGGTCGGTGC
351 AGGTGCAAAT CTGCAACATT ACAACCCCTT GCCCGATGTG GCGATTGCTA
401 AAGCGTGGAA TATTCCCGAA AACTGGCTGT TGCGCGCGCA AATGGTTATC
451 GGTGGTATTG AAGGGGcggc aggtgaaaaa gtctttgaac CCGTTGCgga
501 acgtttgAAA GTGTTCGGCG CATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 662>:
1 MAVASNVSLD MSNPTVLRMG LPLYIASLRR GAIYKVWQFV EDALRAVVPA
51 DSFEPTAQKL KLFKAGAATI LFYEDQNVVK GLQEQFPAYA ANFPVWADQA
101 NAMVQYAVWT TLAAVGAGAN LQHYNPLPDV AIAKAWNIPE NWLLRAQMVI
151 GGIEGAAGEK VFEPVAERLK VFGA*
ORF6ng和ORF6-1在131个氨基酸的重叠区内显示出有96.9%的相同性:
10 20 30
orf6-1.pep LRAVVPADSFEPTAQKLNLFKAGAATILFY
|||||||||||||||||:||||||||||||
orf6ng PTVLRMGLPLYIASLRRGAIYKVWQFVEDALRAVVPADSFEPTAQKLKLFKAGAATILFY
20 30 40 50 60 70
40 50 60 70 80 90
orf6-1.pep EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGVGANLQHYNPLPDAAIA
|||||||||||||||||||||||||||||||||||||||||||:||||||||||||:|||
orf6ng EDQNVVKGLQEQFPAYAANFPVWADQANAMVQYAVWTTLAAVGAGANLQHYNPLPDVAIA
80 90 100 110 120 130
100 110 120 130
orf6-1.pep KAWNIPENWLLRAQMVIGGIEGAAGEKTFEPVAERLKVFGAX
|||||||||||||||||||||||||||:||||||||||||||
orf6ng KAWNIPENWLLRAQMVIGGIEGAAGEKVFEPVAERLKVFGAX
140 150 160 170
预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例79
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 663>
1 ..GGCTACAACT ACCTGTTCGC GCGCGGCAGC CGCATCGCCA ACTACCAAAT
101 ACCGCCGCCT ATGAGCGCGT AGAAGTCGTG CGCGGCGTGG CGGGGCTGCT
151 GGACGGCACG GGCGAGCCTT CCGCCACCGT CAATCTGGTG CGCAAACGCC
201 TGACCCGCAA GCCATTGTTT GAAGTCCGCG CCGAAGCgGG CAACCGcAAA
251 CATTTCGGGC TGGACGCGGA CGTATCGGGC AGCCTGAACA CCGAAG.crC
301 rCTGCGCgGC CGCCTGGTTT CCAcCTTCGG ACGCGGCGAC TCGTGGCGGC
351 GGCGCGAACG CAGCCGskAT GCCGAACTCT ACGGCATTTT GGAATACGAC
401 ATCGCACCGC AAACCCGCGT CCACGCArGC ATGGACTACC AGCAGGCGAA
451 AGAAACCGCC GACGCGCCGC TCAGcTACGC CGTGTACGAC AGCCAAGGTT
501 ATGCCACCGC CTTCGGCCCG AAAGACAACC CCGCCACAAA TTGGGCGAAC
551 AGCCACCACC GTGCGCTCAA CCTGTTCGCC GGCATCGAAC ACCGCTTCAA
601 CCAAGACTGG AAACTCAAAG CCGAATACGA CTAC..
它对应于氨基酸序列<SEQ ID 664;ORF23>:
1 ..GYNYLFARGS RIANYQINGI PVADALADTG NANTAAYERV EVVRGVAGLL
51 DGTGEPSATV NLVRKRLTRK PLFEVRAEAG NRKHFGLDAD VSGSLNTEXX
101 LRGRLVSTFG RGDSWRRRER SRXAELYGIL EYDIAPQTRV HAXMDYQQAK
151 ETADAPLSYA VYDSQGYATA FGPKDNPATN WANSHHRALN LFAGIEHRFN
201 QDWKLKAEYD Y..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 665>:
1 ATGACACGCT TCAAATATTC CCTGCTGTTT GCCGCCCTGT TGCCCGTGTA
51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCCAAACCG CAGGAAAGCA
101 CTGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC
151 GACGGCTACA CTGTTTCCGG CACGCACACC CCGCTCGGGC TGCCCATGAC
201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC
251 GCGACCAAAA CATCAAAACG CTCGACCGCG CCCTGTTGCA GGCGACCGGC
301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT
351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG
401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC
451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CTGGACGGCA CGGGCGAGCC
501 TTCCGCCACC GTCAATCTGG TGCGCAAACG CCTGACCCGC AAGCCATTGT
551 TTGAAGTCCG CGCCGAAGCG GGCAACCGCA AACATTTCGG GCTGGACGCG
601 GACGTATCGG GCAGCCTGAA CACCGAAGGC ACGCTGCGCG GCCGCCTGGT
651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCGGCGCGAA CGCAGCCGCG
701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC
751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CCGACGCGCC
801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC
851 CGAAAGACAA CCCCGCCACA AATTGGGCGA ACAGCCGCCA CCGTGCGCTC
901 AACCTGTTCG CCGGCATCGA ACACCGCTTC AACCAAGACT GGAAACTCAA
951 AGCCGAATAC GACTACACCC GCAGCCGCTT CCGCCAGCCC TACGGCGTAG
1001 CAGGCGTGCT TTCCATCGAC CACAACACCG CCGCCACCGA CCTGATTCCC
1051 GGTTATTGGC ACGCCGACCC GCGCACCCAC AGCGCCAGCG TGTCATTGAT
1101 CGGCAAATAC CGCCTGTTCG GCCGCGAACA CGATTTAATC GCGGGTATCA
1151 ACGGTTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATCCCC
1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGTG CCTACCCGCA
1251 GCCTGCATCG TTTGCCCAAA CCATCCCGCA ATACGGCACC AGGCGGCAAA
1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG
1351 ATTTTGGGCG GACGATACAC CCGTTACCGC ACCGGCAGCT ACGACAGCCG
1401 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG
1451 GCATCGTGTT CGACCTGACC GGCAACCTGT CTCTTTACGG CTCGTACAGC
1501 AGCCTGTTCG TCCCGCAATC GCAAAAAGAC GAACACGGCA GCTACCTGAA
1551 ACCCGTAACC GGCAACAATC TGGAAGCCGG CATCAAAGGC GAATGGCTTG
1601 AAGGCCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC
1651 CTCGCCACCG CAGCAGGACG CGACCCGAGC GGCAACACCT ACTACCGCGC
1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA
1751 TCACGCCCGA ATGGCAGATA CAGGCAGGTT ACAGCCAAAG CAAAACCCGC
1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTACCCG AACGCAGCTT
1851 CAAACTCTTC ACTGCCTACC ACTTTGCCCC CGAAGCCCCC AGCGGCTGGA
1901 CCATCGGCGC AGGCGTGCGC TGGCAGAGCG AAACCCACAC CGACCCTGCC
1951 ACGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG CCGACAACAG
2001 CCGCCAAAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA
2051 ATCCGCGCGC CGAACTGTCG CTGAACGTGG ACAATCTGTT CAACAAACAC
2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA
2151 CGCGGCGTTT ACCTATCGGT TTAAATAA
它对应于氨基酸序列<SEQ ID 666;ORF23-1>:
1
MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN
51 DGYTVSGTHT PLGLPMTLRE IPQSVSVITS QQMRDQNIKT LDRALLQATG
101 TSRQIYGSDR AGYNYLFARG SRIANYQING IPVADALADT GNANTAAYER
151 VEVVRGVAGL LDGTGEPSAT VNLVRKRLTR KPLFEVRAEA GNRKHFGLDA
201 DVSGSLNTEG TLRGRLVSTF GRGDSWRRRE RSRDAELYGI LEYDIAPQTR
251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWANSRHRAL
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HNTAATDLIP
351 GYWHADPRTH SASVSLIGKY RLFGREHDLI AGINGYKYAS NKYGERSIIP
401 NAIPNAYEFS RTGAYPQPAS FAQTIPQYGT RRQIGGYLAT RFRAADNLSL
451 ILGGRYTRYR TGSYDSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS
501 SLFVPQSQKD EHGSYLKPVT GNNLEAGIKG EWLEGRLNAS AAVYRARKNN
551 LATAAGRDPS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKTR
601 DQDGSRLNPD SVPERSFKLF TAYHFAPEAP SGWTIGAGVR WQSETHTDPA
651 TLRIPNPAAK ARAADNSRQK AYAVADIMAR YRFNPRAELS LNVDNLFNKH
701 YRTQPDRHSY GALRTVNAAF TYRFK*
该氨基酸序列的计算机分析给出了下列结果:
与恶臭假单胞菌的铁-假单胞菌素受体PupB(登录号为P38047)的同源性
ORF23和PupB蛋白在205个氨基酸的重叠区内显示出有32%的氨基酸相同性:
Orf23 6 FARGSRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRK 65
++RG I NY+++G+P + L D + + A ++RVE+VRG GL+ G G PSAT+NL+RK
PupB 215 WSRGFAIQNYEVDGVPTSTRL-DNYSQSMAMFDRVEIVRGATGLISGMGNPSATINLIRK 273
Orf23 66 RLTRKPLFEVRAEAGNRKHFGLDADVSGSLNTEXXLRGRLVSTFXXXXXXXXXXXXXXAE 125
R T+ + EAGN +G DVSG L +RGR V+ +
PupB 274 RPTAEAQASITGEAGNWDRYGTGFDVSGPLTETGNIRGRFVADYKTEKAWIDRYNQQSQL 333
Orf23 126 LYGILEYDIAPQTRVHAXMDYQQAKETADAPLSYAVYD--SQGYATAFGPKDNPATNWAN 183
+YGI E+D++ T + Y + D+PL + S G T N A +W+
PupB 334 MYGITEFDLSEDTLLTVGFSY--LRSDIDSPLRSGLPTRFSTGERTNLKRSLNAAPDWSY 391
Orf23 184 SHHRALNLFAGIEHRFNQDWKLKAE 208
+ H + F IE + W K E
PupB 392 NDHEQTSFFTSIEQQLGNGWSGKIE 416
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF23和脑膜炎奈瑟球菌菌株A的ORF(ORF23a)在211个氨基酸的重叠区内显示出有95.7%的相同性:
10 20 30
orf23.pep GYNYLFARGSRIANYQINGIPVADALADTG
||||||||||||||||||||||||||||||
orf23a QMRDQNIKALDRALLQATGTSRQIYGSDRAGYNYLFARGSRIANYQINGIPVADALADTG
90 100 110 120 130 140
40 50 60 70 80 90
orf23.pep NANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRLTRKPLFEVRAEAGNRKHFGLDAD
|||||||||||||||||||||||||||||||||||| |||||||||||||||||||| ||
orf23a NANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRPTRKPLFEVRAEAGNRKHFGLGAD
150 160 170 180 190 200
100 110 120 130 140 150
orf23.pep VSGSLNTEXXLRGRLVSTFGRGDSWRRRERSRXAELYGILEYDIAPQTRVHAXMDYQQAK
||||||:| :||||||||||||||||:||||| ||||||||||||||||||| |||||||
orf23a VSGSLNAEGTLRGRLVSTFGRGDSWRQRERSRDAELYGILEYDIAPQTRVHAGMDYQQAK
210 220 230 240 250 260
160 170 180 190 200 210
orf23.pep ETADAPLSYAVYDSQGYATAFGPKDNPATNWANSHHRALNLFAGIEHRFNQDWKLKAEYD
||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||
orf23a ETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRALNLFAGIEHRFNQDWKLKAEYD
270 280 290 300 310 320
orf23.pep Y
|
orf23a YTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTHSASVSLIGKYRLFGREHDLIA
330 340 350 360 370 380
全长ORF23a核苷酸序列<SEQ ID 667>是:
1 ATGACACGCT TCAAATATTC CCTGCTGTTT GCCGCCCTGT TGCCCGTGTA
51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCAAAACCG CAGGAAAGCA
101 CTGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC
151 GACGGCTACA CTGTTTCCGG CACGCACACC CCGCTCGGGC TGCCCATGAC
201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC
251 GCGACCAAAA CATCAAAGCG CTCGACCGCG CCCTGTTGCA GGCGACCGGC
301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT
351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG
401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC
451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CTGGACGGCA CGGGCGAGCC
501 TTCCGCCACC GTCAATCTGG TGCGCAAACG CCCGACCCGC AAGCCATTGT
551 TTGAAGTCCG CGCCGAAGCG GGCAACCGCA AACATTTCGG GCTGGGCGCG
601 GACGTATCGG GCAGCCTGAA TGCCGAAGGC ACGCTGCGCG GCCGCCTGGT
651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCAGCGCGAA CGCAGCCGCG
701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC
751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CCGACGCGCC
801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC
851 CGAAAGACAA CCCCGCCACA AATTGGGCGA ACAGCCGCCA CCGTGCGCTC
901 AACCTGTTCG CCGGCATCGA ACACCGCTTC AACCAAGACT GGAAACTCAA
951 AGCCGAATAC GACTACACCC GCAGCCGCTT CCGCCAGCCC TACGGCGTAG
1001 CAGGCGTGCT TTCCATCGAC CACAACACCG CCGCCACCGA CCTGATTCCC
1051 GGTTATTGGC ACGCCGACCC GCGCACCCAC AGCGCCAGCG TGTCATTAAT
1101 CGGCAAATAC CGCCTGTTCG GCCGCGAACA CGATTTAATC GCGGGTATCA
1151 ACGGTTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATCCCC
1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGTG CCTACCCGCA
1251 GCCTGCATCG TTTGCCCAAA CCATCCCGCA ATACGGCACC AGGCGGCAAA
1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG
1351 ATACTCGGCG GCAGATACAG CCGTTACCGC ACCGGCAGCT ACGACAGCCG
1401 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG
1451 GCATCGTGTT CGACCTGACC GGCAACCTGT CGCTTTACGG CTCGTACAGC
1501 AGCCTGTTCG TCCCGCAATC GCAAAAAGAC GAACACGGCA GCTACCTGAA
1551 ACCCGTAACC GGCAACAATC TGGAAGCCGG CATCAAAGGC GAATGGCTTG
1601 AAGGCCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC
1651 CTCGCCACCG CAGCAGGACG CGACCCGAGC GGCAACACCT ACTACCGCGC
1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA
1751 TCACGCCCGA ATGGCAGATA CAGGCAGGTT ACAGCCAAAG CAAAACCCGC
1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTACCCG AACGCAGCTT
1851 CAAACTCTTC ACTGCCTACC ACTTTGCCCC CGAAGCCCCC AGCGGCTGGA
1901 CCATCGGCGC AGGCGTGCGC TGGCAGAGCG AAACCCACAC CGACCCTGCC
1951 ACGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG CCGACAACAG
2001 CCGCCAAAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA
2051 ATCCGCGCGC CGAACTGTCG CTGAACGTGG ACAATCTGTT CAACAAACAC
2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA
2151 CGCGGCGTTT ACCTATCGGT TTAAATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 668>:
1
MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN
51 DGYTVSGTHT PLGLPMTLRE IPQSVSVITS QQMRDQNIKA LDRALLQATG
101 TSRQIYGSDR AGYNYLFARG SRIANYQING IPVADALADT GNANTAAYER
151 VEVVRGVAGL LDGTGEPSAT VNLVRKRPTR KPLFEVRAEA GNRKHFGLGA
201 DVSGSLNAEG TLRGRLVSTF GRGDSWRQRE RSRDAELYGI LEYDIAPQTR
251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWANSRHRAL
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HNTAATDLIP
351 GYWHADPRTH SASVSLIGKY RLFGREHDLI AGINGYKYAS NKYGERSIIP
401 NAIPNAYEFS RTGAYPQPAS FAQTIPQYGT RRQIGGYLAT RFRAADNLSL
451 ILGGRYSRYR TGSYDSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS
501 SLFVPQSQKD EHGSYLKPVT GNNLEAGIKG EWLEGRLNAS AAVYRARKNN
551 LATAAGRDPS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKTR
601 DQDGSRLNPD SVPERSFKLF TAYHFAPEAP SGWTIGAGVR WQSETHTDPA
651 TLRIPNPAAK ARAADNSRQK AYAVADIMAR YRFNPRAELS LNVDNLFNKH
701 YRTQPDRHSY GALRTVNAAF TYRFK*
ORF23a和ORF23-1在725个氨基酸的重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60
orf23a.pep MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
10 20 30 40 50 60
70 80 90 100 110 120
orf23a.pep PLGLPMTLREIPQSVSVITSQQMRDQNIKALDRALLQATGTSRQIYGSDRAGYNYLFARG
|||||||||||||||||||||||||||||||||:||||||||||||||||||||||||||
orf23-1 PLGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG
70 80 90 100 110 120
130 140 150 160 170 180
orf23a.pep SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRPTR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||
orf23-1 SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRLTR
130 140 150 160 170 180
190 200 210 220 230 240
orf23a.pep KPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGRGDSWRQRERSRDAELYGI
|||||||||||||||||| ||||||||;|||||||||||||||||||:||||||||||||
orf23-1 KPLFEVRAEAGNRKHFGLDADVSGSLNTEGTLRGRLVSTFGRGDSWRRRERSRDAELYGI
190 200 210 220 230 240
250 260 270 280 290 300
orf23a.pep LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL
250 260 270 280 290 300
310 320 330 340 350 360
orf23a.pep NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH
310 320 330 340 350 360
370 380 390 400 410 420
orf23a.pep SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS
370 380 390 400 410 420
430 440 450 460 470 480
orf23a.pep FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYSRYRTGSYDSRTQGMTYVSANRFT
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf23-1 FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYTRYRTGSYDSRTQGMTYVSANRFT
430 440 450 460 470 480
490 500 510 520 530 540
orf23a.pep PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS
490 500 510 520 530 540
550 560 570 580 590 600
orf23a.pep AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR
550 560 570 580 590 600
610 620 630 640 650 660
orf23a.pep DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK
610 620 630 640 650 660
670 680 690 700 710 720
orf23a.pep ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23-1 ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
670 680 690 700 710 720
orf23a.pep TYRFKX
||||||
orf23-1 TYRFKX
与淋病奈瑟球菌的预计ORF的同源性
ORF23和淋病奈瑟球菌的预计ORF(ORF23.ng)在211个氨基酸的重叠区内显示出有93.4%的相同性:
orf23.pep GYNYLFARGSRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLD 51
||||||||||||||||||||||||||||||||||||||||||||||||| |
orf23ng SAVDACRIPGYNYLFARGSRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLPD 60
orf23.pep GTGEPSATVNLVRKRLTRKPLFEVRAEAGNRKHFGLDADVSGSLNTEXXLRGRLVSTFGR 111
||||||||||||||: |||||||||||||||||||| ||||||||:| :|||||||||||
orf23ng GTGEPSATYNLVRKHPTRKPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGR 120
orf23.pep GDSWRRRERSRXAELYGILEYDIAPQTRVHAXMDYQQAKETADAPLSYAVYDSQGYATAF 171
|||||: |||| ||||||||||||||||||| ||||||||||||||||||||||||||||
orf23ng GDSWRQLERSRDAELYGILEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAF 180
orf23.pep GPKDNPATNWANSHHRALNLFAGIEHRFNQDWKLKAEYDY 211
||||||||||:||::|||||||||||||||||||||||||
orf23ng GPKDNPATNWSNSRNRALNLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHS 240
预计ORF23ng核苷酸序列<SEQ ID 669>编码的蛋白质包含氨基酸序列<SEQ ID670>:
1 SAVDACRIPG YNYLFARGSR IANYQINGIP VADALADTGN ANTAAYERVE
51 VVRGVAGLPD GTGEPSATVN LVRKHPTRKP LFEVRAEAGN RKHFGLGADV
101 SGSLNAEGTL RGRLVSTFGR GDSWRQLERS RDAELYGILE YDIAPQTRVH
151 AGMDYQQAKE TADAPLSYAV YDSQGYATAF GPKDNPATNW SNSRNRALNL
201 FAGIEHRFNQ DWKLKAEYDY TRSRFRQPYG VAGVLSIDHS TAATDLIPGY
251 WHADPRTHSA SMSLTGKYRL FGREHDLIAG INGYKYASNK YGERSIIPNA
301 IPNAYEFSRT GAYPQPSSFA QTIPQYDTRR QIGGYLATRF RAADNLSLIL
351 GGRYSRYRAG SYNSRTQGMT YVSANRFTPY TGIVFDLTGN LSLYGSYSSL
401 FVPQLQKDEH GSYLKPVTGN NLEADIKGEW LEGRLNASAA VYRARKNNLA
451 TAAGRDQSGN TYYRAANQAK THGWEIEVGG RITPEWQIQA GYSQSKPRDQ
501 DGSRLNPDSV PERSFKLFTA YHLAPEAPSG RTIGAGVRRQ GETHTDPAAL
551 RIPNPAAKAR AVANSRQKAY AVADIMARYR FNPRTELSLN VDNLFNKHYR
601 TQPDRHSYGA LRTVNAAFTY RFK*
进一步的工作揭示了其完整的核苷酸序列<SEQ ID 671>:
1 ATGACACGCT TCAAATACTC CCTGCTTTTT GCCGCCCTGC TACCCGTGTA
51 CGCGCAGGCC GATGTTTCTG TTTCAGACGA CCCCAAACCG CAGGAAAGCA
101 CCGAATTGCC GACCATCACC GTTACCGCCG ACCGCACCGC GAGTTCCAAC
151 GACGGCTACA CCGTTTCCGG CACGCACACC CCGTTCGGGC TGCCCATGAC
201 CCTGCGCGAA ATCCCGCAGA GCGTCAGCGT CATCACATCG CAACAAATGC
251 GCGACCAAAA CATCAAAACG CTCGACCGCG CCCTGTTGCA GGCGACCGGC
301 ACCAGCCGCC AGATTTACGG CTCCGACCGC GCGGGCTACA ACTACCTGTT
351 CGCGCGCGGC AGCCGCATCG CCAACTACCA AATCAACGGC ATCCCCGTTG
401 CCGACGCGCT GGCCGATACG GGCAATGCCA ACACCGCCGC CTATGAGCGC
451 GTAGAAGTCG TGCGCGGCGT GGCGGGGCTG CCGGACGGCA CGGGCGAGCC
501 TTCTGCCACC GTCAATCTGG TACGCAAACA CCCGACCCGC AAGCCATTGT
551 TTGAAGTCCG CGCCGAAGCC GGCAACCGCA AACATTTCGG GCTGGGCGCG
601 GACGTATCGG GCAGCCTGAA CGCCGAAGGC ACGCTGCGCG GCCGCCTGGT
651 TTCCACCTTC GGACGCGGCG ACTCGTGGCG GCAGCTCGAA CGCAGCCGCG
701 ATGCCGAACT CTACGGCATT TTGGAATACG ACATCGCACC GCAAACCCGC
751 GTCCACGCAG GCATGGACTA CCAGCAGGCG AAAGAAACCG CAGACGCGCC
801 GCTCAGCTAC GCCGTGTACG ACAGCCAAGG TTATGCCACC GCCTTCGGCC
851 CAAAAGACAA CCCCGCCACA AATTGGTCGA ACAGCCGCAA CCGTGCGCTC
901 AACCTGTTCG CCGGCATAGA ACACCGCTTC AACCAAGACT GGAAACTCAA
951 AGCCGAATAC GACTACACCC GTAGCCGCTT CCGCCAGCCC TACGGTGTGG
1001 CAGGCGTACT TTCCATCGAC CACAGCACTG CCGCCACCGA CCTGATTCCC
1051 GGTTATTGGC ACGCcgatcc GCGCACCCAC AGCGCCAGCA TGTCATTGAC
1101 CGGCAAATAC CgcctGTTCG GCCGCGAGCA CGATTTAATC GCGGGTATCA
1151 ACGGCTACAA ATACGCCAGC AACAAATACG GCGAACGCAG CATCATTCCC
1201 AACGCCATTC CCAACGCCTA CGAATTTTCC CGCACGGGCG CCTATCCGCA
1251 GCCATCATCG TTTGCCCAAA CCATCCCGCA ATACGACACC AGGCGGCAAA
1301 TCGGCGGCTA TCTCGCCACC CGTTTCCGCG CCGCCGACAA CCTTTCGCTG
1351 ATACTCGGCG GCAGATACAG CCGCTACCGC GCAGGCAGCT ACAACAGCCG
1401 CACACAAGGC ATGACCTATG TGTCCGCCAA CCGTTTCACC CCCTACACAG
1451 GCATCGTGTT CGATCTGACC GGCAACCTGT CGCTTTACGG CTCGTACAGC
1501 AGCCTGTTCG TCCCGCAATT GCAAAAAGAC GAACACGGCA GCTACCTGAA
1551 ACCCGTAACC GGCAACAATC TGGAAGCCGA CATCAAAGGC GAATGGCTTG
1601 AAGGGCGTCT GAACGCATCC GCCGCCGTGT ACCGCGCCCG TAAAAACAAC
1651 CTCGCCACCG CAGCAGGACG CGACCAGAGC GGCAACACCT ACTATCGCGC
1701 CGCCAACCAA GCCAAAACCC ACGGCTGGGA AATCGAAGTC GGCGGCCGCA
1751 TCACGCCCGA ATGGCAGATA CAGGCAGGCT ACAGCCAAAG CAAACCCCGC
1801 GACCAAGACG GCAGCCGCCT GAACCCCGAC AGCGTAcCCG AACGCAGCTT
1851 CAAACTCTTC ACCGCCTACC ACTTAGCCCC CGAAGCCCCC AGCGGCCGGA
1901 CCATcggTGC GGGTGTGCGC CGGCAGGGCG AAACCCACAC CGACCCAGCC
1951 GCGCTCCGCA TCCCCAACCC CGCCGCCAAA GCCCGCGCCG TCGCCAACAG
2001 CCGCCAGAAA GCCTACGCCG TCGCCGACAT CATGGCGCGT TACCGCTTCA
2051 ATCCGCGCAC CGAACTGTCG CTGAACGTGG ACAACCTGTT CAACAAACAC
2101 TACCGCACCC AGCCCGACCG CCACAGCTAC GGCGCACTGC GGACAGTGAA
2151 CGCGGCGTTT ACCTATCGGT TTAAATAA
它对应于氨基酸序列<SEQ ID 672;ORF23ng-1>:
1
MTRFKYSLLF AALLPVYAQA DVSVSDDPKP QESTELPTIT VTADRTASSN
51 DGYTVSGTHT PFGLPMTLRE IPQSVSVITS QQMRDQNIKT LDRALLQATG
101 TSRQIYGSDR AGYNYLFARG SRIANYQING IPVADALADT GNANTAAYER
151 VEVVRGVAGL PDGTGEPSAT VNLVRKHPTR KPLFEVRAEA GNRKHFGLGA
201 DVSGSLNAEG TLRGRLVSTF GRGDSWRQLE RSRDAELYGI LEYDIAPQTR
251 VHAGMDYQQA KETADAPLSY AVYDSQGYAT AFGPKDNPAT NWSNSRNRAL
301 NLFAGIEHRF NQDWKLKAEY DYTRSRFRQP YGVAGVLSID HSTAATDLIP
351 GYWHADPRTH SASMSLTGKY RLFGREHDLI AGINGYKYAS NKYGERSIIP
401 NAIPNAYEFS RTGAYPQPSS FAQTIPQYDT RRQIGGYLAT RFRAADNLSL
451 ILGGRYSRYR AGSYNSRTQG MTYVSANRFT PYTGIVFDLT GNLSLYGSYS
501 SLFVPQLQKD EHGSYLKPVT GNNLEADIKG EWLEGRLNAS AAVYRARKNN
551 LATAAGRDQS GNTYYRAANQ AKTHGWEIEV GGRITPEWQI QAGYSQSKPR
601 DQDGSRLNPD SVPERSFKLF TAYHLAPEAP SGRTIGAGVR RQGETHTDPA
651 ALRIPNPAAK ARAVANSRQK AYAVADIMAR YRFNPRTELS LNVDNLFNKH
701 YRTQPDRHSY GALRTVNAAF TYRFK*
ORF23ng-1和ORF23-1在725个氨基酸的重叠区内显示出有95.9%的相同性:
10 20 30 40 50 60
orf23-1.pep MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23ng-1 MTRFKYSLLFAALLPVYAQADVSVSDDPKPQESTELPTITVTADRTASSNDGYTVSGTHT
10 20 30 40 50 60
70 80 90 100 110 120
orf23-1.pep PLGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf23ng-1 PFGLPMTLREIPQSVSVITSQQMRDQNIKTLDRALLQATGTSRQIYGSDRAGYNYLFARG
70 80 90 100 110 120
130 140 150 160 170 180
orf23-1.pep SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLLDGTGEPSATVNLVRKRLTR
|||||||||||||||||||||||||||||||||||||||| |||||||||||||||: ||
orf23ng-1 SRIANYQINGIPVADALADTGNANTAAYERVEVVRGVAGLPDGTGEPSATVNLVRKHPTR
130 140 150 160 170 180
190 200 210 220 230 240
orf23-1.pep KPLFEVRAEAGNRKHFGLDADVSGSLNTEGTLRGRLVSTFGRGDSWRRRERSRDAELYGI
|||||||||||||||||| ||||||||:|||||||||||||||||||: |||||||||||
orf23ng-1 KPLFEVRAEAGNRKHFGLGADVSGSLNAEGTLRGRLVSTFGRGDSWRQLERSRDAELYGI
190 200 210 220 230 240
250 260 270 280 290 300
orf23-1.pep LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWANSRHRAL
||||||||||||||||||||||||||||||||||||||||||||||||||||:|||:|||
orf23ng-1 LEYDIAPQTRVHAGMDYQQAKETADAPLSYAVYDSQGYATAFGPKDNPATNWSNSRNRAL
250 260 270 280 290 300
310 320 330 340 350 360
orf23-1.pep NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHNTAATDLIPGYWHADPRTH
|||||||||||||||||||||||||||||||||||||||||:||||||||||||||||||
orf23ng-1 NLFAGIEHRFNQDWKLKAEYDYTRSRFRQPYGVAGVLSIDHSTAATDLIPGYWHADPRTH
310 320 330 340 350 360
370 380 390 400 410 420
orf23-1.pep SASVSLIGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPAS
|||:|| |||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf23ng-1 SASMSLTGKYRLFGREHDLIAGINGYKYASNKYGERSIIPNAIPNAYEFSRTGAYPQPSS
370 380 390 400 410 420
430 440 450 460 470 480
orf23-1.pep FAQTIPQYGTRRQIGGYLATRFRAADNLSLILGGRYTRYRTGSYDSRTQGMTYVSANRFT
|||||||| |||||||||||||||||||||||||||:|||:|||:|||||||||||||||
orf23ng-1 FAQTIPQYDTRRQIGGYLATRFRAADNLSLILGGRYSRYRAGSYNSRTQGMTYVSANRFT
430 440 450 460 470 480
490 500 510 520 530 540
orf23-1.pep PYTGIVFDLTGNLSLYGSYSSLFVPQSQKDEHGSYLKPVTGNNLEAGIKGEWLEGRLNAS
|||||||||||||||||||||||||| ||||||||||||||||||| |||||||||||||
orf23ng-1 PYTGIVFDLTGNLSLYGSYSSLFVPQLQKDEHGSYLKPVTGNNLEADIKGEWLEGRLNAS
490 500 510 520 530 540
550 560 570 580 590 600
orf23-1.pep AAVYRARKNNLATAAGRDPSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKTR
|||||||||||||||||| ||||||||||||||||||||||||||||||||||||||| |
orf23ng-1 AAVYRARKNNLATAAGRDQSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKPR
550 560 570 580 590 600
610 620 630 640 650 660
orf23-1.pep DQDGSRLNPDSVPERSFKLFTAYHFAPEAPSGWTIGAGVRWQSETHTDPATLRIPNPAAK
||||||||||||||||||||||||:||||||| ||||||| |:|||||||:|||||||||
orf23ng-1 DQDGSRLNPDSVPERSFKLFTAYHLAPEAPSGRTIGAGVRRQGETHTDPAALRIPNPAAK
610 620 630 640 650 660
670 680 690 700 710 720
orf23-1.pep ARAADNSRQKAYAVADIMARYRFNPRAELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
|||: |||||||||||||||||||||:|||||||||||||||||||||||||||||||||
orf23ng-1 ARAVANSRQKAYAVADIMARYRFNPRTELSLNVDNLFNKHYRTQPDRHSYGALRTVNAAF
670 680 690 700 710 720
orf23-1.pep TYRFKX
||||||
orf23ng-1 TYRFKX
另外,ORF23ng-1显示出与大肠杆菌的OMP明显同源:
sp|P16869|FHUE_ECOLI FE(III)粪原因子、FE(III)铁草铵以及FE(III)-RHODOTRULIC ACID前体的外膜受体>gi|1651542|gnl|PID|d1015403(D90745)外膜蛋白FhuE前体[大肠杆菌]>gi|1651545|gnl|PID|d1015405(D90746)外膜蛋白FhuE前体[大肠杆菌]>gi|1787344(AE000210)E(III)粪原因子、FE(III)铁草铵以及FE(III)-RHODOTRULIC ACID前体的外膜受体[大肠杆菌]长度=729
评分=332位(843),估计值=3e-90
相同性=228/717(31%),阳性=350/717(48%),空隙=60/717(8%)
询问:38 TITVTADRTASSN--DGYTVSGTHTPFGLPMTLREIPQSVSVITSQQMRDQNIKTLDRAL 95
T+ V TA + + Y+V+ T + MT R+IPQSV++++ Q+M DQ ++TL +
目标:43 TVIVEGSATAPDDGENDYSVTSTSAGTKMQMTQRDIPQSVTIVSQQRMEDQQLQTLGEVM 102
询问:96 LQATGTSRQIYGSDRAGYNYLFARGSRIANYQINGIP--------VADALADTGNANTAA 147
G S+ SDRA Y ++RG +I NY ++GIP + DAL+D A
目标:103 ENTLGISKSQADSDRALY---YSRGFQIDNYMVDGIPTYFESRWNLGDALSDM-----AL 154
询问:148 YERVEVVRGVAGLPDGTGEPSATVNLVRKHPTRKPLF-EVRAEAGNRKHFGLGADVSGSL 206
+ERVEVVRG GL GTG PSA +N+VRKH T + +V AE G+ AD+ L
目标:155 FERVEVVRGATGLMTGTGNPSAAINMVRKHATSREFKGDVSAEYGSWNKERYVADLQSPL 214
询问:207 NAEGTLRGRLVSTFGRGDSWRQLERSRDAELYGILEYDIAPQTRVHAGMDYQQAKETADA 266
+G +R R+V + DSW S GI++ D+ T + AG +YQ+ +
目标:215 TEDGKIRARIVGGYQNNDSWLDRYNSEKTFFSGIVDADLGDLTTLSAGYEYQRIDVNSPT 274
询问:267 PLSYAVYDSQGYATAFGPKDNPATNWSNSRNRALNLFAGIEHRFNQDWKLKAEYDYTRSR 326
+++ G + ++ + A +W+ + +F ++ +F W+ ++
目标:275 WGGLPRWNTDGSSNSYDRARSTAPDWAYNDKEINKVFMTLKQQFADTWQATLNATHSEVE 334
询问:327 F--RQPYGVAGVLSIDHSTAA--TDLIPGY-------WHADPRTHSA-SMSLTGKYRLFG 374
F + Y A V D ++ PG+ W++ R A + G Y LFG
目标:335 FDSKMMYVDAYVNKADGMLVGPYSNYGPGFDYVGGTGWNSGKRKVDALDLFADGSYELFG 394
询问:375 REHDLIAGINGYKYASNKYGER--SIIPNAIPNAYEFSRTGAYPQPSSFAQTIPQYDTRR 432
R+H+L+ G Y +N+Y +I P+ I + Y F+ G +PQ Q++ Q DT
目标:395 RQHNLMFG-GSYSKQNNRYFSSWANIFPDEIGSFYNFN--GNFPQTDWSPQSLAQDDTTH 451
询问:433 QIGGYLATRFRAADNLSLILGGRYSRYRAGSYNSRTQGMTY-VSANRFTPYTGIVFDXXX 491
Y ATR AD L LILG RY+ +R + +TY + N TPY G+VFD
目标:452 MKSLYAATRVTLADPLHLILGARYTNWRVDT-------LTYSMEKNHTTPYAGLVFDIND 504
询问:492 XXXXXXXXXXXFVPQLQKDEHGSYLKPVTGNNLEADIKGEWLEGRLNASAAVYRARKNNL 551
F PQ +D G YL P+TGNN E +K +W+ RL + A++R ++N+
目标:505 NWSTYASYTSIFQPQNDRDSSGKYLAPITGNNYELGLKSDWMNSRLTTTLAIFRIEQDNV 564
询问:552 ATAAGR---DQSGNTYYRAANQAKTHGWEIEVGGRITPEWQIQAGYSQSKPRDQDGSRLN 608
A + G +G T Y+A + + G E E+ G IT WQ+ G ++ D +G+ +N
目标:565 AQSTGTPIPGSNGETAYKAVDGTVSKGVEFELNGAITDNWQLTFGATRYIAEDNEGNAVN 624
询问:609 PDSVPERSFKLFTAYHLAPEAPSGRTIGAGVRRQGETHTDPAALRIPNPAAKARAVANSR 668
P ++P + K+FT+Y L P P T+G GV Q +TD P RA
目标:625 P-NLPRTTVKMFTSYRL-PVMPE-LTVGGGVNWQNRVYTDTV-----TPYGTFRA----E 672
询问:669 QKAYAVADIMARYRFNPRTELSLNVDNLFNKHYRTQPDRH-SYGALRTVNAAFTYRF 724
Q +YA+ D+ RY+ L NV+NLF+K Y T + YG R + TY+F
目标:673 QGSYALVDLFTRYQVTKNFSLQGNVNNLFDKTYDTNVEGSIVYGTPRNFSITGTYQF 729
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF23-1(77.5kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图15A显示出His-融合蛋白亲和纯化的结果,图15B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,对小鼠血清进行Western印迹(图15C)和ELISA(阳性结果).这些实验确认ORF23-1是一种外露蛋白,且是一种有用的免疫原。
实施例80
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 673>:
1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA
101 TCATATCCAA GCCGACCGAA CAAACGGCGG TCATGGCTTC GAGTTTGTCC
151 AGCGTCAgcA CGCCTGCTTC GGCGgcGgCa ATCATACCTT CGTCTTCGGA
201 AACGGGGATA AACGcGCCAC TCAAACCCCC GACCGCGCTG GAAGCCATCA
251 TGCCGCCTTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TnTTCAAGAA TGCGTGCCAC
351 TnAGTCGCCG ACGGGG..
它对应于氨基酸序列<SEQ ID 674;ORF24>:
1 MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIISKPTE QTAVMASSLS
51 SVSTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI XSRMRATXSP TG..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 675>:
1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA
101 TCATATCCAA GCCGACCGAA CAAACGGCGG TCATGGCTTC GAGTTTGTCC
151 AGCGTCAGCA CGCCTGCTTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA
201 AACGGGGATA AACGCGCCAC TCAAACCCCC GACCGCGCTG GAAGCCATCA
251 TGCCGCCTTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TCTTCAAGAA TGCGTGCCAC
351 TGAGTCGCCG ACGGCGGGGG TCGGCGCCAG CGACAAGTCG AGAATACCAA
401 ACGGGATATT CAGCATTTTT GAGGCTTCGC GGCCGATGAG TTCGCCCACG
451 CGGGTAATTT TGAAAGCAGT TTTCTTCACT ACTTCCGCAA CTTCGGTCAA
501 TGTCGTTGCA TCTGAATTTT CCAACGCGGC TTTTACGACA CCTGGGCCGG
551 ATACGCCGAC ATTGATAACG GCATCCGCTT CGCCCGAACC ATGAAACGCG
601 CCCGCCATAA ACGGGTTGTC TTCCACCGCG TTGCAGAACA CGACAATTTT
651 AGCGCAGCCG AAACCTTCGG GCGTGATTTC CGCCGTGCGT TTGACGGTTT
701 CGCCCGCCAG CTTGACCGCA TCCATATTGA TACCGGCACG CGTACTGCCG
751 ATATTGATGG AGCTGCACAC AATATCGGTA GTCTTCATCG CTTCGGGAAT
801 GGAGCGGATT AACACCTCAT CCGAAGGCGA CATCCCTTTT TGCACCAACG
851 CGGAAAAACC GCCGATAAAA GACACACCGA TGGCTTTGGC AGCTTTATCC
901 AAAGTTTGCG CCACGCTGAC GTAA
它对应于氨基酸序列<SEQ ID 676;ORF24-1>:
1
MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIISKPTE QTAVMASSLS
51 SVSTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RIPNGIFSIF EASRPMSSPT
151 RVILKAVFFT TSATSVNVVA SEFSNAAFTT PGPDTPTLIT ASASPEP*NA
201 PAINGLSSTA LQNTTILAQP KPSGVIS
AVR LTVSPASLTA SILIPAR
VLP
251
ILMELHTISV VFIASGMERI NTSSEGDIPF CTNAEKPPIK DTPMALAALS
301 KVCATLT*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF24和脑膜炎奈瑟球菌菌株A的ORF(ORF24a)在307个氨基酸的重叠区内显示出有96.4%的相同性:
10 20 30 40 50 60
orf24a.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISXPTEQTAVIASSLSNVSTPASAAA
|||||||||||||||||||||||||||||||||||| |||||||:|||||:|||||||||
orf24 MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA
10 20 30 40 50 60
70 80 90 100 110 120
orf24a.pep IIPSSSXTGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24 IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
70 80 90 100 110 120
130 140 150 160 170 180
orf24a.pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24 TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
130 140 150 160 170 180
190 200 210 220 230 240
orf24a.pep PGPDTPTLITASASPEPXNAPAIXGLSSXALQNTTILAQPKPSSVISXVRLMVSPASLTA
||||||||||||||||||||||| ||||:||||||||||||||:||| ||| ||||||||
orf24 PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA
190 200 210 220 230 240
250 260 270 280 290 300
orf24a.pep SILIPARVLPILMELHTISVVFIASGMERXNTSSEGDIPFCTSAEKPPIKDTPMALAALS
||||||||||||||||||||||||||||| ||||||||||||:|||||||||||||||||
orf24 SILIPARVLPILMELHTISVVFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS
250 260 270 280 290 300
orf24a.pep KVCATLTX
||||||||
orf24 KVCATLTX
全长ORF24a核苷酸序列<SEQ ID 677>是:
1 ATGCGCACGG CAGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCAATGATG CCGGAAATGG TGTGCGCGGG TGTGTCGCCG GGAACGGCAA
101 TCATATCCAA NCCGACCGAA CAAACGGCGG TCATCGCTTC GAGTTTATCC
151 AACGTCAGCA CGCCTGCTTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA
201 NACGGGGATA AACGCGCCAC TCAAACCGCC AACCGCGCTC GAAGCCATCA
251 TGCCGCCCTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAACCCATT TCTTCAAGAA TGCGCGCCAC
351 CGAGTCGCCG ACGGCAGGGG TCGGTGCCAG CGACAAGTCG AGAATACCAA
401 ACGGGATATT CAGCATTTTT GAGGCTTCGC GGCCGATGAG TTCGCCCACG
451 CGGGTAATTT TGAAGGCGGT TTTCTTCACA ACTTCGGCAA CTTCGGTCAA
501 TGTCGTTGCA TCCGAATTTT CCAACGCGGC TTTTACGACA CCCGGGCCGG
551 ATACGCCGAC ATTAATCACA GCATCCGCTT CGCCTGAGCC GTGAAACGCG
601 CCCGCCATAN ACGGGTTGTC TTCCNCCGCG TTGCAGAACA CGACGATTTT
651 GGCGCAGCCG AAACCTTCTA GTGTGATTTC ANCCGTGCGT TTGATGGTTT
701 CGCCCGCCAG TCTGACCGCG TCCATATTGA TACCGGCGCG CGTACTGCCG
751 ATATTGATGG AGCTGCACAC GATATCAGTA GTCTTCATCG CTTCGGGAAT
801 GGAACGGATN AACACCTCGT CAGAAGGCGA CATACCTTTT TGCACCAGCG
851 CGGAAAAGCC GCCAATAAAA GACACGCCGA TGGCTTTGGC AGCCTTATCC
901 AAAGTTTGCG CCACGCTGAC GTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 678>:
1 MRTAVVLLLI MPMAASSAMM PEMVCAGVSP GTAIISXPTE QTAVIASSLS
51 NVSTPASAAA IIPSSSXTGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RIPNGIFSIF EASRPMSSPT
151 RVILKAVFFT TSATSVNVVA SEFSNAAFTT PGPDTPTLIT ASASPEP
NA
201 PAIXGLSSXA LQNTTILAQP KPSSVISXVR LMVSPASLTA SILIPARVLP
251 ILMELHTISV VFIASGMERX NTSSEGDIPF CTSAEKPPIK DTPMALAALS
301 KVCATLT*
应注意,该蛋白质包括198位的终止密码子。
ORF24a和ORF24-1在307个氨基酸的重叠区内显示出有96.4%的相同性:
10 20 30 40 50 60
orf24a.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISXPTEQTAVIASSLSNVSTPASAAA
|||||||||||||||||||||||||||||||||||| |||||||:|||||:|||||||||
orf24-1 MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA
10 20 30 40 50 60
70 80 90 100 110 120
orf24a.pep IIPSSSXTGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
|||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24-1 IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
70 80 90 100 110 120
130 140 150 160 170 180
orf24a.pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24-1 TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
130 140 150 160 170 180
190 200 210 220 230 240
orf24a.pep PGPDTPTLITASASPEPXNAPAIXGLSSXALQNTTILAQPKPSSVISXVRLMVSPASLTA
||||||||||||||||||||||| ||||:||||||||||||||:||| ||| ||||||||
orf24-1 PGPDTPTLITASASPEPXNAPAINGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA
190 200 210 220 230 240
250 260 270 280 290 300
orf24a.pep SILIPARVLPILMELHTISVVFIASGMERXNTSSEGDIPFCTSAEKPPIKDTPMALAALS
||||||||||||||||||||||||||||| ||||||||||||:|||||||||||||||||
orf24-1 SILIPARVLPILMELHTISVVFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS
250 260 270 280 290 300
orf24a.pep KVCATLTX
||||||||
orf24-1 KVCATLTX
与淋病奈瑟球菌的预计ORF的同源性
ORF24和淋病奈瑟球菌的预计ORF(ORF24ng)在121个氨基酸的重叠区内显示出有96.7%的相同性:
orf24.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA 60
||||||||||||||||||||||||||||||||||:|||||||||||||||||:|||||||
orf24ng MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIMSKPTEQTAVMASSLSSVNTPASAAA 60
orf24.pep IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPIXSRMRATXSP 120
|||||||||||||||||||||||||||||||||||||||||||||||||| |||||| ||
orf24ng IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP 120
orf24.pep TG 122
|:
orf24ng TAGVGASDKSRMPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVRLTASEFSSAALTT 180
全长ORF24ng核苷酸序列<SEQ ID 679>是:
1 ATGCGCACGG CGGTGGTTTT GCTGTTGATC ATGCCGATGG CGGCTTCGTC
51 GGCGATGATG CCGGAAATGG TGTGCGCGGG CGTGTCGCCG GGAACGGCAA
101 TCATGTCCAA ACCAACGGAG CAGACGGCGG TCATGGCTTC GAGTTTGTCC
151 AGCGTCAACA CGCCTGCCTC GGCGGCGGCA ATCATACCTT CGTCTTCGGA
201 AACGGGGATA AACGCGCCGC TCAAACCGCC GACCGCGCTG GAAGCCATCA
251 TGCCGCCCTT TTTCACGGCA TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG
301 CCGTGCGTAC CGCAGACGCT CAAGCCCATT TCTTCAAGAA TGCGCGCCAC
351 CGAGTCGCCG ACGGCGGGGG TCGGTGCCAG CGACAAATCG AGAATGCCGA
401 ACGGGATATT CAGCATTTTT GAGGCTTCGC GACCGATGAG TTCGCCCACG
451 CGGGTGATTT TGAAAGCGGT TTTCTTCACG ACTTCGGCGA CCTCGGTCAG
501 GCTGACCGCG TCCGAATTTT CCAGCGCGGC TTTGACCACG CCTGGACCGG
551 ATACGCCGAC ATTAATCACA GCATCCGCTT CGCCCGAGCC GTGGAACGCA
601 CCCGCCATAA ACGGATTGTC TTCCACCGCG TTGCAGAACA CGACGATTTT
651 GGCGCAGCCG AAACCTTCGG GTGTGATTTC AGCCGTGCGT TTGATGGTTT
701 CGCCTGCCAG CTTGACCGCA TCCATATTGA TACCGGCACG CGTGCTGCCG
751 ATATTGATGG AGCTGCACAC GATATCGGTA GTTTTCATCG CTTCGGGAAC
801 GGAACGGATC AACACCTCAT CCGAAGGCGA CATACCTTTT TGCACCAGCG
851 CGGAAAAGCC GCCGATAAAG GACACGCCGA TGGCTTTGGC TGCCTTGTCC
901 AAAGTCTGCG CCACGCTGAC ATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 680>:
51 SVNTPASAAA IIPSSSETGI NAPLKPPTAL EAIMPPFFTA SFSNAKAAVV
101 PCVPQTLKPI SSRMRATESP TAGVGASDKS RMPNGIFSIF EASRPMSSPT
151 RVILKAVFFT TSATSVRLTA SEFSSAALTT PGPDTPTLIT ASASPEPWNA
201 PAINGLSSTA LQNTTILAQP KPSGVIS
AVR LMVSPASLTA SILIPAR
VLP
251
ILMELHTISV VFIASGTERI NTSSEGDIPF CTSAEKPPIK DTPMALAALS
301 KVCATLT*
ORF24ng和ORF24-1在307个氨基酸的重叠区内显示出有96.1%的相同性:
10 20 30 40 50 60
orf24-1.pep MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIISKPTEQTAVMASSLSSVSTPASAAA
||||||||||||||||||||||||||||||||||:|||||||||||||||||:|||||||
orf24ng MRTAVVLLLIMPMAASSAMMPEMVCAGVSPGTAIMSKPTEQTAVMASSLSSVNTPASAAA
10 20 30 40 50 60
70 80 90 100 110 120
orf24-1.pep IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf24ng IIPSSSETGINAPLKPPTALEAIMPPFFTASFSNAKAAVVPCVPQTLKPISSRMRATESP
70 80 90 100 110 120
130 140 150 160 170 180
orf24-1.pep TAGVGASDKSRIPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVNVVASEFSNAAFTT
|||||||||||:|||||||||||||||||||||||||||||||||| ::|||||:||:||
orf24ng TAGVGASDKSRMPNGIFSIFEASRPMSSPTRVILKAVFFTTSATSVRLTASEFSSAALTT
130 140 150 160 170 180
190 200 210 220 230 240
orf24-1.pep PGPDTPTLITASASPEPXNAPATNGLSSTALQNTTILAQPKPSGVISAVRLTVSPASLTA
||||||||||||||||| ||||||||||||||||||||||||||||||||| ||||||||
orf24ng PGPDTPTLITASASPEPWNAPAINGLSSTALQNTTILAQPKPSGVISAVRLMVSPASLTA
190 200 210 220 230 240
250 260 270 280 290 300
orf24-1.pep SILIPARVLPILMELHTISVVFIASGMERINTSSEGDIPFCTNAEKPPIKDTPMALAALS
|||||||||||||||||||||||||| |||||||||||||||:|||||||||||||||||
orf24ng SILIPARVLPILMELHTISVVFIASGTERINTSSEGDIPFCTSAEKPPIKDTPMALAALS
250 260 270 280 290 300
orf24-1.pep KVCATLTX
||||||||
orf24ng KVCATLTX
根据该分析结果(包括淋球菌蛋白中存在一个推定前导序列(前18个氨基酸,用双划线表示)和推定的跨膜结构域(单划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例81
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 681>:
1 ..ACCGACGTGC AAAAAGAGTT GGTCGGCGAA CAACGCAAGT GGGCGCAGGA
51 AAAAATCAGC AACTGCCGAC AAGCCGCCGC GCAGGCAGAC CGGCAGGAAT
101 ACGCCGAATA CCTCAAGCTG CAATGCGACA CGCGGATGAC GCGCGAACGG
151 ATACAGTATC TTCGCGGCTA TTCCATCGAT TAG
它对应于氨基酸序列<SEQ ID 682;ORF25>:
1 ..TDVQKELVGE QRKWAQEKIS NCRQAAAQAD RQEYAEYLKL QCDTRMTRER
51 IQYLRGYSID*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 683>:
1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGGCGCTTG
51 CGGCAGGGAA GAACCGCCCA AGGCATTGGA ATGCGCCAAC CCCGCCGTGT
101 TGCAAGGCAT ACGCGGCAAT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT
151 TCTTTCGCGC GCGAAGACGG CAGGCAGTTT GTCGATGCCG ACAAAATTAT
201 CGCCGCCGCC TACGGTTTGG CGTTTTCTTT GGAACACGCT TCGGAAACGC
251 AGGAAGGCGG GCGCACGTTC TGTATCGCCG ATTTGAACAT TACCGTGCCG
301 TCTGAAACGC TTGCCGATGC CAAGGCAAAC AGCCCCCTGT TGTACGGGGA
351 AACTGCTTTG TCGGATATTG TGCGGCAGAA GACGGGCGGC AATGTCGAGT
401 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTGCC CGTCAAAGAC
451 GGTCAGACGG CATTTGTCGA CAACACGGTC GGTATGGCGG CGCAAACGCT
501 GTCTGCCGCG CTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG
551 GCAAGGCGGT GAAAAAAGAA GACGCGGTCA GGATTTTGAG CGGAAAAGCC
601 CGTGAAGAAG AACCGTCCAA ACCCACGCCC GAAGACATTT TGGAACACAA
651 TGCCGCCGGC GGCGATGCGG GCGTACCCCA AGCCGCAGAA GGCGCGCCCG
701 AACCGGAAAT CCTGCATCCT GACGACGGCG AGCGTGCCGA TACCGTTACC
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGC GTACAAAACC AGCGTGCGGA
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG
851 AGTTGGTCGG CGAACAACGC AAGTGGGCGC AGGAAAAAAT CAGCAACTGC
901 CGACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA
951 GCTGCAATGC GACACGCGGA TGACGCGCGA ACGGATACAG TATCTTCGCG
1001 GCTATTCCAT CGATTAG
它对应于氨基酸序列<SEQ ID 684;ORF25-1>:
1
MYRKLIALPF ALLLAACGRE EPPKALECAN PAVLQGIRGN IQETLTQEAR
51 SFAREDGRQF VDADKIIAAA YGLAFSLEHA SETQEGGRTF CIADLNITVP
101 SETLADAKAN SPLLYGETAL SDIVRQKTGG NVEFKDGVLT AAVRFLPVKD
151 GQTAFVDNTV GMAAQTLSAA LLPYGVKSIV MIDGKAVKKE DAVRILSGKA
201 REEEPSKPTP EDILEHNAAG GDAGVPQAAE GAPEPEILHP DDGERADTVT
251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEQR KWAQEKISNC
301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF25和脑膜炎奈瑟球菌菌株A的ORF(ORF25a)在60个氨基酸的重叠区内显示出有98.3%的相同性:
10 20 30
orf25.pep TDVQKELVGEQRKWAQEKISNCRQAAAQAD
|||||||||| |||||||||||||||||||
orf25a VTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEXRKWAQEKISNCRQAAAQAD
250 260 270 280 290 300
40 50 60
orf25.pep RQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
|||||||||||||||||||||||||||||||
orf25a RQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
310 320 330
全长ORF25a核苷酸序列<SEQ ID 685>是:
1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCCGCTTG
51 CGGCAGGGAA GAACCGCCCA AGGCATTGGA ATGCGCCAAC CCCGCCGTGT
101 TGCAANGCAT ACGCNGCAAT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT
151 TCTTTCGCGC GCGAAGACNG CANGCAGTTT GTCGATGCCG ACNAAATTAT
201 CGCCGCCGCC TANGNTNNGN NGNTNTCTTT GGAACACGCT TCGGAAACGC
251 AGGAAGGCGG GCGCACGTTC TGTNTCGCCG ATTTGAACAT TACCGTGCCG
301 TCTGAAACGC TTGCCGATGC CAAGGCAAAC AGCCCCCTGC TGTACGGGGA
351 AACCGCTTTG TCGGATATTG TGCGGCAGAA GACGGGCGGC AATGTCGAGT
401 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTACC CGTCAAAGAC
451 GGTCAGANGG CATTTGTCGA CAACACGGTC GGTATGGCGG CGCAAACGCT
501 GTCTGCCGCG TTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG
551 GCAAGGCGGT AAAAAAAGAA GACGCGGTCA GGATTNTGAG CNGANAAGCC
601 CGTGAANAAG AACCGTCCAA ANCCNNGCCC GAAGACATTT TGGAACATAA
651 TGCCGCCGGA GGGGATGCAG ACGTACCCCA AGCCGGAGAA GACGCGCCCG
701 AACCGGAAAT CCTGCATCCT GACGACGGCG AGCGTGCCGA TACCGTTACC
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGN GTACAAAACC AGCGTGCGGA
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG
851 AGTTGGTCGG CGAANAACGC AAGTGGGCGC AGGAAAAAAT CAGCAACTGC
901 CGACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA
951 GCTGCAATGC GACACGCGGA TGACGCGCGA ACGGATACAG TATCTTCGCG
1001 GCTATTCCAT CGATTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 686>:
1
MYRKLIALPF ALLLAACGRE EPPKALECAN PAVLQXIRXN IQETLTQEAR
51 SFAREDXXQF VDADXIIAAA XXXXXSLEHA SETQEGGRTF CXADLNITVP
101 SETLADAKAN SPLLYGETAL SDIVRQKTGG NVEFKDGVLT AAVRFLPVKD
151 GQXAFVDNTV GMAAQTLSAA LLPYGVKSIV MIDGKAVKKE DAVRIXSXXA
201 REXEPSKXXP EDILEHNAAG GDADVPQAGE DAPEPEILHP DDGERADTVT
251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEXR KWAQEKISNC
301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID*
ORF25a和ORF25-1在338个氨基酸的重叠区内显示出有93.5%的相同性:
10 20 30 40 50 60
orf25a.pep MYRKLIALPFALLLAACGREEPPKALECANPAVLQXIRXNIQETLTQEARSFAREDXXQF
||||||||||||||||||||||||||||||||||| || ||||||||||||||||| ||
orf25-1 MYRKLIALPFALLLAACGREEPPKALECANPAVLQGIRGNIQETLTQEARSFAREDGRQF
10 20 30 40 50 60
70 80 90 100 110 120
orf25a.pep VDADXIIAAAXXXXXSLEHASETQEGGRTFCXADLNITVPSETLADAKANSPLLYGETAL
|||| ||||| |||||||||||||||| ||||||||||||||||||||||||||||
orf25-1 VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAKANSPLLYGETAL
70 80 90 100 110 120
130 140 150 160 170 180
orf25a.pep SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQXAFVDNTVGMAAQTLSAALLPYGVKSIV
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf25-1 SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQTAFVDNTVGMAAQTLSAALLPYGVKSIV
130 140 150 160 170 180
190 200 210 220 230 240
orf25a.pep MIDGKAVKKEDAVRIXSXXAREXEPSKXXPEDILEHNAAGGDADVPQAGEDAPEPEILHP
||||||||||||||| | ||| |||| :|||||||||||||| ||||:| |||||||||
orf25-1 MIDGKAVKKEDAVRILSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP
190 200 210 220 230 240
250 260 270 280 290 300
orf25a.pep DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEXRKWAQEKISNC
|||||||||||||||||||||||||||||||||||||||||||||||| |||||||||||
orf25-1 DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC
250 260 270 280 290 300
310 320 330 339
orf25a.pep RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
|||||||||||||||||||||||||||||||||||||||
orf25-1 RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
310 320 330
与淋病奈瑟球菌的预计ORF的同源性
ORF25和淋病奈瑟球菌的预计ORF(ORF25ng)在60个氨基酸的重叠区内显示出有100%的相同性:
orf25.pep TDVQKELVGEQRKWAQEKISNCRQAAAQAD 30
||||||||||||||||||||||||||||||
orf25ng VTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNCRQAAAQAD 308
orf25.pep RQEYAEYLKLQCDTRMTRERIQYLRGYSID 60
||||||||||||||||||||||||||||||
orf25ng RQEYAEYLKLQCDTRMTRERIQYLRGYSID 338
全长ORF25ng核苷酸序列<SEQ ID 687>是:
1 ATGTATCGGA AACTCATTGC GCTGCCGTTT GCCCTGCTGC TTGCAGCGTG
51 CGGCAGGGAA GAACCGCCCA AGGCGTTGGA ATGCGCCAAC CCCGCCGTGT
101 TGCAGGACAT ACGCGGCAGT ATTCAGGAAA CGCTCACGCA GGAAGCGCGT
151 TCTTTCGCGC GCGAAGACGG CAGGCAGTTT GTCGATGCCG ACAAAATTAT
201 CGCCGCCGCC TACGGTTTGG CGTTTTCTTT GGAACACGCT TCGGAAACGC
251 AGGAAGGCGG GCGCACGTTC TGTATCGCCG ATTTGAACAT TACCGTGCCG
301 TCTGAAACGC TTGCCGATGC CGAGGCAAAC AGCCCCCTGC TGTATGGGGA
351 AACGTCTTTG GCAGACATCG TGCAGCAGAA GACGGGCGGC AATGTCGAGT
401 TTAAAGACGG CGTATTGACG GCAGCCGTCC GCTTCCTGCC CGCCAAAGAC
451 GCTCGGACGG CATTTATCGA CAACACGGTC GGTATGGCGA CGCAAACGCT
501 GTCTGCCGCG TTGCTGCCTT ACGGCGTGAA GAGCATCGTG ATGATAGACG
551 GCAAGGCGGT GACAAAAGAA GACGCGGTCA GGGTTTTGAG CGGCAAAGCC
601 CGTGAAGAAG AACCGTCCAA ACCCACCCCC GAAGACATTT TGGAACACAA
651 TGCCGCCGGC GGCGATGCGG GCGTACCCCA AGCCGCAGAA GGCGCACCCG
701 AACCCGAAAT CCTGCATCCC GACGACGTCG AGCGTGCCGA TACCGTTACC
751 GTATCACGGG GCGAAGTGGA AGAGGCGCGC GTACAAAACC AACGTGCGGA
801 ATCCGAAATT ACCAAACTTT GGGGAGGACT CGATACCGAC GTGCAAAAAG
851 AGTTGGTCGG CGAACAGCGC AAGTGGGCGC AGGAAAAAAT CAGcaactgc
901 cgACAAGCCG CCGCGCAGGC AGACCGGCAG GAATACGCCG AATACCTCAA
951 GCTCCAATGC GACACGCGGA TGACGCGCGA ACggaTACAG TATCTTCGCG
1001 GCTATTCCAT CGATTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 688>:
1
MYRKLIALPF ALLLAACGRE EPPKALECAN PAVLQDIRGS IQETLTQEAR
51 SFAREDGRQF VDADKIIAAA YGLAFSLEHA SETQEGGRTF CIADLNITVP
101 SETLADAEAN SPLLYGETSL ADIVQQKTGG NVEFKDGVLT AAVRFLPAKD
151 ARTAFIDNTV GMATQTLSAA LLPYGVKSIV MIDGKAVTKE DAVRVLSGKA
201 REEEPSKPTP EDILEHNAAG GDAGVPQAAE GAPEPEILHP DDVERADTVT
251 VSRGEVEEAR VQNQRAESEI TKLWGGLDTD VQKELVGEQR KWAQEKISNC
301 RQAAAQADRQ EYAEYLKLQC DTRMTRERIQ YLRGYSID*
ORF25ng和ORF25-1在338个氨基酸的重叠区内显示出有95.9%的相同性:
10 20 30 40 50 60
orf25-1.pep MYRKLIALPFALLLAACGREEPPKALECANPAVLQGIRGNIQETLTQEARSFAREDGRQF
||||||||||||||||||||||||||||||||||| |||:||||||||||||||||||||
orf25ng MYRKLIALPFALLLAACGREEPPKALECANPAVLQDIRGSIQETLTQEARSFAREDGRQF
10 20 30 40 50 60
70 80 90 100 110 120
orf25-1.pep VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAKANSPLLYGETAL
|||||||||||||||||||||||||||||||||||||||||||||||:||||||||||:|
orf25ng VDADKIIAAAYGLAFSLEHASETQEGGRTFCIADLNITVPSETLADAEANSPLLYGETSL
70 80 90 100 110 120
130 140 150 160 170 180
orf25-1.pep SDIVRQKTGGNVEFKDGVLTAAVRFLPVKDGQTAFVDNTVGMAAQTLSAALLPYGVKSIV
:|||:||||||||||||||||||||||:||::|||:|||||||:||||||||||||||||
orf25ng ADIVQQKTGGNVEFKDGVLTAAVRFLPAKDARTAFIDNTVGMATQTLSAALLPYGVKSIV
130 140 150 160 170 180
190 200 210 220 230 240
orf25-1.pep MIDGKAVKKEDAVRILSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP
||||||| ||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf25ng MIDGKAVTKEDAVRVLSGKAREEEPSKPTPEDILEHNAAGGDAGVPQAAEGAPEPEILHP
190 200 210 220 230 240
250 260 270 280 290 300
orf25-1.pep DDGERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC
|| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf25ng DDVERADTVTVSRGEVEEARVQNQRAESEITKLWGGLDTDVQKELVGEQRKWAQEKISNC
250 260 270 280 290 300
310 320 330 339
orf25-1.pep RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
|||||||||||||||||||||||||||||||||||||||
orf25ng RQAAAQADRQEYAEYLKLQCDTRMTRERIQYLRGYSIDX
310 320 330
根据该分析结果(包括淋球菌蛋白中存在一个预计的原核细胞膜脂蛋白脂质连接位点(下划线)),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF25-1(37kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图16A显示了GST-融合蛋白的亲和纯化结果,图16B显示了His-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,对小鼠血清进行Western印迹(图16C),ELISA(阳性结果),和FACS分析(图16D)。这些实验确认ORF25-1是一种外露蛋白,且是一种有用的免疫原。
图16E显示出ORF25-1的亲水性、抗原性指数和AMPHI区域的曲线。
实施例82
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 689>
1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TCTGGwysGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA
201 CGsyGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CkGATACTTT
251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA T.........
//
851 .......... .......... .......... ........AC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT CTTTGCCGTC GTTCTCTGCA CGCTCGGCAC
951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG
1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA
1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CCTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC
1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT
1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAA..
它对应于氨基酸序列<SEQ ID 690;ORF26>:
1 MQLIDYSHSF FSVVPPFLAL ALAVITRRVL LSLGIGILXX VAFLVGGNPV
51 DGLTHLKDMV VGLAWSDXDW SLGKPKILVF XILLGIFTSL LTYSGSN...
//
251 .......... .......... .......... .......... ......TSLV
301 FGGTCGVFAV VLCTLGTIKT ADYPKAVWQG AKSMFGAIAI LILAWLISTV
351 VGEMHTGDYL STLVAGNIHP GFLPVILFLL ASVMAFATGT SWGTFGIMLP
401 IAAAMAVKVE PALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPYA LTVAAAAASG YLALGLTKSA LLGFGTTGIV LAVLIFLLKD
501 KK..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 691>:
1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TCTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA
201 CGGCGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CTGATACTTT
251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT
301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGCGCGGCG CGAAAATGCT
351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGTC
401 TCGCCGTCGG TGCGATTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC
451 CGCACCAAAC TCGCCTACAT CCTCGACTCC ACTGCCGCTC CTATGTGCGT
501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG
551 GACTGCTCGT TACCTACAAA ATCACCGAAT ACACGCCGAT GGGGACGTTT
601 GTCGCCATGA GCCTGATGAA CTATTACGCA CTGTTTGCCC TGATTATGGT
651 GTTCGTCGTC GCATGGTTTT CCTTCGACAT CGGCTCGATG GCACGTTTCG
701 AACAAGCCGC GTTGAACGAA GCCCACGATG AAACTGCCGT TTCAGACGCT
751 ACCAAAGGTC GTGTTTACGC ACTGATTATT CCCGTTTTGG CCTTAATCGC
801 CTCAACGGTT TCCGCCATGA TCTACACCGG CGCGCAGGCA AGCGAAACCT
851 TCAGCATTTT GGGGGCATTT GAAAACACGG ACGTAAACAC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT CCTTGCCGTC GTTCTCTGCA CGCTCGGCAC
951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG
1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA
1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CCTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC
1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT
1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAACGCG CCAACGCCTG A
它对应于氨基酸序列<SEQ ID 692;ORF26-1>:
1
MQLIDYSHSF FSVVPPFLAL ALAVITRR
VL LSLGIGILVG VAFLVGGNPV
51 DGLTHLKDMV VGLAWSDGDW SLGKPK
ILVF LILLGIFTSL LTYSGSNQAF
101 ADWAKRHIKN R
RGAKMLTAC LVFVTFIDDY FHSLAVGAIA RPVTDKFKVS
151 RTKLAYILDS TAAPMCVLMP
VSSWGASIIA TLAGLLVTYK ITEYTPMGTF
201 VAMSLMNYYA
LFALIMVFVV AWFSFDIGSM ARFEQAALNE AHDETAVSDA
251 TKGRVYA
LII PVLALIASTV SAMIYTGAQA SETFSILGAF ENTDVNTS
LV
301
FGGTCGVLAV VLCTLGTIKT ADYPKAVWQG AKSM
FGAIAI LILAWLISTV
351
VGEMHTGDYL STLVAGNIHP
GFLPVILFLL ASVMAFATGT SW
GTFGIMLP
401
IAAAMAVKVE P
ALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPY
A LTVAAAAASG YLALGLTKSA
LLGFGTTGIV LAVLIFLLKD
501 KKRANA*
该氨基酸序列的计算机分析给出了下列结果:
与流感嗜血菌的假设跨膜蛋白HI1586(登录号为P44263)的同源性
ORF26和HI1586在N端和C端的97和221个氨基酸重叠区内分别显示出有53%和49%的氨基酸相同性:
Orf26 1 MQLIDYSHSFFSVVPPFLALALAVITRRVXXXXXXXXXXXVAFLVGGNPVDGLTHLKDMV 60
M+LID+S S +S+VP LA+ LA+ TRRV L +L V
HI1586 14 MELIDFSSSVWSIVPALLAIILAIATRRVLVSLSAGIIIGSLMLSDWQIGSAFNYLVKNV 73
Orf26 61 VGLAWSDXDWSLGKPKILVFXILLGIFTSLLTYSGSN 97
V L ++D + + I++F +LLG+ T+LLT SGSN
HI1586 74 VSLVYADGEIN-SNMNIVLFLLLLGVLTALLTVSGSN 109
//
Orf26 86 IFTSLLTYSGS--NTSLVFGGTCGVFAVVLCTL--GTIKTADYPKAVWQGAKSMFGXXXX 141
+F+ L T+ + TSLV GG C + L + + +Y ++ G KSM G
HI1586 299 VFSVLGTFENTVVGTSLVVGGFCSIIISTLLIILDRQVSVPEYVRSWIVGIKSMSGAIAI 358
Orf26 142 XXXXXXXSTVVGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLP 201
+ +VG+M TG YLS+LV+GNI FLPVILF+L + MAF+TGTSWGTFGIMLP
HI1586 359 LFFAWTINKIVGDMQTGKYLSSLVSGNIPMQFLPVILFVLGAAMAFSTGTSWGTFGIMLP 418
Orf26 202 IAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQXXXX 261
IAAAMA P L++PC+SAVMAGAVCGDHCSP+SDTTILSSTGA+CNHIDHVT+Q
HI1586 419 IAAAMAANAAPELLLPCLSAVMAGAVCGDHCSPVSDTTILSSTGAKCNHIDHVTTQLPYA 478
Orf26 262 XXXXXXXXXXXXXXXXXKSALLGFGTTGIVLAVLIFLLKDK 302
S L GF T + L V+IF +K +
HI1586 479 ATVATATSIGYIVVGFTYSGLAGFAATAVSLIVIIFAVKKR 519
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF26和脑膜炎奈瑟球菌菌株A的ORF(ORF26a)在502个氨基酸的重叠区内显示出有58.2%的相同性:
10 20 30 40 50 60
orf26.pep
MQLIDYSHSFFSVVPPFLALALAVITRR
VLLSLGIGILXXVAFLVGGNPVDGLTHLKDMV
||||||||||||||||||||| ||||||| |||||||||| ||||| |||||||||||||||
orf26a
MQLIDYSHSFFSVVPPFLALALAVITRR
VLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
10 20 30 40 50 60
70 80 90 99
orf26.pep VGLAWSDXDWSLGKPK
ILVFXILLGIFTSLLTYSGSNXX---------------------
|||||||||||||||| |||| |||||||||||| ||||
orf26a VGLAWSDGDWSLGKPK
XLVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
70 80 90 100 110 120
orf26.pep ------------------------------------------------------------
orf26a
LVFVTFIDDYFHSLAVGAXARPVTDKFKVSRAKLAYILDSTAAPMCVLMP
VSSWGASIIA
130 140 150 160 170 180
orf26.pep ------------------------------------------------------------
orf26a
TLAGLLVTYKITEYTPMGTFVAMSLMNYYA
LFALIMVFVVAWFSFDIGSMARFEQAALNE
190 200 210 220 230 240
100 110
orf26.pep --------------------------------------------------------TSLV
||||
orf26a AHDETAVSDGSWGRVYA
LIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTS
LV
250 260 270 280 290 300
120 130 140 150 160 170
orf26.pep
FGGTCGVFAVVLCTLGTIKTADYPKAVWQGAKSM
FGAIAILILAWLISTVVGEMHTGDYL
|||||||:||||||||||| ||||||||||||||||||||||||||||||||||||||||
orf26a
FGGTCGVLAVVLCTLGTIKIADYPKAVWQGAKSM
FGAIAILILAWLISTVVGEMHTGDYL
310 320 330 340 350 360
180 190 200 210 220 230
orf26.pep STLVAGNIHP
GFLPVILFLLASVMAFATGTSW
GTFGIMLPIAAAMAVKVEP
ALIIPCMSA
|||||||||| ||| ||||||||||||| ||||| |||||||||||||||| |: |:||||||||
orf26a STLVAGNIHP
GFLXVILFLLASVMAFATGTSW
GTFGIMLPIAAAMAVKVDP
SLIIPCMSA
370 380 390 400 410 420
240 250 260 270 280 290
orf26.pep
VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPY
ALTVAAAAASGYLALGLTKSA
|||||||| ||||||||||||||||||||||||||||||| ||||||||||||||||| ||||
orf26a
VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPY
ALTVAAAAASGYLALGLTKSA
430 440 450 460 470 480
300 310
orf26.pep
LLGFGTTGIVLAVLIFLLKDKK
||||||:|||||||||| |||||
orf26a
LLGFGXTGIVLAVLIFLLKDKKRANAX
490 500
全长ORF26a核苷酸序列<SEQ ID 693>是:
1 ATGCAGCTGA TCGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TCTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA
201 CGGCGATTGG TCGCTGGGCA AACCAAAANT CTTGGTTTTC CTGATACTTT
251 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT
301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGCGCGGCG CGAAAATGCT
351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGTC
401 TCGCCGTCGG TGCGNTTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC
451 CGCGCCAAAC TCGCCTACAT CCTCGACTCC ACTGCCGCGC CTATGTGCGT
501 GCTGATGCCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG
551 GACTGCTCGT TACCTACAAA ATCACCGAAT ACACGCCGAT GGGGACGTTT
601 GTCGCCATGA GCCTGATGAA CTATTACGCA CTGTTTGCCC TGATTATGGT
651 GTTCGTCGTC GCATGGTTCT CCTTCGACAT CGGCTCGATG GCACGTTTCG
701 AACAAGCCGC GTTGAACGAA GCCCACGATG AAACTGCCGT TTCAGACGGC
751 AGCTGGGGCA GGGTTTACGC ATTGATTATT CCCGTTTTGG CCTTAATCGC
801 CTCAACGGTT TCCGCCATGA TCTACACCGG TGCACAGGCA AGCGAAACCT
851 TCAGCATTTT GGGTGCATTT GAAAATACGG ACGTGAACAC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT GCTTGCCGTC GTCCTCTGCA CGCTCGGCAC
951 GATTAAAATC GCCGATTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCCA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTTG CCTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACAGG CGACTACCTC TCCACGCTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGN CCGTCATCCT TTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT CATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAT CCCTCACTGA TTATCCCGTG
1251 TATGTCCGCC GTGATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA
1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CNTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC
1401 CGCATCGGGN TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGTT
1451 TTGGCANGAC AGGCATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAACGCG CCAACGCCTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 694>:
1
MQLIDYSHSF FSVVPPFLAL ALAVITRR
VL LSLGIGILVG VAFLVGGNPV
51 DGLTHLKDMV VGLAWSDGDW SLGKPK
XLVF LILLGIFTSL LTYSGSNQAF
101 ADWAKRHIKN R
RGAKMLTAC LVFVTFIDDY FHSLAVGAXA RPVTDKFKVS
151 RAKLAYILDS TAAPMCVLMP
VSSWGASIIA TLAGLLVTYK ITEYTPMGTF
201 VAMSLMNYYA
LFALIMVFVV AWFSFDIGSM ARFEQAALNE AHDETAVSDG
251 SWGRVYA
LII PVLALIASTV SAMIYTGAQA SETFSILGAF ENTDVNT
SLV
301
FGGTCGVLAV VLCTLGTIKI ADYPKAVWQG AKSM
FGAIAI LILAWLISTV
351
VGEMHTGDYL STLVAGNIHP
GFLXVILFLL ASVMAFATGT SW
GTFGIMLP
401
IAAAMAVKVD P
SLIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPY
A LTVAAAAASG YLALGLTKSA
LLGFGXTGIV LAVLIFLLKD
501 KKRANA*
ORF26a和ORF26-1在506个氨基酸的重叠区内显示出有97.8%的相同性:
10 20 30 40 50 60
orf26a.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26-1 MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
10 20 30 40 50 60
70 80 90 100 110 120
orf26a.pep VGLAWSDGDWSLGKPKXLVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf26-1 VGLAWSDGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
70 80 90 100 110 120
130 140 150 160 170 180
orf26a.pep LVFVTFIDDYFHSLAVGAXARPVTDKFKVSRAKLAYILDSTAAPMCVLMPVSSWGASIIA
|||||||||||||||||| |||||||||||||||:|||||||||||||||||||||||||
orf26-1 LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRTKLAYILDSTAAPMCVLMPVSSWGASIIA
130 140 150 160 170 180
190 200 210 220 230 240
orf26a.pep TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26-1 TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
190 200 210 220 230 240
250 260 270 280 290 300
orf26a.pep AHDETAVSDGSWGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
|||||||||:: ||||||||||||||||||||||||||||||||||||||||||||||||
orf26-1 AHDETAVSDATKGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
250 260 270 280 290 300
310 320 330 340 350 360
orf26a.pep FGGTCGVLAVVLCTLGTIKIADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||
orf26-1 FGGTCGVLAVVLCTLGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
310 320 330 340 350 360
370 380 390 400 410 420
orf26a.pep STLVAGNIHPGFLXVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVDPSLIIPCMSA
||||||||||||| |||||||||||||||||||||||||||||||||||:|:||||||||
orf26-1 STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
370 380 390 400 410 420
430 440 450 460 470 480
orf26a.pep VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26-1 VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
430 440 450 460 470 480
490 500
orf26a.pep LLGFGXTGIVLAVLIFLLKDKKRANAX
|||||:|||||||||||||||||||||
orf26-1 LLGFGTTGIVLAVLIFLLKDKKRANAX
490 500
与淋病奈瑟球菌的预计ORF的同源性
ORF26和淋病奈瑟球菌的预计ORF(ORF26ng)在N端和C端的97和206个氨基酸的重叠区内分别显示出有94.8%和99%的相同性:
orf26.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILXXVAFLVGGNPVDGLTHLKDMV 60
|||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
orf26ng MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV 60
orf26.pep VGLAWSDXDWSLGKPKILVFXILLGIFTSLLTYSGSN 97
|||||:| |||||||||||| ||||||||||||||||
orf26ng VGLAWADGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRCGAKMLTAC 120
//
orf26.pep TSLVFGGTCGVFAVVLCTLGTIKTADYPKA 326
|||||||||||:||||||:|||||||||||
orf26ng ASTVSAMIYTGAQASETFSILGAFENTDVNTSLVFGGTCGVLAVVLCTFGTIKTADYPKA 326
orf26.pep VWQGAKSMFGAIAILILAWLISTVVGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAF 386
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng VWQGAKSMFGAIAILILAWLISTVVGEMHTGDYLSTLVAGNIHPGFLPVILFLLASVMAF 386
orf26.pep ATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGAR 446
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng ATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSAVMAGAVCGDHCSPISDTTILSSTGAR 446
orf26.pep CNHIDHVTSQLPYALTVAAAAASGYLALGLTKSALLGFGTTGIVLAVLIFLLKDKK 502
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng CNHIDHVTSQLPYALTVAAAAASGYLALGLTKSALLGFGTTGIVLAVLIFLLKDKKRADV 506
全长ORF26ng核苷酸序列<SEQ ID 695>是:
1 ATGCAGCTGA TTGACTATTC ACATTCATTT TTCTCGGTTG TGCCACCCTT
51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG
101 GCATCGGTAT TTTGGTCGGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC
151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGGCAGA
201 CGGCGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CTGATACTTT
251 TGGGCATTTT CACTTCACTG CTGACCTACT CCGGCAGCAA TCAGGCGTTT
301 GCCGACTGGG CAAAACGGCA CATTAAAAAC CGGTGCGGCG CGAAAATGCT
351 GACCGCCTGC CTCGTGTTCG TAACCTTTAT CGACGACTAT TTCCACAGCC
401 TCGCCGTCGG TGCGATTGCC CGCCCCGTTA CCGACAAGTT TAAAGTTTCC
451 CGCGCCAAAC TCGCCTACAT CCTCGACTCC ACTGCCTCGC CCATGTGCGT
501 GCTGATGGCC GTTTCAAGCT GGGGCGCGTC GATTATCGCC ACGCTTGCCG
551 GATTGCTCGT TACCTACAAA ATTACCGAAT ACACGCCGAT GGGGACGTTT
601 GTCGCCATGA GCCTGATGAA CTATTACGCG CTGTTTGCCC TGATTATGGT
651 ATTCGTCGTC GCATGGTTCT CCTTCGACAT CGGCTCGAtg gCGCGTTTCG
701 AACAGGCTGC GTTGAACGAA gcccaggacg aaaccgccgc tTCAGACgCT
751 ACCAAAGGTC GTGTTTACGC ATTGATTATT CCCGTTTTGG CCTTAATCGC
801 CTCAACGGTT TCCGCCATGA TCTACACCGG CGCGCAGGCA AGCGAAACCT
851 TCAGCATTTT GGGGGCATTT GAAAATACCG ACGTAAACAC TTCGCTGGTA
901 TTCGGCGGCA CTTGCGGCGT GCTTGCCGTC GTCCTCTGCA CGTTCGGCAC
951 GATTAAAACC GCCGATTATC CCAAAGCCGT GTGGCAGGGT GCGAAATCCA
1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CCTGGCTCAT CAGTACGGTT
1051 GTCGGCGAAA TGCACACGGG CGACTACCTC TCCACGCTGG TTGCGGGCAA
1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA
1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA CGTTCGGCAT TATGCTGCCG
1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTAtcccGTG
1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGTTCGCCCA
1301 TCTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC
1351 GACCACGTTA CCTCGCAACT GCCTTATGCC CTGACGGTTG CCGCCGCCGC
1401 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT
1451 TTGGCACGAC CGGTATTGTA TTGGCGGTGC TGATTTTTCT GTTGAAAGAT
1501 AAAAAACGCG CCGACGTTTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 696>:
1
MQLIDYSHSF FSVVPPFLAL ALAVITRR
VL LSLGIGILVG VAFLVGGNPV
51 DGLTHLKDMV VGLAWADGDW SLGKPK
ILVF LILLGIFTSL LTYSGSNQAF
101 ADWAKRHIKN R
CGAKMLTAC LVFVTFIDDY FHSLAVGAIA RPVTDKFKVS
151 RAKLAYILDS TASPMCVLMP
VSSWGASIIA TLAGLLVTYK ITEYTPMGTF
201 VAMSLMNYYA
LFALIMVFVV AWFSFDIGSM ARFEQAALNE AQDETAASDA
251 TKGRVYA
LII PVLALIASTV SAMIYTGAQA SETFSILGAF ENTDVNTS
LV
301
FGGTCGVLAV VLCTFGTIKT ADYPKAVWQG AKSM
FGAIAI LILAWLISTV
351
VGEMHTGDYL STLVAGNIHP
GFLPVILFLL ASVMAFATGT SW
GTFGIMLP
401
IAAAMAVKVE P
ALIIPCMSA VMAGAVCGDH CSPISDTTIL SSTGARCNHI
451 DHVTSQLPY
A LTVAAAAASG YLALGLTKSA
LLGFGTTGIV LAVLIFLLKD
501 KKRADV*
ORF26ng和ORF26-1在505个氨基酸的重叠区内显示出有98.4%的相同性:
10 20 30 40 50 60
orf26-1.pep MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng MQLIDYSHSFFSVVPPFLALALAVITRRVLLSLGIGILVGVAFLVGGNPVDGLTHLKDMV
10 20 30 40 50 60
70 80 90 100 110 120
orf26-1.pep VGLAWSDGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRRGAKMLTAC
|||||:||||||||||||||||||||||||||||||||||||||||||||| ||||||||
orf26ng VGLAWADGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRCGAKMLTAC
70 80 90 100 110 120
130 140 150 160 170 180
orf26-1.pep LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRTKLAYILDSTAAPMCVLMPVSSWGASIIA
|||||||||||||||||||||||||||||||:||||||||||:|||||||||||||||||
orf26ng LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRAKLAYILDSTASPMCVLMPVSSWGASIIA
130 140 150 160 170 180
190 200 210 220 230 240
orf26-1.pep TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE
190 200 210 220 230 240
250 260 270 280 290 300
orf26-1.pep AHDETAVSDATKGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
|:||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng AQDETAASDATKGRVYALIIPVLALIASTVSAMIYTGAQASETFSILGAFENTDVNTSLV
250 260 270 280 290 300
310 320 330 340 350 360
orf26-1.pep FGGTCGVLAVVLCTLGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
||||||||||||||:|||||||||||||||||||||||||||||||||||||||||||||
orf26ng FGGTCGVLAVVLCTFGTIKTADYPKAVWQGAKSMFGAIAILILAWLISTVVGEMHTGDYL
310 320 330 340 350 360
370 380 390 400 410 420
orf26-1.pep STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng STLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALIIPCMSA
370 380 390 400 410 420
430 440 450 460 470 480
orf26-1.pep VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf26ng VMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQLPYALTVAAAAASGYLALGLTKSA
430 440 450 460 470 480
490 500
orf26-1. pep LLGFGTTGIVLAVLIFLLKDKKRANAX
||||||||||||||||||||||||::
orf26ng LLGFGTTGIVLAVLIFLLKDKKRADVX
490 500
另外,ORF26ng显示出与一种假设的流感嗜血菌蛋白明显同源:
sp|P44263|YF86_HAEIN假设蛋白HI1586>gi|1074850|pir||C64037假设
protein HI1586-流感嗜血菌(Rd KW20菌株)>gi|1574427(U32832)流感嗜血菌预计编码区HI1586[流感嗜血菌]长度=519
评分=538位(1370),估计值=e-152
相同性=280/507(55%),阳性=346/507(68%),空隙=7/507(1%)
询问:1 MQLIDYSHSFFSVVPPFLALALAVITRRXXXXXXXXXXXXXAFLVGGNPVDGLTHLKDMV 60
M+LID+S S +S+VP LA+ LA+ TRR L +L V
目标:14 MELIDFSSSVWSIVPALLAIILAIATRRVLVSLSAGIIIGSLMLSDWQIGSAFNYLVKNV 73
询问:61 VGLAWADGDWSLGKPKILVFLILLGIFTSLLTYSGSNQAFADWAKRHIKNRCGAKMLTAC 120
V L +ADG+ + I++FL+LLG+ T+LLT SGSN+AFA+WA+ IK R GAK+L A
目标:74 VSLVYADGEIN-SNMNIVLFLLLLGVLTALLTVSGSNRAFAEWAQSRIKGRRGAKLLAAS 132
询问:121 LVFVTFIDDYFHSLAVGAIARPVTDKFKVSRAKLAYILDSTASPMCVLMPVSSWGASIIA 180
LVFVTFIDDYFHSLAVGAIARPVTD+FKVSRAKLAYILDSTA+PMCV+MPVSSWGA II
目标:133 LVFVTFIDDYFHSLAVGAIARPVTDRFKVSRAKLAYILDSTAAPMCVMMPVSSWGAYIIT 192
询问:181 TLAGLLVTYKITEYTPMGTFVAMSLMNYYALFALIMVFVVAWFSFDIGSMARFEQAALNE 240
+ GLL TY ITEYTP+G FVAMS MN+YA+F++IMVF VA+FSFDI SM R E+ AL
目标:193 LIGGLLATYSITEYTPIGAFVAMSSMNFYAIFSIIMVFFVAYFSFDIASMVRHEKLALKN 252
询问:241 AQDETAASDATKGRVYALIIPVLALIASTVSAMIYTGAQA----SETFSILGAFENTDVN 296
+D+ TKG+V LI+P+L LI +TVS MIYTGA+A + FS+LG FENT V
目标:253 TEDQLEEETGTKGQVRNLILPILVLIIATVSMMIYTGAEALAADGKVFSVLGTFENTVVG 312
询问:297 TSLVFGGTCGVL--AVVLCTFGTIKTADYPKAVWQGAKSMFGXXXXXXXXXXXSTVVGEM 354
TSLV GG C ++ +++ + +Y ++ G KSM G + +VG+M
目标:313 TSLVVGGFCSIIISTLLIILDRQVSVPEYVRSWIVGIKSMSGAIAILFFAWTINKIVGDM 372
询问:355 HTGDYLSTLVAGNIHPGFLPVILFLLASVMAFATGTSWGTFGIMLPIAAAMAVKVEPALI 414
TG YLS+LV+GNI FLPVILF+L + MAF+TGTSWGTFGIMLPIAAAMA P L+
目标:373 QTGKYLSSLVSGNIPMQFLPVILFVLGAAMAFSTGTSWGTFGIMLPIAAAMAANAAPELL 432
询问:415 IPCMSAVMAGAVCGDHCSPISDTTILSSTGARCNHIDHVTSQXXXXXXXXXXXXXXXXXX 474
+PC+SAVMAGAVCGDHCSP+SDTTILSSTGA+CNHIDHVT+Q
目标:433 LPCLSAVMAGAVCGDHCSPVSDTTILSSTGAKCNHIDHVTTQLPYAATVATATSIGYIVV 492
询问:475 XXXKSALLGFGTTGIVLAVLIFLLKDK 501
S L GF T + L V+IF +K +
目标:493 GFTYSGLAGFAATAVSLIVIIFAVKKR 519
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例83
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 697>:
1 ..AAGCAATGGT ATGCCGACGN .AGTATCAAG ACGGAAATGG TTATGGTCAA
51 CGATGAGCCT GCCAAAATTC TGACTTGGGA TGAAAGCGGC CGATTACTCT
101 CGGAACTGTC TATCCGCCAC CATCAACGCA ACGGGGTGGT TTTGGAGTGG
151 TATGAAGATG GTTCTAAAAA GAGCGAAGT. GTTTATCAGG ATGACAAGTT
201 GGTCAGGAAA ACCCAGTGGG ATAAGGATGG TTATTTAATC GAACCCTGA
它对应于氨基酸序列<SEQ ID 698;ORF27>:
1 ..KQWYADXSIK TEMVMVNDEP AKILTWDESG RLLSELSIRH HQRNGVVLEW
51 YEDGSKKSEX VYQDDKLVRK TQWDKDGYLI EP*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 699>:
1 ATGAAAAAAT TATCTCGGAT TGTATTTTCA ACTGTCCTGT TGGGTTTTTC
51 GGCCGCTTTG CCGGCGCAGA CCTATTCTGT TTATTTTAAT CAGAACGGAA
101 AGCTGACGGC GACGATGTCT TCTGCCGCTT ATATCAGGCA ATATAGTGTG
151 GTGGCGGGTA TTGCGCACGC GCAGGATTTT TATTATCCGT CGATGAAGAA
201 ATATTCTGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC
251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA TGGTCAGAAA
301 AAAATGGCGG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AGTGGGTCAA
351 CTGGTATCCG AACGGTAAAA AATCTGCCGT TATGCCTTAT AAAAATGGCT
401 TGAGTGAGGG TACGGGATAC CGCTATTACC GTAACGGCGG CAAGGAAAGC
451 GAAATCCAGT TTAAGCAAAA TAAGGCAAAC GGCGTATGGA AGCAATGGTA
501 TGCCGACGGC AGTATCAAGA CGGAAATGGT TATGGTCAAC GATGAGCCTG
551 CCAAAATTCT GACTTGGGAT GAAAGCGGCC GATTACTCTC GGAACTGTCT
601 ATCCGCCACC ATCAACGCAA CGGGGTGGTT TTGGAGTGGT ATGAAGATGG
651 TTCTAAAAAG AGCGAAGCTG TTTATCAGGA TGACAAGTTG GTCAGGAAAA
701 CCCAGTGGGA TAAGGATGGT TATTTAATCG AACCCTGA
它对应于氨基酸序列<SEQ ID 700;ORF27-1>:
1
MKKLSRIVFS TVLLGFSAAL PAQTYSVYFN QNGKLTATMS SAAYIRQYSV
51
VAGIAHAQDF YYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFNGQK
101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGY RYYRNGGKES
151 EIQFKQNKAN GVWKQWYADG SIKTEMVMVN DEPAKILTWD ESGRLLSELS
201 IRHHQRNGVV LEWYEDGSKK SEAVYQDDKL VRKTQWDKDG YLIEP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF27和脑膜炎奈瑟球菌菌株A的ORF(ORF27a)在82个氨基酸的重叠区内显示出有91.5%的相同性:
10 20 30
orf27.pep KQWYADXSIKTEMVMVNDEPAKILTWDESG
|||||| :||||||||||||||||||||||
orf27a LSEGTGXRYYRNGGKESEIQFKQNKANGVWKQWYADGNIKTEMVMVNDEPAKILTWDESG
140 150 160 170 180 190
40 50 60 70 80
orf27.pep RLLSELSIRHHQRNGVVLEWYEDGSKKSEXVYQDDKLVRKTQWDKDGYLIEPX
||||||||:|| ||||||||||||||||| |||||||||||||| ||||||||
orf27a RLLSELSIHHHXRNGVVLEWYEDGSKKXEAVYQDDKLVRKTQWDXDGYLIEPX
200 210 220 230 240
全长ORF27a核苷酸序列<SEQ ID 701>是:
1 ATGAAAAAAT TATCTCGGAT TGTATTTTCA ACTGTCCTGT TGGGTTTTTC
51 GGCCGCTTTG CCGGCGCAGA NCTATTCTGT TTATTTTAAT CAGAACGGGA
101 AACTGACGGC GACGNTGTCT TCTGCCGCNT ATATCAGGCA ATATAGTGTG
151 GCGGAGGGTA TTGCGCACGC GCAGGANTTT TANTATCCGT CGATGAAGAA
201 ATATTCCGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC
251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA NGGTCAGAAA
301 AAAATGGCNG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AGTGGGTCAA
351 CTGGTATCCG AACGGTAAAA AATCTGCCGT TATGCCTTAT AAAAATGGTT
401 TGAGTGAAGG TACGGGGTNN CGCTATTACC GTAACGGCGG CAAGGAAAGC
451 GAAATCCAGT TTAAACAGAA TAAGGCAAAC GGCGTATGGA AGCAATGGTA
501 TGCCGACGGC AATATCAAAA CGGAAATGGT TATGGTCAAT GATGAGCCTG
551 CCAAAATTCT GACATGGGAT GAAAGCGGTC GATTACTCTC GGAACTGTCT
601 ATCCATCATC ATNAACGTAA TGGAGTAGTC TTAGAGTGGT ATGAAGATGG
651 TTCTAAAAAG ANTGAAGCTG TTTATCAGGA TGATAAGTTG GTCAGGAAAA
701 CCCAGTGGGA TAANGATGGT TATTTAATCG AACCCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 702>:
1
MKKLSRIVFS TVLLGFSAAL PAQXYSVYFN QNGKLTATXS SAAYIRQYSV
51
AEGIAHAQXF XYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFXGQK
101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGX RYYRNGGKES
151 EIQFKQNKAN GVWKQWYADG NIKTEMVMVN DEPAKILTWD ESGRLLSELS
201 IHHHXRNGVV LEWYEDGSKK XEAVYQDDKL VRKTQWDXDG YLIEP*
ORF27a和ORF27-1在245个氨基酸的重叠区内显示出有94.7%的相同性:
10 20 30 40 50 60
orf27a.pep MKKLSRIVFSTVLLGFSAALPAQXYSVYFNQNGKLTATXSSAAYIRQYSVAEGIAHAQXF
|||||||||||||||||||||||:|||||||||||||| |||||||||||: |||||| |
orf27-1 MKKLSRIVFSTVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSVVAGIAHAQDF
10 20 30 40 50 60
70 80 90 100 110 120
orf27a.pep XYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFXGQKKMAGGFSKGKPDGEWVNWYP
||||||||||||||||||||||||||||||||||| |||||||||||||||||||||||
orf27-1 YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP
70 80 90 100 110 120
130 140 150 160 170 180
orf27a.pep NGKKSAVMPYKNGLSEGTGXRYYRNGGKESEIQFKQNKANGVWKQWYADGNIKTEMVMVN
||||||||||||||||||| ||||||||||||||||||||||||||||||:|||||||||
orf27-1 NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN
130 140 150 160 170 180
190 200 210 220 230 240
orf27a.pep DEPAKILTWDESGRLLSELSIHHHXRNGVVLEWYEDGSKKXEAVYQDDKLVRKTQWDXDG
|||||||||||||||||||||:|| ||||||||||||||| |||||||||||||||| ||
orf27-1 DEPAKILTWDESGRLLSELSIRHHQRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG
190 200 210 220 230 240
orf27a.pep YLIEPX
||||||
orf27-1 YLIEPX
与淋病奈瑟球菌的预计ORF的同源性
ORF27和淋病奈瑟球菌的预计ORF(ORF27ng)在82个氨基酸的重叠区内显示出有96.3%的相同性:
orf27.pep KQWYADXSIKTEMVMVNDEPAKILTWDESG 30
|||||| |||||||||||||||||||||||
orf27ng LSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVNDEPAKILTWDESG 193
orf27.pep RLLSELSIRHHQRNGVVLEWYEDGSKKSEXVYQDDKLVRKTQWDKDGYLIEP 82
|||||||||||:||||||||||||||||| ||||||||||||||||||||||
orf27ng RLLSELSIRHHKRNGVYLEWYEDGSKKSEAVYQDDKLYRKTQWDKDGYLIEP 245
全长ORF27ng核苷酸序列<SEQ ID 703>是:
1 ATGAAGAAAT TATCTCGGAT TGTATTTTCA ATCGTACTGT TGGGTTTTTC
51 GGCCGCTTTG CCGGCGCAGA CCTATTCTGT TTATTTTAAT CAGAACGGGA
101 AACTGACGGC GACGATGTCT TCTGCCGCTT ATATCAGGCA ATATAGTGTG
151 GCGGCGGGTA TCGCACACGC GCAGGATTTT TATTATCCGT CGATGAAGAA
201 ATATTCCGAA CCTTATATCG TTGCTTCAAC GCAAATCAAA TCTTTTGTGC
251 CTACCCTGCA AAACGGTATG TTGATTTTGT GGCATTTTAA TGGTCAGAAA
301 AAAATGGCGG GGGGCTTCAG CAAGGGTAAG CCGGACGGGG AATGGGTCAA
351 CTGGTATCCG AACGGTAAAA AATCTGCGGT TATGCCTTAT AAAAATGGCT
401 TGAGTGAGGG TACGGGATAC CGTTATTACC GTAACGGCGG CAAGGAAAGC
451 GAAATCCAGT TTAAGCAAAA TAAGGCGAAC GGCGTATGGA AGCAATGGTA
501 TGCCGATGGA AGTATCAAGA CGGAAATGGT TATGGTCAAC GATGAGCCTG
551 CCAAAATTCT GACTTGGGAT GAAAGCGGCC GATTACTTTC GGAACTGTCT
601 ATCCGCCACC ATAAACGCAA CGGGGTGGTT TTGGAGTGGT ATGAAGATGG
651 TTCTAAAAAG AGCGAGGCTG TTTATCAGGA TGACAAGTTG GTCAGGAAAA
701 CCCAATGGGA TAAGGATGGT TATTTAATCG AACCCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 704>:
1
MKKLSRIVFS IVLLGFSAAL PAQTYSVYFN QNGKLTATMS SAAYIRQYSV
51 AAGIAHAQDF YYPSMKKYSE PYIVASTQIK SFVPTLQNGM LILWHFNGQK
101 KMAGGFSKGK PDGEWVNWYP NGKKSAVMPY KNGLSEGTGY RYYRNGGKES
151 EIQFKQNKAN GVWKQWYADG SIKTEMVMVN DEPAKILTWD ESGRLLSELS
201 IRHHKRNGVV LEWYEDGSKK SEAVYQDDKL VRKTQWDKDG YLIEP*
ORF27ng和ORF27-1在245个氨基酸的重叠区内显示出有98.8%的相同性:
10 20 30 40 50 60
orf27-1.pep MKKLSRIVFSTVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSVVAGIAHAQDF
|||||||||| |||||||||||||||||||||||||||||||||||||||:|||||||||
orf27ng MKKLSRIVFSIVLLGFSAALPAQTYSVYFNQNGKLTATMSSAAYIRQYSVAAGIAHAQDF
10 20 30 40 50 60
70 80 90 100 110 120
orf27-1.pep YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf27ng YYPSMKKYSEPYIVASTQIKSFVPTLQNGMLILWHFNGQKKMAGGFSKGKPDGEWVNWYP
70 80 90 100 110 120
130 140 150 160 170 180
orf27-1.pep NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf27ng NGKKSAVMPYKNGLSEGTGYRYYRNGGKESEIQFKQNKANGVWKQWYADGSIKTEMVMVN
130 140 150 160 170 180
190 200 210 220 230 240
orf27-1.pep DEPAKILTWDESGRLLSELSIRHHQRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG
||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||||
orf27ng DEPAKILTWDESGRLLSELSIRHHKRNGVVLEWYEDGSKKSEAVYQDDKLVRKTQWDKDG
190 200 210 220 230 240
orf27-1.pep YLIEPX
||||||
orf27ng YLIEPX
根据该分析结果(包括淋球菌蛋白中有推定的前导序列),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF27-1(24.5kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图17A显示了GST-融合蛋白的亲和纯化结果,图17B显示了His-融合物在大肠杆菌中表达的结果。用纯化的GST-融合蛋白来免疫小鼠,用小鼠血清进行ELISA,该试验给出了阳性结果,这确认ORF27-1是一种外露蛋白,且是一种有用的免疫原。
实施例84在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 705>:
1 ATGAAATTTA CCAAGCACCC CGTCTGGGCA ATGGCGTTCC GCCCATTTTA
51 TTCGCTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACkAG CTGTCCGGTT TCTATTGGCA CGCGCATGAg
151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTaTCTGGTC
251 GGCTTGACTA TCTTTTGGCT GGCTGCGCGG ATTGCCGCCT TTATCCCGGG
301 TTGGGGTGCG TCGGCAAGCG GCATACTCGG TACGCTGTTT TTCTGGTACG
351 GCGCGGTGTG CATGGCTTTG CCCGTTATCC GTTCGCAGAA TCAACGCAAC
401 TATGTTgCCG TGTTCGCGCT GTTCGTCTTG GGCGGCACGC ATGCGGCGTT
451 CCACGTCCAG CTGCACAACG GCAACCTAGG CGGACTCTTG AGCGGATTGC
501 AGTCGGGCTT GGTGATG
它对应于氨基酸序列<SEQ ID 706;ORF47>:
1
MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHX LSGFYWHAHE
51 MIWGYAGLVV IAFLLTAVAT WTGQPPTRGG VLVGLTIFWL AARIAAFIPG
101 WGASASGILG TLFFWYGAVC MALPVIRSQN QRNYVAVFAL FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQSGL VM
进一步的工作揭示了完整的核苷酸序列<SEQ ID 707>:
1 ATGAAATTTA CCAAGCACCC CGTCTGGGCA ATGGCGTTCC GCCCATTTTA
51 TTCGCTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG
151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTTCTGGTCG
251 GCTTGACTAT CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT
301 TGGGGTGCGT CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TTCGCAGAAT CAACGCAACT
401 ATGTTGCCGT GTTCGCGCTG TTCGTCTTGG GCGGCACGCA TGCGGCGTTC
451 CACGTCCAGC TGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA
501 GTCGGGCTTG GTGATGGTGT CGGGTTTTAT CGGTCTGATT GGTACGCGGA
551 TTATTTCGTT TTTTACGTCC AAACGCTTGA ATGTGCCGCA GATTCCCAGT
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTGCCCATGC TGACTGCCAT
651 GCTGATGGCG CACGGTGTGT TGGCTTGGCT GTCTGCCGTT TTTGCCTTTG
701 CGGCAGGTGT GATTTTTACC GTGCAGGTGT ACCGCTGGTG GTATAAACCC
751 GTGTTGAAAG AGCCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCCGCTTTCC
851 TCAATCTGGG TGTGCATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT
901 TTGGGCATGA TGGCGCGTAC CGCGCTTGGT CATACGGGCA ATCCGATTTA
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA
1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC
1051 AGCATCCGCA CCTCTTCGGT TTTGTTTGCA CTCGCGCTTT TGGTGTATGC
1101 GTGGAAGTAT ATTCCTTGGC TGATTCGTCC GCGTTCGGAC GGCAGGCCCG
1151 GTTGA
它对应于氨基酸序列<SEQ ID 708;ORF47-1>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 M
IWGYAGLVV IAFLLTAVAT WTGQPPTRGG V
LVGLTIFWL AARIAAFIPG
101 WGASAS
GILG TLFFWYGAVC MALPVIRSQN QRN
YVAVFAL FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQS
GL VMVSGFIGLI GTRIISFFTS KRLNVPQIPS
201 PKW
VAQASLW LPMLTAMLMA HGVLAW
LSAV FAFAAGVIFT VQVYRWWYKP
251 VLKEPMLW
IL FAGYLFTGLG LIAVGASYFK PA
FLNLGVHL IGVGGIGVLT
301 LGMMARTALG HTGNPIYPPP KAVP
VAFWLM MAATAVRMVA VFSSGTAYTH
351
SIRTSSVLFA LALLVYAWKY IPWLIRPRSD GRPG*
对该氨基酸序列进行计算机分析预测到有一个前导肽,并且还给出了下列结果::
与脑膜炎奈瑟球菌(菌株A)的预计fQRF的同源性
ORF47和脑膜炎奈瑟球菌菌株A的ORF(ORF47a)在172个氨基酸的重叠区内显示出有99.4%的相同性:
10 20 30 40 50 60
orf47.pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHXLSGFYWHAHEM
IWGYAGLVV
||||||||||||||||||||||||||||||||||||||| ||||||||||| |||||||||
orf47a MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEM
IWGYAGLVV
10 20 30 40 50 60
70 80 90 100 110 120
orf47.pep
IAFLLTAVATWTGQPPTRGGV
LVGLTIFWLAARIAAFIPGWGASAS
GILGTLFFWYGAVC
|||||||| ||||||||||||| ||||||||||||||||| |||||||| ||||||||||||||
orf47a
IAFLLTAVATWTGQPPTRGGV
LVGLTIFWLAARIAAFIPGWGASAS
GILGTLFFWYGAVC
70 80 90 100 110 120
130 140 150 160 170
orf47.pep
MALPVIRSQNQRN
YVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQS
GLVM
|||| |||||||||| ||||||||||||||||| |||||||||||||||||| ||||
orf47a
MALPVIRSQNQRN
YVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQS
GLVMVSGFIGLI
130 140 150 160 170 180
orf47a
GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVMPWLSAAFAFAAGVIFT
190 200 210 220 230 240
全长ORF47a核苷酸序列<SEQ ID 709>是:
1 ATGAAATTTA CCAAGCACCC CGTTTGGGCA ATGGCGTTCC GCCCGTTTTA
51 TTCACTGGCG GCTCTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG
151 ATGATTTGGG GTTATGCCGG ACTGGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGGC AGCCGCCCAC GCGGGGCGGC GTTCTGGTCG
251 GCTTGACTAT CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT
301 TGGGGTGCGT CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TTCGCAGAAT CAACGCAATT
401 ATGTTGCCGT GTTCGCGCTG TTCGTCTTGG GCGGTACGCA CGCGGCGTTC
451 CACGTCCAGC TGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA
501 GTCGGGCTTG GTGATGGTGT CGGGTTTTAT CGGTCTGATT GGTACGCGGA
551 TTATTTCGTT TTTTACGTCC AAACGGTTGA ATGTGCCGCA GATTCCCAGT
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTGCCCATGC TGACCGCCAT
651 GCTGATGGCG CACGGCGTGA TGCCTTGGCT GTCGGCGGCT TTCGCGTTTG
701 CGGCAGGTGT GATTTTTACC GTGCAGGTGT ACCGCTGGTG GTATAAGCCT
751 GTGTTGAAAG AGCCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCCGCTTTCC
851 TCAATCTGGG TGTGCATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT
901 TTGGGCATGA TGGCGCGTAC CGCGCTCGGT CATACGGGCA ATCCGATTTA
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA
1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC
1051 AGCATACGCA CCTCTTCGGT TTTGTTTGCA CTCGCGCTTT TGGTGTATGC
1101 GTGGAAGTAT ATTCCTTGGC TGATTCGTCC GCGTTCGGAC GGCAGGCCCG
1151 GTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 710>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 M
IWGYAGLVV IAFLLTAVAT WTGQPPTRGG V
LVGLTIFWL AARIAAFIPG
101 WGASAS
GILG TLFFWYGAVC MALPVIRSQN QRN
YVAVFAL FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQS
GL VMVSGFIGLI GTRIISFFTS KRLNVPQIPS
201 PKW
VAQASLW LPMLTAMLMA HGVMPW
LSAA FAFAAGVIFT VQVYRWWYKP
251 VLKEPMLW
IL FAGYLFTGLG LIAVGASYFK PA
FLNLGVHL IGVGGIGVLT
301 LGMMARTALG HTGNPIYPPP KAVP
VAFWLM MAATAVRMVA VFSSGTAYTH
351
SIRTSSVLFA LALLVYAWKY IPWLIRPRSD GRPG*
ORF47a和ORF47-1在384个氨基酸的重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60
orf47a.pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47-1 MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
10 20 30 40 50 60
70 80 90 100 110 120
orf47a.pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47-1 IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
70 80 90 100 110 120
130 140 150 160 170 180
orf47a.pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47-1 MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
130 140 150 160 170 180
190 200 210 220 230 240
orf47a.pep GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVMPWLSAAFAFAAGVIFT
|||||||||||||||||||||||||||||||||||||||||||: ||||:||||||||||
orf47-1 GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVLAWLSAVFAFAAGVIFT
190 200 210 220 230 240
250 260 270 280 290 300
orf47a.pep VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47-1 VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
250 260 270 280 290 300
310 320 330 340 350 360
orf47a.pep LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47-1 LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
310 320 330 340 350 360
370 380
orf47a.pep LALLVYAWKYIPWLIRPRSDGRPGX
|||||||||||||||||||||||||
orf47-1 LALLVYAWKYIPWLIRPRSDGRPGX
370 380
与淋病奈瑟球菌的预计ORF的同源性
ORF47和淋病奈瑟球菌的预计ORF(ORF47a)在172个氨基酸的重叠区显示出有97.1%的相同性:
ORF47 MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ORF47ng MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV 60
ORF47 IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC 120
|||||||||||||||||||||||||| ||||||||||||||||:||||||||||||||||
ORF47ng IAFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAVC 120
ORF47 MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVM 172
||||||||||:||||||||:||||||||||||||||||||||||||||||||
ORF47ng MALPVIRSQNRRNYVAVFAIFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVWGFIGLI 180
预计ORF47ng核苷酸序列<SEQ ID 711>编码的蛋白质包含氨基酸序列<SEQ ID712>:
1
MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 M
IWGYAGLVV IAFLLTAVAT WTGQPPTRGG V
LVGLTAFWL AARIAAFIPG
101 WGAAAS
GILG TLFFWYGAVC MALPVIRSQN RRN
YVAVFAI FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQS
GL VMVWGFIGLI GMKI ISFFTS KRLKLPQIPS
201 PKWVAHASLW LPMLNAILMA HRVMPW
LSAA FPFAAGVIFT VQVYAGGITP
251 IEETSCGSVA GICYRLGNSS G
预计的前导肽和跨膜结构域与脑膜炎球菌蛋白(另见施氏假单胞菌orf396,登录号为e246540)中的序列相同(除了87位的Ile/Ala替换和140位的Leu/Ile替换):
ORF47ng中的TM节段
整合可能性=-5.63 跨膜 52-68
整合可能性=-3.88 跨膜 169-185
整合可能性=-3.08 跨膜 82-98
整合可能性=-1.91 跨膜 134-150
整合可能性=-1.44 跨膜 107-123
整合可能性=-1.38 跨膜 227-243
进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 713>:
1 ATGAAATTTA CCAAACATCC CGTCTGGGCA ATGGCGTTCC GCCCGTTTTA
51 TTCACTGGCG GCACTGTACG GCGCATTGTC CGTATTGCTG TGGGGTTTCG
101 GCTACACGGG AACGCACGAG CTGTCCGGTT TCTATTGGCA CGCGCATGAG
151 ATGATTTGGG GTTATGCCGG TCTCGTCGTC ATCGCCTTCC TGCTGACCGC
201 CGTCGCCACT TGGACGGGAC AGCCGCCCAC GAGGGGCGGC GTTCTGGTCG
251 GCTTGACCGC CTTTTGGCTG GCTGCGCGGA TTGCCGCCTT TATCCCGGGT
301 TGGGGTGCGG CGGCAAGCGG CATACTCGGT ACGCTGTTTT TCTGGTACGG
351 CGCGGTGTGC ATGGCTTTGC CCGTTATCCG TtcgCAAAAC CGGCGCAACT
401 ATGtcgCCGT ATTCGCAATA TTTGTGCTGG GCGGTACGCA TGCGgcgTTC
451 CACGtccAgc tGCACAACGG CAACCTAGGC GGACTCTTGA GCGGATTGCA
501 GTCGGGCCTG GTTATGGTGT CGGGCTTTAT CGGCCTGATT GGGATGAGGA
551 TTATTTCGTT TTTTACGTCC AAACGGTTGA ACGTGCCGCA GATTCCCAGT
601 CCGAAATGGG TGGCGCAGGC TTCGCTGTGG CTACCCATGC TGACCGCCAT
651 ACTGATGGCG CACGGCGTGA TGCCTTGGCT GTCGGCGGCT TTCGCGTTTG
701 CGGCGGGCGT GATTTTTACC GTACAGGTGT ACCGCTGGTG GTATAAACCC
751 GTATTGAAAG AACCGATGCT GTGGATTCTG TTTGCCGGCT ATCTGTTTAC
801 CGGATTGGGG CTGATTGCGG TCGGCGCGTC TTATTTCAAA CCTGCCTTCC
851 TCAATCTGGG CGTACATCTG ATCGGGGTCG GCGGTATCGG CGTGCTGACT
901 TTGGGCATGA TGGCGCGTAC CGCGCTCGGT CATACGGGCA ATTCGATTTA
951 TCCGCCGCCC AAAGCCGTTC CCGTTGCGTT TTGGCTGATG ATGGCGGCAA
1001 CCGCCGTCCG TATGGTTGCC GTATTTTCTT CCGGCACTGC CTACACGCAC
1051 AGCATCCGCA CGTCTTCGGT TTTGTTTGCA CTCGCGCTGC TGGTGTATGC
1101 GTGGAAATAC ATTCCGTGGC TGATCCGTCC GCGTTCGGAC GGCAGGCCCG
1151 GTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 714;ORF47ng-1>:
1 MKFTKHPVWA MAFRPFYSLA ALYGALSVLL WGFGYTGTHE LSGFYWHAHE
51 M
IWGYAGLVV IAFLLTAVAT WTGQPPTRGG V
LVGLTAFWL AARIAAFIPG
101 WGAAAS
GILG TLFFWYGAVC MALPVIRSQN RRN
YVAVFAI FVLGGTHAAF
151 HVQLHNGNLG GLLSGLQS
GL VMVSGFIGLI GMRIISFFTS KRLNVPQIPS
201 PKW
VAQASLW LPMLTAILMA HGVMPW
LSAA FAFAAGVIFT VQVYRWWYKP
251 VLKEPMLW
IL FAGYLFTGLG LIAVGASYFK PA
FLNLGVHL IGVGGIGVLT
301 LGMMARTALG HTGNSIYPPP KAVP
VAFWLM MAATAVRMVA VFSSGTAYTH
351
SIRTSSVLFA LALLVYAWKY IPWLIRPRSD GRPG*
ORF47ng-1和ORF47-1在384个氨基酸的重叠区内显示出有97.4%的相同性:
10 20 30 40 50 60
orf47-1.pep MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47ng-1 MKFTKHPVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFYWHAHEMIWGYAGLVV
10 20 30 40 50 60
70 80 90 100 110 120
orf47-1.pep IAFLLTAVATWTGQPPTRGGVLVGLTIFWLAARIAAFIPGWGASASGILGTLFFWYGAVC
|||||||||||||||||||||||||| ||||||||||||||||:||||||||||||||||
orf47ng-1 IAFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAVC
70 80 90 100 110 120
130 140 150 160 170 180
orf47-1.pep MALPVIRSQNQRNYVAVFALFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
||||||||:||||||||||:||||||||||||||||||||||||||||||||||||||||
orf47ng-1 MALPVIRSQNRRNYVAVFAIFVLGGTHAAFHVQLHNGNLGGLLSGLQSGLVMVSGFIGLI
130 140 150 160 170 180
190 200 210 220 230 240
orf47-1.pep GTRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAMLMAHGVLAWLSAVFAFAAGVIFT
| ||||||||||||||||||||||||||||||||||:||||||: ||||:||||||||||
orf47ng-1 GMRIISFFTSKRLNVPQIPSPKWVAQASLWLPMLTAILMAHGVMPWLSAAFAFAAGVIFT
190 200 210 220 230 240
250 260 270 280 290 300
orf47-1.pep VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf47ng-1 VQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYFKPAFLNLGVHLIGVGGIGVLT
250 260 270 280 290 300
310 320 330 340 350 360
orf47-1.pep LGMMARTALGHTGNPIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
|||||||||||||| |||||||||||||||||||||||||||||||||||||||||||||
orf47ng-1 LGMMARTALGHTGNSIYPPPKAVPVAFWLMMAATAVRMVAVFSSGTAYTHSIRTSSVLFA
310 320 330 340 350 360
370 380
orf47-1.pep LALLVYAWKYIPWLIRPRSDGRPGX
|||||||||||||||||||||||||
orf47ng-1 LALLVYAWKYIPWLIRPRSDGRPGX
370 380
另外,ORF47ng-1显示出与施氏假单胞菌的一个ORF明显同源:
gnl|PID|e246540(Z73914)ORF396蛋白[施氏假单胞菌]长度=396
评分=155位(389),估计值=5e-37
相同性=121/391(30%),阳性=169/391(42%),空隙=21/391(5%)
询问:7 PVWAMAFRPFYSLAALYGALSVLLWGFGYTGTHELSGFY-------WHAHEMIWGYAGLV 59
P+W+ AFRPF+ +LY L++ LW +TG GF WH HEM++G+A +
目标:14 PIWRLAFRPFFLAGSLYALLAIPLWVAAWTGLWP--GFQPTGGWLAWHRHEMLFGFAMAI 71
询问:60 VIAFLLTAVATWTGQPPTRGGVLVGLTAFWLAARIAAFIPGWGAAASGILGTLFFWYGAV 119
V FLLTAV TWTGQ G LVGL A WLAAR+ ++ G AA L LF
目标:72 VAGFLLTAVQTWTGQTAPSGNRLVGLAAVWLAARL-GWLFGLPAAWLAPLDLLFLVALVW 130
询问:120 CMALPVIRSQNRRNYVAVFAIFVLGGTHAAFXXXXXXXXXXXXXXXXXXXXXMVSGFIGL 179
MA + + +RNY V + ++ G +V+ + L
目标:131 MMAQMLWAVRQKRNYPIVVVLSLMLGADVLILTGLLQGNDALQRQGVLAGLWLVAALMAL 190
询问:180 IGMRIISFFTSKRLNVPQIPSP-KWVAQASLWLPMLTAILMAHGV----MPWLSAAFAFA 234
IG R+I FFT + L P W+ A L + A+L A GV P L F A
目标:191 IGGRVIPFFTQRGLGKVDAVKPWVWLDVALLVGTGVIALLHAFGVAMRPQPLLGLLFV-A 249
询问:235 AGVIFTVQVYRWWYKPVLKEPMLWILFAGYLFTGLGLIAVGASYF-KPAFXXXXXXXXXX 293
GV +++ RW+ K + K +LW L L+ + + +F A
目标:250 IGVGHLLRLMRWYDKGIWKVGLLWSLHVAMLWLVVAAFGLALWHFGLLAQSSPSLHALSV 309
询问:294 XXXXXXXXXMMARTALGHTGNSIYPPPKAVPVAFWLXXXXXXXXXXXXFSSGTAYTHSIR 353
M+AR LGHTG + P + AF L F S +
目标:310 GSMSGLILAMIARVTLGHTGRPLQLPAGIIG-AFVL---FNLGTAARVFLSVAWPVGGLW 365
询问:354 TSSVLFALALLVYAWKYIPWLIRPRSDGRPG 384
++V +LA +Y W+Y P L+ R DG PG
目标:366 LAAVCWTLAFALYVWRYAPMLVAARVDGHPG 396
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例85
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 715>:
1 ..ATGCCGTCTG AAGGTTCAGA CGGCmTCGGT GyCGGGGAAy CAGAAGyGGT
51 AGCGCATGCC CAATGAGACT TCGTGGGTTT TGAAGCGGGT GTTTTCCAAG
101 CGTCCCCAGT TGTGGTAACG GTATCCGGTG TCyAArGTCA GCTTGGGyGT
151 GATGTCGAAa CCGACACCGG CGATGACACC AAGACCyAmG CTGCTGATrC
201 TGTkGCTTTC GTGATAGGsA GGTTTGyTGG kmksAsyTTG TAyrATwkkG
251 CCTssCwsTG kAGmGCCkTk CkyTGGTkkA swGrwArTAG TCGTGGTTTy
301 TkTTyyCACC GAATGAACyT GATGTTTAAC GTGTCCGTAG GCGACGCGCG
351 CGCCGATATA GGGTTTGAAT TTATCGTTGA GTTTGAAATC GTAAATGGCG
401 GACAAGCCGA GAGAAGAAAC GGCGTGGAAG CTGCCGTTTC CCTGATGTTT
451 TGTTTGGGTT TCTTTGTAGT TGTTGTTTAT CTCTTCAGTA ACTTTTTTAG
501 TAGAAGAATT ACTTTCTTTC CATTTTCTGT AACTGGCATA ATCTGCCGCT
551 ATTCTCCAGC CGCCGAAATC ..
它对应于氨基酸序列<SEQ ID 716;ORF67>:
1 ..MPSEGSDGXG XGEXEXVAHA QXDFVGFEAG VFQASPVVVT VSGVXXQLGX
51 DVETDTGDDT KTXAADXVAF VIGRFXGXXL YXXAXXXXAX XWXXXXSRGF
101 XXHRMNLMFN VSVGDARADI GFEFIVEFEI VNGGQAERRN GVEAAVSLMF
151 CLGFFVVVVY LFSNFFSRRI TFFPFSVTGI ICRYSPAAEI ..
该氨基酸序列的计算机分析给出了下列结果:
与淋病奈瑟球菌的预计ORF的同源性
ORF67和淋病奈瑟球菌的预计ORF(ORF67ng)在199个氨基酸的重叠区内显示出有51.8%的相同性:
orf67.pep MPSEGSDGXGXGEXEXVAHAQXDFVGFEAG 30
|||||||| | || | ||||| |||||||
orf67ng TNFEIAVLSGMTVRVFYCARPAPVNGGRLKMPSEGSDGIGIGESEAVAHAQRGFVGFEAG 146
90 100 110 120 130 140
orf67.pep VFQASPVVVTVSGVXXQLGXDVETDTGDDTKTXAADXVAFVIGRFXGXXLYXXAXXXXAX 90
|||||||||:|:|| | | || : : ::: || |||:|| | : :
orf67ng VFQASPVVVAVAGVQGQAGRDVYAHARHRAEAQAAAAVAFLIGVFLRMSVRINRNCCVSI 206
orf67.pep XWXXXXSRGFXXHRMNLMFNVSVGDARADIGFEFIVEFEIVNGGQAERRNGVEAAVSLMF 150
: | : |:: : :|||||||:||||||:|||||||||||||||||| || |||
orf67ng TRVGGKSTCYFFSRIDAVSDVSVGDARTDIGFEFVVEFEIVNGGQAERRNGVECAVFLMF 266
orf67.pep CLGFFVV--------VVYLFSNFFSRRITFF-PFSVTGIICRYSPAAEI 190
| | | :: |: |: : | : || ||||| :||||:
orf67ng RLLVFYVKLVAAKSFIILSFQLFYVHGIFIVVPFPVTGIIRGDAPAAEVVADRHPGVDGM 326
预计ORF67ng核苷酸序列<SEQ ID 717>编码的蛋白质包含氨基酸序列<SEQ ID718>:
1 MPSETVGSIV NVGVDESVGF SPPFPSIQHF YRFHRIHRIR LFRPPGPMQL
51 NRHSHGSGNL GRGVWATVLS DKFPCGQVRI PACAGMTNFE IAVLSGMTVR
101 VFYCARPAPV NGGRLKMPSE GSDGIGIGES EAVAHAQRGF VGFEAGVFQA
151 SPVVVAVAGV QGQAGRDVYA HARHRAEAQ
A AAAVAFLIGV FLRMSVRINR
201 NCCVSITRVG GKSTCYFFSR IDAVSDVSVG DARTDIGFEF VVEFEIVNGG
251 QAERRNGVE
C AVFLMFRLLV FYVKLVAAKS F
IILSFQLFY VHGIFIVVPF
301 PVTGIIRGDA PAAEVVADRH PGVDGMRTDV SEIIAYRAYF VFAWSGWFRI
351 IVGNAFGGVG*
根据淋球菌蛋白中存在几个推定的跨膜结构域的发现,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例86
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 719>
1 ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC GAATACGGTT ATGCGGCTGT
51 TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT
101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT
201 CATGTTCGCC GCCGGACGAA TTTGGGGGCA GArArTCCTA rGGTTCArAC
251 CTATTGCGsG CATCATGACG CCGrAACGTT ATGAGCAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG
351 TTTGAGAACG GCCGTATTTG TTACAGCCGG TATCAGCCGC AAGGTTTCAT
401 ACTTGCGTTT TATCATTATG GATGGACTGG CCGCA...
它对应于氨基酸序列<SEQ ID 720;ORF78>:
1 MFAFLEAFFV EYG
YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 H
IMFAVGMLG VLVGDGIMFA AGRIWGQXXL XFXPIAXIMT PXRYEQVQEK
101 F
DKYGNWVLF VARFLPGLRT AVFVTAGISR KVSYLRFIIM DGLAA...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 721>:
1 ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC GAATACGGTT ATGCGGCTGT
51 TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT
101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT
201 CATGTTCGCC GCCGGACGAA TTTGGGGGCA GAAAATCCTA AGGTTCAAAC
251 CTATTGCGCG CATCATGACG CCGAAACGTT ATGAGCAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG
351 TTTGAGAACG GCCGTATTTG TTACAGCCGG TATCAGCCGC AAGGTTTCAT
401 ACTTGCGTTT TATCATTATG GATGGACTGG CCGCACTGAT TTCCGTCCCT
451 ATTTGGATTT ATCTGGGCGA ATACGGTGCG CACAACATCG ATTGGCTGAT
501 GGCGAAAATG CACAGCCTGC AATCGGGTAT TTTTGTTATC TTGGGTATAG
551 GTGCGACCGT TGTCGCTTGG ATTTGGTGGA AAAAACGCCA ACGTATCCAG
601 TTTTACCGCA GCAAATTGAA AGAAAAGCGG GCGCAACGCA AAGCCGCCAA
651 GGCAGCCAAA AAAGCCGCGC AAAGCAAACA ATAA
它对应于氨基酸序列<SEQ ID 722;ORF78-1>:
1 MFAFLEAFFV EYG
YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 H
IMFAVGMLG VLVGDGIMFA AGRIWGQKIL RFKPIARIMT PKRYEQVQEK
101 FDKYGNW
VLF VARFLPGLRT AVFVTAGISR KVSYLR
FIIM DGLAALISVP
151
IWIYLGEYGA HNIDWLMAKM HSLQ
SGIFVI LGIGATVVAW IWWKKRQRIQ
201 FYRSKLKEKR AQRKAAKAAK KAAQSKQ*
该氨基酸序列的计算机分析预测了几个跨膜结构域,并且还给出了下列结果:
与流感嗜血菌的dedA类似物(登录号为P45280)的同源性
ORF78和dedA类似物在144个氨基酸的重叠区内显示出有58%的氨基酸相同性:
Orf78:4 FLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGM--GYTNPHIMFAVGMLGV 61
FL FF EYGY AV FVL+ICGFGVPIPED+TLV+GGVI+G+ N H+M V M+GV
DedA: 20 FLIGFFTEYGYWAVLFVLIICGFGVPIPEDITLVSGGVIAGLYPENVNSHLMLLVSMIGV 79
Orf78:62 LVGDGIMFAAGRIWGQXXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRTA 121
L GD M+ GRI+G L F PI I+T R V+EKF +YGN VLFVARFLPGLR
DedA: 80 LAGDSCMYWLGRIYGTKILRFRPIRRIVTLQRLRMVREKFSQYGNRVLFVARFLPGLRAP 139
Orf78:122 VFVTAGISRKVSYLRFIIMDGLAA 145
+++ +GI+R+VSY+RF+++D AA
DedA: 140 IYMVSGITRRVSYVRFVLIDFCAA 163
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF78和脑膜炎奈瑟球菌菌株A的ORF(ORF78a)在145个氨基酸的重叠区内显示出有93.8%的相同性:
10 20 30 40 50 60
orf78.pep MFAFLEAFFVEYG
YAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPH
IMFAVGMLG
|||:||||||||| ||||||||||||||||| ||||||||||||||||||||| |||||||||
orf78a MFALLEAFFVEYG
YAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPH
IMFAVGMLG
10 20 30 40 50 60
70 80 90 100 110 120
orf78.pep
VLVGDGIMFAAGRIWGQXXLXFXPIAXIMTPXRYEQVQEKFDKYGNW
VLFVARFLPGLRT
|||||||| ||||||||| | | ||| |||| || |||||||||||| |||||||||||||
orf78a
VLVGDGIMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNW
VLFVARFLPGLRT
70 80 90 100 110 120
130 140
orf78.pep
AVFVTAGISRKVSYLR
FIIMDGLAA
|||| |||||||||||| |:|||||||
orf78a
AVFVTAGISRKVSYLR
FLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQ
SGIFIA
130 140 150 160 170 180
全长ORF78a核苷酸序列<SEQ ID 723>是:
1 ATGTTTGCCC TTTTGGAAGC CTTTTTTGTC GAATACGGCT ATGCGGCCGT
51 GTTTTTCGTT TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT
101 TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT
201 CATGTTCGCC GCCGGACGCA TCTGGGGGCA GAAAATCCTC AAGTTCAAAC
251 CGATTGCGCG CATCATGACG CCGAAACGTT ACGCACAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGCAACTG GGTGTTATTT GTCGCTCGTT TCCTGCCCGG
351 TTTGCGGACT GCCGTTTTCG TTACCGCCGG CATCAGCCGC AAAGTATCGT
401 ATCTGCGCTT TCTGATTATG GACGGGCTTG CCGCGCTGAT TTCCGTGCCC
451 GTTTGGATTT ACTTGGGCGA GTACGGCGCG CACAACATCG ATTGGCTGAT
501 GGCGAAAATG CACAGCCTGC AATCCGGCAT CTTCATCGCA TTGGGCGTGC
551 TGGCGGCGGC GCTGGCGTGG TTCTGGTGGC GCAAACGCCG ACATTATCAG
601 CTTTACCGCG CACAATTGAG CGAAAAACGC GCCAAACGCA AGGCGGAAAA
651 GGCAGCGAAA AAAGCGGCAC AGAAGCAGCA GTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 724>:
1 MFALLEAFFV EYG
YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 H
IMFAVGMLG VLVGDGIMFA AGRIWGQKIL KFKPIARIMT PKRYAQVQEK
101 FDKYGNW
VLF VARFLPGLRT AVFVTAGISR KVSYLR
FLIM DGLAALISVP
151
VWIYLGEYGA HNIDWLMAKM HSLQ
SGIFIA LGVLAAALAW FWWRKRRHYQ
201 LYRAQLSEKR AKRKAEKAAK KAAQKQQ*
ORF78a和ORF78-1在227个氨基酸的重叠区内显示出有89.0%的相同性:
10 20 30 40 50 60
orf78a.pep MFALLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf78-1 MFAFLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
10 20 30 40 50 60
70 80 90 100 110 120
orf78a.pep VLVGDGIMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT
||||||||||||||||||||:||||||||||||| |||||||||||||||||||||||||
orf78-1 VLVGDGIMFAAGRIWGQKILRFKPIARIMTPKRYEQVQEKFDKYGNWVLFVARFLPGLRT
70 80 90 100 110 120
130 140 150 160 170 180
orf78a.pep AVFVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA
|||||||||||||||||:||||||||||||:|||||||||||||||||||||||||||:
orf78-1 AVFVTAGISRKVSYLRFIIMDGLAALISVPIWIYLGEYGAHNIDWLMAKMHSLQSGIFVI
130 140 150 160 170 180
190 200 210 220
orf78a.pep LGVLAAALAWFWWRKRRHYQLYRAQLSEKRAKRKAEKAAKKAAQKQQX
||: |:::||:||:||:: |:||::|:||||:||| ||||||||::||
orf78-1 LGIGATVVAWIWWKKRQRIQFYRSKLKEKRAQRKAAKAAKKAAQSKQX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF78和淋病奈瑟球菌的预计ORF(ORF78ng)在38个氨基酸的重叠区内显示出有97.4%的相同性:
orf78.pep XXLXFXPIAXIMTPXRYEQVQEKFDKYGNWVLFVARFLPGLRTAVFVTAGISRKVSYLRF 137
||||||||||||||||||||||||||||||
orf78ng YPVLFVARFLPGLRTAVFVTAGISRKVSYLRF 32
orf78.pep IIMDGLAA 145
:|||||||
orf78ng LIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIALGVLAAALAWFWWRKRR 92
预计ORF78ng核苷酸序列<SEQ ID 725>编码的蛋白质具有氨基酸序列<SEQ ID726>:
1 ..YP
VLFVARFL PGLRTAVFVT AGISRKVSYL R
FLIMDGLAA LISVPVWIYL
51 GEYGAHNIDW LMAKMHSLQ
S GIFIALGVLA AALAWFWWRK RRHYQLYRAQ
101 LSEKRAKRKA EKAAKKAAQK QQ*
进一步的工作揭示了完整的淋球菌核苷酸序列<SEQ ID 727>:
1 atgtttgccc tttTggaagc CTTTTTTGTC GAAtacggCt atgcGGCCGT
51 GTTTTTCGTT TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAAGATT
101 TGACCTTGGT AACGGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG
151 CATATTATGT TTGCGGTCGG TATGCTCGGC GTGTTGGCGG GCGACGGCGT
201 GATGTTTGCC GCCGGACGCA TCTGGGGGCA GAAAATCCTC AAGTTCAAAC
251 CGATTGCGCG CATCATGACG CCGAAACGTT ACGCGCAGGT TCAGGAAAAA
301 TTCGACAAAT ACGGCAACTG GGTTCTGTTT GTCGCCCGTT TCCTGCCGGG
351 TTTGCGGACT GCCGTTTTCG TTACCGCCGG CATCAGCCGC AAAGTATCGT
401 ATCTGCGCTT TCTGATTATG GACGGGCTGG CCGCGCTGAT TTCCGTGCCC
451 GTTTGGATTT ACTTGGGCGA GTACGGCGCG CACAACATCG ATTGGCTGAT
501 GGCGAAAATG CACAGCCTGC AATCGGGCAT CTTCATCGCA TTGGGCGTGC
551 TGGCGGCGGC GCTGGCGTGG TTCTGGTGGC GCAAACGCCG ACATTATCAG
601 CTTTACCGCG CACAATTGAG CGAAAAACGC GCCAAACGCA AGGCGGAAAA
651 GGCAGCGAAA AAAGCGGCAC AGAAGCAGCA GTAa
它对应于氨基酸序列<SEQ ID 728;ORF78ng-1>:
1 MFALLEAFFV EYG
YAAVFFV LVICGFGVPI PEDLTLVTGG VISGMGYTNP
51 H
IMFAVGMLG VLAGDGVMFA AGRIWGQKIL KFKPIARIMT PKRYAQVQEK
101 FDKYGNW
VLF VARFLPGLRT AVFVTAGISR KVSYLR
FLIM DGLAALISVP
151
VWIYLGEYGA HNIDWLMAKM HSLQ
SGIFIA LGVLAAALAW FWWRKRRHYQ
201 LYRAQLSEKR AKRKAEKAAK KAAQKQQ*
ORF78ng-1和ORF78-1在227个氨基酸的重叠区内显示出有88.1%的相同性:
10 20 30 40 50 60
orf78-1.pep MFAFLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
|||:||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf78ng-1 MFALLEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGMGYTNPHIMFAVGMLG
10 20 30 40 50 60
70 80 90 100 110 120
orf78-1.pep VLVGDGIMFAAGRIWGQKILRFKPIARIMTPKRYEQVQEKFDKYGNWVLFVARFLPGLRT
||:|||:|||||||||||||:||||||||||||| |||||||||||||||||||||||||
orf78ng-1 VLAGDGVMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRT
70 80 90 100 110 120
130 140 150 160 170 180
orf78-1.pep AVFVTAGISRKVSYLRFIIMDGLAALISVPIWIYLGEYGAHNIDWLMAKMHSLQSGIFVI
|||||||||||||||||:||||||||||||:|||||||||||||||||||||||||||:
orf78ng-1 AVFVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIA
130 140 150 160 170 180
190 200 210 220
orf78-1.pep LGIGATVVAWIWWKKRQRIQFYRSKLKEKRAQRKAAKAAKKAAQSKQX
||: |:::||:||:||:: |:||::|:||||:||| ||||||||::||
orf78ng-1 LGVLAAALAWFWWRKRRHYQLYRAQLSEKRAKRKAEKAAKKAAQKQQX
190 200 210 220
另外,orf78ng-1显示出与流感嗜血菌的dedA蛋白同源:
sp|P45280| YG29_HAEIN假设蛋白HI1629>gi|1073983|pir||D64133 dedA蛋白(dedA)同系物-流感嗜血菌(Rd KW20菌株)
>gi|1574476(U32836)dedA蛋白(dedA)[流感嗜血菌]长度=212
评分=223位(563),估计值=7e-58
相同性=108/182(59%),阳性=140/182(76%),空隙=2/182(1%)
询问:5 LEAFFVEYGYAAVFFVLVICGFGVPIPEDLTLVTGGVISGM--GYTNPHIMFAVGMLGVL 62
L FF EYGY AV FVL+ICGFGVPIPED+TLV+GGVI+G+ N H+M V M+GVL
目标:21 LIGFFTEYGYWAVLFVLIICGFGVPIPEDITLVSGGVIAGLYPENVNSHLMLLVSMIGVL 80
询问:63 AGDGVMFAAGRIWGQKILKFKPIARIMTPKRYAQVQEKFDKYGNWVLFVARFLPGLRTAV 122
AGD M+ GRI+G KIL+F+PI RI+T +R V+EKF +YGN VLFVARFLPGLR +
目标:81 AGDSCMYWLGRIYGTKILRFRPIRRIVTLQRLRMVREKFSQYGNRVLFVARFLPGLRAPI 140
询问:123 FVTAGISRKVSYLRFLIMDGLAALISVPVWIYLGEYGAHNIDWLMAKMHSLQSGIFIALG 182
++ +GI+R+VSY+RF+++D AA+ISVP+WIYLGE GA N+DWL ++ Q I+I +G
目标:141 YMVSGITRRVSYVRFVLIDFCAAIISVPIWIYLGELGAKNLDWLHTQIQKGQIVIYIFIG 200
询问:183 VL 184
L
目标:201 YL 202
根据该分析结果(包括推定跨膜结构域的存在),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例87
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 729>:
1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG
101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC
151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA
201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG
251 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA
351 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC
401 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA C...
它对应于氨基酸序列<SEQ ID 730;ORF79>:
1
MKKLLAAVMM AGLAGAVSAA GVHVEDGWAR TTVEGMKIGG AFMKIHNDEA
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKIA PMPAMNH..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 731>:
1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG
101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC
151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA
201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG
251 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA
351 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC
401 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA CGGTCATCAC
451 CACGGCGAAG CGCATCAGCA CTAA
它对应于氨基酸序列<SEQ ID 732;ORF79-1>:
1
MKKLLAAVMM AGLAGAVSAA GVHVEDGWAR TTVEGMKIGG AFMKIHNDEA
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKIA PMPAMNHGHH
151 HGEAHQH*
对该氨基酸序列的计算机分析揭示了一个推定的前导肽,并且还给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF79和脑膜炎奈瑟球菌菌株A的ORF(ORF79a)在147个氨基酸的重叠区内显示出有94.6%的相同性:
10 20 30 40 50 60
orf79.pep
MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS
|| ||||||||||||||||||:|||||||||||||||:||||||||||||||||||||||
orf79a
MKXLLAAVMMAGLAGAVSAAGIHVEDGWARTTVEGMKMGGAFMKIHNDEAKQDFLLGGSS
10 20 30 40 50 60
70 80 90 100 110 120
orf79.pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
|||||||||||||||||||||||||||||||||||||||||||||||| ||||| |||||
orf79a PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGXKKQLKXGDKIP
70 80 90 100 110 120
130 140
orf79.pep VTLKFKNAKAQTVQLEVKIAPMPAMNH
|||||||||||||||||| ||| ||:|
orf79a VTLKFKNAKAQTVQLEVKTAPMSAMDHGHHHGEAHQHX
130 140 150
全长ORF79a核苷酸序列<SEQ ID 733>是:
1 ATGAAANAAC TATTGGCAGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTCCGCCGCC GGAATCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG
101 AAGGTATGAA AATGGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC
151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCTGTTGCCG ACCGCGTCGA
201 AGTGCATACC CATATCAATG ATAACGGTGT GATGCGGATG CGCGAAGTCG
251 AAGGCGGCGT GCCTTTGGAG GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCATG TCATGTTTAT GGGTNTGAAA AAACAATTAA AAGANGGCGA
351 CAAGATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCA CAAACCGTCC
401 AACTGGAAGT CAAAACCGCG CCGATGTCGG CAATGGACCA CGGTCATCAC
451 CACGGCGAAG CGCATCAGCA CTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 734>:
1
MKXLLAAVMM AGLAGAVSAA GIHVEDGWAR TTVEGMKMGG AFMKIHNDEA
51 KQDFLLGGSS PVADRVEVHT HINDNGVMRM REVEGGVPLE AKSVTELKPG
101 SYHVMFMGXK KQLKXGDKIP VTLKFKNAKA QTVQLEVKTA PMSAMDHGHH
151 HGEAHQH*
ORF79a和ORF79-1在157个氨基酸的重叠区内显示出有94.9%的相同性:
10 20 30 40 50 60
orf79a.pep MKXLLAAVMMAGLAGAVSAAGIHVEDGWARTTVEGMKMGGAFMKIHNDEAKQDFLLGGSS
|| ||||||||||||||||||:|||||||||||||||:||||||||||||||||||||||
orf79-1 MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS
10 20 30 40 50 60
70 80 90 100 110 120
orf79a.pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGXKKQLKXGDKIP
|||||||||||||||||||||||||||||||||||||||||||||||| ||||| |||||
orf79-1 PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
70 80 90 100 110 120
130 140 150
orf79a.pep VTLKFKNAKAQTVQLEVKTAPMSAMDHGHHHGEAHQHX
|||||||||||||||||| ||| ||:||||||||||||
orf79-1 VTLKFKNAKAQTVQLEVKIAPMPAMNHGHHHGEAHQHX
130 140 150
与淋病奈瑟球菌的预计ORF的同源性
ORF79和淋病奈瑟球菌的预计ORF(ORF79ng)在76个氨基酸的重叠区内显示出有96.1%的相同性:
orf79.pep FMKIHNDEAKQDFLLGGSSPVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGS 101
||||||||||||:|||||||||||||||||
orf79ng INDNGVMRMREVKGGVPLEAKSVTELKPGS 30
orf79.pep YHVMFMGLKKQLKEGDKIPVTLKFKNAKAQTVQLEVKIAPMPAMNH 147
||||||||||||||||||||||||||||||||||||| ||| ||||
orf79ng YHVMFMGLKKQLKEGDKIPVTLKFKNAKAQTVQLEVKTAPMSAMNHGHHHGEAHQH 86
预计ORF79ng核苷酸序列<SEQ ID 735>编码的蛋白质包含氨基酸序列<SEQ ID736>:
1 ..INDNGVMRMR EVKGGVPLEA KSVTELKPGS YHVMFMGLKK QLKEGDKIPV
51 TLKFKNAKAQ TVQLEVKTAP MSAMNHGHHH GEAHQH*
进一步的工作揭示了完整的淋球菌DNA序列<SEQ ID 737>:
1 ATGAAAAAAT TATTGGCAGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT
51 TTccgccgCc GGagTccAtG TCGAggACGG CTGGGCGCGc accaCTGtcg
101 aaggtATgaa aatggGCGGC GCgttCATga aaATCCACAA CGACGaaGcc
151 atacaaGACt ttgtgcTCgg CGGaagcatg cccgttgccg accgcGTCGA
201 AGTGCAtaca cacATCAACG ACAACGGCGT GATGCGTATG CGCGAAGTCA
251 AAGGCGGCGT GCCTTTGGAG GCGAAATCCG TTACCGAACT CAAACCCGGC
301 AGCTATCACG TGATGTTTAT GGGTTTGAAA AAACAACTGA AAGAGGGCGA
351 CAAGATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC
401 AACTGGAAGT CAAAACCGCG CCGATGTCGG CAATGAACCA CGGTCATCAC
451 CACGGCGAAG CGCATCAGCA CTAA
它对应于氨基酸序列<SEQ ID 738;ORF79ng-1>:
1
MKKLLAAVMM AGLAGAVSAA GVHVEDGWAR TTVEGMKMGG AFMKIHNDEA
51 IQDFVLGGSM PVADRVEVHT HINDNGVMRM REVKGGVPLE AKSVTELKPG
101 SYHVMFMGLK KQLKEGDKIP VTLKFKNAKA QTVQLEVKTA PMSAMNHGHH
151 HGEAHQH*
ORF79ng-1和ORF79-1在157个氨基酸的重叠区内显示出有95.5%的相同性:
10 20 30 40 50 60
orf79-1.pep MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKIGGAFMKIHNDEAKQDFLLGGSS
|||||||||||||||||||||||||||||||||||||:|||||||||||| |||:|||||
orf79ng-1 MKKLLAAVMMAGLAGAVSAAGVHVEDGWARTTVEGMKMGGAFMKIHNDEAIQDFVLGGSM
10 20 30 40 50 60
70 80 90 100 110 120
orf79-1.pep PVADRVEVHTHINDNGVMRMREVEGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
|||||||||||||||||||||||:||||||||||||||||||||||||||||||||||||
orf79ng-1 PVADRVEVHTHINDNGVMRMREVKGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIP
70 80 90 100 110 120
130 140 150
orf79-1.pep VTLKFKNAKAQTVQLEVKIAPMPAMNHGHHHGEAHQHX
|||||||||||||||||| ||| |||||||||||||||
orf79ng-1 VTLKFKNAKAQTVQLEVKTAPMSAMNHGHHHGEAHQHX
130 140 150
另外,ORF79ng-1显示出与Aquifex aeolicus的蛋白有明显的同源性
gi|2983695(AE000731)推定的蛋白[Aquifex aeolicus]长度=151
评分=63.6位(152),估计值=6e-10
相同性=38/114(33%),阳性=58/114(50%),空隙=1/114(0%)
询问:24 VEDGWARTTVEGMKMGGAFMKIHNDEAIQDFVLGGSMPVADRVEVHTHINDNGVMRMREV 83
V+ W G M I N+ D+++G +A RVE+H + +N V +M
目标:27 VKHPWVMEPPPGPNTTMMGMI IVNEGDEPDYLIGAKTDIAQRVELHKTVIENDVAKMVPQ 86
询问:84 KGGVPLEAKSVTELKPGSYHVMFMGLKKQLKEGDKIPVTLKFKNAKAQTVQLEV 137
+ + + K E K YHVM +GLKK++KEGDK+ V L F+ + TV+ V
目标:87 ER-IEIPPKGKVEFKHHGYHVMIIGLKKRIKEGDKVKVELIFEKSGKITVEAPV 139
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF79-1(15.6kDa)克隆到pET载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图18A显示出His-融合蛋白亲和纯化的结果。用纯化的His-融合蛋白免疫小鼠,用其血清进行ELISA(阳性结果)和FACS分析(图18B)。这些实验确认ORF79-1是一种外露蛋白,且是一种有用的免疫原。
实施例88
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 739>:
1 ATGACGGTAA CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA
51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT
201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG
251 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG
301 CGGATTCCGG TTGTGAAAtC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 ATacgTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC
401 CGTTTCCCCA GCCCGGTATT TGGACGATyG CTTTCGTGTC AGGGCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAs GACGGCGATT ATCTTTCCGT
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AsCATTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAsGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT
701 AA
它对应于氨基酸序列<SEQ ID 740;ORF98>:
1 MTVTAAEGGK AAKALKKYLI TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPGLGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSEYVL SDSSRSFKTP VLVPFPQPGI WTIAFVSGQV
151 SNAVKAALPX DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEXLK
201 YVISLGMVIP DDLPVKTLAX PMPSEKADLP EQQ*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 741>:
1 ATGACGGAAC nTGCGGCCGA AGGCGGCAAA GCTGCCAArG CGTTAAAAAA
51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT
201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG
251 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG
301 CGGATTCCGG TTGTGAAATC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 ATCGCTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC
401 CGTTTCCCCA GCCCGGTATT TGGACGATTG CTTTCGTGTC AGGGCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAG GACGGCGATT ATCTTTCCGT
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCATTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAGGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT
701 AA
它对应于氨基酸序列<SEQ ID 742;ORF98-1>:
1 MTEXAAEGGK AAKALKKYL
I TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPG
LGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQPGI WTIAFVSGQV
151 SNAVKAALPK DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPSEKADLP EQQ*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF98和脑膜炎奈瑟球菌菌株A的ORF(ORF98a)在233个氨基酸的重叠区内显示出有96.1%的相同性:
10 20 30 40 50 60
orf98.pep MTVTAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
|| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98a MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
10 20 30 40 50 60
70 80 90 100 110 120
orf98.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSEYVL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||| :|
orf98a GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSXSLL
70 80 90 100 110 120
130 140 150 160 170 180
orf98.pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPXDGDYLSVYVPTTPNPTGGYY
||||||||||||||||| ||||||||||||||||||||| ||||||||||||||||||||
orf98a SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
130 140 150 160 170 180
190 200 210 220 230
orf98.pep IMVKKSDVRELDMSVDEXLKYVISLGMVIPDDLPVKTLAXPMPSEKADLPEQQX
||||||||||||||||| ||||||||||||||||||||| ||||||||||||||
orf98a IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
190 200 210 220 230
全长ORF98a核苷酸序列<SEQ ID 743>是:
1 ATGACGGAAC CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA
51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT
201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTATTTGCCG
251 CAAACGTATT GGGCCGGCAG ATTCTTGCCG CGTGGGACAG CTTGTTGGGG
301 CGGATTCCGG TTGTGAAGTC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 NTCGTTGCTG TCCGACAGCA GCCGTTCGTT TAAAACACCA GTACTCGTGC
401 CGTTTCCCCA ATCGGGTATT TGGACAATCG CATTCGTGTC CGGTCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAG GACGGCGATT ATCTTTCCGT
501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCGTTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAGGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT
701 AA
它编码的蛋白质具有氨基酸序列<SEQ ID 744>:
1 MTEPAAEGGK AAKALKKYL
I TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPG
LGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSXSLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV
151 SNAVKAALPK DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPSEKADLP EQQ*
ORF98a和ORF98-1在233个氨基酸的重叠区内显示出有98.7%的相同性:
10 20 30 40 50 60
orf98a.pep MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98-1 MTEXAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
10 20 30 40 50 60
70 80 90 100 110 120
orf98a.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSXSLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98-1 GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSESLL
70 80 90 100 110 120
130 140 150 160 170 180
orf98a.pep SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf98-1 SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
130 140 150 160 170 180
190 200 210 220 230
orf98a.pep IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98-1 IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
190 200 210 220 230
与淋病奈瑟球菌的预计ORF的同源性
ORF98和淋病奈瑟球菌的预计ORF(ORF98ng)在233个氨基酸的重叠区内显示出有95.3%的相同性:
10 20 30 40 50 60
orf98.pep MTVTAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL 60
|| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98ng MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL 60
orf98.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSEYVL 120
||||||||||||||||||||||||||||||||||||||| ||||||||||||||||| :|
orf9Sng GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLXRIPVVKSIYSSVKKVSESLL 120
orf98.pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPXDGDYLSVYVPTTPNPTGGYY 180
||||||||||||||||| ||||||||||||||||||||| ||||||||||||||||||||
orf98ng SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPQDGDYLSVYVPTTPNPTGGYY 180
orf98.pep IMVKKSDVRELDMSVDEXLKYVISLGMVIPDDLPVKTLAXPMPSEKADLPEQQ 233
||||||||||||||||| ||||||||||||||||||||| ||| |||:|||||
orf98ng IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPPEKAELPEQQ 233
预计全长ORF98ng核苷酸序列<SEQ ID 745>编码的蛋白质具有氨基酸序列<SEQ ID 746>:
1 MTEPAAEGGK AAKALKKYL
I TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPG
LGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLX
101 RIPVVKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV
151 SNAVKAALPQ DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPPEKAELP EQQ*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 747>:
1 ATGACGGAAC CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA
51 ATATCTGATT ACAGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT
101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ACCAGCTTGT CAACCTGCTG
151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCCGGGCT
201 CGGCGTTATT GTTGCCATTG CCGTATTGTT TGTAACCGGA TTATTTGCCG
251 CAAACGTGTT GGGCCGGCAG ATTCTTGCCG CGTGGGACAG CCTGTTgggg
301 cggaTTCCGG TTGTCAAATC CATCTATTCG AGTGTGAAAA AAGTATCCGA
351 ATCGCTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC
401 CGTTTCCCCA ATCGGGTATT TGGACAATCG CATTCGTGTC CGGTCAGGTG
451 TCGAATGCGG TTAAGGCCGC ATTGCCGCAG GATGGCGATT ATCTTTCCGT
501 GTATGTCCCG ACCACGCCCA ACCCGACCGG CGGTTACTAT ATTATGGTAA
551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AGCGTTGAAA
601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC
651 ATTGGCAGGA CCTATGCCGC CTGAAAAGGC GGAGTTGCCC GAACAACAAT
701 AA
它对应于氨基酸序列<SEQ ID 748;ORF98ng-1>:
1 MTEPAAEGGK AAKALKKYL
I TGILVWLPIA VTVWVVSYIV SASDQLVNLL
51 PKQWRPQYVL GFNIPG
LGVI VAIAVLFVTG LFAANVLGRQ ILAAWDSLLG
101 RIPVVKSIYS SVKKVSESLL SDSSRSFKTP VLVPFPQSGI WTIAFVSGQV
151 SNAVKAALPQ DGDYLSVYVP TTPNPTGGYY IMVKKSDVRE LDMSVDEALK
201 YVISLGMVIP DDLPVKTLAG PMPPEKAELP EQQ*
ORF98ng-1和ORF98-1在233个氨基酸的重叠区内显示出有97.9%的相同性:
10 20 30 40 50 60
orf98-1.pep MTEXAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98ng-1 MTEPAAEGGKAAKALKKYLITGILVWLPIAVTVWVVSYIVSASDQLVNLLPKQWRPQYVL
10 20 30 40 50 60
70 80 90 100 110 120
orf98-1.pep GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSESLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf98ng-1 GFNIPGLGVIVAIAVLFVTGLFAANVLGRQILAAWDSLLGRIPVVKSIYSSVKKVSESLL
70 80 90 100 110 120
130 140 150 160 170 180
orf98-1.pep SDSSRSFKTPVLVPFPQPGIWTIAFVSGQVSNAVKAALPKDGDYLSVYVPTTPNPTGGYY
||||||||||||||||| |||||||||||||||||||||:||||||||||||||||||||
orf98ng-1 SDSSRSFKTPVLVPFPQSGIWTIAFVSGQVSNAVKAALPQDGDYLSVYVPTTPNPTGGYY
130 140 150 160 170 180
190 200 210 220 230
orf98-1.pep IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPSEKADLPEQQX
||||||||||||||||||||||||||||||||||||||||||| |||:||||||
orf98ng-1 IMVKKSDVRELDMSVDEALKYVISLGMVIPDDLPVKTLAGPMPPEKAELPEQQX
190 200 210 220 230
根据该分析结果(包括淋球菌蛋白中的推定跨膜结构域与脑膜炎球菌蛋白中的序列相同这一事实),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例89
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 749>:
1 ATgAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CCGTCGGACT
51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
201 ATATCCCCGA AAAGATGCAG CGTTTCGGTT CGGCnCGTAA AGGCCkCAAG
251 ssCGsGCTTG CCTTGAACAA GGCGGGTTTG GCGTATTTTG AAGGGCGTTT
301 TGAAAAGGCG GAACTAGAAG CCTCACGCGT GTTGGTCAAC AAAGtAGGCC
401 AGATGGAAAA CATCGAssTG CGCGACCGTT ATCTTGCGGA AATCGCCAAA
451 CTGCCGGAAA AACAGCAGCT TTCCCGTTAT CTTTTGTTGG CGGAATCGGC
501 GTTGAACCGG CGCGATTACG AAGCGGCGGA AGCCAATCTT CATGCGGCGG
551 CGAAGATGAA TGCCAACCTT ACGCGCCTCG TGCGTCTGCA .ATTCGTTAC
601 GCTTTCGACA GGGGCGACGC GTTGCAGGTT CTGGCAAAAA CCGAAAAACT
651 TTCCAAGGCG GGCGCGTTGG GCAAATCGGA AATGGAACGG TATCAAAATT
701 GGGCATAT
C GTCGCCAGCT GGCGGATGCT GCCGATGCCG CCGCTTTGAA
751 AACCTGCCTG AAGCGGATTC CCGACAGCCT CAAAAACGGG GAATTGAGCG
801 TATCGGTTGC GGAAAAGTAC GAACGTTTGG GACTGTATGC CGATGCGGTC
851 AAATGGGTCA AACAGCATTA TCCGCAsAAC CGCCGCCCCG AGCTTTTGGA
901 AGCCTTTGTC GAAAGCGTGC GCTTTTTGGG CGAGCGCGAA CAGCAGAAAG
951 CCATCGATTT TGCCGATGCT TGGCTGAAAG AACAGCCCGA TAACGCGCTT
1001 CTGCTGATGT ATCTCGGTCG GCTCGCCTTC GGCCGCAAAC TTTGGGGCAA
1051 GGCAAAAGGC TACCTTGAAG CGAGCATTGC ATTAAAGCCG AGTATTTCCG
1101 CGCGTTTGGT TCTAACAAAG GTTTTCGACG AAATCGGAGA ACCGCAGAAG
1151 GCGGAGGCGC AC...
它对应于氨基酸序列<SEQ ID 750;ORF100>:
1 MKTVVWIVVL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSLI
51 AVVVWYFLFK FIIGVLNIPE KMQRFGSARK GXKXXLALNK AGLAYFEGRF
101 EKAELEASRV LVNKVGRDNR TLALMLXAHA AGQMENIXXR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLXIRYA
201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQLA DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP XNRRPELLEA
301 FVESVRFLGE REQQKAIDFA DAWLKEQPDN ALLLMYLGRL AFGRKLWGKA
351 KGYLEASIAL KPSISARLVL TKVFDEIGEP QKAEAH...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 751>:
1 ATGAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CCGTCGGACT
51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CTTGTTTAAA TTCATTATCG GCGTACTCAA
201 TATCCCCGAA AAGATGCAGC GTTTCGGTTC GGCGCGTAAA GGCCGCAAGG
251 CCGCGCTTGC CTTGAACAAG GCGGGTTTGG CGTATTTTGA AGGGCGTTTT
301 GAAAAGGCGG AACTAGAAGC CTCACGCGTG TTGGTCAACA AAGAGGCCGG
351 AGACAACCGG ACTTTGGCAT TGATGCTGGG CGCGCACGCC GCCGGACAGA
401 TGGAAAACAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG
451 CCGGAAAAAC AGCAGCTTTC CCGTTATCTT TTGTTGGCGG AATCGGCGTT
501 GAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCT
601 TTCGACAGGG GCGACGCGTT GCAGGTTCTG GCAAAAACCG AAAAACTTTC
651 CAAGGCGGGC GCGTTGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG
701 CATACCGCCG CCAGCTGGCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGAGCGTATC
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT
851 GGGTCAAACA GCATTATCCG CACAACCGCC GCCCCGAGCT TTTGGAAGCC
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAG CGCGAACAGC AGAAAGCCAT
951 CGATTTTGCC GATGCTTGGC TGAAAGAACA GCCCGATAAC GCGCTTCTGC
1001 TGATGTATCT CGGTCGGCTC GCCTACGGCC GCAAACTTTG GGGCAAGGCA
1051 AAAGGCTACC TTGAAGCGAG CATTGCATTA AAGCCGAGTA TTTCCGCGCG
1101 TTTGGTTCTA GCAAAGGTTT TCGACGAAAT CGGAGAACCG CAGAAGGCGG
1151 AGGCGCAGCG CAACTTGGTT TTGGAAGCCG TCTCCGATGA CGAACGTCAC
1201 GCAGCGTTAG AGCAGCATAG CTGA
它对应于氨基酸序列<SEQ ID 752;ORF100-1>:
1
MKTVVWIVVL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGS
LI
51
AVVVWYFLFK FIIGVLNIPE KMQRFGSARK GRKAALALNK AGLAYFEGRF
101 EKAELEASRV LVNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA
201 FDRGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQLA DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA
301 FVESVRFLGE REQQKAIDFA DAWLKEQPDN ALLLMYLGRL AYGRKLWGKA
351 KGYLEASIAL KPSISARLVL AKVFDEIGEP QKAEAQRNLV LEAVSDDERH
401 AALEQHS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF100和脑膜炎奈瑟球菌菌株A的ORF(ORF100a)在386个氨基酸的重叠区内显示出有93.5%的相同性:
10 20 30 40 50 60
orf100.pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
|||||||||||||| |||||||| ||||||||||||||||||||||||||||||||||||
orf100a MKTVVWIVVLFAAAXGLALASGIXTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
10 20 30 40 50 60
70 80 90 100 110 120
orf100.pep FIIGVLNIPEKMQRFGSARKGXKXXLALNKAGLAYFEGRFEKAELEASRVLVNKVGRDNR
||||||| ||||||||||||| | |||||||||||||||||||||||||| || : |||
orf100a FIIGVLNXPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR
70 80 90 100 110 120
130 140 150 160 170 180
orf100.pep TLALMLXAHAAGQMENIXXRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
|||||| |||||||||| |||||||||||||||||||||||||||||||||||||||||
orf100a TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
130 140 150 160 170 180
190 200 210 220 230 240
orf100.pep AAAKMNANLTRLVRLXIRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA
||||||||||||||| :||||||||||||||||| ||||| ||||||||||||||||||
orf100a AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKXSKAGAXGKSEMERYQNWAYRRQLX
190 200 210 220 230 240
250 260 270 280 290 300
orf100.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPXNRRPELLEA
|||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||
orf100a DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
250 260 270 280 290 300
310 320 330 340 350 360
orf100.pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAFGRKLWGKAKGYLEASIAL
|||||||||||:|||||||||||||||||||||||||||||:||||||||||||||||||
orf100a FVESVRFLGERDQQKAIDFADAWLKEQPDNALLLXYLGRLAYGRKLWGKAKGYLEASIAL
310 320 330 340 350 360
370 380
orf100.pep KPSISARLVLTKVFDEIGEPQKAEAH
||||||||||:||||| ||||||||:
orf100a KPSISARLVLAKVFDETGEPQKAEAQRNLVLASVAEENRPSAETHX
370 380 390 400
全长ORF100a核苷酸序列<SEQ ID 753>是:
1 ATGAAAACGG TAGTCTGGAT TGTCGTCCTG TTTGCCGCCG CNNTCGGGCT
51 GGCATTGGCG TCGGGCATTN ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CCTGTTCAAA TTCATCATCG GCGTACTCAA
201 TANCCCCGAA AAGATGCAGC GTTTCGGTTC GGCGCGTAAA GGCCGCAAGG
251 CCGCGCTTGC TTTGAACAAG GCGGGTTTGG CGTATTTTGA AGGGCGTTTT
301 GAAAAGGCGG AACTTGAAGC CTCGCGCGTA TTGGGAAACA AAGAGGCGGG
351 GGATAACCGG ACTTTGGCAT TGATGTTGGG CGCACATGCC GCCGGGCAGA
401 TGGAAAACAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG
451 CCGGAAAAGC AGCAGCTTTC CCGTTATCTT TTGTTGGCGG AATCGGCGTT
501 GAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCT
601 TTCGACAGGG GCGACGCGTT GCAGGTTCTG GCAAAAACCG AAAAANTTTC
651 CAAGGCGGGC GCGTNGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG
701 CATACCGCCG CCAGCTGNCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGAGCGTATC
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT
851 GGGTCAAACA GCATTATCCG CACAACCGCC GACCCGAACT TTTGGAAGCN
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAA CGCGATCAGC AGAAAGCCAT
951 CGATTTTGCC GATGCTTGGC TGAAAGAACA GCCCGATAAT GCGCTTCTGC
1001 TGANGTATCT CGGTCGGCTC GCCTACGGCC GCAAACTTTG GGGCAAGGCA
1051 AAAGGCTACC TTGAAGCGAG CATTGCATTA AAGCCGAGTA TTTCCGCGCG
1101 TTTGGTTCTG GCAAAGGTTT TTGACGAAAC CGGAGAACCG CAGAAGGCGG
1151 AGGCGCAGCG CAACTTGGTT TTGGCAAGCG TTGCCGAGGA AAACCGNCCT
1201 TCCGCCGAAA CCCATTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 754>:
1
MKTVVWIVVL FAAAXGLALA SGIXTGDVYI VLGQTMLRIN LHAFVLGS
LI
51
AVVVWYFLFK FIIGVLNXPE KMQRFGSARK GRKAALALNK AGLAYFEGRF
101 EKAELEASRV LGNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA
201 FDRGDALQVL AKTEKXSKAG AXGKSEMERY QNWAYRRQLX DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA
301 FVESVRFLGE RDQQKAIDFA DAWLKEQPDN ALLLXYLGRL AYGRKLWGKA
351 KGYLEASIAL KPSISARLVL AKVFDETGEP QKAEAQRNLV LASVAEENRP
401 SAETH*
ORF100a和ORF100-1在406个氨基酸的重叠区内显示出有95.1%的相同性:
10 20 30 40 50 60
orf100a.pep MKTVVWIVVLFAAAXGLALASGIXTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
|||||||||||||| |||||||| ||||||||||||||||||||||||||||||||||||
orf100-1 MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
10 20 30 40 50 60
70 80 90 100 110 120
orf100a.pep FIIGVLNXPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR
||||||| ||||||||||||||||||||||||||||||||||||||||||| ||||||||
orf100-1 FIIGVLNIPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLVNKEAGDNR
70 80 90 100 110 120
130 140 150 160 170 180
orf100a.pep TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100-1 TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
130 140 150 160 170 180
190 200 210 220 230 240
orf100a.pep AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKXSKAGAXGKSEMERYQNWAYRRQLX
||||||||||||||||||||||||||||||||||| ||||| |||||||||||||||||
orf100-1 AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA
190 200 210 220 230 240
250 260 270 280 290 300
orf100a.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100-1 DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
250 260 270 280 290 300
310 320 330 340 350 360
orf100a.pep FVESVRFLGERDQQKAIDFADAWLKEQPDNALLLXYLGRLAYGRKLWGKAKGYLEASIAL
|||||||||||:|||||||||||||||||||||| |||||||||||||||||||||||||
orf100-1 FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL
310 320 330 340 350 360
370 380 390 400
orf100a.pep KPSISARLVLAKVFDETGEPQKAEAQRNLVLASVAEENRPSA-ETHX
|||||||||||||||| |||||||||||||| :|::::| :| | |
orf100-1 KPSISARLVLAKVFDEIGEPQKAEAQRNLVLEAVSDDERHAALEQHSX
370 380 390 400
与淋病奈瑟基菌的预计ORF的同源性
ORF100和淋病奈瑟球菌的预计ORF(ORF100ng)在386个氨基酸的重叠区内显示出有93.3%的相同性:
orf100.pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100ng MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK 60
orf100.pep FIIGVLNIPEKMQRFGSARKGXKXXLALNKAGLAYFEGRFEKAELEASRVLVNKVGRDNR 120
||||||||||:|:| |||||| | |||||||||||||||||||||||||| || : |||
orf100ng FIIGVLNIPENMRRSGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR 120
orf100.pep TLALMLXAHAAGQMENIXXRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 180
|||||| |||||||||| |||||||||||||||||||||||||||||||||||||||||
orf100ng TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH 180
orf100.pep AAAKMNANLTRLVRLXIRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA 240
||||||||||||||| :|||||||||||||||||||||||||||||||||||||||||:|
orf100ng AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQMA 240
orf100.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPXNRRPELLEA 300
|||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||
orf100ng DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA 300
orf100.pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAFGRKLWGKAKGYLEASIAL 360
|||||||||||||||||||:|||||||||||||||||||||:||||||||||||||||||
orf100ng FVESVRFLGEREQQKAIDFADSWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL 360
orf100.pep KPSISARLVLTKVFDEIGEPQKAEAH 386
|||| |||||:||||| :: |||||:
orf100ng KPSIPARLVLAKVFDETAQSQKAEAQRNLVLASVAGENRPSAETR 405
全长ORF100ng核苷酸序列<SEQ ID 755>是:
1 ATGAAAACGG TAGTCTGGAT TGTTGTCCTG TTTGCCGCCG CCGTCGGACT
51 GGCGCTGGCT TCGGGCATTT ACACCGGCGA CGTGTATATC GTACTCGGAC
101 AGACCATGCT CAGAATCAAC CTGCACGCCT TTGTGTTAGG TTCGCTGATT
151 GCCGTCGTGG TGTGGTATTT CCTGTTTAAA TTCATCATCG GCGTACTCAA
201 TATCCCCGAA AATATGCGGC GTTCCGGTTC GGCGCGGAAA GGCCGCAAGG
251 CCGCGCTTGC CTTGAATAAG GCGGGTTTGG CGTATTTCGA AGGGCGTTTT
301 GAAAAGGCGG AACTCGAAGC CTCTCGAGTG TTGGGCAACA AAGAGGCCGG
351 AGACAACCGG ACTTTGGCAT TGATGCTGGG CGCGCACGCG GCAGGACAGA
401 TGGAAAATAT CGAGCTGCGC GACCGTTATC TTGCGGAAAT CGCCAAACTG
451 CCGGAAAAAC AGCAGCTTTC CCGCTATCTT CTGCTGGCGG AATCGGCGTT
501 AAACCGGCGC GATTACGAAG CGGCGGAAGC CAATCTTCAT GCGGCGGCGA
551 AGATGAATGC CAACCTTACG CGCCTCGTGC GTCTGCAACT TCGTTACGCC
601 TTCGATCGGG GCGATGCGTT GCAGGTTCTG GCAAAAaccG AAAAACTTTC
651 CAAGGCGGGC GCGTTGGGCA AATCGGAAAT GGAACGGTAT CAAAATTGGG
701 CATACCGCCG CCAGATGGCG GATGCTGCCG ATGCCGCCGC TTTGAAAACC
751 TGCCTGAAGC GGATTCCCGA CAGCCTCAAA AACGGGGAAT TGagcGTATC
801 GGTTGCGGAA AAGTACGAAC GTTTGGGACT GTATGCCGAT GCGGTCAAAT
851 GGGTCAAACA GCATTATCCG CACAACCGCC GCCCCGAGCT TTTGGAAGCC
901 TTTGTCGAAA GCGTGCGCTT TTTGGGCGAG CGCGAACAGC AGAAAGCCAT
951 CGATTTTGCC GATTCTTGGC TGAAAGAACA GCCCGATAAC GCGCTTCTGC1001 TGATGTATCT CGGCCGGCTC GCCTACGGCC GCAAACTTTG GGGTAAGGCA1051 AAAGGCTACC TTGAAGCGAG TATTGCACTG AAGCCGAGTA TTCCGGCGCG1101 TTTGGTGTTG GCAAAGGTTT TTGACGAAAC CGCACAGTCG CAAAAAGCCG1151 AAGCACAGCG CAACTTGGTT TTGGCAAGCG TTGCCGGGGA AAACCGCCCT1201 TCCGCCGAAA CCCGTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 756>:
1
MKTVVWIVVL FAAAVGLALA SGIYTGDVYI VLGQTMLRIN LHAFVLGSL
I
51
AVVVWYFLFK FIIGVLNIPE NMRRSGSARK GRKAALALNK AGLAYFEGRF
101 EKAELEASRV LGNKEAGDNR TLALMLGAHA AGQMENIELR DRYLAEIAKL
151 PEKQQLSRYL LLAESALNRR DYEAAEANLH AAAKMNANLT RLVRLQLRYA
201 FD
RGDALQVL AKTEKLSKAG ALGKSEMERY QNWAYRRQMA DAADAAALKT
251 CLKRIPDSLK NGELSVSVAE KYERLGLYAD AVKWVKQHYP HNRRPELLEA
301 FVESVRFLGE REQQKAIDFA DSWLKEQPDN ALLLMYLGRL AYGRKLWGKA
351 KGYLEASIAL KPSIPARLVL AKVFDETAQS QKAEAQRNLV LASVAGENRP
401 SAETR*
ORF100ng和ORF100-在402个氨基酸的重叠区内1显示出有95.3%的相同性:
10 20 30 40 50 60
orf100-1.pep MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100ng MKTVVWIVVLFAAAVGLALASGIYTGDVYIVLGQTMLRINLHAFVLGSLIAVVVWYFLFK
10 20 30 40 50 60
70 80 90 100 110 120
orf100-1.pep FIIGVLNIPEKMQRFGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLVNKEAGDNR
||||||||||:|:| |||||||||||||||||||||||||||||||||| ||||||||||
orf100ng FIIGVLNIPENMRRSGSARKGRKAALALNKAGLAYFEGRFEKAELEASRVLGNKEAGDNR
70 80 90 100 110 120
130 140 150 160 170 180
orf100-1.pep TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100ng TLALMLGAHAAGQMENIELRDRYLAEIAKLPEKQQLSRYLLLAESALNRRDYEAAEANLH
130 140 150 160 170 180
190 200 210 220 230 240
orf100-1.pep AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf100ng AAAKMNANLTRLVRLQLRYAFDRGDALQVLAKTEKLSKAGALGKSEMERYQNWAYRRQMA
190 200 210 220 230 240
250 260 270 280 290 300
orf100-1.pep DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf100ng DAADAAALKTCLKRIPDSLKNGELSVSVAEKYERLGLYADAVKWVKQHYPHNRRPELLEA
250 260 270 280 290 300
310 320 330 340 350 360
orf100-1.pep FVESVRFLGEREQQKAIDFADAWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf100ng FVESVRFLGEREQQKAIDFADSWLKEQPDNALLLMYLGRLAYGRKLWGKAKGYLEASIAL
310 320 330 340 350 360
370 380 390 400
orf100-1.pep KPSISARLVLAKVFDEIGEPQKAEAQRNLVLEAVSDDERHAALEQHSX
|||| ||||||||||| :: ||||||||||| :|: ::| :|
orf100n KPSIPARLVLAKVFDETAQSQKAEAQRNLVLASVAGENRPSAETRX
370 380 390 400
根据该分析结果(包括一个推定的前导序列、一个推定的跨膜结构域以及一个RGD基序的存在),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例90
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 757>
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG
151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATsTGGT CGTGTTCAAA CCGTTTTGA
它对应于氨基酸序列<SEQ ID 758;ORF102>:
1 MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA
51 VRLYRFMSPL GFGAVVFGAA IPFAAGWWGS GWVHVKLCLG LMLLAYQLYC
101 GVLLRRFQDY SNAFSHRWYR VFNEIPVLLM VAALYXVVFK PF*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 759>:
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG
151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA
它对应于氨基酸序列<SEQ ID 760;ORF102-1>:
1
MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA
51 VRLYRFMSP
L GFGAVVFGAA IPFAAGWWGS GWVHVK
LCLG LMLLAYQLYC
101
GVLLRRFQDY SNAFSHRWYR VFNE
IPVLLM VAALYLVVFK PF*
该氨基酸序列的计算机分析给出了下列结果:
与幽门螺杆菌的HP1484假设整合膜蛋白(登录号为AE000647)的同源性ORF102和HP1484在143个氨基酸的重叠区内显示出有33%的氨基酸相同性:
orf102 3 FSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPLGF 62
F W K FH+ VISW A LFYLPR+FV A + V++ +LY F++
HP1484 8 FLWVKAFHVIAVISWMAALFYLPRLFVYHAENAHKKEFVGVVQIQEK--KLYSFIASPAM 65
orf102 63 GAVVFGAAIPFAAG---WWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWY 119
G + + + GW+H KL L ++LLAY YC +R + + R+Y
HP1484 66 GFTLITGILMLLIEPTLFKSGGWLHAKLALVVLLLAYHFYCKKCMRELEKDPTRRNARFY 125
orf102 120 RVFNEIPXXXXXXXXXXXXFKPF 142
RVFNE P KPF
HP1484 126 RVFNEAPTILMILIVILVVVKPF 148
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF102和脑膜炎奈瑟球菌菌株A的ORF(ORF102a)在142个氨基酸的重叠区内显示出有99.3%的相同性:
10 20 30 40 50 60
orf102.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102a MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
10 20 30 40 50 60
70 80 90 100 110 120
orf102.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102a GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
70 80 90 100 110 120
130 140
orf102.pep VFNEIPVLLMVAALYXVVFKPFX
||||||||||||||| |||||||
orf102a VFNEIPVLLMVAALYLVVFKPFX
130 140
全长ORF102a核苷酸序列<SEQ ID 761>是:
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG
151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 762>:
1
MMFSWFKLFH LFFVISWFAG LFYLPRIFVN MAMIDVPRGN PEYVRLSGMA
51 VRLYRFMSP
L GFGAVVFGAA IPFAAGWWGS GWVHVK
LCLG LMLLAYQLYC
101
GVLLRRFQDY SNAFSHRWYR VFNE
IPVLLM VAALYLVVFK PF*
ORF102a和ORF102-1在142个氨基酸的重叠区内显示出完全相同:
10 20 30 40 50 60
orf102a.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102-1 MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
10 20 30 40 50 60
70 80 90 100 110 120
orf102a.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf102-1 GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
70 80 90 100 110 120
130 140
orf102a.pep VFNEIPVLLMVAALYLVVFKPFX
|||||||||||||||||||||||
orf102-1 VFNEIPVLLMVAALYLVVFKPFX
130 140
与淋病奈瑟球菌的预计ORF的同源性
ORF102和淋病奈瑟球菌的预计ORF(ORF102ng)在142个氨基酸的重叠区内显示出有97.9%的相同性:
orf102.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL 60
|||||||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf102ng MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPL 60
orf102.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 120
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf102ng GFGAVVFGAAIPFAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR 120
orf102.pep VFNEIPVLLMVAALYXVVFKPF 142
||||||||||||||| ||||||
orf102ng VFNEIPVLLMVAALYLVVFKPF 142
全长ORF102ng核苷酸序列<SEQ ID 763>是:
1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG
51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA
101 TTGATGCGCC GCGCGGCAAT CCCGAGTATG TGCGCCTGTC GGGGATGGCG
151 GTGCGGTTGT ACCGTTTTAT GTCGCCTTTG GGTTTCGGCG CGGTCGTGTT
201 CGGCGCGGCG ATACCGTTTG CCGCcggccg GTGGGGCagc ggctggGTTC
251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTATCA GTTGTATTGC
301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG
351 CTGGTACCGC GTGTTCAAcg aAATCCCCGT GCTGCTGATG GTTGCCGCGC
401 TGTATCTGGT CGTGTTCAAA CCGTTTTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 764>:
1
MMFSWFKLFH LFFYISWFAG LFYLPRIFVN MAMIDAPRGN PEYVRLSGMA
51 VRLYRFMSP
L GFGAVVFGAA IPFAAGRWGS GWVHVK
LCLG LMLLAYQLYC
101
GVLLRRFQDY SNAFSHRWYR VFNE
IPVLLM VAALYLVVFK PF*
ORF102ng和ORF102-1在142个氨基酸的重叠区内显示出有98.6%的相同性:
10 20 30 40 50 60
orf102-1.pep MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDVPRGNPEYVRLSGMAVRLYRFMSPL
|||||||||||||||||||||||||||||||||||:||||||||||||||||||||||||
orf102ng MMFSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPL
10 20 30 40 50 60
70 80 90 100 110 120
orf102-1.pep GFGAVVFGAAIPFAAGWWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
|||||||||||||||| |||||||||||||||||||||||||||||||||||||||||||
orf102ng GFGAVVFGAAIPFAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFSHRWYR
70 80 90 100 110 120
130 140
orf102-1.pep VFNEIPVLLMVAALYLVVFKPFX
|||||||||||||||||||||||
orf102ng VFNEIPVLLMVAALYLVVFKPFX
130 140
另外,ORF102ng显示出与幽门螺杆菌的一种膜蛋白明显同源:
gi|2314656(AE000647)保守的假设整合蛋白[幽门螺杆菌]长度=148
评分=79.2位(192),估计值=1e-14
相同性=50/147(34%),阳性=68/147(46%),空隙=13/147(8%)
询问:3 FSWFKLFHLFFVISWFAGLFYLPRIFVNMAMIDAPRGNPEYVRLSGMAVRLYRFMSPLGF 62
F W K FH+ VISW A LFYLPR+FV A + V++ +LY F++
目标:8 FLWVKAFHVIAVISWMAALFYLPRLFVYHAENAHKKEFVGVVQIQEK--KLYSFIASPAM 65
询问:63 GAVVFGAAIP-------FAAGRWGSGWVHVKLCLGLMLLAYQLYCGVLLRRFQDYSNAFS 115
G + + F +G GW+H KL L ++LLAY YC +R + +
目标:66 GFTLITGILMLLIEPTLFKSG----GWLHAKLALVVLLLAYHFYCKKCMRELEKDPTRRN 121
询问:116 HRWYRVFNEIPXXXXXXXXXXXXFKPF 142
R+YRVFNE P KPF
目标:122 ARFYRVFNEAPTILMILIVILVVVKPF 148
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例91
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 765>:
1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC
51 GGTTTGGGGC GGATGGTCTT AACTGAAGCC CGAGCCGCAC GTGCTTGATA
101 TTACGGAAAC GGTCAGGCGC GGC //.....
//.. ATTTCGTTTA CGATTTTGTC CGAACCGGAT ACGCCGATTA AGGCGAAGCT
51 CGACAGCGTC GACCCCGGGC TGACCACGAT GTCGTCGGGC GGTTACAACA
101 GCAGTACGGA TACGGCTTCC AATGCGGTCT ACTATTATGC CCGTTCGTTT
151 GTGCCGAATC CGGACGGCAA ACTCGCCACG GGGATGACGA CGCAGAATAC
201 GGTTGAAATC GACGGCGTGA AAAATGTGCT GATTATTCCG TCGCTGACCG
251 TGAAAAATCG CGGCGGCAAG GCGTTTGTGC GCGTGTTGGG TGCGGACGGC
301 AAGGCGGCGG AACGCGAAAT CCGGACCGGT ATGAGAGACA GTATGAATAC
351 CGAAGTAAAA AGCGGGTTGA AAGAGGGGGA CAAAGTGGTC ATCTCCGAAA
401 TAACCGCCGC CGAGCAACAG GAAAGCGGCG AACGCGCCCT AGGCGGCCCG
451 CCGCGCCGAT AA
它对应于氨基酸序列<SEQ ID 766;ORF85>:
1
MAKMMKWAAV AAVAAAAVWG GWS.LKPEPH VLDITETVRR G........
51 .......... .......... .......... .......... ..........
101 .......... .......... .......... .......... ..........
151 .......... .......... .......... .......... ..........
201 .......... .......... .......... .........I SFTILSEPDT
251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG
301 MTTQNTVEID GVKNVLIIPS LTVKNRGGKA FVRVLGADGK AAEREIRTGM
351 RDSMNTEVKS GLKEGDKVVI SEITAAEQQE SGERALGGPP RR*
进一步的工作揭示了部分核苷酸序列<SEQ ID 767>:
1 ..GTATCGGTCG GCGCGCAGGC ATCGGGGCAG ATTAAGATAC TTTATGTCAA
51 ACTCGGGCAA CAGGTTAAAA AGGGCGATTT GATTGCGGAA ATCAATTCGA
101 CCTCGCAGAC CAATACGCTC AATACGGAAA AATCCAAGTT GGAAACGTAT
151 CAGGCGAAGC TGGTGTCGGC ACAGATTGCA TTGGGCAGCG CGGAGAAGAA
201 ATATAAGCGT CAGGCGGCGT TATGGAAGGA AAACGCGACT TCCAAAGAGG
251 ATTTGGAAAG CGCGCAGGAT GCGTTTGCCG CCGCCAAAGC CAATGTTGCC
301 GAGCTGAAGG CTTTAATCAG ACAGAGCAAA ATTTCCATCA ATACCGCCGA
351 GTCGGAATTG GGCTACACGC GCATTACCGC AACGATGGAC GGCACGGTGG
401 TGGCGATTCT CGTGGAAGAG GGGCAGACTG TGAACGCGGC GCAGTCTACG
451 CCGACGATTG TCCAATTGGC GAATCTGGAT ATGATGTTGA ACAAAATGCA
501 GATTGCCGAG GGCGATATTA CCAAGGTGAA GGCGGGGCAG GATATTTCGT
551 TTACGATTTT GTCCGAACCG GATACGCCGA TTAAGGCGAA GCTCGACAGC
601 GTCGACCCCG GGCTGACCAC GATGTCGTCG GGCGGTTACA ACAGCAGTAC
651 GGATACGGCT TCCAATGCGG TCTACTATTA TGCCCGTTCG TTTGTGCCGA
701 ATCCGGACGG CAAACTCGCC ACGGGGATGA CGACGCAGAA TACGGTTGAA
751 ATCGACGGCG TGAAAAATGT GCTGATTATT CCGTCGCTGA CCGTGAAAAA
801 TCGCGGCGGC AAGGCGTTTG TGCGCGTGTT GGGTGCGGAC GGCAAGGCGG
851 CGGAACGCGA AATCCGGACC GGTATGAGAG ACAGTATGAA TACCGAAGTA
901 AAAAGCGGGT TGAAAGAGGG GGACAAAGTG GTCATCTCCG AAATAACCGC
951 CGCCGAGCAA CAGGAAAGCG GCGAACGCGC CCTAGGCGGC CCGCCGCGCC
1001 GATAA
它对应于氨基酸序列<SEQ ID 768;ORF85-1>:
1 ..VSVGAQASGQ IKILYVKLGQ QVKKGDLIAE INSTSQTNTL NTEKSKLETY
51 QAKLVSAQIA LGSAEKKYKR QAALWKENAT SKEDLESAQD AFAAAKANVA
101 ELKALIRQSK ISINTAESEL GYTRITATMD GTVVAILVEE GQTVNAAQST
151 PTIVQLANLD MMLNKMQIAE GDITKVKAGQ DISFTILSEP DTPIKAKLDS
201 VDPGLTTMSS GGYNSSTDTA SNAVYYYARS FVPNPDGKLA TGMTTQNTVE
251 IDGVKNVLII PSLTVKNRGG KAFVRVLGAD GKAAEREIRT GMRDSMNTEV
301 KSGLKEGDKV VISEITAAEQ QESGERALGG PPRR*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF85和脑膜炎奈瑟球菌菌株A的ORF(ORF85a)在41个氨基酸的重叠区内有87.8%的相同性,在153个氨基酸的重叠区内有99.3%的相同性:
10 20 30 40
orf85.pep MAKMMKWAAVAAVAAAAVWGGWS-LKPEPHVLDITETVRRG
||||||||||||||||||||||| |||||:: ||||||||
orf85a MAKMMKWAAVAAVAAAAVWGGWSYLKPEPQAAYITETVRRGDISRTVSATGEISPSNLVS
10 20 30 40 50 60
80 90 100
orf85.pep ..............................ISFTILSEPDTPIKAKLDSVDPGLTTMSSG
||||||||||||||||||||||||||||||
orf85a TIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSSG
210 220 230 240 250 260
110 120 130 140 150 160
orf85.pep GYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGGK
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:
orf85a GYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGGR
270 280 290 300 310 320
170 180 190 200 210 220
orf85.pep AFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85a AFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGP
330 340 350 360 370 380
230
orf85.pep PRRX
||||
orf85a PRRX
390
全长ORF85a核苷酸序列<SEQ ID 769>是:
1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC
51 GGTTTGGGGC GGATGGTCTT ATCTGAAGCC CGAGCCGCAG GCTGCTTATA
101 TTACGGAAAC GGTCAGGCGC GGCGACATCA GCCGGACGGT TTCTGCAACA
151 GGGGAGATTT CGCCGTCCAA CCTGGTATCG GTCGGCGCGC AGGCATCGGG
201 GCAGATTAAG AAACTTTATG TCAAACTCGG GCAACAGGTT AAAAAGGGCG
251 ATTTGATTGC GGAAATCAAT TCGACCTCGC AGACCAATAC GCTCAATACG
301 GAAAAATCCA AATTGGAAAC GTATCAGGCG AAGCTGGTGT CGGCACAGAT
351 TGCATTGGGC AGCGCGGAGA AGAAATATAA GCGTCAGGCG GCGTTGTGGA
401 AGGATGATGC GACCGCTAAA GAAGATTTGG AAAGCGCACA GGATGCGCTT
451 GCCGCCGCCA AAGCCAATGT TGCCGAGCTG AAGGCTCTAA TCAGACAGAG
501 CAAAATTTCC ATCAATACCG CCGAGTCGGA ATTGGGCTAC ACGCGCATTA
551 CCGCAACGAT GGACGGCACG GTGGTGGCGA TTCTCGTGGA AGAGGGGCAG
601 ACTGTGAACG CGGCGCAGTC TACGCCGACG ATTGTCCAAT TGGCGAATCT
651 GGATATGATG TTGAACAAAA TGCAGATTGC CGAGGGCGAT ATTACCAAGG
701 TGAAGGCGGG GCAGGATATT TCGTTTACGA TTTTGTCCGA ACCGGATACG
751 CCGATTAAGG CGAAGCTCGA CAGCGTCGAC CCCGGGCTGA CCACGATGTC
801 GTCGGGCGGC TACAACAGCA GTACGGATAC GGCTTCCAAT GCGGTCTACT
851 ATTATGCCCG TTCGTTTGTG CCGAATCCGG ACGGCAAACT CGCCACGGGG
901 ATGACGACGC AGAATACGGT TGAAATCGAC GGTGTGAAAA ATGTGCTGAT
951 TATTCCGTCG CTGACCGTGA AAAATCGCGG CGGCAGGGCG TTTGTGCGCG
1001 TGTTGGGTGC AGACGGCAAG GCGGCGGAAC GCGAAATCCG GACCGGTATG
1051 AGAGACAGTA TGAATACCGA AGTAAAAAGC GGGTTGAAAG AGGGGGACAA
1101 AGTGGTCATC TCCGAAATAA CCGCCGCCGA GCAGCAGGAA AGCGGCGAAC
1151 GCGCCCTAGG CGGCCCGCCG CGCCGATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 770>:
1
MAKMMKWAAV AAVAAAAVWG GWSYLKPEPQ AAYITETVRR GDISRTVSAT
51 GEISPSNLVS VGAQASGQIK KLYVKLGQQV KKGDLIAEIN STSQTNTLNT
101 EKSKLETYQA KLVSAQIALG SAEKKYKRQA ALWKDDATAK EDLESAQDAL
151 AAAKANVAEL KALIRQSKIS INTAESELGY TRITATMDGT VVAILVEEGQ
201 TVNAAQSTPT IVQLANLDMM LNKMQIAEGD ITKVKAGQDI SFTILSEPDT
251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG
301 MTTQNTVEID GVKNVLIIPS LTVKNRGGRA FVRVLGADGK AAEREIRTGM
351 RDSMNTEVKS GLKEGDKVVI SEITAAEQQE SGERALGGPP RR*
ORF85a和ORF85-1在334个氨基酸的重叠区内显示出有98.2%的相同性:
30 40 50 60 70 80
orf85a.pep PQAAYITETVRRGDISRTVSATGEISPSNLVSVGAQASGQIKKLYVKLGQQVKKGDLIAE
|||||||||||| |||||||||||||||||
orf85-1 VSVGAQASGQIKILYVKLGQQVKKGDLIAE
10 20 30
90 100 110 120 130 140
orf85a.pep INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATAKEDLESAQD
||||||||||||||||||||||||||||||||||||||||||||||::||:|||||||||
orf85-1 INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKENATSKEDLESAQD
40 50 60 70 80 90
150 160 170 180 190 200
orf85a.pep ALAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST
|:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85-1 AFAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST
100 110 120 130 140 150
210 220 230 240 250 260
orf85a.pep PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85-1 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
160 170 180 190 200 210
270 280 290 300 310 320
orf85a.pep GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85-1 GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGG
220 230 240 250 260 270
330 340 350 360 370 380
orf85a.pep RAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85-1 KAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
280 290 300 310 320 330
390
orf85a.pep PPRRX
|||||
orf85-1 PPRRX
图19D显示出ORF85a的亲水性、抗原性指数和AMPHI区域的曲线。
与淋病奈瑟球菌的预计ORF的同源性
ORF85和淋病奈瑟球菌的预计ORF(ORF85ng)显示出高度的相同性:
ORF85 1 MAKMMKWAAVAAVAAAAVWGGWS.LKPEPHVLDITETVRRG......... 40
||||||||||||||||||||||| |||||:: |||:||||
ORF85ng 1 MAKMMKWAAVAAVAAAAVWGGWSYLKPEPQAAYITEAVRRGDISRTVSAT 50
. . . . .
ORF85 .......................................ISFTILSEPDT 250
|||||||||||
ORF85ng 201 TVNAAQSTPTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDT 250
. . . . .
ORF85 251 PIKAKLDSVDPGLTTMSSGGYNSSTDTASNAVYYYARSFVPNPDGKLATG 300
||||||||||||||||||||||||||||||||||||||||||||||||||
ORF85ng 251 PIKAKLDSVDPGLTTMSSGGYNSSTDTASNAVYYYARSFVPNPDGKLATG 300
. . . . .
ORF85 301 MTTQNTVEIDGVKNVLIIPSLTVKNRGGKAFVRVLGADGKAAEREIRTGM 350
||||||||||||||||:|||||||||||||||||||||||| ||||||||
ORF85ng 301 MTTQNTVEIDGVKNVLLIPSLTVKNRGGKAFVRVLGADGKAVEREIRTGM 350
. . . . .
ORF85 152 RDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGPPRR 393
:|||||||||||||||||||||||||||||||||||||||||
ORF85ng 351 KDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGGPPRR 393
全长ORF85ng核苷酸序列<SEQ ID 771>是:
1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCaac
51 GGTTTGGGGC GGATGGTCTT ATCTGAAGCC CGAACCGCAG GCTGCTTATA
101 TTACGGAaac ggTCAGGCGC GGCGATATCA GCCGGACGGT TTCCGCGACG
151 GgcgAGATTT CGCCGTCCAA CCTGGTATCG GTCGGCGCGC AGGCTTCGGG
201 GCAGATTAAA AAGCTTTATG TCAAACTCGG GCAACAGGTC AAAAAGGGCG
251 ATTTGATTGC GGAAATCAAT TCGACCACGC AGACCAACAC GATCGATATG
301 GAAAAATCCA AATTGGAAAC GTATCAGGCG AAGCTGGTGT CGGCACAGAT
351 TGCATTGGGC AGCGCGGAGA AGAAATATAA GCGTCAGGCG GCGTTGTGGA
401 AGGATGATGC GACCTCTAAA GAAGATTTGG AAAGCGCGCA GGATGCGCTT
451 GCCGCCGCCA AAGCCAATGT TGCCGAGTTG AAGGCTTTAA TCAGACAGAG
501 CAAAATTTCC ATCAATACCG CCGAGTCGGA TTTGGGCTAC ACGCGCATTA
551 CCGCGACGAT GGACGGCACG GTGGTGGCGA TTCCCGTGGA AGAGGGGCAG
601 ACTGTGAACG CGGCGCAGTC TACGCCGACG ATTGTCCAAT TGGCGAATCT
651 GGATATGATG TTGAACAAAA TGCAGATTGC CGAGGGCGAT ATTACCAAGG
701 TGAAGGCGGG GCAGGATATT TCGTTTACGA TTTTGTCCGA ACCGGATACG
751 CCGATTAAGG CGAAGCTCGA CAGCGTCGAC CCCGGGCTGA CCACGATGTC
801 GTCGGGCGGC TACAACAGCA GTACGGATAC GGCTTCCAAT GCGGTCTATT
851 ATTATGCCCG TTCGTTTGTG CCGAATCCGG ACGGCAAACT CGCCACGGGG
901 ATGACGACGC AGAATACGGT TGAAATCGAC GGTGTGAAAA ATGTGTTGCT
951 TATTCCGTCG CTGACCGTGA AAAATCGCGG CGGCAAGGCG TTCGTACGCG
1001 TGTTGGGTGC GGACGGCAAG GCAGTGGAAC GCGAAATCCG GACCGGTATG
1051 AAAGACAGTA TGAATACCGA AGTGAAAAGC GGGTTGAAAG AGGGGGACAA
1101 AGTGGTCATC TCCGAAATAA CCGCCGCCGA GCAGCAGGAA AGCGGCGAAC
1151 GCGCCCTAGG CGGCCCGCCG CGCCGATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 772>:
1
MAKMMKWAAV AAVAAAAVWG GWSYLKPEPQ AAYITEAVR
R GDISRTVSAT
51 GEISPSNLVS VGAQASGQIK KLYVKLGQQV KKGDLIAEIN STTQTNTIDM
101 EKSKLETYQA KLVSAQIALG SAEKKYKRQA ALWKDDATSK EDLESAQDAL
151 AAAKANVAEL KALIRQSKIS INTAESDLGY TRITATMDGT VVAIPVEEGQ
201 TVNAAQSTPT IVQLANLDMM LNKMQIAEGD ITKVKAGQDI SFTILSEPDT
251 PIKAKLDSVD PGLTTMSSGG YNSSTDTASN AVYYYARSFV PNPDGKLATG
301 MTTQNTVEID GVKNVLLIPS LTVKNRGGKA FVRVLGADGK AVEREIRTGM
351 KDSMNTEVKS GLKEGDKVVI SEITAAEQQE SGERALGGPP RR*
ORF85ng和ORF85-1在334个氨基酸的重叠区内显示出有96.1%的相同性:
30 40 50 60 70 80
orf85ng PQAAYITETVRRGDISRTVSATGEISPSNLVSVGAQASGQIKKLYVKLGQQVKKGDLIAE
|||||||||||| |||||||||||||||||
orf85-1 VSVGAQASGQIKILYVKLGQQVKKGDLIAE
10 20 30
90 100 110 120 130 140
orf85ng INSTTQTNTIDMEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATSKEDLESAQD
||||:||||:: ||||||||||||||||||||||||||||||||||::||||||||||||
orf85-1 INSTSQTNTLNTEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKENATSKEDLESAQD
40 50 60 70 80 90
150 160 170 180 190 200
orf85ng ALAAAKANVAELKALIRQSKISINTAESDLGYTRITATMDGTVVAIPVEEGQTVNAAQST
|:||||||||||||||||||||||||||:||||||||||||||||| |||||||||||||
orf85-1 AFAAAKANVAELKALIRQSKISINTAESELGYTRITATMDGTVVAILVEEGQTVNAAQST
100 110 120 130 140 150
210 220 230 240 250 260
orf85ng PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf85-1 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS
160 170 180 190 200 210
270 280 290 300 310 320
orf85ng GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLLIPSLTVKNRGG
||||||||||||||||||||||||||||||||||||||||||||||||:|||||||||||
orf85-1 GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLIIPSLTVKNRGG
220 230 240 250 260 270
330 340 350 360 370 380
orf85ng KAFVRVLGADGKAVEREIRTGMKDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
|||||||||||||:||||||||:|||||||||||||||||||||||||||||||||||||
orf85-1 KAFVRVLGADGKAAEREIRTGMRDSMNTEVKSGLKEGDKVVISEITAAEQQESGERALGG
280 290 300 310 320 330
390
orf85ng PPRRX
|||||
orf85-1 PPRRX
另外,ORF85ng显示出与大肠杆菌一种膜融合蛋白明显同源:
gi|1787104(AE000189)o380;与膜融合蛋白前体的332个残基有27%相同(27个空隙),MTRC_NEIGO SW:P43505(412aa)[大肠杆菌]长度=380
评分=193位(485),估计值=2e-48
相同性=120/345(34%),阳性=182/345(51%),空隙=13/345(3%)
询问:29 PQAAYITETVRRGDISRTVSATGEISPSNLVSVGAQASGQIKKLYVKLGQQVKKGDLIAE 88
P Y T VR GD+ ++V ATG++ V VGAQ SGQ+K L V +G +VKK L+
目标:41 PVPTYQTLIVRPGDLQQSVLATGKLDALRKVDVGAQVSGQLKTLSVAIGDKVKKDQLLGV 100
询问:89 INSTTQTNTIDMEKSKLETYQAKLVSAQIALGSAEKKYKRQAALWKDDATSKEXXXXXXX 148
I+ N I ++ L +A+ A+ L A Y RQ L + A S++
目标:101 IDPEQAENQIKEVEATLMELRAQRQQAEAELKLARVTYSRQQRLAQTKAVSQQDLDTAAT 160
询问:149 XXXXXXXXXXXXXXXIRQSKISINTAESDLGYTRITATMDGTVVAIPVEEGQTVNAAQST 208
I++++ S++TA+++L YTRI A M G V I +GQTV AAQ
目标:161 EMAVKQAQIGTIDAQIKRNQASLDTAKTNLDYTRIVAPMAGEVTQITTLQGQTVIAAQQA 220
询问:209 PTIVQLANLDMMLNKMQIAEGDITKVKAGQDISFTILSEPDTPIKAKLDSVDPGLTTMSS 268
P I+ LA++ ML K Q++E D+ +K GQ FT+L +P T + ++ V P
目标:221 PNILTLADMSAMLVKAQVSEADVIHLKPGQKAWFTVLGDPLTRYEGQIKDVLP------- 273
询问:269 GGYNSSTDTASNAVYYYARSFVPNPDGKLATGMTTQNTVEIDGVKNVLLIPSLTVKNRGG 328
+ + ++A++YYAR VPNP+G L MT Q +++ VKNVL IP + + G
目标:274 -----TPEKVNDAIFYYARFEVPNPNGLLRLDMTAQVHIQLTDVKNVLTIPLSALGDPVG 328
询问:329 KAFVRV-LGADGKAVEREIRTGMKDSMNTEVKSGLKEGDKVVISE 372
+V L +G+ ERE+ G ++ + E+ GL+ GD+VVI E
目标:329 DNRYKVKLLRNGETREREVTIGARNDTDVEIVKGLEAGDEVVIGE 373
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF85-1(40.4kDa)克隆到pGex载体中,并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图19A显示出GST-融合蛋白亲和纯化的结果。用纯化的GST-融合蛋白免疫小鼠,用小鼠血清进行Western印迹(图19B),FACS分析(图19C)和ELISA(阳性结果)。这些实验确认ORF85-1是一种外露蛋白,且是一种有用的免疫原。
实施例92
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 773>:
1 ..ATTCCCGCCA CGATGACATT TGAACGCAGC GGCAATGCTT ACAAAATCGT
51 TTCGACGATT AAAGTGCCGC TATACAATAT CCGTTTCGAG TCCGGCGGTA
101 CGGTTGTCGG CAATACCCTG CACCCTACCT ACTATAGAGA CATACGCAGG
151 GGCAAACTGT ATGCGGAAgc CAAATTCGCC GACgGcAGCG TAACTTACGG
201 CAAAGCGGGC GAGAGCAAAA CCGAGCAAAG CCCCAAGGCT ATGGATTTGT
251 TCACGCTTGC CTGGCAGTTG GCGGCAAATG ACGCGAAACT CCCCCCGGGG
301 CTGAAAATCA CCAACGGCAA AAAACTTTAT TCCGTCGGCG GTTTGAATAA
351 GGCGGGTACA GGAAAATACA GCATAGGCGG CGTGGAAACC GAAGTCGTCA
401 AATATCGGGT GCGGCGCGGC GACGATGCGG TAATGTATTT cTTCGCACCG
451 TCCCTGAACA ATATTCCGGC ACAAATCGGC TATACCGACG ACGGCAAAAC
501 CTATACGCTG AAACTCAAAT CGGTGCAGAT CAACGGCCAG GCAGCCAAAC
551 CGTAA
它对应于氨基酸序列<SEQ ID 774;ORF120>:
1 ..IPATMTFERS GNAYKIVSTI KVPLYNIRFE SGGTVVGNTL HPTYYRDIRR
51 GKLYAEAKFA DGSVTYGKAG ESKTEQSPKA MDLFTLAWQL AANDAKLPPG
101 LKITNGKKLY SVGGLNKAGT GKYSIGGVET EVVKYRVRRG DDAVMYFFAP
151 SLNNIPAQIG YTDDGKTYTL KLKSVQINGQ AAKP*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 775>:
1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC
51 CCTGCCGTGC GCGTATGCGG CAGGGCTGCC CCAATCCGCC GTGCTGCACT
101 ATTCCGGCAG CTACGGCATT CCCGCCACGA TGACATTTGA ACGCAGCGGC
151 AATGCTTACA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG
201 TTTCGAGTCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTACCTACT
251 ATAGAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC
301 GGCAGCGTAA CTTACGGCAA AGCGGGCGAG AGCAAAACCG AGCAAAGCCC
351 CAAGGCTATG GATTTGTTCA CGCTTGCCTG GCAGTTGGCG GCAAATGACG
401 CGAAACTCCC CCCGGGGCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC
451 GTCGGCGGTT TGAATAAGGC GGGTACAGGA AAATACAGCA TAGGCGGCGT
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATGCGGTAA
551 TGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT
601 ACCGACGACG GCAAAACCTA TACGCTGAAA CTCAAATCGG TGCAGATCAA
651 CGGCCAGGCA GCCAAACCGT AA
它对应于氨基酸序列<SEQ ID 776;ORF120-1>:
1
MMKTFKNIFS AAILSAALPC AYAAGLPQSA VLHYSGSYGI PATMTFERSG
51 NAYKIVSTIK VPLYNIRFES GGTVVGNTLH PTYYRDIRRG KLYAEAKFAD
101 GSVTYGKAGE SKTEQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS
151 VGGLNKAGTG KYSIGGVETE VVKYRVRRGD DAVMYFFAPS LNNIPAQIGY
201 TDDGKTYTLK LKSVQINGQA AKP*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF120和脑膜炎奈瑟球菌菌株A的ORF(ORF120a)在184个氨基酸的重叠区内显示出有92.4%的相同性:
10 20 30
orf120.pep IPATMTFERSGNAYKIVSTIKVPLYNIRFE
|||| : ||||||||||||||||||||
orf120a SAAILSAALPCAYAAGLPXSAVLHYSGSYGIPATXXXXXXXNAXKIVSTIKVPLYNIRFE
10 20 30 40 50 60
40 50 60 70 80 90
orf120.pep SGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL
||||||||||||||||||||||||||||||||||||||| : |||||||||||||||
orf120a SGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAXXXXXXQSPKAMDLFTLAWQL
70 80 90 100 110 120
100 110 120 130 140 150
orf120.pep AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDAVMYFFAP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120a AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDAVMYFFAP
130 140 150 160 170 180
160 170 180
orf120.pep SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
|||||||||||||||||||||||||||||||||||
orf120a SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
190 200 210 220
全长ORF120a核苷酸序列<SEQ ID 777>是:
1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC
51 CCTGCCGTGC GCGTATGCGG CAGGGCTGCC CNAATCCGCC GTGCTGCACT
101 ATTCCGGCAG CTACGGCATT CCCGCCACNA NNANNTNNGN ACNNNGNGNC
151 AATGCTTNCA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG
201 TTTCGAGTCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTACCTACT
251 ATAGAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC
301 GGCAGCGTAA CCTACGGCAA AGCGGNNNNN ANCNNNNNNG NGCAAAGCCC
351 CAAGGCTATG GATTTGTTCA CGCTTGCNTG GCAGTTGGCG GCAAATGACG
401 CGAAACTCCC CCCGGGGCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC
451 GTCGGCGGTT TGAATAAGGC GGGTACAGGA AAATACAGCA TAGGCGGCGT
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATGCGGTAA
551 TGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT
601 ACCGACGACG GCAAAACCTA TACGCTGAAA CTCAAATCGG TGCAGATCAA
651 CGGCCAGGCA GCCAAACCGT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 778>:
1
MMKTFKNIFS AAILSAALPC AYAAGLPXSA VLHYSGSYGI PATXXXXXXX
51 NAXKIVSTIK VPLYNIRFES GGTVVGNTLH PTYYRDIRRG KLYAEAKFAD
101 GSVTYGKAXX XXXXQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS
151 VGGLNKAGTG KYSIGGVETE VVKYRVRRGD DAVMYFFAPS LNNIPAQIGY
201 TDDGKTYTLK LKSVQINGQA AKP*
ORF120a和ORF120-1在223个氨基酸的重叠区内显示出有93.3%的相同性:
10 20 30 40 50 60
orf120a.pep MMKTFKNIFSAAILSAALPCAYAAGLPXSAVLHYSGSYGIPATXXXXXXXNAXKIVSTIK
||||||||||||||||||||||||||| ||||||||||||||| : ||| |||||||
orf120-1 MMKTFKNIFSAAILSAALPCAYAAGLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK
10 20 30 40 50 60
70 80 90 100 110 120
orf120a.pep VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAXXXXXXQSPKAM
|||||||||||||||||||||||||||||||||||||||||||||||| : ||||||
orf120-1 VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM
70 80 90 100 110 120
130 140 150 160 170 180
orf120a.pep DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120-1 DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
130 140 150 160 170 180
190 200 210 220
orf120a.pep DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
||||||||||||||||||||||||||||||||||||||||||||
orf120-1 DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
190 200 210 220
与淋病奈瑟球菌的预计ORF的同源性
ORF120和淋病奈瑟球菌的预计ORF(ORF120ng)在184个氨基酸的重叠区内显示出有97.8%的相同性:
orf120.pep IPATMTFERSGNAYKIVSTIKVPLYNIRFE 30
||||||||||||||||||||||||||||||
orf120ng SAAILSAALPCAYAARLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIKVPLYNIRFE 69
orf120.pep SGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 90
||||||||||||:||:|||||||||||||||||||||||||||||||||||||||||||
orf120ng SGGTVVGNTLHPAYYKDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAMDLFTLAWQL 129
orf120.pep AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDAVMYFFAP 150
||||||||||||||||||||||||||||||||||||||||||||||||||||:| |||||
orf120ng AANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGDDTVTYFFAP 189
orf120.pep SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKP 184
||||||||||||||||||||||||||||||||||
orf120ng SLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKP 223
全长ORF120ng核苷酸序列<SEQ ID 779>是:
1 ATGATGAAGA CTTTTAAAAA TATATTTTCC GCCGCCATTT TGTCCGCCGC
51 CCTGCCGTGC GCGTATGCGG CAAGGCTACC CCAATCCGCC GTGCTGCACT
101 ATTCCGGCAG CTACGGCATT CCCGCCACGA TGACATTTGA ACGCAGCGGC
151 AATGCTTACA AAATCGTTTC GACGATTAAA GTGCCGCTAT ACAATATCCG
201 TTTCGAATCC GGCGGTACGG TTGTCGGCAA TACCCTGCAC CCTGCCTACT
251 ATAAAGACAT ACGCAGGGGC AAACTGTATG CGGAAGCCAA ATTCGCCGAC
301 GGCAGCGTAA CCTACGGCAA AGCGGGCGAG AGCAAAACCG AGCAAAGCCC
351 CAAGGCTATG GATTTGTTCA CGCTTGCCTG GCAGTTGGCG GCAAATGACG
401 CGAAACTCCC CCCGGGTCTG AAAATCACCA ACGGCAAAAA ACTTTATTCC
451 GTCGGCGGCC TGAATAAGGC GGGTACGGGA AAATACAGCA TaggCGGCGT
501 GGAAACCGAA GTCGTCAAAT ATCGGGTGCG GCGCGGCGAC GATACGGTAA
551 CGTATTTCTT CGCACCGTCC CTGAACAATA TTCCGGCACA AATCGGCTAT
601 ACCGACGACG GCAAAACCTA TACGCTGAAG CTCAAATCGG TGCAGATCAA
651 CGGACAGGCC GCCAAACCGT AA
它编码的蛋白质具有氨基酸序列<SEQ ID 780>:
1
MMKTFKNIFS AAILSAALPC AYAARLPQSA VLHYSGSYGI PATMTFERSG
51 NAYKIVSTIK VPLYNIRFES GGTVVGNTLH PAYYKDIRRG KLYAEAKFAD
101 GSVTYGKAGE SKTEQSPKAM DLFTLAWQLA ANDAKLPPGL KITNGKKLYS
151 VGGLNKAGTG KYSIGGVETE VVKYRVRRGD DTVTYFFAPS LNNIPAQIGY
201 TDDGKTYTLK LKSVQINGQA AKP*
与ORF120-1相比,ORF120ng在223个氨基酸的重叠区内显示出有97.8%的相同性:
10 20 30 40 50 60
orf120-1.pep MMKTFKNIFSAAALSAALPCAYAAGLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK
|||||||||||||||||||||||| |||||||||||||||||||||||||||||||||||
orf120ng MMKTFKNIFSAAILSAALPCAYAARLPQSAVLHYSGSYGIPATMTFERSGNAYKIVSTIK
10 20 30 40 50 60
70 80 90 100 110 120
orf120-1.pep VPLYNIRFESGGTVVGNTLHPTYYRDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM
|||||||||||||||||||||:||:|||||||||||||||||||||||||||||||||||
orf120ng VPLYNIRFESGGTVVGNTLHPAYYKDIRRGKLYAEAKFADGSVTYGKAGESKTEQSPKAM
70 80 90 100 110 120
130 140 150 160 170 180
orf120-1.pep DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf120ng DLFTLAWQLAANDAKLPPGLKITNGKKLYSVGGLNKAGTGKYSIGGVETEVVKYRVRRGD
130 140 150 160 170 180
190 200 210 220
orf120-1.pep DAVMYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
|:| ||||||||||||||||||||||||||||||||||||||||
orf120ng DTVTYFFAPSLNNIPAQIGYTDDGKTYTLKLKSVQINGQAAKPX
190 200 210 220
该分析结果(包括淋球菌蛋白中有一个推定的前导序列)提示脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例93
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 781>:
1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG GTGCCGGTGC
51 .GCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATCGTCC
251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC
401 ATACGGGAGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AGGCAGGGCG GCAATATT..
它对应于氨基酸序列<SEQ ID 782;ORF121>:
1 MYRRKGRGIK PWMGAGXAFA ALVWLVFALG DTLTPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMSVMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM
151 RQGGNI..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 783>:
1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG GTGCCGGTGC
51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATCGTCC
251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC
401 ATACGGGAGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AGGCAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCTTCC
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA
551 TTGCCAAACT GGTTCCGAgG CGTTTTGCCG GTGCTTATAC GCGCATTACA
601 GGCAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGGC AGCTTCTGGT
651 AATGCTGATT ATGGGCTTGG TTTACGGTTT GGGATTGGTG CTGGTCGGGC
701 TGGATTCGGG GTTTGCCATC GGTATGCTTG CCGGTATTTT GGTGTTTGTC
751 CCTTATCTCG GGGCGTTTAC GGGATTGCTG CTTGCCACCG TCGCCGCCTT
801 GCTCCAGTTC GGTTCGTGGA ACGGCATCCT ATCGGTTTGG GCGGTTTTTG
851 CCGTAGGACA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATCGTGGGA
901 GACCGTATCG GGCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT
951 CGGGCAGCTG ATGGGCTTTG TCGGAATGTT GGCGGGATTG CCTTTGGCCG1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG TGCAGAAATA TTTTGCCGGC1051 AGTTTTTACC GGGGCAGGTA G
它对应于氨基酸序列<SEQ ID 784;ORF121-1>:
1
MYRRKGRGIK PWMGAGAAFA ALVWLVFALG DTL
TPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMS
VMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM
151 RQGGNIVS
SI GNLLLLPLLL YYFLLDWQRW SCGIAKLVPR RFAGAYTRIT
201 GNLNEVLGEF LRGQL
LVMLI MGLVYGLGLV LVGLDSGFAI GMLAG
ILVFV
251
PYLGAFTGLL LATVAALLQF GSWNG
ILSVW AVFAVGQFLE SFFITPKIVG
301 DRIGLSPFWV IFSLMAFGQL MGF
VGMLAGL PLAAVTLVLL REGVQKYFAG
351 SFYRGR*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF121和脑膜炎奈瑟球菌菌株A的ORF(ORF121a)在156个氨基酸的重叠区内显示出有98.7%的相同性:
10 20 30 40 50 60
orf121.pep MYRRKGRGIKPWMGAGXAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
||||||||||||| || |||||||||||||||||||||||||||||||||||||||||||
orf121a MYRRKGRGIKPWMDAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
10 20 30 40 50 60
70 80 90 100 110 120
orf121.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121a ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
70 80 90 100 110 120
130 140 150
orf121.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNI
||||||||||||||||||||||||||||||||||||
orf121a EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
130 140 150 160 170 180
orf121a SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
190 200 210 220 230 240
全长ORF121a核苷酸序列<SEQ ID 785>是:
1 ATGTATCGGA GGAAAGGGCG GGGCATCAAG CCGTGGATGG ATGCCGGTGC
51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTT CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTATTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATTGTCC
251 CTATGCTGGT CGGGCAGTTC AACAATTTGG CATCGCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG CTTCAGGCGC
401 ATACGGGCGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AGGCAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCTTCC
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA
551 TTGCCAAACT GGTTCCGAGG CGTTTTGCCG GTGCTTATAC GCGCATTACA
601 GGCAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGGC AGCTTCTGGT
651 GATGCTGATT ATGGGTTTGG TTTACGGCTT GGGGTTGGTG CTGGTCGGGC
701 TGGATTCGGG GTTTGCAATC GGTATGGTTG CCGGTATTTT GGTTTTTGTT
751 CCCTATTTGG GCGCGTTTAC AGGACTGCTG CTGGCAACCG TCGCCGCCTT
801 GCTCCAGTTC GGTTCGTGGA ACGGCATCTT GGCTGTTTGG GCGGTTTTTG
851 CCGTAGGACA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATCGTGGGA
901 GACCGTATCG GCCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT
951 CGGGCAGCTG ATGGGCTTTG TCGGAATGTT GGCCGGATTG CCTTTGGCCG
1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG TGCAGAAATA TTTTGCCGGC
1051 AGTTTTTACC GGGGCAGGTA G
它编码的蛋白质具有氨基酸序列<SEQ ID 786>:
1
MYRRKGRGIK PWMDAGAAFA ALVWLVFALG DTL
TPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMS
VMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW LQAHTGELSN ALKAWFPVLM
151 RQGGNIVS
SI GNLLLLPLLL YYFLLDWQRW SCGIAKLVPR RFAGAYTRIT
201 GNLNEVLGEF LRGQL
LVMLI MGLVYGLGLV LVGLDSGFAI GMVAG
ILVFV
251
PYLGAFTGLL LATVAALLQF GSWNG
ILAVW AVFAVGQFLE SFFITPKIVG
301 DRIGLSPFWV IFSLMAFGQL MGF
VGMLAGL PLAAVTLVLL REGVQKYFAG
351 SFYRGR*
ORF121a和ORF121-1在356个氨基酸的重叠区内显示出有99.2%的相同性:
10 20 30 40 50 60
orf121a.pep MYRRKGRGIKPWMDAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
|||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 MYRRKGRGIKPWMGAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
10 20 30 40 50 60
70 80 90 100 110 120
orf121a.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
70 80 90 100 110 120
130 140 150 160 170 180
orf121a.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
130 140 150 160 170 180
190 200 210 220 230 240
orf121a.pep SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
190 200 210 220 230 240
250 260 270 280 290 300
orf121a.pep GMVAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILAVWAVFAVGQFLESFFITPKIVG
||:||||||||||||||||||||||||||||||||||:||||||||||||||||||||||
orf121-1 GMLAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILSVWAVFAVGQFLESFFITPKIVG
250 260 270 280 290 300
310 320 330 340 350
orf121a.pep DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121-1 DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX
310 320 330 340 350
与淋病奈瑟球菌的预计ORF的同源性
ORF121和淋病奈瑟球菌的预计ORF(ORF121ng)在156个氨基酸的重叠区内显示出有97.4%的相同性:
orf121.pep MYRRKGRGIKPWMGAGXAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 60
|||||||||||||||| |||||||||:|||||||||||||||||||||||||||||||||
orf121ng MYRRKGRGIKPWMGAGAAFAALVWLVYALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR 60
orf121.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121ng ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV 120
orf121.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNI 156
||||||||||:|||||||||||||||||||:|||||
orf121ng EIDQASIIAWFQAHTGELSNALKAWFPVLMKQGGNIVSTIGNLLLPPLLLYYFLLDWHRW 180
预计ORF121ng核苷酸序列<SEQ ID 787>编码的蛋白质具有氨基酸序列<SEQ ID788>:
1
MYRRKGRGIK PWMGAGAAFA ALVWLVYALG DTL
TPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMS
VMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW FQAHTGELSN ALKAWFPVLM
151 KQGGNIVS
TI GNLLLPPLLL YYFLLDWHRW SCGIPKLVPR RFAGAYTRIT
201 GNLNKVWGKF LRGQLLGETE RGAVVCRVGR ECWEGGGARS RPSDDGWPRW
251 GGG*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 789>:
1 ATGTATCGGA GAAAAGGACG GGGCATCAAG CCGTGGATGG GTGCCGGCGC
51 GGCGTTTGCC GCCTTGGTCT GGCTGGTTTA CGCGCTCGGC GATACTTTGA
101 CTCCGTTTGC GGTTGCGGCG GTGCTGGCGT ATGTGTTGGA CCCTTTGGTC
151 GAATGGTTGC AGAAAAAGGG TTTGAACCGT GCATCCGCTT CGATGTCTGT
201 GATGGTGTTT TCCTTGATTT TGTTGTTGGC ATTATTGTTG ATTATTGTCC
251 CTATGCTGGT CGGGCAGTTC AATAATTTGG CATCTCGCCT GCCCCAATTA
301 ATCGGTTTTA TGCAGAACAC GCTGCTGCCG TGGTTGAAAA ATACAATCGG
351 CGGATATGTG GAAATCGATC AGGCATCTAT TATTGCGTGG TTTCAGGCGC
401 ATACGGGCGA GTTGAGCAAC GCGCTTAAGG CGTGGTTTCC CGTTTTGATG
451 AAACAGGGCG GCAATATTGT CAGCAGTATC GGCAACCTGC TGCTGCCGCC
501 CTTGCTGCTT TACTATTTCC TGCTGGATTG GCAGCGGTGG TCGTGCGGCA
551 TCGCCAAACT GGTTCCGAGG CGTTTTGCCG GTGCTTATAC GCGCATTACG
601 GGTAATTTGA ACGAGGTATT GGGCGAATTT TTGCGCGGTC AGCTTCTGGT
651 GATGCTGATT ATGGGCTTGG TTTACGGTTT GGGATTGATG CTAGTCGGAC
701 TGGATTCGGG ATTTGCCATC GGTATGGTTG CCGGTATTTT GGTGTTTGTC
751 CCCTATTTGG GTGCGTTTAC GGGATTGCTG CTTGCCACTG TTGCAGCCTT
801 GCTCCAGTTC GGTTCGTGGA ACGGAATCTT GGCTGTTTGG GCGGTTTTTG
851 CCGTCGGTCA GTTTCTCGAA AGTTTTTTCA TTACGCCGAA AATTGTAGGA
901 GACCGTATCG GCCTGTCGCC GTTTTGGGTT ATCTTTTCGC TGATGGCGTT
951 CGGAGAGCTG ATGGGCTTTG TCGGAATGTT GGCCGGATTG CCTTTGGCCG
1001 CCGTAACCTT GGTCTTGCTT CGCGAGGGCG CGCAGAAATA TTTTGCCGGC
1051 AGTTTTTACC GGGGCAGGTA G
它对应于氨基酸序列<SEQ ID 790;ORF121ng-1>:
1
MYRRKGRGIK PWMGAGAAFA ALVWLVYALG DTL
TPFAVAA VLAYVLDPLV
51 EWLQKKGLNR ASASMS
VMVF SLILLLALLL IIVPMLVGQF NNLASRLPQL
101 IGFMQNTLLP WLKNTIGGYV EIDQASIIAW FQAHTGELSN ALKAWFPVLM
151 KQGGNIVS
SI GNLLLPPLLL YYFLLDWQRW SCGIAKLVPR RFAGAYTRIT
201 GNLNEYLGEF LRGQL
LVMLI MGLVYGLGLM LVGLDSGFAI GMVAG
ILVFV
251
PYLGAFTGLL LATVAALLQF GSWNG
ILAVW AVFAVGQFLE SFFITPKIVG
301 DRIGLSPFWV IFSLMAFGEL MGF
VGMLAGL PLAAVTLVLL REGAQKYFAG
351 SFYRGR*
ORF121ng-1和ORF121-1在356个氨基酸的重叠区内显示出有97.5%的相同性:
10 20 30 40 50 60
orf121-1.pep MYRRKGRGIKPWMGAGAAFAALVWLVFALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
||||||||||||||||||||||||||:|||||||||||||||||||||||||||||||||
orf121ng-1 MYRRKGRGIKPWMGAGAAFAALVWLVYALGDTLTPFAVAAVLAYVLDPLVEWLQKKGLNR
10 20 30 40 50 60
70 80 90 100 110 120
orf121-1.pep ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf121ng-1 ASASMSVMVFSLILLLALLLIIVPMLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYV
70 80 90 100 110 120
130 140 150 160 170 180
orf121-1.pep EIDQASIIAWLQAHTGELSNALKAWFPVLMRQGGNIVSSIGNLLLLPLLLYYFLLDWQRW
||||||||||:|||||||||||||||||||:||||||||||||| |||||||||||||||
orf121ng-1 EIDQASIIAWFQAHTGELSNALKAWFPVLMKQGGNIVSSIGNLLLPPLLLYYFLLDWQRW
130 140 150 160 170 180
190 200 210 220 230 240
orf121-1.pep SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLVLVGLDSGFAI
|||||||||||||||||||||||||||||||||||||||||||||||||:||||||||||
orf121ng-1 SCGIAKLVPRRFAGAYTRITGNLNEVLGEFLRGQLLVMLIMGLVYGLGLMLVGLDSGFAI
190 200 210 220 230 240
250 260 270 280 290 300
orf121-1.pep GMLAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILSVWAVFAVGQFLESFFITPKIVG
||:||||||||||||||||||||||||||||||||||:||||||||||||||||||||||
orf121ng-1 GMVAGILVFVPYLGAFTGLLLATVAALLQFGSWNGILAVWAVFAVGQFLESFFITPKIVG
250 260 270 280 290 300
310 320 330 340 350
orf121-1.pep DRIGLSPFWVIFSLMAFGQLMGFVGMLAGLPLAAVTLVLLREGVQKYFAGSFYRGRX
||||||||||||||||||:||||||||||||||||||||||||:|||||||||||||
orf121ng-1 DRIGLSPFWVIFSLMAFGELMGFVGMLAGLPLAAVTLVLLREGAQKYFAGSFYRGRX
310 320 330 340 350
另外,ORF121ng-1显示出与流感嗜血菌的一种通透酶同源:
sp|P43969|PERM_HAEIN推定的通透酶PERM同系物 长度=349
评分=69.9位(168),估计值=2e-11
相同性=67/317(21%),阳性=120/317(37%),空隙=7/317(2%)
询问:26 VYALGDTLTPFAVAAVLAYVLDPLVEWL-QKKGLNRASASMSVMVFSXXXXXXXXXXXVP 84
+Y GD + P +A VL+Y+L+ + +L Q R A++ + VP
目标:32 IYFFGDLIAPLLIALVLSYLLEIPINFLNQYLKCPRMLATILIFGSFIGLAAVFFLVLVP 91
询问:85 MLVGQFNNLASRLPQLIGFMQNTLLPWLKNTIGGYVE-IDQASIIAWFQAHTGELSNALK 143
ML Q +L S LP + N WL N Y E ID + + + F + ++ +
目标:92 MLWNQTISLLSDLPAMF----NKSNEWLLNLPKNYPELIDYSMVDSIFNSVREKILGFGE 147
询问:144 AWFPVLMKQGGNIVSSIGNXXXXXXXXXXXXXDWQRWSCGIAKLVPRRFAGAYTRITGNL 203
+ + + N+VS D G+++ +P+ A+ R +
目标:148 SAVKLSLASIMNLVSLGIYAFLVPLMMFFMLKDKSELLQGVSRFLPKNRNLAFXRWK-EM 206
询问:204 NEVLGEFLRGQXXXXXXXXXXXXXXXXXXXXDSGFAIGMVAGILVFVPYXXXXXXXXXXX 263
+ + ++ G+ + + G+ V VPY
目标:207 QQQISNYIHGKLLEILIVTLITYIIFLIFGLNYPLLLAFAVGLSVLVPYIGAVIVTIPVA 266
询问:264 XXXXXQFGSWNGILAVWAVFAVGQFLESFFITPKIVGDRIGLSPFWVIFSLMAFGELMGF 323
QFG + FAV Q L+ + P + + + L P +I S++ FG L GF
目标:267 LVALFQFGISPTFWYIIIAFAVSQLLDGNLLVPYLFSEAVNLHPLIIIISVLIFGGLWGF 326
询问:324 VGMLAGLPLAAVTLVLL 340
G+ +PLA + ++
目标:327 WGVFFAIPLATLVKAVI 343
根据该分析结果(包括两个蛋白中存在一个推定的前导序列和跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例94
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 791>:
1 ..ACTGCTTTTT CGGCGGCGCT GCGCTTGAGT CCATCATGAC TCGTCATATT
51 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT
101 TTTGCACGTC CTGCCCGCCG CGTTCAAATG CGTACCAGCA ATACCGCCGC
151 CTGCGCCTCT ATGCCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG
201 TTTTGCCTTT GATGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG
251 ATGTTGGCAC GCATTTGCGG AATGTGCGGC GCGAGTGTGG GTTTCTGTGC
301 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC
351 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
401 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC
451 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC
501 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAG..
它对应于氨基酸序列<SEQ ID 792;ORF122>:
1 ..TAFSAALRLS PSXLVIFLSF GKPYQQTAAI LTFFCTSCPP RSNAYQQYRR
51 LRLYAFHPPE IAEFFVGFAF DVDARNVYAQ IGGDVGTHLR NVRRECGFLC
101 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT
151 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQ..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 793>:
1 ATATCGTACT GGGCAAGCAG TTCGCCGGAT TTTTTGGAAG TAGATACCGC
51 GCCTTTGATT TTTTTGCCGC TCTTACCCAA GGCTTCGATG AAAAAGTTGA
101 TGGTCGAGCC GGTACCGATG CCGATATATT CATTTTCGGG TACGAATTCG
151 ACTGCTTTTT CGGCGGCGAT GCGCTTGAGT TCGTCTTGTG TCGTCATATT
201 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT
251 TTTGCACGTC CTGCCCGCCG CGTTCAAATG CGTACCAGCA ATACCGCCGC
301 CTGCGCCTCT ATGCCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG
351 TTTTGCCTTT GATGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG
401 ATGTTGGCAC GCATTTGCGG AATGTGCGGC GCGAGTTTGG GTTTCTGTGC
451 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC
501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
551 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC
601 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC
651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT
701 CTGCCTTCGG TCAGTTGGTG GACATCGTAG CCCTGTCCGA TACGGATGTT
751 CGTCATCGTT TGTGTTCCTG A
它对应于氨基酸序列<SEQ ID 794;ORF122-1>:
1 ISYWASSSPD FLEVDTAPLI FLPLLPKASM KKLMVEPVPM PIYSFSGTNS
51
TAFSAAMRLS SSCVVIFLSF GKPYQQTAAI LTFFCTSCPP RSNAYQQYRR
101 LRLYAFHPPE IAEFFVGFAF DVDARNVYAQ IGGDVGTHLR NVRREFGFLC
151 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT
201 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDV
251 RHRLCS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF122和脑膜炎奈瑟球菌菌株A的ORF(ORF122a)在182个氨基酸的重叠区内显示出有94.0%的相同性:
10 20 30
orf122.pep TAFSAALRLSPSXLVIFLSFGKPYQQTAAI
||||||:||| | :||||||||||||||||
orf122a FLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLSSSCVVIFLSFGKPYQQTAAI
30 40 50 60 70 80
40 50 60 70 80 90
orf122.pep LTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAFDVDARNVYAQIGGDVGTHLR
|||| |||||||| ||||||||||||| |||:|||||||| |||||||||||||||||||
orf122a LTFFXTSCPPRSNPYQQYRRLRLYAFHAPEITEFFVGFAFXVDARNVYAQIGGDVGTHLR
90 100 110 120 130 140
100 110 120 130 140 150
orf122.pep NVRRECGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRIFELCGGVGEMAADIAQTCRT
|:||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf122a NMRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRIFELCGGVGEMAADIAQTCRT
150 160 170 180 190 200
160 170 180
orf122.pep EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQ
||||||||||||||||||||||||||||||||
orf122a EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLVDIVALSDTDVRHRLCSX
210 220 230 240 250
全长ORF122a核苷酸序列<SEQ ID 795>是:
1 ATATCATATT GGGCAAGCAG TTCACTGGAT TTTTTGGAAG TAGATACCGC
51 GCCTTTGATT TTTTTGCCGC TCTTACCCAA GGCTTCGATG AAAAAGTTGA
101 TGGTCGAACC GGTACCGATG CCGATGTATT CGTTTTCGGG TACGAATTCG
151 ACTGCNTTTT CGGCGGCGAT GCGCTTGAGT TCGTCTTGTG TCGTCATATT
201 TTTGTCCTTT GGGAAACCGT ATCAACAAAC AGCCGCCATC TTAACATTTT
251 TTNNNACGTC CTGCCCGCCG CGTTCAAATC CTTACCAGCA ATACCGCCGC
301 CTGCGACTCT ATGCCTTCCA TGCGCCCGAG ATAACCGAGT TTTTCGTTGG
351 TTTTGCCTTT GANGTTGACG CACGAAATGT CTATGCCCAA ATCGGCGGCG
401 ATGTTGGCAC GCATTTGCGG AATATGCGGC GCGAGTTTGG GTTTCTGTGC
451 AATCACGGTC GTATCGACAT TGACCGCCTG CCAACCCTGC GCCTGAACGC
501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
551 GCGGCGGTGT CGGGGAAATG GCTGCCGATA TCGCCCAAAC CTGCCGCACC
601 GAGCAGCGCG TCGGTAACGG CGTGCAGCAG CGCATCGGCA TCGGAGTGTC
651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT
701 CTGCCTTCGG TCAGTTGGTG GACATCGTAG CCCTGTCCGA TACGGATGTT
751 CGTCATCGTT TGTGTTCCTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 796>:
1 ISYWASSSLD FLEVDTAPLI FLPLLPKASM KKLMVEPVPM PMYSFSGTNS
51
TAFSAAMRLS SSCVVIFLSF GKPYQQTAAI LTFFXTSCPP RSNPYQQYRR
101 LRLYAFHAPE ITEFFVGFAF XVDARNVYAQ IGGDVGTHLR NMRREFGFLC
151 NHGRIDIDRL PTLRLNALIR RTQKDAAVRI FELCGGVGEM AADIAQTCRT
201 EQRVGNGVQQ RIGIGVSEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDV
251 RHRLCS*
ORF122a和ORF122-1在256个氨基酸的重叠区内显示出有96.9%的相同性:
10 20 30 40 50 60
orf122a.pep ISYWASSSLDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLS
|||||||| ||||||||||||||||||||||||||||||:||||||||||||||||||||
orf122-1 ISYWASSSPDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPIYSFSGTNSTAFSAAMRLS
10 20 30 40 50 60
70 80 90 100 110 120
orf122a.pep SSCVVIFLSFGKPYQQTAAILTFFXTSCPPRSNPYQQYRRLRLYAFHAPEITEFFVGFAF
|||||||||||||||||||||||| |||||||| |||||| |||||| |||:||||||||
orf122-1 SSCVVIFLSFGKPYQQTAAILTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAF
70 80 90 100 110 120
130 140 150 160 170 180
orf122a.pep XVDARNVYAQIGGDVGTHLRNMRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRI
|||||||||||||||||||||:||||||||||||||||||||||||||||||||||||||
orf122-1 DVDARNVYAQIGGDVGTHLRNVRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRI
130 140 150 160 170 180
190 200 210 220 230 240
orf122a.pep FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf122-1 FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV
190 200 210 220 230 240
250
orf122a.pep DIVALSDTDVRHRLCSX
|||||||||||||||||
orf122-1 DIVALSDTDVRHRLCSX
250
与淋病奈瑟球菌的预计ORF的同源性
ORF122和淋病奈瑟球菌的ORF(ORF122ng)在182个氨基酸的重叠区内显示出有89.6%的相同性:
orf122.pep TAFSAALRLSPSXLVIFLSFGKPYQQTAAI 30
||||||:||| | :||||||||||||||||
orf122ng FLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLSSSCVVIFLSFGKPYQQTAAI 80
orf122.pep LTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAFDVDARNVYAQIGGDVGTHLR 90
||||||| ||||| |||||||||||||||||||||||||||:||||: :|||||||||||
orf122ng LTFFCTSWPPRSNPYQQYRRLRLYAFHPPEIAEFFVGFAFDIDARNIDTQIGGDVGTHLR 140
orf122.pep NVRRECGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRIFELCGGVGEMAADIAQTCRT 150
||| | ||||||||||||:|||||||||||||||||||||||||||||:||||:||||||
orf122ng NVRCEFGFLCNHGRIDIDHLPTLRLNALIRRTQKDAAVRIFELCGGVGKMAADVAQTCRT 200
orf122.pep EQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQ 182
|||||||||||:|| : |||||||||||||||
orf122ng EQRVGNGVQQRVGIRMPEQPFFKWDFNSAKYQLSAFGQLVDIVALSDTDIRHRLCS 256
全长ORF122ng核苷酸序列<SEQ ID 797>是:
1 ATGTCGTACC GGGCAAGCAG TTCGCCGGAT TTTTTGGAGG TTGAAACCGC
51 GCCTTTGATT TTTTTACCGC TTTTGCCCAA GGCTTCGATG AAGAAATTGa
101 tgGTCGAACC GgtaCCGATG CCGATGTATT CGTTTTCGGG TACGAATTCG
151 ACTGCTTTTT CGGCGGCGAT GCGCttgAgt TCgtcttgcg TcgTCATATT
201 TTTAtccttt gGGAAaccct atcaAcaAAc agccgccatC TTAACATTTT
251 TTTGCACGtc ctggccgccg cgttcaAATc cgtaccaGca ataccgccgc
301 ctgcgcctCT AtgcCTTCCA TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG
351 TTTTGCCTTT GATatTGACG CACGAAATAT CGatacCCAa atcggcgGCG
401 ATGTTGGCAC GCATTTGCGG AATGTGCGGT GCGAGTTTGG GTTTCTGTGC
451 AATCACGGTC GTATCGACAT TGACCACCTG CCAACCCTGC GCCTGAACGC
501 TTTGATACGC CGCACGCAAA AGGACGCGGC TGTCCGCATC TTTGAACTCT
551 GCGGCGGTGT CGGGAAAATG GCTGCCGATG TCGCCCAAAC CTGCCGCACC
601 GAGCAGCgcg tcggtaaCGG CGTGCAGCAG cgcgTcgGCA TCCGAATGCC
651 CGAGCAGCCC TTTTTCAAAT GGGATTTCAA CTCCGCCAAG TATCAGCTTT
701 CTGCCTTCGG TCAATTGGTG GACATCGTAG CCCTGTCCGA TACGGATATT
751 CGTCATCGTT TGTGTTCCTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 798>:
1 MSYRASSSPD FLEVETAPLI FLPLLPKASM KKLMVEPVPM PMYSFSGTNS
51
TAFSAAMRLS SSCVVIFLSF GKPYQQTAAI LTFFCTSWPP RSNPYQQYRR
101 LRLYAFHPPE IAEFFVGFAF DIDARNIDTQ IGGDVGTHLR NVRCEFGFLC
151 NHGRIDIDHL PTLRLNALIR RTQKDAAVRI FELCGGVGKM AADVAQTCRT
201 EQRVGNGVQQ RVGIRMPEQP FFKWDFNSAK YQLSAFGQLV DIVALSDTDI
251 RHRLCS*
ORF122ng和ORF122-1在256个氨基酸的重叠区内显示出有92.6%的相同性:
10 20 30 40 50 60
orf122-1.pep ISYWASSSPDFLEVDTAPLIFLPLLPKASMKKLMVEPVPMPIYSFSGTNSTAFSAAMRLS
:|| ||||||||||:||||||||||||||||||||||||||:||||||||||||||||||
orf122ng MSYRASSSPDFLEVETAPLIFLPLLPKASMKKLMVEPVPMPMYSFSGTNSTAFSAAMRLS
10 20 30 40 50 60
70 80 90 100 110 120
orf122-1.pep SSCVVIFLSFGKPYQQTAAILTFFCTSCPPRSNAYQQYRRLRLYAFHPPEIAEFFVGFAF
|||||||||||||||||||||||| |||||||| ||||||||||||||||||||||||||
orf122ng SSCVVIFLSFGKPYQQTAAILTFFCTSWPPRSNPYQQYRRLRLYAFHPPEIAEFFVGFAF
70 80 90 100 110 120
130 140 150 160 170 180
orf122-1.pep DVDARNVYAQIGGDVGTHLRNVRREFGFLCNHGRIDIDRLPTLRLNALIRRTQKDAAVRI
|:||||: :|||||||||||||| ||||||||||||||:|||||||||||||||||||||
orf122ng DIDARNIDTQIGGDVGTHLRNVRCEFGFLCNHGRIDIDHLPTLRLNALIRRTQKDAAVRI
130 140 150 160 170 180
190 200 210 220 230 240
orf122-1.pep FELCGGVGEMAADIAQTCRTEQRVGNGVQQRIGIGVSEQPFFKWDFNSAKYQLSAFGQLV
||||||||:||||:|||||||||||||||||:|| : |||||||||||||||||||||||
orf122ng FELCGGVGKMAADVAQTCRTEQRVGNGVQQRVGIRMPEQPFFKWDFNSAKYQLSAFGQLV
190 200 210 220 230 240
250
orf122-1.pep DIVALSDTDVRHRLCSX
|||||||||:|||||||
orf122ng DIVALSDTDIRHRLCSX
250
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例95
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 799>:
1 ..GCCGGCGCGA GTGCGAACAA CATTTCCGCG CGTTTTGCGG AAACACCCGT
51 CGCTGTCAGC GTTACCCTGA TCGGCACGGT ACTTGCCGTC ATGCTGCCCG
101 TTACCGAATA TGAAAACTTC CTGCTGCTTA TCGGCTCGGT ATTTGCGCCG
它对应于氨基酸序列<SEQ ID 800;ORF125>:
1 ..AGASANNISA RFAETPVAVS VTLIGTVLAV MLPVTEYENF LLLIGSVFAP
51 MGGFDCRLFR LETA*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 801>:
1 ATGTCGGGCA ATGCCTCCTC TCCTTCATCT TCCTCCGCCA TCGGGCTGAT
51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACGCTGC
101 TTGCGCCTTT GGGCTGGCAG CGCGGTCTGG CGGCTCTACT TTTGGGTCAT
151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC
201 CGGACGCAGC TCGATGGAAA GCGTGCGCCT GTCGTTCGGC AAACGCGGTT
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG
301 GTGATGATTT ACGCCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT
351 GTGGGACGGC GAATCTTTTG TCTGGTGGGC ATTGGCAAAC GGCGCGCTGA
401 TTGTGCTGTG GCTGGTTTTC GGCGCACGCA AAACAGGCGG GCTGAAAACC
451 GTTTCGATGC TGCTGATGCT GTTGGCGGTT CTGTGGCTGA GTGCCGAAGT
501 CTTTTCCACG GCAGGCAGCA CCGCCGCACA GGTTTCAGAC GGCATGAGTT
551 TCGGAACGGC AGTCGAGCTG TCCGCCGTGA TGCCGCTTTC CTGGCTGCCG
601 CTTGCCGCCG ACTACACGCG CCACGCGCGC CGCCCGTTTG CGGCAACCCT
651 GACGGCAACG CTCGCCTACA CGCTGACCGG CTGCTGGATG TATGCCTTGG
701 GTTTGGCAGC GGCGTTGTTC ACCGGAGAAA CCGACGTGGC AAAAATCCTG
751 CTGGGCGCAG GTTTGGGTGC GGCAGGCATT TTGGCGGTCG TCCTCTCCAC
801 CGTTACCACA ACGTTTCTCG ATGCCTATTC CGCCGGCGCG AGTGCGAACA
851 ACATTTCCGC GCGTTTTGCG GAAACACCCG TCGCTGTCGG CGTTACCCTG
901 ATCGGCACGG TACTTGCCGT CATGCTGCCC GTTACCGAAT ATGAAAACTT
951 CCTGCTGCTT ATCGGCTCGG TATTTGCGCC GATGGCGGCG GTTTTGATTG
1001 CCGACTTTTT CGTCTTGAAA CGGCGTGAGG AGATTGAAGG CTTTGACTTT
1051 GCCGGACTGG TTCTGTGGCT TGCGGGCTTC ATCCTCTACC GCTTCCTGCT
1101 CTCGTCCGGC TGGGAAAGCA GCATCGGTCT GACCGCCCCC GTAATGTCTG
1151 CCGTTGCCAT TGCCACCGTA TCGGTACGCC TTTTCTTTAA AAAAACCCAA
1201 TCTTTACAAA GGAACCCGTC ATGA
它对应于氨基酸序列<SEQ ID 802;ORF125-1>:
1
MSGNASSPSS SSAIGLIWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51
AVGGALFFAA AYIGALTGRS SMESVRLSFG KRGSVLFSVA NMLQLAGWTA
101 VMIYAGATVS SALGKVLWDG ES
FVWWALAN GALIVLWLVF GARKTGGLKT
151 VS
MLLMLLAV LWLSAEVFST AGSTAAQVSD GMSFGTAVEL SAVMPLSWLP
201 LAADYTRHAR RPFAATLTAT LAYTLTGCWM YALGLAAALF TGETDVAK
IL
251
LGAGLGAAGI LAVVLSTVTT TFLDAYSAGA SANNISARFA E
TPVAVGVTL
301
IGTVLAVMLP VTEYEN
FLLL IGSVFAPMAA VLIADFFVLK RREEIEGFDF
351
AGLVLWLAGF ILYRFLLSSG WESSIGLTA
P VMSAVAIATV SVRLFFKKTQ
401 SLQRNPS*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF125和脑膜炎奈瑟球菌菌株A的ORF(ORF125a)在51个氨基酸的重叠区内显示出有76.5%的相同性:
10 20 30
orf125.pep AGASANNISARFAETPVAVSVTLIGTVLAV
||:|||||||:::| |:||:|:::||:|||
orf125a KILLGAGLGAAGILAVVLSTVTTTFLDAYSAGVSANNISAKLSEIPIAVAVAVVGTLLAV
250 260 270 280 290 300
40 50 60
orf125.pep MLPVTEYENFLLLIGSVFAPMGGFDCRLFRLETAX
:||||||||||||||||||||:
orf125a LLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEG
310 320 330 340
该ORF125a的部分核苷酸序列<SEQ ID 803>是:
1 ATGTCGGGCA ATGCCTCCTC TCNTTCATCT TCCGCCGCCA TCGGGCTGAT
51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACACTGC
101 TTGCGCCTTT GGGCTGGCAG CGCGGTCTGG CNGCTCTGCT TTTGGGTCAT
151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC
201 CGGACNCANC TCGATGGAAA GCGTGCGCCT GTCGTTCGGC AAACGCGGTT
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG
301 GTGATGATTT ACGCCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT
351 GTGGGACGGC GAATCTTTTG TCTGGTGGGC ATTGGCAAAC GGCGCGCTGA
401 TTGTGCTGTG GCTGGTTTTC GGCGCACGCA AAACAGGCGG GCTGAAAACC
451 GTTTCGATGC TGCTGATGCT GTTGGCGGTT CTGTGGCTGA GTGCCGAANT
501 NTTTTCCACG GCAGGCAGCA CCGCCGCANN GGTNNCAGAC GGCATGAGTT
551 TCGGAACGGC AGTCGAGCTG TCCGCCGTNA TGCCGCTTTC TTGGCTGCCG
601 CTGGCCGCCG ACTACACGCG CCACGCGCGC CGCCCGTTTG CGGCAACCCT
651 GACGGCAACG CTCGCCTACA CGCTGACCGG CTGCTGGATG TATGCCTTGG
701 GTTTGGCAGC GGCGTTGTTC ACCGGAGAAA CCGACGTGGC AAAAATCCTG
751 CTGGGCGCAG GTTTGGGTGC GGCAGGCATT TTGGCGGTCG TCCTGTCGAC
801 CGTTACCACC ACTTTTCTCG ATGCNTACTC CGCCGGCGTA AGTGCCAACA
851 ATATTTCCGC CAAACTTTCG GAAATACCNA TCGCCGTTGC CGTCGCCGTT
901 GTCGGCACAC TGCTTGCCGT CCTCCTGCCC GTTACCGAAT ATGAAAACTT
951 CCTGCTGCTT ATCGGCTCGG TATTTGCGCC GATGGCGGCG GTTTTGATTG
1001 CCGACTTTTT CGTCTTGAAA CGGCGTGAGG AGATTGAAGG C..
它编码的蛋白质具有部分氨基酸序列<SEQ ID 804>:
1
MSGNASSXSS SAAIGLIWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51
AVGGALFFAA AYIGALTGXX SMESVRLSFG KRGSVLFSVA NMLQLAGWTA
101 VMIYAGATVS SALGKVLWDG ES
FVWWALAN GALIVLWLVF GARKTGGLKT
151 VS
MLLMLLAV LWLSAEXFST AGSTAAXVXD GMSFGTAVEL SAVMPLSWLP
201 LAADYTRHAR RPFAATLTAT LAYTLTGCWM YALGLAAALF TGETDVAK
IL
251
LGAGLGAAGI LAVVLSTVTT TFLDAYSAGV SANNISAKLS E
IPIAVAVAV
301
VGTLLAVLLP VTEYEN
FLLL IGSVFAPMAA VLIADFFVLK RREEIEG..
ORF125a和ORF125-1在347个氨基酸的重叠区内显示出有94.5%的相同性:
10 20 30 40 50 60
orf125a.pep MSGNASSXSSSAAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
||||||| |||:||||||||||||||||||||||||||||||||||||||||||||||||
orf125-1 MSGNASSPSSSSAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
10 20 30 40 50 60
70 80 90 100 110 120
orf125a.pep AYIGALTGXXSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG
|||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||
orf125-1 AYIGALTGRSSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG
70 80 90 100 110 120
130 140 150 160 170 180
orf125a.pep ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEXFSTAGSTAAXVXD
|||||||||||||||||||||||||||||||||||||||||||||| ||||||||| | |
orf125-1 ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEVFSTAGSTAAQVSD
130 140 150 160 170 180
190 200 210 220 230 240
orf125a.pep GMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAALF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf125-1 GMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAALF
190 200 210 220 230 240
250 260 270 280 290 300
orf125a.pep TGETDVAKILLGAGLGAAGILAVVLSTVTTTFLDAYSAGVSANNISAKLSEIPIAVAVAV
|||||||||||||||||||||||||||||||||||||||:|||||||:::| |:||:|::
orf125-1 TGETDVAKILLGAGLGAAGILAVVLSTVTTTFLDAYSAGASANNISARFAETPVAVGVTL
250 260 270 280 290 300
310 320 330 340
orf125a.pep VGTLLAVLLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEG
:||:|||:|||||||||||||||||||||||||||||||||||||||
orf125-1 IGTVLAVMLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAGF
310 320 330 340 350 360
与淋病奈瑟球菌的预计ORF的同源性
ORF125和淋病奈瑟球菌的预计ORF(ORF125ng)在65个氨基酸的重叠区内显示出有86.2%的相同性:
orf125.pep AGASANNISARFAETPVAVSVTLIGTVLAV 30
|||||||||||||| ||||:|||| |||||
orf125ng KILLGAGLGITGILAVVLSTVTTTFLDTYSAGASANNISARFAEIPVAVGVTLIRTVLAV 308
orf125.pep MLPVTEYENFLLLIGSVFAPM-GGFDCRLFRLETA 64
|||||||:|||||| |||:|| |||||||| |:||
orf125ng MLPVTEYKNFLLLIRSVFGPMAGGFDCRLFCLKTA 343
预计ORF125ng核苷酸序列<SEQ ID 805>编码的蛋白质具有氨基酸序列<SEQ ID806>:
1
MSGNASSPSS SAAIGLVWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51
AVGGALFFAA AYIGALTGRS SMESVRLSFG KCGSVLFSVA NMLQLAGWTA
101 VMIYVGATVS SALGKVLWDG ES
FVWWALAN GALIVLWLVF GARRTGGLKT
151 VS
MLLMLLAV LWLSVEVFAS SGTNAAPAVS DGMTFGTAVE LSAVMPLSWL
201 PLAADYTRQA RRPFAATLTA TLAYTLTGCW MYALGLAAAL FTGETDVAK
I
251
LLGAGLGITG ILAVVLSTVT TTFLDTYSAG ASANNISARF AE
IPVAVGVT
301
LIRTVLAVML PVTEYKNFLL LIRSVFGPMA GGFDCRLFCL KTA*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 807>:
1 ATGTCGGGCA ATGCCTCCTC TCCTTCATCT TCCGCCGCCA TCGGGCTGGT
51 TTGGTTCGGC GCGGCGGTAT CGATTGCCGA AATCAGCACG GGTACGCTGC
101 TCGCCCCCTT GGGCTGGCAG CGCGGTCTGG CGGCCCTGCT TTTGGGTCAT
151 GCCGTCGGCG GCGCGCTGTT TTTTGCGGCG GCGTATATCG GCGCACTGAC
201 CGGACGCAGC TCGATGGAAA GTGTGCGCCT GTCGTTCGGC AAATGCGGTT
251 CAGTGCTGTT TTCCGTGGCG AATATGCTGC AACTGGCCGG CTGGACGGCG
301 GTGATGATTT ACGTCGGCGC AACGGTCAGC TCCGCTTTGG GCAAAGTGTT
351 GTGGGACGGC GAATCCTTTG TCTGGTGGGC ATTGGCAAAC GGCGCACTGA
401 TCGTGCTGTG GCTGGTTTTC GGCGCACGCA GAACGGGCGG GCTGAAAACC
451 GTTTCGATGC TGCTGATGCT GCTTGCCGTG TTGTGGTTGA GCGTCGAAGT
501 GTTCGCTTCG TCCGGCACAA ACGCCGCGCC CGCCGTTTCA GACGGCATGA
551 CCTTCGGAAC GGCAGTCGAA CTGTCCGCCG TCATGCCGCT TTCCTGGCTG
601 CCGCTGGCCG CCGACTACAC GCGCCAAGCA CGCCGCCCGT TTGCGGCAAC
651 CCTGACGGCA ACGCTCGCCT ATACGCTGAC GGGCTGCTGG ATGTATGCCT
701 TGGGTTTGGC GGCGGCTCTG TTTACCGGAG AAACCGACGT GGCGAAAATC
751 CTGTTGGGCG CGGGCTTGGG CATAACGGGC ATTCTGGCAG TCGTCCTCTC
801 CACCGTTACC ACAACGTTTC TCGATACCTA TTCCGCCGGC GCGAGTGCGA
851 ACAACATTTC CGCGCGTTTT GCGGAAATAC CCGTCGCTGT CGGCGTTACC
901 CTGATCGGCA CGGTGCTTGC CGTCATGCTG CCCGTTACCG AATATAAAAA
951 CTTCCTGCTG CTTATCGGCT CGGTATTTGC GCCGATGGCG GCGGTTTTGA
1001 TTGCCGACTT TTTCGTCTTA AAACGGCGTG AGGAGATTGA AGGCTTTGAC
1051 TTTGCCGGAC TGGTTCTGTG GCTGGCAGGC TTCATCCTCT ACCGCTTCCT
1101 GCTCTCGTCC GGTTGGGAAA GCAGCATCGG TCTGACCGCC CCCGTAATGT
1151 CTGCCGTTGC CATTGCCACC GTATCGGTAC GCCTTTTCTT TAAAAAAACC
1201 CAATCTTTAC AAAGGAACCC GTCATGA
它对应于氨基酸序列<SEQ ID 808;ORF125ng-1>:
1
MSGNASSPSS SAAIGLVWFG AAVSIAEIST GTLLAPLGWQ RGLAALLLGH
51
AVGGALFFAA AYIGALTGRS SMESVRLSFG KCGSVLFSVA NMLQLAGWTA
101 VMIYVGATVS SALGKVLWDG ES
FVWWALAN GALIVLWLVF GARRTGGLKT
151 VS
MLLMLLAV LWLSVEVFAS SGTNAAPAVS DGMTFGTAVE LSAVMPLSWL
201 PLAADYTRQA RRPFAATLTA TLAYTLTGCW MYALGLAAAL FTGETDVAK
I
251
LLGAGLGITG ILAVVLSTVT TTFLDTYSAG ASANNISARF AE
IPVAVGVT
301
LIGTVLAVML PVTEYKN
FLL LIGSVFAPMA AVLIADFFVL KRREEIEGFD
351 F
AGLVLWLAG FILYRFLLSS GWESSIGLTA
PVMSAVAIAT VSVRLFFKKT
401 QSLQRNPS*
ORF125ng-1和ORF125-1在408个氨基酸的重叠区内显示出有95.1%的相同性:
10 20 30 40 50 60
orf125-1.pep MSGNASSPSSSSAIGLIWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
|||||||||||:||||:|||||||||||||||||||||||||||||||||||||||||||
orf125ng-1 MSGNASSPSSSAAIGLVWFGAAVSIAEISTGTLLAPLGWQRGLAALLLGHAVGGALFFAA
10 20 30 40 50 60
70 80 90 100 110 120
orf125-1.pep AYIGALTGRSSMESVRLSFGKRGSVLFSVANMLQLAGWTAVMIYAGATVSSALGKVLWDG
||||||||||||||||||||| |||||||||||||||||:||||||||||||||||||||
orf125ng-1 AYIGALTGRSSMESVRLSFGKCGSVLFSVANMLQLAGWTAVMIYVGATVSSALGKVLWDG
70 80 90 100 110 120
130 140 150 160 170 179
orf125-1.pep ESFVWWALANGALIVLWLVFGARKTGGLKTVSMLLMLLAVLWLSAEVFSTAGSTAAQ-VS
|||||||||||||||||||||||:||||||||||||||||||||:|||:::|::|| ||
orf125ng-1 ESFVWWALANGALIVLWLVFGARRTGGLKTVSMLLMLLAVLWLSVEVFASSGTNAAPAVS
130 140 150 160 170 180
180 190 200 210 220 230 239
orf125-1.pep DGMSFGTAVELSAVMPLSWLPLAADYTRHARRPFAATLTATLAYTLTGCWMYALGLAAAL
|||:||||||||||||||||||||||||:|||||||||||||||||||||||||||||||
orf125ng-1 DGMTFGTAVELSAVMPLSWLPLAADYTRQARRPFAATLTATLAYTLTGCWMYALGLAAAL
190 200 210 220 230 240
240 250 260 270 280 290 299
orf125-1.pep FTGETDVAKILLGAGLGAAGILAVVLSTVTTTFLDAYSAGASANNISARFAETPVAVGVT
||||||||||||||||| :||||||||||||||||:|||||||||||||||| |||||||
orf125ng-1 FTGETDVAKILLGAGLGITGILAVVLSTVTTTFLDTYSAGASANNISARFAEIPVAVGVT
250 260 270 280 290 300
300 310 320 330 340 350 359
orf125-1.pep LIGTVLAVMLPVTEYENFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAG
|||||||||||||||:||||||||||||||||||||||||||||||||||||||||||||
orf125ng-1 LIGTVLAVMLPVTEYKNFLLLIGSVFAPMAAVLIADFFVLKRREEIEGFDFAGLVLWLAG
310 320 330 340 350 360
360 370 380 390 400
orf125-1.pep FILYRFLLSSGWESSIGLTAPVMSAVAIATVSVRLFFKKTQSLQRNPSX
|||||||||||||||||||||||||||||||||||||||||||||||||
orf125ng-1 FILYRFLLSSGWESSIGLTAPVMSAVAIATVSVRLFFKKTQSLQRNPSX
370 380 390 400
根据该分析结果(包括淋球菌蛋白中存在推定的前导序列和跨膜结构域),预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例96
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 809>:
1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC
51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAAGCT
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TAGCCGCCGC CATGCTCGCG
151 CCTGCAGCGG A.ACGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG
201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA
251 CGATGATGCA GGAAAACGGC AGCCTGATTG TATGGCACGG GCAGGACAAG
301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGT.ACGGA
351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTAA GACGGCATCT ACCTGCCGAC CGAAGC.CAG
451 CTCGACGGGC GGCAATTATA GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GCCTGCAAG..
它对应于氨基酸序列<SEQ ID 810;ORF126>:
1 MTRIAILGGG LSGRLTALQL AEQGYQIALF DKSCRRGEHA AAYVAAAMLA
51 PAAXTVEATP EVVRLGRQSI PLWRGIRCRL NTHTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGXTDDEI VRWRADDIAE REPQLGGRFX DGIYLPTEXQ
151 LDGRQLXSAL ADALDELNVP CHWEHECVPE ACK...
进一步的工作揭示了完整的核苷酸序列<SEQ ID 811>:
1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC
51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAGGCT
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TTGCCGCCGC CATGCTCGCG
151 CCTGCGGCGG AAGCGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG
201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA
251 CGATGATGCA GGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAG
301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA
351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG
451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GGCCTGCAAG
551 CCCAATACGA CTGGCTGATC GACTGCCGCG GCTACGGCGC AAAAACCGCG
601 TGGAACCAAT CCCCCGAGCA CACCAGCACC CTGCGCGGCA TACGCGGCGA
651 AGTGGCGCGG GTTTACACAC CCGAAATCAC GCTCAACCGC CCCGTGCGTC
701 TGCTCCATCC GCGTTATCCG CTCTACATCG CCCCGAAAGA AAACCACGTC
751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CCCCCGCCAG
801 CGTGCGTTCA GGGTTGGAAC TCTTGTCCGC ACTCTATGCC ATCCACCCCG
851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCACCGGCCT GCGCCCCACG
901 CTCAACCACC ACAACCCCGA AATCCGTTAC AACCGCGCCC GACGCCTGAT
951 TGAAATCAAC GGCCTTTTCC GCCACGGTTT CATGATCTCC CCCGCCGTAA
1001 CCGCCGCCGC CGCCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGACGCG
1051 CCCGAACGCG ATAAAGAAAG CGGTTTGGCG TATATCCGAA GACAAGATTA
1101 A
它对应于氨基酸序列<SEQ ID 812;ORF126-1>:
1
MTRIAILGGG LSGRLTALQL AEQGYQIALF DKGCRRGEHA AAYVAAAMLA
51
PAAEAVEATP EVVRLGRQSI PLWRGIRCRL NTHTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGVADDEI VRWRADDIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECVPE GLQAQYDWLI DCRGYGAKTA
201 WNQSPEHTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENHV
251 FVIGATQIES ESQAPASVRS GLELLSALYA IHPAFGEADI LEIATGLRPT
301 LNHHNPEIRY NRARRLIEIN GLFRHGFM
IS PAVTAAAARL AVALFDGKDA
351 PERDKESGLA YIRRQD*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF126和脑膜炎奈瑟球菌菌株A的ORF(ORF126a)在180个氨基酸的重叠区内显示出有90.0%的相同性:
10 20 30 40 50 60
orf126.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKSCRRGEHAAAYVAAAMLAPAAXTVEATP
||||||||||||||||||||||||||||||||:|||||||||||||||||||| :|||||
orf126a MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
10 20 30 40 50 60
70 80 90 100 110 120
orf126.pep EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGXTDDEI
|||||||| |||||||||:|:| :|| ||||||||||||||||:|||||||||| :|| |
orf126a EVVRLGRQXIPLWRGIRCHLKTPAMMXENGSLIVWHGQDKPLSNEFVRHLKRGGVADDXI
70 80 90 100 110 120
130 140 150 160 170 180
orf126.pep VRWRADDIAEREPQLGGRFXDGIYLPTEXQLDGRQLXSALADALDELNVPCHWEHECVPE
||||||||||||||||||| |||||||| ||||||: ||||||||||||||||||||:||
orf126a VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPE
130 140 150 160 170 180
全长ORF126a核苷酸序列<SEQ ID 813>是:
1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCNGGAA GGCTGACCGC
51 ACTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAGGCT
101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TTGCCGCCGC CATGCTCGCG
151 CCTGCGGCGG AAGCGGTCGA AGCCACGCCT GAAGTGGTCA GGCTGGGCAG
201 GCAGANCATC CCGCTTTGGC GCGGCATCCG ATGCCATCTG AAAACGCCTG
251 CCATGATGCA NGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAA
301 CCTTTATCCA ACGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA
351 TGACNAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG
451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCC TGCCATTGGG AACACGAATG TGCCCCCGAA GACTTGCAAG
551 CCCAATACGA CTGGCTGATC GACTGCCGCG GCTACGGCGC AAAAACCGCG
601 TGGAACCAAT CCCCCGANNA NACCAGCACC CTGCGCGGCA TACGCGGCGA
651 AGTGGCGCGG GTTTACACAC CCGAAATCAC GCTCAACCGC CCCGTGCGCC
701 TGCTACACCC GCGCTATCCG CTNTACATCG CCCCGAAAGA AAACCNCGTC
751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CACCTGCCAG
801 CGTGCGTTCC GGGCTGGAAC TCTTATCCGC ACTCTATGCC GTCCACCCCG
851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCACCGGCCT GCGCCCCACG
901 CTCAATCACC ACAACCCCGA AATCCGTTAC AACCGCGCCC GACGCCTGAT
951 TGAAATCAAC GGCCTTTTCC GCCACGGTTT CATGATCTCC CCCGCCGTAA
1001 CCGCCGCCGC CGTCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGANGCG
1051 CCCGAACGCG ATGAAGAAAG CGGTTTGGCG TATATCCGAA GACAAGATTA
1101 A
它编码的蛋白质具有氨基酸序列<SEQ ID 814>:
1
MTRIAILGGG LSGRLTALQL AEQGYQIALF DKGCRRGEHA AAYVAAAMLA
51
PAAEAVEATP EVVRLGRQXI PLWRGIRCHL KTPAMMXENG SLIVWHGQDK
101 PLSNEFVRHL KRGGVADDXI VRWRADDIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECAPE DLQAQYDWLI DCRGYGAKTA
201 WNQSPXXTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENXV
251 FVIGATQIES ESQAPASVRS GLELLSALYA VHPAFGEADI LEIATGLRPT
301 LNHHNPEIRY NRARRLIEIN GLFRHGFM
IS PAVTAAAVRL AVALFDGKXA
351 PERDEESGLA YIRRQD*
ORF126a和ORF126-1在366个氨基酸的重叠区内显示出有95.4%的相同性:
10 20 30 40 50 60
orf126a.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf126-1 MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
10 20 30 40 50 60
70 80 90 100 110 120
orf126a.pep EVVRLGRQXIPLWRGIRCHLKTPAMMXENGSLIVWHGQDKPLSNEFVRHLKRGGVADDXI
|||||||| |||||||||:|:| :|| ||||||||||||||||:|||||||||||||| |
orf126-1 EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI
70 80 90 100 110 120
130 140 150 160 170 180
orf126a.pep VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPE
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||:||
orf126-1 VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECVPE
130 140 150 160 170 180
190 200 210 220 230 240
orf126a.pep DLQAQYDWLIDCRGYGAKTAWNQSPXXTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||
orf126-1 GLQAQYDWLIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
190 200 210 220 230 240
250 260 270 280 290 300
orf126a.pep LYIAPKENXVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIATGLRPT
|||||||| |||||||||||||||||||||||||||||||:|||||||||||||||||||
orf126-1 LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAIHPAFGEADILEIATGLRPT
250 260 270 280 290 300
310 320 330 340 350 360
orf126a.pep LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAVRLAVALFDGKXAPERDEESGLA
||||||||||||||||||||||||||||||||:||||||||||||||| |||||:|||||
orf126-1 LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAARLAVALFDGKDAPERDKESGLA
310 320 330 340 350 360
orf126a.pep YIRRQDX
|||||||
orf126-1 YIRRQDX
与淋病奈瑟球菌的预计ORF的同源性
ORF126和淋病奈瑟球菌的预计ORF(ORF126ng)在180个氨基酸的重叠区内显示出有90%的相同性:
orf126.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKSCRRGEHAAAYVAAAMLAPAAXTVEATP 60
|||||:||||||||||||||||||||| ||||: |:||||||||||||||||| :|||||
orf126ng MTRIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHAAAYVAAAMLAPAAEAVEATP 60
orf126.pep EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGXTDDEI 120
||:||||||||||||||||||| ||||||||||||||||||||||||||||||| :||||
orf126ng EVIRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI 120
orf126.pep VRWRADDIAEREPQLGGRFXDGIYLPTEXQLDGRQLXSALADALDELNVPCHWEHECVPE 180
||||||:|||||||||||| |||||||| ||||||: ||||||||||||||||||||:|:
orf126ng VRWRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQ 180
预计ORF126ng核苷酸序列<SEQ ID 815>编码的蛋白质具有氨基酸序列<SEQ ID816>:
1
MTRIAVLGGG LSGRLTALQL AEQGYQIELF DKGTRQGEHA AAYVAAAMLA
51
PAAEAVEATP EVIRLGRQSI PLWRGIRCRL NTLTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGVADDEI VRWRADEIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECAPQ DLQAQYDWVI DCRGYGAKTA
201 WNQSPEHTST LRGIRGEVRG FTRPKSRSTA PCACCTRAIR STSPRKKTTS
251 SSSARPKSKA KAKPPPAYVP GWNSYPRSMP STPPSAKPTS SKWRPGLRPT
301 LNHHNPEIRY SRERRLIEIN GLFRHGFM
IS PAVTAAAVRL AVALFDGKDA
351 PERDEESGLA YIGRQD*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 817>:
1 ATGACCCGTA TCGCCGTCCT CGGAGGCGGC CTTTCCGGAA GGCTGACCGC
51 ATTGCAGCTT GCAGAACAAG GTTATCAGAT TGAACTTTTC GACAAGGGCA
101 CCCGCCAAGG CGAACACGCC GCCGCCTATG TTGCCGCCGC GATGCTCGCG
151 CCTGCGGCGG AAGCGGTCGA GGCAACGCCC GAAGTCATCA GGCTGGGCAG
201 GCAGAGCATT CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCTCA
251 CGATGATGCA GGAAAACGGC AGCCTGATTG TGTGGCACGG GCAGGACAAG
301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGTAGCGGA
351 TGACGAAATC GTCCGTTGGC GCGCCGATGA AATCGCCGAA CGCGAACCGC
401 AACTCGGCGG ACGTTTTTCA GACGGCATCT ACCTGCCGAC CGAAGGCCAG
451 CTCGACGGGC GGCAAATATT GTCTGCACTT GCCGACGCTT TGGACGAACT
501 GAACGTCCCT TGCCATTGGG AACACGAATG CGCCCCCCAA GACCTGCAAG
551 CCCAATACGA CTGGGTAATC GACTGCCGGG GCTACGGCGC GAAAACCGCG
601 TGGAACCAAT CCCCCGAGCA CACCAGCACC TTGCGCGGCA TACGCGGCGA
651 AGTGGCGCGG GTTTACACGC CCGAAATCAC GCTCAACCGC CCCGTGCGCC
701 TGCTGCACCC GCGCTATCCG CTCTACATCG CCCCGAAAGA AAACCACGTC
751 TTCGTCATCG GCGCGACCCA AATCGAAAGC GAAAGCCAAG CCCCCGCCAG
801 CGTACGTTCC GGGCTGGAAC TCTTATCCGC GCTCTATGCC GTCCACCCCG
851 CCTTCGGCGA AGCCGACATC CTCGAAATCG CCGCCGGCCT GCGCCCCACG
901 CTCAACCACC ACAACCCCGA AATCCGCTAC AGCCGCGAAC GCCGCCTCAT
951 CGAAATCAAC GGCCTTTTCC GGCACGGCTT TATGATTTCC CCCGCCGTAA
1001 CCGCCGCCGC CGTCAGATTG GCAGTGGCAC TGTTTGACGG AAAAGACGCG
1051 CCCGAACGTG ATGAAGAAAG CGGTTTGGCG TATATCGGAA GACAAGATTA
1101 A
它对应于氨基酸序列<SEQ ID 818;ORF126ng-1>:
1
MTRIAVLGGG LSGRLTALQL AEQGYQIELF DKGTRQGEHA AAYVAAAMLA
51
PAAEAVEATP EVIRLGRQSI PLWRGIRCRL NTLTMMQENG SLIVWHGQDK
101 PLSSEFVRHL KRGGVADDEI VRWRADEIAE REPQLGGRFS DGIYLPTEGQ
151 LDGRQILSAL ADALDELNVP CHWEHECAPQ DLQAQYDWVI DCRGYGAKTA
201 WNQSPEHTST LRGIRGEVAR VYTPEITLNR PVRLLHPRYP LYIAPKENHV
251 FVIGATQIES ESQAPASVRS GLELLSALYA VHPAFGEADI LEIAAGLRPT
301 LNHHNPEIRY SRERRLIEIN GLFRHGFM
IS PAVTAAAVRL AVALFDGKDA
351 PERDEESGLA YIGRQD*
ORF126ng-1和ORF126-1在366个氨基酸的重叠区内显示出有95.1%的相同性:
10 20 30 40 50 60
orf126-1.pep MTRIAILGGGLSGRLTALQLAEQGYQIALFDKGCRRGEHAAAYVAAAMLAPAAEAVEATP
|||||:|||||||||||||| |||||| ||||| |:||||||||||||||||||||||||
orf126ng-1 MTRIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHAAAYVAAAMLAPAAEAVEATP
10 20 30 40 50 60
70 80 90 100 110 120
orf126-1.pep EVVRLGRQSIPLWRGIRCRLNTHTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI
||:||||||||||||||||||| |||||||||||||||||||||||||||||||||||||
orf126ng-1 EVIRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEI
70 80 90 100 110 120
130 140 150 160 170 180
orf126-1.pep VRWRADDIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECVPE
||||||:||||||||||||||||||||||||||||||||||||||||||||||||||:|:
orf126ng-1 VRWRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQ
130 140 150 160 170 180
190 200 210 220 230 240
orf126-1.pep GLQAQYDWLIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf126ng-1 DLQAQYDWVIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYP
190 200 210 220 230 240
250 260 270 280 290 300
orf126-1.pep LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAIHPAFGEADILEIATGLRPT
||||||||||||||||||||||||||||||||||||||||:|||||||||||||:|||||
orf126ng-1 LYIAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIAAGLRPT
250 260 270 280 290 300
310 320 330 340 350 360
orf126-1.pep LNHHNPEIRYNRARRLIEINGLFRHGFMISPAVTAAAARLAVALFDGKDAPERDKESGLA
||||||||||:| ||||||||||||||||||||||||:||||||||||||||||:|||||
orf126ng-1 LNHHNPEIRYSRERRLIEINGLFRHGFMISPAVTAAAVRLAVALFDGKDAPERDEESGLA
310 320 330 340 350 360
orf126-1.pep YIRRQDX
|| ||||
orf126ng-1 YIGRQDX
另外,ORF126ng-1显示出与一种推定的根瘤菌氧化酶黄素蛋白同源:
gi|2627327(AF004408)推定的氨基酸氧化酶黄素蛋白[Rhizobium etli]长度=327
评分=169位(423),估计值=3e-41
相同性=112/329(34%),阳性=163/329(49%),空隙=25/329(7%)
询问:3 RIAVLGGGLSGRLTALQLAEQGYQIELFDKGTRQGEHXXXXXXXXXXXXXXXXXXXXXXX 62
RI V G G++G A QL G+++ L ++ G
目标:2 RILVNGAGVAGLTVAWQLYRHGFRVTLAERAGTVGA-GASGFAGGMLAPWCERESAEEPV 60
询问:63 IRLGRQSIPLWRGIRCRLNTLTMMQENGSLIVWHGQDKPLSSEFVRHLKRGGVADDEIVR 122
+ LGR + W + G+L+V G+D F R G DE+
目标:61 LTLGRLAADWWEAA-----LPGHVHRRGTLVVAGGRDTGELDRFSRRTS-GWEWLDEVA- 113
询问:123 WRADEIAEREPQLGGRFSDGIYLPTEGQLDGRQILSALADALDELNVPCHWEHECAPQDL 182
IA EP L GRF ++ E LD RQ L+ALA L++ + +
目标:114 -----IAALEPDLAGRFRRALFFRQEAHLDPRQALAALAAGLEDARMRLTLG---VVGES 165
询问:183 QAQYDWVIDCRGYGAKTAWNQSPEHTSTLRGIRGEVARVYTPEITLNRPVRLLHPRYPLY 242
+D V+DC G LRG+RGE+ V T E++L+RPVRLLHPR+P+Y
目标:166 DVDHDRVVDCTGAA-------QIGRLPGLRGVRGEMLCVETTEVSLSRPVRLLHPRHPIY 218
询问:243 IAPKENHVFVIGATQIESESQAPASVRSGLELLSALYAVHPAFGEADILEIAAGLRPTLN 302
I P++ + F++GAT IES+ P + RS +ELL+A YA+HPAFGEA + E AG+RP
目标:219 IVPRDKNRFMVGATMIESDDGGPITARSLMELLNAAYAMHPAFGEARVTETGAGVRPAYP 278
询问:303 HHNPEIRYSRERRLIEINGLFRHGFMISP 331
+P R ++E R + +NGL+RHGF+++P
目标:279 DNLP--RVTQEGRTLHVNGLYRHGFLLAP 305
该分析结果提示,脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例97
在脑膜炎奈瑟球菌中鉴定出下列认为是完整的DNA序列<SEQ ID 819>:
1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGGTTTA AACAAACATC
201 TACCAAGTGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGtCGCG CGGG..GCTT TAGACAGTAA ATTCATGTTG
301 AAGGCGGTAG CCATAGATAA AGATAAAAAT CCTTTTATTA TTAAGATGAA
351 TGAAAATCTA GTAACCTTTATTTGCAAGA AGTCCGCCAG TTCGTGTAGT
401 GACGGGCTGG ATTATTTTAA AGGAAATGAT AAGGACTGCA AGTTACTTAA
451 GTAG
它对应于氨基酸序列<SEQ ID 820;ORF127>:
1 MTDNRGFTLV ELISVVLILS VLALIVYPSY RNYVEKAKIN AVRAALLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIVA RXALDSKFML
101 KAVAIDKDKN PFI IKMNENL VTFICKKSAS SCSDGLDYFK GNDKDCKLLK
151 *
进一步的工作揭示了下列DNA序列<SEQ ID 821>:
1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGGTTTA AACAAACATC
201 TACCAAGTGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGCGCGC GGGGCTTTAG ACAGTAAATT CATGTTGAAG
301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA
351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG
401 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG
它对应于氨基酸序列<SEQ ID 822;ORF127-1>:
1 MTDNRGFTL
V ELISVVLILS VLALIVYPSY RNYVEKAKIN AVRAALLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDGLDYFKG NDKDCKLLK*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF127和脑膜炎奈瑟球菌菌株A的ORF(ORF127a)在150个氨基酸的重叠区内显示出有98.0%的相同性:
10 20 30 40 50 60
orf127.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf127a MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINTVRAALLENAHFMEKFYLQN
10 20 30 40 50 60
70 80 90 100 110 120
orf127.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIVARXALDSKFMLKAVAIDKDKNPFIIKMNENL
|||||||||||||||||||||||||||| || ||||||||||||||||||||||||||||
orf127a GRFKQTSTKWPSLPIKEAEGFCIRLNGI-ARGALDSKFMLKAVAIDKDKNPFIIKMNENL
70 80 90 100 110
130 140 150
orf127.pep VTFICKKSASSCSDGLDYFKGNDKDCKLLKX
|||||||||||||||||||||||||||||||
orf127a VTFICKKSASSCSDGLDYFKGNDKDCKLLKX
120 130 140 150
全长ORF127a核苷酸序列<SEQ ID 823>是:
1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT ACAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGATTTA AACAAACATC
201 TACCAAATGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGCGCGC GGGGCCTTAG ACAGTAAATT CATGTTGAAG
301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA
351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG
401 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 824>:
1 MTDNRGFTL
V ELISVVLILS VLALIVYPSY RNYVEKAKIN TVRAALLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDGLDYFKG NDKDCKLLK*
ORF127a和ORF127-1在149个氨基酸的重叠区内显示出有99.3%的相同性:
10 20 30 40 50 60
orf127a.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINTVRAALLENAHFMEKFYLQN
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf127-1 MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
10 20 30 40 50 60
70 80 90 100 110 120
orf127a.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf127-1 GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
70 80 90 100 110 120
130 140 150
orf127a.pep TFICKKSASSCSDGLDYFKGNDKDCKLLKX
||||||||||||||||||||||||||||||
orf127-1 TFICKKSASSCSDGLDYFKGNDKDCKLLKX
130 140 150
与淋病奈瑟球菌的预计ORF的同源性
ORF127和淋病奈瑟球菌的预计ORF(ORF127ng)在150个氨基酸的重叠区内显示出有97.3%的相同性:
orf127.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN 60
|||||||||||||||||||||||||||||||||||||||||||||:||||||||||||||
orf127ng MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAAFLENAHFMEKFYLQN 60
orf127.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIVARXALDSKFMLKAVAIDKDKNPFIIKMNENL 120
|||||||||||||||||||||||||||| || ||||||||||||||||||||||||||||
orf127ng GRFKQTSTKWPSLPIKEAEGFCIRLNGI-ARGALDSKFMLKAVAIDKDKNPFIIKMNENL 119
orf127.pep VTFICKKSASSCSDGLDYFKGNDKDCKLLK 150
|||||||||||||| |||||||||||||||
orf127ng VTFICKKSASSCSDRLDYFKGNDKDCKLLK 149
全长ORF127ng核苷酸序列<SEQ ID 825>是:
1 ATGACTGATA ATCGGGGGTT TACACTGGTT GAATTAATAT CAGTGGTCTT
51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG
101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA
151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGATTTA AACAAACATC
201 TACCAAATGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC
251 GTTTGAATGG AATCGCGCGC GGGGCTTTAG ACAGTAAATT CATGTTGAAG
301 GCGGTAGCCA TAGATAAAGA TAAAAATCCT TTTATTATTA AGATGAATGA
351 AAATCTAGTA ACCTTTATTT GCAAGAAGTC CGCCAGTTCG TGTAGTGACG
401 GGCTGGATTA TTTTAAAGGA AATGATAAGG ACTGCAAGTT ACTTAAGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 826>:
1 MTDNRGFTL
V ELISVVLILS VLALIVYPSY RNYVEKAKIN AVRAAFLENA
51 HFMEKFYLQN GRFKQTSTKW PSLPIKEAEG FCIRLNGIAR GALDSKFMLK
101 AVAIDKDKNP FIIKMNENLV TFICKKSASS CSDRLDYFKG NDKDCKLLK*
ORF127ng和ORF127-1在149个氨基酸的重叠区内显示出有100.0%的相同性:
10 20 30 40 50 60
orf127-1.pep MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf127ng-1 MTDNRGFTLVELISVVLILSVLALIVYPSYRNYVEKAKINAVRAALLENAHFMEKFYLQN
10 20 30 40 50 60
70 80 90 100 110 120
orf127-1.pep GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf127ng-1 GRFKQTSTKWPSLPIKEAEGFCIRLNGIARGALDSKFMLKAVAIDKDKNPFIIKMNENLV
70 80 90 100 110 120
130 140 150
orf127-1.pep TFICKKSASSCSDGLDYFKGNDKDCKLLKX
||||||||||||||||||||||||||||||
orf127ng-1 TFICKKSASSCSDGLDYFKGNDKDCKLLKX
130 140 150
该分析结果(包括脑膜炎球菌和淋球菌蛋白均具有预计的跨膜结构域这一事实)提示,脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例98
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 827>
1 ..GTGTCGCTGG CTTCGGTGAT TGCCTCTCAA ATCTTCCTTT ACGAAGATTT
51 CAACCAAATG CGGAAAACC
CGTGGAGCTAT CTGCGGTTTT CTTGTCCAAT
101 ATTTATCTGG GGTTTCAGCA GGGGTATTTC GATTTGAGTG CCGACGAGAA
151 CCCCGTACTG CATATCTGGT CTTTGGCAGT AGAGGAACAG TATTACCTCC
201 TGTATCCCCT TTTGCTGATA TTTTGCTGCA AAAAAACCAA ATCGCTACGG
251 GTGCTGCGTA ACATCAGCAT CATCCTGTTT TTGATTTTGA CTGCCTCATC
301 GTTTTTGCCA AGCGGGTTTT ATACCGACAT CCTCAACCAA CCCAATACTT
351 ATTACCTTTC GACACTGAGG TTTCCCGAGC TGTTGGCAGG TTCGCTGCTG
401 GCGGTTTACG GGCAAACGCA AAACGGCAGA CGGCAAACAG CAAATGGAAA
451 ACGGCAGTTG CTTTCATCAC TCTGCTTCGG CGCATTGCTT GCCTGCCTGT
501 TCGTGATTGA CAAACACAAT CCGTTTATCC CGGGAATGAC CCTGCTCCTT
551 CCCTGCCTGC TGACGGCACT GCTTATCCGG AGTATGCAAT ACGGGACACT
601 TCCGACCCGC ATCCTGTCGG CAAGCCCCAT CGTATTTGTC GGCAAAATCT
651 CTTATTCCCT ATACCTGTAC CATTGGATTT TTATTGCTTT CGCTCCGCTC
701 ATTAGAGGCG GGAAACAGCT CGGACTGCCT GCCG..
它对应于氨基酸序列<SEQ ID 828;ORF128>:
1 ..VSLASVIASQ IFLYEDFNQM RKTVELSAVF LSNIYLGFQQ GYFDLSADEN
51 PVLHIWSLAV EEQYYLLYPL LLIFCCKKTK SLRVLRNISI ILFLILTASS
101 FLPSGFYTDI LNQPNTYYLS TLRFPELLAG SLLAVYGQTQ NGRRQTANGK
151 RQLLSSLCFG ALLACLFVID KHNPFIPGMT LLLPCLLTAL LIRSMQYGTL
201 PTRILSASPI VFVGKISYSL YLYHWIFIAF APLIRGGKQL GLPA..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 829>:
1 ATGCAAGCTG TCCGATACAG ACCGGAAATT GACGGATTGC GGGCCGTCGC
51 CGTGCTATCC GTCATGATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG
101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCAGGATT CCTCATTACC
151 GGCATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT
201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCCTTTATT GCGGCCGTGT
251 CGCTGGCTTC GGTGATTGCC TCTCAAATCT TCCTTTACGA AGATTTCAAC
301 CAAATGCGGA AAACCGTGGA GCTTTCTGCG GTTTTCTTGT CCAATATTTA
351 TCTGGGGTTT CAGCAGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG
401 TACTGCATAT CTGGTCTTTG GCAGTAGAGG AACAGTATTA CCTCCTGTAT
451 CCCCTTTTGC TGATATTTTG CTGCAAAAAA ACCAAATCGC TACGGGTGCT
501 GCGTAACATC AGCATCATCC TGTTTTTGAT TTTGACTGCC TCATCGTTTT
551 TGCCAAGCGG GTTTTATACC GACATCCTCA ACCAACCCAA TACTTATTAC
601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GCAGGTTCGC TGCTGGCGGT
651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGCAAAT GGAAAACGGC
701 AGTTGCTTTC ATCACTCTGC TTCGGCGCAT TGCTTGCCTG CCTGTTCGTG
751 ATTGACAAAC ACAATCCGTT TATCCCGGGA ATGACCCTGC TCCTTCCCTG
801 CCTGCTGACG GCACTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA
851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT
901 TCCCTATACC TGTACCATTG GATTTTTATT GCTTTCGCCC ATTACATTAC
951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA
1001 CGGCCGGATT TTCCCTGTTG AGTTATTATT TGATTGAACA GCCGCTTAGA
1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTCT ATCTCGCCCC
1101 GTCCCTGATA CTTGTCGGTT ACAACCTGTA CGCAAGGGGG ATATTGAAAC
1151 AGGAACACCT CCGCCCGTTG CCCGGCGCGC CCCTTGCTGC GGAAAATCAT
1201 TTTCCGGAAA CCGTCCTGAC CCTCGGCGAC TCGCACGCCG GACACCTGAG
1251 GGGGTTTCTG GATTATGTCG GCAGCCGGGA AGGGTGGAAA GCCAAAATCC
1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TAGATGAGAA GCTGGCAGAC
1351 AACCCGTTAT GTCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCCGT
1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCTGTGCCGA
1451 GATTTGAAGC GCAATCCTTC CTAATACCCG GGTTCCCAGC CCGATTCAGG
1501 GAAACCGTCA AAAGGATAGC CGCCGTCAAA CCCGTCTATG TTTTTGCAAA
1551 CAACACATCA ATCAGCCGTT CGCCCCTGAG GGAGGAAAAA TTGAAAAGAT
1601 TTGCCGCAAA CCAATATCTC CGCCCCATTC AGGCTATGGG CGACATCGGC
1651 AAGAGCAATC AGGCGGTCTT TGATTTGATT AAAGATATTC CCAATGTGCA
1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATATACG
1751 GCCGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT
1801 TATATGGGGC GGGAATTCCA CAAACACGAA CGCCTGCTTA AATCTTCCCA
1851 CGGCGGCGCA TTGCAGTAG
它对应于氨基酸序列<SEQ ID 830;ORF128-1>:
1 MQAVRYRPE
I DGLRAVAVLS VMIFHLNNRW LPGGFLG
VDI FFVISGFLIT
51
GIILSEIQNG SFSFRDFYTR RIKRIYPA
FI AAVSLASVIA SQIFLYEDFN
101 QMRKTVELSA VFLSNIYLGF QQGYFDLSAD ENPVLHIWSL AVEEQYYLLY
151 PLLLIFCCKK TKSLRVLRN
I SIILFLILTA SSFLPSGFYT DILNQPNTYY
201 LSTLRFPELL AGSLLAVYGQ TQNGRRQTAN GKRQ
LLSSLC FGALLACLFV
251
IDKHNPF
IPG MTLLLPCLLT ALLIRSMQYG TLPTRILSAS PIVFVGKISY
301 SLYLYHWIFI AFAHYITGDK QLG
LPAVSAV AALTAGFSLL SYYLIEQPLR
351 KRKMTFKKAF FCLYLAPSLI LVGYNLYARG ILKQEHLRPL PGAPLAAENH
401 FPETVLTLGD SHAGHLRGFL DYVGSREGWK AKILSLDSEC LVWVDEKLAD
451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFPARFR
501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAANQYL RPIQAMGDIG
551 KSNQAVFDLI KDIPNVHWVD AQKYLPKNTV EIYGRYLYGD QDHLTYFGSY
601 YMGREFHKHE RLLKSSHGGA LQ*
该氨基酸序列的计算机分析给出了下列结果:
与流感嗜血菌的假设的整合膜蛋白HI0392(登录号为U32723)的同源性ORF128和HI0392在180个氨基酸的重叠区内显示出有52%的氨基酸相同性:
Orf128:1 VSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGFQQGYFDLSADENPVLHIWSLAV 60
++L S IAS IF+Y DFN++RKT+EL+ FLSN YLG QGYFDLSA+ENPVLHIWSLAV
HI0392:46 MALVSFIASAIFIYNDFNKLRKTIELAIAFLSNFYLGLTQGYFDLSANENPVLHIWSLAV 105
Orf128:61 EEQXXXXXXXXXIFCCKKTKSLRVLRNISIILFLILTASSFLPSGFYTDILNQPNTYYLS 120
E Q I KK + ++VL I++ILF IL A+SF+ + FY ++L+QPN YYLS
HI0392:106 EGQYYLIFPLILILAYKKFREVKVLFIITLILFFILLATSFVSANFYKEVLHQPNIYYLS 165
0rf128:121 TLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLCFGALLACLFVIDKHNPFIPGMT 180
LRFPELL GSLLA+Y N + Q + +L+ L L +CLF+++ + FIPG+T
HI0392:166 NLRFPELLVGSLLAIYHNLSN-KVQLSKQVNNILAILSTLLLFSCLFLMNNNIAFIPGIT 224
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF128和脑膜炎奈瑟球菌菌株A的ORF(ORF128a)在244个氨基酸的重叠区内显示出有98.0%的相同性:
10 20 30
orf128.pep VSLASVIASQIFLYEDFNQMRKTVELSAVF
||||||||||||||||||||||||||||||
orf128a ILSEIQNGSFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVF
60 70 80 90 100 110
40 50 60 70 80 90
orf128.pep LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128a LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI
120 130 140 150 160 170
100 110 120 130 140 150
orf128.pep ILFLILTASSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK
||||||||:|||||||||||||||||||||||||||||||||||||||||||||||||||
orf128a ILFLILTATSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK
180 190 200 210 220 230
160 170 180 190 200 210
orf128.pep RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128a RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI
240 250 260 270 280 290
220 230 240
orf128.pep VFVGKISYSLYLYHWIFIAFAPLIRGGKQLGLPA
||||||||||||||||||||| | | |||||||
orf128a VFVGKISYSLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKR
300 310 320 330 340 350
orf128a KMTFKKAFFCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSH
360 370 380 390 400 410
全长ORF128a核苷酸序列<SEQ ID 831>是:
1 ATGCAAGCTG TCCGATACAG ACCGGAAATT GACGGATTGC GGGCCGTCGC
51 CGTGCTATCC GTCATGATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG
101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCAGGATT CCTCATTACC
151 GGCATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT
201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCTTTTATT GCGGCCGTGT
251 CGCTGGCTTC GGTGATTGCC TCTCAAATCT TCCTTTACGA AGATTTCAAC
301 CAAATGCGGA AAACCGTGGA GCTTTCTGCG GTTTTCTTGT CCAATATTTA
351 TCTGGGGTTT CAGCAGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG
401 TACTGCATAT CTGGTCTTTG GCAGTAGAGG AACAGTATTA CCTCCTGTAT
451 CCTCTTTTGC TGATATTTTG CTGCAAAAAA ACAAAATCGC TACGGGTGCT
501 GCGTAACATC AGCATCATCC TATTTCTGAT TTTGACTGCC ACATCGTTTT
551 TGCCAAGCGG GTTTTATACC GATATTCTCA ACCAACCCAA TACTTATTAC
601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GCAGGTTCGC TGCTGGCGGT
651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGCAAAT GGAAAACGGC
701 AGTTGCTTTC ATCACTCTGC TTCGGCGCAT TGCTTGCCTG CCTGTTCGTG
751 ATTGACAAAC ACAATCCGTT TATCCCGGGA ATGACCCTGC TCCTTCCCTG
801 CCTGCTGACG GCACTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA
851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT
901 TCCCTATACC TGTACCATTG GATTTTTATT GCTTTCGCCC ATTACATTAC
951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA
1001 CGGCCGGATT TTCCCTGTTG AGTTATTATT TGATTGAACA GCCGCTTAGA
1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTCT ATCTCGCCCC
1101 GTCCCTGATA CTTGTCGGTT ACAACCTGTA CGCAAGGGGG ATATTGAAAC
1151 AGGAACACCT CCGCCCGTTG CCCGGCGCGC CCCTTGCTGC GGAAAATCAT
1201 TTTCCGGAAA CCGTCCTGAC CCTCGGCGAC TCGCACGCCG GACACCTGCG
1251 GGGGTTTCTG GATTATGTCG GCAGCCGGGA AGGGTGGAAA GCCAAAATCC
1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TAGATGAGAA GCTGGCAGAC
1351 AACCCGTTAT GTCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCCGT
1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCCGTGCCGA
1451 GATTTGAAGC GCAATCCTTC CTAATACCCG GGTTCCCAGC CCGATTCAGG
1501 GAAACCGTCA AAAGGATAGC CGCCGTCAAA CCCGTCTATG TTTTTGCAAA
1551 CAACACATCA ATCAGCCGTT CGCCCCTGAG GGAGGAAAAA TTGAAAAGAT
1601 TTGCCGCAAA CCAATATCTC CGCCCCATTC AGGCTATGGG CGACATCGGC
1651 AAGAGCAATC AGGCGGTCTT TGATTTGATT AAAGATATTC CCAATGTGCA
1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATATACG
1751 GCCGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT
1801 TATATGGGGC GGGAATTTCA CAAACACGAA CGCCTGCTTA AATCTTCTCG
1851 CGACGGCGCA TTGCAGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 832>:
1 MQAVRYRPE
I DGLRAVAVLS VMIFHLNNRW LPGGFLG
VDI FFVISGFLIT
51
GIILSEIQNG SFSFRDFYTR RIKRIYPA
FI AAVSLASVIA SQIFLYEDFN
101 QMRKTVELSA VFLSNIYLGF QQGYFDLSAD ENPVLHIWSL AVEEQYYLLY
151 PLLLIFCCKK TKSLRVLRN
I SIILFLILTA TSFLPSGFYT DILNQPNTYY
201 LSTLRFPELL AGSLLAVYGQ TQNGRRQTAN GKRQ
LLSSLC FGALLACLFV
251
IDKHNPF
IPG MTLLLPCLLT ALLIRSMQYG TLPTRILSAS PIVFVGKISY
301 SLYLYHWIFI AFAHYITGDK QLG
LPAVSAV AALTAGFSLL SYYLIEQPLR
351 KRKMTFKKAF FCLYLAPSLI LVGYNLYARG ILKQEHLRPL PGAPLAAENH
401 FPETVLTLGD SHAGHLRGFL DYVGSREGWK AKILSLDSEC LVWVDEKLAD
451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFPARFR
501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAANQYL RPIQAMGDIG
551 KSNQAVFDLI KDIPNVHWVD AQKYLPKNTV EIYGRYLYGD QDHLTYFGSY
601 YMGREFHKHE RLLKSSRDGA LQ*
ORF128a和ORF128-1在622个氨基酸的重叠区内显示出有99.5%的相同性:
orf128a.pep MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNG
orf128a.pep SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF
orf128a.pep QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISIILFLILTA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISIILFLILTA
orf128a.pep TSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC
:|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 SSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC
orf128a.pep FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
orf128a.pep SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
orf128a.pep FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL
orf128a.pep DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
orf128a.pep PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL
orf128a.pep RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128-1 RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY
orf128a.pep YMGREFHKHERLLKSSRDGALQX
||||||||||||||||: |||||
orf128-1 YMGREFHKHERLLKSSHGGALQX
与淋病奈瑟球菌的预计ORF的同源性
ORF128和淋病奈瑟球菌的预计ORF(ORF128ng)在244个氨基酸的重叠区内显示出有93.4%的相同性:
orf128.pep VSLASVIASQIFLYEDFNQMRKTVELSAVF 30
|||||||||||||||||||||||:|||:||
orf128ng ILSEIQNGSFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTIELSTVF 112
orf128.pep LSNIYLGFQQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISI 90
||||||||: ||||||||||||||||||||||||||||||||||| ||||||||||||||
orf128ng LSNIYLGFRLGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCYKKTKSLRVLRNISI 172
orf128.pep ILFLILTASSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGK 150
|||||||||||||:||||||||||||||||||||||||:||||||||||||||||| |||
orf128ng ILFLILTASSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAVYGQTQNGRRQTENGK 232
orf128.pep RQLLSSLCFGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPI 210
|||| ||||||||:||||||||:|||||:|||||||||||||||||||||||||||||||
orf128ng RQLLSLLCFGALLVCLFVIDKHDPFIPGITLLLPCLLTALLIRSMQYGTLPTRILSASPI 292
orf128.pep VFVGKISYSLYLYHWIFIAFAPLIRGGKQLGLPA 244
||||||||||||||||||||| | | |||||||
orf128ng VFVGKISYSLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKR 352
全长ORF128ng核苷酸序列<SEQ ID 833>是:
1 ATGCAAGCTG TCCGATACAG GCCTGAAATT GACGGATTGC GGGCCGTCGC
51 CGTGCTATCC GTCATTATTT TCCACCTGAA TAACCGCTGG CTGCCCGGAG
101 GATTCCTGGG GGTGGACATT TTCTTTGTCA TCTCGGGATT CCTCATTACC
151 AACATCATTC TTTCTGAAAT ACAGAACGGT TCTTTTTCTT TCCGGGATTT
201 TTATACCCGC AGGATTAAGC GGATTTATCC TGCTTTTATT GCGGCCGTGT
251 CCCTGGCTTC GGTGATTGCT TCTCAAATCT TCCTTTACGA AGATTTCAAC
301 CAAATGAGGA AAACCATAGA GCTTTCTACG GTTTTTTTGT CCAATATTTA
351 TTTGGGGTTC CGATTGGGGT ATTTCGATTT GAGTGCCGAC GAGAACCCCG
401 TACTGCATAT CTGGTCTTTG GCGGTAGAGG AACAGTATTA CCTCCTGTAT
451 CCTCTTTTGC TGATATTCTG TTACAAAAAA ACCAAATCAC TACGGGTGCT
501 GCGTAATATC AGCATCATCC TGTTTCTGAT TTTGACCGCA TCATCGTTTT
551 TGCCGGCCGG GTTTTATACC GACATCCTCA ACCAACCcaa TACTTATTAC
601 CTTTCGACAC TGAGGTTTCC CGAGCTGTTG GTGGGTTCGC TGTTGGCGGT
651 TTACGGGCAA ACGCAAAACG GCAGACGGCA AACAGAAAAT GGAAAACGGC
701 AGTTGCTTTC ATTACTCTGT TTCGGCGCat tgCTTGTCTG CCTGTTCGTG
751 ATCGACAAAC ACGATCCGTT TATCCCGGGA ATAACCCTGC TCCTTCCCTG
801 CCTGCTGACG GCGCTGCTTA TCCGGAGTAT GCAATACGGG ACACTTCCGA
851 CCCGCATCCT GTCGGCAAGC CCCATCGTAT TTGTCGGCAA AATCTCTTAT
901 TCCCTATACC TGTACCATTG GATTTTTATT GCCTTCGCCC ATTACATTAC
951 AGGCGACAAA CAGCTCGGAC TGCCTGCCGT ATCGGCGGTT GCCGCGTTGA
1001 CGGCCGGATT TTCCCTGTTG AGCTATTATT TGATTGAACA GCCGCTTAGA
1051 AAACGGAAGA TGACCTTCAA AAAGGCATTT TTCTGCCTTT ATCTCGCCCC
1101 GTCCCTGATG CTTGTCGGTT ACAACCTGTA TTCAAGAGGG ATATTGAAAC
1151 AGGAACACCT CCGCCCGCTG CCCGGCACGC CCGTTGCTGC GGAAAATAAT
1201 TTTCCGGAAA CCGTCTTGAC CCTCGGCGAC TCGCACGCCG GACACCTGCG
1251 GGGGTTTCTG GATTATGTCG GCGGCAGGGA AGGGTGGAAA GCTAAAATCC
1301 TGTCCCTCGA TTCGGAGTGT TTGGTTTGGG TGGATGAGAA GCTGGCAGAC
1351 AACCCGTTGT GCCGAAAATA CCGGGATGAA GTTGAAAAAG CCGAAGCTGT
1401 TTTCATTGCC CAATTCTATG ATTTGAGGAT GGGCGGCCAG CCCGTGCCGA
1451 GATTTGAAGC GCAATCCTTC CTGATACCCG GGTTCAAAGC CCGATTCAGG
1501 GAAACCGTCA AGAGGATAGC CGCCGTCAAA CCTGTATATG TTTTTGCAAA
1551 CAATACATCA ATCAGCCGTT CTCCCTTGAG GGAGGAAAAA TTGAAAAGAT
1601 TTGCTATAAA CCAATACCTC CGGCCTATTC GGGCTATGGG CGACATCGGC
1651 AAGAGCAATC AGGCGGTCTT TGATTTGGTT AAAGATATTC CCAATGTGCA
1701 TTGGGTGGAC GCACAAAAAT ACCTGCCCAA AAACACGGTC GAAATACACG
1751 GACGCTATCT TTACGGCGAC CAAGACCACC TGACCTATTT CGGTTCTTAT
1801 TATATGGGGC GGGAATTTCA CAAACACGAA CGCCTGCTCA AGCATTCCCG
1851 AGGCGGCGCA TTGCAGTAG
它编码的蛋白质具有氨基酸序列<SEQ ID 834>:
1 MQAVRYRPE
I DGLRAVAVLS VIIFHLNNRW LPGGFLG
VDI FFVISGFLIT
51
NIILSEIQNG SFSFRDFYTR RIKRIYPA
FI AAVSLASVIA SQIFLYEDFN
101 QMRKTIELST VFLSNIYLGF RLGYFDLSAD ENPVLHIWSL AVEEQYYLLY
151 PLLLIFCYKK TKSLRVLRN
I SIILFLILTA SSFLPAGFYT DILNQPNTYY
201 LSTLRFPELL VGSLLAVYGQ TQNGRRQTEN GKRQ
LLSLLC FGALLVCLFV
251
IDKHDPF
IPG ITLLLPCLLT ALLIRSMQYG TLPTRILSAS PIVFVGKISY
301 SLYLYHWIFI AFAHYITGDK QLG
LPAVSAV AALTAGFSLL SYYLIEQPLR
351 KRKMTFKKAF FCLYLAPSLM LVGYNLYSRG ILKQEHLRPL PGTPVAAENN
401 FPETVLTLGD SHAGHLRGFL DYVGGREGWK AKILSLDSEC LVWVDEKLAD
451 NPLCRKYRDE VEKAEAVFIA QFYDLRMGGQ PVPRFEAQSF LIPGFKARFR
501 ETVKRIAAVK PVYVFANNTS ISRSPLREEK LKRFAINQYL RPIRAMGDIG
551 KSNQAVFDLV KDIPNVHWVD AQKYLPKNTV EIHGRYLYGD QDHLTYFGSY
601 YMGREFHKHE RLLKHSRGGA LQ*
ORF128ng和ORF128-1在622个氨基酸的重叠区内显示出有95.7%的相同性:
orf128-1.pep MQAVRYRPEIDGLRAVAVLSVMIFHLNNRWLPGGFLGVDIFFVISGFLITGIILSEIQNG
|||||||||||||||||||||:||||||||||||||||||||||||||||:|||||||||
orf128ng MQAVRYRPEIDGLRAVAVLSVIIFHLNNRWLPGGFLGVDIFFVISGFLITNIILSEIQNG
orf128-1.pep SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTVELSAVFLSNIYLGF
|||||||||||||||||||||||||||||||||||||||||||||:|||:||||||||||
orf128ng SFSFRDFYTRRIKRIYPAFIAAVSLASVIASQIFLYEDFNQMRKTIELSTVFLSNIYLGF
orf128-1.pep QQGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCCKKTKSLRVLRNISIILFLILTA
:|||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||
orf128ng RLGYFDLSADENPVLHIWSLAVEEQYYLLYPLLLIFCYKKTKSLRVLRNISIILFLILTA
orf128-1.pep SSFLPSGFYTDILNQPNTYYLSTLRFPELLAGSLLAVYGQTQNGRRQTANGKRQLLSSLC
|||||:||||||||||||||||||||||||:||||||||||||||||| |||||||| ||
orf128ng SSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAVYGQTQNGRRQTENGKRQLLSLLC
orf128-1.pep FGALLACLFVIDKHNPFIPGMTLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
|||||:||||||||:|||||:|||||||||||||||||||||||||||||||||||||||
orf128ng FGALLVCLFVIDKHDPFIPGITLLLPCLLTALLIRSMQYGTLPTRILSASPIVFVGKISY
orf128-1.pep SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128ng SLYLYHWIFIAFAHYITGDKQLGLPAVSAVAALTAGFSLLSYYLIEQPLRKRKMTFKKAF
orf128-1.pep FCLYLAPSLILVGYNLYARGILKQEHLRPLPGAPLAAENHFPETVLTLGDSHAGHLRGFL
|||||||||:|||||||:||||||||||||||:|:||||:||||||||||||||||||||
orf128ng FCLYLAPSLMLVGYNLYSRGILKQEHLRPLPGTPVAAENNFPETVLTLGDSHAGHLRGFL
orf128-1.pep DYVGSREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
||||:|||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf128ng DYVGGREGWKAKILSLDSECLVWVDEKLADNPLCRKYRDEVEKAEAVFIAQFYDLRMGGQ
orf128-1.pep PVPRFEAQSFLIPGFPARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAANQYL
||||||||||||||| ||||||||||||||||||||||||||||||||||||||| ||||
orf128ng PVPRFEAQSFLIPGFKARFRETVKRIAAVKPVYVFANNTSISRSPLREEKLKRFAINQYL
orf128-1.pep RPIQAMGDIGKSNQAVFDLIKDIPNVHWVDAQKYLPKNTVEIYGRYLYGDQDHLTYFGSY
|||:|||||||||||||||:||||||||||||||||||||||:|||||||||||||||||
orf128ng RPIRAMGDIGKSNQAVFDLVKDIPNVHWVDAQKYLPKNTVEIHGRYLYGDQDHLTYFGSY
orf128-1.pep YMGREFHKHERLLKSSHGGALQX
|||||||||||||| |:||||||
orf128ng YMGREFHKHERLLKHSRGGALQX
610 620
另外,ORF218ng显示出与一种假设的流感嗜血菌蛋白同源:
sp|P43993|Y392_AEIN假设蛋白HI0392>gi|1074385|pir||B64007假设蛋白HI0392-流感嗜血菌(Rd KW20菌株)>gi|1573364(U32723)流感嗜血菌预计的编码区HI0392[流感嗜血菌]长度=245
评分=239位(604),估计值=3e-62
相同性=124/225(55%),阳性=152/225(67%),空隙=1/225(0%)
询问:38 VDIFFVISGFLITNIILSEIQNGSFSFRDFYTRRIKRIYPXXXXXXXXXXXXXXXXFLYE 97
+DIFFVISGFLIT II++EIQ SFS + FYTRRIKRIYP F+Y
目标:1 MDIFFVISGFLITGIIITEIQQNSFSLKQFYTRRIKRIYPAFITVMALVSFIASAIFIYN 60
询问:98 DFNQMRKTIELSTVFLSNIYLGFRLGYFDLSADENPVLHIWSLAVEEQXXXXXXXXXIFC 157
DFN++RKTIEL+ FLSN YLG GYFDLSA+ENPVLHIWSLAVE Q I
目标:61 DFNKLRKTIELAIAFLSNFYLGLTQGYFDLSANENPVLHIWSLAVEGQYYLIFPLILILA 120
询问:158 YKKTKSLRVLRNISIILFLILTASSFLPAGFYTDILNQPNTYYLSTLRFPELLVGSLLAV 217
YKK + ++VL I++ILF IL A+SF+ A FY ++L+QPN YYLS LRFPELLVGSLLA+
目标:121 YKKFREVKVLFIITLILFFILLATSFVSANFYKEVLHQPNIYYLSNLRFPELLVGSLLAI 180
询问:218 YGQTQNGRRQTENGKRQLLSLLCFGALLVCLFVIDKHDPFIPGIT 262
Y N + Q +L++L L CLF+++ + FIPGIT
目标:181 YHNLSN-KVQLSKQVNNILAILSTLLLFSCLFLMNNNIAFIPGIT 224
该分析结果(包括鉴定出几个推定的跨膜结构域)提示脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例99
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 835>:
1 ..ATTATTTACG AATACCGCTG GATGTTTCTT TACGGCGCAC TGACGACCTT
51 GGGGCTGACG GTCGTGGCAA C.GCGGGCGG TTCGGTATTG GGTCTGTTGT
101 TGGCGTTGGC GCGCCTGATT CACTTGGAAA AAGCCGGTGC GCCGATGCGC
151 GTGCTGGCGT GGGCGTTGCG TAAAGTTTCG CTGCTGTATG TTACGCTGTT
201 CCGGGGTACG CCGCTGTTTG TGCAGATTGT GATTTGGGCG TATGTGTGGT
251 TTCCGTTTTT CGTC..
它对应于氨基酸序列<SEQ ID 836;ORF129>:
1 ..IIYEYRWMFL YGALTTLGLT VVAXAGGSVL GLLLALARLI HLEKAGAPMR
51 VLAWALRKVS LLYVTLFRGT PLFVQIVIWA YVWFPFFV..
进一步的工作揭示了其完整的核苷酸序列<SEQ ID 837>:
1 ATGGATTTTC GTTTTGACAT TATTTACGAA TACCGCTGGA TGTTTCTTTA
51 CGGCGCACTG ACGACCTTGG GGCTGACGGT CGTGGCAACG GCGGGCGGTT
101 CGGTATTGGG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA
151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AAGTTTCGCT
201 GCTGTATGTT ACGCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA
251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT
301 TTGGTCAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG
401 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG
451 GCGCGTTCTT TGGGGCTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT
501 GCCGCAGGCA TTGCGCCGCA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC
651 GCTTTACACC GTCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA
它对应于氨基酸序列<SEQ ID 838;ORF129-1>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLT
VVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTP
LFV QIVIWAYVWF PFFVHPSDGI
101 LVSGEAAIAL RRGYGP
LIAG SLALIANSGA YICEIFRAGI QSIDKGQMEA
151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS E
FITLLKDSS LLSVIAVAEL
201 AYVQNTITGR YSVYEEPLYT
VALIYLLMTT FLGWIFLRLE KRYNPQHR*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF129和脑膜炎奈瑟球菌菌株A的ORF(ORF129a)在88个氨基酸的重叠区内显示出有98.9%的相同性:
10 20 30 40 50
orf129.pep IIYEYRWMFLYGALTTLGLT
VVAXAGGSVLGLLLALARLIHLEKAGAPMRVLAW
|||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf129a MDFRFDIIYEYRWMFLYGALTTLGLT
VVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
10 20 30 40 50 60
60 70 80
orf129.pep ALRKVSLLYVTLFRGTP
LFVQIVIWAYVWFPFFV
||||||||||||||||||||||||||||||||||
orf129a ALRKVSLLYVTLFRGTP
LFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGP
LIAG
70 80 90 100 110 120
orf129a
SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
130 140 150 160 170 180
全长ORF129a核苷酸序列<SEQ ID 839>是:
1 ATGGATTTTC GTTTTGACAT TATTTACGAA TACCGCTGGA TGTTTCTTTA
51 CGGCGCACTG ACGACCTTGG GGCTGACGGT CGTGGCGACG GCGGGCGGTT
101 CGGTATTGGG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA
151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AGGTTTCGCT
201 GCTGTATGTT ACGCTGTTCC GGGGTACGCC GCTGTTTGTG CAGATTGTGA
251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT
301 TTGGTTAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG
401 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG
451 GCGCGTTCTT TGGGGCTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT
501 GCCGCAGGCA TTGCGCCGTA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC
651 GCTTTACACC GTCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA
它编码的蛋白质具有氨基酸序列<SEQ ID 840>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLT
VVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTP
LFV QIVIWAYVWF PFFVHPSDGI
101 LVSGEAAIAL RRGYGP
LIAG SLALIANSGA YICEIFRAGI QSIDKGQMEA
151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS E
FITLLKDSS LLSVIAVAEL
201 AYVQNTITGR YSVYEEPLYT
VALIYLLMTT FLGWIFLRLE KRYNPQHR*
ORF129a和ORF129-1在248个氨基酸的重叠区内显示出有100.0%的相同性:
orf129a.pep MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
orf129a.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
orf129a.pep SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
orf129a.pep EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129-1 EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE
orf129a.pep KRYNPQHRX
|||||||||
orf129-1 KRYNPQHRX
与淋病奈瑟球菌的预计ORF的同源性
ORF129和淋病奈瑟球菌的预计ORF(ORF129ng)在88个氨基酸的重叠区内显示出有98.9%的相同性:
orf129.pep IIYEYRWMFLYGALTTLGLTVVAXAGGSVLGLLLALARLIHLEKAGAPMRVLAW 54
|||||||||||||||||||||||:||||||||||||||||||||||||||||||
orf129ng MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW 60
orf129.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFV 88
||||||||||||||||||||||||||||||||||
orf129ng ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVILHTAFLGNAMRQSRRVPDKGRWIAG 120
预计ORF129ng核苷酸序列<SEQ ID 841>编码的蛋白质具有氨基酸序列<SEQ ID842>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLT
VVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTPLF
V QIVIWAYVWF PFFVILHTAF
101 LGNAMRQSRR VPDKGRWIAG SLELNCQPRG RKTRGEFPPG ESNLGTEPRN
151 PLSMGQRRFP GCENWYPPQN FIKK*
进一步的工作揭示了下列淋球菌序列<SEQ ID 843>:
1 ATGGATTTTc gtTTTGACAT TATTTAcgaA TACCGCTGGA TGTTTCTTTA
51 CGGCGCACTG Acgaccttgg ggctgacggt cgtggcgacg gCGGGCGGTT
101 CGGtattggG TCTGTTGTTG GCGTTGGCGC GCCTGATTCA CTTGGAAAAA
151 GCCGGTGCGC CGATGCGCGT GCTGGCGTGG GCGTTGCGTA AGGTTTCGCT
201 GCTGTACGTT ACCCTGTTCC GGGGTACCCC GCTGTTTGTG CAGATTGTGA
251 TTTGGGCGTA TGTGTGGTTT CCGTTTTTCG TCCATCCTTC AGACGGCATT
301 TTGGTCAGCG GCGAGGCGGC AATCGCGCTG CGTCGCGGAT ACGGGCCGCT
351 GATTGCCGGT TCTTTGGCAC TGATCGCCAA CTCGGGGGCG TATATCTGTG
401 AGATTTTCCG CGCGGGCATC CAGTCTATAG ACAAAGGACA GATGGAGGCG
451 GCGTGTTCTT TGGGACTGAC CTATCCGCAG GCGATGCGCT ATGTGATTCT
501 GCCGCAGGCA TTGCGCCGTA TGCTGCCGCC TTTGGCGAGC GAGTTCATCA
551 CGCTCTTGAA AGACAGCTCG CTGCTGTCGG TCATTGCTGT GGCGGAGTTG
601 GCGTATGTTC AGAATACGAT TACGGGCCGG TATTCGGTTT ATGAAGAACC
651 GCTTTACACC GCCGCCCTGA TTTATCTGTT GATGACGACT TTCTTAGGCT
701 GGATATTCCT GCGTTTGGAA AAACGTTACA ATCCGCAACA CCGCTGA
它对应于氨基酸序列<SEQ ID 844;ORF129ng-1>:
1 MDFRFDIIYE YRWMFLYGAL TTLGLT
VVAT AGGSVLGLLL ALARLIHLEK
51 AGAPMRVLAW ALRKVSLLYV TLFRGTP
LFV QIVIWAYVWF PFFVHPSDGI
101 LVSGEAAIAL RRGYGP
LIAG SLALIANSGA YICEIFRAGI QSIDKGQMEA
151 ARSLGLTYPQ AMRYVILPQA LRRMLPPLAS E
FITLLKDSS LLSVIAVAEL
201 AYVQNTITGR YSVYEEPLYT
VALIYLLMTT FL6WIFLRLE KRYNPQHR*
ORF129ng-1和ORF129-1在248个氨基酸的重叠区内显示出有99.2%的相同性:
orf129-1.pep MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129ng-1 MDFRFDIIYEYRWMFLYGALTTLGLTVVATAGGSVLGLLLALARLIHLEKAGAPMRVLAW
orf129-1.pep ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf129ng-1 ALRKVSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAG
orf129-1.pep SLALIANSGAYICEIFRAGIQSIDKGQMEAARSLGLTYPQAMRYVILPQALRRMLPPLAS
||||||||||||||||||||||||||||||| ||||||||||||||||||||||||||||
orf129ng-1 SLALIANSGAYICEIFRAGIQSIDKGQMEAACSLGLTYPQAMRYVILPQALRRMLPPLAS
orf129-1.pep EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTVALIYLLMTTFLGWIFLRLE
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf129ng-1 EFITLLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTAALIYLLMTTFLGWIFLRLE
orf129-1.pep KRYNPQHRX
|||||||||
orf129ng-1 KRYNPQHRX
另外,ORF129ng-1与闪烁古生球菌的ABC转运蛋白同源:
2650409(AE001090)谷氨酰胺ABC转运蛋白,通透酶蛋白(glnP)[闪烁古生球菌]长度=224
评分=132位(329),估计值=2e-30
相同性=86/178(48%),阳性=103/178(57%),空隙=18/178(10%)
询问:65 VSLLYVTLFRGTPLFVQIVIWAYVWFPFFVHPSDGILVSGEAAIALRRGYGPLIAGSLAL 124
+S YV + RGTPL VQI+I +F P+ GI + E A G +AL
目标:58 ISTAYVEVIRGTPLLVQILI------VYFGLPAIGINLQPEPA------------GIIAL 99
询问:125 IANSGAYICEIFRAGIQSIDKGQMEAACSLGLTYPQAMRYVILPQALRRMLPPLASEFIT 184
SGAYI EI RAGI+SI GQMEAA SLG+TY QAMRYVI PQA R +LP L +EFI
目标:100 SICSGAYIAEIYRAGIESIPIGQMEAARSLGMTYLQAMRYVIFPQAFRNILPALGNEFIA 159
询问:185 LLKDSSLLSVIAVAELAYVQNTITGRYSVYEEPLYTAALIYLLMTTFLGWIFLRLEKR 242
LLKDSSLLSVI++ EL V I P AL YL+MT L + +K+
目标:160 LLKDSSLLSVISIVELTRVGRQIVNTTFNAWTPFLGVALFYLMMTIPLSRLVAYSQKK 217
该分析结果(包括在两个蛋白中鉴定出几个跨膜结构域)提示脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例100
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 845>:
1 ..CTGAAAGAAT GCCGTCTGAA AGACCCTGTT TTTATTCCAA ATATCGTTTA
51 TAAGAACATC GCCATTACTT TCCTGCTCTT GCACGCCGCC GCCGAACTTT
101 GGCTGCCCGC GCAAACCGCC GGTTTTACCG CGCTCGCCGT CGGCTTCATC
151 CTGCTCGCCA AGCTGCGTGA gCTTCACCAT CACGAACTCT TACGTAAACA
251 TTGTGGACAG GCGCGGCGwA ATTACAAAAC CTGCCCGCyT CCGCGCCCCT
301 GCACCTGATT ACCCTCGGCG GCATGATGGG CGGCGTGATG ATGGTGTGGc
351 TGACCGCCGG ACTGTGGCAC AGCGGCTTTA CCAAACTCGA CTACCCCAAA
401 CTCTGCCGCA TTGCCGTCCC CATCCTTTTC GCCGCCGCCG TCTCGCGCGC
451 TTTCTTGrTG AACGTGAACC CGrTATTTTT CATTACCGTT CCTGCGATTC
501 TGACCGCCGC CGTATTCGTA CTGTATCTTT TCrCGTTTAT ACCGATATTT
551 CGGGCGAATG CGTTTACAGA CGATCCGGAr TAr
它对应于氨基酸序列<SEQ ID 846;ORF130>:
1 ..LKECRLKDPV FIPNIVYKNI AITFLLLHAA AELWLPAQTA GFTALAVGFI
51 LLAKLRELHH HELLRKHYVR TYYLLQLFAA AGSLWTGAAX LQNLPASAPL
101 HLITLGGMMG GVMMVWLTAG LWHSGFTKLD YPKLCRIAVP ILFAAAVSRA
151 FLXNVNPXFF ITVPAILTAA VFVLYLFXFI PIFRANAFTD DPE*
进一步的工作揭示了完整的核苷酸序列<SEQ ID 847>:
1 ATGCGGCCGT TTTTCGTCGG CGCGGCGGTG CTTGCCATAC TCGGTGCGCT
51 GGTGTTTTTC ATCAACCCCG GTGCCATCGT CCTGCACCGC CAAATTTTCT
101 TGGAACTTAT GCTGCCGGCG GCATACGGCG GTTTTTTGAC TGCGGCTTTG
151 TTGGACTGGA CGGGTTTTTC GGGTAACCTG AAACCTGTCG CGACTTTGAT
201 GGCGGCATTA TTGCTCGCCG CATCCGCTAT ACTGCCCTTT TCGCCGCAAA
251 CTGCCTCGTT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC
301 GCCCGGCTGA TTTGGCTAGA CCGAAACACC GACAACTTCG CCCTGCTAAT
351 GTTACTTGCC GCGTTCACTG TTTTTCAGAC GGCATATGCC GTCAGCGGCG
401 ATTTGAACCT GTTGCGCGCG CAAGTGCATC TAAATATGGC GGCGGTGATG
451 TTCGTATCCG TGCGCGTCAG TATTCTTTTG GGCGCGGAAG CCCTGAAAGA
501 ATGCCGTCTG AAAGACCCTG TTTTTATTCC AAATATCGTT TATAAAAACA
551 TCGCCATTAC TTTCCTGCTC TTGCACGCCG CCGCCGAACT TTGGCTGCCC
601 GCGCAAACCG CCGGTTTTAC CGCGCTCGCC GTCGGCTTCA TCCTGCTCGC
651 CAAGCTGCGT GAGCTTCACC ATCACGAACT CTTACGTAAA CACTACGTCC
701 GCACTTATTA CCTGCTCCAA CTCTTTGCCG CCGCAGGCTA TTTGTGGACA
751 GGCGCGGCGA AATTACAAAA CCTGCCCGCC TCCGCGCCCC TGCACCTGAT
801 TACCCTCGGC GGCATGATGG GCGGCGTGAT GATGGTGTGG CTGACCGCCG
851 GACTGTGGCA CAGCGGCTTT ACCAAACTCG ACTACCCCAA ACTCTGCCGC
901 ATTGCCGTCC CCATCCTTTT CGCCGCCGCC GTCTCGCGCG CTTTCTTGAT
951 GAACGTGAAC CCGATATTTT TCATTACCGT TCCTGCGATT CTGACCGCCG
1001 CCGTATTCGT ACTGTATCTT TTCACGTTTA TACCGATATT TCGGGCGAAT
1051 GCGTTTACAG ACGATCCGGA ATAA
它对应于氨基酸序列<SEQ ID 848;ORF130-1>:
1
MRPFFVGAAV LAILGALVFF INPGAIVLHR QIFLELMLPA AYGGFLTAAL
51 LDWTGFSGNL KP
VATLMAAL LLAASAILPF SPQT
ASFFVA AYWLVLLLFC
101
ARLIWLDRNT DNFA
LLMLLA AFTVFQTAYA VSGDLNLLRA QVHLN
MAAVM
151
FVSVRVSILL GAEALKECRL KDPVFIPNIV YKN
IAITFLL LHAAAELWLP
201 AQ
TAGFTALA VGFILLAKLR ELHHHELLRK HYVRTYYLLQ LFAAAGYLWT
251 GAAKLQNLPA SAPLH
LITLG GMMGGVMMVW LTAGLWHSGF TKLDYPKLCR
301
IAVPILFAAA VSRAFLMNVN P
IFFITVPAI LTAAVFVLYL FTFIPIFRAN
351 AFTDDPE*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF130和脑膜炎奈瑟球菌菌株A的ORF(ORF130a)在193个氨基酸的重叠区内显示出有94.3%的相同性:
10 20 30
orf130.pep LKECRLKDPVFIPNIVYKNIAITFLLLHAA
||||||||||||||:|||||||||||||||
orf130a LNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNVVYKNIAITFLLLHAA
140 150 160 170 180 190
40 50 60 70 80 90
orf130.pep AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGSLWTGAAX
|||||||||||||:|||||||||||||||||||||||||||||||||||||| |||||||
orf130a AELWLPAQTAGFTSLAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGYLWTGAAK
200 210 220 230 240 250
100 110 120 130 140 150
orf130.pep LQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA
||||||||||||||||||||:|||||||||||||||||||||||||||||||||||||||
orf130a LQNLPASAPLHLITLGGMMGSVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA
260 270 280 290 300 310
160 170 180 190
orf130.pep FLXNVNPXFFITVPAILTAAVFVLYLFXFIPIFRANAFTDDPEX
| ||||| ||||||||||||||||||::|:||||||||||||||
orf130a VLMNVNPIFFITVPAILTAAVFVLYLLTFVPIFRANAFTDDPEX
320 330 340 350
全长ORF130a核苷酸序列<SEQ ID 849>是:
1 ATGCGGCCGT TTTTCGTCGG CGCGGCGGTG CTTGCCATAC TCGGTGCGCT
51 GGTGTTTTTC ATCAACCCCG GTGCCATCGT CCTGCACCGC CAAATTTTCT
101 TGGAACTTAT GCTGCCGGCG GCATACGGCG GTTTTTTGAC TGCGGCTTTG
151 TTGGACTGGA CGGGTTTTTC GGGTAACCTG AAACCTGTCG CGACTTTGAT
201 GGCGGCATTA TTGCTCGCCG CATCCGCTAT ACTGCCCTTT TCGCCGCAAA
251 CTGCCTCGTT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC
301 GCCCGGCTGA TTTGGCTAGA CCGAAACACC GACAACTTCG CCCTGCTAAT
351 GTTACTTGCC GCGTTCACTG TTTTTCAGAC GGCATATGCC GTCAGCGGCG
401 ATTTGAACCT GTTGCGCGCG CAAGTGCATC TAAATATGGC GGCGGTGATG
451 TTCGTATCCG TGCGCGTCAG TATTCTTTTG GGCGCGGAAG CCCTGAAAGA
501 ATGCCGTCTG AAAGACCCAG TATTCATCCC CAATGTCGTC TATAAAAACA
551 TCGCCATTAC CTTCCTGCTC CTGCACGCCG CCGCCGAACT TTGGCTGCCT
601 GCGCAAACCG CCGGTTTTAC CTCGCTCGCC GTCGGCTTTA TCCTGCTTGC
651 CAAGCTGCGT GAGCTTCACC ATCACGAACT CCTGCGCAAA CACTACGTCC
701 GCACTTATTA CCTGCTCCAA CTCTTTGCCG CCGCAGGCTA TTTGTGGACA
751 GGCGCGGCGA AATTACAAAA CCTGCCCGCC TCCGCGCCCC TGCACCTGAT
801 TACCCTCGGT GGCATGATGG GCAGCGTGAT GATGGTGTGG CTGACTGCCG
851 GACTGTGGCA CAGCGGCTTT ACCAAGCTCG ACTACCCGAA ACTCTGCCGC
901 ATCGCCGTCC CCATCCTNTT CGCCGCCGCC GTTTCGCGCG CTGTTTTAAT
951 GAACGTAAAC CCGATATTCT TCATCACCGT CCCCGCAATT CTGACCGCCG
1001 CCGTGTTCGT GCTTTACCTG CTGACATTCG TACCGATCTT TCGGGCGAAC
1051 GCGTTTACAG ACGATCCGGA ATAA
它编码的蛋白质具有氨基酸序列<SEQ ID 850>:
1
MRPFFVGAAV LAILGALVFF INPGAIVLHR QIFLELMLPA AYGGFLTAAL
51 LDWTGFSGNL KP
VATLMAAL LLAASAILPF SPQT
ASFFVA AYWLVLLLFC
101
ARLIWLDRNT DNFA
LLMLLA AFTVFQTAYA VSGDLNLLRA QVHLN
MAAVM
151
FVSVRVSILL GAEALKECRL KDPVFIPNVV YKN
IAITFLL LHAAAELWLP
201 AQ
TAGFTSLA VGFILLAKLR ELHHHELLRK HYVRTYYLLQ LFAAAGYLWT
251 GAAKLQNLPA SAPLH
LITLG GMMGSVMMVW LTAGLWHSGF TKLDYPKLCR
301
IAVPILFAAA VSRAVLMNVN P
IFFITVPAI LTAAVFVLYL LTFVPIFRAN
351 AFTDDPE*
ORF130a和ORF130-1在357个氨基酸的重叠区内显示出有98.3%的相同性:
orf130a.pep MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf130-1 MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL
orf130a.pep KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf130-1 KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA
orf130a.pep AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNVV
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||:|
orf130-1 AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNIV
orf130a.pep YKNIAITFLLLHAAAELWLPAQTAGFTSLAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
|||||||||||||||||||||||||||:||||||||||||||||||||||||||||||||
orf130-1 YKNIAITFLLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
orf130a.pep LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGSVMMVWLTAGLWHSGFTKLDYPKLCR
||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||
orf130-1 LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCR
orf130a.pep IAVPILFAAAVSRAVLMNVNPIFFITVPAILTAAVFVLYLLTFVPIFRANAFTDDPE
|||||||||||||| |||||||||||||||||||||||||:||:|||||||||||||
orf130-1 IAVPILFAAAVSRAFLMNVNPIFFITVPAILTAAVFVLYLFTFIPIFRANAFTDDPE
与淋病奈瑟球菌的预计ORF的同源性
ORF130和淋病奈瑟球菌的预计ORF(ORF130ng)在193个氨基酸的重叠区内显示出有91.7%的相同性:
orf130.pep LKECRLKDPVFIPNIVYKNIAITFLLLHAA 30
||||||||||||||::||||||| ||||||
orf130ng LNLLRAQVHLNMAAVMFVSVRVSVLLGTETLKECRLKDPVFIPNVIYKNIAIT-LLLHAA 201
orf130.pep AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGSLWTGAAX 90
|||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||
orf130ng AELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQLFAAAGYLWTGAAK 261
orf130.pep LQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVPILFAAAVSRA 150
|||||||||||||||||| |||||||||||||||||||||||||||||| ||||:|||||
orf130ng LQNLPASAPLHLITLGGMTGGVMMVWLTAGLWHSGFTKLDYPKLCRIAVSILFASAVSRA 321
orf130.pep FLXNVNPXFFITVPAILTAAVFVLYLFXFIPIFRANAFTDDPE 193
| |||| |||||| |||||||:|||::|:|||||||||||||
orf130ng VLMNVNPIFFITVPEILTAAVFMLYLLTFVPIFRANAFTDDPE 364
预计ORF130ng核苷酸序列<SEQ ID 851>编码的蛋白质具有氨基酸序列<SEQ ID852>:
1
MNKFFTHPMR PFFVGAAVLA ILGALVFFHQ PRRYHPAPPN FLGTYAAGCI
51 RRFFDYRFVG PDGFFRQPET CRYFDG
GVVA CCGCFIAVFT ATCRIFRRRL
101 LAGVAAVLRL ADLARRQHRT LRSVDVTAAF TVFQTAYAVS GDLNLLRAQV
151 H
LNMAAVMFV SVRVSVLLGT ETLKECRLKD P
VFIPNVIYK NIAITLLLHA
201 AAELWLPAQ
T AGFTALAVGF ILLAKLRELH HHELLRKHYV RTYYLLQLFA
251 AAGYLWTGAA KLQNLPASAP LHLITLGGMT GGVMMVWLTA GLWHSGFTKL
301 DYPKLCR
IAV SILFASAVSR AVLMNVNPIF FITVPE
ILTA AVFMLYLLTF
351
VPIFRANAFT DDPE*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 853>:
1 ATGCGCCCGT TTTTCGTCGG TGCGGCAGTA CTTGCCATAC TCGGTGCGTT
51 GGTGTTTTTT ATCAACCCCG GCGCTATCAT CCTGCACCGC CAAATTTTCT
101 TGGAACTTAT GCTGCCGGCT GCATACGGCG GTTTTTTGAC TACCGCTTTG
151 TTGGACCGGA CGGGTTTTTC AGGCAACCTG AAACCTGCCG CTACTTTGAT
201 GGCGGTGTTG TTGCTTGTTG CGGCTGTTTT ATTGCCGTTT TTACCGCAAC
251 TTGCCGCATT TTTCGTCGCC GCCTATTGGC TGGTGTTGCT GCTGTTCTGC
301 GCCTGGCTGA TTTGGCTCGA CCGCAACACC GACAACTTCG CTCTGTTGAT
351 GTTACTTGCC GCATTTACCG TTTTTCAGAC GGCCTATGCC GTCAGCGGCG
401 ATTTGAACTT ACTGCGCGCG CAAGTGCATT TGAATATGGC GGCGGTCATG
451 TTCGTATCCG TCCGCGTCAG CGTCCTTTTG GGCACGGAAA CCCTGAAAGA
501 ATGCCGTCTG AAAGACCCCG TATTCATCCC CAACGTTATC TATAAAAACA
551 TCGCCATCAC CCTGCTGCTG CACGCCGCCG CCGAACTTTG GCTGCCCGCG
601 CAAACCGCCG GTTTTACTGC GCTTGCCGTC GGCTTCATCC TGCTCGCCAA
651 GCTGCGCGAA CTGCACCATC ACGAACTCTT ACGCAAACAC TACGTCCGCA
701 CTTATTACCT GCTCCAGCTC TTTGCCGCCG CAGGTTATCT GTGGACAGGC
751 GCGGCGAAAC TGCAAAACCT GCCCGCCTCC GCGCCCCTGC ACCTGATTAC
801 CCTCGGCGGC ATGACGGGTG GCGTGATGAT GGTGTGGCTG ACTGCCGGAC
851 TGTGGCACAG CGGCTTTACC AAACTCGACT ACCCGAAACT CTGCCGCATC
901 GCCGTCTCCA TCCTTTTCGC CTCCGCCGTT TCGCGCGCTG TTTTAATGAA
951 CGTGAATCCG ATATTCTTCA TCACCGTTCC CGAGATTCTG ACCGCCGCCG
1001 TGTTCATGCT TTACCTGCTG ACGTTCGTAC CGATTTTTCG AGCGAACGCG
1051 TTTACAGACG ATCCGGAATA A
它对应于氨基酸序列<SEQ ID 854;ORF130ng-1>:
1 MRPF
FVGAAV LAILGALVFF INPGAIILHR QIFLELMLPA AYGGFLTTAL
51 LDRTGFSGNL KPAA
TLMAVL LLVAAVLLPF LPQ
LAAFFVA AYWLVLLLFC
101 AWLIWLDRNT DNFA
LLMLLA AFTVFQTAYA VSGDLNLLRA QVH
LNMAAVM
151
FVSVRVSVLL GTETLKECRL KDP
VFIPNVI YKNIAITLLL HAAAELWLPA
201 Q
TAGFTALAV GFILLAKLRE LHHHELLRKH YVRTYYLLQL FAAAGYLWTG
251 AAKLQNLPAS APLHLITLGG MTGGVMMVWL TAGLWHSGFT KLDYPKLCR
I
301
AVSILFASAV SRAVLMNVNP IFFITVPE
IL TAAVFMLYLL TFVPIFRANA
351 FTDDPE*
ORF130ng-1和ORF130-1在357个氨基酸的重叠区内显示出有92.4%的相同性:
orf130-1.pep MRPFFVGAAVLAILGALVFFINPGAIVLHRQIFLELMLPAAYGGFLTAALLDWTGFSGNL
||||||||||||||||||||||||||:||||||||||||||||||||:||||||||||||
orf130ng-1 MRPFFVGAAVLAILGALVFFINPGAIILHRQIFLELMLPAAYGGFLTTALLDRTGFSGNL
orf130-1.pep KPVATLMAALLLAASAILPFSPQTASFFVAAYWLVLLLFCARLIWLDRNTDNFALLMLLA
||:|||||:|||:|:::||| || |:||||||||||||||| ||||||||||||||||||
orf130ng-1 KPAATLMAVLLLVAAVLLPFLPQLAAFFVAAYWLVLLLFCAWLIWLDRNTDNFALLMLLA
orf130-1.pep AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSILLGAEALKECRLKDPVFIPNIV
|||||||||||||||||||||||||||||||||||||:|||:|:||||||||||||||::
orf130ng-1 AFTVFQTAYAVSGDLNLLRAQVHLNMAAVMFVSVRVSVLLGTETLKECRLKDPVFIPNVI
orf130-1.pep YKNIAITFLLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||
orf130ng-1 YKNIAIT-LLLHAAAELWLPAQTAGFTALAVGFILLAKLRELHHHELLRKHYVRTYYLLQ
orf130-1.pep LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMMGGVMMVWLTAGLWHSGFTKLDYPKLCR
|||||||||||||||||||||||||||||||| |||||||||||||||||||||||||||
orf130ng-1 LFAAAGYLWTGAAKLQNLPASAPLHLITLGGMTGGVMMVWLTAGLWHSGFTKLDYPKLCR
orf130-1.pep IAVPILFAAAVSRAFLMNVNPIFFITVPAILTAAVFVLYLFTFIPIFRANAFTDDPEX
||| ||||:||||| ||||||||||||| |||||:|||:||:||||||||||||||||
orf130ng-1 IAVSILFASAVSRAVLMNVNPIFFITVPEILTAAVFMLYLLTFVPIFRANAFTDDPEX
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例101
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 855>:
1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA
101 CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATAGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT
251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG C.TGCGGGCT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA
351 CTGCTTGGAA AAG..
它对应于氨基酸序列<SEQ ID 856;ORF131>:
1 MEIRAIKYTA MAALLAFTVA GCRLAGWYEC SSLTGWCKPR KPAAIDFWDI
51 GGESPPSLGD YEIPLSDGNS SVRANEYESA QQSYFYRKIG KFEXCGLDWR
101 TRDGKPLIET FKQGGFDCLE K..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 857>:
1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA
101 CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT
251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG CCTGCGGGCT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA
351 CTGCTTGGAA AAGCAGGGGT TGCGGCGCAA CGGTCTGTCC GAGCGCGTCC
401 GATGGTAA
它对应于氨基酸序列<SEQ ID 858;ORF131-1>:
1
MEIRAIKYTA MAALLAFTVA GCRLAGWYEC SSLTGWCKPR KPAAIDFWDI
51 GGESPPSLGD YEIPLSDGNR SVRANEYESA QQSYFYRKIG KFEACGLDWR
101 TRDGKPLIET FKQGGFDCLE KQGLRRNGLS ERVRW*
该氨基酸序列的计算机分析给出了下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF131和脑膜炎奈瑟球菌菌株A的ORF(ORF131a)在121个氨基酸的重叠区内显示出有95.0%的相同性:
10 20 30 40 50 60
orf131.pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD
|||||||||||||||||||||||||||||||||:|||||||||||||||||||||||| |
orf131a MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPPSLED
10 20 30 40 50 60
70 80 90 100 110 120
orf131.pep YEIPLSDGNSSVRANEYESAQQSYFYRKIGKFEXCGLDWRTRDGKPLIETFKQGGFDCLE
||||||||| ||||||||||||||||||||||| ||||||||||||||||||| |||||:
orf131a YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQEGFDCLK
70 80 90 100 110 120
orf131.pep K
|
orf131a KQGLRRNGLSERVRWX
130
全长ORF131a核苷酸序列<SEQ ID 859>是:
1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT
51 TACGGTTGCA GGCTGCCGGT TGGCAGGTTG GTATGAGTGT TCGTCCCTGT
101 CCGGCTGGTG TAAGCCGAGA AAACCTGCCG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GTCCTCCGTC TTTAGAGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT
251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG CCTGCGGGTT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG AAGGTTTTGA
351 TTGTTTGAAA AAGCAGGGGT TGCGGCGCAA CGGTCTGTCC GAGCGCGTCC
401 GATGGTAA
它编码的蛋白质具有氨基酸序列<SEQ ID 860>:
1
MEIRAIKYTA MAALLAFTVA GCRLAGWYEC SSLSGWCKPR KPAAIDFWDI
51 GGESPPSLED YEIPLSDGNR SVRANEYESA QQSYFYRKIG KFEACGLDWR
101 TRDGKPLIET FKQEGFDCLK KQGLRRNGLS ERVRW*
ORF131a和ORF131-1在135个氨基酸的重叠区内显示出有97.0%的相同性:
orf131a.pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPPSLED
|||||||||||||||||||||||||||||||||:|||||||||||||||||||||||| |
orf131a.pep YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQEGFDCLK
||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||:
orf131-1 YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQGGFDCLE
orf131a.pep KQGLRRNGLSERVRWX
||||||||||||||||
orf131-1 KQGLRRNGLSERVRWX
与淋病奈瑟球菌的预计ORF的同源性
ORF131和淋病奈瑟球菌的预计ORF(ORF131ng)121个氨基酸的重叠区内显示出有89.3%的相同性:
orf131.pep MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD 60
||||:||||| |||:||||||||||||||| ||:||||||||||||||||||||| || |
orf131ng MEIRVIKYTATAALFAFFVAGCRLAGWYECLSLSGWCKPRKPAAIDFWDIGGESPLSLED 60
orf131.pep YEIPLSDGNSSVRANEYESAQQSYFYRKIGKFEXCGLDWRTRDGKPLIETFKQGGFDCLE 120
||||||||| |||||||||||:||||||||||| |||||||||||||:| ||| ||||||
orf131ng YEIPLSDGNRSVRANEYESAQKSYFYRKIGKFEACGLDWRTRDGKPLVERFKQEGFDCLE 120
orf131.pep K 121
|
orf131ng KQGLRRNGLSERVRW 134
预计全长ORF131ng核苷酸序列<SEQ ID 861>编码的蛋白质具有氨基酸序列<SEQ ID 862>:
1
MEIRVIKYTA TAALFAFTVA GCRLAGWYEC LSLSGWCKPR KPAAIDFWDI
51 GGESPLSLED YEIPLSDGNR SVRANEYESA QKSYFYRKIG KFEACGLDWR
101 TRDGKPLVER FKQEGFDCLE KQGLRRNGLS ERVRW*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 863>:
1 ATGGAAATTC GGGTAATAAA ATATACGGCA ACGGCTGCGT TGTTTGCATT
51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCTTGT
101 CCGGCTGGTG TAAGCCGAGA AAACCTGCCG CCATCGATTT TTGGGATATT
151 GGCGGCGAGA GtccgctGTC TTTAGAGGAC TACGAGATAC CGCTTTCAGA
201 CGGCAATCGT TCCGTCAGGG CAAACGAATA TGAATCCGCG CAAAAATCTT
251 ACTTTTATAG GAAAATAGGG AAGTTTGAAG CCTGCGGGTT GGATTGGCGT
301 ACGCGTGACG GCAAACCTTT GGTTGAGAGG TTCAAACAGG AAGGTTTCGA
351 CTGTTTGGAA AAGCAGGGGT TGCGGCGCAA CGGCCTGTCC GAGCGCGTCC
401 GATGGTAA
它对应于氨基酸序列<SEQ ID 864;ORF131ng-1>:
1
MEIRVIKYTA TAALFAFTVA GCRLAGWYEC SSLSGWCKPR KPAAIDFWDI
51 GGESPLSLED YEIPLSDGNR SVRANEYESA QKSYFYRKIG KFEACGLDWR
101 TRDGKPLVER FKQEGFDCLE KQGLRRNGLS ERVRW*
ORF131ng-1和ORF131-1在135个氨基酸的重叠区内显示出有92.6%的相同性:
orf131ng-1.pep MEIRVIKYTATAALFAFTVAGCRLAGWYECSSLSGWCKPRKPAAIDFWDIGGESPLSLED
||||:||||| |||:||||||||||||||||||:||||||||||||||||||||| || |
orf131-1 MEIRAIKYTAMAALLAFTVAGCRLAGWYECSSLTGWCKPRKPAAIDFWDIGGESPPSLGD
orf131ng-1.pep YEIPLSDGNRSVRANEYESAQKSYFYRKIGKFEACGLDWRTRDGKPLVERFKQEGFDCLE
|||||||||||||||||||||:|||||||||||||||||||||||||:| ||| ||||||
orf131-1 YEIPLSDGNRSVRANEYESAQQSYFYRKIGKFEACGLDWRTRDGKPLIETFKQGGFDCLE
orf131ng-1.pep KQGLRRNGLSERVRWX
||||||||||||||||
orf131-1 KQGLRRNGLSERVRWX
根据存在预计的原核细胞膜脂蛋白脂质连接位点的结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的此蛋白质及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例102
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 865>
1 ATGAAACACA TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT
51 TGCCGCCATT GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG
151 TATGAAGGCT TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA
201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT
251 TGAACCTCGG CCTGCCtTAT ATtTcCGGCC CGCAATGGCT GTCGGAAAAC
301 GTGCTGCACC ATCATTGGGT ACTCGGTGTG GCGGGGACgC ACGGCAAAAC
351 GACCACCGCC TCCATGCTCG CATGGGTCTT GGAATATgCC GGCCTCGCGC
401 CGGGCTTCCT TATtGGCGGC GTACC.GGAA AATttCGGCG TTTCCGCCCG
451 CCTGCCGCAA ACGCCGCGCC AAGACCCGAA CAGCCAATCG CCGTTTTTcG
501 TCATCGAAGC CGACGAATAC GACACCGCCT TTtTCGACAA ACGTTCTAAA
551 TtCGTGCATT ACCGTCCGCG TACCGCCGTG TTGAACAATC TGGAATTCGA
601 CCACGCCGAC ATCTTTGCCG ACTTGGGCGC GATACAGACc CAGTTCCACT
651 ACCTCGTGCG TACCGTGCCG TCTGAAGGCT TAATCGTCTG CAACGGACGG
701 CAGCAAAGCC TGCAAGATAC TTTGGACAAA GGCTGCTGGA CGCCGGTGGA
751 AAAATTCGGC ACGGAACACG GCTGGCA..
它对应于氨基酸序列<SEQ ID 866;ORF132>:
1 MKHIHIIGIG GTFMGGLAAI AKEAGFEVSG CDAKMYPPMS TQLEALGIDV
51 YEGFDAAQLD EFKADVYVIG NVAKRGMDVV EAILNLGLPY ISGPQWLSEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VXGKFRRFRP
151 PAANAAPRPE QPIAVFRHRS RRIRHRLFRQ TFXIRALPSA YRRVEQSGIR
201 PRRHLCRLGR DTDPVPLPRA YRAVXRLNRL QRTAAKPARY FGQRLLDAGG
251 KIRHGTRLA..
进一步的工作揭示了完整的核苷酸序列<SEQ ID 867>:
1 ATGAAACACA TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT
51 TGCCGCCATT GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG
151 TATGAAGGCT TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA
201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT
251 TGAACCTCGG CCTGCCTTAT ATTTCCGGCC CGCAATGGCT GTCGGAAAAC
301 GTGCTGCACC ATCATTGGGT ACTCGGTGTG GCGGGGACGC ACGGCAAAAC
351 GACCACCGCC TCCATGCTCG CATGGGTCTT GGAATATGCC GGCCTCGCGC
401 CGGGCTTCCT TATTGGCGGC GTACCGGAAA ATTTCGGCGT TTCCGCCCGC
451 CTGCCGCAAA CGCCGCGCCA AGACCCGAAC AGCCAATCGC CGTTTTTCGT
501 CATCGAAGCC GACGAATACG ACACCGCCTT TTTCGACAAA CGTTCTAAAT
551 TCGTGCATTA CCGTCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC
601 CACGCCGACA TCTTTGCCGA CTTGGGCGCG ATACAGACCC AGTTCCACTA
651 CCTCGTGCGT ACCGTGCCGT CTGAAGGCTT AATCGTCTGC AACGGACGGC
701 AGCAAAGCCT GCAAGATACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA
751 AAATTCGGCA CGGAACACGG CTGGCAGGCC GGCGAAGCCA ATGCCGACGG
801 CTCGTTCGAC GTGTTGCTCG ACGGCAAAAC CGCCGGACGC GTCAAATGGG
851 ATTTGATGGG CAGGCACAAC CGCATGAACG CGCTCGCCGT CATTGCCGCC
901 GCGCGTCATG TCGGTGTCGA TATTCAGACC GCCTGCGAAG CCTTGGGCGC
951 GTTTAAAAAC GTCAAACGCC GGATGGAAAT CAAAGGCACG GCAAACGGCA
1001 TCACCGTTTA CGACGACTTC GCCCACCACC CGACCGCCAT CGAAACCACG
1051 ATTCAAGGTT TGCGCCAACG CGTCGGCGGC GCGCGCATCC TCGCCGTCCT
1101 CGAACCGCGT TCCAACACGA TGAAGCTGGG CACGATGAAG TCCGCCCTGC
1151 CTGTAAGCCT CAAAGAAGCC GACCAAGTGT TCTGCTACGC CGGCGGCGTG
1201 GACTGGGACG TCGCCGAAGC CCTCGCGCCT TTGGGCGGCA GGCTGAACGT
1251 CGGCAAAGAC TTCGATGCCT TCGTTGCCGA AATCGTGAAA AACGCCGAAG
1301 TAGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC
1351 GGAAAGCTGC TGGAAGCTTT GAGATAG
它对应于氨基酸序列<SEQ ID 868;ORF132-1>:
1
MKHIHIIGIG GTFMGGLAAI AKEAGFEVSG CDAKMYPPMS TQLEALGIDV
51 YEGFDAAQLD EFKADVYVIG NVAKRGMDVV EAILNLGLPY ISGPQWLSEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPENFGVSAR
151 LPQTPRQDPN SQSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD
201 HADIFADLGA IQTQFHYLVR TVPSEGLIVC NGRQQSLQDT LDKGCWTPVE
251 KFGTEHGWQA GEANADGSFD VLLDGKTAGR VKWDLMGRHN RMNALAVIAA
301 ARHVGVDIQT ACEALGAFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK SALPVSLKEA DQVFCYAGGV
401 DWDVAEALAP LGGRLNVGKD FDAFVAEIVK NAEVGDHILV MSNGGFGGIH
451 GKLLEALR*
该氨基酸序列的计算机分析给出了下列结果:
与大肠杆菌的假设的o457蛋白(登录号为U14003)的同源性
ORF132和o457在140个氨基酸的重叠区内显示出有58%的氨基酸相同性:
Orf132:4 IHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLDEFK 63
IHI+GI GTFMGGLA +A++ G EV+G DA +YPPMST LE GI++ +G+DA+QL+ +
o457:3 IHILGICGTFMGGLAMLARQLGHEVTGSDANVYPPMSTLLEKQGIELIQGYDASQLEP-Q 61
Orf132:64 ADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTASML 123
D+ +IGN RG VEA+L +PY+SGPQWL + VL WVL VAGTHGKTTTA M
o457:62 PDLVIIGNAMTRGNPCVEAVLEKNIPYMSGPQWLHDFVLRDRWVLAVAGTHGKTTTAGMA 121
Orf132:124 AWVLEYAGLAPGFLIGGVXG 143
W+LE G PGF+IGGV G
o457:122 TWILEQCGYKPGFVIGGVPG 141
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF132和脑膜炎奈瑟球菌菌株A的ORF(ORF132a)在189个氨基酸的重叠区内显示出有74.6%的相同性:
10 20 30 40 50 60
orf132.pep MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD
||||||||||||||||:|||||||||| |||||||||||||||||||| ||||||:||||
orf132a MKHIHIIGIGGTFMGGIAAIAKEAGFEXSGCDAKMYPPMSTQLEALGIGVYEGFDTAQLD
10 20 30 40 50 60
70 80 90 100 110 120
orf132.pep EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA
||||||||||||||||||||||| |||||||||||||:|| ||||| |||| ||||||||
orf132a EFKADVYVIGNVAKRGMDVVEAILNRGLPYISGPQWLAENXLHHHWXLGVAXTHGKTTTA
70 80 90 100 110 120
130 140 150 160
orf132.pep SMLAWVLEYAGLAPGFLIGGVXGKFR---RFRPPAANAAPRPEQPI----------AVFR
||||||||||||||||||||| :| |: | : | ::|: | |
orf132a SMLAWVLEYAGLAPGFXIGGVPENFSVSARL-PQTPRQDPNSQSPFFVIEADEYDTAFFD
130 140 150 160 170
170 180 190 200 210 220
orf132.pep HRSRRIRHRLFRQTFXIRALPSAYRRVEQSGIRPRRHLCRLGRDTDPVPLPRAYRAVXRL
:||: :::|
orf132a KRSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGRQQSLQD
180 190 200 210 220 230
全长ORF132a核苷酸序列<SEQ ID 869>是:
1 ATGAAACACA TCCACATTAT CGGTATCGGC GGCACGTTTA TGGGTGGGAT
51 TGCCGCCATT GCCAAAGAAG CAGGGTTTGA ANTCAGCGGT TGCGATGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG CATAGGCGTG
151 TATGAAGGCT TCGACACCGC GCAGTTGGAC GAATTTAAAG CCGACGTTTA
201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT
251 TGAACCGTGG GCTGCCTTAT ATTTCCGGCC CGCAATGGCT GGCTGAAAAC
301 NTGCTGCACC ATCATTGGNN ACTCGGCGTG GCGGNGACGC ACGGCAAAAC
351 GACCACCGCG TCTATGCTCG CGTGGGTTTT GGAATATGCC GGACTCGCAC
401 CGGGCTTCNT TATCGGCGGC GTACCGGAAA ACTTCAGCGT TTCCGCCCGC
451 CTGCCGCAAA CGCCGCGCCA AGACCCGAAC AGCCAATCGC CGTTTTTCGT
501 CATTGAAGCC GACGAATACG ACACCGCGTT TTTCGACAAA CGCTCCAAAT
551 TCGTGCATTA CCGTCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC
601 CACGCCGACA TCTTCGCCGA TTTGGGCGCG ATACAGACCC AGTTCCACCA
651 CCTCGTGCGT ACCGTGCCGT CTGAAGGCCT CATCGTCTGC AACGGACGGC
701 AGCAAAGCCT GCAAGACACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA
751 AAATTCGGCA CGGAACACGG CTGGCAGGCC GGCGAAGCCA ATGCCGATGG
801 CTCGTTCGAC GTGTTGCTTG ACGGCAAAAA AGCCGGACAC GTCGCTTGGA
851 GTTTGATGGG CGGACACAAC CGCATGAACG CGCTCGCNGT CATCGCCGCC
901 GCGCGTCATG CCGGAGTNGA CATTCAGACG GCCTGCGAAG CCTTGAGCAC
951 GTTTAAAAAC GTCAAACGCC GCATGGAAAT CAAAGGCACG GCAAACGGTA
1001 TCACCGTTTA CGACGACTTC GCCCACCATC CGACCGCTAT CGAAACCACG
1051 ATTCAAGGTT TGCGCCAGCG CGTCGGCGGC GCGCGCATCC TCGCCGTCCT
1101 CGAACCGCGT TCCAATACGA TGAAGCTGGG TACGATGAAA GCCGCCCTGC
1151 CCGCAAGCCT CAAAGAAGCC GACCAAGTGT TCTGNTACGC CGGCGGCGCG
1201 GACTGGGACG TTGCCGAAGC CCTCGCGCCT TTGGGCGGCA GGCTGCACGT
1251 CGGCAAAGAC TTCGATGCCT TCGTTGCCGA AATCGTGAAA AACGCCGAAG
1301 CAGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC
1351 ACCAAACTGC TGGACGCTTT GAGATAG
它编码的蛋白质具有氨基酸序列<SEQ ID 870>:
1
MKHIHIIGIG GTFMGGIAAI AKEAGFEXSG CDAKMYPPMS TQLEALGIGV
51 YEGFDTAQLD EFKADVYVIG NVAKRGMDVV EAILNRGLPY ISGPQWLAEN
101 XLHHHWXLGV AXTHGKTTTA SMLAWVLEYA GLAPGFXIGG VPENFSVSAR
151 LPQTPRQDPN SQSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD
201 HADIFADLGA IQTQFHHLVR TVPSEGLIVC NGRQQSLQDT LDKGCWTPVE
251 KFGTEHGWQA GEANADGSFD VLLDGKKAGH VAWSLMGGHN RMNALAVIAA
301 ARHAGVDIQT ACEALSTFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK AALPASLKEA DQVFXYAGGA
401 DWDVAEALAP LGGRLHVGKD FDAFVAEIVK NAEAGDHILV MSNGGFGGIH
451 TKLLDALR*
ORF132a和ORF132-1在458个氨基酸的重叠区内显示出有93.9%的相同性:
orf132a.pep MKHIHIIGIGGTFMGGIAAIAKEAGFEXSGCDAKMYPPMSTQLEALGIGVYEGFDTAQLD
|||||||||||||||||||:||||||| |||||||||||||||||||| ||||||:||||
orf132-1 MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD
orf132a.pep EFKADVYVIGNVAKRGMDVVEAILNRGLPYISGPQWLAENXLHHHWXLGVAXTHGKTTTA
||||||||||||||||||||||||| |||||||||||:|| ||||| |||||||||||||
orf132-1 EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA
orf132a.pep SMLAWVLEYAGLAPGFXIGGVPENFSVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK
|||||||||||||||| ||||||||:||||||||||||||||||||||||||||||||||
orf132-1 SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK
orf132a.pep RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGRQQSLQDT
||||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||
orf132-1 RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHYLVRTVPSEGLIVCNGRQQSLQDT
orf132a.pep LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKKAGHVAWSLMGGHNRMNALAVIAA
|||||||||||||||||||||||||||||||||||| ||:| |:||| ||||||||||||
orf132-1 LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKTAGRVKWDLMGRHNRMNALAVIAA
orf132a.pep ARHAGVDIQTACEALSTFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
|||:|||||||||||::|||||||||||||||||||||||||||||||||||||||||||
orf132-1 ARHVGVDIQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
orf132a.pep ARILAVLEPRSNTMKLGTMKAALPASLKEADQVFXYAGGADWDVAEALAPLGGRLHVGKD
||||||||||||||||||||:|||:||||||||| |:||||||||||||||||||:||||
orf132-1 ARILAVLEPRSNTMKLGTMKSALPVSLKEADQVFCYAGGVDWDVAEALAPLGGRLNVGKD
orf132a.pep FDAFVAEIVKNAEAGDHILVMSNGGFGGIHTKLLDALRX
|||||||||||:|||||||||||||||||| |||:||||
orf132-1 FDAFVAEIVKNAEVGDHILVMSNGGFGGIHGKLLEALRX
与淋病奈瑟球菌的预计ORF的同源性
ORF132和淋病奈瑟球菌的预计ORF(ORF132ng)在259个氨基酸的重叠区内显示出有89.6%的相同性:
orf132.pep MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD 60
||||||||||||||||:|||||||||:||||||||||||||||||||| |:||||||||:
orf132ng MKHIHIIGIGGTFMGGIAAIAKEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLE 60
orf132.pep EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA 120
||:||:|||||||:||||||||| |||||||||||||:||||||||||||||||||||||
orf132ng EFQADIYVIGNVARRGMDVVEAILNRGLPYISGPQWLAENVLHHHWVLGVAGTHGKTTTA 120
orf132.pep SMLAWVLEYAGLAPGFLIGGVXGKFRRFRPPAANAAPRPEQPIAVFRHRSRRIRHRLFRQ 180
||||||||||||||||||||| |||||||||:|||| |||| ||||||||||||||||||
orf132ng SMLAWVLEYAGLAPGFLIGGVPGKFRRFRPPTANAASRPEQQIAVFRHRSRRIRHRLFRQ 180
orf132.pep TFXIRALPSAYRRVEQSGIRPRRHLCRLGRDTDPVPLPRAYRAVXRLNRLQRTAAKPARY 240
|: |||| ||||||||||||||||| |||||||||| |||:|:: | :||||||||||||
orf132ng TLQIRALSPAYRRVEQSGIRPRRHLRRLGRDTDPVPPPRAHRTIRRPHRLQRTAAKPARY 240
orf132.pep FGQRLLDAGGKIRHGTRLA 259
|||||||||||||| ||||
orf132ng FGQRLLDAGGKIRHRTRLADW 261
预计ORF132ng核苷酸序列<SEQ ID 871>编码的蛋白质具有氨基酸序列<SEQ ID872>:
1
MKHIHIIGIG GTFMGGIAAI AKEAGFKVSG CDAKMYPPMS TQLEALGIGV
51 HEGFDAAQLE EFQADIYVIG NVARRGMDVV EAILNRGLPY ISGPQWLAEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPGKFRRFRP
151 PTANAASRPE QQIAVFRHRS RRIRHRLFRQ TLQIRALSPA YRRVEQSGIR
201 PRRHLRRLGR DTDPVPPPRA HRTIRRPHRL QRTAAKPARY FGQRLLDAGG
251 KIRHRTRLAD W*
进一步的工作揭示了下列淋球菌DNA序列<SEQ ID 873>:
1 ATGAAACACA TCCACATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGAT
51 TGCCGCCATT GCCAAAGAAG CCGGGTTCAA AGTCAGCGGT TGCGACGCGA
101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG CATAGGCGTA
151 CACGAAGGCT TCGATGCCGC GCAGTTGGAA GAATTTCAAG CCGATATTTA
201 CGTCATCGGC AATGTCGCCA GGCGCGGGAT GGATGTGGTC GAGGCGATTT
251 TGAACCGTGG GCTGCCTTAT ATTTCCGGCC CGCAATGGCT GGCTGAAAac
301 GTGCtgcacc atcaTTGGgt ACTCGGCGTG GcagggaCGC ACGGcaaAac
351 gaccaCcGcg tCCATGCTCG CCTGGGTCTT GGAATATGCC GGACTCGCGC
401 CGGGCTTCCT CATCGGCGGt gtaccggaAA ATTTCGGCGT TTCCGCCCGC
451 CTACCGCAAA CGCCGCGTCA AGACCCGAAC AGCAAATCGC CGTTTTTCGT
501 CATCGAAGCC GACGAATACG ACACCGCCTT TTTCGACAAA CGCTCCAAAT
551 TCGTGCATTA TCGCCCGCGT ACCGCCGTGT TGAACAATCT GGAATTCGAC
601 CACGCCGACA TCTTCGCCGA CTTGGGCGCG ATACAGACCC AGTTCCACCA
651 CCTCGTGCGC ACCGTACCAT CCGAAGGCCT CATCGTCTGC AACGGACAGC
701 AGCAAAGCCT GCAAGATACT TTGGACAAAG GCTGCTGGAC GCCGGTGGAA
751 AAATTCGGCA CCGGACACGG CTGGCAGATT GGTGAAGTCA ATGCCGACGG
801 CTCGTTCGAC GTATTGCTTG ACGGCAAAAA AGCCGGACAC GTCGCATGGG
851 ATTTGATGGG CGGACACAAC CGCATGAACG CGCTCGCCGT CATCGCTGCC
901 GCACGCCATG CCGGAGTCGA TGTTCAGACG GCCTGCGAAG CCTTGGGTGC
951 GTTTAAAAAC GTCAAACGCC GCATGGAAAT CAAAGGCACG GCAAACGGCA1001 TCACCGTTTA CGACGATTTC GCCCACCACC CGACCGCCAT CGAAACCACG1051 ATTCAAGGTT TGCGCCAACG TGTCGGCGGC GCGCGCATCC TCGCCGTCCT1101 CGAGCCGCGT TCCAACACCA TGAAACTCGG CACGATGAAG TCCGCCCTGC1151 CCGCAAGCCT CAAAGAAGCC GACCAAGTGT TCTGCTACGC CGGCGGCGCG
1201 GACTGGGACG TTGCCGAAGC CCTCGCGCCT TTGGGCTGCA GGCTGCGCGT
1251 CGGTAAAGAT TTCGATACCT TCGTTGCCGA AATTGTGAAA AACGCCCGAA
1301 CCGGCGACCA TATTTTGGTG ATGAGCAACG GCGGTTTCGG CGGAATACAC
1351 ACCAAACTGC TGGACGCTTT GAGATAG
它对应于氨基酸序列<SEQ ID 874;ORF132ng-1>:
1
MKHIHIIGIG GTFMGGIAAI AKEAGFKVSG CDAKMYPPMS TQLEALGIGV
51 HEGFDAAQLE EFQADIYVIG NVARRGMDVV EAILNRGLPY ISGPQWLAEN
101 VLHHHWVLGV AGTHGKTTTA SMLAWVLEYA GLAPGFLIGG VPENFGVSAR
151 LPQTPRQDPN SKSPFFVIEA DEYDTAFFDK RSKFVHYRPR TAVLNNLEFD
201 HADIFADLGA IQTQFHHLVR TVPSEGLIVC NGQQQSLQDT LDKGCWTPVE
251 KFGTGHGWQI GEVNADGSFD VLLDGKKAGH VAWDLMGGHN RMNALAVIAA
301 ARHAGVDVQT ACEALGAFKN VKRRMEIKGT ANGITVYDDF AHHPTAIETT
351 IQGLRQRVGG ARILAVLEPR SNTMKLGTMK SALPASLKEA DQVFCYAGGA
401 DWDVAEALAP LGCRLRVGKD FDTFVAEIVK NARTGDHILV MSNGGFGGIH
451 TKLLDALR*
ORF132ng-1和ORF132-1在458个氨基酸的重叠区内显示出有93.2%的相同性:
orf132ng-1.pep MKHIHIIGIGGTFMGGIAAIAKEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLE
|||||||||||||||||||:||||||:||||||||||||||||||||| |:||||||||:
orf132-1 MKHIHIIGIGGTFMGGLAAIAKEAGFEVSGCDAKMYPPMSTQLEALGIDVYEGFDAAQLD
orf132ng-1.pep EFQADIYVIGNVARRGMDVVEAILNRGLPYISGPQWLAENVLHHHWVLGVAGTHGKTTTA
||:||:|||||||:||||||||||| |||||||||||:||||||||||||||||||||||
orf132-1 EFKADVYVIGNVAKRGMDVVEAILNLGLPYISGPQWLSENVLHHHWVLGVAGTHGKTTTA
orf132ng-1.pep SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSKSPFFVIEADEYDTAFFDK
|||||||||||||||||||||||||||||||||||||||||:||||||||||||||||||
orf132-1 SMLAWVLEYAGLAPGFLIGGVPENFGVSARLPQTPRQDPNSQSPFFVIEADEYDTAFFDK
orf132ng-1.pep RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHHLVRTVPSEGLIVCNGQQQSLQDT
||||||||||||||||||||||||||||||||||||:|||||||||||||||:|||||||
orf132-1 RSKFVHYRPRTAVLNNLEFDHADIFADLGAIQTQFHYLVRTVPSEGLIVCNGRQQSLQDT
orf132ng-1.pep LDKGCWTPVEKFGTGHGWQIGEVNADGSFDVLLDGKKAGHVAWDLMGGHNRMNALAVIAA
|||||||||||||| |||| ||:||||||||||||| ||:| |||| |||||||||||||
orf132-1 LDKGCWTPVEKFGTEHGWQAGEANADGSFDVLLDGKTAGRVKWDLMGRHNRMNALAVIAA
orf132ng-1.pep ARHAGVDVQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
|||:|||:||||||||||||||||||||||||||||||||||||||||||||||||||||
orf132-1 ARHVGVDIQTACEALGAFKNVKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRVGG
orf132ng-1.pep ARILAVLEPRSNTMKLGTMKSALPASLKEADQVFCYAGGADWDVAEALAPLGCRLRVGKD
|||||||||||||||||||||:||||||||||||||:||||||||||||||| || ||||
orf132-1 ARILAVLEPRSNTMKLGTMKSALPVSLKEADQVFCYAGGVDWDVAEALAPLGGRLNVGKD
orf132ng-1.pep FDTFVAEIVKNARTGDHILVMSNGGFGGIHTKLLDALRX
||:|||||||||::|||||||||||||||| |||:||||
orf132-1 FDAFVAEIVKNAEVGDHILVMSNGGFGGIHGKLLEALRX
另外,ORF132ng-1与一种假设的大肠杆菌蛋白同源:
pir||S56459假设蛋白o457-大肠杆菌>gi|537075(U14003)
ORF_o457[大肠杆菌]>gi|1790680(AE000494)fbp-pmba基因间区中的假设的48.5kD蛋白[大肠杆菌]长度=457
评分=474位(1207),估计值=e-133
相同性=249/439(56%),阳性=294/439(66%),空隙=13/439(2%)
询问:22 KEAGFKVSGCDAKMYPPMSTQLEALGIGVHEGFDAAQLEEFQADIYVIGNVARRGMDVVE 81
++ G +V+G DA +YPPMST LE GI + +G+DA+QLE Q D+ +IGN RG VE
目标:21 RQLGHEVTGSDANVYPPMSTLLEKQGIELIQGYDASQLEP-QPDLVIIGNAMTRGNPCVE 79
询问:82 AILNRGLPYISGPQWLAENVLHHHWVLGVAGTHGKTTTASMLAWVLEYAGLAPGFLIGGV 141
A+L + +PY+SGPQWL + VL WVL VAGTHGKTTTA M W+LE G PGF+IGGV
目标:80 AVLEKNIPYMSGPQWLHDFVLRDRWVLAVAGTHGKTTTAGMATWILEQCGYKPGFVIGGV 139
询问:142 PENFGVSARLPQTPRQDPNSKSPFFVIEADEYDTAFFDKRSKFVHYRPRTAVLNNLEFDH 201
P NF VSA L +S FFVIEADEYD AFFDKRSKFYHY PRT +LNNLEFDH
目标:140 PGNFEVSAHL---------GESDFFVIEADEYDCAFFDKRSKFVHYCPRTLILNNLEFDH 190
询问:202 ADIFADLGAIQTQFHHLVRTVPSEGLIVCNGQQQSLQDTLDKGCWTPVEKFGTGHGWQIG 261
ADIF DL AIQ QFHHLVR VP +G I+ +L+ T+ GCW+ E G WQ
目标:191 ADIFDDLKAIQKQFHHLVRIVPGQGRIIWPENDINLKQTMAMGCWSEQELVGEQGHWQAK 250
询问:262 EVNADGS-FDVLLDGKKAGHVAWDLMGGHNRMNALAVIAAARHAGVDVQTACEALGAFKN 320
++ D S ++VLLDG+K G V W L+G HN N L IAAARH GV A ALG+F N
目标:251 KLTTDASEWEVLLDGEKVGEVKWSLVGEHNMHNGLMAIAAARHVGVAPADAANALGSFIN 310
询问:321 VKRRMEIKGTANGITVYDDFAHHPTAIETTIQGLRQRYGG-ARILAVLEPRSNTMKLGTM 379
+RR+E++G ANG+TVYDDFAHHPTAI T+ LR +VGG ARI+AVLEPRSNTMK+G
目标:311 ARRRLELRGEANGYTVYDDFAHHPTAILATLAALRGKVGGTARIIAVLEPRSNTMKMGIC 370
询问:380 KSALPASLKEADQVF-CYAGGADWDVAEALAPLGCRLRVGKDFDTFVAEIVKNARTGDHI 438
K L SL AD+VF W VAE D DT +VK A+ GDHI
目标:371 KDDLAPSLGRADEVFLLQPAHIPWQVAEVAEACVQPAHWSGDVDTLADMVVKTAQPGDHI 430
询问:439 LVMSNGGFGGIHTKLLDAL 457
LVMSNGGFGGIH KLLD L
目标:431 LVMSNGGFGGIHQKLLDGL 449
根据该分析结果,预计脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
如上所述,将ORF132-1(26.4kDa)克隆到pET和pGeX载体中并在大肠杆菌中表达。用SDS-PAGE分析蛋白表达和纯化的产物。图20A显示出His-融合蛋白亲和纯化的结果,图20B显示出GST-融合物在大肠杆菌中表达的结果。用纯化的His-融合蛋白免疫小鼠,将小鼠血清用于FACS分析(图20C)和ELISA(阳性结果)。这些实验确认ORF132是一种外露蛋白,且是一种有用的免疫原。
实施例103
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 875>
1 ..CCGGGCTATT ACGGCTCGGA TGACGAATTT AAGCGGGCAT TCGGAGAAAA
51 CTCGCCGACA TmCAAGAAAC ATTGCAACCG GAGCTGCGGG ATTTATGAAC
101 CCGTATTGAA AAAATACGGC AAAAAGCGCG CCAACAACCA TTCGGTCAGC
151 ATTAGTGCGG ACTTCGGCGA TTATTTCATG CCGTTCGCCA GCTATTCGCG
201 CACACACCGT ATGCCCAACA TCCAAGAAAT GTATTTTTCC CAAATCGGCG
251 ACTCCGGCGT TCACACCGCC TTAAAACCAG AGCGCGCAAA CACTTGGCAA
301 TTTGGCTTCr ATACCTATAA AAAAGGATTG TTAAAACAAG ATGATACATT
351 AGGATTAAAA CTGGTCGGCT ACCGCAGCCG CATCGACAAC TACATCCACA
401 ACGTTTACGG GAAATGGTGG GATTTGAACG GGGATATTCC GAGCTGGGTC
451 AGCAGCACCG GGCTTGCCTA CACCATCCAA CATCGCrATT TCAwAGACAA
501 AGTGCATCAA nnnnnnnnnn nnnnnnnnnn nnnnTACGAT TATGGGCGTT
551 TTTTCACCAA CCTTTCTTAC GCCTATCAAA AAAGCACGCA ACCGACCAAC
601 TTCAGCGATG CGAGCGAATC GCCCAACAAT GCGTCCAAAG AAGACCAACT
651 CAAACAAGGT TATGGGTTGA GCAGGGTTTC CGCCCTGCCG CGAGATTACG
701 GACGTTTGGA AGTCGGTACG CGCTGGTTGG GCAACAAACT GACTTTGGGC
751 GGCGCGATGC GCTATTTCGG CAAGAGCATC CGCGCGACGG CTGAAGAACG
801 CTATATCGAC GGCACCAACG GGGGAAATAC CAGCAATTTC CGGCAACTGG
851 GCAAGCGTTC CATCAAACAA ACCGAAACTC TTGCCCGCCA GCCTTTGATT
901 TTwGATTTTa ACGCCGCTTA CGAGCCGAAG AAAAACCTTA TTTTCCGCGC
951 CGAAGTCAAA AATCTGTTCG ACAGGCGTTA TATCGATCCG CTCGATGCGG
1001 GCAATGATGC GGCAAC.GAG CGTTATTACA GCTCGTTCGA CCCGAAAGAC
1051 AAGGACrrAG ACGTAACGTG TAATGCTGAT AAAACGTTGT GCaACGGCAA
1101 ATACGGCGGC ACAAGCAAAA GCGTATTGAC CAATTTTGCA CGCGGACGCA
1151 CCTTTTTgAT GACGATGAGC TACAAGTTTT AA
它对应于氨基酸序列<SEQ ID 876;ORF133>:
1 ..PGYYGSDDEF KRAFGENSPT XKKHCNRSCG IYEPVLKKYG KKRANNHSVS
51 ISADFGDYFM PFASYSRTHR MPNIQEMYFS QIGDSGVHTA LKPERANTWQ
101 FGFXTYKKGL LKQDDTLGLK LVGYRSRIDN YIHNVYGKWW DLNGDIPSWV
151 SSTGLAYTIQ HRXFXDKVHQ XXXXXXXXYD YGRFFTNLSY AYQKSTQPTN
201 FSDASESPNN ASKEDQLKQG YGLSRVSALP RDYGRLEVGT RWLGNKLTLG
251 GAMRYFGKSI RATAEERYID GTNGGNTSNF RQLGKRSIKQ TETLARQPLI
301 XDFNAAYEPK KNLIFRAEVK NLFDRRYIDP LDAGNDAAXE RYYSSFDPKD
351 KDXDVTCNAD KTLCNGKYGG TSKSVLTNFA RGRTFLMTMS YKF*
进一步的工作揭示了部分DNA序列<SEQ ID 877>:
1 GAGGCGCAGA TACAGGTTTT GGAAGATGTG CACGTCAAGG CGAAGCGCGT
51 ACCGAAAGAC AAAAAAGTGT TTACCGATGC GCGTGCCGTA TCGACCCGTC
101 AGGATATATT CAAATCCAGC GAAAACCTCG ACAACATCGT ACGCAGCATC
151 CCCGGTGCGT TTACACAGCA AGATAAAAGC TCGGGCATTG TGTCTTTGAA
201 TATTCGCGGC GACAGCGGGT TCGGGCGGGT CAATACGATG GTGGACGGCA
251 TCACGCAGAC CTTTTATTCG ACTTCTACCG ATGCGGGCAG GGCAGGCGGT
301 TCATCTCAAT TCGGTGCATC TGTCGACAGC AATTTTATTG CCGGACTGGA
351 TGTCGTCAAA GGCAGCTTCA GCGGCTCGGC AGGCATCAAC AGCCTTGCCG
401 GTTCGGCGAA TCTGCGGACT TTAGGCGTGG ATGACGTCGT TCAGGGCAAT
451 AATACCTACG GCCTGCTGCT AAAAGGTCTG ACCGGCACCA ATTCAACCAA
501 AGGTAATGCG ATGGCGGCGA TAGGTGCGCG CAAATGGCTG GAAAGCGGAG
551 CATCTGTCGG TGTGCTTTAC GGGCACAGCA GGCGCAGCGT GGCGCAAAAT
601 TACCGCGTGG GCGGCGGCGG GCAGCACATC GGAAATTTTG GCGCGGAATA
651 TTTGGAACGG CGCAAGCAGC GATATTTTGT ACAAGAGGGT GCTTTGAAAT
701 TCAATTCCGA CAGCGGAAAA TGGGAGCGGG ATTTACAAAG GCAACAGTGG
751 AAATACAAGC CGTATAAAAA TTACAACAAC CAAGAACTAC AaAAATACAT
801 CGAAGAGCAT GACAAAAGCT GGCGGGAAAA CCTg.CaCCG CAATACGACA
851 TTACCCCCAT CGATCCGTCC AGCCTGAAGC AGCAGTCGGC AGGCAATCTG
901 TTTAAATTGG AATACGACGG CGTATTCAAT AAATACACGG CGCAATTTCG
951 CGATTTAAAC ACCAAAATCG GCAGCCGCAA AATCATCAAC CGCAATTATC
1001 AGTTCAATTA CGGTTTGTCT TTGAACCCGT ATACCAACCT CAATCTGACC
1051 GCAGCCTACA ATTCGGGCAG GCAGAAATAT CCGAAAGGGT CGAAGTTTAC
1101 AGGCTGGGGG CTTTTAAAGG ATTTTGAAAC CTACAACAAC GCGAAAATCC
1151 TCGACCTCAA CAACACCGCC ACCTTCCGGC TGCCCCGCGA AACCGAGTTG
1201 CAAACCACTT TGGGCTTCAA TTATTTCCAC AACGAATACG GCAAAAACCG
1251 CTTTCCTGAA GAATTGGGGC TGTTTTTCGA CGGTCCTGAT CAGGACAACG
1301 GGCTTTATTC CTATTTGGGG CGGTTTAAGG GCGATAAAGG GCTGCTGCCC
1351 CAAAAATCAA CCATTGTCCA ACCGGCCGGC AGCCAATATT TCAACACGTT
1401 CTACTTCGAT GCCGCGCTCA AAAAAGACAT TTACCGCTTA AACTACAGCA
1451 CCAATACCGT CGGCTACCGT TTCGGCGGCG AATATACGGG CTATTACGGC
1501 TCGGATGACG AATTTAAGCG GGCATTCGGA GAAAACTCGC CGACATACAA
1551 GAAACATTGC AACCGGAGCT GCGGGATTTA TGAACCCGTA TTGAAAAAAT
1601 ACGGCAAAAA GCGCGCCAAC AACCATTCGG TCAGCATTAG TGCGGACTTC
1651 GGCGATTATT TCATGCCGTT CGCCAGCTAT TCGCGCACAC ACCGTATGCC
1701 CAACATCCAA GAAATGTATT TTTCCCAAAT CGGCGACTCC GGCGTTCACA
1751 CCGCCTTAAA ACCAGAGCGC GCAAACACTT GGCAATTTGG CTTCAATACC
1801 TATAAAAAAG GATTGTTAAA ACAAGATGAT ACATTAGGAT TAAAACTGGT
1851 CGGCTACCGC AGCCGCATCG ACAACTACAT CCACAACGTT TACGGGAAAT
1901 GGTGGGATTT GAACGGGGAT ATTCCGAGCT GGGTCAGCAG CACCGGGCTT
1951 GCCTACACCA TCCAACATCG CAATTTCAAA GACAAAGTGC ACAAACACGG
2001 TTTTGAGTTG GAGCTGAATT ACGATTATGG GCGTTTTTTC ACCAACCTTT
2051 CTTACGCCTA TCAAAAAAGC ACGCAACCGA CCAACTTCAG CGATGCGAGC
2101 GAATCGCCCA ACAATGCGTC CAAAGAAGAC CAACTCAAAC AAGGTTATGG
2151 GTTGAGCAGG GTTTCCGCCC TGCCGCGAGA TTACGGACGT TTGGAAGTCG
2201 GTACGCGCTG GTTGGGCAAC AAACTGACTT TGGGCGGCGC GATGCGCTAT
2251 TTCGGCAAGA GCATCCGCGC GACGGCTGAA GAACGCTATA TCGACGGCAC
2301 CAACGGGGGA AATACCAGCA ATTTCCGGCA ACTGGGCAAG CGTTCCATCA
2351 AACAAACCGA AACTCTTGCC CGCCAGCCTT TGATTTTTGA TTTTTACGCC
2401 GCTTACGAGC CGAAGAAAAA CCTTATTTTC CGCGCCGAAG TCAAAAATCT
2451 GTTCGACAGG CGTTATATCG ATCCGCTCGA TGCGGGCAAT GATGCGGCAA
2501 CGCAGCGTTA TTACAGCTCG TTCGACCCGA AAGACAAGGA CGAAGACGTA
2551 ACGTGTAATG CTGATAAAAC GTTGTGCAAC GGCAAATACG GCGGCACAAG
2601 CAAAAGCGTA TTGACCAATT TTGCACGCGG ACGCACCTTT TTGATGACGA
2651 TGAGCTACAA GTTTTAA
它对应于氨基酸序列<SEQ ID 878;ORF133-1>:
1 EAQIQVLEDV HVKAKRVPKD KKVFTDARAV STRQDIFKSS ENLDNIVRSI
51 PGAFTQQDKS SGIVSLNIRG DSGFGRVNTM VDGITQTFYS TSTDAGRAGG
101 SSQFGASVDS NFIAGLDVVK GSFSGSAGIN SLAGSANLRT LGVDDVVQGN
151 NTYGLLLKGL TGTNSTKGNA MAAIGARKWL ESGASVGVLY GHSRRSVAQN
201 YRVGGGGQHI GNFGAEYLER RKQRYFVQEG ALKFNSDSGK WERDLQRQQW
251 KYKPYKNYNN QELQKYIEEH DKSWRENLXP QYDITPIDPS SLKQQSAGNL
301 FKLEYDGVFN KYTAQFRDLN TKIGSRKIIN RNYQFNYGLS LNPYTNLNLT
351 AAYNSGRQKY PKGSKFTGWG LLKDFETYNN AKILDLNNTA TFRLPRETEL
401 QTTLGFNYFH NEYGKNRFPE ELGLFFDGPD QDNGLYSYLG RFKGDKGLLP
451 QKSTIVQPAG SQYFNTFYFD AALKKDIYRL NYSTNTVGYR FGGEYTGYYG
501 SDDEFKRAFG ENSPTYKKHC NRSCGIYEPV LKKYGKKRAN NHSVSISADF
551 GDYFMPFASY SRTHRMPNIQ EMYFSQIGDS GVHTALKPER ANTWQFGFNT
601 YKKGLLKQDD TLGLKLVGYR SRIDNYIHNV YGKWWDLNGD IPSWVSSTGL
651 AYTIQHRNFK DKVHKHGFEL ELNYDYGRFF TNLSYAYQKS TQPTNFSDAS
701 ESPNNASKED QLKQGYGLSR VSALPRDYGR LEVGTRWLGN KLTLGGAMRY
751 FGKSIRATAE ERYIDGTNGG NTSNFRQLGK RSIKQTETLA RQPLIFDFYA
801 AYEPKKNLIF RAEVKNLFDR RYIDPLDAGN DAATQRYYSS FDPKDKDEDV
851 TCNADKTLCN GKYGGTSKSV LTNFARGRTF LMTMSYKF*
该氨基酸序列的计算机分析给出了下列结果:
与流感嗜血菌的可能的TonB依赖性受体HI121(登录号为U32801)的同源性
ORF133和HI121在363个氨基酸的重叠区内显示出有57%的氨基酸相同性:
Orf133:31 IYEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTA 90
I EP+L K G K+A NHS ++SA+ DYFMPF +YSRTHRMPNIQEM+FSQ+ ++GV+TA
HI121: 563 INEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMPNIQEMFFSQVSNAGVNTA 622
0rf133:91 LKPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWV 150
LKPE+++T+Q GF TYKKGL QDD LG+KLVGYRS I NYIHNVYG WW +P+W
HI121: 623 LKPEQSDTYQLGFNTYKKGLFTQDDVLGVKLVGYRSFIKNYIHNVYGVWW--RDGMPTWA 680
Orf133:151 SSTGLAYTIQHRXFXDKVHXXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNN 210
S G YTI H+ + V YD GRFF N+SYAYQ++ QPTN++DAS PNN
HI121: 681 ESNGFKYTIAHQNYKPIVKKSGYELEINYDMGRFFANVSYAYQRTNQPTNYADASPRPNN 740
Orf133:211 ASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYID 270
AS+ED LKQGYGLSRVS LP+DYGRLE+GTRW KLTLG A RY+GKS RAT EE YI+
HI121: 741 ASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLAARYYGKSKRATIEEEYIN 800
Orf133:271 GTNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDP 330
G+ + R+ ++K+TE + +QP+I D + +YEP K+LI +AEV+NL D+RY+DP
HI121:801 GSR-FKKNTLRRENYYAVKKTEDIKKQPIILDLHVSYEPIKDLIIKAEVQNLLDKRYVDP 859
Orf133:331 LDAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMS 390
LDAGNDAA +RYYSS + + C D + C GG+ K+VL NFARGRT++++++
HI121:860 LDAGNDAASQRYYSSL-----NNSIECAQDSSAC----GGSDKTVLYNFARGRTYILSLN 910
Orf133:391 YKF 393
YKF
HI121:911 YKF 913
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF133和脑膜炎奈瑟球菌菌株A的ORF(ORF133a)在392个氨基酸的重叠区内显示出有90.8%的相同性:
10 20 30
orf133.pep PGYYGSDDEFKRAFGENSPTXKKHCNRSCGI
|||| ||||||||||||||| |||||:||||
orf133a FYFDAALKKDIYRLNYSTNTVGYRFGGXYTGYYXSDDEFKRAFGENSPTYXKHCNQSCGI
450 460 470 480 490 500
40 50 60 70 80 90
orf133.pep YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133a YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL
510 520 530 540 550 560
100 110 120 130 140 150
orf133.pep KPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVS
|||||||||||| ||||||||||| ||||||||||||| ||||||||||||||:||||||
orf133a KPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDXYIHNVYGKWWDLNGNIPSWVS
570 580 590 600 610 620
160 170 180 190 200 210
orf133.pep STGLAYTIQHRXFXDKVHQXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA
||||||||||| | ||||: ||| |||||||||||||||||||||||||||||
orf133a STGLAYTIQHRNFKDKVHKHGFELELNYDYXRFFTNLSYAYQKSTQPTNFSDASESPNNA
630 640 650 660 670 680
220 230 240 250 260 270
orf133.pep SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDG
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133a SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDX
690 700 710 720 730 740
280 290 300 310 320 330
orf133.pep TNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDPL
||| |||||||||||| ||||||||||| | ||||||| |||||||||||||||||||
orf133a TNGXXTSNFRQLGKRSIXQTETLARQPLIFDXYAAYEPKKXLIFRAEVKNLFDRRYIDPL
750 760 770 780 790 800
340 350 360 370 380 390
orf133.pep DAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY
|||||||::|||||||||||| :|||| |:||||||||||||||||||||| |||:||||
orf133a DAGNDAATQRYYSSFDPKDKDEEVTCNDDNTLCNGKYGGTSKSVLTNFARGXTFLITMSY
810 820 830 840 850 860
orf133.pep KFX
|||
orf133a KFX
870
一部分ORF133a核苷酸序列<SEQ ID 879>是:
1 AAAGACAAAA AAGTGTTTAC CGATGCGCGT GCCGTATCGA CCCGTCAGGA
51 TATATTCAAA TCCANCGAAA ACCTCGACAA CATCGTACGC ANCATCCCCG
101 GTGCGTTTAC ACANCAANAT AAAAGCTCGG GCNTTGTGTC TTTGAATATT
151 CGCNGCGACA GCGGGTTCGG GCGGGTCAAT ACNATGGTNG ACGGCATCAC
201 NCANACCTTT TATTCGACTT CTACCGATGC GGGCAGGGCA GGCGGTTCAT
251 CTCAATTCGG TGCATCTGTC GACAGCAATT TTATNGCCGG ACTGGATGTC
301 GTCAAAGGCA GCTTCAGCGG CTCGGCAGGC ATCAACAGCC TTGCCGGTTC
351 GGCGAATCTG CGGACTTTAN GCGTGGATGA TGTCGTTCAG GGCAATANTA
401 CNTACGGCCT GCTGCTAAAA GGTCTGACCG GCACCAATTC AACCAAAGGT
451 AATGCGATGG CGGCGATAGG TGCGCGCAAA TGGCTGGAAA GCGGAGCATC
501 TGTCGGTGTG CTTTACGGGC ACAGCAGGCG CAGCGTGGCG CAAAATTACC
551 GCGTGGGCGG CGGCGGGCAG CACATCGGAA ATTTTGGCGC GGAATATCTG
601 GAACGACGCA AGCAACGATA TTTTGAGCAA GAAGGCGGGT TGAAATTCAA
651 TTCCAACAGC GGAAAATGGG AGCGGGATTT CCAAAAGTCG TACTGGAAAA
701 CCAAGTGGTA TCAAAAATAC GATGCCCCCC AAGAACTGCA AAAATACATC
751 GAAGGTCATG ATAAAAGCTG GCGGGAAAAC CTGGCGCCGC AATACGACAT
801 CACCCCCATC GATCCGTCCA GCCTGAAGCN GCAGTCGGCA GGCAACCTGT
851 TTAAATTGGA ATACGACGGC GTATTCAATA AATACACGGC GCAATTTCGC
901 GATTTAAACA CCAAAATCGG CAGCCGCAAA ATCATCAACC GCAATTATCA
951 ATTCAATTAC GGTTTGTCTT TGAACCCGTA TACCAACCTC AATCTGACCG
1001 CAGCCTACAA TTCGGGCAGG CAGAAATATC CGAAAGGGTC GAAGTTTACA
1051 GGCTGGGGGC TTTTNAAAGA TTTTGAAACC TACAACAACG CAAAAATCCT
1101 CGACCTCANC AACACCTCCA CCTTCCGGCT GCCCCGTGAA ACCGAGTTGC
1151 AAACCACTTT GGGCTTCAAT TATTTCCACA ACGAATACGG CAAAAACCGC
1201 TTTCCTGAAG AATTGGGGCT GTTTTTCGAC GGTCCGGATC ANGACAACGG
1251 GCTTTATTCC TATTTGGGGC GGTTTAAGGG CGATAAAGGG CTGCTGCCCC
1301 AAAAATCAAC CATTGTCCAA CCGGCCGGCA GCCAATATTT CAACACGTTC
1351 TACTTCGATG CCGCGCTCAA AAAAGACATT TACCGCTTAA ACTACAGCAC
1401 CAATACCGTC GGCTACCGTT TCGGCGGCNA ATATACGGGC TATTACNGCT
1451 CGGATGACGA ATTTAAGCGG GCATTCGGAG AAAACTCGCC GACATACANG
1501 AAACATTGCA ACCAGAGCTG CGGAATTTAT GAACCCGTAT TGAAAAAATA
1551 CGGCAAAAAG CGCGCCAACA ACCATTCGGT CAGCATTAGT GCGGACTTCG
1601 GCGATTATTT CATGCCGTTC GCCAGCTATT CGCGCACACA CCGTATGCCC
1651 AACATCCAAG AAATGTATTT TTCCCAAATC GGCGACTCCG GCGTTCACAC
1701 CGCCTTAAAA CCAGAGCGCG CAAACACTTG GCAATTTGGC TTCAATACCT
1751 ATAAAAAAGG ATTGTTAAAA CAAGATGATA TATTAGGATT AAAACTGGTC
1801 GGCTACCGCA GCCGCATCGA CNACTACATC CACAACGTTT ACGGGAAATG
1851 GTGGGATTTG AACGGGAATA TTCCGAGCTG GGTCAGCAGC ACCGGGCTTG
1901 CCTACACCAT CCAACACCGC AATTTCAAAG ACAAAGTGCA CAAACACGGT
1951 TTTGAGTTGG AGCTGAATTA CGATTATNGG CGTTTTTTCA CCAACCTTTC
2001 TTACGCCTAT CAAAAAAGCA CGCAACCGAC CAACTTCAGC GATGCGAGCG
2051 AATCGCCCAA CAATGCGTCC AAAGAAGACC AACTCAAACA AGGTTATGGG
2101 TTGAGCAGGG TTTCCGCCCT GCCGCGAGAT TACGGACGTT TGGAAGTCGG
2151 TACGCGCTGG TTGGGCAACA AACTGACTTT GGGCGGCGCG ATGCGCTATT
2201 TCGGCAAGAG CATCCGCGCG ACGGCTGAAG AACGCTATAT CGACGNCACC
2251 AATGGGGNAN NTACCAGCAA TTTCCGGCAA CTGGGCAAGC GTTCCATCAN
2301 ACAAACCGAA ACCCTTGCCC GCCAGCCTTT GATTTTTGAT TTNTACGCCG
2351 CTTACGAGCC GAAGAAAAAN CTTATTTTCC GCGCCGAAGT CAAAAATCTG
2401 TTCGACAGGC GTTATATCGA TCCGCTCGAT GCGGGCAATG ATGCGGCAAC
2451 GCAGCGTTAT TACAGTTCGT TCGACCCGAA AGACAAGGAC GAAGAAGTAA
2501 CGTGTAATGA TGATAACACG TTATGCAACG GCAAATACGG CGGCACAAGC
2551 AAAAGCGTAT TGACCAATTT TGCACGCGGA CNCACCTTTT TGATAACGAT
2601 GAGCTACAAG TTTTAA
它编码的蛋白质具有(部分)氨基酸序列<SEQ ID 880>:
1 KDKKVFTDAR AVSTRQDIFK SXENLDNIVR XIPGAFTXQX KSSGXVSLNI
51 RXDSGFGRVN TMVDGITXTF YSTSTDAGRA GGSSQFGASV DSNFXAGLDV
101 VKGSFSGSAG INSLAGSANL RTLXVDDVVQ GNXTYGLLLK GLTGTNSTKG
151 NAMAAIGARK WLESGASVGV LYGHSRRSVA QNYRVGGGGQ HIGNFGAEYL
201 ERRKQRYFEQ EGGLKFNSNS GKWERDFQKS YWKTKWYQKY DAPQELQKYI
251 EGHDKSWREN LAPQYDITPI DPSSLKXQSA GNLFKLEYDG VFNKYTAQFR
301 DLNTKIGSRK IINRNYQFNY GLSLNPYTNL NLTAAYNSGR QKYPKGSKFT
351 GWGLXKDFET YNNAKILDLX NTSTFRLPRE TELQTTLGFN YFHNEYGKNR
401 FPEELGLFFD GPDXDNGLYS YLGRFKGDKG LLPQKSTIVQ PAGSQYFNTF
451 YFDAALKKDI YRLNYSTNTV GYRFGGXYTG YYXSDDEFKR AFGENSPTYX
501 KHCNQSCGIY EPVLKKYGKK RANNHSVSIS ADFGDYFMPF ASYSRTHRMP
551 NIQEMYFSQI GDSGVHTALK PERANTWQFG FNTYKKGLLK QDDILGLKLV
601 GYRSRIDXYI HNVYGKWWDL NGNIPSWVSS TGLAYTIQHR NFKDKVHKHG
651 FELELNYDYX RFFTNLSYAY QKSTQPTNFS DASESPNNAS KEDQLKQGYG
701 LSRVSALPRD YGRLEVGTRW LGNKLTLGGA MRYFGKSIRA TAEERYIDXT
751 NGXXTSNFRQ LGKRSIXQTE TLARQPLIFD XYAAYEPKKX LIFRAEVKNL
801 FDRRYIDPLD AGNDAATQRY YSSFDPKDKD EEVTCNDDNT LCNGKYGGTS
851 KSVLTNFARG XTFLITMSYK F*
ORF133a和ORF133-1在871个氨基酸的重叠区内显示出有94.3%的相同性:
10 20 30 40
orf133a.pep KDKKVFTDARAVSTRQDIFKSXENLDNIVRXIPGAFTXQXKS
||||||||||||||||||||| |||||||| |||||| | ||
orf133-1 EAQIQVLEDVHVKAKRVPKDKKVFTDARAVSTRQDIFKSSENLDNIVRSIPGAFTQQDKS
10 20 30 40 50 60
50 60 70 80 90 100
orf133a.pep SGXVSLNIRXDSGFGRVNTMVDGITXTFYSTSTDAGRAGGSSQFGASVDSNFXAGLDVVK
|| |||||| ||||||||||||||| |||||||||||||||||||||||||| |||||||
orf133-1 SGIVSLNIRGDSGFGRVNTMVDGITQTFYSTSTDAGRAGGSSQFGASVDSNFIAGLDVVK
70 80 90 100 110 120
110 120 130 140 150 160
orf133a.pep GSFSGSAGINSLAGSANLRTLXYDDYVQGNXTYGLLLKGLTGTNSTKGNAMAAIGARKWL
||||||||||||||||||||| |||||||| |||||||||||||||||||||||||||||
orf133-1 GSFSGSAGINSLAGSANLRTLGVDDVVQGNNTYGLLLKGLTGTNSTKGNAMAAIGARKWL
130 140 150 160 170 180
170 180 190 200 210 220
orf133a.pep ESGASVGVLYGHSRRSVAQNYRVGGGGQHIGNFGAEYLERRKQRYFEQEGGLKFNSNSGK
|||||||||||||||||||||||||||||||||||||||||||||| |||:|||||:|||
orf133-1 ESGASVGVLYGHSRRSVAQNYRVGGGGQHIGNFGAEYLERRKQRYFVQEGALKFNSDSGK
190 200 210 220 230 240
230 240 250 260 270 280
orf133a.pep WERDFQKSYWKTKWYQKYDAPQELQKYIEGHDKSWRENLAPQYDITPIDPSSLKXQSAGN
||||:|:: || | |::|: |||||||| ||||||||| |||||||||||||| |||||
orf133-1 WERDLQRQQWKYKPYKNYNN-QELQKYIEEHDKSWRENLXPQYDITPIDPSSLKQQSAGN
250 260 270 280 290
290 300 310 320 330 340
orf133a.pep LFKLEYDGVFNKYTAQFRDLNTKIGSRKIINRNYQFNYGLSLNPYTNLNLTAAYNSGRQK
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 LFKLEYDGVFNKYTAQFRDLNTKIGSRKIINRNYQFNYGLSLNPYTNLNLTAAYNSGRQK
300 310 320 330 340 350
350 360 370 380 390 400
orf133a.pep YPKGSKFTGWGLXKDFETYNNAKILDLXNTSTFRLPRETELQTTLGFNYFHNEYGKNRFP
|||||||||||| |||||||||||||| ||:|||||||||||||||||||||||||||||
orf133-1 YPKGSKFTGWGLLKDFETYNNAKILDLNNTATFRLPRETELQTTLGFNYFHNEYGKNRFP
360 370 380 390 400 410
410 420 430 440 450 460
orf133a.pep EELGLFFDGPDXDNGLYSYLGRFKGDKGLLPQKSTIVQPAGSQYFNTFYFDAALKKDIYR
||||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 EELGLFFDGPDQDNGLYSYLGRFKGDKGLLPQKSTIVQPAGSQYFNTFYFDAALKKDIYR
420 430 440 450 460 470
470 480 490 500 510 520
orf133a.pep LNYSTNTVGYRFGGXYTGYYXSDDEFKRAFGENSPTYXKHCNQSCGIYEPVLKKYGKKRA
|||||||||||||| ||||| |||||||||||||||| ||||:|||||||||||||||||
orf133-1 LNYSTNTVGYRFGGEYTGYYGSDDEFKRAFGENSPTYKKHCNRSCGIYEPVLKKYGKKRA
480 490 500 510 520 530
530 540 550 560 570 580
orf133a.pep NNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTALKPERANTWQFGFN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 NNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTALKPERANTWQFGFN
540 550 560 570 580 590
590 600 610 620 630 640
orf133a.pep TYKKGLLKQDDILGLKLVGYRSRIDXYIHNVYGKWWDLNGNIPSWVSSTGLAYTIQHRNF
||||||||||| ||||||||||||| ||||||||||||:|||||||||||||||||||||
orf133-1 TYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVSSTGLAYTIQHRNF
600 610 620 630 640 650
650 660 670 680 690 700
orf133a.pep KDKVHKHGFELELNYDYXRFFTNLSYAYQKSTQPTNFSDASESPNNASKEDQLKQGYGLS
||||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf133-1 KDKVHKHGFELELNYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNASKEDQLKQGYGLS
660 670 680 690 700 710
710 720 730 740 750 760
orf133a.pep RVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDXTNGXXTSNFRQLG
|||||||||||||||||||||||||||||||||||||||||||||| ||| ||||||||
orf133-1 RVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDGTNGGNTSNFRQLG
720 730 740 750 760 770
770 780 790 800 810 820
orf133a.pep KRSIXQTETLARQPLIFDXYAAYEPKKXLIFRAEVKNLFDRRYIDPLDAGNDAATQRYYS
|||| ||||||||||||| |||||||| ||||||||||||||||||||||||||||||||
orf133-1 KRSIKQTETLARQPLIFDFYAAYEPKKNLIFRAEVKNLFDRRYIDPLDAGNDAATQRYYS
780 790 800 810 820 830
830 840 850 860 870
orf133a.pep SFDPKDKDEEVTCNDDNTLCNGKYGGTSKSVLTNFARGXTFLITMSYKFX
|||||||||:|||| |:||||||||||||||||||||| |||:|||||||
orf133-1 SFDPKDKDEDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSYKFX
840 850 860 870 880
与淋病奈瑟球菌的预计ORF的同源性
ORF133和淋病奈瑟球菌的预计ORF(ORF133ng)在392个氨基酸的重叠区内显示出有92.3%的相同性:
orf133.pep PGYYGSDDEFKRAFGENSPTXKKHCNRSCGI 31
|||||::|||||||||||: |:||:|||:
orf133ng FYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKRAFGENSPAYKEHCDPSCGL 560
orf133.pep YEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNIQEMYFSQIGDSGVHTAL 91
||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||||
orf133ng YEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMPNIQEMYFSQIGDSGVHTAL 620
orf133.pep KPERANTWQFGFXTYKKGLLKQDDTLGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVS 151
|||||||||||| ||||||||||| ||||||||||||||||||||||||||||||||||:
orf133ng KPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYIHNVYGKWWDLNGDIPSWVG 680
orf133.pep STGLAYTIQHRXFXDKVHQXXXXXXXXYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 211
||||||||:|| | ||||: |||||||||||||||||||||||||||||||||
orf133ng STGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQKSTQPTNFSDASESPNNA 740
orf133.pep SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDG 271
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133ng SKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMRYFGKSIRATAEERYIDG 800
orf133.pep TNGGNTSNFRQLGKRSIKQTETLARQPLIXDFNAAYEPKKNLIFRAEVKNLFDRRYIDPL 331
|||||||| |||||||||||||||||||| || |||||||||||||||||||||||||||
orf133ng TNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLIFRAEVKNLFDRRYIDPL 860
orf133.pep DAGNDAAXERYYSSFDPKDKDXDVTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY 391
|||||||::|||||||||||| ||||||||||||||||||||||||||||||||||||||
orf133ng DAGNDAATQRYYSSFDPKDKDEDYTCNADKTLCNGKYGGTSKSVLTNFARGRTFLMTMSY 920
orf133.pep KF 393
||
orf133ng KF 922
预计全长ORF133ng核苷酸序列<SEQ ID 881>编码的蛋白质具有氨基酸序列<SEQ ID 882>:
1 MRSSFRLKPI CFYLMGVMLY HHSYAEDAGR AGSEAQIQVL EDVHVKAKRV
51 PKDKKVFTDA RAVSTRQDYF KSGENLDNIV RSIPGAFTQQ DKSSGIVSLN
101 IRGDSGFGRV NTMVDGITQT FYSTSTDAGR AGGSSQFGAS VDSNFIAGLD
151 VVKGSFSGSA GINSLAGSAN LRTLGVDDVV QGNNTYGLLL KGLTGTNSTK
201 GNAMAAIGAR KWLESGASVG VLYGHSRRGV AQNYRVGGGG QHIGNFGEEY
251 LERRKQQYFV QEGGLKFNAG SGKWERDLQR QYWKTKWYKK YEDPQELQKY
301 IEEHDKSWRE NLAPQYDITP IDPSGLKQQS AGNLLNLEYD GVFNKYTAQF
351 RDLNTRIGSR KIINRNYQFN YGLSLNPYTN LNLTAAYNSG RQKYPKGAKF
401 TGWGLLKDFE TYNNAKILDL NNTATFRLPR ETELQTTLGF NYFHNEYGKN
451 RFPEELGLFF DGPDQDNGLY SYLGRFKGDK GLLPQKSTIV QPAGSQYFNT
501 FYFDAALKKD IYRLNYSTNA INYRFGGEYT GYYGSENEFK RAFGENSPAY
551 KEHCDPSCGL YEPVLKKYGK KRANNHSVSI SADFGDYFMP FAGYSRTHRM
601 PNIQEMYFSQ IGDSGVHTAL KPERANTWQF GFNTYKKGLL KQDDILGLKL
651 VGYRSRIDNY IHNVYGKWWD LNGDIPSWVG STGLAYTIRH RNFKDKVHKH
701 GFELELNYDY GRFFTNLSYA YQKSTQPTNF SDASESPNNA SKEDQLKQGY
751 GLSRVSALPR DYGRLEVGTR WLGNK
LTLGG AMRYFGKSIR ATAEERYIDG
801 TNGGNTSNVR QLGKRSIKQT ETLARQPLIF DFYAAYEPKK NLIFRAEVKN
851 LFDRRYIDPL DAGNDAATQR YYSSFDPKDK DEDVTCNADK TLCNGKYGGT
901 SKSVLTNFAR GRTFLMTMSY KF*
还鉴定出一个变体,它由淋球菌DNA序列<SEQ ID 883>编码:
1 ATGAGATCTT CTTTCCGGTT GAAGCCGATT TGTTTTTATC TTATGGGTGT
51 TATGCTATAT CATCATAGTT ATGCCGAAGA TGCAGGGCGC GCGGGCAGCG
101 AGGCGCAGAT ACAGGTTTTG GAAGATGTGC ACGTCAAGGC GAAGCGCGTA
151 CCGAAAGACA AAAAAGTGTT TACCGATGCG CGTGCCGTAT CGACCCGTca
201 gGATGTGTTC AAATCCGGCG AAAACCTCGA CAACATCGTA CGCAGCATAC
251 CCGGTGCGTT TACACAGCAA GATAAAAGCT CGGGCATTGT GTCTTTGAAT
301 ATTCGCGGCG ACAGCGGGTT CGGGCGGGTC AATACGATGG TGGACGGCAT
351 CACGCAGACC TTTTATTCGA CTTCTACCGA TGCGGGCAGG GCAGGCGGTT
401 CATCTCAATT CGGTGCATCT GTCGACAGCA ATTTTATTGC CGGACTGGAT
451 GTCGTCAAAG GCAGCTTCAG CGGCTCGGCA GGCATCAACA GCCTTGCCGG
501 TTCGGCGAAT CTGCGGACTT TAGGCGTGGA TGACGTCGTT CAGGGCAATA
551 ATACCTACGG CCTGCTGCTA AAAGGTCTGA CCGGCACCAA TTCAACCAAA
601 GGTAATGCGA TGGCGGCGAT AGGTGCGCGC AAATGGCTGG AAAGCGGAGC
651 GTCTGTCGGT GTGCTTTACG GGCACAGCAG GCGCGGCGTG GCGCAAAATT
701 ACCGCGTGGG CGGCGGCGGG CAGCACATCG GAAATTTTGG TGAAGAATAT
751 CTGGAACGGC GCAAACAGCA ATATTTTGTA CAAGAGGGTG GTTTGAAATT
801 CAATGCCGGC AGCGGAAAAT GGGAACGGGA TTTGCAAAGG CAATACTGGA
851 AAACAAAGTG GTATAAAAAA TACGAAGACC CCCAAGAACT GCAAAAATAC
901 ATCGAAGAGC ATGATAAAAG CTGGCGGGAA AACCTGGCGC CGCAATACGA
951 CATCACCCCC ATCGATCCGT CCGGCCTGAA GCAGCAGTCG GCAGGCAATC
1001 TGTTTAAATT GGAATACGAC GGCGTATTCA ATAAATACAC GGCGCAATTT
1051 CGCGATTTAA ACACCAGAAT CGGCAGCCGC AAAATCATCA ACCGCAATTA
1101 TCAATTCAAT TACGGTTTGT CTTTGAACCC GTATACCAAC CTCAATCTGA
1151 CCGCAGCCTA CAATTCGGGC AGGCAGAAAT ATCCGAAAGG GGCGAAGTTT
1201 ACAGGCTGGG GGCTTTTAAA AGATTTTGAA ACCTACAACA ACGCGAAAAT
1251 CCTCGACCTC AACAACACCG CCACCTTCCG GCTGCCCCGC GAAACCGAGT
1301 TGCAAACCAC TTTGGGCTTC AATTATTTCC ACAACGAATA CGGCAAAAAC
1351 CGCTTTCCTG AAGAATTGGG GCTGTTTTTC GACGGTCCTG ATCAGGACAA
1401 CGGGCTTTAT TCCTATTTGG GGCGGTTTAA GGGCGATAAA GGGCTGTTGC
1451 CTCAAAAATC AACCATTGTC CAACCGGCCG GCAGCCAATA TTTCAACACG
1501 TTCTACTTCG ATGCCGCGCT CAAAAAAGAC ATTTACCGCT TAAACTACAG
1551 CACCAATGCA ATCAACTACC GTTTCGGCGG CGAATATACG GGCTATTACG
1601 GCTCGGAAAA CGAATTTAAG CGGGCATTCG GAGAAAACTC GCCGGCATAC
1651 AAGGAACATT GCGACCCGAG CTGCGGGCTT TATGAACCCG TATTGAAAAA
1701 ATACGGCAAA AAGCGCGCCA ACAACCATTC GGTCAGCATT AGTGCGGACT
1751 TCGGCGATTA TTTCATGCCG TTCGCCGGCT ATTCGCGCAC ACACCGTATG
1801 CCCAACATCC AAGAAATGTA TTTTTCCCAA ATCGGCGACT CCGGCGTTCA
1851 CACCGCCTTA AAACCAGAGC GCGCAAACAC TTGGCAATTT GGCTTCAATA
1901 CCTATAAAAA AGGATTGTTA AAACAAGATG ATATATTAGG ATTGAAACTG
1951 GTCGGCTACC GCAGCCGCAT TGACAACTAC ATCCACAACG TTTACGGGAA
2001 ATGGTGGGAT TTGAACGGGG ATATTCCGAG CTGGGTCGGC AGCACCGGGC
2051 TTGCCTACAC CATCCGACAC CGCAATTTCA AAGACAAAGT GCACAAACAC
2101 GGTTTTGAGC TGGAGCTGAA TTACGATTAT GGGCGTTTTT TCACCAACCT
2151 TTCTTACGCC TATCAAAAAA GCACGCAACC GACCAATTTC AGCGATGCGA
2201 GCGAATCGCC CAACAATGCC tccaaAGAAG ACCAACTCAA ACAAGGTTAT
2251 GGGCTGAGCA GGGTTTCCGC CCTGCCGCGA GATTACGGAC GTTTGGAAGT
2301 CGGTACGCGC TGGTTGGGCA ACAAACTGAC TTTGGGCGGC GCGAtgcGCT
2351 ATTTCGGCAA GAGCATCCGC GCGACGGCTG AAGAACGCTA TATCGACGGC
2401 ACCAACGGGG GAAATACCAG CAATGTCCGG CAACTGGGCA AGCGTTCCAT
2451 CAAACAAACC GAAACCCTTG CCCGACAGCC TTTGATTTTT GATTTTTACG
2501 CCGCTTACGA GCCGAAGAAA AACCTTATTT TCCGCGCCGA AGTCAAAAAC
2551 CTGTTCGACA GGCGTTATAT CGATCCGCTC GATGCGGGCA ATGATGCGGC
2601 AACGCAGCGT TATTACAGCT CGTTCGACCC GAAAGACAAG GACGAAGACG
2651 TAACGTGTAA TGCTGATAAA ACGTTGTGCA ACGGCAAATA CGGCGGCACA
2701 AGCAAAAGCG TATTGACCAA TTTCGCACGC GGACGCACCT TCTTGATGAC
2751 GATGAGCTAC AAGTTTTAA
它对应于氨基酸序列<SEQ ID 884;ORF133ng-1>:
1
MRSSFRLKPI CFYLMGVMLY HHSYAEDAGR AGSEAQIQVL EDVHVKAKRV
51 PKDKKVFTDA RAVSTRQDVF KSGENLDNIV RSIPGAFTQQ DKSSGIVSLN
101 IRGDSGFGRV NTMVDGITQT FYSTSTDAGR AGGSSQFGAS VDSNFIAGLD
151 VVKGSFSGSA GINSLAGSAN LRTLGVDDVV QGNNTYGLLL KGLTGTNSTK
201 GNAMAAIGAR KWLESGASVG VLYGHSRRGV AQNYRVGGGG QHIGNFGEEY
251 LERRKQQYEV QEGGLKFNAG SGKWERDLQR QYWKTKWYKK YEDPQELQKY
301 IEEHDKSWRE NLAPQYDITP IDPSGLKQQS AGNLFKLEYD GVFNKYTAQF
351 RDLNTRIGSR KIINRNYQFN YGLSLNPYTN LNLTAAYNSG RQKYPKGAKF
401 TGWGLLKDFE TYNNAKILDL NNTATFRLPR ETELQTTLGF NYFHNEYGKN
451 RFPEELGLFF DGPDQDNGLY SYLGRFKGDK GLLPQKSTIV QPAGSQYFNT
501 FYFDAALKKD IYRLNYSTNA INYRFGGEYT GYYGSENEFK RAFGENSPAY
551 KEHCDPSCGL YEPVLKKYGK KRANNHSVSI SADFGDYFMP FAGYSRTHRM
601 PNIQEMYFSQ IGDSGVHTAL KPERANTWQF GFNTYKKGLL KQDDILGLKL
651 VGYRSRIDNY IHNVYGKWWD LNGDIPSWVG STGLAYTIRH RNFKDKVHKH
701 GFELELNYDY GRFFTNLSYA YQKSTQPTNF SDASESPNNA SKEDQLKQGY
751 GLSRVSALPR DYGRLEVGTR WLGNKLTLGG AMRYFGKSIR ATAEERYIDG
801 TNGGNTSNVR QLGKRSIKQT ETLARQPLIF DFYAAYEPKK NLIFRAEVKN
851 LFDRRYIDPL DAGNDAATQR YYSSFDPKDK DEDVTCNADK TLCNGKYGGT
901 SKSVLTNFAR GRTFLMTMSY KF*
ORF133ng-1和ORF133-1在889个氨基酸的重叠区显示出有96.2%的相同性:
10 20 30 40 50 60
orf133ng-1.pep SFRLKPICFYLMGVMLYHHSYAEDAGRAGSEAQIQVLEDVHVKAKRVPKDKKVFTDARAV
||||||||||||||||||||||||||||||
orf133-1 EAQIQVLEDVHVKAKRVPKDKKVFTDARAV
10 20 30
70 80 90 100 110 120
orf133ng-1.pep STRQDVFKSGENLDNIVRSIPGAFTQQDKSSGIVSLNIRGDSGFGRVNTMVDGITQTFYS
|||||:|||:||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 STRQDIFKSSENLDNIVRSIPGAFTQQDKSSGIVSLNIRGDSGFGRVNTMVDGITQTFYS
40 50 60 70 80 90
130 140 150 160 170 180
orf133ng-1.pep TSTDAGRAGGSSQFGASVDSNFIAGLDVVKGSFSGSAGINSLAGSANLRTLGVDDVVQGN
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 TSTDAGRAGGSSQFGASVDSNFIAGLDVVKGSFSGSAGINSLAGSANLRTLGVDDVVQGN
100 110 120 130 140 150
190 200 210 220 230 240
orf133ng-1.pep NTYGLLLKGLTGTNSTKGNAMAAIGARKWLESGASVGVLYGHSRRGVAQNYRVGGGGQHI
||||||||||||||||||||||||||||||||||||||||:|||||||||||||||||||
orf133-1 NTYGLLLKGLTGTNSTKGNAMAAIGARKWLESGASVGVLYGHSRRSVAQNYRVGGGGQHI
160 170 180 190 200 210
250 260 270 280 290 300
orf133ng-1.pep GNFGEEYLERRKQQYFVQEGGLKFNAGSGKWERDLQRQYWKTKWYKKYEDPQELQKYIEE
|||| ||||||||:||||||:||||: ||||||||||| || | ||:|:: |||||||||
orf133-1 GNFGAEYLERRKQRYFVQEGALKFNSDSGKWERDLQRQQWKYKPYKNYNN-QELQKYIEE
220 230 240 250 260
310 320 330 340 350 360
orf133ng-1.pep HDKSWRENLAPQYDITPIDPSGLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTRIGSRKII
||||||||| |||||||||||:||||||||||||||||||||||||||||||:|||||||
orf133-1 HDKSWRENLXPQYDITPIDPSSLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTKIGSRKII
270 280 290 300 310 320
370 380 390 400 410 420
orf133ng-1.pep NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGAKFTGWGLLKDFETYNNAKILDLNNT
||||||||||||||||||||||||||||||||||:|||||||||||||||||||||||||
orf133-1 NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGSKFTGWGLLKDFETYNNAKILDLNNT
330 340 350 360 370 380
430 440 450 460 470 480
orf133ng-1.pep ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSYLGRFKGDKGLL
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSYLGRFKGDKGLL
390 400 410 420 430 440
490 500 510 520 530 540
orf133ng-1.pep PQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKRAF
||||||||||||||||||||||||||||||||||||:::|||||||||||||::||||||
orf133-1 PQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNTVGYRFGGEYTGYYGSDDEFKRAF
450 460 470 480 490 500
550 560 570 580 590 600
orf133ng-1.pep GENSPAYKEHCDPSCGLYEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMPNI
|||||:||:||: |||:||||||||||||||||||||||||||||||||:||||||||||
orf133-1 GENSPTYKKHCNRSCGIYEPVLKKYGKKRANNHSVSISADFGDYFMPFASYSRTHRMPNI
510 520 530 540 550 560
610 620 630 640 650 660
orf133ng-1.pep QEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYIHN
||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||
orf133-1 QEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDTLGLKLVGYRSRIDNYIHN
570 580 590 600 610 620
670 680 690 700 710 720
orf133ng-1.pep VYGKWWDLNGDIPSWVGSTGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQK
||||||||||||||||:||||||||:||||||||||||||||||||||||||||||||||
orf133-1 VYGKWWDLNGDIPSWVSSTGLAYTIQHRNFKDKVHKHGFELELNYDYGRFFTNLSYAYQK
630 640 650 660 670 680
730 740 750 760 770 780
orf133ng-1.pep STQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 STQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGAMR
690 700 710 720 730 740
790 800 810 820 830 840
orf133ng-1.pep YFGKSIRATAEERYIDGTNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLI
||||||||||||||||||||||||| ||||||||||||||||||||||||||||||||||
orf133-1 YFGKSIRATAEERYIDGTNGGNTSNFRQLGKRSIKQTETLARQPLIFDFYAAYEPKKNLI
750 760 770 780 790 800
850 860 870 880 890 900
orf133ng-1.pep FRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKS
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf133-1 FRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTSKS
810 820 830 840 850 860
910 920
orf133ng-1.pep VLTNFARGRTFLMTMSYKFX
||||||||||||||||||||
orf133-1 VLTNFARGRTFLMTMSYKFX
870 880
另外,ORF133ng-1与流感嗜血菌中的TonB-依赖性受体同源:
sp|P45114|YC17_HAEIN可能的TONB-依赖性受体HI1217前体
>gi|1075372|pir||G64110运铁蛋白结合蛋白1前体(tbp1)同系物-流感嗜血菌(Rd KW20菌株)>gi|1574147(U32801)运铁蛋白结合蛋白1前体(tbp1)[流感嗜血菌]长度=913
评分=930位(2377),估计值=0.0
相同性=476/921(51%),阳性=619/921(66%),空隙=72/921(7%)
询问:38 QVLEDVHVKAKRVPKDKKVFTDARAVSTRQDVFKSGENLDNIVRSIPGAFTQQDKSSGIV 97
+ L + V K + DKK FT+A+A STR++VFK + +D ++RSIPGAFTQQDK SG+V
目标:29 ETLGQIDVVEKVISNDKKPFTEAKAKSTRENVFKETQTIDQVIRSIPGAFTQQDKGSGVV 88
询问:98 SLNIRGDSGFGRVNTMVDGITQTFYSTSTDAGRAGGSSQFGASVDSNFIAGLDVVKGSFS 157
S+NIRG++G GRVNTMVDG+TQTFYST+ D+G++GGSSQFGA++D NFIAG+DV K +FS
目标:89 SVNIRGENGLGRVNTMVDGVTQTFYSTALDSGQSGGSSQFGAAIDPNFIAGVDVNKSNFS 148
询问:158 GSAGINSLAGSANLRTLGVDDVVQXXXXXXXXXXXXXXXXXXXXXAMAAIGARKWLESGA 217
G++GIN+LAGSAN RTLGV+DV+ M RKWL++G
目标:149 GASGINALAGSANFRTLGVNDVITDDKPFGIILKGMTGSNATKSNFMTMAAGRKWLDNGG 208
询问:218 SVGVLYGHSRRGVAQNYRVGGGGQHIGNFGEEYLERRKQQYFVQEGGLKFNAGSGKWERD 277
VGV+YG+S+R V+Q+YR+ GGG+ + + G++ L + K+ YF + G N G+W D
目标:209 YVGVVYGYSQREVSQDYRI-GGGERLASLGQDILAKEKEAYF-RNAGYILNP-EGQWTPD 265
询问:278 LQRQYWK-----------TKWY--------------------KKYEDPQELQK---YIEE 303
L +++W +Y KK +D ++LQK IEE
目标:266 LSKKHWSCNKPDYQKNGDCSYYRIGSAAKTRREILQELLTNGKKPKDIEKLQKGNDGIEE 325
询问:304 HDKSWRENLAPQYDITPIDPSGLKQQSAGNLFKLEYDGVFNKYTAQFRDLNTRIGSRKII 363
DKS+ N QY + PI+P L+ +S +L K EY AQ R L+ +IGSRKI
目标:326 TDKSFERN-KDQYSVAPIEPGSLQSRSRSHLLKFEYGDDHQNLGAQLRTLDNKIGSRKIE 384
询问:364 NRNYQFNYGLSLNPYTNLNLTAAYNSGRQKYPKGAKFTGWGLLKDFETYNNAKILDLNNT 423
NRNYQ NY +N Y +LNL AA+N G+ YPKG F GW + T N A I+D+NN+
目标:385 NRNYQVNYNFNNNSYLDLNLMAAHNIGKTIYPKGGFFAGWQVADKLITKNVANIVDINNS 444
询问:424 ATFRLPRETELQTTLGFNYFHNEYGKNRFPEELGLFFDGPDQDNGLYSY--LGRFKGDKG 481
TF LP+E +L+TTLGFNYF NEY KNRFPEEL LF++ D GLYS+ GR+ G K
目标:445 HTFLLPKEIDLKTTLGFNYFTNEYSKNRFPEELSLFYNDASHDQGLYSHSKRGRYSGTKS 504
询问:482 LLPQKSTIVQPAGSQYFNTFYFDAALKKDIYRLNYSTNAINYRFGGEYTGYYGSENEFKR 541
LLPQ+S I+QP+G Q F T YFD AL K IY LNYS N +Y F GEY GY
目标:505 LLPQRSVILQPSGKQKFKTVYFDTALSKGIYHLNYSVNFTHYAFNGEYVGY--------- 555
询问:542 AFGENSPAYKEHCDPSCGLYEPVLKKYGKKRANNHSVSISADFGDYFMPFAGYSRTHRMP 601
EN+ + + EP+L K G K+A NHS ++SA+ DYFMPF YSRTHRMP
目标:556 ---ENTAGQQ--------INEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMP 604
询问:602 NIQEMYFSQIGDSGVHTALKPERANTWQFGFNTYKKGLLKQDDILGLKLVGYRSRIDNYI 661
NIQEM+FSQ+ ++GV+TALKPE+++T+Q GFNTYKKGL QDD+LG+KLVGYRS I NYI
目标:605 NIQEMFFSQVSNAGVNTALKPEQSDTYQLGFNTYKKGLFTQDDVLGVKLVGYRSFIKNYI 664
询问:662 HNVYGKWWDLNGDIPSWVGSTGLAYTIRHRNFKDKVHKHGFELELNYDYGRFFTNLSYAY 721
HNVYG WW +P+W S G YTI H+N+K V K G ELE+NYD GRFF N+SYAY
目标:665 HNVYGVWW--RDGMPTWAESNGFKYTIAHQNYKPIVKKSGVELEINYDMGRFFANVSYAY 722
询问:722 QKSTQPTNFSDASESPNNASKEDQLKQGYGLSRVSALPRDYGRLEVGTRWLGNKLTLGGA 781
Q++ QPTN++DAS PNNAS+ED LKQGYGLSRVS LP+DYGRLE+GTRW KLTLG A
目标:723 QRTNQPTNYADASPRPNNASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLA 782
询问:782 MRYFGKSIRATAEERYIDGTNGGNTSNVRQLGKRSIKQTETLARQPLIFDFYAAYEPKKN 841
RY+GKS RAT EE YI+G+ + +R+ ++K+TE + +QP+I D + +YEP K+
目标:783 ARYYGKSKRATIEEEYINGSR-FKKNTLRRENYYAVKKTEDIKKQPIILDLHVSYEPIKD 841
询问:842 LIFRAEVKNLFDRRYIDPLDAGNDAATQRYYSSFDPKDKDEDVTCNADKTLCNGKYGGTS 901
LI +AEV+NL D+RY+DPLDAGNDAA+QRYYSS + + C D + C GG+
目标:842 LIIKAEVQNLLDKRYVDPLDAGNDAASQRYYSSL-----NNSIECAQDSSAC----GGSD 892
询问:902 KSVLTNFARGRTFLMTMSYKF 922
K+VL NFARGRT++++++YKF
目标:893 KTVLYNFARGRTYILSLNYKF 913
该淋球菌蛋白中用下划线示出的基序预计是ATP/GTP结合位点基序A(P-环),该分析提示脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
实施例104
在脑膜炎奈瑟球菌中鉴定出下列部分DNA序列<SEQ ID 885>
1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG
151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT
201 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG
401 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG
451 AAAGAAAAAA ACAGCGTGAT CAATGTGCGC GAAATGTTGC CCGACCAT..
它对应于氨基酸序列<SEQ ID 886;ORF112>:
1 MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML
51 GYTALKMPAR AYE
LIPLAVL IGGLVSLSQL AAGSELTVIK ASGMSTKK
LL
101
LILSQFGFIF AIATVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKNSVINVR EMLPDH...
进一步的工作揭示了部分核苷酸序列<SEQ ID 887>:
1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG
151 gGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT
201 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG
401 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG
451 AAAGAAAAAA ACAGCrTkAT CAATGTGCGC GAAATGTTGC CCGACCATAC
501 GCTTTTGGGC ATCAAAATTT GGGCGCGCAA CGATAAAAAC GAATTGGCAG
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGTTGGCAG
601 TTGAAAAACA TCCGCCGCAG CACGCTTGGC GAAGACAAAG TCGAGGTCTC
651 TATTGCGGCT GAAGAAAACT GGCCGATTTC CGTCAAACGC AACCTGATGG
701 ACGTATTGCT CGTCAAACCC GACCAAATGT CCGTCGGCGA ACTGACCACC
751 TACATCCGCC ACCTCCAAAA CAACAGCCAA AACACCCGAA TCTACGCCAT
801 CGCATGGTGG CGCAAATTGG TTTACCCCGC CGCAGCCTGG GTGATGGCGC
851 TCGTCGCCTT TGCCTTTACC CCGCAAACCA CCCGCCACGG CAATATGGGC
901 TTAAAACTCT TCGGCGGCAT CTGTsTCGGA TTGCTGTTCC ACCTTGCCGG
951 ACGGCTCTTT GGGTTTACCA GCCAACTCGG...
它对应于氨基酸序列<SEQ ID 888;ORF112-1>:
1
MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML
51
GYTALKMPAR AYE
LIPLAVL IGGLVSLSQL AAGSELTVIK ASGMSTKK
LL
101
LILSQFGFIF AIATVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKNSXINVR EMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ
201 LKNIRRSTLG EDKVEVSIAA EENWPISVKR NLMDVLLVKP DQMSVGELTT
251 YIRHLQNNSQ NTRIYAIAWW RK
LVYPAAAW VMALVAFAFT PQTTRHGN
MG
301
LKLFGGICXG LLFHLAGRLF GFTSQL...
对该氨基酸序列进行的计算机分析预测了两个跨膜结构域,并给出下列结果:
与脑膜炎奈瑟球菌(菌株A)的预计ORF的同源性
ORF112和脑膜炎奈瑟球菌菌株A的ORF(ORF112a)在166个氨基酸的重叠区内显示出有96.4%的相同性:
10 20 30 40 50 60
orf112.pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
||||||||||||||||||||||||||||||||||||||||||||||||| ||||||| ||
orf112a MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMXGYTALKMXAR
10 20 30 40 50 60
70 80 90 100 110 120
orf112.pep AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
||||:||||||||||| |||||||||:|||||||||||||||||||||||||||||||||
orf112a AYELMPLAVLIGGLVSXSQLAAGSELXVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
70 80 90 100 110 120
130 140 150 160
orf112.pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSVINVREMLPDH
|||||||||||||||||||||||||||||||||||:||||||||||
orf112a VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSIINVREMLPDHTLLGIKIWARNDKN
130 140 150 160 170 180
orf112a ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEEXWPISVKRNLMDVLLVKP
190 200 210 220 230 240
该ORF112a的核苷酸序列<SEQ ID 889>是:
1 ATGAACCTGA TTTCACGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGNTG
151 GGNTACACCG CCCTCAAAAT GNCCGCCCGC GCCTACGAAC TGATGCCCCT
201 CGCCGTCCTT ATCGGCGGAC TGGTCTCTNT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGAN CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG
401 CCGCGGCCAT CAACGGCAAA ATCAGTACCG GCAATACCGG CCTTTGGCTG
451 AAAGAAAAAA ACAGCATTAT CAATGTGCGC GAAATGTTGC CCGACCATAC
501 CCTGCTGGGC ATTAAAATCT GGGCCCGCAA CGATAAAAAC GAACTGGCAG
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGTTGGCAG
601 TTGAAAAACA TCCGCCGCAG CACGCTTGGC GAAGACAAAG TCGAGGTCTC
651 TATTGCGGCT GAAGAAAANT GGCCGATTTC CGTCAAACGC AACCTGATGG
701 ACGTATTGCT CGTCAAACCC GACCAAATGT CCGTCGGCGA ACTGACCACC
751 TACATCCGCC ACCTCCAAAN NNACAGCCAA AACACCCGAA TCTACGCCAT
801 CGCATGGTGG CGCAAATTGG TTTACCCCGC CGCAGCCTGG GTGATGGCGC
851 TCGTCGCCTT TGCCTTTACC CCGCAAACCA CCCGCCACGG CAATATGGGC
901 TTAAAANTCT TCGGCGGCAT CTGTCTCGGA TTGCTGTTCC ACCTTGCCGG
951 NCGGCTCTTC NGGTTTACCA GCCAACTCTA CGGCATCCCG CCCTTCCTCG
1001 NCGGCGCACT ACCTACCATA GCCTTCGCCT TGCTCGCCGT TTGGCTGATA
1051 CGCAAACAGG AAAAACGCTA A
它编码的蛋白质具有氨基酸序列<SEQ ID 890>:
1
MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEMX
51
GYTALKMXAR AYE
LMPLAVL IGGLVSXSQL AAGSELXVIK ASGMSTKK
LL
101
LILSQFGFIF AIATVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKNSIINVR EMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ
201 LKNIRRSTLG EDKVEVSIAA EEXWPISVKR NLMDVLLVKP DQMSVGELTT
251 YIRHLQXXSQ NTRIYAIAWW RK
LVYPAAAW VMALVAFAFT PQTTRHGN
MG
301
LKXFGGICLG LLFHLAGRLF XFTSQLYGIP PFLXGALPTI AFALLAVWLI
351 RKQEKR*
ORF112a和ORF112-1在326个氨基酸的重叠区内显示出有96.3%的相同性:
orf112a.pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMXGYTALKMXAR
||||||||||||||||||||||||||||||||||||||||||||||||| ||||||| ||
orf112-1 MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
orf112a.pep AYELMPLAVLIGGLVSXSQLAAGSELXVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
||||:||||||||||| |||||||||:|||||||||||||||||||||||||||||||||
orf112-1 AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
orf112a.pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSIINVREMLPDHTLLGIKIWARNDKN
||||||||||||||||||||||||||||||||||| ||||||||||||||||||||||||
orf112-1 VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSXINVREMLPDHTLLGIKIWARNDKN
orf112a.pep ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEEXWPISVKRNLMDVLLVKP
|||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||
orf112-1 ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEENWPISVKRNLMDVLLVKP
orf112a.pep DQMSVGELTTYIRHLQXXSQNTRIYAIAWWRKLVYPAAAWVMALVAFAFTPQTTRHGNMG
|||||||||||||||| ||||||||||||||||||||||||||||||||||||||||||
orf112-1 DQMSVGELTTYIRHLQNNSQNTRIYAIAWWRKLVYPAAAWVMALVAFAFTPQTTRHGNMG
orf112a.pep LKXFGGICLGLLFHLAGRLFXFTSQLYGIPPFLXGALPTIAFALLAVWLIRKQEKRX
|| ||||| ||||||||||| |||||
orf112-1 LKLFGGICXGLLFHLAGRLFGFTSQL
与淋病奈瑟球菌的预计ORF的同源性
ORF112和淋病奈瑟球菌的预计ORF(ORF112ng)在166个氨基酸的重叠区内显示出有95.8%的相同性:
orf112.pep MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf112ng MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR 60
orf112.pep AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW 120
||||:|||||||||:|||||||||||:|||||||||||||||||||||||||||:|||||
orf112ng AYELMPLAVLIGGLASLSQLAAGSELAVIKASGMSTKKLLLILSQFGFIFAIAAVALGEW 120
orf112.pep VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSVINVREMLPDH 166
|||||||||||||||||||||||||||||||||:|:|||| |||||
orf112ng VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKTSIINVRGMLPDHTLLGIKIWARNDKN 180
全长ORF112ng核苷酸序列<SEQ ID 891>是:
1 ATGAACCTGA TTTCACGTTA CATCATCCGC CAAATGGCGG TTATGGCGGT
51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT
101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG
151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TCATGCCCCT
201 CGCCGTCCTC ATCGGCGGAC TGGCCTCTCT CAGCCAGCTT GCCGCCGGCA
251 GCGAACTGGC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG
301 TTGATTCTGT CTCAGTTCGG TTTTATTTTT GCTATTGCCG CCGTCGCGCT
351 CGGCGAATGG GTTGCGCCCA CGCTGAGCCA AAAAGCCGAA AACATCAAag
401 cCGCCGCCAt taacggCAAA ATCAGCAccg gcAATACCGG CCTTTggcTG
451 AAAGAAAAAa ccAGCATTAT CAATGTGcGc GGAATGTTGC CCGACCATAC
501 GCTTTTGGGC ATCAAAATTT GGGCGCGCAA CGATAAAAAC GAATTGGCAG
551 AGGCAGTGGA AGCCGATTCC GCCGTTTTGA ACAGCGACGG CAGCTGGCAG
601 TTGAAAAACA TCCGCCGCAG CATCATGGGT ACAGACAAAA TCGAAACATC
651 cgCCGCCGCC GAAGAAACTT gGCCGATTGC CGTCAGACGC AACCTGATGG
701 ACGTATTGCT CGTCAAGCCC GACCAAATGT CCGTCGGCGA GCTGACCACC
751 TACATCCGCC ACCTCCAAAA CAACAGCCAA AACACCCAAA TCTACGCCAT
801 CGCATGGTGG CGTAAACTCG TTTACCCCGT CGCCGCATGG GTCATGGCGC
851 TCGTTGCCTT CGCCTTTACG CCGCAAACCA CGCGCCACGG CAATATGGGC
901 TTAAAACTCT TCGGCGGCAT CTGTCTCGGA TTGCTGTTCC ACCTTGCCGG
951 CAGGCTCTTC GGGTTTACCA GCCAACTCTA CGGCACCCCA CCCTTCCTCG
1001 CCGGCGCACT GCCTACCATA GCCTTCGCCT TGCTCGCTGT TTGGCTGATA
1051 CGCAAACAGG AAAAACGTTG A
它编码的蛋白质具有氨基酸序列<SEQ ID 892>:
1
MNLISRYIIR QMAVMAVYAL LAFLALYSFF EILYETGNLG KGSYGIWEML
51
GYTALKMPAR AYE
LMPLAVL IGGLASLSQL AAGSELAVIK ASGMSTKK
LL
101
LILSQFGFIF AIAAVALGEW VAPTLSQKAE NIKAAAINGK ISTGNTGLWL
151 KEKTSIINVR GMLPDHTLLG IKIWARNDKN ELAEAVEADS AVLNSDGSWQ
201 LKNIRRSIMG TDKIETSAAA EETWPIAVRR NLMDVLLVKP DQMSVGELTT
251 YIRHLQNNSQ NTQIYAIAWW RK
LVYPVAAW VMALVAFAFT PQTTRHGN
MG
301
LKLFGGICLG LLFHLAGRLF GFTSQLYGTP PFL
AGALPTI AFALLAVWLI
351 RKQEKR*
ORF112ng和ORF112-1在326个氨基酸的重叠区内显示出有94.2%的相同性:
10 20 30 40 50 60
orf112ng MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
orf112-1 MNLISRYIIRQMAVMAVYALLAFLALYSFFEILYETGNLGKGSYGIWEMLGYTALKMPAR
10 20 30 40 50 60
70 80 90 100 110 120
orf112ng AYELMPLAVLIGGLASLSQLAAGSELAVIKASGMSTKKLLLILSQFGFIFAIAAVALGEW
||||:|||||||||:|||||||||||:||||||||||||||||||||||||||:||||||
orf112-1 AYELIPLAVLIGGLVSLSQLAAGSELTVIKASGMSTKKLLLILSQFGFIFAIATVALGEW
70 80 90 100 110 120
130 140 150 160 170 180
orf112ng VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKTSIINVRGMLPDHTLLGIKIWARNDKN
|||||||||||||||||||||||||||||||||:| |||| |||||||||||||||||||
orf112-1 VAPTLSQKAENIKAAAINGKISTGNTGLWLKEKNSXINVREMLPDHTLLGIKIWARNDKN
130 140 150 160 170 180
190 200 210 220 230 240
orf112ng ELAEAVEADSAVLNSDGSWQLKNIRRSIMGTDKIETSAAAEETWPIAVRRNLMDVLLVKP
||||||||||||||||||||||||||| :| ||:|:| ||||:|||:|:|||||||||||
orf112-1 ELAEAVEADSAVLNSDGSWQLKNIRRSTLGEDKVEVSIAAEENWPISVKRNLMDVLLVKP
190 200 210 220 230 240
250 260 270 280 290 300
orf112ng DQMSVGELTTYIRHLQNNSQNTQIYAIAWWRKLVYPVAAWVMALVAFAFTPQTTRHGNMG
|||||||||||||||||||:||||||||||||||||:|||||||||||||||||||||||
orf112-1 DQMSVGELTTYIRHLQNNSQNTRIYAIAWWRKLVYPAAAWVMALVAFAFTPQTTRHGNMG
250 260 270 280 290 300
310 320 330 340 350
orf112ng LKLFGGICLGLLFHLAGRLFGFTSQLYGTPPFLAGALPTIAFALLAVWLIRKQEKRX
|||||||| |||||||||||||||||
orf112-1 LKLFGGICXGLLFHLAGRLFGFTSQL
310 320
该分析结果提示脑膜炎奈瑟球菌和淋病奈瑟球菌的这些蛋白及其表位可用作疫苗或诊断用的抗原,或用来产生抗体。
应理解,本发明只通过实施例进行了描述,而在本发明的思路和范围内还可作其它改动。
表I-PCR引物
ORF | 引物 | 序列 | 限制性位点 |
ORF 1ORF 2ORF 2-1ORF 4ORF 5ORF 6ORF 7ORF 8ORF 9ORF 10 | 正向反向正向反向正向反向正向反向正向正向反向正向反向正向反向正向反向正向反向正向反向 | CGC GGATCCGCTAGC-GGACACACTTATTTCGGCCCG CTCGAG-CCAGCGGTAGCCTAATTGC GGATCCCATATG-TTTGATTTCGGTTTGGGCCCG CTCGAG-GACGGCATAACGGCGGC GGATCCCATATG-TTTGATTTCGGTTTGGGCCCG CTCGAG-TGATTTACGGACGCGCAGC GGATCCCATATG-TGCGGAGGTCAAAAAGACCCCG CTCGAG-TTTGGCTGCGCCTTCGGAATTC CATATGG CCATGG-TGGAAGGCGCACAACCCG GGATCC-ATGGAAGGCGCACAACCCCG CTCGAG-GACTGTGCAAAAACGGCGC GGATCCCATATG-ACCCGTCAATCTCTGCACCCG CTCGAG-TGCGCCGAACACTTTCCGC GGATCCGCTAGC-GCGCTGCTTTTTGTTCCCCCG CTCGAG-TTTCAAAATATATTTGCGGAGC GGATCCCATATG-GCTCAACTGCTTCGTACCCCG CTCGAG-AGCAGGCTTTGGCGCCGC GGATCCCATATG-CCGAAGGAAGTCGGAAACCCG CTCGAG-TTTCCGAGGTTTTCGGGGC GGATCCCATATG-GACACAAAAGAAATCCTCCCCG CTCGAG-TAATGGGAAACCTTGTTTT | BamHI-NheIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoINdeI-NcoIBamHIXhoIBamHI-NdeIXhoIBamHI-NheIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoI |
ORF 11ORF 13ORF 15ORF 17ORF 18ORF 19ORF 20ORF 22ORF 23ORF 24 | 正向反向正向反向正向正向反向正向正向反向正向反向正向正向反向正向正向反向正向正向反向正向反向正向正向 | GC GGATCCCATATG-GCGGTCAACCTCTACGCCCG CTCGAG-GGAAACGACTTCGCCCGC GGATCCCATATG-GCTCTGCTTTCCGCGCCCCG CTCGAG-AGGGTGTGTGATAATAAGGGAATTC CATATGG CCATGG-GCGGGACACTGACAGCG GGATCC-TGCGGGACACTGACAGGCCCG CTCGAG-AGGTTGGCCTTGTCTATGGGAATT CCATATGG CCATGG-TTGCCGGCCTGTTCGCG GGATCC-ATTGCCGGCCTGTTCGCCCG CTCGAG-AAGCAGGTTGTACAGCGC GGATCCCATATG-ATTTTGCTGCATTTGGATCCCG CTCGAG-TCTTCCAATTTCTGAAAGCGGAATTC CATATGG CCATGG-TCGCCAGTGTTTTTACCCG GGATCC-TTCGCCAGTGTTTTTACCGCCCG CTCGAG-GGTGTTTTTGAAGCTGCCGGAATTC CATATGG CCATGG-TCGGCGCGGGTATGCG GGATCC-TTCGGCGCGGGTATGCCCG CTCGAG-CGGCGAGCGAGAGCAGGAATTC CATATGG CCATGG-TGATTAAAATCAAAAAAGGTCTCG GGATCC-ATGATTAAAATCAAAAAAGGTCTAAACCCCCG CTCGAG-ATTATGATAGCGGCCCCGC GGATCCCATATG-GATGTTTCTGTTTCAGACCCCG CTCGAG-TTTAAACCGATAGGTAAACGGGAATTC CATATGG CCATGG-TGATGCCGGAAATGGTGCG GGATCC-ATGATGCCGGAAATGGTG | BamHI-NdeIXhoIBamHI-NdeIXhoINdeI-NcoIBamHIXhoINdeI-NcoIBamHIXhoIBamHI-NdeIXhoINdeI-NcoIBamHIXhoINdeI-NcoIBamHIXhoINdeI-NcoIBamHIXhoIBamHI-NdeIXhoINdeI-NcoIBamHI |
ORF 25ORF 26ORF 27ORF 28ORF 29ORF 32ORF 33ORF 35ORF 37ORF 58 | 反向正向反向正向反向正向正向反向正向正向反向正向正向反向正向反向正向反向正向正向反向正向反向正向反向 | CCCG CTCGAG-TGTCAGCGTGGCGCAGC GGATCCCATATG-TATCGCAAACTGATTGCCCCG CTCGAG-ATCGATGGAATAGCCGGC GGATCCCATATG-CAGCTGATCGACTATTCCCCG CTCGAG-GACATCGGCGCGTTTTGGAATTC CATATGG CCATGG-AGACCTATTCTGTTTACG GGATCC-CAGACCTATTCTGTTTATTTTAATCCCCG CTCGAG-GGGTTCGATTAAATAACCATGGAATTC CATATGG CCATGG-ACGGCTGTACGTTGATGTCG GGATCC-AACGGCTGTACGTTGATGCCCG CTCGAG-TTTGTCAGAGGAATTCGCGGC GGATCCCATATG-AACGGTTTGGATGCCCGCGC GGATCCGCTAGC-AACGGTTTGGATGCCCGCCCG CTCGAG-TTTGTCTAAGTTCCTGATATGCGC GGATCCCATATG-AATACTCCTCCTTTTGCCCG CTCGAG-GCGTATTTTTTGATGCTTTGGC GGATCCCATATG-ATTGATAGGGATCGTATGCCCG CTCGAG-TTGATCTTTCAAACGGCCGC GGATCCCATATG-TTCAGAGCTCAGCTTCGC GGATCCGCTAGC-TTCAGAGCTCAGCTTCCCG CTCGAG-AAACAGCCATTTGAGCGAGC GGATCCCATATG-GATGACGTATCGGATTTTCCCG CTCGAG-ATAGCCCGCTTTCAGGCGC GGATCCGCTAGC-TCCGAACGCGAGTGGATCCCG CTCGAG-AGCATTGTCCAAGGGGAC | XhoIBamHI-NdeIXhoIBamHI-NdeIXhoINdeI-NcoIBamHIXhoINdeI-NcoIBamHIXhoIBamHI-NdeIBamHI-NheIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIBamHI-NheIXhoIBamHI-NdeIXhoIBamHI-NheIXhoI |
ORF 65ORF 66ORF 72ORF 73ORF 75ORF 76ORF 79ORF 83ORF 84ORF 85ORF 89 | 正向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向正向 | GGAATTC CATATGG CCATGG-TGCTGTATCTGAATCAAGCG GGATCC-TTGCTGTATCTGAATCAAGGCCCG CTCGAG-CCGCATCGGCAGACAGC GGATCCCATATG-TACGCATTTACCGCCGCCCG CTCGAG-TGGATTTTGCAGAGATGGCGC GGATCCCATATG-AATGCAGTAAAAATATCTGACCCG CTCGAG-GCCTGAGACCTTTGCAAGC GGATCCCATATG-AGATTTTTCGGTATCGGCCCG CTCGAG-TTCATCTTTTTCATGTTCGGC GGATCCCATATG-TCTGTCTTTCAAACGGCCCCG CTCGAG-TTTGTTTTTGCAAGACAGGATCA GCTAGCCATATG-AAACAGAAAAAAACCGCCG GGATCC-TTACGGTTTGACACCGTTCGC GGATCCCATATG-GTTTCCGCCGCCGCCCG CTCGAG-GTGCTGATGCGCTTCGGC GGATCCCATATG-AAAACCCTGCTGCTGCCCCG CTCGAG-GCCGCCTTTGCGGCGC GGATCCCATATG-GCAGAGATCTGTTTGCCCG CTCGAG-GTTTGCCGATCCGACCACGC GGATCCCATATG-GCGGTTTGGGGCGGACCCG CTCGAG-TCGGCGCGGCGGGCGGAATTC CATATGG CCATGG-CCATACCTTCTTATCACG GGATCC-GCCATACCTTCTTATCAGAG | NdeI-NcoIBamHIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoINheI-NdeIBamHIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoINdeI-NcoIBamHI |
ORF 97ORF 98ORF 100ORF 101ORF 102ORF 103ORF 104ORF 105ORF 106ORF 109ORF 110 | 反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向 | CCCG CTCGAG-TTTTTTGCGATTAGAAAAAGCGC GGATCCCATATG-CATCCTGCCAGCGAACCCCG CTCGAG-TTCGCCTACGGTTTTTTGGC GGATCCCATATG-ACGGTAACTGCGGCCCG CTCGAG-TTGTTGTTCGGGCAAATCGCGGATCCCATATG-TCGGGCATTTACACCGCCCG CTCGAG-ACGGGTTTCGGCGGAAGC GGATCCCATATG-ATTTATCAAAGAAACCTCCCCG CTCGAG-TTTTCCGCCTTTCAATGTGC GGATCCCATATG-GCAGGGCTGTTTTACCCCCG CTCGAG-AAACGGTTTGAACACGACGC GGATCCCATATG-AACCACGACATCACCCCG CTCGAG-CAGCCACAGGACGGCGC GGATCCCATATG-ACGTGGGGAACGCCCCG CTCGAG-GCGGCGTTTGAACGGCGC GGATCCCATATG-ACCAAATTTCAAACCCCTCCCCG CTCGAG-TAAACGAATGCCGTCCAGGC GGATCCCATATG-AGGATAACCGACGGCGCCCG CTCGAG-TTTGTTCCCGATGATGTTGC GGATCCCATATG-GAAGATTTATATATAATACTCGCCCG CTCGAG-ATCAGCTTCGAACCGAAGAAA GAATTC-ATGAGTAAATCCCGTAGATCTCCCAAA CTGCAG-GGAAAACCACATCCGCACTCTGCC | XhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIEcoRIPstI |
ORF 111ORF 113ORF 115ORF 119ORF 120ORF 121ORF 122ORF 125ORF 126ORF 127ORF 128ORF 129 | 正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向 | AAA GAATTC-GCACCGCAAAAGGCAAAAACCGCAAAA CTGCAG-TCTGCGCGTTTTCGGGCAGGGTGGAAA GAATTC-ATGAACAAAACCCTCTATCGTGTGATTTTCAACCGAAA CTGCAG-TTACGAATGCCTGCTTGCTCGACCGTACTGAAA GAATTC-TTGCTTGTGCAAACAGAAAAAGACGGAAAAAA GTCGAC-CTATTTTTTAGGGGCTTTTGCTTGTTTGAAAAGCCTGCCAAA GAATTC-TACAACATGTATCAGGAAAACCAATACCGAAA CTGCAG-TTATGAAAACAGGCGCAGGGCGGTTTTGCCAAA GAATTC-GCAAGGCTACCCCAATCCGCCGTGAAA CTGCAG-CGGTTTGGCTGCCTGGCCGTTGATAAA GAATTC-GCCTTGGTCTGGCTGGTTTTCGCAAA CTGCAG-TCATCCGCCACCCCACCTCGGCCATCCATCAAAAAA GTCGAC-ATGTCTTACCG CGCAAGCAGTTCTCCAAA CTGCAG-TCAGGAACACAAACGATGACGAATATCCGTATCAAA GAATTC-GCGCTGTTTTTTGCGGCGGCGTATAAA CTGCAG-CGCCGTTTCAAGACGAAAAAGTCGAAA GAATTC-GCGGAAACGGTCGAAGAAA CTGCAG-TTAATCTTGTCTTCCGATATACAAA GAATTC-ATGACTGATAATCGGGGGTTTACGAAAAAA GTCGAC-CTTAAGTAACTTGCAGTCCTTATCAAA GAATTC-ATGCAAGCTGTCCGCTACAGGCCAAA CTGCAG-CTATTGCAATGCGCCGCCGCGGGAATGAATGAGCAGGCGAAA GAATTC-ATGGATTTTCGTTTTGACATTATTTACGAATACCGAAA CTGCAG-TTATTTTTTGATGAAATTTTGGGGCGG | EcoRIPstIEcoRIPstIEcoRISalIEcoRIPstIEcoRIPstIEcoRIPstISalIPstIEcoRIPstIEcoRIPstIEcoRISatIEcoRIPstIEcoRIPstI |
ORF 130ORF 131ORF 132ORF 133ORF 134ORF 135ORF 136ORF 137ORF 138ORF 139ORF 140ORF 141 | 正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向反向正向 | AAA GAATTC-GCAGTACTTGCCAT TCTCGGTGCGAAA CTGCAG-CTCCGGATCGTCTGTAAACGCATTGC GGATCCCATATG-GAAATTCGGGCAATAAAATCCCG CTCGAG-CCAGCGGACGCGTTCGC GGATCCCATATG-AAAGAAGCGGGGTTTGCCCG CTCGAG-CCAATCTGCCAGCCGTCGCGGATCCCATATG-GAAGATGCAGGGCGCGCCCG CTCGAG-AAACTTGTAGCTCATCGTGC GGATCCCATATG-TCTGTGCAAGCAGTATTGCCCG CTCGAG-ATCCTGTGCCAATGCGGC GGATCCCATATG-CCGTCTGAAAAAGCTTTCCCG CTCGAG-AAATACCGCTGAGGATGCGC GGATCCGCTAGC-ATGAAGCGGCGTATAGCCCCCG CTCGAG-TTCCGAATATTTGGAACTTTTCGC GGATCCCATATG-GGCACGGCGGGAAATACCCG CTCGAG-ATAACGGTATGCCGCCGC GGATCCCATATG-TTTCGTTTACAATTCAGGCCCCG CTCGAG-CGGCGTTTTATAGCGGGC GGATCCCATATG-GCTTTTTTGGCGGTAATGCCCG CTCGAG-TAACGTTTCCGTGCGTTTGC GGATCCCATATG-TTGCCCACAGGCAGCCCCG CTCGAG-GACGATGGCAAACAGCGC GGATCCCATATG-CCGTCTGAAGCAGTCT | EcoRIPstIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NheIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeI |
ORF 142ORF 143ORF 144ORF 147 | 反向正向反向正向反向正向反向正向反向 | CCCG CTCGAG-ATCTGTTGTTTTTAAAATATTGC GGATCCCATATG-GATAATTCTGGTAGTGAAGCCCG CTCGAG-AAACGTATAGCCTACCTGC GGATCCCATATG-GATACCGCTTTGAACCTCCCG CTCGAG-AATGGCTTCCGCAATATGGC GGATCCCATATG-ACCTTTTTACAACGTTTGCCCCG CTCGAG-AGATTGTTGTTGTTTTTTCGGC GGATCCCATATG-TCTGTCTTTCAAACGGCCCCG CTCGAG-TTTGTTTTTGCAAGACAG | XhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoIBamHI-NdeIXhoI |
NB:
-限制性位点用下划线表示
-对于ORF110-130,当ORF本身携带EcoRI位点(例如ORF122)时,在正向引物中采用SalI位点。同样,当ORF携带了PstI位点(例如ORF115和127),在反向引物中采用SalI位点。
表II-隆、表达和纯化的归纳
ORF | PCR/克隆 | His-融合物表达 | GST-融合物表达 | 纯化 |
orf1 | + | + | + | His-融合物 |
orf2 | + | + | + | GST-融合物 |
orf2.1 | + | 未测定 | + | GST-融合物 |
orf4 | + | + | + | His-融合物 |
orf5 | + | 未测定 | + | GST-融合物 |
orf6 | + | + | + | GST-融合物 |
orf7 | + | + | + | GST-融合物 |
orf8 | + | 未测定 | 未测定 | |
orf9 | + | + | + | GST-融合物 |
orf10 | + | 未测定 | 未测定 | |
orf11 | + | 未测定 | 未测定 | |
orf13 | + | 未测定 | + | GST-融合物 |
orf15 | + | + | + | GST-融合物 |
orf17 | + | 未测定 | 未测定 | |
orf18 | + | 未测定 | 未测定 | |
orf19 | + | 未测定 | 未测定 | |
orf20 | + | 未测定 | 未测定 | |
orf22 | + | + | + | GST-融合物 |
orf23 | + | + | + | His-融合物 |
orf24 | + | 未测定 | 未测定 | |
orf25 | + | + | + | His-融合物 |
orf26 | + | 未测定 | 未测定 | |
orf27 | + | + | + | GST-融合物 |
orf28 | + | + | + | GST-融合物 |
orf29 | + | 未测定 | 未测定 | |
orf32 | + | + | + | His-融合物 |
orf33 | + | 未测定 | 未测定 | |
orf35 | + | 未测定 | 未测定 |
orf37 | +- | + | + | GST-融合物 |
orf58 | + | 未测定 | 未测定 | |
orf65 | + | 未测定 | 未测定 | |
orf66 | + | 未测定 | 未测定 | |
orf72 | + | + | 未测定 | His-融合物 |
orf73 | + | 未测定 | + | 未测定 |
orf75 | + | 未测定 | 未测定 | |
orf76 | + | + | 未测定 | His-融合物 |
orf79 | + | + | 未测定 | His-融合物 |
orf83 | + | 未测定 | + | 未测定 |
orf84 | + | 未测定 | 未测定 | |
orf85 | + | 未测定 | + | GST-融合物 |
orf89 | + | 未测定 | + | GST-融合物 |
orf97 | + | + | + | GST-融合物 |
orf98 | + | 未测定 | 未测定 | |
orf100 | + | 未测定 | 未测定 | |
orf101 | + | 未测定 | 未测定 | |
orf102 | + | 未测定 | 未测定 | |
orf103 | + | 未测定 | 未测定 | |
orf104 | + | 未测定 | 未测定 | |
orf105 | + | 未测定 | 未测定 | |
orf106 | + | + | + | His-融合物 |
orf109 | + | 未测定 | 未测定 | |
orf110 | + | 未测定 | 未测定 | |
orf111 | + | + | 未测定 | His-融合物 |
orf113 | + | + | 未测定 | His-融合物 |
orf115 | 未测定 | 未测定 | 未测定 | |
orf119 | + | + | 未测定 | His-融合物 |
orf120 | + | + | 未测定 | His-融合物 |
orf121 | + | 未测定 | 未测定 | |
orf122 | + | + | 未测定 | His-融合物 |
orf125 | + | + | 未测定 | His-融合物 |
orf126 | + | + | 未测定 | His-融合物 |
orf127 | + | + | 未测定 | His-融合物 |
orf128 | + | 未测定 | 未测定 | |
orf129 | + | + | 未测定 | His-融合物 |
orf130 | + | 未测定 | 未测定 | |
orf131 | + | + | + | 未测定 |
orf132 | + | + | + | His-融合物 |
orf133 | + | 未测定 | + | GST-融合物 |
orf134 | + | 未测定 | 未测定 | |
orf135 | + | 未测定 | 未测定 | |
orf136 | + | 未测定 | 未测定 | |
orf137 | + | 未测定 | + | GST-融合物 |
orf138 | + | 未测定 | + | GST-融合物 |
orf139 | + | 未测定 | 未测定 | |
orf140 | + | 未测定 | 未测定 | |
orf141 | + | 未测定 | 未测定 | |
orf142 | + | 未测定 | 未测定 | |
orf143 | + | 未测定 | 未测定 | |
orf144 | + | 未测定 | + | 未测定 |
orf147 | + | 未测定 | 未测定 |
Claims (17)
1.一种蛋白,它包含选自SEQ ID 2、4、6和8的氨基酸序列。
2.一种核酸分子,它编码权利要求1所述的蛋白。
3.根据权利要求2所述的核酸分子,它包含选自SEQ ID 1、3、5和7的核苷酸序列。
4.一种蛋白,它包含选自SEQ ID 2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126,128,130,132,134,136,138,140,142,144,146,148,150,152,154,156,158,160,162,164,166,168,170,172,174,176,178,180,182,184,186,188,190,192,194,196,198,200,202,204,206,208,210,212,214,216,218,220,222,224,226,228,230,232,234,236,238,240,242,244,246,248,250,252,254,256,258,260,262,264,266,268,270,272,274,276,278,280,282,284,286,288,290,292,294,296,298,300,302,304,306,308,310,312,314,316,318,320,322,324,326,328,330,332,334,336,338,340,342,344,346,348,350,352,354,356,358,360,362,364,366,368,370,372,374,376,378,380,382,384,386,388,390,392,394,396,398,400,402,404,406,408,410,412,414,416,418,420,422,424,426,428,430,432,434,436,438,440,442,444,446,448,450,452,454,456,458,460,462,464,466,468,470,472,474,476,478,480,482,484,486,488,490,492,494,496,498,500,502,504,506,508,510,512,514,516,518,520,522,524,526,528,530,532,534,536,538,540,542,544,546,548,550,552,554,556,558,560,562,564,566,568,570,572,574,576,578,580,582,584,586,588,590,592,594,596,598,600,602,604,606,608,610,612,614,616,618,620,622,624,626,628,630,632,634,636,638,640,642,644,646,648,650,652,654,656,658,660,662,664,666,668,670,672,674,676,678,680,682,684,686,688,690,692,694,696,698,700,702,704,706,708,710,712,714,716,718,720,722,724,726,728,730,732,734,736,738,740,742,744,746,748,750,752,754,756,758,760,762,764,766,768,770,772,774,776,778,780,782,784,786,788,790,792,794,796,798,800,802,804,806,808,810,812,814,816,818,820,822,824,826,828,830,832,834,836,838,840,842,844,846,848,850,852,854,856,858,860,862,864,866,868,870,872,874,876,878,880,882,884,886,888,890,和892的氨基酸序列。
5.一种蛋白质,它与权利要求4所述的蛋白质的序列相同性为50%或更高。
6.一种蛋白质,它包含选自SEQ ID 2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124,126,128,130,132,134,136,138,140,142,144,146,148,150,152,154,156,158,160,162,164,166,168,170,172,174,176,178,180,182,184,186,188,190,192,194,196,198,200,202,204,206,208,210,212,214,216,218,220,222,224,226,228,230,232,234,236,238,240,242,244,246,248,250,252,254,256,258,260,262,264,266,268,270,272,274,276,278,280,282,284,286,288,290,292,294,296,298,300,302,304,306,308,310,312,314,316,318,320,322,324,326,328,330,332,334,336,338,340,342,344,346,348,350,352,354,356,358,360,362,364,366,368,370,372,374,376,378,380,382,384,386,388,390,392,394,396,398,400,402,404,406,408,410,412,414,416,418,420,422,424,426,428,430,432,434,436,438,440,442,444,446,448,450,452,454,456,458,460,462,464,466,468,470,472,474,476,478,480,482,484,486,488,490,492,494,496,498,500,502,504,506,508,510,512,514,516,518,520,522,524,526,528,530,532,534,536,538,540,542,544,546,548,550,552,554,556,558,560,562,564,566,568,570,572,574,576,578,580,582,584,586,588,590,592,594,596,598,600,602,604,606,608,610,612,614,616,618,620,622,624,626,628,630,632,634,636,638,640,642,644,646,648,650,652,654,656,658,660,662,664,666,668,670,672,674,676,678,680,682,684,686,688,690,692,694,696,698,700,702,704,706,708,710,712,714,716,718,720,722,724,726,728,730,732,734,736,738,740,742,744,746,748,750,752,754,756,758,760,762,764,766,768,770,772,774,776,778,780,782,784,786,788,790,792,794,796,798,800,802,804,806,808,810,812,814,816,818,820,822,824,826,828,830,832,834,836,838,840,842,844,846,848,850,852,854,856,858,860,862,864,866,868,870,872,874,876,878,880,882,884,886,888,890和892的氨基酸序列的片段。
7.一种抗体,它结合权利要求4至6任一所述的蛋白质。
8.一种核酸分子,它编码权利要求4至6任一所述的蛋白质。
9.根据权利要求8所述的核酸分子,它包括选自SEQ ID 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145,147,149,151,153,155,157,159,161,163,165,167,169,171,173,175,177,179,181,183,185,187,189,191,193,195,197,199,201,203,205,207,209,211,213,215,217,219,221,223,225,227,229,231,233,235,237,239,241,243,245,247,249,251,253,255,257,259,261,263,265,267,269,271,273,275,277,279,281,283,285,287,289,291,293,295,297,299,301,303,305,307,309,311,313,315,317,319,321,323,325,327,329,331,333,335,337,339,341,343,345,347,349,351,353,355,357,359,361,363,365,367,369,371,373,375,377,379,381,383,385,387,389,391,393,395,397,399,401,403,405,407,409,411,413,415,417,419,421,423,425,427,429,431,433,435,437,439,441,443,445,447,449,451,453,455,457,459,461,463,465,467,469,471,473,475,477,479,481,483,485,487,489,491,493,495,497,499,501,503,505,507,509,511,513,515,517,519,521,523,525,527,529,531,533,535,537,539,541,543,545,547,549,551,553,555,557,559,561,563,565,567,569,571,573,575,577,579,581,583,585,587,589,591,593,595,597,599,601,603,605,607,609,611,613,615,617,619,621,623,625,627,629,631,633,635,637,639,641,643,645,647,649,651,653,655,657,659,661,663,665,667,669,671,673,675,677,679,681,683,685,687,689,691,693,695,697,699,701,703,705,707,709,711,713,715,717,719,721,723,725,727,729,731,733,735,737,739,741,743,745,747,749,751,753,755,757,759,761,763,765,767,769,771,773,775,777,779,781,783,785,787,789,791,793,795,797,799,801,803,805,807,809,811,813,815,817,819,821,823,825,827,829,831,833,835,837,839,841,843,845,847,849,851,853,855,857,859,861,863,865,867,869,871,873,875,877,879,881,883,885,887,889和891的核苷酸序列。
10.一种核酸分子,它包含选自SEQ ID 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71,73,75,77,79,81,83,85,87,89,91,93,95,97,99,101,103,105,107,109,111,113,115,117,119,121,123,125,127,129,131,133,135,137,139,141,143,145,147,149,151,153,155,157,159,161,163,165,167,169,171,173,175,177,179,181,183,185,187,189,191,193,195,197,199,201,203,205,207,209,211,213,215,217,219,221,223,225,227,229,231,233,235,237,239,241,243,245,247,249,251,253,255,257,259,261,263,265,267,269,271,273,275,277,279,281,283,285,287,289,291,293,295,297,299,301,303,305,307,309,311,313,315,317,319,321,323,325,327,329,331,333,335,337,339,341,343,345,347,349,351,353,355,357,359,361,363,365,367,369,371,373,375,377,379,381,383,385,387,389,391,393,395,397,399,401,403,405,407,409,411,413,415,417,419,421,423,425,427,429,431,433,435,437,439,441,443,445,447,449,451,453,455,457,459,461,463,465,467,469,471,473,475,477,479,481,483,485,487,489,491,493,495,497,499,501,503,505,507,509,511,513,515,517,519,521,523,525,527,529,531,533,535,537,539,541,543,545,547,549,551,553,555,557,559,561,563,565,567,569,571,573,575,577,579,581,583,585,587,589,591,593,595,597,599,601,603,605,607,609,611,613,615,617,619,621,623,625,627,629,631,633,635,637,639,641,643,645,647,649,651,653,655,657,659,661,663,665,667,669,671,673,675,677,679,681,683,685,687,689,691,693,695,697,699,701,703,705,707,709,711,713,715,717,719,721,723,725,727,729,731,733,735,737,739,741,743,745,747,749,751,753,755,757,759,761,763,765,767,769,771,773,775,777,779,781,783,785,787,789,791,793,795,797,799,801,803,805,807,809,811,813,815,817,819,821,823,825,827,829,831,833,835,837,839,841,843,845,847,849,851,853,855,857,859,861,863,865,867,869,871,873,875,877,879,881,883,885,887,889和891的核苷酸序列的片段。
11.一种核酸分子,它包含与权利要求8至10任一所述的核酸分子互补的核苷酸序列。
12.一种核酸分子,它包含的核苷酸序列与权利要求8至11任一所述的核酸分子的序列相同性为50%或更高。
13.一种核酸分子,它能在高度严谨的条件下与权利要求8至12任一所述的核酸分子杂交。
14.一种组合物,它包含前述任一权利要求所述的蛋白质、核酸分子或抗体。
15.根据权利要求14所述的组合物,它是疫苗组合物或诊断组合物。
16.权利要求14或15所述的组合物作为药剂的应用。
17.权利要求14所述的组合物在生产用于治疗或预防由于奈瑟球菌引起的感染的药剂中的应用。
Applications Claiming Priority (14)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GBGB9723516.2A GB9723516D0 (zh) | 1997-11-06 | 1997-11-06 | |
GB9723516.2 | 1997-11-06 | ||
GB9724190.5 | 1997-11-14 | ||
GBGB9724190.5A GB9724190D0 (zh) | 1997-11-14 | 1997-11-14 | |
GB9724386.9 | 1997-11-18 | ||
GBGB9724386.9A GB9724386D0 (zh) | 1997-11-18 | 1997-11-18 | |
GBGB9725158.1A GB9725158D0 (zh) | 1997-11-27 | 1997-11-27 | |
GB9725158.1 | 1997-11-27 | ||
GBGB9726147.3A GB9726147D0 (en) | 1997-12-10 | 1997-12-10 | Antigens |
GB9726147.3 | 1997-12-10 | ||
GBGB9800759.4A GB9800759D0 (zh) | 1998-01-14 | 1998-01-14 | |
GB9800759.4 | 1998-01-14 | ||
GB9819016.8 | 1998-09-01 | ||
GBGB9819016.8A GB9819016D0 (zh) | 1998-09-01 | 1998-09-01 |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB988128446A Division CN1263854C (zh) | 1997-11-06 | 1998-10-09 | 奈瑟球菌抗原 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1824675A true CN1824675A (zh) | 2006-08-30 |
CN1824675B CN1824675B (zh) | 2010-12-01 |
Family
ID=10821707
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2005101133957A Expired - Fee Related CN1824675B (zh) | 1997-11-06 | 1998-10-09 | 奈瑟球菌抗原 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN1824675B (zh) |
GB (1) | GB9723516D0 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108823329A (zh) * | 2018-06-28 | 2018-11-16 | 中国农业科学院烟草研究所 | 鉴定烟草sua-CMS胞质雄性不育系的分子标记 |
CN114137209A (zh) * | 2021-02-01 | 2022-03-04 | 中国水产科学研究院黄海水产研究所 | 一种快速检测牡蛎疱疹病毒抗原的免疫荧光检测试纸条及应用 |
-
1997
- 1997-11-06 GB GBGB9723516.2A patent/GB9723516D0/en not_active Ceased
-
1998
- 1998-10-09 CN CN2005101133957A patent/CN1824675B/zh not_active Expired - Fee Related
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108823329A (zh) * | 2018-06-28 | 2018-11-16 | 中国农业科学院烟草研究所 | 鉴定烟草sua-CMS胞质雄性不育系的分子标记 |
CN108823329B (zh) * | 2018-06-28 | 2022-03-22 | 中国农业科学院烟草研究所 | 鉴定烟草sua-CMS胞质雄性不育系的分子标记 |
CN114137209A (zh) * | 2021-02-01 | 2022-03-04 | 中国水产科学研究院黄海水产研究所 | 一种快速检测牡蛎疱疹病毒抗原的免疫荧光检测试纸条及应用 |
CN114137209B (zh) * | 2021-02-01 | 2024-03-01 | 中国水产科学研究院黄海水产研究所 | 一种快速检测牡蛎疱疹病毒抗原的免疫荧光检测试纸条及应用 |
Also Published As
Publication number | Publication date |
---|---|
GB9723516D0 (zh) | 1998-01-07 |
CN1824675B (zh) | 2010-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1263854C (zh) | 奈瑟球菌抗原 | |
US8293251B2 (en) | Neisserial antigens | |
CN1224708C (zh) | 脑膜炎奈瑟氏球菌抗原 | |
CN100379757C (zh) | 脑膜炎奈瑟球菌抗原和组合物 | |
CN1359426A (zh) | 奈瑟球菌基因组序列及其用法 | |
KR102607213B1 (ko) | 암모니아-산화 니트로소모나스 유트로파 균주 d23 | |
AU745787B2 (en) | Enterococcus faecalis polynucleotides and polypeptides | |
CN1338005A (zh) | 奈瑟球菌基因组序列及其用途 | |
CN1451046A (zh) | 保守的奈瑟球菌抗原 | |
CN1774447A (zh) | 肺炎链球菌抗原 | |
JP2014068652A (ja) | ナイセリア抗原性ペプチド | |
RU2673715C2 (ru) | Вакцина против haemophilus parasuis серологического типа 4 | |
CN1824675B (zh) | 奈瑟球菌抗原 | |
CN1911959A (zh) | 奈瑟球菌基因组序列及其用途 | |
KR20190059562A (ko) | γPGA 활성을 가지는 신규 고초균 및 이의 용도 | |
MXPA00004363A (en) | Neisserial antigens | |
AU2003235364B2 (en) | Neisseria meningitidis antigens and compositions | |
KR20220135669A (ko) | 서팩틴을 생산하는 신규한 바실러스 서브틸리스 균주 및 이의 용도 | |
CN1186516A (zh) | 与幽门螺杆菌相关的诊断和治疗用的核酸和氨基酸序列 | |
AU1546202A (en) | Enterococcus faecalis polynucleotides and polypeptides |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20101201 Termination date: 20131009 |